PARIS–(BUSINESS WIRE)–Feb 11, 2025–
MLCommons, in partnership with the AI Verify Foundation, today released v1.1 of AILuminate, incorporating new French language capabilities into its first-of-its-kind AI safety benchmark. The new update – which was announced at the Paris AI Action Summit – marks the next step towards a global standard for AI safety and comes as AI purchasers across the globe seek to evaluate and limit product risk in an emerging regulatory landscape. Like its v1.0 predecessor, the French LLM version 1.1 was developed collaboratively by AI researchers and industry experts, ensuring a trusted, rigorous analysis of chatbot risk that can be immediately incorporated into company decision-making.
“Companies around the world are increasingly incorporating AI in their products, but they have no common, trusted means of comparing model risk,” said Rebecca Weiss, Executive Director of MLCommons. “By expanding AILuminate’s language capabilities, we are ensuring that global AI developers and purchasers have access to the type of independent, rigorous benchmarking proven to reduce product risk and increase industry safety.”
Like the English v1.0, the v1.1 French model of AILuminiate assesses LLM responses to over 24,000 French language test prompts across twelve categories of hazards behaviors – including violent crime, hate, and privacy. Unlike many of peer benchmarks, none of the LLMs evaluated are given advance access to specific evaluation prompts or the evaluator model. This ensures a methodological rigor uncommon in standard academic research and an empirical analysis that can be trusted by industry and academia alike.
“Building safe and reliable AI is a global problem – and we all have an interest in coordinating on our approach,” said Peter Mattson, Founder and President of MLCommons. “Today’s release marks our commitment to championing a solution to AI safety that’s global by design and is a first step toward evaluating safety concerns across diverse languages, cultures, and value systems.”
The AILuminate benchmark was developed by the MLCommons AI Risk and Reliability working group, a team of leading AI researchers from institutions including Stanford University, Columbia University, and TU Eindhoven, civil society representatives, and technical experts from Google, Intel, NVIDIA, Microsoft, Qualcomm Technologies, Inc., and other industry giants committed to a standardized approach to AI safety. Cognizant that AI safety requires a coordinated global approach, MLCommons also collaborated with international organizations such as the AI Verify Foundation to design the AILuminate benchmark.
“MLCommons’ work in pushing the industry toward a global safety standard is more important now than ever,” said Nicolas Miailhe, Founder and CEO of PRISM Eval. “PRISM is proud to support this work with our latest Behavior Elicitation Technology (BET), and we look forward to continuing to collaborate on this important trustbuilding effort – in France and beyond.”
Currently available in English and French, AILuminate will be made available in Chinese and Hindi later this year. For more information on MLCommons and the AILuminate Benchmark, please visit mlcommons.org.
About MLCommons
MLCommons is the world’s leader in AI benchmarking. An open engineering consortium supported by over 125 members and affiliates, MLCommons has a proven record of bringing together academic, industry, and civil society to measure and improve AI. The foundation for MLCommons began with the MLPerf benchmarks in 2018, which rapidly scaled as a set of industry metrics to measure machine learning performance and promote transparency of machine learning techniques. Since then, MLCommons has continued to use collective engineering to build the benchmarks and metrics required for better AI – ultimately helping to evaluate and improve the accuracy, safety, speed, and efficiency of AI technologies.
View source version on businesswire.com:https://www.businesswire.com/news/home/20250210273057/en/
CONTACT: Kelly Berschauer
Marketing Director, MLCommons
[email protected]
KEYWORD: CALIFORNIA EUROPE UNITED STATES NORTH AMERICA FRANCE
INDUSTRY KEYWORD: MOBILE/WIRELESS TECHNOLOGY SECURITY ENGINEERING TELECOMMUNICATIONS SOFTWARE MANUFACTURING INTERNET HARDWARE ARTIFICIAL INTELLIGENCE
SOURCE: MLCommons
Copyright Business Wire 2025.
PUB: 02/11/2025 12:00 AM/DISC: 02/10/2025 11:59 PM
http://www.businesswire.com/news/home/20250210273057/en