Google’s AI research and development lab, DeepMind, has unveiled AlphaEvolve, an AI system designed to tackle complex problems in math and science with “machine-gradable” solutions. The system leverages “state-of-the-art” models, specifically Gemini models, to generate, critique, and evaluate possible answers to a given problem.
AlphaEvolve introduces a mechanism to reduce hallucinations in AI models by using an automatic evaluation system. This system scores the generated answers for accuracy, allowing it to work effectively on problems that can be self-evaluated, particularly in fields like computer science and system optimization.
To utilize AlphaEvolve, users must provide a problem statement along with optional details such as instructions, equations, and relevant literature. They must also supply a mechanism for automatically assessing the system’s answers, typically in the form of a formula. The system’s capability is limited to describing solutions as algorithms, making it less suitable for non-numerical problems.
In benchmarking tests, AlphaEvolve was presented with around 50 math problems across various branches, including geometry and combinatorics. The system successfully “rediscovered” the best-known answers 75% of the time and uncovered improved solutions in 20% of cases. DeepMind also applied AlphaEvolve to practical problems, such as optimizing Google’s data center efficiency and speeding up model training runs.
Video: Google DeepMind
According to DeepMind, AlphaEvolve generated an algorithm that recovered 0.7% of Google’s worldwide compute resources on average and suggested an optimization that reduced the overall time to train Gemini models by 1%. While AlphaEvolve isn’t making groundbreaking discoveries, it is claimed to save time and free up experts to focus on more critical tasks.
DeepMind plans to build a user interface for AlphaEvolve and launch an early access program for selected academics before considering a broader rollout. The lab asserts that AlphaEvolve’s capabilities make it a valuable tool for domain experts.
All Rights Reserved. Copyright , Central Coast Communications, Inc.