SAN FRANCISCO, CA / ACCESS Newswire / March 6, 2026 / The fusion of artificial intelligence and pure mathematics is accelerating, evolving from computational assistance to enabling creative breakthroughs. Neel Somani, a Berkeley-educated computer scientist and founder of the blockchain platform Eclipse, recently spearheaded an experiment that offers a glimpse into this transformative future. By organizing a team of undergraduate students to apply advanced AI models to unresolved Erdős problems, Somani sought not only to solve mathematical challenges but also to uncover the deeper architecture of discovery itself.
The project, named "GPT-Erdos," employed state-of-the-art tools like GPT-5.2 Pro and Deep Research to tackle unsolved mathematical conjectures. The results were striking, yielding accepted solutions, partial progress, and rediscoveries of previously undocumented findings. Yet, as Somani highlights, the true value of this initiative lies in the insights it provided into the informal, often hidden principles that guide human research. As AI systems increasingly engage in "autoformalization"-the process of converting human-readable proofs into machine-verifiable formats-they are prompting the scientific community to reconsider long-held notions of novelty, progress, and rigor.
The Complexity of Underspecification
One of the most intriguing insights from Somani's work revolves around the issue of underspecification. AI-generated solutions often expose ambiguities in how humans define success. During the GPT-Erdos experiment, Somani observed instances where the AI produced valid solutions that diverged methodologically from established approaches while remaining functionally equivalent.
This raises a critical question: How should such results be categorized? Are they novel discoveries, rediscoveries, or extensions of prior work? Somani notes that debates over novelty are not merely academic; they reflect deeper questions about intellectual contribution. When AI generates solutions without a clear historical lineage, it challenges the human desire for clean definitions of "newness," revealing the often messy reality of mathematical derivation. The experiment underscored that AI's "failures" are frequently not errors in results but failures of specification-cases where the AI meets technical criteria but falls short of satisfying human expectations for what constitutes a meaningful contribution.
Redefining Novelty in the Age of Automation
The challenge of defining novelty is not unique to AI; it has long divided even the most accomplished mathematicians. Somani points to examples where leading figures, such as Terence Tao, might view an AI-generated result as novel, while others might see it as derivative. This divergence underscores the reliance of the mathematical community on intuition rather than formal logic to assess the value of a proof.
Neel Somani suggests that the field may need to adopt a more formalized definition of novelty. One potential framework could involve measuring the minimum complexity required to express a proof. If a proof merely reconfigures existing theorems with new parameters, it may lack novelty. However, if it necessitates the creation of multiple new, non-trivial theorems, it likely represents a genuine advancement. Drawing on his expertise in cryptography and quantitative research, Somani proposes taking inspiration from zero-knowledge proofs, defining mathematical "knowledge" as the ability to reconstruct a proof using existing results within polynomial time.
The Elusive Nature of "Interestingness"
Beyond the mechanics of proving theorems lies a more abstract challenge: determining which problems are worth solving. Human mathematicians possess an intuitive sense of "interestingness"-a heuristic that balances difficulty with potential impact. Large Language Models (LLMs), however, lack this intuition. They cannot inherently discern which mathematical questions might unlock breakthroughs in physics or engineering, nor can they grasp the cultural or aesthetic significance of a problem.
Neel Somani argues that this limitation extends beyond mathematics into fields like business and art. Just as AI struggles to identify genuinely novel business ideas, it also struggles to prioritize meaningful mathematical inquiries. These values, deeply embedded in human experience, are absent from training data. As a result, the rise of autoformalization serves as a mirror, exposing the "soft" concepts that humans rely on as invisible guardrails for progress.
From Mathematical Proofs to Software Reliability
While the philosophical implications of autoformalization are profound, its practical applications are immediate, particularly in domains requiring absolute reliability. Somani, whose work with Eclipse focuses on decentralized technology, sees a direct connection between formal mathematical proofs and software security. In fields like quantitative finance and blockchain development, the goal is often to create systems that are provably correct.
The rapid proliferation of AI-generated code introduces new risks. Somani refers to this as "slop code"-software produced so quickly that human review becomes a bottleneck. Autoformalization offers a solution by enabling formal methods at scale. Instead of relying on human oversight to catch issues like memory safety violations or exception handling errors, formalized AI systems could provide provable guarantees. This shift could make formal verification, once considered too cumbersome for general software development, a practical standard for critical infrastructure.
Toward a Metric for "Closeness"
Looking ahead, Somani identifies a key gap in current tools: the absence of a metric for "closeness" to completion. Today, formal verification operates in binary terms-a proof either verifies or it does not. Yet, the history of scientific discovery is rarely so clear-cut. Major breakthroughs, such as the Einstein field equations, often emerged through heuristics and metaphors long before they were rigorously formalized.
Somani envisions a future where autoformalization incorporates a differentiable surrogate function to measure how close a proof is to being correct. Such a tool would allow researchers to distinguish between proofs that are fundamentally flawed and those that are nearly complete. This development could transform AI from a binary checker into a true collaborator, capable of navigating the heuristic, iterative process of discovery.
A New Paradigm for Inquiry
Neel Somani's experiment demonstrates that even if AI progress were to halt today, the practice of mathematics has already been irrevocably changed. The ability to verify proofs via machine and rapidly assimilate existing approaches allows researchers to focus on high-level conceptualization rather than rote memorization.
As the founder of Eclipse and a mentor to emerging computer scientists, Somani continues to explore how these technologies can reshape decentralized systems and academic inquiry. Neel Somani founded Lipschitz Strategies, LLC is a consulting firm that provides marketing and technical advisory services for software companies . The future of mathematics is not just about machines solving problems; it is about machines helping humans redefine the very nature of the questions they seek to answer.
To learn more visit:
Contact:
SOURCE: Lipschitz Strategies LLC
View the original press release on ACCESS Newswire
