Accepted to NAACL SRW 2025

Automating Mathematical Proof Generation Using Large Language Model Agents and Knowledge Graphs

Vincent Li, Tim Knappe, Yule Fu

Abstract

Large language models have demonstrated remarkable capabilities in natural language processing tasks requiring multi-step logical reasoning capabilities, such as automated theorem proving. We present KG-Prover, a novel framework that leverages knowledge graphs mined from reputable mathematical texts to augment general-purpose LLMs to construct and formalize mathematical proofs. We built a knowledge graph of over 60,000 nodes and 300,000 edges that represents mathematical concepts and their interrelations. We study the effects of scaling graph-based, test-time compute using KG-Prover, demonstrating significant performance improvements over baselines. General-purpose LLMs improve up to 21% on miniF2F-test when combined with KG-Prover, with consistent improvements ranging from 2-11% on the ProofNet, miniF2F-test, and MUSTARD datasets. Furthermore, KG-Prover with o4-mini achieves over 50% on miniF2F-test.

Citation

Vincent Li, Tim Knappe, Yule Fu. "Automating Mathematical Proof Generation Using Large Language Model Agents and Knowledge Graphs". Accepted to NAACL SRW 2025.

Resources

View on arXiv

Details

Conference: Accepted to NAACL SRW 2025
Authors: 3 authors

Related Publications

Explore more research from Algoverse

NeurIPS 2025 (Spotlight)

Publish Your Research

Join Algoverse and work with world-class mentors to publish at top AI conferences.

Start Your Application