Lean4AI: An Overview
The other day, I was chatting with a friend and trying to explain my research program. The question arises: How should someone new learn about this topic?
My default answer had been to start with the original proposal, and then to read the follow up posts on this blog and to play with the GitHub repos linked from them. The drawback is that with six blog posts so far and about the same number of github repos, it would a very long read.
Thinking about this more, I believe a reading guide that lays out the various components in a more structured and logical sequence can be valuable. Hence this post. Readers may use the reading guide to focus their attention on a particular branch of inquiry that they find more interesting. Or, you may still want to read everything, but with the overall structure in mind, you can hopefully get a better sense of how they fit together.
Also, does our project need a catchy title? I’ve been thinking about the title Lean4AI. This is a play on related project titles like “AI for Lean”, “AI for Math” etc. Sub-slogan: Using iteractive theorem provers like Lean to tackle the great challenges of generative AI: Safety and Hallucination. Comments and suggestions are welcome. Meanwhile, I got the domain name lean4ai.org, which currently points to this blog site.
The Lean4AI Reading Guide
- Read the original proposal, Sections 1 and 2, and Section 3 up to the beginning of the Projects. This will give you a good sense of the motivation of the project, and the overall direction.
- Skim through the rest of the proposal including the Projects, to get a rough sense of what they are. If you find one or more of the projects particularly exciting, feel free to read those in more detail, and refer to the relevant materials in the next bullet.
- The subsequent material can be divided into three streams:
- Dataset creation and sharing. Current efforts are happening in the following sub-projects:
- Code with Proof: the Arena. This is a website where coding challenges with formal specifications can be posted and solved. Relevant reading: Progress report 1. Code on GitHub: Arena, SafeVerify. A demo of the website is up.
- Code with Proof Benchmark. This is a human-written (by me) suite of code-with-proof problems and solutions. Code on GitHub: Benchmark. Also available on HuggingFace. Try these problems on the Arena demo site.
- FormalizeWithTest. Autoformalization of coding problems, verified with test cases. Relevant reading: Progress Report 2. Code on GitHub: FormalizeWithTest, LeanTool. Some of the results of the autoformalization are on Arena demo.
- Hallucination Detecting and Recovery. We use Lean to help LLMs detect and recover from hallucinations. Reading: Progress report 3, 3.5, 3.6. Code on GitHub: WakingUp is the main repo; dataset preprocessing provided by FormalizeWithTest; also uses LeanTool for LLM-Lean interaction.
- Programming with dependent types. A promising approach for producing verified code, and for easily locating errors while generating code. Reading: Memoization example, DP example, and some discussion of future directions relating to hallucination detection. Code on GitHub: Some of the solutions in Benchmark are written in this style. Practice on Arena demo.
- Dataset creation and sharing. Current efforts are happening in the following sub-projects:
Diagram of Data Creation Processes
(Added 2/5) Here is a rough drawing of how our various projects relate to each other in terms of data creation.
Contribubting
If you are interested in contributing or collaborating, feel free to contact me at GasStationCodeManager@gmail.com. If you are looking for research project ideas, check out the future work discussions in the various posts mentioned above. If you are looking for code contribution ideas, check out some of the open issues in my GitHub projects.