Provably-Correct Vibe Coding

http://ProvablyCorrectVibeCoding.com

This demo app is a frontend to LeanTool and SafeVerify. In more detail:

The user creates a formal specification for a Lean coding task. Or take an existing problem from Code with Proofs: the Arena. (Upcoming feature: autoformalize from natural language problem description and a test set)
An AI agent attempts to output code with proof of correctness. The AI is an LLM (e.g. Sonnet/o3/Deepseek/Gemini/Grok4) with user-provided API key, armed with LeanTool. LeanTool provides not only ability to iteratively fix syntax errors, it also facilitates hallucination detection via property-based testing.
The solution is checked against the specification using SafeVerify. If the solution passes, the user now has an implementation of the coding task that is guaranteed to satisfy the specification.

This demo is a proof-of-concept of a vision that I have been building towards via my open-source projects: a future with safe, hallucination-free coding AIs. We are of course quite a bit of distance from that goal, but hopefully this gives an idea of what that might look like.

Next Steps

Contributions / collaborations welcome! Some ideas for improvements:

Integration with LeanExplore’s MCP server
Streaming output
As mentioned above, autoformalization of problem descriptions using the FormalizeWithTest approach
Right now the app is using the simple feedback loop provided by LeanTool. Could instead use another “agentic” coding assistant like Claude Code, Gemini CLI or Goose, armed with LeanTool MCP for the Lean interaction.