DeepIsaHOL progress report: current machine learning for the Isabelle proof assistant

Title:DeepIsaHOL progress report: current machine learning for the Isabelle proof assistant

Tags:Isabelle, large language model, machine learning, proof assistant, reinforcement learning and theorem proving

Abstract:

I report on the progress of the DeepIsaHOL project for doing modern machine learning (ML) on the Isabelle interactive theorem prover (ITP) data. The project is ongoing and focused on creating infrastructure for doing ML experiments with Isabelle data such as fine-tuning large language models (LLMs) or converting Isabelle into a reinforcement learning (RL) environment. The project’s ultimate objective is to provide tools based on these experiments to aid Isabelle users in the proving process. This report describes the project’s current state and the tools it has generated, including a generic data extraction algorithm from Isabelle’s theory files, its Scala and Python interfaces, simple (Python and Scala) read, eval, print, loops (REPLs) for interacting with Isabelle programmatically, and initial use of the framework’s data for training a Google’s FLAN-T5 small LLM. The project remains open and welcomes ITP and ML enthusiasts to collaborate on it.