The Formal Abstracts (FABSTRACTS) project will establish a formal abstract service that will express the results of mathematical publications in a computer-readable form that captures the semantic content of publications.

Specifically, the service will

- give a statement of the main theorem of each published mathematical paper in a language that is both human and machine readable.
- link each term in theorem statements to a precise definition of that term (again in human/machine readable form), and
- ground every statement and definition in the system in some foundational system for doing mathematics (the Lean theorem prover).

What is the long-term vision for Formal Abstracts (FABSTRACTS) in mathematics?

## FABSTRACTS enables machine learning in math

Many mathematicians dream that computers will someday fundamentally

transform how mathematics is practiced, and FABSTRACTS expresses

in concrete terms how that transformation might be achieved.

Automation of mathematics has been a major aim of artificial

intelligence from its inception. This broad aim includes all aspects

of mathematical activity such as conjecturing, generating examples,

finding counterexamples, computing, learning,

collaborating, proving, generalizing, refereeing and checking,

searching, exploring, and transforming the literature.

For many of these activities, it is important or even essential to

have a machine-readable semantic representation of mathematical

objects. This is achieved by FABSTRACTS.

We foresee tools for exploration such as a *Google Earth* for

mathematics, providing an intuitive visual map of the entire world of

mathematics, combining formal abstracts, computation, and other

networked content — all displayed at user-selected resolution.

## It gets easier

Just as the FABSTRACTS project will enable AI in mathematics, the

development of AI will enable FABSTRACTS.

Looking forward beyond the challenging initial groundwork to get

FABSTRACTS up and running, the generation of formal abstracts will

become nearly fully automated. The production of formal abstracts

will become easier and more efficient with every passing year.

Tools for the natural language processing of mathematics are under

development. As these tools develop, automation will replace the

manual preparation of formal abstracts. Automation will make it feasible

to construct hundreds of thousands (and eventually millions)

of formal abstracts.

For example, Kofler and Neumaier have developed a powerful parser for common mathematical language, and Neumaier is currently rewriting some of his

published math books in multilingual editions so as to be parsed and

translated by their software. In other promising work,

Ganesalingam’s thesis gave a brilliant linguistic analysis of the

language of mathematics. The current Naproche (Natural language

Proof Checking) system accepts a rich subset of mathematical language

including TeX-style formulas.

We expect mathematicians to be able to supply their own formal abstracts

in a structured English format, which will be parsed and translated to the Lean language.

We foresee future cooperation between FABSTRACTS and a major math

abstract service (such as zbMATH or MathSciNet).

The abstract service will provide statements of

theorems from publications in a form suited for fast formal

abstraction, and with FABSTRACTS offering enhancements to those services.

Progress in document analysis will eventually permit the automated

extraction of formal abstracts directly from the documents (building

on the work of M. Suzuki and collaborators at Kyushu University).

## FABSTRACTS as a permanent service

FABSTRACTS will become integrated with the workflow of

several tens of thousands of mathematicians.

We foresee a TeX package that allows mathematicians to cite formal

definitions and theorem statements for the convenience of readers, and

for the automatic processing of texts:

Let $M$ be a \formal{banach manifold}[fabstracts.org/10.1109/5.771073]. %TeX

We foresee further integration of FABSTRACTS with computation (in the

style of the current SageTeX package, which performs SageMath

computation and graphics from within a TeX document).

We foresee the translation of formal abstracts into many languages

and formats. Lewis has constructed a two-way bridge between Mathematica and Lean. Computer programs already exist that give automated translations between some of the major proof assistants. The work of Kohlhase, Rabe, and

their coworkers develops general infrastructure for the translation of

formal mathematics between different systems. The point is

that before we translate, we first need any formal representation,

and that is achieved by FABSTRACTS.

There are many expected applications,

including training sets for machine learning, enhancement of automated

theorem proving technologies, machine translation of mathematical

texts into foreign languages, detecting errors in the mathematical

literature, and intelligent mathematical search

tools.

Finally, we foresee that FABSTRACTS will inspire a broader movement

for the increased precision in science, with mathematicians taking up

the challenge to formalize the proofs of formal abstracts, and

scientists finding precise ways to give formal expression to the

objects of their research.

This is great! What is the concrete timeline and roadmap? And who will working on it? Can you use more funding from industry?