A Review of the Lean Theorem Prover

Thomas Hales

This review gives a mathematician’s perspective on Lean.

What is great about Lean?

I turned to Lean when I started looking for a system with more foundational strength than HOL-Light, a proof assistant I have been happily using for many years. HOL-Light is great at what it does (such as real and complex analysis), but it falls short in its treatment of big objects, such as the category of all sets.

Both Coq and Lean are based on the logical foundations of CiC (calculus of inductive constructions). Unlike HOL, CiC can readily encode most structures in widespread use in mathematics.

I chose Lean (over Coq and other competitors) for a various reasons.

1. Lean sounds wonderful:  open source, a small trusted kernel, a powerful elaboration engine including a Prolog-like algorithm for type-class resolution, multi-core support, incremental compilation, support for both constructive and classical mathematics, successful projects in homotopy type theory, excellent documentation, and a web browser interface.  Lean source files can be edited in Emacs, Visual Studio Code, or CoCalc.

Lean seems to have learned from the past decade of research into the Coq system.

2. In more detail, a “minimalist and high performance kernel” was an explicit goal of the Lean. Independent implementations of the kernel have been given (Selsam in 2000 lines, etc.) alleviating any concerns about a bug in the C++ implementation of Lean.

3. The semantics of Lean are now completely spelled out (thanks to Mario Carneiro, building on Werner). In particular, Carneiro has built a model of Lean’s logic (CiC with non-cumulative universes) in ZFC set theory (augmented by a countable number of inaccessible cardinals).

4. Lean has a clean syntax. For example, to add two elements in an abelian group, one can simply write x+y and Lean correctly infers the group in which the addition is to be performed. I have more to say about Lean’s syntax later.

5. Lean makes it easy to switch from constructive to classical logic (you just open the classical logic module). Lean makes quotient types easy (unlike Coq, when tends to work with awkward setoids).

6. Lean is its own meta language. I find this very appealing. Contrast this with HOL-Light, which has OCaml as meta-language or Coq which has a domain-specific language Ltac for tactics.

7. Finally, there was a personal reason. Carnegie-Mellon University is a center of Lean library development. I live in Pittsburgh and am a regular participant in CMU’s Lean group meetings.

What’s not so great about Lean?

1. The kernel is bloated. Specifically, from what I hear, for performance reasons, mutually inductive types will soon be moved into the kernel. This destroys the former claims of a minimalist kernel.

2. Lean is not backwards compatible. Lean 3 broke the Lean 2 libraries, and old libraries still haven’t been ported to Lean 3. After nearly 2 years, it doesn’t look like that will ever happen. Instead new libraries are being built at great cost. Lean 4 is guaranteed to break the Lean 3 libraries when it arrives.  In short, Lean is experimental, evolving, and unstable.

3. The learning curve is steep. It is very hard to learn to use Lean proficiently. Are you a graduate student at Stanford, CMU, or Pitt writing a thesis on Lean? Are you a student at Imperial being guided by Kevin Buzzard? If not, Lean might not be for you.

4. Lean is its own metalanguage. Lean is new, and the language libraries are almost nonexistent.  Hardly any major programs have been written in Lean.   Java programmers outnumber Lean programmers by a million to one.  To first approximation, it is impossible to do any serious programming in Lean.

I miss HOL-Light, whose metalanguage is an established programming language (OCaml) with large supporting libraries. Whenever I needed a tactic, I just sat down and wrote it in OCaml. I can’t do this in Lean, and it is extremely frustrating to work in a environment that is not equipped for serious programming.

In fact, I could live with almost all my other criticisms, if I could write my own code in Lean to fix what I don’t like. I have discussed Lean programming issues with experts, and the tools simply are not available.

5. Typing in Lean is nominal rather than structural.  As a mathematician and OCaml programmer, I would have preferred design decisions that favor a structural approach.  Mathematicians are used to working in set theory, and decisions should be made that push the type theory to be more set-like, by making types more interoperable.

6. There are performance issues. It is not clear (to me or perhaps even to anyone) why performance is such a big problem, because Lean was implemented in C++ for the sake of performance. However in fact, the compilation of the math libraries is currently very slow. Something is wrong here.

7. Ugly projection chains are required. Consider the definition of semigroup (a mathematical structure with an associative binary operation)

universe u
class semigroup (α : Type u) extends has_mul α :=
(mul_assoc : ∀ a b c : α, a * b * c = a * (b * c))

This looks simple enough, and it has an appealing syntax that allows the symbol * to be used for multiplication because the class extends has_mul. However, the elaborator expands this into something ugly

@[class, priority 100]
structure semigroup : Type u → Type u
fields:
semigroup.mul : Π {α : Type u} [c : semigroup α], α → α → α
semigroup.mul_assoc : ∀ {α : Type u} [c : semigroup α] (a b c_1 : α),
@eq α
(@has_mul.mul α (@has_mul.mk α (@semigroup.mul α c))
(@has_mul.mul α (@has_mul.mk α (@semigroup.mul α c)) a b) c_1)
(@has_mul.mul α (@has_mul.mk α (@semigroup.mul α c)) a
(@has_mul.mul α (@has_mul.mk α (@semigroup.mul α c)) b c_1))

Every time the multiplication symbol is used, the semigroup is explicitly converted to another structure has_mul, which is responsible for handling the multiplication.

I have intentionally selected a very simple example. Things are much much worse when you look at the ugly expansions of a more complicated mathematical structure (such as an ordered ring).

8. Structure depends on notation. Lean has a library of results about multiplicative groups and a separate library of results about additive groups. The only difference is that one uses the symbol * for the group operation and the other uses the symbol + for the group operation. Mathematician find it absurd that the theorems in group theory have been made to depend on the symbol used for composition.

The problem grows worse in structures that depend on groups. For example, there are four types of products of two groups, because the first factor can be either a +-group or a *-group and the second factor can also be either of the two.

There is an interesting history behind this design flaw.  Originally, type classes were introduced into functional languages to make ad hoc polymorphism less ad hoc. What started as a way to clean up operator overloading notation, ended up welding notation to structures where it is not welcome.

9. No diamonds are allowed. From a programming language point of view, I understand the issues surrounding diamonds. However, for mathematicians, diamonds are extremely natural and they occur in great abundance under many names (pullbacks, fibered products, Cartesian squares, etc.). It seems to me that a system designed for doing mathematics should do more than just declare them illegal.

Read the full discussion that led to the diamond ban, where it argued (unpersuasively) that even though the algebra library used diamonds “in many different places,” the ban would only “be a minor annoyance.” In fact, I find it to be a major impediment. I’m told that I can just give my own implementation of classes with diamonds, but see item 4 above — I effectively have no computer language to use for this.

10. Structures are meaninglessly parameterized from a mathematical perspective. To briefly introduce the topic of parameters and bundling, users choose whether data appears as an external parameter. Here is an example of magma, both in parametric and bundled form

structure magma_with_parameter (α : Type*) :=
(op : α → α → α)
structure magma_bundled :=
(α : Type*)
(op : α → α → α)

I think of the parametric versus bundled variants as analogous to currying or not; are the arguments to a function presented in succession or as a single ordered tuple? However, there is a big difference between currying functions and currying structures. Switching between curried and uncurried functions is cheap, but it is nearly impossible in Lean to curry a structure. That is, what is bundled cannot be later opened up as a parameter. (Going the other direction towards increased bundling of structures is easily achieved with sigma types.) This means that library designers are forced to take a conservative approach and expose as a parameter anything that any user might reasonably want exposed, because once it is bundled, it is not coming back.

Mathematicians tend to be flexible about what information they choose to expose about a structure, according to whatever information is relevant to the given context. They don’t want a predetermined bundling decision that is hard-wired into the libraries.

How are Lean structures meaninglessly parametrized? As an illustration, the type α : Type* of the underlying carrier of the group is passed as a parameter (as in the magma_with_parameter example above). For a mathematician this is entirely meaningless information that is passed as a parameter. Who cares about the particularities of the underlying set of the group? What matters is the group structure. A mathematician says, let G be a group, but never let α be a set and let G be a group constructed on carrier α. Lean requires the user to give both α and G every damn time. Other language designers fight on behalf of the users to save them keystrokes and boilerplate. I expect the same from Lean.

This gets worse as the algebraic structures grow in complexity. You cannot just say, let M be a topological module of a topological ring R. I count 7 different structures that first have to be explicitly declared every time a topological module is introduced in Lean. This is intolerable. I haven’t made the count for tensor products of topological bi-modules, but you get the idea. And in case you are wondering, yes, mathematicians do  take tensor products of topological bi-modules.

(Lean does it this way for a reason: it uses the exposed parameter α as an index for type-class resolution, as a cheap way to avoid diamonds, etc. But why should mathematicians be forced to wander into Lean internals?)

11. Lean discards valuable information that is later reconstructed (at a cost) by its type class resolution engine.

For example, we write G:group α in Lean to say that G is a group with carrier α. Then, when we write the associative law, ∀ (x y z : α), x * y * z = x * (y * z), valuable information is lost because all typing information refers to the carrier α instead of the group G, so Lean must hunt around for a group G associated with the raw type α.

If there are two groups on the same underlying carrier (which happens for example with semi-direct products), then Lean gives unpredictable results. There is a way to set priorities here, but that is not advised, because these settings have global scope.

The Mizar approach seems saner. We write G:group (bundled in Mizar style) and

∀ (x y z : G), x * y * z = x * (y * z).

The difference with Lean might appear small, but now the statement has not discarded the group G and we do not have to hunt for it. Since G is not a type, z : G coerces to z : G.carrier, or more verbosely, z : group.carrier G, and the group is still visible.

11 thoughts on “A Review of the Lean Theorem Prover

  1. Dear Thomas,

    Thanks for all your interesting comments. I’m curious if you could expand on #4, which seems maybe the most critical one for you at the moment.

    Could you give examples of libraries that you miss most?

    (I recently ran into the pain of having to define lazy lists from scratch — is that the sort of thing you have in mind? Another similar pain point for me is not having an established library for lifting the tactic monad; the basics are there, but without examples or conveniences.)

    You’re right that there definitely a downside to having Lean be “its own metalanguage”. (The upside to me is nevertheless quite convincing; I’ve been very pleasantly surprised that my students and I can write our own tactics as we need them.)

  2. Thanks, this is a wonderful summary of the pros and cons of Lean! Could you elaborate a bit on C5 (structural vs nominal typing), though? I don’t quite see what you’re suggesting. Would your concerns be addressed by HoTT, where we can identify equivalent types, or is your concern on the level of judgmental equality?

  3. Isn’t the last point more a weakness of Lean’s library than of Lean itself? Coercions are there in Lean, so using a bundled group the same syntax as in Mizar should be possible?

  4. Thanks for this post, it is very instructive to implementors of proof assistants!

    From the implementor’s perspective (Agda) let me comment on some of the drawbacks you list.

    – 5. (structural typing) would only aggravate 6. (performance issues) and 7. (ugly terms printed back to the user). A type checker up to structure would produce structures the user has not written himself and has not given a name; thus, to comprehend the errors of the type checker, the user would have to comprehend these machine-generated structures. Coq’s CIC has more structural feel than Agda, e.g., they have a fix combinator for anonymous recursive functions, and this shows up sometimes in error messages; completely incomprehensible. My verdict would be that 5 is a bad idea with what our current technology for error explanation to the user can achieve.

    – 7. (performance issues) have to be seen relative to the state of the art. Currently all type-theoretic proof assistants have performance issues one-way or the other, because of the inherent cost of type checking dependent types. This cost can be removed by much less automatic type checker, which require annotation concerning unfolding of definitions. However, I do not expect such “stupid” type checkers to find the applause of the users.

    – 10. (parametrization) Er, yes, this is owed to type theory. Standard mathematics makes hardly use of indexing (parametrization), while they are the bread-and-butter of working in type theory.
    But I agree that existing solutions for “anti-Sigma”, such as records with manifest fields (Pollack, TPHOLs 2000) haven’t found their way into proof assistants in the way they should have.

    – 11. Coq has coercive subtyping which achieves what you want here. (In Agda, we don’t have it (yet) as it makes the type checker even more complicated.)

    They say it takes 20 years to develop a proof assistant of some maturity (Coq, Isabelle, …). It is kind of reassuring for the competition that lean hasn’t reached that level in 5 years.

    Cheers, Andreas

  5. Very interesting summary, thanks a lot!

    I do wonder if you have explored the math-comp library, and how many of these problems does it share.

Leave a comment