Typista.org

Lean for Scala programmers

February 28, 2021


Over the next few blog posts I’m going to use Lean and Scala to explore the following topics:

  • What is a proof assistant, and how is it different from a regular programming language?
  • What are dependent types?
  • What is a formal proof?
  • How does the Curry-Howard isomorphism work in practice?

First impressions of Lean

From wikipedia:

Lean is a theorem prover and programming language It is based on the calculus of constructions with inductive types.

The Lean project was created around 2013 by Leonardo de Moura (at Microsoft Research). The current version is 3.23.

In my opinion, the “killer app” for Lean at the moment is mathlib. It is perhaps the largest single repository of formalized mathematics out there.

This fantastic article on quanta magazine does a great job describing some of the momentum around Lean and mathlib.

The upcoming Lean 4 release (at milestone 1 right now) promises to bring increased performance, better interop with native code, and overall a better value proposition for using it as a practical programming language (alas, VSCode interactive support is not fully functional yet, so I’ll stick to 3.23 for now).

Coincidentally the almost-here Scala 3 release made dramatic simplifications to some form of dependently typed programming. It is my opinion that observing dependently typed constructions in a “native” environment such as Lean can help to understand and demystify some features of Scala’s type system.

(Scala version used: 3.0.0-M3)

The easiest way to follow along is to use the online interactive environment, but you can also install it locally if you wish.


With this out of the way, let’s start by dissecting a simple example.

Calculating the length of a list

Lean (try it):

def length {A: Type}: list A -> ℕ
    | [] := 0
    | (h :: t) := 1 + length t 
    
-- used like
#reduce length ["a", "b"]
-- 2

Scala 3:

def length[A <: Any]: List[A] => Int =
  case Nil => 0
  case h :: t => 1 + length(t)

// used like
length(List("a", "b"))

A few things that need discussion are:

Lean commands

We’ll be using Lean mostly in interactive mode. For that purpose we’ll be using two commands

#check

Prints the type of the expression

#reduce

Evaluates the expression using the trusted kernel and prints the resulting value

A: Type vs A <: Any

Types in Lean are first-class objects, meaning that they can be treated as regular values: they have a type, they can be stored, passed around, returned from functions. So A: Type means that A is a type.

Since every expression has a type, it is natural to ask: what is the type of Type? Turns out there is a type called Type 1 that has Type as an inhabitant.

Type : Type 1

and a type Type 2 that has Type 1 as an inhabitant.

Type 1 : Type 2

and so on.

In fact, Type is just an alias for Type 0 and we have an infinite tower of types

...
Type 2 : Type 3
Type 1 : Type 2
Type 0 : Type 1

It’s types all the way up!

Graphically:

universe hierarchy

In this diagram the boxes at the top are types, and their inhabitants (also types) are displayed inside. The inhabitants that are not types (such as regular numbers) are displayed at the bottom, connected by broken lines.

Each concrete number i above is called a universe level.

Note that list ℕ has type Type 0, whereas list Type has type Type 1.

If we want to declare type variables of any universe we can use the syntax {A: Type*} like so:

Lean (try it):

def length {A: Type*}: list A -> ℕ
    | [] := 0
    | (h :: t) := 1 + length t     

with this change we can now use our function in a list of types:

#reduce length [bool, ℕ -> ℕ]
-- 2

(We’ll explore universes in more detail later)

{...} vs [...]

{A : Type} means that the type argument A can be inferred. Regular (non inferrable) parameters are declared as (A: Type).

Also, in Scala [A <: Any] can be shortened as [A]. In Lean in some cases {A : Type} can also be shortened as {A}, but it is common to write the full form.

As Scala’s type parameters [...] are always potentially inferrable, they correspond to Lean’s {...} syntax.

-- correct:
length ["a", "b"]


-- incorrect:
length string ["a", "b"]

Unicode symbols

Unicode symbols are used a lot in Lean, and many objects have a corresponding unicode alias. is an alias for nat (the type of natural numbers, and it is entered as \N in vscode).

A trivial proof

As a first example of a proof consider the (admittedly trivial) result:

try it

example: length ["a", "b"] = 2 := rfl

It is so trivial that we don’t even bother naming it. Lean has the special keyword example for such anonymous proofs.

Propositions

The expression

length ["a", "b"] = 2 

Is a proposition (of type Prop). There are many ways to create propositions; a common one is using the equality operator (=).

Observe that neither length ["a", "b"] nor 2 are types, and yet the whole expression is used in type position in our example:

example: <PROPOSITION> := <PROOF>
              ^              ^
              |              | 
            TYPE X       EXPRESSION OF TYPE X

In Lean, propositions are types that sit at the bottom of the universe hierarchy:

...
Type 2 : Type 3
Type 1 : Type 2
Type 0 : Type 1
ℕ      : Type 0 
Prop   : Type 0
2 = 3  : Prop
2 = 2  : Prop
rfl 2  : 2 = 2

...

Graphically:

universe hierarchy props

You can see that Type 0 has Prop as an inhabitant (alongside with , and others not shown). Prop’s own inhabitants (such as 2 = 2, 2 = 3) are types as well.

rfl

(rfl is a short hand for reflexivity)

This is a function that constructs propositions of the form a = a, i.e. propositions where the left-hand side is exactly the same as the right-hand side.

“exactly” here means: up to evaluation. Lean evaluates length ["a", "b"] to obtain the equivalent proposition

2 = 2

So how do we actually prove that 2 = 2?

By providing a program (rfl) that can construct a value of this type.

On the other hand the proposition 2 = 3 is false, because you can’t create (by design) a value of this type. i.e. this type is uninhabited.

This is the basis of the Curry-Howard isomorphism, that represents propositions as types and proofs as programs.

Back to Scala

In Scala 3 we can’t use our length function in type position, so if we want to do something similar we need to create a type level version:

import compiletime.ops.int._

type length[tt <: Tuple] <: Int = tt match
  case EmptyTuple => 0
  case h *: t => 1 + length[t]

// example
summon[ length[("a", "b")] =:= 2 ]

Observe how we’re using tuples as the type-level equivalent for lists.

This example shows one of the biggest differences between theorem provers such as Lean and “normal” programming languages like Scala.

Lean has a unified syntax for both type-checking and evaluation of expressions. So far we’ve only used the type checker, but Lean also has an interpreter that can execute a subset of Leans’ code.

Scala in contrast has basically two languages combined into one: the (pure, functional) type-level language that is “evaluated” at compile time, and the regular Scala term-level code that is compiled into JVM bytecode.

Map function on lists

As a second example let’s implement map:

Lean (try it):

def map {A B: Type*} (f: A -> B): list A -> list B
    | [] := []
    | (x :: xs) := f x :: map xs


#reduce map (λ x, x + 1) [1, 2, 3]
-- [2, 3, 4]

-- or even better:
example: map (λ x, x + 1) [1, 2, 3] = [2, 3, 4] := rfl

Scala 3:

def map[A, B](f: A => B): List[A] => List[B] =
  case Nil => Nil
  case x :: xs => f(x) :: map(f)(xs)


map((x: Int) => x + 1)(List(1, 2, 3))
// List(2, 3, 4)

Scala 3 (type-level version):

import compiletime.ops.int._

type map[A, f[_ <: A], as <: Tuple] <: Tuple = as match
	case EmptyTuple => EmptyTuple
	case x *: xs => f[x] *: map[A, f, xs]

// example
summon[map[Int, [x <: Int] =>> x + 1, (1, 2, 3)] =:= (2, 3, 4)]

Notes:

First, observe how multiple arguments of the same type can be grouped, so

{A B: Type*}

is the same as

{A: Type*} {B: Type*} 

Second, if you look carefully at the recursive map xs invocation you’ll notice that we’re not passing f as a first argument.

In the signature

def map {A B: Type*} (f: A -> B): list A -> list B
--       \____________________/  \____/
--                   |             |
--             automatically   explicitly
--                passed          passed   

The fact that A, B and f are declared before the colon makes Lean pass all three variables implicitly when calling map recursively. So map xs is expanded as map A B f xs internally.

Third, notice the syntax for anonymous functions

λ x, x + 1

which corresponds to Scala’s

x => x + 1

and (type -level lambda)

[x <: Int] =>> x + 1

(λ can be entered as \la<TAB>)

Also noticed how Lean can infer the type of the function, whereas Scala can’t because the function is passed before the list.

A simple proof

Before wrapping up, let’s go ahead and show off the kind of results that can be proved with Lean

Lemma: Given an arbitrary function f:ABf: A \to B , for every list lst:list A\mathtt{lst}: \mathtt{list\ A} it is true that

length (map f lst)=length lst\mathtt{length}\ (\mathtt{map}\ f \ \mathtt{lst})= \mathtt{length}\ \mathtt{lst}

i.e. map does not change the length of a list

We’re going to use the lemma keyword this time:

(Try it)

lemma map_length {A B: Type*} (f: A -> B):
	
  ∀ (lst: list A), length (map f lst) = length lst
	
| [] := rfl
| (h :: t) := by simp [length, map, map_length]

There’s a lot going on here, but hopefully over the next few blog posts we’ll learn enough Lean to understand every part of it, and use it to prove similar results.

Let’s just comment briefly on some key points:

  • lemma is a keyword. It works exactly as a function definition (you can replace it with def and everything will work).
  • map_length is the name of our lemma/function. It takes three arguments: types A, B and a function f: A -> B
  • It produces a proof of the proposition:

    • ∀ (lst: list A), length (map f lst) = length lst
    • (Remember that propositions are types, that’s why this is in type position)
    • Which corresponds to the statement: “for all lists of type A, the length of the list produced by mapping f is the same as the length of the original list”
  • Since lists are either empty [] or h :: t , we can prove this result by cases:

    • The case of an empty list is trivial and can be proved by rfl.
    • The other case can be proved semi-automatically by Lean. We are telling it to use the tactic simp
  • Tactics are macros that automate a lot of the boilerplate required to prove things manually.

Stay tuned for more!

Resources