Lean for Scala programmers - Part 1

February 28, 2021

First impressions of Lean

Over the next few blog posts I’m going to use Lean and Scala to explore the following topics:

What is a proof assistant, and how is it different from a regular programming language?
What are dependent types?
What is a formal proof?
How does the Curry-Howard isomorphism work in practice?

First impressions of Lean

Lean is a theorem prover and programming language. It is based on the calculus of constructions with inductive types.

The Lean project was created around 2013 by Leonardo de Moura (at Microsoft Research).

In my opinion, the “killer app” for Lean at the moment is mathlib. It is perhaps the largest single repository of formalized mathematics out there.

This fantastic article on quanta magazine does a great job describing some of the momentum around Lean and mathlib.

We’ll use Lean 4, even though it’s still not officially released yet. If you want to follow along you can follow installation instructions here.

Coincidentally, Scala 3 made dramatic simplifications to some form of dependently typed programming. It is my opinion that observing dependently typed constructions in a “native” environment such as Lean can help to understand and demystify some features of Scala’s type system.

Audience: Scala intermediate/advanced.
Scala version used: 3.1.2
Lean 4 version: nightly-2022-08-09
Latest revision: Sep 4, 2022

Let’s start by disecting a simple example.

1. Calculating the length of a list

Lean 4

def length (ls: List A): Nat := 
  match ls with
    | []     =>  0
    | _ :: t =>  1 + length t 

#eval length ["a", "b"]
-- 2

#eval length [Bool, Nat -> Nat, String × String, List Nat]
-- 4

Scala 3

def length[A](ls: List[A]): Int =
  ls match
    case Nil    => 0
    case _ :: t => 1 + length(t)

// example
length(List("a", "b")) == 2

We’ll use Tuple types in place of type level lists:

Scala 3 (type level version)

import compiletime.ops.int._

type Length[T <: Tuple] <: Int = T match
  case EmptyTuple => 0
  case _ *: t     => 1 + Length[t]

// examples
summon[ Length[("a", "b")] =:= 2 ]

summon[ Length[(Boolean, Int => Int, (String, String), List[Int])] =:= 4 ]

A few things that need discussion are:

Type inference

Often times type variables such as A here can be inferred correctly by Lean.

Unified syntax

This example shows one of the biggest differences between dependently type languages such as Lean and “normal” programming languages like Scala:

Lean has a unified syntax for both type-checking and evaluation of expressions. Expressions can be compiled (into low level C code) and executed as a normal program.

Scala in contrast has basically two languages combined into one: the (pure, functional) type-level language that is “evaluated” at compile time, and the regular Scala term-level code that is compiled into JVM bytecode.

Lean commands

We’ll be using Lean mostly in interactive mode. For that purpose we’ll be using two commands:


`#check`	Prints the type of the expression
`#eval`	Evaluates the expression and prints the result to the console (or VSCode).

Unicode symbols

Unicode symbols are used a lot in Lean, and many objects have a corresponding unicode alias. For example × is an alias for Prod (the product of two types, and it is entered as \x in vscode).

2. Propositions

The expression

length ["a", "b"] = 2

Is a proposition (of type Prop).

Propositions are types (of a special kind) on their own, so they can be used in type position:

def myExample: length ["a", "b"] = 2 := sorry

sorry is the same as Scala’s ???

In many cases we want to create a proof without even naming it, so we can use the following syntax:

example: length ["a", "b"] = 2 := sorry

There are many ways to create propositions; a common one is using the equality operator (=).

Ok, so how do we create a value of this type?

Equality

example: length ["a", "b"] = 2 := rfl

The rfl function constructs propositions of the form a = a, i.e. propositions where the left-hand side is exactly the same as the right-hand side; “exactly” here means: up to evaluation. Lean evaluates length ["a", "b"] to obtain the equivalent proposition

2 = 2

On the other hand the proposition 2 = 3 is false, because you can’t create (by design) a value of this type. i.e. this type is uninhabited.

This is the basis of the Curry-Howard isomorphism, that represents propositions as types and proofs as programs.

3. Map function on lists

As a second example let’s implement map. This time instead of explicit match we’re going to construct a function by cases (this is just syntactic sugar for match a with):

Lean 4

def map (f: A -> B): List A -> List B
  | []      =>  []
  | a :: as =>  f a :: map f as

example: map (fun x => x + 1) [1, 2, 3] = [2, 3, 4] := rfl

example: map Option [Nat, String] = [Option Nat, Option String] := rfl

Scala 3

def map[A, B](f: A => B): List[A] => List[B] =
  case Nil     => Nil
  case a :: as => f(a) :: map(f)(as)

map((x: Int) => x + 1)(List(1, 2, 3))
// List(2, 3, 4)

Scala 3 (type level version)

import compiletime.ops.int._

type Map[F[_], As <: Tuple] <: Tuple = As match
  case EmptyTuple => EmptyTuple
  case a *: as    => F[a] *: Map[F, as]

// example
summon[ Map[[x <: Int] =>> x + 1, (1, 2, 3)] =:= (2, 3, 4) ]    

summon[ Map[Option, (Int, String)] =:= (Option[Int], Option[String]) ]

Note that Scala has a special syntax for type level anonymous functions:

// value level function
(x: Int) => x + 1

// type level function
[X <: Int] =>> X + 1

whereas in Lean the same syntax works for both cases:

fun (x: Int) => x + 1

4. A simple proof

Before wrapping up, let’s go ahead and show off the kind of results that can be proved with Lean

Lemma: Given an arbitrary function $f: A \to B$ , for every list $\mathtt{lst}: \mathtt{list\ A}$ it is true that
$\mathtt{length}\ (\mathtt{map}\ f \ \mathtt{lst})= \mathtt{length}\ \mathtt{lst}$
i.e. map does not change the length of a list

def map_length (f: A -> B): (lst: List A) -> length (map f lst) = length lst
  | []     => rfl
  | _ :: t => by simp [length, map_length]

Let’s just comment briefly on some key points:

Given a function f and a list lst, this function will create a proof of the proposition
- length (map f lst) = length lst
Which corresponds to the statement: “for all functions f: A -> B and lists of type A, the length of the list produced by mapping f is the same as the length of the original list”
Since lists are either empty [] or h :: t , we can prove this result by cases:
- The case of an empty list is trivial and can be proved by rfl.
- The other case can be proved semi-automatically by Lean. We are telling it to use the tactic simp
Tactics are macros that automate a lot of the boilerplate required to prove things manually.

It is customary to use the keywords theorem and ∀..., (forall) when proving things:

theorem map_length (f: A -> B): ∀ (lst: List A), length (map f lst) = length lst
  | []     => rfl
  | _ :: t => by simp [length, map_length]

Continue in part 2 with Inductive types, Structures and Algebraic Data Types.