Let's Learn Lean : Basics-I
There has been a lot of discussion around Formal Verification, and Lean has been at the forefront of all this. So instead of observing from the sidelines, let's try to learn it from the ground up.
In this series, I am porting the Software Foundations book (written originally for Coq) into Lean 4. We will build everything from scratch to understand the mathematics that powers verified software.
Why Lean 4?
Lean is an interactive theorem prover developed by Leonardo de Moura at Microsoft Research. It has become incredibly popular among mathematicians (via the Mathlib library) and the ZK community.
Unlike older provers like Coq that require "extracting" code to an intermediate language like OCaml to run it, Lean 4 is a general-purpose language. We can write our compiler in Lean, verify it in Lean, and compile it directly to a standalone C binary. This reduces the "trust surface" and offers higher performance.
Data and Functions
We are ignoring the standard libraries and building everything from scratch to understand how these tools work.
"Days of the Week"
Let's start with a very simple example. We want to tell the theorem prover that we are defining a set of data values—a type.
We use the inductive keyword. We use where to start the definition (similar to Python syntax), and list the values. Note the naming convention: Lean types and constructors use CamelCase.
inductive Day where
| monday
| tuesday
| wednesday
| thursday
| friday
| saturday
| sunday
deriving Repr
What is deriving Repr?
Lean is efficient and "lazy." It won't generate the code to convert our type into a printable string unless we explicitly ask it to derive a "Representation" (Repr). Without this, we wouldn't be able to print the days to the console later.
Writing Functions
Now that we have defined Day, we can write functions that operate on it. Let's write a function that calculates the next business day.
We use def to define functions and match to handle the logic.
def nextWorkingDay (d : Day) : Day :=
match d with
| Day.monday => Day.tuesday
| Day.tuesday => Day.wednesday
| Day.wednesday => Day.thursday
| Day.thursday => Day.friday
| Day.friday => Day.monday
| Day.saturday => Day.monday
| Day.sunday => Day.monday
Note on Namespacing:
You will notice we use Day.monday instead of just monday. In Lean, constructors are namespaced inside the type. This keeps the global namespace clean—a massive advantage in large projects.
Computation and Testing
To check our work, we use the #eval command. This calculates the result immediately inside the editor.
#eval nextWorkingDay Day.friday
-- Result: Day.monday
#eval nextWorkingDay (nextWorkingDay Day.saturday)
-- Result: Day.tuesday
The First Proof (Assertions)
Testing values is good, but proving them is better.
We can define a test case using the example keyword. Instead of writing a complex proof script, we can simply provide the proof term rfl (short for reflexivity) inside a by block.
example : nextWorkingDay (nextWorkingDay Day.saturday) = Day.tuesday := by
rfl
The rfl tactic computes the values on both sides of the = sign and checks if they are definitionally equal. If they compute to the same thing, the proof passes.
Booleans
Now, let's implement Booleans from scratch.
Note: Because Bool is built into Lean's standard library, we will wrap our work in a namespace to avoid conflicts.
namespace MySpace
inductive Bool where
| true
| false
deriving Repr
Boolean Functions
We can define functions over booleans just like we did for days. Here are the definitions for negation (negb), logical AND (andb), and logical OR (orb).
def negb (b : Bool) : Bool :=
match b with
| Bool.true => Bool.false
| Bool.false => Bool.true
def andb (b1 : Bool) (b2 : Bool) : Bool :=
match b1 with
| Bool.true => b2
| Bool.false => Bool.false
def orb (b1 : Bool) (b2 : Bool) : Bool :=
match b1 with
| Bool.true => Bool.true
| Bool.false => b2
We can verify these with truth tables using example:
example : orb Bool.true Bool.false = Bool.true := by rfl
example : orb Bool.false Bool.false = Bool.false := by rfl
Notation
Writing orb x y is tedious. We prefer standard infix notation like x || y. Lean allows us to define this with specific precedence levels.
infixl:65 " && " => andb
infixl:60 " || " => orb
infixlmeans "infix left-associative".- The numbers (e.g.,
65) represent binding power (precedence). andb (65) binds tighter than orb (60).
Now we can write readable proofs:
example : (Bool.true || (Bool.true || Bool.true)) = Bool.true := by rfl
Conditional Expressions (if is tricky)
In most languages, if is a primitive. In Lean, if is tied strictly to logic.
A standard if b then x else y statement in Lean expects b to be a Proposition that is Decidable. Our custom Bool is just a data type; Lean doesn't automatically know how to treat it as a logical condition.
To use if with our custom type, we would need to derive DecidableEq and check equality explicitly:
inductive Bool' where
| true
| false
deriving Repr, DecidableEq -- Allows us to check equality (x = y)
def negb' (b : Bool') : Bool' :=
if b = Bool'.true then Bool'.false else Bool'.true
For the rest of this series, we will mostly stick to Pattern Matching (match), as it is more fundamental.
Types and Functions
Every expression in Lean has a type. We can inspect them using the #check command.
#check Bool.true
-- Result: Bool.true : Bool
#check negb
-- Result: negb : Bool → Bool
The type Bool → Bool is pronounced "Bool arrow Bool." It means: "Give me a Bool, and I will return a Bool." Similarly, the type of andb, written bool → bool → bool, can be read, "Given two inputs, each of type bool, this function produces an output of type bool."
Wrapper Types (New Types from Old)
Let's define a Color type that can be Black, White, or a Primary color (Red, Green, Blue).
Notice the syntax primary (p : Rgb). This means the primary constructor requires an argument.
inductive Rgb where
| red
| green
| blue
deriving Repr
inductive Color where
| black
| white
| primary (p : Rgb) -- Wraps the Rgb type
deriving Repr
When we pattern match, we can extract this data:
def monochrome (c : Color) : Bool :=
match c with
| Color.black => Bool.true
| Color.white => Bool.true
| Color.primary _ => Bool.false -- The '_' is a wildcard (we don't care which RGB it is)
Tuples
We can create types that hold multiple values by defining a single constructor with multiple parameters. This is effectively a "Struct."
namespace TuplePlayground
inductive Bit where
| B1
| B0
deriving Repr
inductive Nybble where
| bits (b0 b1 b2 b3 : Bit)
deriving Repr
-- Creating a Nybble
#check (Nybble.bits Bit.B1 Bit.B0 Bit.B1 Bit.B0)
-- Pattern matching to read the bits
def allZero (nb : Nybble) : Bool :=
match nb with
| Nybble.bits Bit.B0 Bit.B0 Bit.B0 Bit.B0 => Bool.true
| Nybble.bits _ _ _ _ => Bool.false
end TuplePlayground
Tip: In production Lean code, we often use the
structurekeyword for types that have only one constructor. However, for learning the foundations, usinginductivehelps us understand that a struct is just a data type with a single shape.
Numbers (Peano Arithmetic)
We wrap this section in a namespace so that our own definition of natural numbers does not interfere with the one from the standard library.
namespace NatPlayground
All the types we have defined so far—Enumerated types like Day and Tuples like Nybble—are finite. The natural numbers, on the other hand, are an infinite set. To represent them, we need a richer form of type declaration.
There are many representations of numbers to choose from. You are familiar with decimal notation (base 10), binary (base 2), or hexadecimal (base 16). The binary representation is valuable in computer hardware because digits map to voltage levels. However, here we choose a representation that makes proofs simpler.
We use Unary representation (base 1), where only a single digit is used—like making scratches on a cave wall. To represent unary numbers with a datatype, we use two constructors:
O: Represents Zero.S: Represents the Successor of a number (n + 1).
inductive Nat where
| O
| S (n : Nat)
deriving Repr
With this definition:
0isO.1isS O.2isS (S O).
The Logic:
The definition says precisely how expressions in the set Nat can be built:
Obelongs to the setNat.- If
nbelongs toNat, thenS nalso belongs toNat.
Crucially, the constructor S has the type Nat → Nat, just like functions. However, S does not "do" anything. It doesn't perform a calculation. It is simply a way of writing down numbers. Just like the digit "1" in "100" is data, S is data.
Predecessor Function
We can write functions that "peel off" these wrappers. For example, the predecessor function:
def pred (n : Nat) : Nat :=
match n with
| Nat.O => Nat.O
| Nat.S n' => n'
The second branch reads: "If n has the form S n', return n'."
Recursion
For most interesting computations, simple pattern matching isn't enough; we need recursion. For example, to check if n is even, we need to check if n-2 is even.
In Coq, you must use the keyword Fixpoint to define recursive functions.
In Lean, we use standard def. Lean is smart enough to analyze the function and prove it terminates (because the argument n' is smaller than n).
def even (n : Nat) : Bool :=
match n with
| Nat.O => Bool.true -- 0 is even
| Nat.S Nat.O => Bool.false -- 1 is odd
| Nat.S (Nat.S n') => even n' -- Recurse
Recursive Addition
Addition is defined by "peeling off" wrappers from one number and putting them on the other.
def plus (n m : Nat) : Nat :=
match n with
| Nat.O => m
| Nat.S n' => Nat.S (plus n' m)
Mental Model:
If we calculate plus 3 2, the simplification steps look like this:
plus (S (S (S O))) (S (S O))
==> S (plus (S (S O)) (S (S O)))
==> S (S (plus (S O) (S (S O))))
==> S (S (S (plus O (S (S O)))))
==> S (S (S (S (S O)))) -- which is 5
Multiplication and Subtraction
Multiplication is just repeated addition.
def mult (n m : Nat) : Nat :=
match n with
| Nat.O => Nat.O
| Nat.S n' => plus m (mult n' m)
def minus (n m : Nat) : Nat :=
match n, m with
| Nat.O, _ => Nat.O
| Nat.S n', Nat.O => n
| Nat.S n', Nat.S m' => minus n' m'
infixl:65 " + " => plus
infixl:65 " - " => minus
infixl:70 " * " => mult
#eval (6 + 3 : _root_.Nat)
Here, we define eqb (equality boolean).
def eqb (n m : Nat) : Bool :=
match n, m with
| Nat.O, Nat.O => Bool.true
| Nat.O, Nat.S _ => Bool.false
| Nat.S _, Nat.O => Bool.false
| Nat.S n', Nat.S m' => eqb n' m'
infix:50 " =? " => eqb
-- Less than or Equal function
def leb (n m : Nat) : Bool :=
match n with
| Nat.O => Bool.true
| Nat.S n' =>
match m with
| Nat.O => Bool.false
| Nat.S m' => leb n' m'
infix:70 " <=? " => leb
While we can define this, we usually don't have to. Lean has a built-in typeclass called BEq (Boolean Equality). Nat already has an instance of this, so we can just write if n == m then .... We defined it here just for the sake of knowing.
Exercises
1. Factorial
Recall the mathematical definition: factorial(0) = 1 and factorial(n) = n * factorial(n-1). Translate this into Lean.
def factorial (n : Nat) : Nat :=
-- REPLACE THIS WITH YOUR DEFINITION
sorry
example : factorial (Nat.S (Nat.S (Nat.S Nat.O))) = (Nat.S (Nat.S (Nat.S (Nat.S (Nat.S (Nat.S Nat.O)))))) := by sorry
2. Less Than (ltb)
Fill in the definition of ltb (less-than boolean).
def ltb (n m : Nat) : Bool :=
-- REPLACE THIS WITH YOUR DEFINITION
sorry
infix:70 " <? " => ltb
example : (ltb (Nat.S (Nat.S Nat.O)) (Nat.S (Nat.S Nat.O))) = Bool.false := by sorry
example : (ltb (Nat.S (Nat.S Nat.O)) (Nat.S (Nat.S (Nat.S (Nat.S Nat.O))))) = Bool.true := by sorry
end NatPlayground
Concluding and moving forward...
With all these definitions, we will be concluding the first blog in this series. Basics-II will be continuation of the first part where we will cover tactics for proofs : Simplification, Rewriting, and Case Analysis. We will also look at a more efficient representation of numbers: Binary Numerals.
You can find all the source code in this repo. The repo provides you with the Lean code, and you can follow the Basics.v file for the Coq code. For all the exercises mentioned in this article, the solutions can be found in the equivalent .v files, and you can write them in the Lean file yourself.
Main Reference : Software Foundations