Surfer's blog

Let's Learn Lean : Polymorphism and Higher-Order Functions

Welcome back to the fifth instalment in our series of porting Software Foundations to Lean 4.

In the previous article, Lists, we successfully built structured data types like NatList (lists of natural numbers) and Bag (multisets). We implemented various functions like app, rev, and length specifically for them.

But there is a problem. What if we need a list of Booleans? We would have to define a new type BoolList, copy-paste app, rev, and length, and rename them to bool_app, bool_rev, etc.

This is not very optimal and violates the principle of not repeating ourself.

In this article, we are going to solve this using Polymorphism. We will also explore Higher-Order Functions—functions that treat other functions as data.

Let's dive in.

1. Polymorphism

In the last chapter, we defined NatList like this:

inductive NatList where
  | nil
  | cons (n : Nat) (l : NatList)

To make this generic, we need to replace Nat with an arbitrary type variable. In Coq, you might see Inductive list (X : Type).

In Lean 4, we define polymorphic types using Greek letters (like α, β, γ) by convention.

Note: For this chapter, instead of defining our own MyList type, we will switch to using Lean's built-in List type. This allows us to use standard notations like [] and :: out of the box.

Implicit Arguments

When we define a polymorphic function, we need to tell Lean that it takes a type α as a parameter.

If we write (α : Type), we have to provide the type explicitly every time we call the function (e.g., length Nat my_list). This is tedious.

Instead, we use curly braces {α : Type} to mark the argument as Implicit. This tells Lean: "Please figure out what α is from the context".

Let's define a polymorphic repeatN function (renamed to avoid conflict with the tactic repeat).

-- α is an implicit type parameter.
-- x has type α.
-- The result is a List containing elements of type α.
def repeatN {α : Type} (x : α) (count : Nat) : List α :=
  match count with
  | 0 => []
  | Nat.succ count' => x :: (repeatN x count')

Because α is implicit, we don't need to pass Nat or Bool when calling it. Lean infers it automatically based on the value of x.

-- Lean infers α is Nat because 4 is a Nat
#eval repeatN 4 2
-- Result: [4, 4]

-- Lean infers α is Bool because false is a Bool
#eval repeatN false 2
-- Result: [false, false]

Exercises: Polymorphic Lists

Since we are now using the built-in List, standard operators like ++ (append) are already polymorphic. Let's try to prove some properties about them.

These proofs work exactly like they did for NatList, but now they work for List α for any α.

Exercise 1: app_nil_r

Prove that appending nil ([]) to the right of any list returns the list itself.

theorem app_nil_r {α : Type} (l : List α) : l ++ [] = l := by
  /- FILL IN HERE -/ sorry

Exercise 2: app_assoc

Prove that list appending is associative.

theorem app_assoc {α : Type} (l m n : List α) : l ++ m ++ n = l ++ (m ++ n) := by
  /- FILL IN HERE -/ sorry

Exercise 3: app_length

Prove that the length of two appended lists is the sum of their lengths.

theorem app_length {α : Type} (l1 l2 : List α) :
  (l1 ++ l2).length = l1.length + l2.length := by
  /- FILL IN HERE -/ sorry

2. Polymorphic Pairs

Just like we generalized lists, we can generalize pairs.

In the previous article, we defined NatProd manually to hold two natural numbers. But what if we want a pair of a Nat and a Bool? Or a pair of a List and a Nat?

In Lean, we have the built-in generic type Prod α β. This represents a pair where the first element has type α and the second has type β .

Notation:

def myPair : Nat × Bool := (3, true)

#check myPair
-- Result: myPair : Nat × Bool

Note: It is easy to confuse (x, y) (the value) with X × Y (the type).

Combine (Zip)

A classic functional programming exercise is "zipping" two lists together. If we have a list of α and a list of β, we can combine them into a single list of pairs α × β.

We call this function combine (often called zip in other languages).

def combine {α β : Type} (l1 : List α) (l2 : List β) : List (α × β) :=
  match l1, l2 with
  | [], _ => []
  | _, [] => []
  | x :: tx, y :: ty => (x, y) :: combine tx ty

Exercise 4: Split (Unzip)

The inverse of combine is split (often called unzip). It takes a list of pairs and returns a pair of lists.

Give a try implementing it yourself!

def split {α β : Type} (l : List (α × β)) : (List α) × (List β) :=
  /- REPLACE THIS LINE WITH YOUR DEFINITION -/ sorry

example : split [(1, false), (2, false)] = ([1, 2], [false, false]) := by
  /- FILL IN HERE -/ sorry

3. Polymorphic Options

In the Lists article, we encountered a problem: what does the hd (head) function return when the list is empty?

We solved this by defining NatOption, which was either Some n or None.

Now, we can generalize this to any type. Lean provides a built-in Option type that works exactly like our NatOption, but for any α.

inductive Option (α : Type) where
  | none : Option α
  | some (val : α) : Option α

We can now rewrite the nth_error function so that it works with any type of list. It returns the n-th element of a list, or none if the index is out of bounds.

def nth_error {α : Type} (l : List α) (n : Nat) : Option α :=
  match l with
  | [] => none
  | a :: l' =>
    match n with
    | 0 => some a
    | Nat.succ n' => nth_error l' n'

Let's verify it works for both numbers and booleans:

example : nth_error [4, 5, 6, 7] 0 = some 4 := rfl
example : nth_error [true] 2 = none := rfl

Exercise 5: hd_error

Complete the definition of a polymorphic version of the hd function. Unlike the standard hd which requires a default value, this should just return Option α.

def hd_error {α : Type} (l : List α) : Option α :=
  /- REPLACE THIS LINE WITH YOUR DEFINITION -/ sorry

example : hd_error [1, 2] = some 1 := by
  /- FILL IN HERE -/ sorry

-- Note: Sometimes Lean needs help knowing what 'α' is if the list is empty.
-- We can provide it explicitly using the (name := value) syntax.
example : hd_error (α := Nat) [] = none := by
  /- FILL IN HERE -/ sorry

4. Functions as Data

In Lean, functions are "first-class citizens". This means they are treated just like any other data type (like Nat or List). You can pass functions as arguments to other functions, return them as results, and store them in data structures.

Functions that manipulate other functions are called Higher-Order Functions.

Simple Example

To warm up, let's write a function that takes a function f and a value n, and applies f to n three times.

def doit3times {α : Type} (f : α -> α) (n : α) : α :=
  f (f (f n))

def minustwo (n : Nat) : Nat := n - 2

example : doit3times minustwo 9 = 3 := rfl

Filter

A more useful higher-order function is filter. It takes a list and a predicate (a function from α to Bool). It returns a new list containing only the elements where the predicate returns true.

def filter {α : Type} (test : α -> Bool) (l : List α) : List α :=
  match l with
  | [] => []
  | h :: t =>
    if test h then h :: (filter test t)
    else filter test t

For example, if we want to filter a list to keep only even numbers, we might first define a helper function is_even:

def is_even (n : Nat) : Bool :=
  n % 2 == 0

example : filter is_even [1, 2, 3, 4] = [2, 4] := rfl

Or, if we want to keep lists that have length 1:

def length_is_1 {α : Type} (l : List α) : Bool :=
  l.length == 1

example : filter length_is_1 [[1, 2], [3], [4], [5, 6, 7], [], [8]]
  = [[3], [4], [8]] := rfl

Anonymous Functions

It is arguably a little sad to be forced to define functions like is_even and length_is_1 and give them global names, just to be able to pass them as arguments to filter. We will probably never use them again, so they just clutter up our namespace.

Fortunately, there is a better way. Lean allows us to construct functions "on the fly" using the fun x => ... syntax (similar to lambdas in Python or arrow functions in JS).

Here is the filter example, rewritten to use an anonymous function:

-- Keep only even numbers
example : filter (fun x => x % 2 == 0) [1, 2, 3, 4] = [2, 4] := rfl

-- Keep lists of length 1
example : filter (fun l => l.length == 1) [[1, 2], [3], [4], [5, 6, 7], [], [8]]
  = [[3], [4], [8]] := rfl

Exercise 6: filter_even_gt7

Use filter (instead of Fixpoint recursion) to write a function that keeps only even numbers greater than 7.

def filter_even_gt7 (l : List Nat) : List Nat :=
  /- REPLACE THIS LINE WITH YOUR DEFINITION -/ sorry

example : filter_even_gt7 [1, 2, 6, 9, 10, 3, 12, 8] = [10, 12, 8] := by
  /- FILL IN HERE -/ sorry

Exercise 7: partition

Use filter to write partition. This function takes a list and separates it into a pair of lists: the first containing elements that satisfy the test, and the second containing those that don't.

def partition {α : Type} (test : α -> Bool) (l : List α) : List α × List α :=
  /- REPLACE THIS LINE WITH YOUR DEFINITION -/ sorry

example : partition (fun x => x % 2 != 0) [1, 2, 3, 4, 5] = ([1, 3, 5], [2, 4]) := by
  /- FILL IN HERE -/ sorry

Map

map is another fundamental higher-order function. It takes a transformation function f (from α to β) and applies it to every element of a list.

def map {α β : Type} (f : α -> β) (l : List α) : List β :=
  match l with
  | [] => []
  | h :: t => (f h) :: (map f t)

example : map (fun x => x + 3) [2, 0, 2] = [5, 3, 5] := rfl

Exercise 8: map_rev

Show that map and reverse commute. You might need to prove an auxiliary lemma first!

theorem map_rev {α β : Type} (f : α -> β) (l : List α) :
  map f (l.reverse) = (map f l).reverse := by
  /- FILL IN HERE -/ sorry

Exercise 9: flat_map

flat_map is similar to map, but the function f returns a List β instead of a single β. The results are then "flattened" into a single list.

def flat_map {α β : Type} (f : α -> List β) (l : List α) : List β :=
  /- REPLACE THIS LINE WITH YOUR DEFINITION -/ sorry

example : flat_map (fun n => [n, n+1, n+2]) [1, 5, 10]
  = [1, 2, 3, 5, 6, 7, 10, 11, 12] := by
  /- FILL IN HERE -/ sorry

Fold

fold (specifically "fold right") is an even more powerful function. It is the inspiration for the "reduce" operation in MapReduce frameworks.

Intuitively, fold f [1, 2, 3, 4] b inserts the binary operator f between every pair of elements.

say, for example, Sum of list fold (+) [1, 2, 3, 4] 0 ==> 1 + (2 + (3 + (4 + 0)))

def fold {α β : Type} (f : α -> β -> β) (l : List α) (b : β) : β :=
  match l with
  | [] => b
  | h :: t => f h (fold f t b)

example : fold Nat.add [1, 2, 3, 4] 0 = 10 := rfl

Exercise 10: fold_length

Many common list functions can be implemented in terms of fold. Try to implement length using fold.

def fold_length {α : Type} (l : List α) : Nat :=
  fold (fun _ n => n + 1) l 0

theorem fold_length_correct {α : Type} (l : List α) :
  fold_length l = l.length := by
  /- FILL IN HERE -/ sorry

5. Functions That Construct Functions

Most of the higher-order functions we have talked about so far take functions as arguments. Now let's look at examples that return functions as results.

This concept is often called Currying, named after the logician Haskell Curry.

The type α -> β -> γ can be read as:

  1. A function taking two arguments (α and β) and returning γ.
    OR
  2. A function taking one argument (α) and returning a function (β -> γ).

In Lean, all multi-argument functions are technically curried by default.

Exercise 11: Currying

We can define functions to convert between these two forms. Try to implement them and prove that they are inverses of each other!

def prod_curry {α β γ : Type}
  (f : (α × β) -> γ) (x : α) (y : β) : γ := f (x, y)

def prod_uncurry {α β γ : Type}
  (f : α -> β -> γ) (p : α × β) : γ :=
  f p.1 p.2

theorem uncurry_curry {α β γ : Type}
  (f : α -> β -> γ) (x : α) (y : β) :
  prod_curry (prod_uncurry f) x y = f x y := by
  /- FILL IN HERE -/ sorry

theorem curry_uncurry {α β γ : Type}
  (f : (α × β) -> γ) (p : α × β) :
  prod_uncurry (prod_curry f) p = f p := by
  /- FILL IN HERE -/ sorry

Concluding and Moving Ahead...

So this was all about making our code reusable and powerful using Polymorphism and Higher-Order Functions.

We started by breaking free from specific types like NatList and learned to write generic code using List α. We then discovered how to manipulate data concisely using powerful tools like map, filter, and fold.

We are gonna stop here. You can find all the source code in this repo. All the exercises can be found here, and you can give them a try yourself.

In the next article, we will learn about more Tactics in Lean. See you there !

Main Reference : Software Foundations