From 9f128aeba65b4b70353bc33bc373e5e55f138a88 Mon Sep 17 00:00:00 2001
From: Jannis Limperg <jannis@limperg.de>
Date: Sun, 29 Sep 2024 13:24:11 +0200
Subject: [PATCH] 03: fix explanation of capture-avoiding substitution

As pointed out by Kayla Thomas [1], the paragraph on capture-avoiding
substitution used an example that didn't actually demonstrate capture-avoiding
substitution.

While I was at it, I also made the explanation a bit more explicit.

[1] https://leanprover.zulipchat.com/#narrow/stream/239415-metaprogramming-.2F-tactics/topic/.22Metaprogramming.20in.20Lean.204.22/near/473287822
---
 lean/main/03_expressions.lean | 39 +++++++++++++++++++++++------------
 1 file changed, 26 insertions(+), 13 deletions(-)

diff --git a/lean/main/03_expressions.lean b/lean/main/03_expressions.lean
index 3a247c6..6d044c6 100644
--- a/lean/main/03_expressions.lean
+++ b/lean/main/03_expressions.lean
@@ -113,22 +113,34 @@ and a curse
 
 ## De Bruijn Indexes
 
-Consider the following lambda expression `(λ f x => f x x) (λ x y => x + y) 5`,
-we have to be very careful when we reduce this,
-because we get a clash in the variable `x`.
-
-To avoid variable name-clash carnage,
-`Expr`s use a nifty trick called __de Bruijn indexes__.
+Consider the lambda expression `(λ x : ℕ => λ y : ℕ => x + y) y`.
+When we evaluate it naively, by replacing `x` with `y` in the body of the outer lambda, we obtain `λ y : ℕ => y + y`.
+But this is incorrect: the lambda is a function with two arguments that adds one argument to the other, yet the evaluated version adds its argument to itself.
+The root of the issue is that the name `y` is used for both the variable outside the lambdas and the variable bound by the inner lambda.
+Having different variables use the same name means that when we evaluate, or β-reduce, an application, we must be careful not to confuse the different variables.
+
+To address this issue, Lean does not, in fact, refer to bound variables by name.
+Instead, it uses __de Bruijn indexes__.
 In de Bruijn indexing,
 each variable bound by a `lam` or a `forallE` is converted into a number `#n`.
-The number says how many binders up the `Expr` tree we should look
-to find the binder which binds this variable.
+The number says how many binders up the `Expr` tree we should skip
+to find the binder that binds this variable.
 So our above example would become
-(putting wildcards `_` in the type arguments for now for brevity):
-``app (app (lam `f _ (lam `x _ (app (app #1 #0) #0))) (lam `x _ (lam `y _ (app (app plus #1) #0)))) five``
-Now we don't need to rename variables when we perform β-reduction.
-We also really easily check if two `Expr`s containing bound expressions are equal.
-This is why the signature of the `bvar` case is `Nat → Expr` and not `Name → Expr`.
+(replacing inessential parts of the expression with `_` for brevity):
+```
+app (lam `x _ (lam `y _ (app (app `plus #1) #0) _) _) (fvar _)
+```
+The `fvar` represents `y` and the lambdas' variables are now represented by `#0` and `#1`.
+When we evaluate this application, we replace the bound variable belonging to ``lam `x`` (here `#1`) with the argument `fvar _`, obtaining
+```
+(lam `y _ (app (app `plus (fvar _)) #0) _)
+```
+This is pretty-printed as
+```
+λ y_1 => y + y_1
+```
+Note that Lean has automatically chosen a name `y_1` for the remaining bound variable that does not clash with the name of the `fvar` `y`.
+The chosen name is based on the name suggestion `y` contained in the `lam`.
 
 If a de Bruijn index is too large for the number of binders preceding it,
 we say it is a __loose `bvar`__;
@@ -141,6 +153,7 @@ outside of some very low-level functions,
 Lean expects that expressions do not contain any loose `bvar`s.
 Instead, whenever we would be tempted to introduce a loose `bvar`,
 we immediately convert it into an `fvar` ("free variable").
+(Hence, Lean's binder representation is "locally nameless".)
 Precisely how that works is discussed in the next chapter.
 
 If there are no loose `bvar`s in an expression, we say that the expression is __closed__.