Matthias-2015-07-27.html

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  <meta http-equiv="Content-Style-Type" content="text/css" />
  <meta name="generator" content="pandoc" />
  <title></title>
  <style type="text/css">code{white-space: pre;}</style>
  <link rel="stylesheet" href="pandoc.css" type="text/css" />
</head>
<body>
<p>Started in the 30s with Church.</p>
<p>Godel has a language that can prove anything.</p>
<p>Church says you can prove anything with functions.</p>
<p>Set notation was <code>^ x | x² &lt; 0</code>, church annotated <code>'^ | x² &lt; 0</code>, which was typeset as <code>λ | x² &lt; 0</code>.</p>
<pre><code>e ⩴ x | λx.e | e e</code></pre>
<p>See things as trees:</p>
<pre><code>((λx.x)(λx.x)) = (λ.←) (λ.←)</code></pre>
<p>The names don't matter, the meaning of variable is its binding-site.</p>
<pre><code>λdavid. (david david) = λjeff. (jeff jeff)</code></pre>
<p>&quot;<code>λ</code> is just 7th-grade algebra hyped up a little bit.&quot;</p>
<pre><code>f(x) ≔ x²

f(6) + 5 = 6² + 5 = 41

(λx.e)e&#39; = e[x←e&#39;] -- β

(λx.x) a = a

(λxy.xyy)ab = (λy.ayy)b = abb</code></pre>
<p>Lambdas are not functions! What about partial functions?</p>
<p><code>β</code> is a relation, and requires a <code>λ</code> on the outside.</p>
<pre><code>λx.λy.λz.⸤(λx.x)(zz)⸥y</code></pre>
<p>We can't reduce with <code>β</code>.</p>
<p><code>β</code> is a &quot;notion of reduction&quot;.</p>
<p>We start defining <code>=⸤β⸣</code>:</p>
<pre><code>e β e&#39;
---------
e =⸤β⸥ e&#39;

e =⸤β⸥ e&#39;
---------------
λx.e =⸤β⸥ λx.e&#39;</code></pre>
<p>Template structure is called a &quot;redex&quot;.</p>
<p>[&quot;redex&quot; is not latin, it just means reducible expression. &quot;contractum&quot; is latin.]</p>
<p>Need another set of rules..</p>
<pre><code>e =⸤β⸥ e&#39;
----------------
e₀ e =⸤β⸥ e₀ e&#39;

e =⸤β⸥ e&#39;
----------------
e e₀ =⸤β⸥ e&#39; e₀</code></pre>
<p>We are creating a &quot;syntactic compatibility closure&quot;.</p>
<p>We also need the reflexive, symmetric and transitive closure of <code>=⸤β⸥</code>.</p>
<p>This gives an equivalence relation.</p>
<p>We have &quot;a system of calculating equivalences between terms&quot;.</p>
<p>Q: &quot;does something have meaning?&quot;</p>
<p>Two possible meanings:</p>
<ol style="list-style-type: decimal">
<li>&quot;Can you prove 'true = false', or 'is everything related'&quot; You need to prove (meta-proof), that you cannot prove (in the system), that some two terms are equal. This is called a consistency theorem, developed by Church+Rosser, &quot;the Church+Rosser lemma&quot;. This shows that the system relates some terms, but not all terms.</li>
<li>Is there a topologically, algebraically generated space of functions generated by <code>λ</code> and satisfying <code>=⸤β⸥</code>. This was worked out by Dana Scott.</li>
</ol>
<p>&quot;lambda-calculus and denotational semantics had a terrible influence on computer science&quot; --MF</p>
<p>1958: Lisp and Algol 60 were created.</p>
<p>Lisp: - introduced <code>λ</code>-notation, got it wrong</p>
<p>Algol 60: - based on substitution model of <code>λ</code>-calculus - call-by-name parameter passing (<code>β</code>-rule) (was very slow) - then also introduced call-by-value - cvn vs cbv &quot;one was correct, one was fast&quot;</p>
<p>For the next 15 years, people struggled to relate call-by-name (correct) with call-by-value (fast).</p>
<p>Landin (1960s, '62, '63), invented the idea of abstract syntax. Bohm did the same thing. McCarthy tried something similar.</p>
<p>Abelson and Sussman make popular &quot;applicative order application&quot;.</p>
<p>Dana Scott assigned a mathematical meaning to <code>λ</code>-calculus: 1. Created the function space 2. Assigning a mapping from <code>λ → ⟦⟧</code></p>
<p>MF opinion, denotational semantics took us off track.</p>
<p>Plotkin solved all of this (1972/1974) &quot;Call-by-name, call-by-value and the lambda calculus&quot;. Launched enough research ideas to fill 15 people's entire research lives. Read this paper it's really good!!!</p>
<p>Gives an algorithm to understand what a calculus and a semantics is for a programming language (13 steps?).</p>
<p>Launched research into CPS.</p>
<ol style="list-style-type: decimal">
<li>Pick a term language, scoped</li>
<li>Pick a subset of terms, called programs, and another subset, called values (first appearance of words 'program' and 'value' in study of <code>λ</code> up to that point.)</li>
</ol>
<ul>
<li>Programs are things we don't really know what to do with immediately</li>
<li><p>Values are things &quot;you see&quot; at the end of computation. <code>λ</code> is a value.</p>
<pre><code>        .- input
 (λi.e) e&#39; ~~~~~~&gt; output
 -----
 ^ program proper</code></pre></li>
</ul>
<ol start="3" style="list-style-type: decimal">
<li><p>Define a notion of reduction: <code>β</code> and <code>β-value</code></p>
<p>βᵥ: (λx.e)v ~&gt; e[x←v]</p></li>
<li><p>Uniformly crate a calculus <code>=ₓ</code> from the notions of reduction.</p>
<p><code>=ₙ</code> from β and <code>=ᵥ</code> from β and βᵥ</p></li>
</ol>
<p>A way to equating arbitrary program fragments.</p>
<ol start="5" style="list-style-type: decimal">
<li><p>Define a semantics from <code>=ₓ</code></p>
<p>evalₓ ∈ 𝒫(Program × Value)</p>
<p>e evalₓ v 𝑖𝑓𝑓 e =ₓ x</p></li>
<li><p>Prove that <code>evalₓ</code> is a function.</p></li>
</ol>
<p>Via Church-Rosser Lemma, <code>evalˣ</code> is a (partial) function</p>
<pre><code>   evalₓ(e) ≔ { n          for &quot;base&quot; value 
              | &#39;closure   for λ-expression }</code></pre>
<p>You can now prove things like:</p>
<pre><code>   e (Y e) =ₙ Y e</code></pre>
<p>Computation should be directed, which for now is not specified, and problematic.</p>
<ol start="7" style="list-style-type: decimal">
<li><p>Prove that <code>=ₓ</code> satisfies a &quot;standardization&quot; property:</p>
<p>𝑖𝑓 e =ₓ e' 𝑡ℎ𝑒𝑛 then you can do so in an algorithmic fashion</p></li>
</ol>
<p>An algorithm means you know how to pick the next redex.</p>
<p>The algorithm is the same for CBN and CBV.</p>
<p>&quot;left-most outer-most strategy&quot;, 𝑖.𝑒. standard reduction. <code>|--&gt;ₓ</code>.</p>
<p>A strategy is a meta-function for picking a redex.</p>
<p>&quot;If all you care about is the value at the end, you can use standard reduction&quot;.</p>
<pre><code>   evalₓ(e) = v 𝑖𝑓𝑓 e |--&gt;ₓ* v</code></pre>
<p>Proof in Curry and Fays, Curry Fays theorem.</p>
<p>(Aside: you must give readers a guide for how to pronounce math notation! A reader should be able to read your paper aloud.)</p>
<p>We have two semantics, <code>evalₓ</code> based on standard reduction, and <code>=ₓ</code> based on equality.</p>
<p>Must prove that <code>evalₓSR</code> is the same function as <code>evalₓ=</code>.</p>
<p>CBN calculus inconsistent with CBV interpreter, CBV calculus inconsistent with CBN interpreter.</p>
<p>What do calculations on program mean?</p>
<ol style="list-style-type: decimal">
<li>(Syntactic) because you prove Church/Rosser, you know that calculations are consistent with the &quot;fast&quot; interpreter</li>
<li>(Semantic) via snippet from Jim Morris (63) dissertation, created polymorphic lambda calculus (PAL): introduce a relation known as observational equivalence.</li>
</ol>
<p><code>e ≃ e'</code> means for all ways of placing a term into a complete program (a context) called C, evalₓ(C[e]) ~ evalₓ(C[e'])</p>
<p>Two versions: <code>≃ₙ</code> and <code>≃ᵥ</code>. These are the largest possible consistent equivalence relations that let you calculate programs. Therefore they are unique (because they are larges). They are the <em>truth</em>.</p>
<p>Every programming language has &quot;the truth&quot; (<code>≃ₙ</code>) by virtue of having an interpreter. The goal is to make the proof system (<code>=ₓ</code>) consistent with the truth.</p>
<p>MF: &quot;On the expressive power of programming languages&quot;, previous draft attempted to prove <code>≃ᵥ ⊆  ≃ₙ</code>, was proved different two months earlier.</p>
<p>CBV and CBN functional programming are not related other than in the syntax of the terms. CBN is not &quot;a different strategy&quot;.</p>
<p>Laziness and CBN are related, by subset.</p>
<p>Q: what use is studying functional programming if programs aren't purely functional?</p>
<pre><code>(f (call/cc g)) ~ g(f)</code></pre>
<p>A calculus equation for a very imperative idea.</p>
<p>Technical insights: &quot;evaluation context semantics&quot; <em>use contexts instead of inference rules</em>.</p>
<pre><code>e β e&#39;
---------    &lt;-- inference rule
e =⸤β⸥ e&#39;</code></pre>
<p>&quot;Syntactic compatibility&quot;</p>
<p>&quot;left-most-outer-most&quot;</p>
<p>Contexts:</p>
<pre><code>e ⩴ x | λx.e | e e
C ⩴ □ | λx.C | C e | e C</code></pre>
<p>one-hole contexts.</p>
<p><code>C[e]</code> &quot;textually&quot; put <code>e</code> into hole.</p>
<pre><code>(λx.□)(λy.y)
      
 / \
λ  λ←←
|  | ↑
□  ⋅→→</code></pre>
<p>with contexts:</p>
<pre><code>=⸤β⸥ : e =⸤β⸥ e&#39; 𝑖𝑓𝑓  ∃ C,
     e  = C[(λx.e₀)e₁]
     e&#39; = C[e₀[x ←e₁]]</code></pre>
<p>Evaluation context:</p>
<pre><code>E ⩴ □ | E e</code></pre>
<p>Thm: E[(λx.e)e'] is the LMOM redex.</p>
<p>For CBV you need:</p>
<pre><code>E ⩴ □ | v E | E e</code></pre>
<p>You could also use:</p>
<pre><code>E ⩴ □ | e E | E v</code></pre>
<p>also left-most-outer-most!</p>
<pre><code>E[(λx.e)e&#39;] |--&gt;ₙ E[e[x←e&#39;]]</code></pre>
<p>fully describes CBN standard reduction.</p>
<pre><code>E[(λx.e)v] |--&gt;ᵥ E[e[x←v]]</code></pre>
<p>fully describes CBV standard reduction.</p>
<p>&quot;evaluation context semantics&quot; should be called &quot;standard reduction semantics&quot;.</p>
<p>Technical Insight 2:</p>
<pre><code>E[ THING v ]</code></pre>
<p><code>THING</code> can manipulate <code>E</code>, the evaluation context.</p>
<p>From this you can do side-effects, continuations, etc.</p>
<p>𝑒.𝑔.</p>
<pre><code>E[raise e] ~&gt; raise e</code></pre>
<p>full equational system for exceptions:</p>
<pre><code>x | λx.e | ee | raise e</code></pre>
<p>calculation system:</p>
<pre><code>C[(λx.e)v]    =ₑₓ C[e[x←v]]
C[E[raise e]] =ₑₓ C[raise e]</code></pre>
<p>These two rules give you a consistent Church/Rosser system for exceptions. Same two equations work for CBN.</p>
<p>Standard reduction:</p>
<pre><code>E[(λx.e)v]     |--&gt;ₑₓ E[e[x←v]]
E[E&#39;[raise e]] |--&gt;ₑₓ E[raise e]   [ |--&gt;ₑₓ raise e , as a coincidence  ]</code></pre>
<p>Standard reduction for assignment:</p>
<pre><code>e = x | λx.e | e e | set! x e | letrec ((x v) ..) e
v = λx.e

E = □ | E e | v E | set! x E

(βₛₑₜ):  (λx.e) v                          R  letrec ((x v)) e
(x):     letrec (.. (x v) ..) E[x]         R  letrec (.. (x v) ..) E[v]
(set!):  letrec (.. (x v) ..) E[set! x u]  R  letrec (.. (x u) ..) E[λx.x]

(scope extrusion:)
  E[letrec (...) e]  R  letrec (...) E[e]

(merge:)
  letrec (.. (x v) ..) (letrec (.. (y u) ..) e)  R  letrec (.. (x v) .. .. (y u) ..) e</code></pre>
<p>You can calculate in parallel, but standard reduction doesn't capture parallel execution.</p>
<p>Technical Insight 3:</p>
<pre><code>t:   E[(λx.e)e&#39;]   = Pₜ
t+1: E[e[x←e&#39;]]    = Pₜ₊₁</code></pre>
<p>Idea: separate <code>E</code> from the expression where the &quot;machine&quot; is looking for a redex.</p>
<p>Two register machine: control and stack registers:</p>
<pre><code>⟨e,E⟩</code></pre>
<p>Next idea: change data representation from context to stack:</p>
<pre><code>⟨e,[app₁]⟩
   [app₂]
   [app₃]
    ...</code></pre>
<p>Next idea: substitution is hard and inefficient. Make substitution lazy; reveals an explicit environment.</p>
<pre><code>control:     e
environment: ρ  mapping free-variables to values
stack:       κ  control stack</code></pre>
</body>
</html>