main.Rmd

---
title: "Tensor diagrams"
description: |
  Penrose notation for deep learning.
author:
  - name: Piotr Migdał
    url: https://p.migdal.pl
    affiliation: Quantum Flytrap
    affiliation_url: https://quantumgame.io
  - name: Claudia Zendejas-Morales
    url: http://claudiazm.xyz
    affiliation: Quantum Flytrap
    affiliation_url: https://quantumflytrap.com
date: "`r Sys.Date()`"
bibliography: references.bib
output:
  distill::distill_article:
    toc: true
    toc_float: true
    self_contained: false
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
```

```{r echo=FALSE, message=FALSE, results='asis'}
cat(
  "<script src='https://unpkg.com/tensor-diagrams@0.2.4/dist/esbuild/tensor-diagrams.js'></script>",
  "<link rel='stylesheet' href='tensorDiagram.css'>"
)
```

Multidimenstional arrays are the basic building blocks for any deep neural network, and many other models in statistics and machine learning.
However, even for relatively simple operations the notation becomes monstrous:


$$
\sum_{i_1=1}^n \sum_{i_2=1}^n \sum_{i_3=1}^n \sum_{i_4=1}^n
p_{i_1} T_{i_1 i_2} O_{i_2 j_1}  T_{i_2 i_3} O_{i_3 j_2} T_{i_3 i_4} O_{i_4 j_3}
$$

If you are aleary faimiliar with the problem, you may guess that it is probability of observing $(j_1, j_2, j_3)$ sequence in a Hidden Markov Model. If you are not, it takes time and effort to parse, and comprehend the formula.
In fact, I made a typo when writing it for the first time.

This notation has some problems, as:

- We introduce indices just to sum over them.
- We need to manually inspect which indices repeat (and how many times).
- There is no overview over the struture.


In this expository article we introduce tensor diagram notation in the context of deep learning.
That is - we focus on array of real numbers and relevant operations.
We present it as a convenient notation for matrix summation, nothing less more more.
In this notation, the described equation becomes

<div id="example1" class="equation"></div>

```{js, results='asis', echo=FALSE, message=FALSE}

TensorDiagram.new()
  .addTensor("p", {x: 0, y: 0}, [], ["i1"])
  .addTensor("T", {x: 1, y: 0}, ["i1"], ["tx"])
  .addTensor("dot", {x: 2, y: 0}, ["tx"], ["xt"], [], ["xo"], { shape:"dot", showLabel: false })
  .addTensor("T", {x: 3, y: 0}, ["xt"], ["ty"])
  .addTensor("dot", {x: 4, y: 0}, ["ty"], ["yt"], [], ["yo"], { shape:"dot", showLabel: false })
  .addTensor("T", {x: 5, y: 0}, ["yt"], ["tz"])
  .addTensor("dot", {x: 6, y: 0}, ["tz"], [], [], ["zo"], { shape:"dot", showLabel: false })
  .addTensor("O", {x: 2, y: 1}, [], [], ["xo"], ["j1"], { labelPos: "left"} )
  .addTensor("O", {x: 4, y: 1}, [], [], ["yo"], ["j2"], { labelPos: "left"} )
  .addTensor("O", {x: 6, y: 1}, [], [], ["zo"], ["j3"], { labelPos: "left"} )
  .addContraction(0, 1, "i1")
  .addContraction(1, 2, "tx")
  .addContraction(2, 3, "xt")
  .addContraction(2, 7, "xo")
  .addContraction(3, 4, "ty")
  .addContraction(4, 5, "yt")
  .addContraction(4, 8, "yo")
  .addContraction(5, 6, "tz")
  .addContraction(6, 9, "zo")
  .setSize(400, 200)
  .draw("#example1");

```

In this article we use **tensor** as in `torch.Tensor` or TensorFlow - i.e. multidimensional arrays of numebrs with  some additional structure. This a very specific case of would a mathematican call a tensor.
In this context **tensor product** is **outer product**.
(A note for mathematical purists and fetishists: here we work in a finite-dimensional Hilbert space over real numbers, in a fixed basis.)
Also, in physics tensors have a need to fullfil certain transformation criteria [II 02: Differential Calculus of Vector Fields from The Feynman Lectures on Physics](https://www.feynmanlectures.caltech.edu/):

> it is not generally true that any three numbers form a vector. It is true only if, when we rotate the coordinate system, the components of the vector transform among themselves in the correct way.




## Background


Tensor diagrams where invented by Penrose[@penrose_applications_1971].
For the first contact with tensor diagrams I suggest glaring at the beautiful diagrams[@bradley_matrices_2019]. If you have some background in quantum mechanics, go for a short introduciton[@biamonte_tensor_2017] or a slightly longer[@bridgeman_hand-waving_2017].

For a complete introduction I suggest book[@coecke_picturing_2017] or lecture notes [@biamonte_lectures_2020].

Tensor diagrams are popular quantum state decomposition for condensed matter physics[@orus_practical_2014,@verstraete_matrix_2008].


## Key parts

### Tensors


A scalar $c$, a vector $v_i$, a matrix $M_{ij}$ and a (third-order) tensor $T_{ijk}$ are represented by:

<div id="example2" class="equation"></div>

```{js, results='asis', echo=FALSE, message=FALSE}

TensorDiagram.new()
  .addTensor("a", {x: 0, y: 0}, [], [])
  .addTensor("v", {x: 2, y: 0}, [], ["i"])
  .addTensor("M", {x: 5, y: 0}, ["i"], ["j"])
  .addTensor("T", {x: 8, y: 0}, ["i"], ["k"], [], ["j"])
  .setSize(580, 140)
  .draw("#example2");


```

Each loose end correspond to an index.

### Basic operations

```{r child = 'basic-operations.Rmd'}
```

### Dot operator

```{r child = 'dot-operators.Rmd'}
```

### Other examples

```{r child = 'other-examples.Rmd'}
```

## Applications

### Machine learning

```{r child = 'machine-learning.Rmd'}
```

### Deep learning

```{r child = 'deep-learning.Rmd'}
```


## You may have seen tensor diagrams

Feynman Diagrams[@kaiser_physics_2005] (for unbounded operators on infitely-dimensional Hilbert spaces) and quantum computing.

TODO: Do LSTM diagrams qualify?

## Conclusion

Tensor diagrams are cool!

## Footnote

Written in [R Markdown Distill](https://rstudio.github.io/distill).

Other inspirations:

- [Einsum is All you Need - Einstein Summation in Deep Learning - Tim Rocktäschel](https://rockt.github.io/2018/04/30/einsum)
- [Simple diagrams of convoluted neural networks - Piotr Migdał](https://medium.com/inbrowserai/simple-diagrams-of-convoluted-neural-networks-39c097d2925b)
- [Tensor Diagram Notation - tensornetwork.org](http://tensornetwork.org/diagrams/) for graphic style
- [Thinking in tensors, writing in PyTorch (a hands-on deep learning intro)](https://github.com/stared/thinking-in-tensors-writing-in-pytorch), especially chapter 1 (it is an eternal Work in Progress, though)
- [bra-ket-vue](https://github.com/Quantum-Game/bra-ket-vue) - a visualizer for quantum states and matrices, aware of the tensor structure
- [Tensor diagram notation in D3.js - JSFiddle](https://jsfiddle.net/stared/8huz5gy7/) - initial version of vis
- [HN on Terry Tao on some desirable properties of mathematical notation](https://news.ycombinator.com/item?id=23911903) - including that "Is it just me, or does probability theory in general have fairly terrible notation?"

## TODO

Idea:

- Antisymmetric tensor and cross product.
- Tensors in the sense of TensorFlow and torch.tensor (e.g with einsum).