-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathmain.Rmd
189 lines (130 loc) · 6.94 KB
/
main.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
---
title: "Tensor diagrams"
description: |
Penrose notation for deep learning.
author:
- name: Piotr Migdał
url: https://p.migdal.pl
affiliation: Quantum Flytrap
affiliation_url: https://quantumgame.io
- name: Claudia Zendejas-Morales
url: http://claudiazm.xyz
affiliation: Quantum Flytrap
affiliation_url: https://quantumflytrap.com
date: "`r Sys.Date()`"
bibliography: references.bib
output:
distill::distill_article:
toc: true
toc_float: true
self_contained: false
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
```
```{r echo=FALSE, message=FALSE, results='asis'}
cat(
"<script src='https://unpkg.com/[email protected]/dist/esbuild/tensor-diagrams.js'></script>",
"<link rel='stylesheet' href='tensorDiagram.css'>"
)
```
Multidimenstional arrays are the basic building blocks for any deep neural network, and many other models in statistics and machine learning.
However, even for relatively simple operations the notation becomes monstrous:
$$
\sum_{i_1=1}^n \sum_{i_2=1}^n \sum_{i_3=1}^n \sum_{i_4=1}^n
p_{i_1} T_{i_1 i_2} O_{i_2 j_1} T_{i_2 i_3} O_{i_3 j_2} T_{i_3 i_4} O_{i_4 j_3}
$$
If you are aleary faimiliar with the problem, you may guess that it is probability of observing $(j_1, j_2, j_3)$ sequence in a Hidden Markov Model. If you are not, it takes time and effort to parse, and comprehend the formula.
In fact, I made a typo when writing it for the first time.
This notation has some problems, as:
- We introduce indices just to sum over them.
- We need to manually inspect which indices repeat (and how many times).
- There is no overview over the struture.
In this expository article we introduce tensor diagram notation in the context of deep learning.
That is - we focus on array of real numbers and relevant operations.
We present it as a convenient notation for matrix summation, nothing less more more.
In this notation, the described equation becomes
<div id="example1" class="equation"></div>
```{js, results='asis', echo=FALSE, message=FALSE}
TensorDiagram.new()
.addTensor("p", {x: 0, y: 0}, [], ["i1"])
.addTensor("T", {x: 1, y: 0}, ["i1"], ["tx"])
.addTensor("dot", {x: 2, y: 0}, ["tx"], ["xt"], [], ["xo"], { shape:"dot", showLabel: false })
.addTensor("T", {x: 3, y: 0}, ["xt"], ["ty"])
.addTensor("dot", {x: 4, y: 0}, ["ty"], ["yt"], [], ["yo"], { shape:"dot", showLabel: false })
.addTensor("T", {x: 5, y: 0}, ["yt"], ["tz"])
.addTensor("dot", {x: 6, y: 0}, ["tz"], [], [], ["zo"], { shape:"dot", showLabel: false })
.addTensor("O", {x: 2, y: 1}, [], [], ["xo"], ["j1"], { labelPos: "left"} )
.addTensor("O", {x: 4, y: 1}, [], [], ["yo"], ["j2"], { labelPos: "left"} )
.addTensor("O", {x: 6, y: 1}, [], [], ["zo"], ["j3"], { labelPos: "left"} )
.addContraction(0, 1, "i1")
.addContraction(1, 2, "tx")
.addContraction(2, 3, "xt")
.addContraction(2, 7, "xo")
.addContraction(3, 4, "ty")
.addContraction(4, 5, "yt")
.addContraction(4, 8, "yo")
.addContraction(5, 6, "tz")
.addContraction(6, 9, "zo")
.setSize(400, 200)
.draw("#example1");
```
In this article we use **tensor** as in `torch.Tensor` or TensorFlow - i.e. multidimensional arrays of numebrs with some additional structure. This a very specific case of would a mathematican call a tensor.
In this context **tensor product** is **outer product**.
(A note for mathematical purists and fetishists: here we work in a finite-dimensional Hilbert space over real numbers, in a fixed basis.)
Also, in physics tensors have a need to fullfil certain transformation criteria [II 02: Differential Calculus of Vector Fields from The Feynman Lectures on Physics](https://www.feynmanlectures.caltech.edu/):
> it is not generally true that any three numbers form a vector. It is true only if, when we rotate the coordinate system, the components of the vector transform among themselves in the correct way.
## Background
Tensor diagrams where invented by Penrose[@penrose_applications_1971].
For the first contact with tensor diagrams I suggest glaring at the beautiful diagrams[@bradley_matrices_2019]. If you have some background in quantum mechanics, go for a short introduciton[@biamonte_tensor_2017] or a slightly longer[@bridgeman_hand-waving_2017].
For a complete introduction I suggest book[@coecke_picturing_2017] or lecture notes [@biamonte_lectures_2020].
Tensor diagrams are popular quantum state decomposition for condensed matter physics[@orus_practical_2014,@verstraete_matrix_2008].
## Key parts
### Tensors
A scalar $c$, a vector $v_i$, a matrix $M_{ij}$ and a (third-order) tensor $T_{ijk}$ are represented by:
<div id="example2" class="equation"></div>
```{js, results='asis', echo=FALSE, message=FALSE}
TensorDiagram.new()
.addTensor("a", {x: 0, y: 0}, [], [])
.addTensor("v", {x: 2, y: 0}, [], ["i"])
.addTensor("M", {x: 5, y: 0}, ["i"], ["j"])
.addTensor("T", {x: 8, y: 0}, ["i"], ["k"], [], ["j"])
.setSize(580, 140)
.draw("#example2");
```
Each loose end correspond to an index.
### Basic operations
```{r child = 'basic-operations.Rmd'}
```
### Dot operator
```{r child = 'dot-operators.Rmd'}
```
### Other examples
```{r child = 'other-examples.Rmd'}
```
## Applications
### Machine learning
```{r child = 'machine-learning.Rmd'}
```
### Deep learning
```{r child = 'deep-learning.Rmd'}
```
## You may have seen tensor diagrams
Feynman Diagrams[@kaiser_physics_2005] (for unbounded operators on infitely-dimensional Hilbert spaces) and quantum computing.
TODO: Do LSTM diagrams qualify?
## Conclusion
Tensor diagrams are cool!
## Footnote
Written in [R Markdown Distill](https://rstudio.github.io/distill).
Other inspirations:
- [Einsum is All you Need - Einstein Summation in Deep Learning - Tim Rocktäschel](https://rockt.github.io/2018/04/30/einsum)
- [Simple diagrams of convoluted neural networks - Piotr Migdał](https://medium.com/inbrowserai/simple-diagrams-of-convoluted-neural-networks-39c097d2925b)
- [Tensor Diagram Notation - tensornetwork.org](http://tensornetwork.org/diagrams/) for graphic style
- [Thinking in tensors, writing in PyTorch (a hands-on deep learning intro)](https://github.com/stared/thinking-in-tensors-writing-in-pytorch), especially chapter 1 (it is an eternal Work in Progress, though)
- [bra-ket-vue](https://github.com/Quantum-Game/bra-ket-vue) - a visualizer for quantum states and matrices, aware of the tensor structure
- [Tensor diagram notation in D3.js - JSFiddle](https://jsfiddle.net/stared/8huz5gy7/) - initial version of vis
- [HN on Terry Tao on some desirable properties of mathematical notation](https://news.ycombinator.com/item?id=23911903) - including that "Is it just me, or does probability theory in general have fairly terrible notation?"
## TODO
Idea:
- Antisymmetric tensor and cross product.
- Tensors in the sense of TensorFlow and torch.tensor (e.g with einsum).