Skip to content

Commit

Permalink
add docs for all-attention memory key/value
Browse files Browse the repository at this point in the history
  • Loading branch information
lucidrains committed Nov 25, 2020
1 parent 11d77c7 commit 6e66977
Show file tree
Hide file tree
Showing 2 changed files with 22 additions and 1 deletion.
23 changes: 22 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -180,9 +180,30 @@ model(x, mask = mask) # (1, 1024, 20000)

## Features

### Augmenting Self-attention with Persistent Memory

<img src="./images/all-attention.png"></img>

https://arxiv.org/abs/1907.01470

Proposes adding learned memory key / values prior to attention. This can be added to either the encoder or the decoder.

```python
from x_transformers import Decoder, Encoder

enc = Encoder(
dim = 512,
depth = 6,
heads = 8,
attn_num_mem_kv = 16 # 16 memory key / values
)
```

## Todo

To be explained and documented

- [x] memory key / values - All-attention paper
- [x] ~~memory key / values - All-attention paper~~
- [x] memory tokens - Memory Transformers
- [x] scale normalization - Transformers Without Tears
- [x] feedforward gated linear variant - Noam's GLU Variants
Expand Down
Binary file added images/all-attention.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 6e66977

Please sign in to comment.