Skip to content

Commit

Permalink
set up read the docs for levanter (#301)
Browse files Browse the repository at this point in the history
  • Loading branch information
dlwh authored Sep 6, 2023
1 parent 77c8f42 commit e98c351
Show file tree
Hide file tree
Showing 9 changed files with 150 additions and 44 deletions.
17 changes: 17 additions & 0 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Read the Docs configuration file for MkDocs projects
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details

# Required
version: 2

# Set the version of Python and other tools you might need
build:
os: ubuntu-22.04
tools:
python: "3.11"
mkdocs:
configuration: mkdocs.yml
# Optionally declare the Python requirements required to build your docs
python:
install:
- requirements: docs/requirements.txt
7 changes: 4 additions & 3 deletions docs/Getting-Started-TPU-VM.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,10 +56,11 @@ In addition to creating the instance, it will also mount the `/files/` nfs share
venv and a copy of the repo.

**Notes**:
- This uploads setup scripts via scp. If the ssh-key that you used for Google Cloud requires passphrase or your ssh key

* This uploads setup scripts via scp. If the ssh-key that you used for Google Cloud requires passphrase or your ssh key
path is not `~/.ssh/google_compute_engine`, you will need to modify the script.
- The command will spam you with a lot of output, sorry.
- If you use a preemptible instance, you probably want to use the "babysitting" script that automatically re-creates
* The command will spam you with a lot of output, sorry.
* If you use a preemptible instance, you probably want to use the "babysitting" script that automatically re-creates
the VM. That's explained down below in the "Running Levanter GPT-2" section.


Expand Down
16 changes: 8 additions & 8 deletions docs/Getting-Started-Training.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,8 @@ To launch the training of a GPT2 model, run the following command:
python src/levanter/main/train_lm.py --config_path config/gpt2_small.yaml
```

This will execute the training pipeline pre-defined in the [train_lm.py](../src/levanter/main/train_lm.py) and set model and training configuration
set in [gpt2_small.yaml](../config/gpt2_small.yaml). You can find more template configurations in the [config](../config/) directory.
This will execute the training pipeline pre-defined in the [train_lm.py](https://github.com/stanford-crfm/levanter/tree/main/src/levanter/main/train_lm.py) and set model and training configuration
set in [gpt2_small.yaml](https://github.com/stanford-crfm/levanter/tree/main/config/gpt2_small.yaml). You can find more template configurations in the [config](https://github.com/stanford-crfm/levanter/tree/main/config/) directory.

Configuration files are processed using [Pyrallis](https://github.com/dlwh/draccus). Pyrallis is yet-another yaml-to-dataclass library.

Expand All @@ -45,8 +45,8 @@ This will overwrite the default model and training configurations and set the fo
that the hidden dimension must be divisible by the number of heads.
- `trainer.num_train_steps`: The number of training steps to run.

You can find a complete list of parameters to change from the `TrainerConfig` in [trainer.py](src/levanter/trainer.py) and `Gpt2Config` in
[gpt2.py](src/levanter/models/gpt2.py).
You can find a complete list of parameters to change from the `TrainerConfig` in [trainer.py](https://github.com/stanford-crfm/levanter/tree/main/src/levanter/trainer.py) and `Gpt2Config` in
[gpt2.py](https://github.com/stanford-crfm/levanter/tree/main/src/levanter/models/gpt2.py).

### Change Checkpoint Settings
To change the frequency of saving checkpoints, you can use the following command:
Expand All @@ -59,7 +59,7 @@ python src/levanter/main/train_lm.py \
--trainer.checkpointer.save_interval 20m
```

This will overwrite the default checkpoint settings from the `TrainerConfig` and `CheckpointerConfig` in [checkpoint.py](src/levanter/checkpoint.py) to
This will overwrite the default checkpoint settings from the `TrainerConfig` and `CheckpointerConfig` in [checkpoint.py](https://github.com/stanford-crfm/levanter/tree/main/src/levanter/checkpoint.py) to
save checkpoints every 20 minutes. The checkpoint will be saved to the directory `checkpoints/gpt2/${wandb_id}`

Note that:
Expand All @@ -79,7 +79,7 @@ python src/levanter/main/train_lm.py \
--trainer.steps_per_eval 500
```

This will overwrite the default eval frequency (every 1,000) from the `TrainerConfig` in [config.py](src/levanter/config.py) to every 500 steps.
This will overwrite the default eval frequency (every 1,000) from the `TrainerConfig` in [config.py](https://github.com/stanford-crfm/levanter/tree/main/src/levanter/config.py) to every 500 steps.

### Change Parallelism Settings
By default, Levanter will split the number of examples in `train_batch_size` equally across all available GPUs.
Expand Down Expand Up @@ -114,7 +114,7 @@ python src/levanter/main/train_lm.py \
--trainer.wandb,group my_new_exp_group
```

This will overwrite the default WandB configuration from the `TrainerConfig` in [config.py](src/levanter/config.py).
This will overwrite the default WandB configuration from the `TrainerConfig` in [config.py](https://github.com/stanford-crfm/levanter/tree/main/src/levanter/config.py).
We pass all these arguments to the `wandb.init()` function at the same verbatim.
For more information on the WandB configuration, please refer to the [WandB documentation](https://docs.wandb.ai/ref/python/init).

Expand All @@ -126,7 +126,7 @@ To do so, you can use the following command:
python src/levanter/main/train_lm.py \
--config_path config/gpt2_small.yaml \
--trainer.load_checkpoint_path checkpoints/gpt2/wandb_id \
--trainer.wandb.resume True \
--trainer.wandb.resume true \
--trainer.wandb.id asdf1234
```

Expand Down
4 changes: 2 additions & 2 deletions docs/Installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@
end="<!--levanter-installation-end-->"
%}

If you're using a TPU, more complete documentation for setting that up is available [here](docs/Getting-Started-TPU-VM.md).
If you're using CUDA, more complete documentation for setting that up is available [here](docs/Getting-Started-CUDA.md).
If you're using a TPU, more complete documentation for setting that up is available [here](Getting-Started-TPU-VM.md).
If you're using CUDA, more complete documentation for setting that up is available [here](Getting-Started-CUDA.md).

## Setting up a development environment

Expand Down
2 changes: 1 addition & 1 deletion docs/Levanter-1.0-Release.md
Original file line number Diff line number Diff line change
Expand Up @@ -549,7 +549,7 @@ learn differently from Transformers.

To get started, first install the appropriate version of JAX for your system. See [JAX's installation instructions](https://github.com/google/jax/blob/main/README.md#installation) as it varies from platform to platform.

If you're using a TPU, more complete documentation for setting that up is available [here](docs/Getting-Started-TPU-VM.md). GPU support is still in-progress; documentation is available [here](docs/Getting-Started-CUDA.md).
If you're using a TPU, more complete documentation for setting that up is available [here](Getting-Started-TPU-VM.md). GPU support is still in-progress; documentation is available [here](Getting-Started-CUDA.md).

Next, clone the repository and install it with pip:

Expand Down
25 changes: 25 additions & 0 deletions docs/css/material.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
.md-main__inner {
margin-bottom: 1.5rem;
}

/* Custom admonition: preview */
:root {
--md-admonition-icon--preview: url('data:image/svg+xml;charset=utf-8,<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path d="M15.5 12a3.5 3.5 0 1 1-7 0 3.5 3.5 0 0 1 7 0Z"/><path d="M12 3.5c3.432 0 6.124 1.534 8.054 3.241 1.926 1.703 3.132 3.61 3.616 4.46a1.6 1.6 0 0 1 0 1.598c-.484.85-1.69 2.757-3.616 4.461-1.929 1.706-4.622 3.24-8.054 3.24-3.432 0-6.124-1.534-8.054-3.24C2.02 15.558.814 13.65.33 12.8a1.6 1.6 0 0 1 0-1.598c.484-.85 1.69-2.757 3.616-4.462C5.875 5.034 8.568 3.5 12 3.5ZM1.633 11.945a.115.115 0 0 0-.017.055c.001.02.006.039.017.056.441.774 1.551 2.527 3.307 4.08C6.691 17.685 9.045 19 12 19c2.955 0 5.31-1.315 7.06-2.864 1.756-1.553 2.866-3.306 3.307-4.08a.111.111 0 0 0 .017-.056.111.111 0 0 0-.017-.056c-.441-.773-1.551-2.527-3.307-4.08C17.309 6.315 14.955 5 12 5 9.045 5 6.69 6.314 4.94 7.865c-1.756 1.552-2.866 3.306-3.307 4.08Z"/></svg>');
}

.md-typeset .admonition.preview,
.md-typeset details.preview {
border-color: rgb(220, 139, 240);
}

.md-typeset .preview>.admonition-title,
.md-typeset .preview>summary {
background-color: rgba(142, 43, 155, 0.1);
}

.md-typeset .preview>.admonition-title::before,
.md-typeset .preview>summary::before {
background-color: rgb(220, 139, 240);
-webkit-mask-image: var(--md-admonition-icon--preview);
mask-image: var(--md-admonition-icon--preview);
}
26 changes: 26 additions & 0 deletions docs/css/mkdocstrings.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
/* Indentation. */
div.doc-contents:not(.first) {
padding-left: 25px;
border-left: .05rem solid var(--md-typeset-table-color);
}

/* Mark external links as such. */
a.external::after,
a.autorefs-external::after {
/* https://primer.style/octicons/arrow-up-right-24 */
mask-image: url('data:image/svg+xml,<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path d="M18.25 15.5a.75.75 0 00.75-.75v-9a.75.75 0 00-.75-.75h-9a.75.75 0 000 1.5h7.19L6.22 16.72a.75.75 0 101.06 1.06L17.5 7.56v7.19c0 .414.336.75.75.75z"></path></svg>');
content: ' ';

display: inline-block;
vertical-align: middle;
position: relative;

height: 1em;
width: 1em;
background-color: var(--md-typeset-a-color);
}

a.external:hover::after,
a.autorefs-external:hover::after {
background-color: var(--md-accent-fg-color);
}
9 changes: 9 additions & 0 deletions docs/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
mkdocs
mkdocstrings
mkdocstrings-python
mkdocs-material
mkdocs-material-extensions
mkdocs-autorefs
mkdocs-include-markdown-plugin
mkdocs-literate-nav
mkdocs-macros-plugin
88 changes: 58 additions & 30 deletions mkdocs.yml
Original file line number Diff line number Diff line change
@@ -1,46 +1,75 @@
site_name: CRFM Levanter
site_name: Levanter
repo_url: https://github.com/stanford-crfm/levanter/
edit_uri: blob/main/docs/
theme:
name: readthedocs
name: material
highlightjs: false
features:
- content.code.copy
markdown_extensions:
- attr_list
- admonition
#- callouts
- footnotes
- codehilite
- pymdownx.details # Allowing hidden expandable regions denoted by ???
- pymdownx.magiclink
- pymdownx.superfences
- pymdownx.arithmatex: # Render LaTeX via MathJax
generic: true
- pymdownx.superfences # Seems to enable syntax highlighting when used with the Material theme.
- pymdownx.snippets: # Include one Markdown file into another
base_path: docs
- pymdownx.inlinehilite
- pymdownx.snippets:
check_paths: true
- pymdownx.superfences
- toc:
permalink: "¤"
toc_depth: "2-3"

plugins:
- search
- autorefs
- mkdocstrings:
handlers:
python:
setup_commands:
- import pytkdocs_tweaks
- pytkdocs_tweaks.main()
paths: [src]
import:
- https://docs.python.org/3/objects.inv
- https://jax.readthedocs.io/en/latest/objects.inv
- https://docs.kidger.site/equinox/objects.inv
options:
show_root_heading: true
show_signature_annotations: true
show_bases: false
show_source: false
show_root_full_path: false
show_if_no_docstring: true
members_order: source
merge_init_into_class: true
docstring_options:
ignore_init_summary: true

docstring_style: google
show_source: false
docstring_section_style: list
heading_level: 5
inherited_members: true
merge_init_into_class: true
load_external_modules: true
preload_modules: [haliax, haliax.core]
# separate_signature: true
show_root_heading: true
show_root_full_path: false
# show_signature_annotations: true
show_symbol_type_heading: false
show_symbol_type_toc: false
signature_crossrefs: true
line_length: 100
- include-markdown
extra_css:
- docstrings.css
markdown_extensions:
- pymdownx.magiclink
- pymdownx.arithmatex: # Render LaTeX via MathJax
generic: true
- pymdownx.superfences # Seems to enable syntax highlighting when used with the Material theme.
- pymdownx.details # Allowing hidden expandable regions denoted by ???
- pymdownx.snippets: # Include one Markdown file into another
base_path: docs
- admonition
- toc:
permalink: "¤" # Adds a clickable permalink to each section heading
toc_depth: 4
- css/material.css
- css/mkdocstrings.css


watch:
- src
- scripts
- infra
- config
- docs
nav:
- 'Home': 'index.md'
- 'User Guide': 'Getting-Started-Training.md'
Expand All @@ -49,6 +78,5 @@ nav:
- 'Installation.md'
- 'Getting-Started-TPU-VM.md'
- 'Getting-Started-CUDA.md'
- Technical Documentation:
- 'Overview.md'
- 'design/Data-Loader-Design.md'
- Other:
- 'Levanter-1.0-Release.md'

0 comments on commit e98c351

Please sign in to comment.