Skip to content

Commit

Permalink
Improve links in docs (#124)
Browse files Browse the repository at this point in the history
  • Loading branch information
alan-cooney authored Nov 30, 2023
1 parent d5c66f8 commit ee49ed8
Show file tree
Hide file tree
Showing 7 changed files with 171 additions and 132 deletions.
2 changes: 2 additions & 0 deletions .vscode/cspell.json
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@
"gelu",
"githistory",
"hobbhahn",
"htmlproofer",
"hyperband",
"hyperparameters",
"imageuri",
Expand All @@ -53,6 +54,7 @@
"miniter",
"mkdocs",
"mkdocstrings",
"mknotebooks",
"monosemantic",
"monosemanticity",
"multipled",
Expand Down
35 changes: 21 additions & 14 deletions docs/content/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,21 +11,30 @@ A sparse autoencoder for mechanistic interpretability research.
pip install sparse_autoencoder
```

## Quick Start

Check out the [demo notebook](demo) for a guide to using this library.

We also highly recommend skimming the reference docs to see all the features that are available.

## Features

This library contains:

1. **A sparse autoencoder model**, along with all the underlying PyTorch components you need to
customise and/or build your own:
- Encoder, constrained unit norm decoder and tied bias PyTorch modules in `autoencoder`.
- L1 and L2 loss modules in `loss`.
- Adam module with helper method to reset state in `optimizer`.
- Encoder, constrained unit norm decoder and tied bias PyTorch modules in
[sparse_autoencoder.autoencoder][].
- L1 and L2 loss modules in [sparse_autoencoder.loss][].
- Adam module with helper method to reset state in [sparse_autoencoder.optimizer][].
2. **Activations data generator** using TransformerLens, with the underlying steps in case you
want to customise the approach:
- Activation store options (in-memory or on disk) in `activation_store`.
- Hook to get the activations from TransformerLens in an efficient way in `source_model`.
- Source dataset (i.e. prompts to generate these activations) utils in `source_data`, that
stream data from HuggingFace and pre-process (tokenize & shuffle).
- Activation store options (in-memory or on disk) in [sparse_autoencoder.activation_store][].
- Hook to get the activations from TransformerLens in an efficient way in
[sparse_autoencoder.source_model][].
- Source dataset (i.e. prompts to generate these activations) utils in
[sparse_autoencoder.source_data][], that stream data from HuggingFace and pre-process
(tokenize & shuffle).
3. **Activation resampler** to help reduce the number of dead neurons.
4. **Metrics** that log at various stages of training (e.g. during training, resampling and
validation), and integrate with wandb.
Expand All @@ -38,10 +47,8 @@ The library is designed to be modular. By default it takes the approach from [To
Monosemanticity: Decomposing Language Models With Dictionary Learning
](https://transformer-circuits.pub/2023/monosemantic-features/index.html), so you can pip install
the library and get started quickly. Then when you need to customise something, you can just extend
the abstract class for that component (e.g. you can extend `AbstractEncoder` if you want to
customise the encoder layer, and then easily drop it in the standard `SparseAutoencoder` model to
keep everything else as is. Every component is fully documented, so it's nice and easy to do this.

## Demo

Check out the [demo notebook](demo) for a guide to using this library.
the abstract class for that component (e.g. you can extend
[`AbstractEncoder`][sparse_autoencoder.autoencoder.components.abstract_encoder] if you want to
customise the encoder layer, and then easily drop it in the standard
[`SparseAutoencoder`][sparse_autoencoder.autoencoder.model] model to keep everything else as is.
Every component is fully documented, so it's nice and easy to do this.
10 changes: 9 additions & 1 deletion docs/gen_ref_pages.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,12 +62,20 @@ def generate_documentation(path: Path, module_path: Path, full_doc_path: Path) -
if module_path.name == "__main__":
return

# Get the mkdocstrings identifier for the module
parts = list(module_path.parts)
parts.insert(0, "sparse_autoencoder")
identifier = ".".join(parts)

# Read the first line of the file docstring, and set as the header
with path.open() as fd:
first_line = fd.readline()
first_line_without_docstring = first_line.replace('"""', "").strip()
first_line_without_last_dot = first_line_without_docstring.rstrip(".")
title = first_line_without_last_dot or module_path.name

with mkdocs_gen_files.open(full_doc_path, "w") as fd:
fd.write(f"::: {identifier}")
fd.write(f"# {title}" + "\n\n" + f"::: {identifier}")

mkdocs_gen_files.set_edit_path(full_doc_path, path)

Expand Down
9 changes: 5 additions & 4 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ site_dir: docs/generated
repo_url: https://github.com/ai-safety-foundation/sparse_autoencoder
repo_name: ai-safety-foundation/sparse_autoencoder
edit_uri: "" # Disabled as we use mkdocstrings which auto-generates some pages
# strict: true
strict: true

theme:
name: material
Expand Down Expand Up @@ -42,6 +42,8 @@ markdown_extensions:
- pymdownx.arithmatex: # Render LaTeX via MathJax
generic: true
- pymdownx.superfences # Seems to enable syntax highlighting when used with the Material theme.
- pymdownx.magiclink
- pymdownx.saneheaders
- pymdownx.details # Allowing hidden expandable regions denoted by ???
- pymdownx.snippets: # Include one Markdown file into another
base_path: docs/content
Expand All @@ -53,9 +55,6 @@ markdown_extensions:
plugins:
- search
- autorefs
# - gen-files:
# scripts:
# - docs/gen_ref_pages.py
- section-index
- literate-nav:
nav_file: SUMMARY.md
Expand All @@ -73,3 +72,5 @@ plugins:
line_length: 100
show_symbol_type_heading: true
edit_uri: ""
- htmlproofer:
raise_error: True
Loading

0 comments on commit ee49ed8

Please sign in to comment.