Start of executable docs #777

ianhi · 2025-02-24T20:42:56Z

By making docs executable we will be able to automatically pick up errors. Currently I'm doing this using markdown-exec because that was the only way to integrate into the existing markdown docs while also preserving the tabbing that exists in several pages. This has the downside of currently being a bit verbose in adding to the start of each executable codeblock. However I think this is fixable with some issues I've opened upstream (pawamoy/markdown-exec#77, pawamoy/markdown-exec#76 (comment)). There are two alternatives:

mkdocs-jupyter

Woudl allow for writing documentation in jupyter notebooks. however, we would lose some of the pymdownx extensions (e.g. tabbing)

Switching the docs build to sphinx and using myst would allow for writing everthign in jupyter as well as preserving tab behavior. However, this would require swapping the docs framework.

Have doc build fail on unexpected error:

markdown-exec currently warns on an unexpected error. We can make the docs fail by passing --strict to mkdocs build (pawamoy/markdown-exec#75)

Alternatively mkdocs-jupyter provides this already, or we can achieve the same with myst/jupyterbook

ianhi · 2025-02-24T21:16:50Z

Ahh annoyingly this requires readthedocs to know how to install the dev version of icechunk, otherwise it won't be able to properly execute the docs. may take some figuring out to make that happen in readthedocs with poetry etc

docs/docs/icechunk-python/quickstart.md

dcherian · 2025-02-25T16:58:19Z

docs/.readthedocs.yaml

@@ -14,7 +15,11 @@ build:
      - poetry config virtualenvs.create false
    post_install:
      # Install deps and build using poetry
-      - . "$READTHEDOCS_VIRTUALENV_PATH/bin/activate" && cd docs && poetry install
+      - . "$READTHEDOCS_VIRTUALENV_PATH/bin/activate" && cd docs && poetry install && cd ../icechunk-python && maturin develop && cd ../docs


we can directly install from github with pip too if we know the commit ID. example:

pip install git+https://github.com/earth-mover/icechunk.git@COMMIT#subdirectory=icechunk-python

This will require maturin in the env, which we seem to have.

ianhi · 2025-03-01T00:16:22Z

With the current strategy we need an upstream change before this works. I put in a PR: pawamoy/markdown-exec#80

otherwise can't finish the version control page which has codeblocks that intentional raise execeptions.

dcherian · 2025-03-01T00:22:43Z

can't finish the version control page which has codeblocks that intentional raise execeptions.

Can we skip this one for now and continue with the rest?

ianhi · 2025-03-01T00:54:24Z

Can we skip this one for now and continue with the rest?

sure

dcherian · 2025-03-01T02:27:58Z

docs/docs/icechunk-python/dask.md

 # Assuming you have a valid writable Session named icechunk_session
-dataset = xr.tutorial.open_dataset("rasm", chunks={"time": 1}).isel(time=slice(24))
+with icechunk_session.allow_pickling():


this should only be needed for the read down below.

docs/docs/icechunk-python/quickstart2.ipynb

dcherian · 2025-03-01T02:29:04Z

docs/docs/icechunk-python/version-control.md

 for snapshot in repo.ancestry(branch="main"):
    print(snapshot)


Suggested change

for snapshot in repo.ancestry(branch="main"):

print(snapshot)

print(list(repo.ancestry(branch="main"))

perhaps?

renders poorly unfortunately. As one really long line:

dcherian · 2025-03-01T02:29:59Z

docs/docs/icechunk-python/version-control.md


-```mermaid
+```python exec="on" result="mermaid" session="version"
+main_commits = [s.id[:6] for s in list(repo.ancestry(branch='main'))]


We've wanted a method to do this for quite long!

You mean creating the mermaid diagram from python? it would be amazing if it could just auto generate the tree for you in a notebook. Would just have to bundle mermaid.js

ianhi · 2025-03-01T02:31:06Z

docs/docs/icechunk-python/dask.md

-First let's start a distributed Client and create an IcechunkStore.
-
-```python
-# initialize a distributed Client
-from distributed import Client
-
-client = Client()
-


noteable change here is that the first example doesn't use a client, because it was not working without allow_pickling so i only used the client after that was described

ianhi · 2025-03-01T02:31:39Z

docs/docs/icechunk-python/dask.md


-icechunk.xarray.to_icechunk(dataset, session)
+# `to_icechunk` takes care of "allow_pickling" for you
+icechunk.xarray.to_icechunk(dataset, icechunk_session, mode="w")


had to add the mode to avoid an error.

docs/docs/icechunk-python/quickstart.md

ianhi · 2025-03-01T02:33:41Z

docs/docs/icechunk-python/version-control.md

-```python
-session = repo.readonly_session(snapshot_id="BSHY7B1AGAPWQC14Q18G")
+```python exec="on" session="version" source="material-block" result="code"
+session = repo.readonly_session(snapshot_id=list(repo.ancestry(branch="main"))[1].id)


Is there a nicer way to get the second to last commit? equivalent to HEAD~1?

next(next(repo.ancestry(branch="main")))?

ianhi · 2025-03-01T02:36:19Z

docs/mkdocs.yml

+  - toc:
+      permalink: "#"


Also added permalinks that show up when you hover. unrelated to executable changes

ianhi · 2025-03-01T02:57:46Z

docs/docs/icechunk-python/parallel.md

@@ -59,9 +59,9 @@ def write_timestamp(*, itime: int, session: Session) -> None:

 Now execute the writes.

-```python exec="on" session="parallel" source="material-block" result="code"
+<!-- ```python exec="on" session="parallel" source="material-block" result="code" -->
+```python


this example runs fine for me locally, but ends up not writing anything to store when running on readthedocs.

ianhi force-pushed the ian/docs/exec-docs branch from ff2f51e to 08341c9 Compare February 24, 2025 21:02

dcherian reviewed Feb 25, 2025

View reviewed changes

docs/docs/icechunk-python/quickstart.md Outdated Show resolved Hide resolved

dcherian reviewed Feb 25, 2025

View reviewed changes

ianhi force-pushed the ian/docs/exec-docs branch from 26d4c49 to ea18b8b Compare March 1, 2025 00:53

ianhi added 22 commits February 28, 2025 19:56

doc: initial executeable docs

5d65b26

doc: add permalink anchors

4a2fbe9

doc: executable quickstart

079c5f2

remove old docs stuff

3c0c0bd

doc: strict docs build

7e95499

doc: add markdown-exec dependency

146801d

doc: readthedocs build rust

01e2a3e

doc: rtd build

d567379

doc: rtd build

8d346fe

doc: rtd build

5fb8ed1

doc: rtd build

ddd6283

doc: rtd build

d7f8cc8

doc: rtd build

66c4522

doc: rtd build

0a4dc89

doc: rtd build

bb650a3

docs: end of file

64e74e7

docs: back to snapshot_id

e2d1bf1

docs: add parallel to executed docs

170559b

docs: add xarray to doc req

7fa4d1a

doc: build add pooch

9b50cf4

doc: build add scipy

b9e5c54

doc: add exec dask

45820c9

ianhi added 7 commits February 28, 2025 19:56

doc: formatting

4f70655

doc: exec xarray

ce755a9

doc: exec version ctrl

52c6c16

doc: dataset rendering

3d8b943

doc: formatting output

bbe0780

doc: remove parts of parallel from execution

86201c9

doc: add more doc dependencies

4c6966a

ianhi force-pushed the ian/docs/exec-docs branch from ea18b8b to 4c6966a Compare March 1, 2025 00:57

ianhi added 4 commits February 28, 2025 20:27

doc: bld add dask distributed

ca5914f

doc: spelling

8a7f081

doc: linting

da754d9

doc: execute final dask block

723e7b0

dcherian reviewed Mar 1, 2025

View reviewed changes

docs/docs/icechunk-python/quickstart2.ipynb Outdated Show resolved Hide resolved

dcherian reviewed Mar 1, 2025

View reviewed changes

doc: fix errors

ec2753d

dcherian reviewed Mar 1, 2025

View reviewed changes

ianhi commented Mar 1, 2025

View reviewed changes

docs/docs/icechunk-python/quickstart.md Outdated Show resolved Hide resolved

ianhi added 2 commits February 28, 2025 21:32

Update docs/docs/icechunk-python/quickstart.md

9c83871

doc: remove old file

e775d27

ianhi commented Mar 1, 2025

View reviewed changes

ianhi added 2 commits February 28, 2025 21:54

doc: formatting

b35caf0

doc: don't run parallel

082ad54

ianhi commented Mar 1, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Start of executable docs #777

Start of executable docs #777

ianhi commented Feb 24, 2025 •

edited

Loading

ianhi commented Feb 24, 2025

dcherian Feb 25, 2025 •

edited

Loading

ianhi commented Mar 1, 2025

dcherian commented Mar 1, 2025

ianhi commented Mar 1, 2025

dcherian Mar 1, 2025

dcherian Mar 1, 2025

ianhi Mar 1, 2025

dcherian Mar 1, 2025

ianhi Mar 1, 2025

ianhi Mar 1, 2025

ianhi Mar 1, 2025

ianhi Mar 1, 2025

dcherian Mar 1, 2025

ianhi Mar 1, 2025

ianhi Mar 1, 2025

		for snapshot in repo.ancestry(branch="main"):
		print(snapshot)

	for snapshot in repo.ancestry(branch="main"):
	print(snapshot)
	print(list(repo.ancestry(branch="main"))

Start of executable docs #777

Are you sure you want to change the base?

Start of executable docs #777

Conversation

ianhi commented Feb 24, 2025 • edited Loading

Have doc build fail on unexpected error:

ianhi commented Feb 24, 2025

dcherian Feb 25, 2025 • edited Loading

Choose a reason for hiding this comment

ianhi commented Mar 1, 2025

dcherian commented Mar 1, 2025

ianhi commented Mar 1, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ianhi commented Feb 24, 2025 •

edited

Loading

dcherian Feb 25, 2025 •

edited

Loading