Skip to content

Commit

Permalink
[DAPHNE-daphne-eu#550] HTML documentation generated from our markdown…
Browse files Browse the repository at this point in the history
… files

This commit is the squashed version of PR daphne-eu#550. I did not take the time to sum up the individual log messages of the 56 original commits. Find the commit messages of these squashed commit below (in chronological order).

Closes daphne-eu#550

Co-authored-by: Patrick Damme <[email protected]>

docs mvp with only markdown docs

doc readme: fix urls to work on web page

add logos

fix markdown syntax

who is this mark dwon?

remove copyright info from mark dwon files

shorten mark dwon headers

shorten mark dwon headers

bundle DaphneDSL docs

bundle DaphneLib docs

minor md change

[doc] fix some links

[doc] widen html width

add repo infos

[mkdocs] manual ordering of documentation

use nav tabs

[doc][html] set color for light mode to daphne blue

[doc][html] set color palette with light and dark version for DAPHNE blue

[doc] change doc links back to working on gh

[doc] append .md sufix back to doc links

[doc] move binary format to dev docs

[doc] gh action: exchange urls before doc build

[doc] gh action: rm .md suffix for doc build prep

[doc] gh actions insert full qualified gh url to all files except for /doc/ paths

[doc] doc readme, minro bug fix

[doc] fix urls

[doc] add doc/development/Contributing.md dummy which will be overridden

[doc] bugfix in mkdocs

[doc] add dev docs for writing docs

[doc] gh actions link replacement fix

[doc] gh actions: also make links to file in repo root work

[doc] fix links in gettingstarted

[doc] rename containers/Readme to README

[doc] fix broken links

[doc] gh actions: make sed cmds better readable

[doc] support links to issues

[doc] add favicon

[doc] rearrange docs

[doc] change header for config

[doc] put <> stuff into inline code

[doc] indentation fixes

[doc] make nav bar sticky

[doc] refactor and add new md files to html

Minor fixes.

[doc] fix angle bracket usage

[doc] try different gh workflow

[doc] gh worflow yaml

[doc] readd license header to all files

[doc] change push trigger to main branch

[doc] gh workflow: add PR trigger

Minor polishing.
  • Loading branch information
m-birke authored and corepointer committed Jul 4, 2023
1 parent 00382c1 commit c27d7e9
Show file tree
Hide file tree
Showing 37 changed files with 1,310 additions and 928 deletions.
39 changes: 39 additions & 0 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
name: Build HTML Docs from Markdown and Push to GitHub Pages Branch
on:
push:
branches:
- main
pull_request:
branches:
- main

jobs:
build:
name: Build Docs
runs-on: ubuntu-20.04
steps:
- name: Checkout
uses: actions/checkout@v2
- name: Prepare Docs
run: |
cp CONTRIBUTING.md doc/development/Contributing.md
sed -ri 's@(\(/doc/[a-zA-Z0-9/]*)(.md)@\1@g' doc/*.md
sed -ri 's@(\(/doc/[a-zA-Z0-9/]*)(.md)@\1@g' doc/**/*.md
sed -ri 's@\(/([a-zA-Z0-9]*.[a-zA-Z0-9]*)@\(https://github.com/daphne-eu/daphne/tree/main/\1@g' doc/*.md
sed -ri 's@\(/([a-zA-Z0-9]*.[a-zA-Z0-9]*)@\(https://github.com/daphne-eu/daphne/tree/main/\1@g' doc/**/*.md
sed -ri 's@]\(/([a-z]+)@]\(https://github.com/daphne-eu/daphne/tree/main/\1@g' doc/*.md
sed -ri 's@]\(/([a-z]+)@]\(https://github.com/daphne-eu/daphne/tree/main/\1@g' doc/**/*.md
sed -i 's@](https://github.com/daphne-eu/daphne/tree/main/doc/@](/daphne/@g' doc/*.md
sed -i 's@](https://github.com/daphne-eu/daphne/tree/main/doc/@](/daphne/@g' doc/**/*.md
sed -ri 's@]\(/issues/([0-9]+)@]\(https://github.com/daphne-eu/daphne/issues/\1@g' doc/*.md
- name: Build
uses: Tiryoh/actions-mkdocs@v0
with:
mkdocs_version: 'latest'
requirements: 'doc/docs-build-requirements.txt'
configfile: 'mkdocs.yml' # option
- name: Deploy
uses: peaceiris/actions-gh-pages@v3
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: ./doc_build
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@ build_*/
/lib
/tmp

# documentation build output
doc_build/

# dependencies
thirdparty/*
!thirdparty/llvm-project/
Expand All @@ -16,7 +19,7 @@ thirdparty/*

# Python
__pycache__/
/venv*
/**/*venv*

# Jetbrains IDE
.idea/
Expand Down
File renamed without changes.
2 changes: 1 addition & 1 deletion deploy/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ This directory includes a set of **bash scripts** providing support for:
## List of Files in this Directory

1. This short [README](README.md) file to explain directory structure and point to more documentation at [Deploy](/doc/Deploy.md).
2. A [script](build-daphne-singularity-image.sh) that builds the "daphne.sif" singularity image from the [Docker image](../containers/Readme.md)
2. A [script](build-daphne-singularity-image.sh) that builds the "daphne.sif" singularity image from the [Docker image](/containers/README.md)
daphneeu/daphne-dev
3. [deploy-distributed-on-slurm](deploy-distributed-on-slurm.sh) script allows the user to deploy DAPHNE with SLURM.
4. [deployDistributed](deployDistributed.sh) script builds and sends DAPHNE to remote machines manually with SSH (no tools like Slurm needed).
Expand Down
65 changes: 35 additions & 30 deletions doc/BinaryFormat.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,40 +14,38 @@ See the License for the specific language governing permissions and
limitations under the License.
-->

# DAPHNE Binary Data Format
# Binary Data Format

DAPHNE defines its own binary representation for the serialization of in-memory data objects (matrices/frames).
This representation is intended to be used by default whenever we need to transfer or persistently store these in-memory objects, e.g., for

- the data transfer in the distributed runtime
- a custom binary file format
- the eviction of in-memory data to secondary storage

*Disclaimer:* The current specification is a first draft and will likely be refined as we proceed.
At the moment, we focus on the case of a single block per data object.

**Endianess**

For now, we assume *little endian*.

**Images**
**Endianess:** For now, we assume *little endian*.

In the images below, all addresses and sizes are specified in bytes (`[B]`).
**Images:** In the images below, all addresses and sizes are specified in bytes (`[B]`).

### Binary Representation of a Whole Data Object
## Binary Representation of a Whole Data Object

The binary representation of a data object (matrix/frame) starts with a header containing general and data type-specific information.
The data object is partitioned into rectangular blocks (in the extreme case, this can mean a single block).
All blocks are represented individually (see binary representation of a single block below) and stored along with their position in the data object.

```
```text
+--------+------+
| header | body |
+--------+------+
```

**Header**
### Header

The header consists of the following information:

- DAPHNE binary format version number (`1` for now) (uint8)
- data type `dt` (uint8)
- number of rows `#r` (uint64)
Expand Down Expand Up @@ -81,9 +79,10 @@ We currently support the following **value types**:
Depending on the data type, there are more information in the header:

*For `DenseMatrix` and `CSRMatrix`*:

- value type `vt` (uint8)

```
```text
addr[B] 0 0 1 1 2 9 10 17 18 18
+---+----+----+-----+-----+
| 1 | dt | #r | #c | vt |
Expand All @@ -92,28 +91,30 @@ size[B] 1 1 8 8 1
```

*For `Frame`*:

- value type `vt` (uint8), for each column
- length of the label `len` (uint16) and label `lbl` (character string), for each column

```
```text
addr[B] 0 0 1 1 2 9 10 17 18 18+#c-1 18+#c *
+---+----+----+-----+-------+ +----------+--------+--------+ +-----------+-----------+
| 1 | dt | #r | #c | vt[0] | ... | vt[#c-1] | len[0] | lbl[0] | ... | len[#c-1] | lbl[#c-1] |
+---+----+----+-----+-------+ +----------+--------+--------+ +-----------+-----------+
size[B] 1 1 8 8 1 1 2 len[0] 2 len[#c-1]
```

**Body**
### Body

The body consists of a sequence of:

- a pair of
- row index `rx` (uint64)
- column index `cx` (uint64)
- a binary block representation

For the special case of a single block, this looks as follows:

```
```text
addr[B] 0 7 8 15 16 *
+---+----+----------+
| 0 | 0 | block[0] |
Expand All @@ -122,26 +123,27 @@ addr[B] 0 7 8 15 16 *
size[B]
```

### Binary Representation of a Single Block
## Binary Representation of a Single Block

A single data block is a rectangular partition of a data object.
In the extreme case, a single block can span the entire data object in both dimensions (one block per data object).

General block header

- number of rows `#r` (uint32)
- number of columns `#c` (uint32)
- block type `bt` (uint8)
- block type-specific information (see below)

```
```text
addr[B] 0 3 4 7 8 8 9 *
+----+----+----+--------------------------+
| #r | #c | bt | block type-specific info |
+----+----+----+--------------------------+
size[B] 4 4 1 *
```

**Block types**
## Block types

We define different block types to allow for a space-efficient representation depending on the data.
When serializing a data object, the block types are not required to match the in-memory representation (e.g., the blocks of a `DenseMatrix` could use the *sparse* binary representation).
Expand All @@ -159,53 +161,54 @@ Most block types store their value type as part of the block type-specific infor
Note that the value type used for the binary representation is not required to match the value type of the in-memory object (e.g., `DenseMatrix<uint64_t>` may be represented as a *dense* block with value type `uint8_t`, if the value range permits).
Furthermore, each block may be represented using its individual value type.

**Empty block**
### Empty block

This block type is used to represent blocks that contain only zeros of the respective value type very space-efficiently.

Block type-specific information: *none*

```
```text
addr[B] 0 3 4 7 8 8
+----+----+---+
| #r | #c | 0 |
+----+----+---+
size[B] 4 4 1
```

**Dense block**
### Dense block

Block type-specific information:

- value type `vt` (uint8)
- values `v` in row-major (value type `vt`)

Below, `S` denotes the size (in bytes) of a single value of type `vt`.

```
```text
addr[B] 0 3 4 7 8 8 9 9 10 10+#r*#c*S
+----+----+---+----+---------+---------+ +---------------+
| #r | #c | 1 | vt | v[0, 0] | v[0, 1] | ... | v[#r-1, #c-1] |
+----+----+---+----+---------+---------+ +---------------+
size[B] 4 4 1 1 S S S
```

**Sparse block (compressed sparse row, CSR)**
### Sparse block (compressed sparse row, CSR)

Block type-specific information:

- value type `vt` (uint8)
- number of non-zeros in the block `#nzb` (uint64)
- for each row
- number of non-zeros in the row `#nzr` (uint32)
- for each non-zero in the row
- column index `cx` (uint32)
- value `v` (value type `vt`)

Note that both a row and the entire block might contain no non-zeros.

Below, `S` denotes the size (in bytes) of a single value of type `vt`.


```
```text
18 + 4*#r +
addr[B] 0 3 4 7 8 8 9 9 10 17 18 #nzb*(4+S)
+----+----+---+----+------+--------+ +--------+ +-----------+
Expand All @@ -230,14 +233,15 @@ size[B] 4 4 1 1 8 4+#nzr[i]*(4+S)
4 S
```

**Ultra-sparse block (coordinate, COO)**
### Ultra-sparse block (coordinate, COO)

Ultra-sparse blocks contain almost no non-zeros, so we want to keep the overhead of the meta data low.
Thus, we distinguish blocks with a single column (where we don't need to store the column index) and blocks with more than one column.

*Blocks with a single column*
### Blocks with a single column

Block type-specific information:

- value type `vt` (uint8)
- number of non-zeros in the block `#nzb` (uint32)
- for each non-zero
Expand All @@ -246,7 +250,7 @@ Block type-specific information:

Below, `S` denotes the size (in bytes) of a single value of type `vt`.

```
```text
addr[B] 0 3 4 7 8 8 9 9 10 13 14 14+#nzb*(4+S)
+----+----+---+----+------+-------+ +-------+ +------------+
| #r | #c | 3 | vt | #nzb | nz[0] | ... | nz[i] | ... | nz[#nzb-1] |
Expand All @@ -262,9 +266,10 @@ size[B] 4 4 1 1 4 4+S 4+S 4+S
4 S
```

*Blocks with more than one column*
### Blocks with more than one column

Block type-specific information:

- value type `vt` (uint8)
- number of non-zeros in the block `#nzb` (uint32)
- for each non-zero
Expand All @@ -274,7 +279,7 @@ Block type-specific information:

Below, `S` denotes the size (in bytes) of a single value of type `vt`.

```
```text
addr[B] 0 3 4 7 8 8 9 9 10 13 14 14+#nzb*(8+S)
+----+----+---+----+------+-------+ +-------+ +------------+
| #r | #c | 3 | vt | #nzb | nz[0] | ... | nz[i] | ... | nz[#nzb-1] |
Expand Down
7 changes: 4 additions & 3 deletions doc/Config.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ See the License for the specific language governing permissions and
limitations under the License.
-->

# DAPHNE Configuration: Getting Information from the User
# Configuration - Getting Information from the User

The behavior of the DAPHNE system can be influenced by the user by means of a cascading configuration mechanism.
There is a set of options that can be passed from the user to the system.
Expand All @@ -27,17 +27,18 @@ The cascade consists of the following steps:
- (In the future, DaphneDSL will also offer means to change the configuration at run-time.)

The `DaphneUserConfig` is available to all parts of the code, including:

- The DAPHNE compiler: The `DaphneUserConfig` is passed to the `DaphneIrExecutor` and from there to all passes that need it.
- The DAPHNE runtime: The `DaphneUserConfig` is part of the `DaphneContext`, which is passed to all kernels.

Hence, information provided by the user can be used to influence both, the compiler and the runtime.
*The use of environment variables to pass information into the system is discouraged.*

### How to extend the configuration?
## How to extend the configuration?

If you need to add additional information from the user, you must take roughly the following steps:

1. Create a new member in `DaphneUserConfig` and hard-code a reasonable default.
2. Add a command-line argument to the system's CLI API in [src/api/cli/daphne.cpp](/src/api/cli/daphne.cpp). We use LLVM's [CommandLine 2.0 library](https://llvm.org/docs/CommandLine.html) for parsing CLI arguments. Make sure to update the corresponding member the `DaphneUserConfig` with the parsed argument.
3. *For compiler passes*: If necessary, pass on the `DaphneUserConfig` to the compiler pass you are working on in [src/compiler/execution/DaphneIrExecutor.cpp](/src/compiler/execution/DaphneIrExecutor.cpp). *For kernels*: All kernels automatically get the `DaphneUserConfig` via the `DaphneContext` (their last parameter), so no action is required from your side.
4. Access the new member of the `DaphneUserConfig` in your code.
4. Access the new member of the `DaphneUserConfig` in your code.
Loading

0 comments on commit c27d7e9

Please sign in to comment.