forked from nodejs/node
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
tools: add documentation regarding our api tooling
Introduces a proper imperative description of how the current API documentation build system works. Refs: nodejs/next-10#169
- Loading branch information
Showing
1 changed file
with
296 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,296 @@ | ||
# Node.js API Documentation Tooling | ||
|
||
The Node.js API documentation is generated by an in-house tooling that resides | ||
within the [tools/doc](https://github.com/nodejs/node/tree/main/tools/doc) | ||
directory. | ||
|
||
The build process (using `make doc`) uses this tooling to parse the markdown | ||
files in [doc/api](https://github.com/nodejs/node/tree/main/doc/api) and | ||
generate the following: | ||
|
||
1. Human-readable HTML in `out/doc/api/*.html` | ||
2. A JSON representation in `out/doc/api/*.json` | ||
|
||
These are published to nodejs.org for multiple versions of Node.js. As an | ||
example the latest version of the Human-redable HTML is published to | ||
[nodejs.org/en/doc](https://nodejs.org/en/docs/), and the latest version json | ||
documentation is published to | ||
[nodejs.org/api/all.json](https://nodejs.org/api/all.json) | ||
|
||
<!-- TODO: Add docs about how the publishing process happens --> | ||
|
||
**The key things to know about the tooling include:** | ||
|
||
1. The entry-point is `tools/doc/generate.js`. | ||
2. The tooling supports the CLI arguments listed in the table below. | ||
3. The tooling processes one file at a time. | ||
4. The tooling uses a set of dependencies as described in the dependencies | ||
section. | ||
5. The tooling parses the input files and does several transformations to the | ||
AST (Abstract Syntax Tree). | ||
6. The tooling generates a JSON output that contains the metadata and content of | ||
the Markdown file. | ||
7. The tooling generates a HTML output that contains a human-readable and ready | ||
to-view version of the file. | ||
|
||
This documentation serves the purpose of explaining the existing tooling | ||
processes, to allow easier maintenance and evolution of the tooling. It is not | ||
meant to be a guide on how to write documentation for Node.js. | ||
|
||
#### Vocabulary & Good to Know's | ||
|
||
* AST means "Abstract Syntax Tree" and it is a data structure that represents | ||
the structure of a certain data format. In our case, the AST is a "graph" | ||
representation of the contents of the Markdown file. | ||
* MDN means [Mozilla Developer Network](https://developer.mozilla.org/en-US/) | ||
and it is a website that contains documentation for web technologies. We use | ||
it as a reference for the structure of the documentation. | ||
* The | ||
[Stability Index](https://nodejs.org/dist/latest/docs/api/documentation.html#stability-index) | ||
is used to community the Stability of a given Node.js module. The Stability | ||
levels include: | ||
* Stability 0: Deprecated. (This module is Deprecated) | ||
* Stability 1: Experimental. (This module is Experimental) | ||
* Stability 2: Stable. (This module is Stable) | ||
* Stability 3: Legacy. (This module is Legacy) | ||
* Within Remark YAML snippets `<!-- something -->` are considered HTML nodes, | ||
that's because YAML isn't valid Markdown content. (Doesn't abide by the | ||
Markdown spec) | ||
* "New Tooling" references to the (written from-scratch) API build tooling | ||
introduced in `nodejs/nodejs.dev` that might replace the current one from | ||
`nodejs/node` | ||
|
||
## CLI Arguments | ||
|
||
The tooling requires a `filename` argument and supports extra arguments (some | ||
also required) as shown below: | ||
|
||
| Argument | Description | Required | Example | | ||
| --------------------- | -------------------------------------------------------------------------------------------------------------------------------------- | -------- | ---------------------------------- | | ||
| `--node-version=` | The version of Node.js that is being documented. It defaults to `process.version` which is supplied by Node.js itself | No | v19.0.0 | | ||
| `--output-directory=` | The directory where the output files will be generated. | Yes | `./out/api/` | | ||
| `--apilinks=` | This file is used as an index to specify the source file for each module | No | `./out/doc/api/apilinks.json` | | ||
| `--versions-file=` | This file is used to specify an index of all previous versions of Node.js. It is used for the Version Navigation on the API docs page. | No | `./out/previous-doc-versions.json` | | ||
|
||
**Note:** both of the `apilinks` and `versions-file` parameters are generated by | ||
the Node.js build process (Makefile). And they're files containing a JSON | ||
object. | ||
|
||
### Basic Usage | ||
|
||
```bash | ||
# cd tools/doc | ||
npm run node-doc-generator ${filename} | ||
``` | ||
|
||
**OR** | ||
|
||
```bash | ||
# nodejs/node root directory | ||
make doc | ||
``` | ||
|
||
## Dependencies and how the Tooling works internally | ||
|
||
The API tooling uses an-AST-alike library called | ||
[unified](https://github.com/unifiedjs/unified) for processing the Input file as | ||
a Graph that supports easy modification and update of its nodes. | ||
|
||
In addition to `unified` we also use | ||
[Remark](https://github.com/remarkjs/remark) for manipulating the Markdown part, | ||
and [Rehype](https://github.com/rehypejs/rehype)to help convert to and from | ||
Markdown. | ||
|
||
### What are the steps of the internal tooling? | ||
|
||
The tooling uses `unified` pipe-alike engine to pipe each part of the process. | ||
(The description below is a simplified version) | ||
|
||
* Starting from reading the Frontmatter section of the Markdown file with | ||
[remark-frontmatter](https://www.npmjs.com/package/remark-frontmatter). | ||
* Then the tooling goes to parse the Markdown by using `remark-parse` and adds | ||
support to [GitHub Flavoured Markdown](https://github.github.com/gfm/). | ||
* The tooling proceeds by parsing some of the Markdown nodes and transforming | ||
them to HTML. | ||
* The tooling proceeds to generate the JSON output of the file. | ||
* Finally it does its final node transformations and generates a stringified | ||
HTML. | ||
* It then stores the output to a JSON file and adds extra styling to the HTML | ||
and then stores the HTML file. | ||
|
||
### What each file is responsible for? | ||
|
||
The files listed below are the ones referenced and actually used during the | ||
build process of the API docs as we see on <https://nodejs.org/api>. The | ||
remaining files from the directory might be used by other steps of the Node.js | ||
Makefile or might even be deprecated/remnant of old processes and might need to | ||
be revisited/removed. | ||
|
||
* **`html.mjs`**: Responsible for transforming nodes by decorating them with | ||
visual artifacts for the HTML pages; | ||
* For example, transforming man or JS doc references to links correctly | ||
referring to respective External documentation. | ||
* **`json.mjs`**: Responsible for generating the JSON output of the file; | ||
* It is mostly responsible for going through the whole Markdown file and | ||
generating a JSON object that represent the Metadata of a specific Module. | ||
* For example, for the FS module, it will generate an object with all its | ||
methods, events, classes and use several regular expressions (ReGeX) for | ||
extracting the information needed. | ||
* **`generate.mjs`**: Main entry-point of doc generation for a specific file. It | ||
does e2e processing of a documentation file; | ||
* **`allhtml.mjs`**: A script executed after all files are generated to create a | ||
single "all" page containing all the HTML documentation; | ||
* **`alljson.mjs`**: A script executed after all files are generated to create a | ||
single "all" page containing all the JSON entries; | ||
* **`markdown.mjs`**: Contains utility to replace Markdown links to work with | ||
the <https://nodejs.org/api/> website. | ||
* **`common.mjs`**: Contains a few utility functions that are used by the other | ||
files. | ||
* **`type-parser.mjs`**: Used to replace "type references" (e.g. "String", or | ||
"Buffer") to the correct Internal/External documentation pages (i.e. MDN or | ||
other Node.js documentation pages). | ||
|
||
**Note:** It is important to mention that other files not mentioned here might | ||
be used during the process but are not relevant to the generation of the API | ||
docs themselves. You will notice that a lot of the logic within the build | ||
process is **specific** to the current <https://nodejs.org/api/> infrastructure. | ||
Just as adding some JavaScript snippets, styles, transforming certain Markdown | ||
elements into HTML, and adding certain HTML classes or such things. | ||
|
||
**Note:** Regarding the previous **Note** it is important to mention that we're | ||
currently working on an API tooling that is generic and independent of the | ||
current Nodejs.org Infrastructure. | ||
[The new tooling that is functional is available at the nodejs.dev repository](https://github.com/nodejs/nodejs.dev/blob/main/scripts/syncApiDocs.js) | ||
and uses plain ReGeX (No AST) and [MDX](https://mdxjs.com/). | ||
|
||
## The Build Process | ||
|
||
The build process that happens on `generate.mjs` follows the steps below: | ||
|
||
* Links within the Markdown are replaced directly within the source Markdown | ||
(AST) (`markdown.replaceLinks`) | ||
* This happens within `markdown.mjs` and basically it adds suffixes or | ||
modifies link references within the Markdown | ||
* This is necessary for the `https://nodejs.org` infrastructure as all pages | ||
are suffixed with `.html` | ||
* Text (and some YAML) Nodes are transformed/modified through | ||
`html.preprocessText` | ||
* JSON output is generated through `json.jsonAPI` | ||
* The title of the page is inferred through `html.firstHeader` | ||
* Nodes are transformed into HTML Elements through `html.preprocessElements` | ||
* The HTML Table of Contents (ToC) is generated through `html.buildToc` | ||
|
||
### `html.mjs` | ||
|
||
This file is responsible for doing node AST transformations that either update | ||
Markdown nodes to decorate them with more data or transform them into HTML Nodes | ||
that attain a certain visual responsibility; For example, to generate the "Added | ||
at" label, or the Source Links or the Stability Index, or the History table. | ||
|
||
**Note:** Methods not listed below are either not relevant or utility methods | ||
for string/array/object manipulation (e.g.: are used by the other methods | ||
mentioned below). | ||
|
||
#### `preprocessText` | ||
|
||
**New Tooling:** Most of the features within this method are available within | ||
the new tooling. | ||
|
||
This method does two things: | ||
|
||
* Replaces the Source Link YAML entry `<-- source_link= -->` into a "Source | ||
Link" HTML anchor element. | ||
* Replaces type references within the Markdown (text) (i.e.: "String", "Buffer") | ||
into the correct HTML anchor element that links to the correct documentation | ||
page. | ||
* The original node then gets mutated from text to HTML. | ||
* It also updates references to Linux "MAN" pages to Web versions of them. | ||
|
||
#### `firstHeader` | ||
|
||
**New Tooling:** All features within this method are available within the new | ||
Tooling. | ||
|
||
Is used to attempt to extract the first heading of the page (recursively) to | ||
define the "title" of the page. | ||
|
||
**Note:** As all API Markdown files start with a Heading, this could possibly be | ||
improved to a reduced complexity. | ||
|
||
#### `preprocessElements` | ||
|
||
**New Tooling:** All features within this method are available within the new | ||
tooling. | ||
|
||
This method is responsible for doing multiple transformations within the AST | ||
Nodes, in majority, transforming the source node in respective HTML elements | ||
with diverse responsibilities, such as: | ||
|
||
* Updating Markdown `code` blocks by adding Language highlighting | ||
* It also adds the "CJS"/"MJS" switch to Nodes that are followed by their | ||
CJS/ESM equivalents. | ||
* Increasing the Heading level of each Heading | ||
* Parses YAML blocks and transforms them into HTML elements (See more at the | ||
`parseYAML` method) | ||
* Updates BlockQuotes that are prefixed by the "Stability" word into a Stability | ||
Index HTML element. | ||
|
||
#### `parseYAML` | ||
|
||
**New Tooling:** Most of the features within this method are available within | ||
the new tooling. | ||
|
||
This method is responsible for parsing the `<--YAML snippets -->` and | ||
transforming them into HTML elements. | ||
|
||
It follows a certain kind of "schema" that basically constitues in the following | ||
options: | ||
|
||
| YAML Key | Description | Example | Example Result | Available on new tooling | | ||
| ------------- | ------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------- | --------------------------- | ------------------------ | | ||
| `added` | It's used to reference when a certain "module", "class" or "method" was added on Node.js | `added: v0.1.90` | `Added in: v0.1.90` | Yes | | ||
| `deprecated` | It's used to reference when a certain "module", "class" or "method" was deprecated on Node.js | `deprecated: v0.1.90` | `Deprecated since: v0.1.90` | Yes | | ||
| `removed` | It's used to reference when a certain "module", "class" or "method" was removed on Node.js | `removed: v0.1.90` | `Removed in: v0.1.90` | No | | ||
| `changes` | It's used to describe all the changes (historical ones) that happened within a certain "module", "class" or "method" in Node.js | `[{ version: v0.1.90, pr-url: '', description: '' }]` | -- | Yes | | ||
| `napiVersion` | It's used to describe in which version of the N-API this "module", "class" or "method" is available within Node.js | `napiVersion: 1` | `N-API version: 1` | Yes | | ||
|
||
**Note:** The `changes` field gets prepended with the `added`, `deprecated` and | ||
`removed` fields if they exist. The table only gets generated if a `changes` | ||
field exists. In the new tooling only "added" is prepended for now. | ||
|
||
#### `buildToc` | ||
|
||
**New Tooling:** This feature is natively available within the new tooling | ||
through MDX. | ||
|
||
This method generates the Table of Contents based on all the Headings of the | ||
Markdown file. | ||
|
||
#### `altDocs` | ||
|
||
**New Tooling:** All features within this method are available within the new | ||
tooling. | ||
|
||
This method generates a version picker for the current page to be shown in older | ||
versions of the API docs. | ||
|
||
### `json.mjs` | ||
|
||
This file is responsible for generating a JSON object that (supposedly) is used | ||
for IDE-Intellisense or for indexing of all the "methods", "classes", "modules", | ||
"events", "constants" and "globals" available within a certain Markdown file. | ||
|
||
It attempts a best effort extraction of the data by using several regular | ||
expression patterns (ReGeX). | ||
|
||
**Note:** JSON output generation is currently not supported by the new tooling, | ||
but it is in the pipeline for development. | ||
|
||
#### `jsonAPI` | ||
|
||
This method traverses all the AST Nodes by iterating through each one of them | ||
and infers the kind of information each node contains through ReGeX. Then it | ||
mutate the data and appends it to the final JSON object. | ||
|
||
For a more in-depth information we recommend to refer to the `json.mjs` file as | ||
it contains a lot of comments. |