Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: naming conventions in README #101

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 40 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,8 +56,8 @@ files:
all: # used as the prefix for the generated dependency file names for conda or requirements files (has no effect on pyproject.toml files)
output: [conda, requirements] # which dependency file types to generate. required, can be "conda", "requirements", "pyproject", "none" or a list of non-"none" values
conda_dir: conda/environments # where to put conda environment.yaml files. optional, defaults to "conda/environments"
requirements_dir: python/cudf # where to put requirements.txt files. optional, but recommended. defaults to "python"
pyproject_dir: python/cudf # where to put pyproject.toml files. optional, but recommended. defaults to "python"
requirements_dir: python/cudf # where to put requirements.txt files. optional, but recommended. Used for devcontainers. defaults to "python"
pyproject_dir: python/cudf # where to put pyproject.toml files. optional, but recommended. Used for wheel builds. defaults to "python"
matrix: # (optional) contains an arbitrary set of key/value pairs to determine which dependency files that should be generated. These values are included in the output filename.
cuda: ["11.5", "11.6"] # which CUDA version variant files to generate.
arch: [x86_64] # which architecture version variant files to generate. This value should be the result of running the `arch` command on a given machine.
Expand Down Expand Up @@ -99,6 +99,8 @@ files:

When `output: none` is used, the `conda_dir`, `requirements_dir` and `matrix` keys can be omitted. The use case for `output: none` is described in the [_Additional CLI Notes_](#additional-cli-notes) section below.

If more than one `files` key specifies the same output and *_dir, their contents are merged. This is how different sections are written to a given output file.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If more than one `files` key specifies the same output and *_dir, their contents are merged. This is how different sections are written to a given output file.
If more than one `files` key specifies the same output and `*_dir`, their contents are merged. This is how different sections are written to a given output file.

I think this was meant to be read as a literal wildcard. If so, it should be inline code-formatted so another one showing up later in the line doesn't accidentally change this to italics.


#### `extras`

A given file may include an `extras` entry that may be used to provide inputs specific to a particular file type
Expand All @@ -109,17 +111,15 @@ Here is an example:
files:
build:
output: pyproject
includes: # a list of keys from the `dependencies` section which should be included in the generated files
- build
extras:
table: table_name
key: key_name
```

Currently the supported extras by file type are:
- pyproject.toml
- table: The table in pyproject.toml where the dependencies should be written. Acceptable values are "build-system", "project", and "project.optional-dependencies".
- key: The key corresponding to the dependency list in `table`. This may only be provided for the "project.optional-dependencies" table since the key name is fixed for "build-system" ("requires") and "project" ("dependencies"). Note that this implicitly prohibits including optional dependencies via an inline table under the "project" table.
- table: The table in pyproject.toml where the dependencies should be written. Acceptable values are "build-system", "project", "project.optional-dependencies", and "tool.rapids-build-backend"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically table is not limited to these four values - it can be the name of any table, though these are definitely the most common. Let's think about how we can word this one differently.

- key: The key corresponding to the dependency list in `table`. This may only be provided for the "project.optional-dependencies" or "tool.rapids-build-backend" table since the key name is fixed for "build-system" ("requires") and "project" ("dependencies"). Note that this implicitly prohibits including optional dependencies via an inline table under the "project" table.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more accurate to say that key cannot be provided for build-system and project, but can be provided for all other values.


### `channels` Key

Expand Down Expand Up @@ -223,6 +223,40 @@ dependencies:
- pytest
```

## Naming conventions

With non-trivial collections of files and dependencies, names can look very related and become hard to distinguish. Some general guidelines that will help keep things more understandable:

### Patterns for [RAPIDS build backend](https://github.com/rapidsai/rapids-build-backend)

RBB builds generally include at least two sections:

```
files:
# This dependency brings in RBB
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# This dependency brings in RBB
# This dependency brings in RBB + the build backend it wraps (e.g. scikit-build-core, setuptools)

py_build_<something>:
output: pyproject
extras:
table: build-system # This is the thing to match with the py_build_* name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
table: build-system # This is the thing to match with the py_build_* name
table: build-system

I didn't understand the language in this comment and the one below that "this is the thing to match". Could you rephrase that, or consider dropping these inline comments?

The comments above the sections already pretty clearly describe the purpose.

includes:
- <rapids_build_skbuild>
# This dependency expresses the build-time dependencies for your recipe (what might otherwise go in [build-system] when not using RBB)
py_rapids_build_<something>:
output: pyproject
extras:
table: tool.rapids-build-backend # This is the thing to match with the py_rapids_build_* name
key: requires
includes:
- <normal_project_build_time_deps>
```

### Patterns for dependencies

To help distinguish dependency specifications from file specifications, it is recommended to prefix dependency entries
with `dep_<type>_<name>`, where type is one of `build`, `run`, or `test`. This is functionally superfluous, as the `extras.table`
value will ultimately determine where the dependency entries will go, but following the convention will make your dependencies.yaml
file easier to follow.

Comment on lines +226 to +259
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. I'll let @vyasr, @jameslamb, and @bdice verify.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I've seen a dependencies.yaml across RAPIDS that has any entries beginning with a literal dep_... did you mean for that be depends_on_? I have seen depends_on_{something}.

I'd probably just remove that, in favor of gently recommending grouping dependency entries with related purposes using similar prefixes, like build_{project}, run_{project}, test_{project}. cudf and its multiple sub-projects (cudf, cudf-kafka, cudf-polars, etc.) is a great example of where this aids readability:

https://github.com/rapidsai/cudf/blob/78f4a8a3f639677358bce83a699f92c90476ae75/dependencies.yaml#L23-L30


It's also not always the case that a dependency entry is only reserved for one of build, run, or test. For example, it's common to see a matrix of cuda-version dependencies used to constrain tests, docs builds, and the conda environments checked into source control that contributors are encouraged to use to run examples.

Like this: https://github.com/rapidsai/cudf/blob/78f4a8a3f639677358bce83a699f92c90476ae75/dependencies.yaml#L377

My perspective... the most important thing being documented in this paragraph is not the typical patterns for the names...it's the fact that the names are not functionally meaningful (e.g. you could add _sparkles to the end of all of them and not break anything), other than that they're unique identifiers to be used as references elsewhere in the same dependencies.yaml. I'd restructure this paragraph to lead with that, and then maybe include a nudge about standardizing the names if you'd like.

## How Dependency Lists Are Merged

The information from the top-level `files` and `dependencies` keys are used to determine which dependencies should be included in the final output of the generated dependency files.
Expand Down
Loading