Skip to content

Commit

Permalink
Add README and initial docs
Browse files Browse the repository at this point in the history
  • Loading branch information
lbianchi-lbl committed Mar 12, 2024
1 parent 4424a7e commit 1087a5c
Show file tree
Hide file tree
Showing 5 changed files with 100 additions and 26 deletions.
72 changes: 46 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,26 +1,46 @@
# Example submission

## Installation

```bash
pushd "$(mktemp -d)"
QUARTO_RELEASE="1.3.361"
INSTALL_DIR="$HOME/opt/quarto"
BIN_DIR="$HOME/.local/bin"
mkdir -p "$INSTALL_DIR" "$BIN_DIR"
wget "https://github.com/quarto-dev/quarto-cli/releases/download/v${QUARTO_RELEASE}/quarto-${QUARTO_RELEASE}-linux-amd64.tar.gz"
tar -C "$INSTALL_DIR" -xvzf "quarto-${QUARTO_RELEASE}-linux-amd64.tar.gz"
ln -s "${INSTALL_DIR}/quarto-${QUARTO_RELEASE}/bin/quarto" "${BIN_DIR}/quarto"
which -a quarto
quarto check
popd
```

## Compiling document

```bash
git clone https://github.com/USRSE/jupyter-notebook-templates && cd jupyter-notebook-templates
git checkout reproducible-document
ls -l
quarto render reproducible_document_template.ipynb --to html
```
# Continuous Integration (CI) resources for US-RSE'24 computational notebooks submissions

This repository contains instructions, GitHub Actions workflows, and accessory scripts to help authors tests their submissions for the Computational Notebooks track at the [US-RSE'24 conference](https://USRSE.github.io/usrse24).

## In a nutshell

- We have developed an **automated workflow** to test that a repository satisfies the **requirements for submission to the US-RSE'24 notebooks track**, and that the **notebook can be run in the same standardized, self-contained environment** that will be used during the review process
- Authors **can, but are not required to, enable this workflow** to validate their repository at any stage, including while developing their notebook and/or before finalizing their submission
- To enable the workflow and start testing your repository, refer to the

## A few key questions

### What is Continuous Integration?

- Broadly speaking, Continuous Integration (CI) is a software engineering practice that helps **ensure code works as expected outside of a developer's local environment**
- Typically, CI consists in a **set of checks configured to run automatically whenever code in a repository is updated** (e.g. when new commits are pushed to a branch, a Pull Request (PR) is opened, etc.)
- **If any of the checks fail, the developer is alerted**, giving them the possibility to **fix the issues in the code before it makes its way to its intended destination** (e.g. distributed to users, deployed to a production environment, etc)

### What is GitHub Actions?

- GitHub Actions (GHA) is the name of GitHub's built-in tool to run automated workflows, typically (but not only) used to run CI workflows on code hosted on GitHub repositories
- GHA is available **free of charge** for all GitHub public repositories

### Why is CI needed for notebooks at the US-RSE conference?

- One of the biggest challenges with computational notebooks is **ensuring that a notebook can be run** by people other than its author(s), on computation environments, and/or at different times in the future after its creation, an ability sometimes known as _computational reproducibility_
- While this is a general issue affecting any context where notebooks (or indeed, any computational artifact) are used, these concerns also apply concretely to the **computational notebooks submission track** at the US-RSE conference:
- If **reviewers** are not able to run notebooks for the submissions they're reviewing, they'll likely be **unable to evaluate the submission** based on its full intended functionality; or, they might try to fix the issues preventing the notebook from being run (missing dependencies, incompatible versions, etc), which results in extra work, frustration, and/or less consistency across multiple reviewers
- Even if **authors** try their best to **provide resources for reproducing a valid computational environment** in which their submission can be run (such as documentation, packaging/environment metadata, etc), the **lack of an automated way to test and a documented standard for the computational environment** that will be used limits their ability to validate their resources (and, therefore, estimating how likely it is that their notebooks will run as expected during review) before finalizing their submission
- By providing **a set of automated checks that can run on the repository before submission**, based on the **same standardized tools, specifications, and computational environment available to reviewers**, the CI workflow addresses both of these issues, giving authors the possibility to **focus their efforts toward a concrete goal for computational reproducibility** for their US-RSE notebooks submission, hopefully only requiring a reasonable amount of extra effort

### I'm interested in submitting a notebook to US-RSE'24, but I'm not sure about this CI thing. Am I still able to submit without it?

- In one sentence: **absolutely, yes!** Using this CI workflow **is not a requirement for submission** for US-RSE'24
- **Using this CI workflow is completely optional**. Authors who choose not to enable it for their repository for any reason will not be penalized in any way, as long as their repository satisfies the mandatory requirements described in the submission instructions

### What do I have to do to enable the US-RSE notebooks CI workflow for my GitHub repository?

Refer to the [Getting Started](docs/getting-started.md) section of the [documentation](docs/) in this repository.

## Next steps

If you're interested to know more about the US-RSE'24 notebooks submissions:

- Join the `#usrse24` channel on the US-RSE Slack workspace to receive general news about the conference, as well as updates specific to the notebooks CI resources hosted in this repository
- Star this repository to receive notifications about new versions, functionality being added, etc
Binary file added docs/enable-actions.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions docs/faq.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# Frequently Asked Questions
47 changes: 47 additions & 0 deletions docs/getting-started.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# Getting started

## Step 1: enabling GitHub Actions in your repository

Depending on how your repository is configured, the Actions functionality might need to be enabled first, along with the permissions needed by the workflow to perform certain steps.

1. Navigate to the `Settings` tab of your repository, then select the `Actions` section from the menu on the left-hand side
2. Select the options highlighted in red in the figure below, making sure of confirming each selection by clicking on the corresponding `Save` buttons

![](enable-actions.png)

## Step 2: adding the workflow definition file

GitHub Actions workflows are defined by files in the `.github/workflows` directory using the YAML format. A workflow definition file contains two key components:

- The `on` section, specifying one or more triggers, i.e. the conditions under which the workflow will be run
- The `jobs` section, specifying one or more jobs, i.e. the tasks that will be executed during each workflow runs

In this case, our workflow (the _caller workflow_) will contain a single job referencing the workflow `check-submission.yml` defined in this (i.e.`USRSE/notebooks-submissions`) repository (the _called workflow_).

1. From a local clone of your repository, or alternatively using the GitHub web interface, create a subdirectory named `.github/workflows`
2. Inside `.github/workflow`, create a file with any name and the `.yml` extension, e.g. `.github/workflows/checks.yml`, containing the following snippet:

```yml
name: Check submission for US-RSE'24

on:
push: # `on.push` means that the workflow will be triggered any time one or more commits are pushed to the repository

jobs:
check-submission: # `check-submission` is the job-id and can be chosen arbitrarily
uses: USRSE/notebooks-submissions/.github/workflows/check-submission.yml@v1
with:
notebook: my-sample-notebook.qmd # replace my-sample-notebook.qmd with the actual path to your notebook
```
3. Change the value of the `jobs.<job-id>.with.notebook:` input field to match the actual path to your notebook, then save the `.github/workflows/check.yml` file and stage, commit, and push the changes using Git (if editing locally); or follow the prompts and create a commit directly to the current branch (if using the GitHub web interface)
4. Navigate to the `Actions` tab of your GitHub repository. If everything is configured correctly, after a few moments, you should see a workflow run corresponding to your latest pushed commits
5. Click on that workflow run to switch to the Run summary page, where the overall status, in-progress and completed jobs, and ultimately the outcome of the various parts of the workflow are displayed.

## Step 3: configuring your repository

- After verifying that the workflow is running as expected, it's likely that you'll need to configure the repository with the _auxiliary files_ needed to set up the computational environment where the notebook will be run
- The exact list of files will depend on the specific programming language (Python, R, ...), file format (`.qmd`, `.ipynb`, `.Rmd`, ...), external dependencies, etc needed to run your notebook
- To help you choose which files should be added to your repository based on your notebook's needs, refer to the following resources:
- The [`usrse-notebooks-sample-submission` topic on GitHub](https://github.com/topics/usrse-notebooks-sample-submission) collects a set of repositories showcasing some of the possible configurations
- The [_Configuring your repository_ section of the repo2docker documentation](https://repo2docker.readthedocs.io/en/latest/configuration/index.html) contains a comprehensive list of files supported by the US-RSE'24 notebooks CI workflow build infrastructure, which (similarly to mybinder.org) is based on the versatile `repo2docker` tool
6 changes: 6 additions & 0 deletions docs/reference.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Reference

## Resources

- [Configuring your repository](https://repo2docker.readthedocs.io/en/latest/configuration/index.html): comprehensive list of files supported by the repo2docker/Binder infrastructure used by the US-RSE notebooks CI workflow
- [GitHub Actions documentation](https://docs.github.com/en/actions)

0 comments on commit 1087a5c

Please sign in to comment.