Skip to content

Commit

Permalink
Document release process in the code base (#4933)
Browse files Browse the repository at this point in the history
This PR executes on the decision made on the [3rd of December,
2024](https://docs.google.com/document/d/15bvqKLuVZ6Wp28ZExqRO3YGYi4ILlHGGfJgK2bBBdFU/edit?tab=t.0#heading=h.gjdgxs),
to store documentation alongside code.

* Create a detailed description of the release process in its own page
* Start recording some of the Advice Process decisions in the
documentation for the wallet, under /contributor/decisions tree
* Add release checklist template

#4934
  • Loading branch information
abailly authored Feb 3, 2025
2 parents 40e856f + 949b38b commit cf000f4
Show file tree
Hide file tree
Showing 11 changed files with 484 additions and 3 deletions.
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -93,3 +93,6 @@ package.json
artifacts
deposit-*
cabal.project.freeze

# Emacs
*~
8 changes: 7 additions & 1 deletion docs/site/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,8 +54,14 @@
- [Testing](contributor/how/testing.md)
- [Continuous Integration](contributor/how/continuous-integration.md)
- [Release Process](contributor/how/release-process.md)
- [Release checklist](contributor/how/release-checklist.md)
- [Release Checklist template](contributor/how/release-checklist.md)
- [Code Review Guidelines](contributor/how/code-review-guidelines.md)
- [Notes](contributor/notes.md)
- [Updating Dependencies](contributor/notes/updating-dependencies.md)
- [Notes from upgrading GHC version](contributor/notes/notes-from-upgrading-ghc-version.md)
- [Decisions Record](contributor/decisions.md)
- [2024-12-03 - Store documents alongside code](contributor/decisions/2024-12-03-document-with-code.md)
- [2024-03-23 - Release Process](contributor/decisions/2024-03-13-release-process.md)
- [2023-07-28 - Team Workflow](contributor/decisions/2023-07-28-workflow-review.md)
- [2023-01-27 - Continuous Integration](contributor/decisions/2023-01-27-continuous-integration.md)
- [2022-10-04 - Document Storage](contributor/decisions/2022-10-04-document-storage.md)
3 changes: 3 additions & 0 deletions docs/site/src/contributor/decisions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Decisions Record

We keep in these documents a timestamped record of process related decisions.
42 changes: 42 additions & 0 deletions docs/site/src/contributor/decisions/2022-10-04-document-storage.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# **Document Storage :: AP**

| | |
|---------|------------|
| Started | 2022-09-30 |
| Decided | 2022-10-14 |

## **Why**

We need to store our documents — such as decision records, design documents — somewhere, but there are a plethora of options: Google Docs, Confluence, Github, Custom wiki, … with various pros and cons. This decision records our current choice and its rationale.

## **Decision**

The entry point to our documentation is the Adrestia (Haskell) Team Dashboard on Google Docs. All other documents are hyperlinked from this dashboard.

For documents that are not public, such as meeting minutes or Advice Process records, we use Google Docs.

For documents that may eventually be public, such as design documentation or release checklists, we use Rodney’s HedgeDoc instance at [https://md.adrestia.iohkdev.io/](https://md.adrestia.iohkdev.io/) . (As Rodney is no longer with us, I’m actually not happy with that state of affairs. Moritz Angerman has set up another instance at [https://hd.devx.iog.io/](https://hd.devx.iog.io/) ).

## **Details**

See Section “Decision”.

## **Rationale**

**Making** a **decision** — Document storage systems come with various trade-offs, and none seems to be perfect. The act of simply picking one system is at least as important as making sure that the pick is reasonable — we can’t even record our choice unless we have already made it\! Thus, I (the decision maker) have decided to favor a speedy decision over an optimal one.

What do we want out of a document storage system? Here is a list of features, in order of importance, and some thoughts as to how these are satisfied with my current decision.

**Sharing** and **Access Control** — We need to be able to share documents with team members, and maybe external parties. Also, we need to keep some types of documents internal to the company or team.

**Readability** — Documents should be pleasant on the eyes. For me, this means that they should have a rendered view, but they do not need to be WSIWYG.

**Commenting** and **Tracking Changes** — It should be pleasant to comment on documents and track changes that different contributors made. For me, this means that it should be possible to comment on the *rendered* view. For example, markdown documents in a Github repository do not satisfy this property, as I can only see changes in the source code, and comments are also attached to the source code rather than the rendering.

**Writing format** — Documents should be easy to export to Markdown, so that we can more easily move them between different storage solutions, e.g. from HedgeDoc to our Github repositories. Unfortunately, Google Docs does not allow easy export to Markdown; this is particularly painful for documents with a lot of source code — hence we use HedgeDoc for more technical documents.

**Navigation** — Documents should be easy to **search** (“I know what I’m looking for, but I cannot find it.”) and to **discover** (“I don’t know what I’m looking for, but I find it anyway”).

From my experience, good discoverability comes from good curation. In my view, Hyperlinks are the best tool for curating document listings. For example, Confluence has a hierarchical view for each space, but I still find it hard to navigate if the sections are not curated properly. The [Haskell wikibook](https://en.wikibooks.org/wiki/Haskell) uses ToC templates and manually curated tables of hyperlinks, I think it works very well.

For searching a document, curation helps, too, but a search box is usually more effective.
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
# **Continuous Integration**

| | |
|--------------|------------|
| Started | 2022-12-06 |
| Decided | 2023-01-27 |
| Last amended | 2023-04-04 |

## **Why**

The sudden decommission of Hydra forces us to revisit our Continuous Integration (CI) setup.

## **Decision**

We predominantly rely on [Buildkite](https://buildkite.com) as CI system.

We build **artifacts** and run **checks** on them. Artifacts include compiled executables, but also the source code itself. Checks include unit and integration tests, but also source code linters. We specify our artifacts in \`flake.nix\`, and most of our checks, too.

We perform these builds and checks with different **granularity** — in order to keep our computing resources within reasonable limits, we don’t automatically check everything on every commit. Instead, the granularities are:

* `post-commit` \= (at most once) after each commit; for a quick sanity check after a push
* `pre-merge` \= before merging each pull request; for a reasonably complete, automated check that master will satisfy functional requirements
* `post-merge` \= after merging each pull request to master; for an exhaustive check that master satisfies functional requirements on all platforms
* `nightly` \= every night; for an exhaustive, automated check of functional and non-functional requirements

The following **table** lists our artifacts, the checks performed on them (\`.\` for build), the granularity at which the check is performed, and the CI system used for doing that:

| Artifact | Check | Granularity | CI System | (Status) |
| :---- | :---- | :---- | :---- | ----- |
| Source code | Code formatting style | post-commit | Buildkite | 🔵 |
| Documentation | . | post-commit | Github Action | 🔵 |
| Pull request (PR) | Mergeable to master concurrently with other PRs | pre-merge | Bors | 🔵 |
| Compiled modules | . | post-commit | Buildkite | 🔵 |
| | Unit tests (linux) | post-commit | Buildkite | 🔵 |
| | Unit tests (macos) | post-merge | Buildkite | 🔵 |
| | Unit tests (windows) | nightly | Github Action | 🔵 |
| Executables / Release archive | . (linux) | pre-merge | Buildkite | 🟡[ADP-2502](https://cardanofoundation.atlassian.net/browse/ADP-2502) |
| | . (macos) | post-merge | Buildkite | 🔵 |
| | . (windows, cross-compiled) | pre-merge | Buildkite | 🔵 |
| Executables | Integration tests (linux) | pre-merge | Buildkite | 🔵 |
| | ~~Integration tests (macos)~~ | | Buildkite | 🔴[ADP-2522](https://cardanofoundation.atlassian.net/browse/ADP-2522) |
| | ~~Integration tests (windows)~~ | | Github Action | 🔴[ADP-2517](https://cardanofoundation.atlassian.net/browse/ADP-2517) |
| | Benchmarks | nightly | Buildkite | 🔵 |
| Release archive | E2E tests | nightly | Github Action | 🔵 |
| Docker image | . | post-commit | Buildkite | 🔵 |

Legend: Status 🔵\= working; 🟡= needs work; 🔴\= not working

## Details

### **Granularity**

* Granularity refers to **automatic** actions taken by the CI system. It should be possible to trigger a build or check manually at any time.
* The purpose of granularity is to **conserve** computing **resources** — in a world with infinite resources, the system would perform every build and check on every commit.
* The name of the granularity “**post-commit**” was chosen for brevity — the action is performed automatically on the **latest** commit after a \`git push\`, not on the git commits in between. In other words, the action is performed at most once per commit.
* We use the “**post-merge**” granularity for actions that
* consume scarce resources and have a high chance of failing, e.g. builds and checks on macOS
* We use the “**nightly**” granularity for actions that
* consume many resources, e.g. benchmarks

### **CI System**

As a general rule, we choose

* Github Actions for actions that
* are very simple and do not require a nix store / environment
* run on Windows
* Buildkite otherwise
* especially for actions that require a nix store

We have a **tension** where we have to set up some checks (e.g. unit tests) in two different environments due to different availability of operating systems:

* Linux, macOS — in Buildkite
* Windows — in Github Actions

We hope to address this tension by requesting a **Windows machine** for use with **Buildkite**.

### **Platform macOS**

At the time of writing, we have two mac-mini machines that act as Buildkite agents. Unfortunately, they are frequently overloaded and fail the builds or checks. Hence, we only use granularity “post-merge” or “nightly” for them.

### **Company Processes**

For developing and maintaining our CI, we may use DevX/SRE expertise from IOG.

* Our **tribe** is responsible for choosing our CI tooling
* Our **tribe** should have a process for getting DevX/SRE support
* Our tribes’ DevX/SRE resources can help teach us how to debug problems that arise
* Link to the [SRE Chapter of IOG](https://input-output.atlassian.net/wiki/spaces/CI/pages/3528785931/SRE+Chapter)

## **Rationale**

### **Artifacts and checks**

The two main concerns of a CI pipeline are: building **artifacts** and running **checks**.

The purpose of building an artifact is to produce, say, an executable or HTML. The purpose of running a check is to check that the artifact satisfies certain properties, e.g. all unit tests pass.

Different CI systems, like Hydra, Cicero, Buildkite or Github Actions, have a different focus regarding these concerns.

* The world view of Hydra is that everything is about building artifacts. Hydra was surprisingly successful as a CI tool, because this world view can be used for running checks, too — they can be expressed as trivial artifacts, where success of the check is equivalent to success of building \`()\`, and failure of the check is equivalent to failure of the artifact build.
* The world view of Github Actions, Buildkite or Cicero is that everything is about running checks. The drawback is that building artifacts is more difficult and we have problems managing the build cache.

For us, the main takeaway is that we should try to separate these concerns clearly.

Our **artifacts** include: **source code** and **compiled** **executables**. We have different **checks** on these: Linters and style checkers on the source code, unit and integration tests on the executables.

As we are coming from Hydra, compiling executables is easiest to do through a **cached nix store**. At the moment, it looks like only Buildkite has good support for that; hence we choose Buildkite.

### **Our options for CI system**

Buildkite

* Pro — Good at artifacts, working nix cache
* Pro — Good documentation, easy to write
* Con — Dependency on machine (currently provided by SRE / [Samuel Leathers](mailto:[email protected]))
* Con — no Windows machine
* Con — Dependency on permissions (currently only SRE / [Samuel Leathers](mailto:[email protected]) has write permission)

In a pinch, the dependencies can be solved by forking the repository and providing our own machines.

Github Action

* Neutral — Good at small actions, but problems at scale
* Neutral — Good documentation, but a bit cumbersome to write
* Pro — No dependency on machine
* Pro — Windows machine
* Pro — No dependency on permissions

Cicero

* Con — Poor at artifacts, nix cache currently not working properly
* Con — Poor documentation
* Neutral — Dependency on machine (provided by SRE, but they have long-time commitment)
* Con — no Windows machine
* Pro — No dependency on permission

## **References**

\[1\] G Kim, K Behr, G Spafford; [The Phoenix Project](https://www.goodreads.com/book/show/17255186-the-phoenix-project); IT Revolution Press (2013). A business novel about the DevOps movement: make the flow of work visible and automate it, to an extreme of, say, 30 releases per day.

\[2\] [Cicero on Github](https://github.com/input-output-hk/cicero#readme)

# **Scratchbook**

## **Random Findings**

Installing Nix with the \`cachix/install-nix-action\` Github Action: [https://github.com/input-output-hk/cardano-node/blob/db396b163af615aa89286aa985583ef8843cfcde/.github/workflows/check-mainnet-config.yml\#L16-L23](https://github.com/input-output-hk/cardano-node/blob/db396b163af615aa89286aa985583ef8843cfcde/.github/workflows/check-mainnet-config.yml#L16-L23)

## **Documentation Findings**

### **Cicero**

[Cicero](https://github.com/input-output-hk/cicero#readme) \= An *engine* for executing actions. An “action” is an arbitrary program (Bash, Python, Nix, …) that is run in the Nomad execution environment.

[Tullia](https://github.com/input-output-hk/tullia#readme) \= A domain specific language, embedded in the Nix language, for expressing actions to be run with Cicero. This is useful when writing Cicero actions that mainly build stuff with Nix.
34 changes: 34 additions & 0 deletions docs/site/src/contributor/decisions/2023-07-28-workflow-review.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# **Workflow Review **

| | |
|---------|------------|
| Started | 2023-07-07 |
| Decided | 2023-07-28 |


## **Why**

As we have now moved to the Cardano Foundation, it’s an appropriate time to review how we operate as a team and see what improvements we can implement.

## **Decision**

1. Due to the distributed nature of the team, where possible and practical we will continue to favor asynchronous working.
2. We will adopt a Kanban-esque approach to managing our workflow, therefore:
1. Work items will be continuously prioritized in line with agreed goals and roadmap.
2. Weekly planning & replenishment meetings will be held.
3. We will no longer plan work into sprints or iterations.
4. Updated meetings and ceremonies:

| Meeting / Activity | Frequency | How |
|:------------------------------|:----------------------|:----------------------------------------------------------------------------|
| Daily Updates | Daily / as applicable | Provided in slack \#hal-daily |
| Planning & Replenishment | Weekly | Friday meeting (½ hr) |
| Development Meeting | Weekly | Wednesday (1hr) |
| Backlog Refinement & Grooming | Continuously | Asynchronously, supported by breakout sessions and ad hoc team meetings. |
| Demos | Ad hoc | Scheduled upon feature completion. Will demo as part of a feature showcase. |
| Retro | Quarterly / ad hoc | Deep Dive meeting (3 hrs) |

3. To determine how best to implement an internal team “No Meeting Day”, this will be progressed via its own advice process.
4. Public team channel (\#hal-public) created to facilitate communications and collaboration within the Cardano Foundation and IOG.
5. Current ticketing system JIRA will continue to be used, subject to future Advice Process.
6. Cardano Wallet [public forum on Github](https://github.com/cardano-foundation/cardano-wallet/discussions) to be utilized for engaging with the community and for also announcing new releases.
24 changes: 24 additions & 0 deletions docs/site/src/contributor/decisions/2024-03-13-release-process.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# Update Release Process

| | |
|---------|------------|
| Started | 2023-07-19 |
| Decided | 2024-03-23 |

## Why

The release v2023-07-18 of cardano-wallet highlighted the need to further automate the release process, and to clarify consistency between artifacts and tests. The release v2023-12-18 highlighted further gaps in test and dependency maintenance.

## Decision

* We continue to use **trunk-based development**, where all code is merged into the master branch frequently, such that this branch is (almost) ready to be released at any point in time.
* A **release** is a collection of artifacts that have been tested together and will be published.
* We **create** a release from a specific Git commit, marked by a **Git tag**.
* We create and **test artifacts** without human involvement using **automation**.
* We publish human-readable release notes, specifically a **changelog** and a list of **known issues**.
* We **publish** a release by clicking a button on [Github Releases](https://docs.github.com/en/repositories/releasing-projects-on-github). Artifacts are automatically pushed to other platforms.
* The release is made under **human supervision** using a **release checklist**.

## Details

see [Release process](../how/release-process.md)
Loading

0 comments on commit cf000f4

Please sign in to comment.