Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unit test guide and catch example #138

Draft
wants to merge 16 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 65 additions & 0 deletions src/book/03-guides/02-tools/05-catch-guide.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
---
section: Guides
chapter: Tools
title: Catch Getting Started
description: How to write your first test with catch.
slug: /guides/tools/catch-guide
---

[Catch](https://github.com/catchorg/Catch2) is the C++ testing framework which we use for our unit tests. In this guide we will go over the basics of Catch by completing a concrete example. This guide assumes familiarity with C++ and testing concepts. To brush-up on testing, see the [Guide for Writing Tests](/guides/general/writing-tests).

Note that the [Catch docs](https://github.com/catchorg/Catch2/tree/devel/docs) should be the first thing to look at if you're wondering about a specific Catch-ism.

## A Basic Example

Catch test cases have a few components. The most important is the `TEST_CASE(..)` macro, which wraps around groups of associated tests. Inside each `TEST_CASE` scope, you should follow the **AAA** structure, _Arranging_ the data first - both inputs and expected outputs - then _Acting_ by calling the function you're testing, then _Asserting_ that the results match the ground truth. We'll make an example for the following toy utility function:
KipHamiltons marked this conversation as resolved.
Show resolved Hide resolved

```cpp
[[nodiscard]] inline int add(const int a, const int b) noexcept {
return a + b;
}
```

To test it, we'll need pairs of inputs and their associated outputs. Usually, if there's a lot of data, we'll want to keep the it separate from the test logic, but for this example we'll keep it local. Also note that for utilities like this one, we would want a much more comprehensive set of test cases.
KipHamiltons marked this conversation as resolved.
Show resolved Hide resolved
KipHamiltons marked this conversation as resolved.
Show resolved Hide resolved

```cpp
using utility::math::add;

TEST_CASE("Testing integer add utility", "[utility][math][add]") {
// Arrange
static constexpr ssize_t NUM_TESTS = 5;
std::array<std::pair<int, int>, NUM_TESTS> inputs = {{0, 0}, {1, 1}, {-1, -1}, {123000, 456}, {-1000, 1000}};
std::array<int, NUM_TESTS> ground_truth = {0, 2, -2, 123456, 0};
std::array<int, NUM_TESTS> outputs{};
// Act
for (ssize_t i = 0; i < NUM_TESTS; ++i) {
outputs[i] = add(inputs[i].first, inputs[i].second);
}
// Assert
for (ssize_t i = 0; i < NUM_TESTS; ++i) {
INFO("In test case number " << i);
INFO("Inputs are (" << inputs[i].first << ", " << inputs[i].second << ")");
INFO("Ground truth is " << ground_truth[i]);
INFO("Function output is " << outputs[i]);
REQUIRE(outputs[i] == ground_truth[i]);
}
}
```

### Dissecting the Example

As we can see in this example, inside the scope of the `TEST_CASE` is where it all happens. We use `INFO` macros to commentate exactly what is happening for each assertion. You shouldn't worry too much about creating huge, unwieldy logs with `INFO` macros, because Catch only prints the `INFO` for test cases which fail by default.
KipHamiltons marked this conversation as resolved.
Show resolved Hide resolved

The first argument for `TEST_CASE` is a string which is a name for the test. You can use the names to run the specific test. They should be specific. The second argument is a set of tags, which you can use to divide the tests into groups easily. We usually use each sub-namespace the function is located in as the tags. The Catch `TEST_CASE` has much more functionality than demonstrated here and you can find the documentation with the details [here](https://github.com/catchorg/Catch2/blob/devel/docs/test-cases-and-sections.md).

The rest of the example is just going through the **AAA** process (the comments are for illustrative purposes).

## Floating Point Considerations

Floating point arithmetic is imprecise by nature. Equality comparisons between distinct non-zero floating point numbers are assumed to be false because of this imprecision. Catch has features to deal with the error from floating point operations. We recommended that if you're testing functions which compute floating point numbers that you [read about those features](https://github.com/catchorg/Catch2/blob/devel/docs/assertions.md#floating-point-comparisons).

Basically, you'll need to define a margin of error - either relative or absolute - that you can tolerate and use that to define your `Approx` for each floating point `REQUIRE` assertion.

## Conclusions

Catch is concise and powerful. It has many more features than the basic example presented here - this is just enough to make you dangerous. Now go and write some tests!
6 changes: 6 additions & 0 deletions src/book/03-guides/04-general/03-glossary.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@ slug: /guides/general/glossary

[**Affine Transform**](https://en.wikipedia.org/wiki/Affine_transformation): A transform which keeps straight lines straight, and parallel lines parallel, without preserving distances or angles between lines otherwise.

[**Approximation Error**](https://en.wikipedia.org/wiki/Approximation_error): The difference between a numerical approximation and the actual value being approximated. It can be expressed as either a relative error term, which is a ratio of error:actual_value, or an absolute error term which is just the error, without reference to the actual value.

[**Arch**](https://www.archlinux.org/): A lightweight linux distribution which the NUgus robots use as their primary operating system.

[**Armadillo (arma)**](http://arma.sourceforge.net/): The C++ linear algebra library developed by the CSIRO, previously used by NUbots. Documentation can be found [here](http://arma.sourceforge.net/docs.html).
Expand Down Expand Up @@ -76,6 +78,8 @@ slug: /guides/general/glossary

[**GitHub**](https://github.com/): A source code repository hosting service, which uses a git backend. NUbots uses GitHub to manage the source code.

[**GitHub Issue**](https://guides.github.com/features/issues/): A numbered thread associated with a GitHub repository. Issues are commonly used to keep track of bugs, enhancements, or simply hosting discussions.

[**Green Horizon**](/system/subsystems/vision#green-horizon-detector): The edge of the football field, as calculated by the vision system.

[**igus Humanoid Open Platform**](https://arxiv.org/abs/1809.11110): The 3d printed base design which the NUgus is based on.
Expand Down Expand Up @@ -152,6 +156,8 @@ slug: /guides/general/glossary

**Quintic Walk**: Open loop walk engine which uses quintic splines to create trajectories. Created by Bit-Bots, based off code from team Rhoban, then ported to run with NUClear.

[**Random Seed**](https://en.wikipedia.org/wiki/Random_seed): A number which is used by a (pseudo-)random number generator to start generating random numbers. Providing a seed for a generator makes its subsequent calls consistent from run to run.

[**Rhoban**](https://www.rhoban-project.fr/): The RoboCup team from Bordeaux University, who created the original quintic walk adapted by Bit-Bots, and ported to the NUbots codebase.

[**RoboCup Symposium**](https://www.robocup.org/symposium): The research conference which is held after RoboCup each year, featuring research from the teams competing.
Expand Down
60 changes: 60 additions & 0 deletions src/book/03-guides/04-general/04-writing-tests.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
---
section: Guides
chapter: General
title: How to Write Tests
description: Information About Making Tests
slug: /guides/general/writing-tests
---

This guide presents some general information about unit testing and tests in general.

## What is a Unit Test?

A unit test is a test of a small piece of a codebase - a unit. Unit tests are meant to test only that piece of code and they serve to validate its correctness. This contrasts with integration tests, which tests the interactions of pieces of code and their behaviour together. For the rest of this guide, we'll work with the following toy utility function
KipHamiltons marked this conversation as resolved.
Show resolved Hide resolved

```cpp
[[nodiscard]] inline int add(const int a, const int b) noexcept {
KipHamiltons marked this conversation as resolved.
Show resolved Hide resolved
return a + b;
}
```

## Anatomy of a Unit Test

The basic pieces you will need to create a unit test for a function are the following:

1. A set of inputs for the function. These should match the input parameters. In our case, we'll need a set of pairs of integers.
2. The associated outputs which you want for each input. These are often referred to as the "ground truth", because they are the true values your function should output. For our `add` example with the pair of inputs (2, 2), we want the output 4, because `add(2, 2) == 4`.
KipHamiltons marked this conversation as resolved.
Show resolved Hide resolved

The process of running the tests is as simple as calling the function with your inputs and verifying that the function's output was as expected. In our example, one of the test cases could be running `add(2, 2)` and checking that the result is 4.

## Testing Approach and Philosophy

There are many guidelines you can find online to help you to write good tests. Here are a few:

- Tests should be simple and readable enough to be correct on inspection. You don't want to think about whether a test is correct or not. Ideally you'll be able to read it and know that it's legitimate.
- Make test cases independent. The outcome of one test case shouldn't affect the outcome of others.
- Demonstrate how a piece of code should be used with its tests. We can't google for examples of people using our software, so create examples with your tests.
- Tests should be deterministic - [seed your randomness](https://en.wikipedia.org/wiki/Random_seed). If you're testing something particularly reliant on randomness or which generates randomness, compensate by using a variety of seeds with many cases each.
KipHamiltons marked this conversation as resolved.
Show resolved Hide resolved
- Follow the **AAA** structure: Arrange, Act, Assert. Each test should follow the general design of first setting up your input variables (Arrange), calling your unit with those variables (Act), then finally checking your outputs match what they should (Assert).

### General Approach

Write the easy tests first, then think about edge cases and code coverage. For `add`, you might make the `(2, 2) == 4` case first, then `(123000, 456) == 123456`. After you have some simple cases, you could consider throwing in some zeros and negative numbers - cases where the observed behaviour is different somehow.

Later, you might consider what should happen on integer overflow/wraparound, making sure that errors are handled correctly. At that stage you could also try to make a test case covering every possible branch of your code. The amount of code you execute in a set of tests is referred to as the "code coverage" of the tests.
KipHamiltons marked this conversation as resolved.
Show resolved Hide resolved

### Regression Tests

When we find bugs in the codebase and fix them, we should add a test case which makes sure that that code doesn't _regress_ into the buggy behaviour. This test case is called a regression test. Regression tests should be labelled with comments to indicate the behaviour they're watching for. If there is a GitHub issue related to the bug, the comment with the test should reference it. The test should be written such that it would fail before the fix, and pass after the fix.

For a concrete example, imagine that a bug was found with `add` where if both inputs were negative, it always returned 0. Good practice would be to add a test case or small set of test cases where both inputs were negative - such as `(-1, -1) == -2` and `(-123, -456) == -579` - labelling them as regression tests for that bug. These would clearly fail if the bug came back (although this example is quite contrived, because there should have already been tests with both inputs negative).
KipHamiltons marked this conversation as resolved.
Show resolved Hide resolved

### Black Box vs White Box Testing
KipHamiltons marked this conversation as resolved.
Show resolved Hide resolved

When we write tests, we can make them completely ignorant to the internals of the code. Such a test worries only about the inputs and outputs, considering the parts in the middle as a black box which we can't see inside - black box testing.

White box testing looks at the internals and makes tests which depend on them. This means that any significant change to the implementation of a function which doesn't change its interface is likely to break white box tests which depended on the old function. The fragility of white box tests to change is the chief reason that black box tests are preferred.

Ideally, you're able to write tests as you design your software interfaces, writing the code afterwards such that it fulfills the needs of the tests. This is the basis of [**Test-Driven Development**](https://en.wikipedia.org/wiki/Test-driven_development)(TDD), which is a powerful means of creating high quality software. Tests written as part of a TDD process should inherently be black box tests, because the implementations they're testing don't exist yet.
KipHamiltons marked this conversation as resolved.
Show resolved Hide resolved

Grey box testing is somewhere between black box and white box testing. It isn't completely ignorant of the implementation, but grey box tests should be easily adaptable if the implementation changes. It will usually be necessary to have some sort of insight into the implementation of the unit in order to get full code coverage, but a rule of thumb is the blacker the box, the better.