Skip to content

Commit

Permalink
Apply changes from code review
Browse files Browse the repository at this point in the history
Signed-off-by: Ahdra Merali <[email protected]>
  • Loading branch information
Ahdra Merali committed Apr 16, 2024
1 parent 6a66715 commit f04b96c
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions docs/source/tutorial/test_a_project.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,12 +12,12 @@ This section explains the following:
This section does not cover:

* Automating your tests - instead read our [automated testing documentation](../development/automated_testing.md).
* More advanced features of testing, including [mocking](https://realpython.com/python-mock-library/#what-is-mocking) and [parametrizing tests](https://docs.pytest.org/en/7.1.x/example/parametrize.html).
* More advanced features of testing, including [mocking](https://realpython.com/python-mock-library/#what-is-mocking) and [parameterising tests](https://docs.pytest.org/en/7.1.x/example/parametrize.html).

Check warning on line 15 in docs/source/tutorial/test_a_project.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/source/tutorial/test_a_project.md#L15

[Kedro.Spellings] Did you really mean 'parameterising'?
Raw output
{"message": "[Kedro.Spellings] Did you really mean 'parameterising'?", "location": {"path": "docs/source/tutorial/test_a_project.md", "range": {"start": {"line": 15, "column": 124}}}, "severity": "WARNING"}


## Writing tests for Kedro nodes: Unit testing

Kedro expects nodes functions to be [pure functions](https://realpython.com/python-functional-programming/#what-is-functional-programming); a pure function is one whose output follows solely from its inputs, without any observable side effects. Testing these functions checks that a node will behave as expected - for a given set of input values, a node will produce the expected output. These tests are referred to as unit tests.
Kedro expects node functions to be [pure functions](https://realpython.com/python-functional-programming/#what-is-functional-programming); a pure function is one whose output follows solely from its inputs, without any observable side effects. Testing these functions checks that a node will behave as expected - for a given set of input values, a node will produce the expected output. These tests are referred to as unit tests.

Let us explore what this looks like in practice. Consider the node function `split_data` defined in the data science pipeline:

Expand All @@ -44,7 +44,7 @@ def split_data(data: pd.DataFrame, parameters: dict[str, Any]) -> Tuple:

</details>

The function takes a pandas DataFrame and dictionary of parameters as input, and splits the input data into 4 different data objects as per the parameters provided. We recommend following [pytest's anatomy of a test](https://docs.pytest.org/en/7.1.x/explanation/anatomy.html#anatomy-of-a-test) which breaks a test down into 4 steps: arrange, act, assert, and cleanup. For this specific function, these steps will be:
The function takes a pandas DataFrame and dictionary of parameters as input, and splits the input data into four different data objects as per the parameters provided. We recommend following [pytest's anatomy of a test](https://docs.pytest.org/en/7.1.x/explanation/anatomy.html#anatomy-of-a-test) which breaks a test down into four steps: arrange, act, assert, and cleanup. For this specific function, these steps will be:

Check warning on line 47 in docs/source/tutorial/test_a_project.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/source/tutorial/test_a_project.md#L47

[Kedro.Spellings] Did you really mean 'pytest's'?
Raw output
{"message": "[Kedro.Spellings] Did you really mean 'pytest's'?", "location": {"path": "docs/source/tutorial/test_a_project.md", "range": {"start": {"line": 47, "column": 193}}}, "severity": "WARNING"}

1. Arrange: Prepare the inputs `data` and `parameters`.
2. Act: Make a call to `split_data` and capture the outputs with `X_train`, `X_test`, `Y_train`, and `Y_test`.
Expand Down Expand Up @@ -178,7 +178,7 @@ def create_pipeline(**kwargs) -> Pipeline:
```
</details>

The pipeline takes a DataFrame and dictionary of parameters as input, splits the data in accordance to the parameters, and uses it to train and evaluate a regression model. With an integration test, we can validate that this sequence of nodes runs as expected. As we did with our unit tests, we break this down into several steps:
The pipeline takes a pandas `DataFrame` and dictionary of parameters as input, splits the data in accordance to the parameters, and uses it to train and evaluate a regression model. With an integration test, we can validate that this sequence of nodes runs as expected. As we did with our unit tests, we break this down into several steps:

Check warning on line 181 in docs/source/tutorial/test_a_project.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/source/tutorial/test_a_project.md#L181

[Kedro.toowordy] 'evaluate' is too wordy
Raw output
{"message": "[Kedro.toowordy] 'evaluate' is too wordy", "location": {"path": "docs/source/tutorial/test_a_project.md", "range": {"start": {"line": 181, "column": 154}}}, "severity": "WARNING"}

Check warning on line 181 in docs/source/tutorial/test_a_project.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/source/tutorial/test_a_project.md#L181

[Kedro.toowordy] 'validate' is too wordy
Raw output
{"message": "[Kedro.toowordy] 'validate' is too wordy", "location": {"path": "docs/source/tutorial/test_a_project.md", "range": {"start": {"line": 181, "column": 216}}}, "severity": "WARNING"}

1. Arrange: Prepare the runner and its inputs `pipeline` and `catalog`.
2. Act: Run the pipeline.
Expand Down

0 comments on commit f04b96c

Please sign in to comment.