Skip to content

Latest commit

 

History

History
27 lines (21 loc) · 1022 Bytes

grouping_in_tests.md

File metadata and controls

27 lines (21 loc) · 1022 Bytes

Grouping in tests

Certain tests support the optional group_by_columns argument to provide more granularity in performing tests. This can be useful when:

  • Some data checks can only be expressed within a group (e.g. ID values should be unique within a group but can be repeated between groups)
  • Some data checks are more precise when done by group (e.g. not only should table rowcounts be equal but the counts within each group should be equal)

This feature is currently available for the following tests:

  • equal_rowcount()
  • fewer_rows_than()
  • recency()
  • at_least_one()
  • not_constant()
  • sequential_values()
  • non_null_proportion()

To use this feature, the names of grouping variables can be passed as a list. For example, to test for at least one valid value by group, the group_by_columns argument could be used as follows:

  - name: data_test_at_least_one
    columns:
      - name: field
        tests:
          - dbt_utils.at_least_one:
              group_by_columns: ['customer_segment']