From b3ab685864e4d79c37a9446abc1a037f91cb5b55 Mon Sep 17 00:00:00 2001 From: Stan Brubaker <120737309+stanbrub@users.noreply.github.com> Date: Fri, 17 Jan 2025 17:29:38 -0700 Subject: [PATCH] Update TestingConcepts.md --- docs/TestingConcepts.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/TestingConcepts.md b/docs/TestingConcepts.md index 38f9a28..7605ba9 100644 --- a/docs/TestingConcepts.md +++ b/docs/TestingConcepts.md @@ -11,7 +11,7 @@ The *Bench* API uses the builder pattern to guide the test writer in generating Repeating tests can be useful for testing the effects of caching (e.g. load file multiple times; is it faster on subsequent loads?), or overcoming a lack of precision in OS timers (e.g. run a fast function many times and average), or average out variability between runs (there are always anomalies). On the other hand, if the context of the test is processing large data sets, then it's better to measure against large data sets where possible. This provides a benchmark test that's closer to the real thing when it comes to memory consumption, garbage collection, thread usage, and JIT optimizations. Repeating tests, though useful in some scenarios, can have the effect of taking the operation under test out of the benchmark equation because of cached results, resets for each iteration, limited heap usage, or smaller data sets that are too uniform. ### Adjust Scale For Each Test -When measuring a full set of benchmarks for transforming data, some benchmarks will naturally be faster than others (e.g. sums vs joins). Running all benchmarks at the same scale (e.g. 10 million rows) could yield results where one benchmark takes a minute and another takes 100 milliseconds. Is the 100 ms test meaningful, especially when measured in a JVM? Not really, because there is no time to assess the impact of JVM ergonomics or the effect of OS background tasks. A better way is to set scale multipliers to amplify row count for tests that need it. +When measuring a full set of benchmarks for transforming data, some benchmarks will naturally be faster than others (e.g. sums vs joins). Running all benchmarks at the same scale (e.g. 10 million rows) could yield results where one benchmark takes a minute and another takes 100 milliseconds. Is the 100 ms test meaningful, especially when measured in a JVM? Not really, because there is no time to assess the impact of JVM ergonomics or the effect of OS background tasks. A better way is to set scale multipliers to amplify row count for tests that need it and aim for a meaningful test duration. ### Test-centric Design Want to know what tables and operations the test uses? Go to the test. Want to know what the framework is doing behind the scenes? Step through the test. Want to run one or more tests? Start from the test rather than configuring an external tool and deploying to that. Let the framework handle the hard part. The point is that a benchmark test against a remote server should be as easy and clear to write as a unit test. As far as is possible, data generation should be defined in the same place it's used... in the test.