PySysLoadBench

Benchmarking framework for Python to measure the system load and execution time of functions.

Overview

PySysLoadBench is a package, which allows you to benchmark functions with a special focus on system metrics. These include CPU load, RAM utilization. Further, the execution time is measured. Out of these, statistics are generated, such as the mean, max, standard deviation and various percentiles. The CPU and RAM utilization is sampled once every 0.05 seconds. CPU and RAM used by child processes, created by your benchmarked function, are also measured.

For additional isolation and more precise results, a new process is created for every run. However, the individual rounds in a run are executed in the same process. This allows for imports and defining global variables in a setup function, which can be used later. It also allows you to (de-)activate garbage collection, depending on your priorities (see gc_active in Parameters). Garbage collection is also called before the execution of benchmark in every round to allow for more reproducible and accurate RAM measurements.

Be aware that the measurement process is created using spawn and not fork, since forking processes might lead to issues if CUDA is used in the functions which are benchmarked, as well as somewhere in the parent process. However, the start method is then set back to the default (for UNIX systems) fork to avoid any unexpected behavior, if new processes are created in the benchmarked function.

Usage

Installation

Install the package using pip

pip install git+https://github.com/Jan-Hoc/PySysLoadBench.git

Parameters

You have two options to use this benchmark. You can use the Run class, with which you can benchmark functions, retrieve the calculated statistics and generate graphs for the individual runs, displaying CPU, RAM and timing results in the console and generating graphs. The other option is the Benchmark class. This offers the possibility to gather several runs, return their statistics, save these into a JSON, including some system information, and also generate graphs according to the results of the different runs. See examples for their use.

When doing a run, either over Run or Benchmark, you have the following parameters you can set:

Parameter Name	Type(s)	Default	Description
`name`	`str`	X	name of the run must be unique for the `Benchmark` or `Run` object
`benchmark`	`Callable`	X	function, which should be benchmarked must take arguments passed through `kwargs`
`setup`	`Callable`, `None`	`None`	function, which is called once at the start of the run it's execution is not measured must take arguments passed through `kwargs`
`prerun`	`Callable`, `None`	`None`	function, which is called before every execution of `benchmark` it's execution is not measured must take arguments passed through `kwargs`
`rounds`	`int`	`1`	amount of rounds for which `benchmark` should be executed a higher amount of rounds gives more reliable results but takes more time
`warmup_rounds`	`int`	`0`	amount of not measured warmup rounds at the beginning of the run these rounds are also executed after `setup` is executed
`gc_active`	`bool`	`True`	`False` if garbage collection should be deactivated during single executions of `benchmark` allows for more reproducible time measurements activating it allows for more realistic system load metrics
`kwargs`	`dict`, `None`	`None`	dictionary of arguments, which are passed to `benchmark`, `setup` and `prerun`

Calculated Metrics

For timing, CPU and RAM the maximum value, the mean, the standard deviation, the 25th, 50th, 75th, 90th, 95th and 99th percentiles are calculated over the different rounds.

Additionaly these metrics are also returned/saved for the individual rounds for the RAM and CPU measurements.

The metrics are also displayed in graphs, a seperate graph for timing, CPU load and RAM utilization as well as every run.

Best Practices

Do not run other programs on your system during benchmarking to allow for the most accurate results
Use setup for any initializations which should not be measured and are only needed to be executed once
Use prerun for any setup or cleanup which should be done for benchmark, but should not be measured (e.g. resetting or initializing global variables)
Since garbage collection is called before any run of benchmark, ensure you delete unused variables in setup and prerun to profit from more accurate measurements of RAM
Beware that maybe due to caching the first round might take longer than the following. Keep this in mind when deciding to use warmup rounds or not
Run the benchmark for multiple rounds to average out potential inconsistencies
Make sure the benchmarked function doesn't complete too quickly, since otherwhise due to the sampling interval, the system metrics will be inaccurate

Examples

Run Class

from sysloadbench import Run

def example_func(x: int):
	l = []
	for i in range(x):
		l.append(i)

args = {'x': 1000000}

r = Run()
r.benchmark_run('run_example_func' , example_func, rounds=100, warmup_rounds=2, kwargs=args)

# return statistics of the run
stats = r.run_statistics('run_example_func')
# generate graphs for statistics
r.create_graphs('run_example_func', './benchmark_graphs')

Benchmark Class

from sysloadbench import Benchmark

def example_func_1(x: int):
	l = []
	for i in range(x):
		l.append(i)

def example_setup_2():
	import numpy as np

def example_func_2(x: int):
	l = np.array([])
	for i in range(100000):
		l = np.append(l, i)

args = {'x': 1000000}

b = Benchmark('Test Benchmark')
# example 1 without a setup function
b.add_run('Run example_func_1' , example_func_1, rounds=100, warmup_rounds=2, kwargs=args)

# retrieve statistics of just that run
stats_1 = b.run_statistics('Run example_func_1')

# example 2 including a setup function
b.add_run('Run example_func_2' , example_func_2, example_setup_2, rounds=100, warmup_rounds=2, kwargs=args)

# retrieve statistics of all runs
stats_all = b.statistics()

# save statistics and graphs to Path.cwd()/sysloadbench_results/<benchmark name> 
b.save_results()

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
sysloadbench		sysloadbench
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PySysLoadBench

Table of Contents

Overview

Usage

Installation

Parameters

Calculated Metrics

Best Practices

Examples

Run Class

Benchmark Class

About

Releases 4

Packages

Languages

License

Jan-Hoc/PySysLoadBench

Folders and files

Latest commit

History

Repository files navigation

PySysLoadBench

Table of Contents

Overview

Usage

Installation

Parameters

Calculated Metrics

Best Practices

Examples

Run Class

Benchmark Class

About

Resources

License

Stars

Watchers

Forks

Releases 4

Packages 0

Languages

Packages