Given an instrumented target application and grammar, SlackLine uses our grammar module to annotate the grammar with a cost. It then uses the grammar to generate derivations bound by some pre-defined length (budget). SlackLine uses input splicing and MCTS to drive the search toward more expensive inputs.
-
Finding Short Slow Inputs Faster with Grammar-Based Search (pre-print-copy). Please use the reference below when citing the paper.
Ziyad Alsaeed and Michal Young. 2023. Finding Short Slow Inputs Faster with Grammar-Based Search. In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA ’23), July 17–21, 2023, Seattle, WA, USA. ACM, New York, NY, USA, 11 pages. https://doi.org/10.1145/3597926.3598118
-
TreeLine and SlackLine: Grammar-Based Performance Fuzzing on Coffee Break (pre-print-copy). Please use the reference below when citing the paper.
Ziyad Alsaeed and Michal Young. 2023. TreeLine and SlackLine: Grammar-Based Performance Fuzzing on Coffee Break. In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA ’23), July 17–21, 2023, Seattle, WA, USA. ACM, New York, NY, USA, 4 pages. https://doi.org/10.1145/3597926.3604925
Following Nautilus, our mutator can "splice" a previously generated subtree at any non-terminal in the grammar. These previously generated subtrees are kept in the "chunk_store", following the chunkstore structure in Nautilus. Like Nautilus, we record previously generated trees (actually just hashes of the strings they generate) so we don't waste time testing the same string over and over.
Key differences include
- No bells and whistles. No havoc. There are only three ways to
get a new string:
- Splice a previously generated subtree into the current tree
- Replace any subtree at a non-terminal with a randomly generated subtree
- Generate a new random tree from the root (really a special case of the second approach)
- Length control: This is critical for performance fuzzing, and not for finding other kinds of bugs. We never generate a string that is longer than a fixed limit, which can be in characters or in tokens. But we also do not minimize inputs as Nautilus and AFL do: We want strings that are pretty close to the limit.
-
Cloning this repository:
The provided repository has submodules. Therefore, you have to clone it appropriately (e.g., use the
--recurse-submodules
flag). -
Build the Docker image:
The back end instrumented execution of an application must take place in a Docker container. This must be built once. When building the docker image, you have two options. First, building it from this repo. Second, building it from the docker repository.
-
Option 1: Build from Dockerfile
The command below uses the Dockerfile given to build the image from scratch. Note that it will take a long time (~10-20 min) as we build AFL and each target application.
docker build -t slackline-img:latest .
-
Option 2: Build from Docker Repository
The command below simply uses our docker repo to download a built image. Please note that the rest of the documentation assumes you follow Option-1. Thus, you should adjust the image and container naming when following this option to avoid any naming conflict.
docker pull zalsaeed/slackline
-
-
Run a new container
After it has been built, it can be started in Docker, like this
docker run -p 2300:2300 --name slackline -it slackline-img /bin/bash
This publishes port
2300
, which can then be reached either within the Docker container or from the host machine (e.g., from an Intel-based Mac laptop for testing). But before we can test input generation for a particular application, we need an instrumented version of that application running under the test harness in the Docker container. For example, to experiment with an instrumented version of GraphViz, we need to build and run the instrumented version of GraphViz. The build process is described within shell script for each established target application (this is the GraphViz build script). -
Run the AFL listener for a target application:
Using the commands provided as sample on each target application README file (wf, libxml, lunasvg, graphviz, flex), run the AFL listener for that target app.
afl-socket -i /home/slackline/target_apps/graphviz/inputs/ -o /home/results/graphviz-001 -p -N 500 -d dot
Note that the instrumented harness is stateful (it remembers the coverage and performance records that have been observed), so a fresh experiment requires quitting the harness and restarting it in the Docker console for the container.
-
Run SlackLine's algorithm:
To run the search process you have two options.
-
Option 1: Run from your local machine.
Run slackline.py with the configuration you want (see defaults.yaml) form your local machine. This means that you are responsible for all python's dependencies.
python3 slackline.py
-
Option 2: Use the same container to run the slackline.py.
Open another bash screen on the same container you have up and running.
docker exec -it slackline /bin/bash
Then navigate to
/home/slackline/src
and run SlackLine as you would on your local machine.python3 slackline.py
Note that the defaults.yaml configuration files is set to run GraphViz for one hour. You should change the configurations according to your run goal.
-
A five seconds run on GraphViz, could generate something similar to the directory list in the sample below
(default /tmp/slackline/<experiment-id>
).
Note we removed some of the generated inputs here for brevity.
.
|-- list
| |-- id:00000001-cost:0000050030-exec:00000001-hs:1808-crtime:1684322423777-dur:11+cost
| |-- ...
| |-- id:00000098-cost:0000107852-exec:00000146-hs:1808-crtime:1684322424940-dur:1174+cov
| |-- id:00000099-cost:0000095064-exec:00000150-hs:1808-crtime:1684322424962-dur:1196+quant
| |-- id:00000100-cost:0000830775-exec:00000151-hs:17757-crtime:1684322424979-dur:1213+cost
| |-- id:00000101-cost:0001006458-exec:00000152-hs:22176-crtime:1684322424999-dur:1233+cost
| |-- ...
| `-- id:00000251-cost:0000246459-exec:00000558-hs:2664-crtime:1684322428775-dur:5009+cov
|-- report.txt
|-- settings.yaml
`-- summary.txt
The output can be described as the following:
- list/: Is the directory where all the generated inputs are saved. For each input we track a set of data points that we
use for naming it. Note we only save interesting inputs (+cov, +quant, +cost).
id
: The input sequence.cost
: The total number of edge hits that the input exercised.exec
: The number of execution from the beginning of the run until the generation of this input.hs
(hot-spot): the count of edge hits for the edge that was hit the most.crtime
: Creation time of the input.dur
(duration): The time it toke to generate the given input since the beginning of th run in milliseconds.+cov
: The input exercised some new coverage.+quant
: The input is in the top quantile (tdigest ranking)+cost
: The input exercised some new cost.
- report.txt: A full report of the search progress.
- settings.yaml: The exact configurations used for running this experiment in case we want to regenerate it.
- summary.txt: The run high-level observations (see sample below).
The most expensive input can be found by sorting the input based on cost
or find the last input that
exercised a +cost
. In the case of the five seconds run there would not be enough time to find an actual
expensive input. Nevertheless, below we show the content of the input named
id:00000101-cost:0001006458-exec:00000152-hs:22176-crtime:1684322424999-dur:1233+cost
.
strict digraph 5{q,i:s,b->_:_,7:nwq,D:s,b,z:s,8,_:s,b,6:s}
In addition, here is a sample of the summary.txt file content. It briefly describes the characteristics of the search attempt.
*** Summary of search ***
Results logged to /tmp/slackline/app:graphviz-gram:parser-based-crtime:1684322423/list
261 nodes on search frontier
5 sweeps of frontier
1_006_458 highest execution cost encountered
194 (34.8%) occurrences new coverage (AFL bucketed criterion)
9 (1.6%) occurrences new max count on an edge (AFL mod in TreeLine and PerfFuzz)
21 (3.8%) occurrences new max of edges executed (measure of execution cost)
27 (4.8%) retained for being relatively costly
---
607 attempts to generate a mutant
463 (76.3%) spliced hybrids
144 (23.7%) random generation
607 mutants generated by splicing OR randomly expanding a node
49 (8.1%) stale mutants (duplicated previously generated string)
558 (91.9%) valid (not duplicate) mutants submitted for execution
---
193 (41.7%) progress from splicing
58 (40.3%) progress from random subtree generation
---
To collect the all the inputs information in an easy to handle CSV file, you could use the collect_costs.py script. The script provides an easy-to-use bulk data collection process. However, be aware that it could take a long time if the majority of the inputs are expensive (timing out). To run the script you only need to specify the directory where the experiment(s) are saved and the location to the instrumented binary file. For our example, it could look like this.
python3 collect_costs.py /tmp/slackline/ dot
The results will be saved to the same directory, and it will be similar to the CSV sample below.
id,exec,crtime,dur,mtime(seconds),size(byte),cost,hotspot,coverage
000001,00000000001,1685176283422,0000000062,1685176283.422196,28,63048,1806,3937
000002,00000000002,1685176283443,0000000083,1685176283.4431958,21,51421,1806,2426
000003,00000000003,1685176283449,0000000089,1685176283.448196,21,51455,1806,2435
000004,00000000004,1685176283453,0000000093,1685176283.454196,29,51519,1806,2439
...
SlackLine (and its companion TreeLine), are built modularly, allowing for a complete change in the search strategy. The basic functionalities like running inputs and reading and annotating grammar are in separate modules. One can either extend the search strategy of SlackLine or replace SlackLine as a whole.
All the dependencies are managed by the docker file provided. However, a major requirements for building and running SlackLine is to build it on x86 processor. This is required for AFL's instrumentation to work.
-
Cores of Linux as a host:
If you host the docker image on a Linux distribution, make sure
core_pattern
is set tocore
in the host machine, as given in the command below.sudo sysctl -w kernel.core_pattern="core"
Otherwise, AFL will complain with an error similar to the one below.
[*] Checking core_pattern... [-] Hmm, your system is configured to send core dump notifications to an external utility. This will cause issues: there will be an extended delay between stumbling upon a crash and having this information relayed to the fuzzer via the standard waitpid() API. To avoid having crashes misinterpreted as timeouts, please log in as root and temporarily modify /proc/sys/kernel/core_pattern, like so: echo core >/proc/sys/kernel/core_pattern