From 680163261427ed341094e0cc47f7bb8ef3cff3f4 Mon Sep 17 00:00:00 2001 From: Jakob Nybo Nissen Date: Sun, 21 Mar 2021 11:30:12 +0100 Subject: [PATCH] Add docs for :simd generator --- docs/src/index.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/docs/src/index.md b/docs/src/index.md index 6ec323c4..b48a6f76 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -385,4 +385,9 @@ elseif cs < 0 end ``` -Automa.jl has three kinds of code generators. The first and default one uses two lookup tables to pick up the next state and the actions for the current state and input. The second one expands these lookup tables into a series of if-else branches. The third one is based on `@goto` jumps. These three code generators are named as `:table`, `:inline`, and `:goto`, respectively. To sepcify a code generator, you can pass the `code=:table|:inline|:goto` argument to `Automa.generate_exec_code`. The generated code size and its runtime speed highly depends on the machine and actions. However, as a rule of thumb, the code size and the runtime speed follow this order (i.e. `:table` will generates the smallest but the slowest code while `:goto` will the largest but the fastest). Also, specifying `check=false` turns off bounds checking while executing and often improves the runtime performance slightly. +Automa.jl has four kinds of code generators. The first and default one uses two lookup tables to pick up the next state and the actions for the current state and input. The second one expands these lookup tables into a series of if-else branches. The third one is based on `@goto` jumps. The fourth one is identitical to the third one, except uses SIMD operations where applicable. These four code generators are named as `:table`, `:inline`, `:goto`, and `:simd`, respectively. To sepcify a code generator, you can pass the `code=:table|:inline|:goto|:simd` argument to `Automa.generate_exec_code`. The generated code size and its runtime speed highly depends on the machine and actions. However, as a rule of thumb, the code size and the runtime speed follow this order (i.e. `:table` will generates the smallest but the slowest code while `:simd` will the largest but the fastest). Also, specifying `checkbounds=false` turns off bounds checking while executing and often improves the runtime performance slightly. + +Note that the `:simd` generator has several requirements: +* First, `boundscheck=false` must be set +* Second, `loopunroll` must be the default `0` (as loops are SIMD unrolled) +* Third, `getbyte` must be the default `Base.getindex`