Skip to content

Commit

Permalink
Merge pull request #66 from jakobnissen/simddocs
Browse files Browse the repository at this point in the history
Add docs for :simd generator
  • Loading branch information
jakobnissen authored Mar 21, 2021
2 parents 5282d7e + 6801632 commit d1b047e
Showing 1 changed file with 6 additions and 1 deletion.
7 changes: 6 additions & 1 deletion docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -385,4 +385,9 @@ elseif cs < 0
end
```

Automa.jl has three kinds of code generators. The first and default one uses two lookup tables to pick up the next state and the actions for the current state and input. The second one expands these lookup tables into a series of if-else branches. The third one is based on `@goto` jumps. These three code generators are named as `:table`, `:inline`, and `:goto`, respectively. To sepcify a code generator, you can pass the `code=:table|:inline|:goto` argument to `Automa.generate_exec_code`. The generated code size and its runtime speed highly depends on the machine and actions. However, as a rule of thumb, the code size and the runtime speed follow this order (i.e. `:table` will generates the smallest but the slowest code while `:goto` will the largest but the fastest). Also, specifying `check=false` turns off bounds checking while executing and often improves the runtime performance slightly.
Automa.jl has four kinds of code generators. The first and default one uses two lookup tables to pick up the next state and the actions for the current state and input. The second one expands these lookup tables into a series of if-else branches. The third one is based on `@goto` jumps. The fourth one is identitical to the third one, except uses SIMD operations where applicable. These four code generators are named as `:table`, `:inline`, `:goto`, and `:simd`, respectively. To sepcify a code generator, you can pass the `code=:table|:inline|:goto|:simd` argument to `Automa.generate_exec_code`. The generated code size and its runtime speed highly depends on the machine and actions. However, as a rule of thumb, the code size and the runtime speed follow this order (i.e. `:table` will generates the smallest but the slowest code while `:simd` will the largest but the fastest). Also, specifying `checkbounds=false` turns off bounds checking while executing and often improves the runtime performance slightly.

Note that the `:simd` generator has several requirements:
* First, `boundscheck=false` must be set
* Second, `loopunroll` must be the default `0` (as loops are SIMD unrolled)
* Third, `getbyte` must be the default `Base.getindex`

0 comments on commit d1b047e

Please sign in to comment.