Skip to content

0.17.2

Compare
Choose a tag to compare
@github-actions github-actions released this 19 Sep 10:14

Added

  • Obscure loop optimisation (#1110).

  • Faster matrix transposition in C backend.

  • Library code generated with CUDA backend can now be called from
    multiple threads.

  • Better optimisation of concatenations of array literals and
    replicates.

  • Array creation C API functions now accept const pointers.

  • Arrays can now be indexed (but not sliced) with any signed integer
    type (#1122).

  • Added --list-devices command to OpenCL binaries (#1131)

  • Added --help command to C, CUDA and OpenCL binaries (#1131)

Removed

  • The integer modules no longer contain iota and replicate
    functions. The top-level ones still exist.

  • The size module type has been removed from the prelude.

Changed

  • Range literals may no longer be produced from unsigned integers.

Fixed

  • Entry points with names that are not valid C (or Python)
    identifiers are now pointed out as problematic, rather than
    generating invalid C code.

  • Exotic tiling bug (#1112).

  • Missing synchronisation for in-place updates at group level.

  • Fixed (in a hacky way) an issue where reduce_by_index would use
    too much local memory on AMD GPUs when using the OpenCL backend.