Releases: diku-dk/futhark
nightly
0.25.27
Added
- Improved reverse-mode AD of
scan
with complicated operators. Work
by Peter Adema and Sophus Valentin Willumsgaard.
Fixed
-
futhark eval
: any errors in the provided .fut file would cause a
"file not found" error message. -
Handling of module-dependent size expressions in type abbreviations
(#2209). -
A
let
-bound size would mistakenly be in scope of the bound
expression (#2210). -
An overzealous floating-point simplification rule.
-
Corrected AD of
x**y
wherex==0
(#2216). -
futhark fmt
: correct file name in parse errors. -
A bug in the "sink" optimisation pass could cause compiler crashes.
-
Compile errors with newer versions of
ispc
.
0.25.26
Fixed
-
futhark pkg
: fixed parsing of Git timestamps in Z time zone. -
GPU backends did not handle array constants correctly in some cases.
-
futhark fmt
: do not throw away doc comments forlocal
definitions. -
futhark fmt
: improve formatting of value specs. -
futhark fmt
: add--check
option.
0.25.25
Added
- Improvements to
futhark fmt
.
Fixed
-
Sizes that go out of scope due to use of higher order functions will
now work in more cases by adding existentials. (#2193) -
Tracing inside AD operators with the interpreter now prints values
properly. -
Compiled and interpreted code now have same treatment of inclusive
ranges with start==end and negative step size, e.g.1..0...1
produces[1]
rather than an invalid range error. -
Inconsistent handling of types in lambda lifting (#2197).
-
Invalid primal results from
vjp2
in interpreter (#2199).
0.25.24
Added
-
futhark doc
now produces better (and stable) anchor IDs. -
futhark profile
now supports multiple JSON files. -
futhark fmt
, by William Due and Therese Lyngby. -
Lambdas can now be passed as the last argument to a function application.
Fixed
-
Negation of floating-point positive zero now produces a negative
zero. -
Necessary inlining of functions used inside AD constructs.
-
A compile time regression for programs that used higher order
functions very aggressively. -
Uniqueness bug related to slice simplification.
0.25.23
Added
-
Trailing commas are now allowed for arrays, records, and tuples in
the textual value format and in FutharkScript. -
Faster floating-point atomics with OpenCL backend on AMD and NVIDIA
GPUs. This affects histogram workloads. -
AD is now supported by the interpreter (thanks to Marcus Jensen).
Fixed
-
Some instances of invalid copy removal. (Again.)
-
An issue related to entry points with nontrivial sizes in their
arguments, where the entry points were also used as normal functions
elsewhere. (#2184)
0.25.22
Added
-
futhark script
now supports an-f
option. -
futhark script
now supports the builtin procedure$store
.
Removed
Changed
Fixed
-
An error in tuning file validation.
-
Constant folding for loops that produce floating point results could
result in different numerical behaviour. -
Compiler crash in memory short circuiting (#2176).
0.25.21
Added
-
Logging now prints more GPU information on context initialisation.
-
GPU cache size can now be configured (tuning param:
default_cache
). -
GPU shared memory can now be configured (tuning param:
default_shared_memory
). -
GPU register capacity can now be configured.
-
futhark script
now accepts a-b
option for producing binary
output.
Fixed
-
Type names for element types of array indexing functions in C
interface are now often better - although there are still cases
where you end up with hashed names. (#2172) -
In some cases, GPU failures would not be reported properly if a
previous failure was pending. -
auto output
didn't work if the.fut
file did not have any path
components. -
Improved detection of malformed tuning files.
0.25.20
Added
- Better error message when in-place updates fail at runtime due to a
shape mismatch.
Fixed
-
#[unroll]
on an outer loop now no longer causes unrolling of all
loops nested inside the loop body. -
Obscure issue related to replications of constants in complex
intrablock kernels. -
Interpreter no longer crashes on attributes in patterns.
-
Fixes to array indexing through C API when using GPU backends.
0.25.19
Added
-
The compiler now does slightly less aggressive inlining. Use the
#[inline]
attribute if you want to force inlining of some
function. -
Arrays of opaque types now support indexing through the C API.
Arrays of records can also be constructed. (#2082)
Fixed
- The
opencl
backend now always passes
-cl-fp32-correctly-rounded-divide-sqrt
to the kernel compiler, in
order to match CUDA and HIP behaviour.