About SIMD intrinsics #20781

toihr · 2024-02-10T22:46:53Z

toihr
Feb 10, 2024

So let me preface this by saying I am a scientist specifically a physicist and planetary scientist. So my main domain in the world of programming is scientific computations. In my field there are a lot of long calculations that are highly vectorizable and benefit often massively from parallelization. I know this discussion was started before but it did not lead anywhere and that was nearly 4 years ago so I belive we could discuss the merits of this once more.

Let me outline my Ideas with a simple example from Scientific computing, in this case its from astronomy. A common occurence in observational astronomy is that you have a raw image from an observation (consider this a simple 2D array where each value represents the number of photons detected at a sensor pixel). These raw images need to be corrected for errors this is done by applying so called flat field and dark current corrections (there are more but for the simplicity of it I will simplify like this). This correction is usualy rather straigth forward and can be written as science = (raw - dark)/(flat - dark), where raw, dark and flat are 2D arrays (images) of the same dimensions.

The equation is obviously highly paralllelizable as you have simple elementwise addition, subtraction and division. Which can be either dispatched to multiple cores or even better GPUs. Now let us take a look into how other languages handle these vectorized calculations.

the wisdom of the old the FORmula TRANslator FORTRAN which is made exactly for such calculation.

Program Astro
real, dimension(3,3) :: flat, dark, image, science

dark = reshape((/ 1,1,1,1,1,1,1,1,1 /), shape(dark))
flat = reshape((/ 2,2,2,2,2,2,2,2,2 /), shape(flat))
science = reshape((/ 2,2,2,2,100,2,2,2,2 /), shape(science))

!In Fortran elementwise calculations are automatically supported on arrays of the same size.
image = (science - dark)/(flat - dark)


end Program Astro

The new kid on the block Julia

dark = [1 1 1; 1 1 1; 1 1 1]
flat = [2 2 2; 2 2 2; 2 2 2]
science = [2 2 2; 2 100 2; 2 2 2]

#here the SIMD intrinsic uses the . syntax which immediately broadcasts the function over all #elements along the same dimensions this also enables broadcasting of 1D Vectors along 2D vectors 
image = (science .- dark)./(flat .- dark)

Python and the insane numerical stack developed on it

import numpy as np
dark = np.array([[1,1,1],[1,1,1],[1,1,1]])
flat = np.array([[2,2,2],[2,2,2],[2,2,2]])
science = np.array([[2,2,2],[2,100,2],[2,2,2]])

#numpy will do automatic vectorization on these calculations
Image = (science - dark)/(flat - dark)

These are the potential implementations, I added python in this case even tho numpy is an external library because it’s the de facto standard for the domain of scientific computing.

I would personally be in favor of adding SIMD intrinsics. Let me outline advantages this would bring:

Advantages:
• Is syntactically different from just using the operator on an array therefore less potential for confusion
• As mentioned in the docs on operator overloading “Operator overloading goes against V's philosophy of simplicity and predictability.”, if we have a shorthand notation for these SIMD calculations we know immediately from the syntax that the operators are just working elementwise on the values in the array. Instead of on some custom data structure we would have to know about first.
• As also mentioned it is just easier to read and write a + b instead of a.add(b)
• If we add SIMD this would make it easier to optimize numerics

While I do am in favor of adding a syntax like this I am willing to acknowledge its demerits:
Disadvantages:
• Its an additional syntax that is technically speaking not necessary as it could easily be done with loops instead
• An argument has been made that SIMD intrinsics don’t actually help performance

My personal take: I would propose something similar to Julias dot syntax. Not necessarily in full form as Julias dot syntax also allows to broadcast over a function using func.(x) which might become confusing but atleast for operators e.g. x .+ y but I would like to hear other peoples opinion on this aswell. And discuss about implementation, advantages disadvantages and everything else.

I hope that we have a nice discussion and can all be civil ^^

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About SIMD intrinsics #20781

{{title}}

Replies: 0 comments

Select a reply

About SIMD intrinsics #20781

toihr Feb 10, 2024

Replies: 0 comments

toihr
Feb 10, 2024