You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
So let me preface this by saying I am a scientist specifically a physicist and planetary scientist. So my main domain in the world of programming is scientific computations. In my field there are a lot of long calculations that are highly vectorizable and benefit often massively from parallelization. I know this discussion was started before but it did not lead anywhere and that was nearly 4 years ago so I belive we could discuss the merits of this once more.
Let me outline my Ideas with a simple example from Scientific computing, in this case its from astronomy. A common occurence in observational astronomy is that you have a raw image from an observation (consider this a simple 2D array where each value represents the number of photons detected at a sensor pixel). These raw images need to be corrected for errors this is done by applying so called flat field and dark current corrections (there are more but for the simplicity of it I will simplify like this). This correction is usualy rather straigth forward and can be written as science = (raw - dark)/(flat - dark), where raw, dark and flat are 2D arrays (images) of the same dimensions.
The equation is obviously highly paralllelizable as you have simple elementwise addition, subtraction and division. Which can be either dispatched to multiple cores or even better GPUs. Now let us take a look into how other languages handle these vectorized calculations.
the wisdom of the old the FORmula TRANslator FORTRAN which is made exactly for such calculation.
Program Astro
real, dimension(3,3) :: flat, dark, image, science
dark =reshape((/1,1,1,1,1,1,1,1,1/), shape(dark))
flat =reshape((/2,2,2,2,2,2,2,2,2/), shape(flat))
science =reshape((/2,2,2,2,100,2,2,2,2/), shape(science))
!In Fortran elementwise calculations are automatically supported on arrays of the same size.
image = (science - dark)/(flat - dark)
end Program Astro
The new kid on the block Julia
dark = [111; 111; 111]
flat = [222; 222; 222]
science = [222; 21002; 222]
#here the SIMD intrinsic uses the . syntax which immediately broadcasts the function over all #elements along the same dimensions this also enables broadcasting of 1D Vectors along 2D vectors
image = (science .- dark)./(flat .- dark)
Python and the insane numerical stack developed on it
importnumpyasnpdark=np.array([[1,1,1],[1,1,1],[1,1,1]])
flat=np.array([[2,2,2],[2,2,2],[2,2,2]])
science=np.array([[2,2,2],[2,100,2],[2,2,2]])
#numpy will do automatic vectorization on these calculationsImage= (science-dark)/(flat-dark)
These are the potential implementations, I added python in this case even tho numpy is an external library because it’s the de facto standard for the domain of scientific computing.
I would personally be in favor of adding SIMD intrinsics. Let me outline advantages this would bring:
Advantages:
• Is syntactically different from just using the operator on an array therefore less potential for confusion
• As mentioned in the docs on operator overloading “Operator overloading goes against V's philosophy of simplicity and predictability.”, if we have a shorthand notation for these SIMD calculations we know immediately from the syntax that the operators are just working elementwise on the values in the array. Instead of on some custom data structure we would have to know about first.
• As also mentioned it is just easier to read and write a + b instead of a.add(b)
• If we add SIMD this would make it easier to optimize numerics
While I do am in favor of adding a syntax like this I am willing to acknowledge its demerits:
Disadvantages:
• Its an additional syntax that is technically speaking not necessary as it could easily be done with loops instead
• An argument has been made that SIMD intrinsics don’t actually help performance
My personal take: I would propose something similar to Julias dot syntax. Not necessarily in full form as Julias dot syntax also allows to broadcast over a function using func.(x) which might become confusing but atleast for operators e.g. x .+ y but I would like to hear other peoples opinion on this aswell. And discuss about implementation, advantages disadvantages and everything else.
I hope that we have a nice discussion and can all be civil ^^
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
So let me preface this by saying I am a scientist specifically a physicist and planetary scientist. So my main domain in the world of programming is scientific computations. In my field there are a lot of long calculations that are highly vectorizable and benefit often massively from parallelization. I know this discussion was started before but it did not lead anywhere and that was nearly 4 years ago so I belive we could discuss the merits of this once more.
Let me outline my Ideas with a simple example from Scientific computing, in this case its from astronomy. A common occurence in observational astronomy is that you have a raw image from an observation (consider this a simple 2D array where each value represents the number of photons detected at a sensor pixel). These raw images need to be corrected for errors this is done by applying so called flat field and dark current corrections (there are more but for the simplicity of it I will simplify like this). This correction is usualy rather straigth forward and can be written as science = (raw - dark)/(flat - dark), where raw, dark and flat are 2D arrays (images) of the same dimensions.
The equation is obviously highly paralllelizable as you have simple elementwise addition, subtraction and division. Which can be either dispatched to multiple cores or even better GPUs. Now let us take a look into how other languages handle these vectorized calculations.
the wisdom of the old the FORmula TRANslator FORTRAN which is made exactly for such calculation.
The new kid on the block Julia
Python and the insane numerical stack developed on it
These are the potential implementations, I added python in this case even tho numpy is an external library because it’s the de facto standard for the domain of scientific computing.
I would personally be in favor of adding SIMD intrinsics. Let me outline advantages this would bring:
Advantages:
• Is syntactically different from just using the operator on an array therefore less potential for confusion
• As mentioned in the docs on operator overloading “Operator overloading goes against V's philosophy of simplicity and predictability.”, if we have a shorthand notation for these SIMD calculations we know immediately from the syntax that the operators are just working elementwise on the values in the array. Instead of on some custom data structure we would have to know about first.
• As also mentioned it is just easier to read and write
a + b
instead ofa.add(b)
• If we add SIMD this would make it easier to optimize numerics
While I do am in favor of adding a syntax like this I am willing to acknowledge its demerits:
Disadvantages:
• Its an additional syntax that is technically speaking not necessary as it could easily be done with loops instead
• An argument has been made that SIMD intrinsics don’t actually help performance
My personal take: I would propose something similar to Julias dot syntax. Not necessarily in full form as Julias dot syntax also allows to broadcast over a function using
func.(x)
which might become confusing but atleast for operators e.g.x .+ y
but I would like to hear other peoples opinion on this aswell. And discuss about implementation, advantages disadvantages and everything else.I hope that we have a nice discussion and can all be civil ^^
Beta Was this translation helpful? Give feedback.
All reactions