-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
relaxed i8x16.swizzle #22
Comments
On PPC, On z/Arch there is |
This instruction is straightforward and was used as an example motivator for the relaxed-simd proposal itself. One question that comes to mind though is the mechanism for enabling? I think this has been discussed before but how would we be expected to enable specific instructions to be their relaxed version while others remain unrelaxed? |
We will not enable an existing instruction to be executed in a relaxed manner. The relaxed instruction will be a completely new instruction with different opcode. |
Yes, so then you could have a module that has both swizzle and relaxed swizzle instructions? What I am wondering then is if I am writing code in C that is auto-vectorized for example, is there expected to be a way to specify this to the compiler that's targeting Wasm? |
Yup that is possible.
Not at the moment. Maybe we can introduce an Emscripten flag to do this, similar to the |
Yes, a flag makes sense. In fact I imagine with the proper dependence analysis the compiler could figure out if it is safe to use the relaxed version of an instruction. In fact perhaps it should be criteria or go into the thinking/motivation of proposing a relaxed instruction .. that with a compiler flag giving permission and proper analysis a compiler could determine when it is safe to generate the relaxed version. |
Good idea, but likely not possible in the most general case. E.g. if the swizzle depends on a mutable global/imported value, |
I don't expect that compiler would be able to generate either the normal |
Note: vtbl is not available on ARM v8-M MVE AFAICT. |
RISC-V V has |
For Power, likely require vperm with shift left on the selection vector (vperm uses bits 3:7 of each byte of selection), then it will select modulo 16. |
relaxed i8x16.swizzle
relaxed i8x16.swizzle(a, s)
selects lanes froma
using indices ins
, indices in the range[0,15]
will select thei
-th element ofa
, the result for any out of range indices is implementation-defined (i.e. if the index is[16-255]
.x86-64 and ARM64. Also provide reference implementation in terms of 128-bit
Wasm SIMD.
x86/64,
pshufb
, out of range indices will return different results:i % 16
-th elementARM/ARM64,
vtbl
andtbl
, out of range indices return 0.RISC-V V
vrgather.vv a, b
, out of range return 0 (assuming VEW set to 8, LMUL set to 1, VLEN set to 128, so VLMAX = 16).Simd128,
i8x16.swizzle
Difference between x86/64 and ARM/ARM64
Swizzle is quite a common operation, e.g. used in multiple places in meshoptimizer.
The text was updated successfully, but these errors were encountered: