Skip to content
This repository was archived by the owner on Jun 26, 2020. It is now read-only.

Add 8x16 shift legalizations for x86 SIMD #1346

Closed
wants to merge 2 commits into from

Conversation

abrown
Copy link
Collaborator

@abrown abrown commented Jan 14, 2020

  • This has not been discussed in an issue.
  • A short description of what this does, why it is needed: I did not implement the i8x16 shifts previously because there was discussion about whether they should be included in the spec (Inefficient x64 codegen for 8x16 shifts WebAssembly/simd#117). I still feel they should not be included due to the large overhead they impose on x86 but I am (rather unwillingly) implementing them here so we can get additional performance measurements.
  • This PR contains test cases, if meaningful.
  • A reviewer from the core maintainer team has been assigned for this PR.

…ing an 8x16 legalization

Though hardly an optimal instruction to use, the `i8x16.ishl` lowering is necessary because of an insistence in the SIMD spec committee that the instruction is desirable for performance on non-x86 platforms and Wasm instruction symmetry (see WebAssembly/simd#117). This legalization uses a 128-byte table of vectors to mask off the parts of the i8x16 vector that are unaffected by a i16x8 shift.
let shift_index = pos.ins().bitcast(I64X2, arg1);
let bitcast_vector = pos.ins().raw_bitcast(I16X8, arg0);
let shifted_vector = pos.ins().x86_psll(bitcast_vector, shift_index);
let mask_base = pos.ins().iconst(I64, 42); // FIXME this needs to be the base pointer of the constant table
Copy link
Collaborator Author

@abrown abrown Jan 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sunfishcode, @bnjbvr: I don't know how to get a pointer to the table of constants but I need something like that in order to load the mask. Is there a different way to do this? I looked at binemit.rs::const_disp4 and I don't have sink (present there) when I'm here (doing legalizations).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sunfishcode, here's my current thinking on this:

  • I considered whether having some sort of symbol (unrelated to globals) in the CL IR like $func.rodata would make this work; if that symbol could have a known address at compile time then I could use that and constant offsets to get at runtime-determined constants.
  • But can the address of that symbol be known at compile time? I think it can if it is function-related (not global) and relative to the PC: 1) once we know how long the function will be, calculate all of the offsets of jump tables, constants, and (now) symbols, and 2) use functions like jt_disp4, const_disp4, and (now) symbol_disp4 in binemit.rs to pull out those calculated offsets when we emit the code.
  • Looking at this from distance, we would be doing the same type of thing three times and the code is fragile due to the distance between when the offsets are assigned (relax_branches), referenced in code (binemit.rs), and actually populated with data (emit_function). Perhaps we add an abstraction to Function like RelativeOffsets that would map any of these things (jump table, constant, symbol) to their offset; it would only be usable after some new counting pass (removing that logic from relax_branches), would be queryable when encoding instructions and would know how to lay out the actual data in emit_function

@alexcrichton
Copy link
Member

Thanks for the PR again, and as a procedural note the Cranelift repository has now merged into the wasmtime repository.

PRs are no longer landing in this repository, and unfortunately there's no "one button" solution to move a PR to the wasmtime repository. A script has been prepared, however, to assist you in transferring this PR to the wasmtime repo. Feel free to reach out on Zulip with any questions!

@abrown abrown closed this Mar 28, 2020
@abrown abrown deleted the add-shift-8x16 branch March 28, 2020 00:24
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants