-
-
Notifications
You must be signed in to change notification settings - Fork 265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
S390x: add preliminary support for SystemZ #4810
base: master
Are you sure you want to change the base?
Conversation
Thank you for bringing it to our attention. We will check internally and confirm. |
Many thanks for working on the SystemZ port! I'll have a look at the details. |
I tried to build this on s390x on Fedora rawhide, using gdc as a bootstrap compiler and ran into the following error:
Any ideas what's going on? |
8ccb0d2
to
496476a
Compare
This is now fixed. |
Thanks! With this, I was able to build an initial version of ldc (using gdc), but when I try to build ldc with itself, it now fails with the following:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for implementing SystemZ support! I'm not familiar with D at all, but I tried to review this from a perspective of compliance with the system ABI. This looks good to me for the most part, see inline comments for some issues / questions.
|
||
struct StructSimpleFlattenRewrite : BaseBitcastABIRewrite { | ||
LLType *type(Type *ty) override { | ||
const size_t type_size = size(ty); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the C ABI, a struct containing just a single float or double member is passed like a plain float or double, i.e. possibly in a floating-point register. I don't see this being handled anywhere here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fixed now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I forgot to mention one similar case, sorry: a struct containing just a single element of vector type (or recursively another such struct), is passed like a plain vector type (i.e. in vector registers).
} | ||
if (t->ty == TY::Tint128 || t->ty == TY::Tcomplex80) { | ||
return true; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vector types of size up to 16 should be passed in vector registers, but only when compiling for an architecture that supports vector registers in the first place (i.e. z13 and above). Older machines use another ABI where vector types are always passed via reference. Do you intend to support both ABIs, or do you plan to simply require z13 or later (either in general, or whenever vector types are used)? That may be a reasonable choice at this point, but I guess you should make sure that the machine type / features are set up accordingly for the LLVM back-end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vector types of size up to 16 should be passed in vector registers, but only when compiling for an architecture that supports vector registers in the first place (i.e. z13 and above). Older machines use another ABI where vector types are always passed via reference. Do you intend to support both ABIs, or do you plan to simply require z13 or later (either in general, or whenever vector types are used)? That may be a reasonable choice at this point, but I guess you should make sure that the machine type / features are set up accordingly for the LLVM back-end.
I don't think in D, and there is an easy way to construct TY::Tint128
(you can use core.int128.Cent
, but that type is { i64, i64 }
in D ABI). Even if we lower it to int128
and pass by reference, LLVM will still correctly pass it using vector registers (see https://godbolt.org/z/a8xEEfezz).
For TY::Tcomplex80
, this lowers to fp128
and will automatically be handled by LLVM according to the -mcpu
values passed to LLVM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vector types of size up to 16 should be passed in vector registers, but only when compiling for an architecture that supports vector registers in the first place (i.e. z13 and above). Older machines use another ABI where vector types are always passed via reference. Do you intend to support both ABIs, or do you plan to simply require z13 or later (either in general, or whenever vector types are used)? That may be a reasonable choice at this point, but I guess you should make sure that the machine type / features are set up accordingly for the LLVM back-end.
I don't think in D, and there is an easy way to construct
TY::Tint128
(you can usecore.int128.Cent
, but that type is{ i64, i64 }
in D ABI). Even if we lower it toint128
and pass by reference, LLVM will still correctly pass it using vector registers (see https://godbolt.org/z/a8xEEfezz).
int128
is always passed via reference, also in your godbolt example (it uses vector registers temporarily to set up the value, but the actual argument is passed at 160(%r15)
, with %r2
pointing to that address.
For
TY::Tcomplex80
, this lowers tofp128
and will automatically be handled by LLVM according to the-mcpu
values passed to LLVM.
fp128
is always passed in a pair of floating-point registers, no matter what -mcpu
.
What I was refering to applies solely to actual vector types (of size up to 16). Those are passed via reference on pre-z13 machines, and in vector registers on z13 and later.
} | ||
// "A struct or union of any other size, a complex type, an __int128, a long | ||
// double, a _Decimal128, or a vector whose size exceeds 16 bytes" | ||
if (size(t) > 16 || t->iscomplex() || t->isimaginary()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused by the size <=8 check above vs. the size > 16 check here. What about structs with sizes in between the two? They should be passed by reference - I'm not sure if this is what this code does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any struct
objects that are between 8 to 16 bytes would be passed by reference (will be determined by DtoIsInMemoryOnly
).
{ | ||
// Arg is passed in one register | ||
alias T1 = U[0]; | ||
static if (is(T1 == double) || is(T1 == float)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, single-element float/double structs might need to be handled here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now somewhat addressed in the compiler, where single-element float/double structs are rewritten to float/double in va_args
before passing to this code.
{ | ||
TypeInfo arg1, arg2; | ||
if (!ti.argTypes(arg1, arg2)) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I'm not familiar with how "runtime" va_arg works in D - is this when the type itself it not known at compile-time? I notice this function doesn't appear to handle floating-point types at all anywhere. Can those never occur here?
("sd $fp, %0") : "=m" (regs[9]); | ||
("sd $ra, %0") : "=m" (sp); | ||
("sd $fp, %0") : "=m" (regs[9]); | ||
("sd $sp, %0") : "=m" (sp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems unrelated to SystemZ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove this hunk when the pull request is cleaned up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does appear to be correct. You can also submit as separate PR through the github web interface. Thnx.
.cfi_def_cfa_offset 384 | ||
/* store the (optional) backchain data */ | ||
stg %r1, 0(%r15) | ||
aghi %r1, -64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looks like you're accessing the stack below r15 here. This is unsafe on SystemZ - we do not have any "red zone", i.e. all memory below r15 may be overwritten at any time by a signal handler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be fixed now
Wrt. SystemZ CI, we apparently could
|
Yes, that would definitely be an option - that is the intended purpose of those VMs we make available. As an alternative, some projects go the route of running SystemZ tests under qemu (see e.g. https://github.com/bytecodealliance/wasmtime/blob/main/.github/workflows/main.yml). For performance reasons, in this case it might be preferable to run the compiler natively (as cross-compiler) and only run the resulting test cases in qemu. |
ecea763
to
c0cd3a2
Compare
Is that a task you are willing to pick up? Thanks! |
using namespace dmd; | ||
|
||
struct SimpleHardfloatRewrite : ABIRewrite { | ||
Type *getFirstFieldType(Type *ty) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the C ABI, this is applied recursively: a struct with a single element that is a struct with a single element of floating-point type is also passed like a float, etc.
|
||
struct StructSimpleFlattenRewrite : BaseBitcastABIRewrite { | ||
LLType *type(Type *ty) override { | ||
const size_t type_size = size(ty); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I forgot to mention one similar case, sorry: a struct containing just a single element of vector type (or recursively another such struct), is passed like a plain vector type (i.e. in vector registers).
} | ||
if (t->ty == TY::Tint128 || t->ty == TY::Tcomplex80) { | ||
return true; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vector types of size up to 16 should be passed in vector registers, but only when compiling for an architecture that supports vector registers in the first place (i.e. z13 and above). Older machines use another ABI where vector types are always passed via reference. Do you intend to support both ABIs, or do you plan to simply require z13 or later (either in general, or whenever vector types are used)? That may be a reasonable choice at this point, but I guess you should make sure that the machine type / features are set up accordingly for the LLVM back-end.
I don't think in D, and there is an easy way to construct
TY::Tint128
(you can usecore.int128.Cent
, but that type is{ i64, i64 }
in D ABI). Even if we lower it toint128
and pass by reference, LLVM will still correctly pass it using vector registers (see https://godbolt.org/z/a8xEEfezz).
int128
is always passed via reference, also in your godbolt example (it uses vector registers temporarily to set up the value, but the actual argument is passed at 160(%r15)
, with %r2
pointing to that address.
For
TY::Tcomplex80
, this lowers tofp128
and will automatically be handled by LLVM according to the-mcpu
values passed to LLVM.
fp128
is always passed in a pair of floating-point registers, no matter what -mcpu
.
What I was refering to applies solely to actual vector types (of size up to 16). Those are passed via reference on pre-z13 machines, and in vector registers on z13 and later.
/* Save callee-saved floating point registers | ||
s390x ABI has a very unique way for storing fp registers: | ||
even-pairs first and odd-pairs last */ | ||
std %f8, 0(%r15) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This clobbers the backchain data. (Which would make backchain-based unwinding, e.g. for profiling, break if a sample hits while in this routine.) Normally, what you'd do is to allocate a new register-save area below the 64 bytes of temporary stack space, and fill in only the backchain in that area.
|
||
/* Save stack pointer, the stack pointer is adjusted so that | ||
GC won't see the float point registers */ | ||
stg %r15, 0(%r2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what exactly that comment means, but I don't see any adjustment to the stack pointer here?
I could certainly set up the machine and the Actions runner. For integration with your CI I'd probably need help from someone who understands its current setup ... |
Thanks, that'd be great!
No worries, once we have a registered runner, we can take over. According to https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners/adding-self-hosted-runners, the registration requires a token from us, which is valid for one hour. So a bit of timely coordination will be required. |
This pull request adds preliminary support for IBM Z systems (running in z/Architecture mode).