-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[0011] Resource element type validation #69
[0011] Resource element type validation #69
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only read through a bit of the introduction so far.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The descriptions are good and the examples are very helpful!
@@ -0,0 +1,142 @@ | |||
* Proposal: [0010](0010-validating-resource-container-elements.md) | |||
* Author(s): [Joshua Batista](https://github.com/bob80905) | |||
* Sponsor: TBD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're the sponsor
* Status: **Under Consideration** | ||
* Impacted Project(s): (LLVM) | ||
|
||
*During the review process, add the following fields as needed:* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can remove this and the PRs line
* Issues: [#75676](https://github.com/llvm/llvm-project/issues/75676) | ||
|
||
## Introduction | ||
Resources are often used in HLSL, with various resource element types (RETs). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a new acronym? Not objecting, but I don't think I've seen it before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, completely new as far as I know. Just thought I'd use an acronym since that phrase is littered everywhere in this spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, when I see RET I think of this: https://www.felixcloutier.com/x86/ret
of rules that define valid RETs for this resource type. | ||
|
||
RETs for non-`RawBuffer` variants may include basic types (ints and uints of sizes 16 | ||
and 32, as well as half, and float). Structs that contain fields of these basic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bool, uint64_t and double are currently allowed as well https://godbolt.org/z/eb7Yh73W6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly from my conversation with @tex3d, though the uint64_t
and double
cases compile, they will cause undefined behavior at runtime, and we do not want to allow any types that are larger than 32 bits.
Even if 2 64-bit types can technically fit in 4 32-bit fields, it's a hard limit that there may not be more than 4 elements, and each element may not be more than 32 bits.
We allow bool
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Support for 64-bit values seems like a totally reasonable thing to do. Is there something fundamental about them that mean we can't support them properly, or is this just that DXC is buggy?
What about in SPIR-V today? Looks like SPIR-V supports int64_t2
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's something fundamental (If I recall correctly, I think it had to do something with the hardware converter / DXGI format being capped at 32 bits in D3D, and that is not changeable).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I said anything about them not working or causing undefined behavior. The undefined behavior case was tied to matching unorm or snorm in HLSL with the DXGI resource type for typed UAVs.
They definitely are supported on DXC (by decomposing to 32-bit components), and FXC the same way IIRC (for double).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps what was thought of was the fact that you can't use these with Sample operations (only Load/Store), since it's equivalent to a typed resource with an int32 component type, which doesn't support those operations.
In fact, if you try it, you get a validation error (we only catch it there), as it attempts to use a 64-bit overload for the sample operation, and define an invalid %dx.types.ResRet.f64
type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since it's equivalent to a typed resource with an int32 component type, which doesn't support those operations.
Circling back to this conversation, I recall that this isn't correct anymore - there is more to add.
We added integer sampling support in SM 6.7. This relaxed checks in the front end that disallowed integer types. So integer types are actually allowed in SM 6.7 on (but not for comparison sampling).
Unfortunately, when 64-bit types are used with sample operations, those resource return types won't be translated into two i32 values per 64-bit value. That's what leads to the invalid operation type for these caught by validation. These types could be supported using integer sampling, if we had broken them up as we do for load operations.
I think this means that when using a 64-bit component type in a typed resource element, we should consider this as being implicitly translated to two 32-bit uints per 64-bit component for the actual type used. Then the rules are applied according to the translated element type.
There is one thing bugging me: double component type seemingly should trigger an error in IsValidObjectElement
, but it doesn't. See the code here:
https://github.com/microsoft/DirectXShaderCompiler/blob/a023a95f73618e8ca1791147d3722a3e1ea7608f/tools/clang/lib/Sema/SemaHLSL.cpp#L6058
primitive RET. | ||
|
||
RETs for `RawBuffer` variants are much less constrained, the only rule is that the RET | ||
may not be an incomplete type (a handle type or a resource type). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can HLSL authors interact with handle types directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should technically be possible.
I wrote a test in clang\test\SemaHLSL\resource_binding_attr_error.hlsl
where __hlsl_resource_t
, the handle type, was directly declared, so it is spellable.
But I don't think we intend users to interact with them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically the __hlsl_resource_t
type is spellable, but there is not much one can do with it. It needs to be decorated with resource type attributes to signify what kind of resource it is representing, and it will never get automatic binding unless it is embedded in a struct that looks like a resource class (has the handle as a first field named "h").
type. For every `type_trait` that is not true for the given RET, an associated error | ||
message will be emitted. | ||
|
||
The list of type_traits that define a valid non-`RawBuffer` RET are descsribed below: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo
The list of type_traits that define a valid non-`RawBuffer` RET are descsribed below: | |
The list of type_traits that define a valid non-`RawBuffer` RET are described below: | |
|-|-| | ||
| `__is_complete_type` | An RET should either be a complete type, or a user defined type that has been completely defined. | | ||
| `__is_intangible_type` | An RET should not contain any handles with unknown sizes, i.e., should not be intangible. So, we should assert this type_trait is false. | | ||
| `__is_homogenous_aggregate` | RETs may be basic types, but if they are aggregate types, then all underlying basic types should be the same type. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In LLVM parlance, I think "aggregate type" applies to structs and arrays, but not vectors nor matrices. https://github.com/llvm/llvm-project/blob/a1d64626ba16f5128530ac771c6e641b1155184f/llvm/include/llvm/IR/Type.h#L291-L293
So I think we need another term for this.
| `__is_complete_type` | An RET should either be a complete type, or a user defined type that has been completely defined. | | ||
| `__is_intangible_type` | An RET should not contain any handles with unknown sizes, i.e., should not be intangible. So, we should assert this type_trait is false. | | ||
| `__is_homogenous_aggregate` | RETs may be basic types, but if they are aggregate types, then all underlying basic types should be the same type. | | ||
| `__is_contained_in_four_groups_of_thirty_two_bits` | RETs should fit in four 32-bit quantities | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are two criteria here that are munged into one. That's how DXC implies it is working as well and it was only relevant when we got sub 32-bit types, but now that we have, there are two separate requirements here.
- The total number of components must be four or fewer. That's how DXC behaves and we might consider changing it, but I don't think it makes sense to do so now.
- The total size must be less than that of 4x32 bits. I think this wording is more confusing than clarifying. I would prefer to state it as 128 bits or if you prefer, 16 bytes.
The reason for separating these is that 8 uint16s will fit in the size, but is more than 4 components and is rejected by DXC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To your second point, I think the constraint is slightly stronger, each element may not exceed 32 bits. I alluded to the reason earlier.
Yes, the wording here can be made clearer, I'll adjust.
struct f { | ||
e x[2]; | ||
e y[2]; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be a bit easier to follow if, instead of arbitrary struct names, they described what they contained. So struct x
could be struct oneInt
and struct a
could be struct twoInt
etc
|
||
// diagnostic: "resource element type 'd' has non-homogenous aggregate type" | ||
RWBuffer<d> r8; // invalid - struct `a` has int types, and this is not homogenous with the float4 contained in `c`. | ||
// the type_trait that would catch this is `__is_homogenous_aggregate` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also has 4 floats and 2 ints, which is more than 4 components and also won't fit in 16 bytes. Might be better to keep the examples at one violation each.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed Introduction and Motivation.
In this code, the RET is `float`, and the resource type is `RWBuffer`. | ||
There are two types of buffers, RawBuffers and TypedBuffers. `RWBuffer` | ||
is a TypedBuffer variant, and `StructuredBuffer` is a RawBuffer variant. | ||
There is a distinct set of rules that define valid RETs for RawBuffer types, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since StructuredBuffers are RawBuffers I assume you meant to compare RawBuffer to TypedBuffers here?
There are two types of buffers, RawBuffers and TypedBuffers. `RWBuffer` | ||
is a TypedBuffer variant, and `StructuredBuffer` is a RawBuffer variant. | ||
There is a distinct set of rules that define valid RETs for RawBuffer types, | ||
and a separate set of rules that define valid RETs for `StructuredBuffer` types. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and a separate set of rules that define valid RETs for `StructuredBuffer` types. | |
and a separate set of rules that define valid RETs for TypedBuffer types. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to understand more about how you propose using the type_traits.
In addition, I think the document throughout needs to be updated to indicate that double and uint64_t are valid element types.
| type_trait | Description| | ||
|-|-| | ||
| `__is_complete_type` | An RET should either be a complete type, or a user defined type that has been completely defined. | | ||
| `__is_resource_element_type` | An RET should be an arithmetic type, or a bool, or a vector or matrix or UDT containing such types. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is different between this and !__builtin_hlsl_is_intangible
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the purposes of the single static assert. I don't think that just because an RET is not intangible, it is always allowed as an RET. Testing the affirmative seems to be better for long term. For example, what if HLSL supports strings in the future? We'd want to disallow strings as RETs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I agree about strings, but I'm also not sure how HLSL strings will be implemented, but my guess is that they'll be constant only so an offset and a length is the most I expect to represent the "value" of a string.
Setting aside strings since they aren't designed. As described, __is_resource_element_type
is equivalent to !__builtin_hlsl_is_intangible
, so I think we should have one trait not two because it simplifies the language rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Talked with Chris offline, we both agree that an arithmetic type, or a bool, or a vector or matrix or UDT containing such types
is exactly equivalent to !__builtin_hlsl_is_intangible
, so I'll use that existing builtin.
| `__builtin_is_homogenous` | A TypedBuffer RET with the DXIL IR target should never have two different subelement types. | | ||
| `__builtin_is_contained_in_four_groups_of_thirty_two_bits` | A TypedBuffer RET with the DXIL IR target should not have more than 4 elements, and the total size of the RET may not exceed 128 bits. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we could simplify this down to something like __builtin_hlsl_is_line_vector_layout_compatible
to mark that a type is layout compatible with a line-width vector. That would handle requiring homogeneousness, element count, and total size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Talked to Chris offline, we agreed __builtin_hlsl_is_line_vector_layout_compatible
could be simplified by using the size_of builtin and other template tricks to verify vector length and total size in bytes.
The list of type traits that will be available for use are described below: | ||
| type trait | Description| | ||
|-|-| | ||
| `__is_complete_type` | An RET should either be a complete type, or a user defined type that has been completely defined. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may not actually need __is_complete_type
. I know that the type needs to be complete, but __is_intangible
will error if the type is incomplete.
…target IR dependency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some editorial notes, but otherwise LGTM.
RETs for typed buffer resources: | ||
* Are not intangible (e.g., isn't a resource type) | ||
* Must be vectors or scalars of arithmetic types, not bools nor enums nor arrays | ||
* The type should be line-vector layout compatible (homogenous, at most 4 elements, and at most 128 bits in size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have always had a problem with this terminology.
- It's inventing new terminology not connected to any prior terminology used for the concept to which it applies: "line-vector"? It's trying to sound more general than it actually is.
- If you were to think about what it could/should be referring to, you could justifiably think it's about what fits into one constant buffer row (line?) in the legacy (current) constant buffer layout. But that would be a mistake, since the rules are different. (for example, you can put 8 native 16-bit values or a struct containing non-homogenous types in one row)
- It implies that this type is used in the (memory) layout of the resource itself. It's not: the element stored and the layout is dependent on runtime factors: the DXGI resource type and device-dependent factors. The element type in HLSL for a typed resource is just how you want to interpret the values loaded from that up-to-4-component typed resource element.
- The description sounds like it would support non-homogenous types (a struct with several different types that fit should work, as it would for constant buffers)
- This 128-bit restriction sounds equivalent, but hides the true meaning: each 64-bit component in HLSL is translated to 2-32-bit components for the resource element type, with a bitcast for translation. This has implications for the actual element type used, compatible DXGI type mapping, and allowed operations. This has a limit of 2 64-bit components because those 2 components expand into a 4 component element type, not because it's 128 bits in size.
- We could probably create specializations that handle 64-bit type translation directly to/from the translated element type in our HLSL header, rather than adding special-case code to the compiler. In this scenario, the real element type for the resource does not use a 64-bit type as the component type, but uses the translated element type with 32-bit uint components instead.
I would rather a more specific term be used for the more specific typed resource element constraint.
How about "typed-element compatible" and __builtin_hlsl_typed_element_compatible
?
- A "typed-element" flows from existing terminology: typed resources are textures and buffers with a DXGI type applying to the resource view descriptor at runtime. These support translation upon load/store to the HLSL element type used in a shader (depending on the types at either end and the operation being performed).
- The element must be a scalar or vector of an integer or floating point type with a maximum of 4 components. All loads and stores are performed at element granularity, which is a 4 component vector (or struct) in the shader IR, whether or not all those components are read/defined.
- If a 64-bit component type is used in HLSL, each 64-bit value maps to 2-32-bit
uint
components which will be reinterpreted together as the 64-bit value in HLSL. This has implications for supported operations, and the maximum number of 64-bit components allowed.double2
or[u]int64_t2
becomesuint4
, which is the maximum number of components, and because the real component used is integer, it triggers those operation restrictions as well.
These above rules and description would apply to this "typed-element compatible"/__builtin_hlsl_typed_element_compatible
constraint.
The short description could be:
- slightly modified: "at most 4 homogenous scalars, and at most 128 bits in total size" (avoiding using "element" for scalar/component since the whole thing is the element type)
- or "at most 4 homogenous scalars after translation of 64-bit scalars into pairs of uint32_t scalars". This makes the implied translation clear right in the short constraint description.
- or subsume the scalar or vector and component type rule:
"scalar or vector of a floating-point or integer type, with a maximum of 4 components after translating 64-bit components into pairs ofuint32_t
components"
There some additional constraints on the element type relating to the way the resource is used (methods called), but these would need to be enforced separately on the method calls:
- If comparison sample (
SampleCmp*
) methods are used, the element type in HLSL must be a scalar floating point type of at most 32-bits. - If other
Sample
methods are used, integer components are only allowed in the element type when the shader model is 6.7 or greater. This is triggered by the use of double components because they break down into 2uint32_t
components.
There are additional runtime restrictions for legal type combinations between the shader and the DXGI type, but these cannot be enforced by the compiler when they depend on runtime state. For instance: typed UAVs using a unorm or snorm DXGI type need to have a matching unorm float or snorm float component type in the HLSL resource element declaration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The naming used here is intended to be consistent with microsoft/hlsl-specs#321. Please comment on that PR if you think we need to adjust the naming there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly looks good to me. A few small comments.
@@ -0,0 +1,217 @@ | |||
* Proposal: [0010](0010-resource-element-type-validation.md) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think we already have a proposal 10, so make sure to update the number before merging.
Two builtins will be used to validate typed buffer element types. Any resource | ||
element type may not be intangible, so the negation of `__builtin_hlsl_is_intangible` | ||
will be used for both typed and raw buffer resources. | ||
A new built-in, `__builtin_hlsl_typed_element_compatible`, will be added in order |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not to bike shed, but this builtin name seems odd to me because it doesn't really describe what this is an element of. Maybe __builtin_hlsl_typed_buffer_element_compatible
or __builtin_hlsl_typed_resource_element_compatible
to add the missing context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, I'll go with typed resource.
This spec describes how the compiler will validate resource element types.
This spec is needed before the implementation, which will eventually solve llvm/llvm-project#75676