Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPIR-V] Add hlsl_private address space for HLSL/SPIR-V #122103

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions clang/include/clang/Basic/AddressSpaces.h
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@ enum class LangAS : unsigned {

// HLSL specific address spaces.
hlsl_groupshared,
hlsl_private,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Title and description only talks about SPIRV, but this is a purely language concept you're adding here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct, slightly reworded the description.
The thing is, only SPIR-V will be using this for now (same as the few others we'll add line hlsl_input, hlsl_output).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only SPIR-V will be using this for now

That's not really how languages work. It's a property of the language and exists independently of what any codegen is doing with it. As it is it sounds like you're jamming some knob you want for codegen through here that isn't part of the language.

This needs reference to some HLSL spec describing the address space without any mention of SPIRV.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This being a small stepping stone for the larger change bringing hlsl_input/hlsl_output, I haven't wrote a specific HLSL spec part, only relied on current MSDN & HLSL implementations (I agree, not great).
For the hlsl_input/hlsl_output address spaces, here is the wg-hlsl proposal:
llvm/wg-hlsl#97

But you are right, maybe that's something we should now write properly in the hlsl-spec repo


// Wasm specific address spaces.
wasm_funcref,
Expand Down
2 changes: 2 additions & 0 deletions clang/lib/AST/TypePrinter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2553,6 +2553,8 @@ std::string Qualifiers::getAddrSpaceAsString(LangAS AS) {
return "__funcref";
case LangAS::hlsl_groupshared:
return "groupshared";
case LangAS::hlsl_private:
return "hlsl_private";
default:
return std::to_string(toTargetAddressSpace(AS));
}
Expand Down
1 change: 1 addition & 0 deletions clang/lib/Basic/TargetInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ static const LangASMap FakeAddrSpaceMap = {
11, // ptr32_uptr
12, // ptr64
13, // hlsl_groupshared
14, // hlsl_private
20, // wasm_funcref
};

Expand Down
1 change: 1 addition & 0 deletions clang/lib/Basic/Targets/AArch64.h
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ static const unsigned ARM64AddrSpaceMap[] = {
static_cast<unsigned>(AArch64AddrSpace::ptr32_uptr),
static_cast<unsigned>(AArch64AddrSpace::ptr64),
0, // hlsl_groupshared
0, // hlsl_private
// Wasm address space values for this target are dummy values,
// as it is only enabled for Wasm targets.
20, // wasm_funcref
Expand Down
2 changes: 2 additions & 0 deletions clang/lib/Basic/Targets/AMDGPU.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ const LangASMap AMDGPUTargetInfo::AMDGPUDefIsGenMap = {
llvm::AMDGPUAS::FLAT_ADDRESS, // ptr32_uptr
llvm::AMDGPUAS::FLAT_ADDRESS, // ptr64
llvm::AMDGPUAS::FLAT_ADDRESS, // hlsl_groupshared
llvm::AMDGPUAS::FLAT_ADDRESS, // hlsl_private
};

const LangASMap AMDGPUTargetInfo::AMDGPUDefIsPrivMap = {
Expand All @@ -83,6 +84,7 @@ const LangASMap AMDGPUTargetInfo::AMDGPUDefIsPrivMap = {
llvm::AMDGPUAS::FLAT_ADDRESS, // ptr32_uptr
llvm::AMDGPUAS::FLAT_ADDRESS, // ptr64
llvm::AMDGPUAS::FLAT_ADDRESS, // hlsl_groupshared
llvm::AMDGPUAS::FLAT_ADDRESS, // hlsl_private
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what hlsl_private means but using flat here is almost certainly wrong

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This address space will be used for:

  • static [const] global variables
    Which matches the Private storage class for SPIR-V.

Would LOCAL_ADDRESS be a better match?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe PRIVATE_ADDRESS would be best, since it's local per thread/wave

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it acts like thread local storage, we don't have an implementation of that right now. So it would be none of these. PRIVATE_ADDRESS globals will just fail and nothing else has thread-local like behavior

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HLSL documentation for those is currently this: https://learn.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl-variable-syntax

FXC/DXC implementation of a static int my_global would be equivalent to a c++ static thread_local int my_global. Initialized once, thread local.

If this is not available in AMDGPU yet, what value shall be picked?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should use PRIVATE_ADDRESS. It will at least fail in a way that makes it somewhat obvious what the issue is


};
} // namespace targets
Expand Down
19 changes: 10 additions & 9 deletions clang/lib/Basic/Targets/DirectX.h
Original file line number Diff line number Diff line change
Expand Up @@ -33,15 +33,16 @@ static const unsigned DirectXAddrSpaceMap[] = {
0, // cuda_constant
0, // cuda_shared
// SYCL address space values for this map are dummy
0, // sycl_global
0, // sycl_global_device
0, // sycl_global_host
0, // sycl_local
0, // sycl_private
0, // ptr32_sptr
0, // ptr32_uptr
0, // ptr64
3, // hlsl_groupshared
0, // sycl_global
0, // sycl_global_device
0, // sycl_global_host
0, // sycl_local
0, // sycl_private
0, // ptr32_sptr
0, // ptr32_uptr
0, // ptr64
3, // hlsl_groupshared
10, // hlsl_private
// Wasm address space values for this target are dummy values,
// as it is only enabled for Wasm targets.
20, // wasm_funcref
Expand Down
1 change: 1 addition & 0 deletions clang/lib/Basic/Targets/NVPTX.h
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ static const unsigned NVPTXAddrSpaceMap[] = {
0, // ptr32_uptr
0, // ptr64
0, // hlsl_groupshared
0, // hlsl_private
// Wasm address space values for this target are dummy values,
// as it is only enabled for Wasm targets.
20, // wasm_funcref
Expand Down
42 changes: 22 additions & 20 deletions clang/lib/Basic/Targets/SPIR.h
Original file line number Diff line number Diff line change
Expand Up @@ -38,15 +38,16 @@ static const unsigned SPIRDefIsPrivMap[] = {
0, // cuda_constant
0, // cuda_shared
// SYCL address space values for this map are dummy
0, // sycl_global
0, // sycl_global_device
0, // sycl_global_host
0, // sycl_local
0, // sycl_private
0, // ptr32_sptr
0, // ptr32_uptr
0, // ptr64
0, // hlsl_groupshared
0, // sycl_global
0, // sycl_global_device
0, // sycl_global_host
0, // sycl_local
0, // sycl_private
0, // ptr32_sptr
0, // ptr32_uptr
0, // ptr64
0, // hlsl_groupshared
10, // hlsl_private
// Wasm address space values for this target are dummy values,
// as it is only enabled for Wasm targets.
20, // wasm_funcref
Expand All @@ -69,17 +70,18 @@ static const unsigned SPIRDefIsGenMap[] = {
// cuda_constant pointer can be casted to default/"flat" pointer, but in
// SPIR-V casts between constant and generic pointers are not allowed. For
// this reason cuda_constant is mapped to SPIR-V CrossWorkgroup.
1, // cuda_constant
3, // cuda_shared
1, // sycl_global
5, // sycl_global_device
6, // sycl_global_host
3, // sycl_local
0, // sycl_private
0, // ptr32_sptr
0, // ptr32_uptr
0, // ptr64
0, // hlsl_groupshared
1, // cuda_constant
3, // cuda_shared
1, // sycl_global
5, // sycl_global_device
6, // sycl_global_host
3, // sycl_local
0, // sycl_private
0, // ptr32_sptr
0, // ptr32_uptr
0, // ptr64
0, // hlsl_groupshared
10, // hlsl_private
// Wasm address space values for this target are dummy values,
// as it is only enabled for Wasm targets.
20, // wasm_funcref
Expand Down
1 change: 1 addition & 0 deletions clang/lib/Basic/Targets/SystemZ.h
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ static const unsigned ZOSAddressMap[] = {
1, // ptr32_uptr
0, // ptr64
0, // hlsl_groupshared
0, // hlsl_private
0 // wasm_funcref
};

Expand Down
1 change: 1 addition & 0 deletions clang/lib/Basic/Targets/TCE.h
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ static const unsigned TCEOpenCLAddrSpaceMap[] = {
0, // ptr32_uptr
0, // ptr64
0, // hlsl_groupshared
0, // hlsl_private
// Wasm address space values for this target are dummy values,
// as it is only enabled for Wasm targets.
20, // wasm_funcref
Expand Down
1 change: 1 addition & 0 deletions clang/lib/Basic/Targets/WebAssembly.h
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ static const unsigned WebAssemblyAddrSpaceMap[] = {
0, // ptr32_uptr
0, // ptr64
0, // hlsl_groupshared
0, // hlsl_private
20, // wasm_funcref
};

Expand Down
1 change: 1 addition & 0 deletions clang/lib/Basic/Targets/X86.h
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ static const unsigned X86AddrSpaceMap[] = {
271, // ptr32_uptr
272, // ptr64
0, // hlsl_groupshared
0, // hlsl_private
// Wasm address space values for this target are dummy values,
// as it is only enabled for Wasm targets.
20, // wasm_funcref
Expand Down
17 changes: 17 additions & 0 deletions clang/lib/CodeGen/CodeGenModule.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5362,6 +5362,23 @@ LangAS CodeGenModule::GetGlobalVarAddressSpace(const VarDecl *D) {
if (OpenMPRuntime->hasAllocateAttributeForGlobalVar(D, AS))
return AS;
}

if (LangOpts.HLSL) {
if (D == nullptr)
return LangAS::hlsl_private;

// Except resources (Uniform, UniformConstant) & instanglble (handles)
if (D->getType()->isHLSLResourceType() ||
D->getType()->isHLSLIntangibleType())
return D->getType().getAddressSpace();

if (D->getStorageClass() != SC_Static)
return D->getType().getAddressSpace();

LangAS AS = D->getType().getAddressSpace();
return AS == LangAS::Default ? LangAS::hlsl_private : AS;
}

Copy link
Member

@hekota hekota Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered applying the hlsl_private address space in the Sema (SemaHLSL::ActOnVariableDeclarator handler) onto the VarDecl QualType?

FYI, I am currently looking into adding hlsl_constant address space to be used for constant buffer declarations and that is probably going to be the place where I'll add it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe this is the best place to do it since it is unspellable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recall having tried this method first, but for some reasons moved it to Codegen, but I don't recall the full context.

I just tried moving it in sema again, and seems there is a slight issue with this-reference-template.hlsl since the VarDecl type is from a template, something goes wrong (but works if the AS fixup is done in codegen).

return getTargetCodeGenInfo().getGlobalVarAddressSpace(*this, D);
}

Expand Down
2 changes: 1 addition & 1 deletion clang/test/CodeGenHLSL/GlobalDestructors.hlsl
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ void main(unsigned GI : SV_GroupIndex) {

// NOINLINE: define internal void @_GLOBAL__D_a() [[IntAttr:\#[0-9]+]]
// NOINLINE-NEXT: entry:
// NOINLINE-NEXT: call void @_ZN4TailD1Ev(ptr @_ZZ3WagvE1T)
// NOINLINE-NEXT: call void @_ZN4TailD1Ev(ptr addrspacecast (ptr addrspace(10) @_ZZ3WagvE1T to ptr))
// NOINLINE-NEXT: call void @_ZN6PupperD1Ev(ptr @GlobalPup)
// NOINLINE-NEXT: ret void

Expand Down
27 changes: 27 additions & 0 deletions clang/test/CodeGenHLSL/out-of-line-static.hlsl
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -std=hlsl202x -emit-llvm -o - -disable-llvm-passes %s | FileCheck %s --check-prefixes=CHECK
// RUN: %clang_cc1 -triple spirv-pc-vulkan1.3-compute -std=hlsl202x -emit-llvm -o - -disable-llvm-passes %s | FileCheck %s --check-prefixes=CHECK

struct S {
static int Value;
};

int S::Value = 1;
// CHECK: @_ZN1S5ValueE = global i32 1, align 4

[shader("compute")]
[numthreads(1,1,1)]
void main() {
S s;
int value1, value2;
// CHECK: %s = alloca %struct.S, align 1
// CHECK: %value1 = alloca i32, align 4
// CHECK: %value2 = alloca i32, align 4

// CHECK: [[tmp:%.*]] = load i32, ptr @_ZN1S5ValueE, align 4
// CHECK: store i32 [[tmp]], ptr %value1, align 4
value1 = S::Value;

// CHECK: [[tmp:%.*]] = load i32, ptr @_ZN1S5ValueE, align 4
// CHECK: store i32 [[tmp]], ptr %value2, align 4
value2 = s.Value;
}
9 changes: 5 additions & 4 deletions clang/test/CodeGenHLSL/static_global_and_function_in_cb.hlsl
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,13 @@

// CHECK-DAG: @[[CB:.+]] = external constant { float }

// CHECK-DAG:@_ZL1b = internal addrspace(10) global float 3.000000e+00, align 4
static float b = 3;

cbuffer A {
float a;
// CHECK-DAG:@_ZL1b = internal global float 3.000000e+00, align 4
static float b = 3;
float a;
// CHECK:load float, ptr @[[CB]], align 4
// CHECK:load float, ptr @_ZL1b, align 4
// CHECK:load float, ptr addrspacecast (ptr addrspace(10) @_ZL1b to ptr), align 4
float foo() { return a + b; }
}

Expand Down
2 changes: 1 addition & 1 deletion clang/test/SemaTemplate/address_space-dependent.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ void neg() {

template <long int I>
void tooBig() {
__attribute__((address_space(I))) int *bounds; // expected-error {{address space is larger than the maximum supported (8388586)}}
__attribute__((address_space(I))) int *bounds; // expected-error {{address space is larger than the maximum supported (8388585)}}
}

template <long int I>
Expand Down
Loading