diff --git a/README.md b/README.md index 49d9d68..c9b3c04 100644 --- a/README.md +++ b/README.md @@ -11,6 +11,26 @@ Khronos SPIR-V Registry](https://www.khronos.org/registry/spir-v/). - Issue tracking for all SPIR-V specifications - Pull requests to add new SPIR-V extensions +## Publishing new extension + +To publish a new extension, please create a pull request which includes: + +- The extension document in the asciidoc format named following + the `SPV__.asciidoc` pattern. The document should be placed + in the `extension/` folder. +- README.md update with the link to the new extension once published + +To publish a non-semantic extended instruction set, + +- The instruction set in the asciidoc format named following + the `NonSemantic..asciidoc` pattern. The document should be placed + in the `nonsemantic` folder. +- README.md update with the link to the new extension once published + +Please see [BUILD.md](BUILD.md) for instructions to create an HTML specification for this repo. + +Note: we no longer push the HTML along side the extension. + ## Extension Specifications ### KHR Extensions (Khronos) @@ -168,7 +188,3 @@ Khronos SPIR-V Registry](https://www.khronos.org/registry/spir-v/). * [NonSemantic.DebugBreak ]( https://github.khronos.org/SPIRV-Registry/nonsemantic/NonSemantic.DebugBreak.html) * [NonSemantic.DebugPrintf ]( https://github.khronos.org/SPIRV-Registry/nonsemantic/NonSemantic.DebugPrintf.html) * [NonSemantic.Shader.DebugInfo.100 ]( https://github.khronos.org/SPIRV-Registry/nonsemantic/NonSemantic.Shader.DebugInfo.100.html) - -## Building HTML Specifications - -Please see [BUILD.md](BUILD.md) for instructions to create an HTML specification for this repo. diff --git a/extensions/AMD/SPV_AMDX_shader_enqueue.html b/extensions/AMD/SPV_AMDX_shader_enqueue.html index 4eb3f1c..86fbaa6 100644 --- a/extensions/AMD/SPV_AMDX_shader_enqueue.html +++ b/extensions/AMD/SPV_AMDX_shader_enqueue.html @@ -1,851 +1,12 @@ - - - - - - - -SPV_AMDX_shader_enqueue - - - - - -
-
-

Name Strings

-
-
-

SPV_AMDX_shader_enqueue

-
-
-
-
-

Contact

-
-
-

See Issues list in the Khronos SPIRV-Headers repository: -https://github.com/KhronosGroup/SPIRV-Headers

-
-
-
-
-

Provisional

-
-
-

This extension is provisional and should: not be used in production applications. -The functionality may change in ways that break backwards compatibility between -revisions, and before final release.

-
-
-
-
-

Contributors

-
-
-
    -
  • -

    Tobias Hector, AMD

    -
  • -
  • -

    Matthäus Chajdas, AMD

    -
  • -
  • -

    Nicolai Hähnle, AMD

    -
  • -
  • -

    Junda Liu, AMD

    -
  • -
  • -

    Maciej Jesionowski, AMD

    -
  • -
  • -

    Daniel Brown, AMD

    -
  • -
  • -

    Stuart Smith, AMD

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2024 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-

Provisional.

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2024-07-26

Revision

2

-
-
-
-

Dependencies

-
-
-

This extension is written against the Unified SPIR-V Specification, -Version 1.6, Revision 1.

-
-
-

This extension requires SPIR-V 1.4.

-
-
-
-
-

Overview

-
-
-

This extension adds the ability for developers to enqueue compute -and mesh shader workgroups from compute shaders.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_AMDX_shader_enqueue"
-
-
-
-

Capabilities

-
-

Modify Section 3.31, "Capability", adding this row to the table:

-
- ----- - - - - - - - - - - - - - -
CapabilityEnabling Capabilities

5067

ShaderEnqueueAMDX
-Uses shader enqueue capabilities

Shader

-
-
-

Storage Class

-
-

Modify Section 3.7, "Storage Class", adding this row to the table:

-
- ----- - - - - - - - - - - - - - -
Storage ClassEnabling Capabilities

5068

NodePayloadAMDX
-Storage for Node Payloads.
-
-Variables declared with OpVariable in the GLCompute execution model with the CoalescingAMDX execution mode are visible across all invocations within a workgroup; and other variables declared with OpVariable in this storage class are visible across all invocations within a node dispatch. -Variables declared with this storage class are readable and writable, and must not have initializers.
-
-Pointers to this storage class are also used to point to payloads allocated and enqueued for other nodes.

ShaderEnqueueAMDX

-
-
-
-
-

Execution Modes

-
-
-

Modify Section 3.6, "Execution Mode", adding the following rows to the table:

-
- -------- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Execution ModeExtra OperandsEnabling Capabilities

5069

CoalescingAMDX
-Indicates that a GLCompute shader has coalescing semantics. (GLCompute only)
-
-Must not be declared alongside StaticNumWorkgroupsAMDX or MaxNumWorkgroupsAMDX.

ShaderEnqueueAMDX

5071

MaxNodeRecursionAMDX
-Maximum number of times a node can enqueue payloads for itself.

<id>
-Number of recursions

ShaderEnqueueAMDX

5070

IsApiEntryAMDX
-Indicates whether the shader can be dispatched directly by the client API or not. (GLCompute and MeshEXT execution models only)
-
-Is Entry is a scalar Boolean value, with a value of true indicating that it can be dispatched from the API, and false indicating that it cannot. -If not specified, defaults to true.
-
-Must be set to false if SharesInputWithAMDX is specified.

<id>
-Is Entry

ShaderEnqueueAMDX

5072

StaticNumWorkgroupsAMDX
-Statically declare the number of workgroups dispatched for this shader, instead of obeying an API- or payload-specified value. (GLCompute and MeshEXT only)
-
-Must not be declared alongside CoalescingAMDX or MaxNumWorkgroupsAMDX.

<id>
-x size

<id>
-y size

<id>
-z size

ShaderEnqueueAMDX

5077

MaxNumWorkgroupsAMDX
-Declare the maximum number of workgroups dispatched for this shader. Dispatches must not exceed this value (GLCompute and MeshEXT only)
-
-Must not be declared alongside CoalescingAMDX or StaticNumWorkgroupsAMDX.

<id>
-x size

<id>
-y size

<id>
-z size

ShaderEnqueueAMDX

5073

ShaderIndexAMDX
-Declare the node index for this shader. (GLCompute and MeshEXT only)

<id>
-Shader Index

ShaderEnqueueAMDX

5102

SharesInputWithAMDX
-Declare that this shader is paired with another node, such that it will be dispatched with the same input payload when the identified node is dispatched.
-Node Name and Shader Index indicate the node that the input will be shared with.
-
-Node Name must be an OpConstantStringAMDX or OpSpecConstantStringAMDX instruction.

<id>
-Node Name

<id>
-Shader Index

ShaderEnqueueAMDX

-
-
-
-

Decorations

-
-
-

Modify Section 3.20, "Decoration", adding the following row to the table:

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
DecorationExtra OperandsEnabling Capabilities

5020

NodeMaxPayloadsAMDX
-Must only be used to decorate an OpTypeNodePayloadArrayAMDX.
-
-OpTypeNodePayloadArrayAMDX must have this decoration. -The operand indicates the maximum number of payloads that can be in the array, and the maximum number of payloads that can be enqueued with this type.

<id>
-Max number of payloads

ShaderEnqueueAMDX

5019

NodeSharesPayloadLimitsWithAMDX
-Decorates an OpTypeNodePayloadArrayAMDX declaration to indicate that payloads of this type share output resources with Payload Type when allocated.
-
-Without the decoration, each types’s resources are separately allocated against the output limits; by using the decoration only the limits of Payload Type are considered. -Applications must still ensure that at runtime the actual usage does not exceed these limits, as this decoration only modifies static validation.
-
-Must only be used to decorate an OpTypeNodePayloadArrayAMDX declaration, -Payload Type must be a different OpTypeNodePayloadArrayAMDX declaration, and -Payload Type must not be itself decorated with NodeSharesPayloadLimitsWithAMDX.
-
-It is only necessary to decorate one OpTypeNodePayloadArrayAMDX declaration to indicate sharing between two node outputs. -Multiple variables can be decorated with the same Payload Type to indicate sharing across multiple node outputs.

<id>
-Payload Type

ShaderEnqueueAMDX

5091

PayloadNodeNameAMDX
-Decorates an OpTypeNodePayloadArrayAMDX declaration to indicate that the payloads in the array -will be enqueued for the shader with Node Name.
-
-Must only be used to decorate an OpTypeNodePayloadArrayAMDX declaration.
-
-Node Name must be an OpConstantStringAMDX or OpSpecConstantStringAMDX instruction.

<id>
-Node Name

ShaderEnqueueAMDX

5098

PayloadNodeBaseIndexAMDX
-Decorates an OpTypeNodePayloadArrayAMDX declaration to indicate a base index that -will be added to the Node Index when allocating payloads of this type. -If not specified, it is equivalent to specifying a value of 0.
-
-Must only be used to decorate an OpTypeNodePayloadArrayAMDX declaration.

<id>
-Base Index

ShaderEnqueueAMDX

5099

PayloadNodeSparseArrayAMDX
-Decorates an OpTypeNodePayloadArrayAMDX declaration to indicate that nodes at some node indexes may not exist in the execution graph pipeline and cannot be used to allocate payloads.
-
-If not specified, all node indexes between 0 and the PayloadNodeArraySizeAMDX value must be valid nodes in the graph.
-
-Must only be used to decorate an OpTypeNodePayloadArrayAMDX declaration.

ShaderEnqueueAMDX

5100

PayloadNodeArraySizeAMDX
-Decorates an OpTypeNodePayloadArrayAMDX declaration to indicate the maximum node index that can be used when allocating payloads of this type, including the base index offset in PayloadNodeBaseIndexAMDX decoration (if present). -If not specified, the node array is considered unbounded.
-
-Must only be used to decorate an OpTypeNodePayloadArrayAMDX declaration.
-
-If PayloadNodeSparseArrayAMDX is not set to true for a type initialized by OpAllocateNodePayloadsAMDX, this must be specified.

<id>
-Array Size

ShaderEnqueueAMDX

5078

TrackFinishWritingAMDX
-Decorates a structure to indicate that when used as a payload it can be written to and works with the OpFinishWritingNodePayloadAMDX instruction.
-
-Must only be used to decorate a structure type declaration.
-
-If the payload enqueued for a node is using a structure decorated with this value, the input payload in the NodePayloadAMDX storage class in the receiving node must use a structure decorated with it as well.

ShaderEnqueueAMDX

5105

PayloadDispatchIndirectAMDX
-Indicates the dispatch indirect arguments describing the number of workgroups to dispatch in a payload. -Must only be used with OpMemberDecorate to decorate the member of a structure.

-

Must decorate a structure member with a type of OpTypeInt or OpTypeVector with two or three components. -The integer type or the type of the vector component must be an OpTypeInt with up to 32-bit Width and 0 Signedness. -If a single integer is used, the Y and Z dispatch indirect arguments are assumed to be 1. -If a vector of two components is used, the Z dispatch indirect argument is assumed to be 1.

ShaderEnqueueAMDX

-
-
-
-

Builtins

-
-
-

Modify Section 3.21, "BuiltIn", adding the following row to the table:

-
- ----- - - - - - - - - - - - - - - - - - - -
BuiltInEnabling Capabilities

5021

RemainingRecursionLevelsAMDX
-The number of times this node can still enqueue payloads for itself.
-Is equal to 0 if at the leaf or if the node is not recursive at all.

ShaderEnqueueAMDX

5073

ShaderIndexAMDX
-Index assigned to the current shader.

ShaderEnqueueAMDX

-
-
-
-

Instructions

-
-
-

Add the following new instructions:

-
- ------ - - - - - - - - - - - - -

OpConstantStringAMDX
-
-Declare a new string specialization constant.
-
-String is the value of the constant.
-
-Unlike OpString, this is a semantically meaningful instruction and cannot be safely removed from a module.

Capability:
-ShaderEnqueueAMDX

3 + variable

5103

Result <id>

Literal
-String

- ------ - - - - - - - - - - - - -

OpSpecConstantStringAMDX
-
-Declare a new string specialization constant.
-
-String is the default value of the constant.
-
-Unlike OpString, this is a semantically meaningful instruction and cannot be safely removed from a module.
-
-This instruction can be specialized to become an OpConstantStringAMDX instruction.
-
-See Specialization.

Capability:
-ShaderEnqueueAMDX

3 + variable

5104

Result <id>

Literal
-String

- ------ - - - - - - - - - - - - -

OpTypeNodePayloadArrayAMDX
-
-Declare a new payload array type. Its length is not known at compile time.
-
-Payload Type is the type of each payload in the array.
-
- See OpNodePayloadArrayLengthAMDX for getting the length of an array of this type.
-
-A payload array can be allocated by either OpAllocateNodePayloadsAMDX to be enqueued as an output, or via OpVariable in the NodePayloadAMDX storage class to be consumed as an input.
-
-Can be dereferenced using an access chain in the same way as OpTypeRuntimeArray or OpTypeArray.

Capability:
-ShaderEnqueueAMDX

3

5076

Result <id>

<id>
-Payload Type

- --------- - - - - - - - - - - - - - - - -

OpAllocateNodePayloadsAMDX
-
-Allocates payloads for a node to be later enqueued via OpEnqueueNodePayloadsAMDX.
-
-Result Type must be an OpTypePointer to an OpTypeNodePayloadArrayAMDX in the NodePayloadAMDX storage class.
-
-The payloads are allocated for the node identified by the Node Name in the PayloadNodeNameAMDX decoration on Result Type, -with an index equal to the sum of its PayloadNodeBaseIndexAMDX decoration (if present) and Node Index. -
-Payloads are allocated for the Scope indicated by Visibility, and are visible to all invocations in that Scope.
-
-Payload Count is the number of payloads to allocate in the resulting array. -
-Behavior is undefined if Payload Count is greater than the NodeMaxPayloadsAMDX decoration on Result Type.
-
-Payload Count and Node Index must be dynamically uniform within the scope identified by Visibility.
-
-Visibility must only be either Invocation or Workgroup.
-
-This instruction must be called in uniform control flow within the same workgroup.

Capability:
-ShaderEnqueueAMDX

6

5074

<id>
-Result Type

Result <id>

Scope <id>
-Visibility

<id>
-Payload Count

<id>
-Node Index

- ----- - - - - - - - - - - - -

OpEnqueueNodePayloadsAMDX
-
-Enqueues a previously allocated payload array for execution by its node.
-
-Payload Array is a pointer to a payload array that was previously allocated by OpAllocateNodePayloadsAMDX.
-
-This instruction must be called in uniform control flow within the workgroup.

Capability:
-ShaderEnqueueAMDX

2

5075

<id>
-Payload Array

- ------- - - - - - - - - - - - - - -

OpNodePayloadArrayLengthAMDX
-
-Query the length of a payload array. Must only be used with input payload arrays or allocated output payload arrays.
-
-Result will be equal to the Payload Count value used to allocate Payload Array, or to the number of received payloads if the shader is using CoalescingAMDX execution mode. Otherwise, Result will be 1.
-
-Result Type must be an OpTypeInt with 32-bit Width and 0 Signedness.
-
-Payload Array is a pointer to a payload array previously allocated by OpAllocateNodePayloadsAMDX, or declared via OpVariable in the NodePayloadAMDX storage class as an input.

Capability:
-ShaderEnqueueAMDX

4

5090

<id>
-Result Type

Result <id>

<id>
-Payload Array

- -------- - - - - - - - - - - - - - - -

OpIsNodePayloadValidAMDX
-
-Check if the node payload identified by the Node Name in the PayloadNodeNameAMDX decoration, -with an index equal to the sum of its PayloadNodeBaseIndexAMDX decoration (if present) and Node Index -can be allocated.
-
-Result is equal to OpConstantTrue if the payload is valid and can be allocated, OpConstantFalse otherwise.
-
-Result Type must be OpTypeBool.
-
-Payload Type must be an OpTypeNodePayloadArrayAMDX declaration.
-
-NodeIndex must be less than the value specified by the PayloadNodeArraySizeAMDX decoration if specified.

Capability:
-ShaderEnqueueAMDX

5

5101

<id>
-Result Type

Result <id>

<id>
-Payload Type

<id>
-Node Index

- ------- - - - - - - - - - - - - - -

OpFinishWritingNodePayloadAMDX
-
-Optionally indicates that all writes to the input payload by the current workgroup have completed.
-
-Result is equal to OpConstantTrue if all workgroups that can access this payload have called this function.
-
-Must not be called if the shader is using CoalescingAMDX execution mode, -or if the shader was dispatched with a vkCmdDispatchGraph* client API command, -rather than enqueued from another shader.
-
-Must not be called if the input payload is not decorated with TrackFinishWritingAMDX.
-
-Result Type must be OpTypeBool.
-
-Payload must be the result of an OpVariable in the NodePayloadAMDX storage class.

Capability:
-ShaderEnqueueAMDX

4

5078

<id>
-Result Type

Result <id>

<id>
-Payload

-
-

Validation Rules

-
-

In section 2.16, Validation Rules for Shader Capabilities, Add NodePayloadAMDX to the list of storage classes where composite variables must be explicitly laid out.

-
-
-
-
-
-

Issues

-
-
-
    -
  • -

    None

    -
  • -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2023-07-22

Tobias Hector

Initial revision.

2

2024-07-26

Tobias Hector

Update to better match HLSL

-
-
-
- - \ No newline at end of file + + + + + + extensions/AMD/SPV_AMDX_shader_enqueue.html + + +

extensions/AMD/SPV_AMDX_shader_enqueue.html

+ + diff --git a/extensions/AMD/SPV_AMD_gcn_shader.html b/extensions/AMD/SPV_AMD_gcn_shader.html index 5ac001e..731d700 100644 --- a/extensions/AMD/SPV_AMD_gcn_shader.html +++ b/extensions/AMD/SPV_AMD_gcn_shader.html @@ -1,332 +1,12 @@ - - - - - - - -SPIR-V Extension SPV_AMD_gcn_shader - - - - - -
-
-

Name Strings

-
-
-

SPV_AMD_gcn_shader

-
-
-
-
-

Contact

-
-
-

See Issues list in the Khronos SPIRV-Registry repository: -https://github.com/KhronosGroup/SPIRV-Registry

-
-
-
-
-

Contributors

-
-
-
    -
  • -

    Dominik Witczak, AMD

    -
  • -
  • -

    Rex Xu, AMD

    -
  • -
  • -

    Daniel Rakos, AMD

    -
  • -
  • -

    Graham Sellers, AMD

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2016 AMD.

-
-
-
-
-

Status

-
-
-

Released.

-
-
-
-
-

Version

-
-
-

Modified Date: October 13, 2016 -Revision: 2

-
-
-
-
-

Dependencies

-
-
-

This extension is written against Revision 1 of the version 1.1 of the -SPIR-V Specification.

-
-
-
-
-

Overview

-
-
-

This extension is written to provide the functionality of the -GL_AMD_gcn_shader OpenGL Shading Language Specification extension for SPIR-V.

-
-
-

This extension exposes miscellaneous features of the AMD "Graphics Core Next" -shader architecture. This includes cube map query functions and functionality -to query the elapsed shader core time.

-
-
-
-
-

Extension Name

-
-
-

To enable SPV_AMD_gcn_shader extension in SPIR-V, use

-
-
-
-
OpExtension "SPV_AMD_gcn_shader"
-
-
-
-
-
-

New Instructions

-
-
-

This extension adds the following extended instructions

-
-
-
-
CubeFaceCoordAMD = 2
-CubeFaceIndexAMD = 1
-TimeAMD = 3
-
-
-
-

To use the extended instructions described below, declare:

-
-
-
-
OpExtInstImport %ext "SPV_AMD_gcn_shader"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.1

-
-
-

Modify Section 3.32.1, Miscellaneous Instructions

-
-
-

(Add to the end of the section a list of instructions as described below)

-
-
-

CubeFaceCoordAMD

-
-

The function cubeFaceCoordAMD returns a two-component floating point vector that -represents the 2D texture coordinates that would be used for accessing the selected -cube map face for the given cube map texture coordinates given as parameter P.

-
-
-

The operand <P> must be a pointer to a 3-component 32-bit floating-point vector.

-
-
-

This instruction is only valid in Fragment, Geometry, GLCompute, TessellationControl, -TessellationEvaluation and Vertex execution models.

-
-
-

Result Type must be a 2-component 32-bit floating-point vector.

-
-
-
-
3 | 2 | <id> Result Type | <id> P
-
-
-
-
-

CubeFaceIndexAMD

-
-

The function CubeFaceIndexAMD returns a single floating point value that represents -the index of the cube map face that would be accessed by texture lookup functions -for the cube map texture coordinates given as parameter. The returned value -correspond to cube map faces as follows:

-
-
-
    -
  • -

    0.0 for the cube map face facing the positive X direction

    -
  • -
  • -

    1.0 for the cube map face facing the negative X direction

    -
  • -
  • -

    2.0 for the cube map face facing the positive Y direction

    -
  • -
  • -

    3.0 for the cube map face facing the negative Y direction

    -
  • -
  • -

    4.0 for the cube map face facing the positive Z direction

    -
  • -
  • -

    5.0 for the cube map face facing the negative Z direction

    -
  • -
-
-
-

The operand <P> must be a 3-component 32-bit floating-point vector.

-
-
-

This instruction is only valid in Fragment, Geometry, GLCompute, TessellationControl, -TessellationEvaluation and Vertex execution models.

-
-
-

Result Type must be a 32-bit floating-point scalar.

-
-
-
-
3 | 1 | <id> Result Type | <id> P
-
-
-
-
-

TimeAMD

-
-

The TimeAMD instruction returns a 64-bit value representing the current execution clock -as seen by the shader processor. Time monotonically increments as the processor -executes instructions. The returned time will wrap after it exceeds the maximum -value representable in 64 bits. The units of time are not defined and need not be -constant. Time is not dynamically uniform. That is, shader invocations executing -as part of a single draw or dispatch will not necessarily see the same value of -time. Time is also not guaranteed to be consistent across shader stages. For -example, there is no requirement that time sampled inside a fragment shader invocation -will be greater than the time sampled in the vertex that lead to its execution.

-
-
-

This instruction is only valid in Fragment, Geometry, GLCompute, TessellationControl, -TessellationEvaluation and Vertex execution models.

-
-
-

Use of this instruction requires declaration of the Int64 capability.

-
-
-

Result Type must be a 64-bit unsigned integer scalar.

-
-
-
-
2 | 3 | <id> Result Type
-
-
-
-
-
-
-

Validation Rules

-
-
-

None.

-
-
-
-
-

Issues

-
-
-

None

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

May 31, 2016

Dominik Witczak

Initial revision based on AMD_gcn_shader.

1

October 13, 2016

Dominik Witczak

Added missing numerical value assignments, removed extension number.

-
-
-
- - \ No newline at end of file + + + + + + extensions/AMD/SPV_AMD_gcn_shader.html + + +

extensions/AMD/SPV_AMD_gcn_shader.html

+ + diff --git a/extensions/AMD/SPV_AMD_gpu_shader_half_float.html b/extensions/AMD/SPV_AMD_gpu_shader_half_float.html index 6e64c7e..d5fd594 100644 --- a/extensions/AMD/SPV_AMD_gpu_shader_half_float.html +++ b/extensions/AMD/SPV_AMD_gpu_shader_half_float.html @@ -1,289 +1,12 @@ - - - - - - - -SPIR-V Extension SPV_AMD_gpu_shader_half_float - - - - - -
-
-

Name Strings

-
-
-

SPV_AMD_gpu_shader_half_float

-
-
-
-
-

Contact

-
-
-

See Issues list in the Khronos SPIRV-Registry repository: -https://github.com/KhronosGroup/SPIRV-Registry

-
-
-
-
-

Contributors

-
-
-
    -
  • -

    Dominik Witczak, AMD

    -
  • -
  • -

    Rex Xu, AMD

    -
  • -
  • -

    Qun Lin, AMD

    -
  • -
  • -

    Daniel Rakos, AMD

    -
  • -
  • -

    Donglin Wei, AMD

    -
  • -
  • -

    Graham Sellers, AMD

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2016 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-

Proposed.

-
-
-
-
-

Version

-
-
-

Modified Date: June 12, 2018 -Revision: 3

-
-
-
-
-

Dependencies

-
-
-

This extension is written against Revision 1 of the version 1.1 of the -SPIR-V Specification.

-
-
-

The extension is written against Revision 1 of the OpenGL extension -AMD_gpu_shader_half_float.

-
-
-
-
-

Overview

-
-
-

This extension is written to provide the functionality of the -AMD_gpu_shader_half_float, OpenGL Shading Language Specification extension, -for SPIR-V.

-
-
-

This extension introduces 16-bit floating point support to extended instructions -described in the GLSL.std.450 extended instruction set.

-
-
-
-
-

Extension Name

-
-
-

To enable SPV_AMD_gpu_shader_half_float extension in SPIR-V, use

-
-
-
-
OpExtension "SPV_AMD_gpu_shader_half_float"
-
-
-
-
-
-

Summary

-
-
-

This extension adds support for 16-bit floating-point component types for the -following instructions described in the GLSL.std.450 extended instruction set:

-
-
-
-
InterpolateAtCentroid
-InterpolateAtSample
-InterpolateAtOffset
-
-
-
-
-
-

Modifications to the OpenGL Shading Language 4.50 Extended Instruction Set Specification, Version 1.00

-
-
-

Modify Section 2, Binary Form

-
-
-

InterpolateAtCentroid

-
-

(Replace the following sentence:)
-
-The operand interpolant must be a pointer to a scalar or vector whose component type is 32-bit floating-point.
-
-(with:)
-
-The operand interpolant must be a pointer to a scalar or vector whose component type is 16- or -32-bit floating-point.

-
-
-
-

InterpolateAtSample

-
-

(Replace the following sentence:)
-
-The operand interpolant must be a pointer to a scalar or vector whose component type is 32-bit floating-point.
-
-(with:)
-
-The operand interpolant must be a pointer to a scalar or vector whose component type is 16- or -32-bit floating-point.

-
-
-
-

InterpolateAtOffset

-
-

(Replace the following sentence:)
-
-The operand interpolant must be a pointer to a scalar or vector whose component type is 32-bit floating-point.
-
-(with:)
-
-The operand interpolant must be a pointer to a scalar or vector whose component type is 16- or -32-bit floating-point.
-
-
-(Replace the following sentence:)
-
-The offset operand must be a vector of 2 components of 32-bit floating-point type.
-
-(with:)
-
-The offset operand must be a vector of 2 components of 16- or 32-bit floating-point type.

-
-
-
-
-
-

Validation Rules

-
-
-

None.

-
-
-
-
-

Issues

-
-
-

None

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

3

June 12, 2018

Dominik Witczak

Removenon-interpolate extended instruction modifications, as GLSL.std.450 spec now includes these changes.

2

June 22, 2017

Dominik Witczak

Removed incorrect language regarding accessing the new functionality.

1

September 21, 2016

Dominik Witczak

Initial revision based on AMD_gpu_shader_half_float.

-
-
-
- - \ No newline at end of file + + + + + + extensions/AMD/SPV_AMD_gpu_shader_half_float.html + + +

extensions/AMD/SPV_AMD_gpu_shader_half_float.html

+ + diff --git a/extensions/AMD/SPV_AMD_gpu_shader_half_float_fetch.html b/extensions/AMD/SPV_AMD_gpu_shader_half_float_fetch.html index 6fdac04..560356e 100644 --- a/extensions/AMD/SPV_AMD_gpu_shader_half_float_fetch.html +++ b/extensions/AMD/SPV_AMD_gpu_shader_half_float_fetch.html @@ -1,734 +1,12 @@ - - - - - - - -SPIR-V Extension SPV_AMD_gpu_shader_half_float_fetch - - - - - -
-
-

Name Strings

-
-
-

SPV_AMD_gpu_shader_half_float_fetch

-
-
-
-
-

Contact

-
-
-

See Issues list in the Khronos SPIRV-Registry repository: -https://github.com/KhronosGroup/SPIRV-Registry

-
-
-
-
-

Contributors

-
-
-
    -
  • -

    Daniel Rakos, AMD

    -
  • -
  • -

    Dominik Witczak, AMD

    -
  • -
  • -

    Graham Sellers, AMD

    -
  • -
  • -

    Qun Lin, AMD

    -
  • -
  • -

    Rex Xu, AMD

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2018 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-

Shipping.

-
-
-
-
-

Version

-
-
-

Modified Date: February 2, 2018 -Revision: 1

-
-
-
-
-

Dependencies

-
-
-

This extension is written against Revision 2 of the version 1.2 of the -SPIR-V Specification.

-
-
-

The extension is written against Revision 5 of the OpenGL extension -AMD_gpu_shader_half_float_fetch.

-
-
-
-
-

Overview

-
-
-

This extension is written to provide the functionality of the -AMD_gpu_shader_half_float_fetch, OpenGL Shading Language Specification extension, -for SPIR-V.

-
-
-

This extension introduces 16-bit sampled type support to image instructions.

-
-
-

This extension introduces support for 16-bit floating-poing type which can now -be used as a coordinate and a result texel type.

-
-
-
-
-

Extension Name

-
-
-

To enable SPV_AMD_gpu_shader_half_float_fetch extension in SPIR-V, use

-
-
-
-
OpExtension "SPV_AMD_gpu_shader_half_float_fetch"
-
-
-
-
-
-

Summary

-
-
-

This extension adds a new Float16ImageAMD capability.

-
-
-

This extension adds support for 16-bit float Result Type which can now be used by the following image instructions:

-
-
-
-
OpImageDrefGather
-OpImageFetch
-OpImageGather
-OpImageRead
-OpImageSampleDrefExplicitLod
-OpImageSampleDrefImplicitLod
-OpImageSampleExplicitLod
-OpImageSampleImplicitLod
-OpImageSampleProjDrefExplicitLod
-OpImageSampleProjDrefImplicitLod
-OpImageSampleProjExplicitLod
-OpImageSampleProjImplicitLod
-OpImageSparseDrefGather
-OpImageSparseFetch
-OpImageSparseGather
-OpImageSparseRead
-OpImageSparseSampleDrefExplicitLod
-OpImageSparseSampleDrefImplicitLod
-OpImageSparseSampleExplicitLod
-OpImageSparseSampleImplicitLod
-
-
-
-

This extension adds support for 16-bit float type used as a coordinate type for the following image instructions:

-
-
-
-
OpImageDrefGather
-OpImageGather
-OpImageSampleDrefExplicitLod
-OpImageSampleDrefImplicitLod
-OpImageSampleExplicitLod
-OpImageSampleImplicitLod
-OpImageSampleProjDrefExplicitLod
-OpImageSampleProjDrefImplicitLod
-OpImageSampleProjExplicitLod
-OpImageSampleProjImplicitLod
-OpImageSparseDrefGather
-OpImageSparseGather
-OpImageSparseSampleDrefExplicitLod
-OpImageSparseSampleDrefImplicitLod
-OpImageSparseSampleExplicitLod
-OpImageSparseSampleImplicitLod
-
-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - -

Float16ImageAMD

5008

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.2

-
-
-

Modify Section 3.31, Capability:

-
-

Append the following Capability to the table:

-
- ----- - - - - - - - - - - - - - - -
CapabilityDepends OnEnabled by Extension

Float16ImageAMD

-

Expands image type declaration instruction and image instructions to allow them to use 16-bit type in image declaration/sampling/read/write/sparse read operations.

Shader

SPV_AMD_gpu_shader_half_float_fetch

-
-
-

Modify Section 3.14, Image Operands:

-
-

(For Bias Image Operand’s description, replace the following sentence:)
-It must be a floating-point type scalar.
-(with:)
-It must be a floating-point type (incl. 16-bit OpTypeFloat) scalar.

-
-
-

(For Lod Image Operand’s description, replace the following sentence:)
-For sampling operations, it must be a floating-point type scalar.
-(with:)
-For sampling operations, it must be a floating-point type (incl. 16-bit OpTypeFloat) scalar.

-
-
-

(For Grad Image Operand’s description, replace the following sentence:)
-They must be a scalar or vector of floating-point type.
-(with:)
-They must be a scalar or vector of floating-point type (incl. 16-bit OpTypeFloat).

-
-
-

(For MinLod Image Operand’s description, replace the following sentence:)
-It must be a floating-point type scalar.
-(with:)
-It must be a floating-point type (incl. 16-bit OpTypeFloat) scalar.

-
-
-
-

Modify Section 3.32.6, Type-Declaration Instructions:

-
-

Update language for the following types:

-
-
-
-

OpTypeImage

-
-

(Replace the following sentence:)
-
-Sampled Type is the type of the components that result from sampling or reading from this image type. Must be a scalar numerical type or OpTypeVoid.
-
-(with:)
-
-Sampled Type is the type of the components that result from sampling or reading from this image type. Must be a scalar numerical type (incl. 16-bit OpTypeFloat) or OpTypeVoid.

-
-
-
-

Modify Section 3.32.10, Image Instructions:

-
-

Update language for the following image instructions:

-
-
-
-

OpImageDrefGather

-
-

(Replace the following sentence:)
-
-Result Type must be a vector of four components of floating-point type or integer type.
-
-(with:)
-
-Result Type must be a vector of four components of floating-point type (incl. 16-bit OpTypeFloat) or integer type.
-
-(Replace the following sentence:)
-
-Coordinate must be a scalar or vector of floating-point type.
-
-(with:)
-
-Coordinate must be a scalar or vector of floating-point (incl. 16-bit OpTypeFloat) type.

-
-
-
-

OpImageFetch

-
-

(Replace the following sentence:)
-
-Result Type must be a vector of four components of floating-point type or integer type.
-
-(with:)
-
-Result Type must be a vector of four components of floating-point type (incl. 16-bit OpTypeFloat) or integer type.

-
-
-
-

OpImageGather

-
-

(Replace the following sentence:)
-
-Result Type must be a vector of four components of floating-point type or integer type.
-
-(with:)
-
-Result Type must be a vector of four components of floating-point type (incl. 16-bit OpTypeFloat) or integer type.
-
-(Replace the following sentence:)
-
-Coordinate must be a scalar or vector of floating-point type.
-
-(with:)
-
-Coordinate must be a scalar or vector of floating-point (incl. 16-bit OpTypeFloat) type.

-
-
-
-

OpImageRead

-
-

(Replace the following sentence:)
-
-Result Type must be a vector of four components of floating-point type or integer type.
-
-(with:)
-
-Result Type must be a vector of four components of floating-point type (incl. 16-bit OpTypeFloat) or integer type.

-
-
-
-

OpImageSampleDrefExplicitLod

-
-

(Replace the following sentence:)
-
-Result Type must be a scalar of integer type or floating-point type.
-
-(with:)
-
-Result Type must be a scalar of integer type or floating-point (incl. 16-bit OpTypeFloat) type.
-
-(Replace the following sentence:)
-
-Coordinate must be a scalar or vector of floating-point type.
-
-(with:)
-
-Coordinate must be a scalar or vector of floating-point (incl. 16-bit OpTypeFloat) type.

-
-
-
-

OpImageSampleDrefImplicitLod

-
-

(Replace the following sentence:)
-
-Result Type must be a scalar of integer type or floating-point type.
-
-(with:)
-
-Result Type must be a scalar of integer type or floating-point (incl. 16-bit OpTypeFloat) type.
-
-(Replace the following sentence:)
-
-Coordinate must be a scalar or vector of floating-point type.
-
-(with:)
-
-Coordinate must be a scalar or vector of floating-point (incl. 16-bit OpTypeFloat) type.

-
-
-
-

OpImageSampleExplicitLod

-
-

(Replace the following sentence:)
-
-Result Type must be a vector of four components of floating-point type or integer type.
-
-(with:)
-
-Result Type must be a vector of four components of floating-point type (incl. 16-bit OpTypeFloat) or integer type.
-
-(Replace the following sentence:)
-
-Coordinate must be a scalar or vector of floating-point type or integer type.
-
-(with:)
-
-Coordinate must be a scalar or vector of floating-point (incl. 16-bit OpTypeFloat) type or integer type.

-
-
-
-

OpImageSampleImplicitLod

-
-

(Replace the following sentence:)
-
-Result Type must be a vector of four components of floating-point type or integer type.
-
-(with:)
-
-Result Type must be a vector of four components of floating-point type (incl. 16-bit OpTypeFloat) or integer type.
-
-(Replace the following sentence:)
-
-Coordinate must be a scalar or vector of floating-point type or integer type.
-
-(with:)
-
-Coordinate must be a scalar or vector of floating-point (incl. 16-bit OpTypeFloat) type or integer type.

-
-
-
-

OpImageSampleProjDrefExplicitLod

-
-

(Replace the following sentence:)
-
-Result Type must be a scalar of integer type or floating-point type.
-
-(with:)
-
-Result Type must be a scalar of integer type or floating-point (incl. 16-bit OpTypeFloat) type.
-
-(Replace the following part of the sentence:)
-
-Coordinate is a floating-point vector containing (..)
-
-(with:)
-
-Coordinate is a floating-point (incl. 16-bit) vector containing (..)

-
-
-
-

OpImageSampleProjDrefImplicitLod

-
-

(Replace the following sentence:)
-
-Result Type must be a scalar of integer type or floating-point type.
-
-(with:)
-
-Result Type must be a scalar of integer type or floating-point (incl. 16-bit OpTypeFloat) type.
-
-(Replace the following part of the sentence:)
-
-Coordinate is a floating-point vector containing (..)
-
-(with:)
-
-Coordinate is a floating-point (incl. 16-bit) vector containing (..)

-
-
-
-

OpImageSampleProjExplicitLod

-
-

(Replace the following sentence:)
-
-Result Type must be a vector of four components of floating-point type or integer type.
-
-(with:)
-
-Result Type must be a vector of four components of floating-point type (incl. 16-bit OpTypeFloat) or integer type.
-
-(Replace the following part of the sentence:)
-
-Coordinate is a floating-point vector containing (..)
-
-(with:)
-
-Coordinate is a floating-point (incl. 16-bit) vector containing (..)

-
-
-
-

OpImageSampleProjImplicitLod

-
-

(Replace the following sentence:)
-
-Result Type must be a vector of four components of floating-point type or integer type.
-
-(with:)
-
-Result Type must be a vector of four components of floating-point type (incl. 16-bit OpTypeFloat) or integer type.
-
-(Replace the following part of the sentence:)
-
-Coordinate is a floating-point vector containing (..)
-
-(with:)
-
-Coordinate is a floating-point (incl. 16-bit) vector containing (..)

-
-
-
-

OpImageSparseDrefGather

-
-

(Replace the following sentence:)
-
-Result Type must be an OpTypeStruct with two members. (..) The second member must be a vector of four components of floating-point type or integer type.
-
-(with:)
-
-Result Type must be an OpTypeStruct with two members. (..) The second member must be a vector of four components of floating-point type (incl. 16-bit OpTypeFloat) or integer type.
-
-(Replace the following sentence:)
-
-Coordinate must be a scalar or vector of floating-point type.
-
-(with:)
-
-Coordinate must be a scalar or vector of floating-point (incl. 16-bit OpTypeFloat) type.

-
-
-
-

OpImageSparseFetch

-
-

(Replace the following sentence:)
-
-Result Type must be an OpTypeStruct with two members. (..) The second member must be a vector of four components of floating-point type or integer type.
-
-(with:)
-
-Result Type must be an OpTypeStruct with two members. (..) The second member must be a vector of four components of floating-point type (incl. 16-bit OpTypeFloat) or integer type.

-
-
-
-

OpImageSparseGather

-
-

(Replace the following sentence:)
-
-Result Type must be an OpTypeStruct with two members. (..) The second member must be a vector of four components of floating-point type or integer type.
-
-(with:)
-
-Result Type must be an OpTypeStruct with two members. (..) The second member must be a vector of four components of floating-point type (incl. 16-bit OpTypeFloat) or integer type.

-
-
-

(Replace the following sentence:)
-
-Coordinate must be a scalar or vector of floating-point type.
-
-(with:)
-
-Coordinate must be a scalar or vector of floating-point (incl. 16-bit OpTypeFloat) type.

-
-
-
-

OpImageSparseRead

-
-

(Replace the following sentence:)
-
-Result Type must be an OpTypeStruct with two members. (..) The second member must be a scalar or vector of floating-point type or integer type.
-
-(with:)
-
-Result Type must be an OpTypeStruct with two members. (..) The second member must be a scalar or vector of floating-point (incl. 16-bit OpTypeFloat) type or integer type.

-
-
-
-

OpImageSparseSampleDrefExplicitLod

-
-

(Replace the following sentence:)
-
-Result Type must be an OpTypeStruct with two members. (..) The second member must be a scalar of integer type or floating-point type.
-
-(with:)
-
-Result Type must be an OpTypeStruct with two members. (..) The second member must be a scalar of integer type or floating-point (incl. 16-bit OpTypeFloat) type.
-
-(Replace the following sentence:)
-
-Coordinate must be a scalar or vector of floating-point type.
-
-(with:)
-
-Coordinate must be a scalar or vector of floating-point (incl. 16-bit OpTypeFloat) type.

-
-
-
-

OpImageSparseSampleDrefImplicitLod

-
-

(Replace the following sentence:)
-
-Result Type must be an OpTypeStruct with two members. (..) The second member must be a scalar of integer type or floating-point type.
-
-(with:)
-
-Result Type must be an OpTypeStruct with two members. (..) The second member must be a scalar of integer type or floating-point (incl. 16-bit OpTypeFloat) type.
-
-(Replace the following sentence:)
-
-Coordinate must be a scalar or vector of floating-point type.
-
-(with:)
-
-Coordinate must be a scalar or vector of floating-point (incl. 16-bit OpTypeFloat) type.

-
-
-
-

OpImageSparseSampleExplicitLod

-
-

(Replace the following sentence:)
-
-Result Type must be an OpTypeStruct with two members. (..) The second member must be a vector of four components of floating-point type or integer type.
-
-(with:)
-
-Result Type must be an OpTypeStruct with two members. (..) The second member must be a vector of four components of floating-point type (incl. 16-bit OpTypeFloat) or integer type.
-
-(Replace the following sentence:)
-
-Coordinate must be a scalar or vector of floating-point type or integer type.
-
-(with:)
-
-Coordinate must be a scalar or vector of floating-point (incl. 16-bit OpTypeFloat) type or integer type.

-
-
-
-

OpImageSparseSampleImplicitLod

-
-

(Replace the following sentence:)
-
-Result Type must be an OpTypeStruct with two members. (..) The second member must be a vector of four components of floating-point type or integer type.
-
-(with:)
-
-Result Type must be an OpTypeStruct with two members. (..) The second member must be a vector of four components of floating-point type (incl. 16-bit OpTypeFloat) or integer type.
-
-(Replace the following sentence:)
-
-Coordinate must be a scalar or vector of floating-point type or integer type.
-
-(with:)
-
-Coordinate must be a scalar or vector of floating-point (incl. 16-bit OpTypeFloat) type or integer type.

-
-
-
-
-
-

Validation Rules

-
-
-

None.

-
-
-
-
-

Issues

-
-
-

None

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

February 2, 2018

Dominik Witczak

Initial revision based on AMD_gpu_shader_half_float_fetch.

-
-
-
- - \ No newline at end of file + + + + + + extensions/AMD/SPV_AMD_gpu_shader_half_float_fetch.html + + +

extensions/AMD/SPV_AMD_gpu_shader_half_float_fetch.html

+ + diff --git a/extensions/AMD/SPV_AMD_gpu_shader_int16.html b/extensions/AMD/SPV_AMD_gpu_shader_int16.html index 0165779..6094481 100644 --- a/extensions/AMD/SPV_AMD_gpu_shader_int16.html +++ b/extensions/AMD/SPV_AMD_gpu_shader_int16.html @@ -1,272 +1,12 @@ - - - - - - - -SPIR-V Extension SPV_AMD_gpu_shader_int16 - - - - - -
-
-

Name Strings

-
-
-

SPV_AMD_gpu_shader_int16

-
-
-
-
-

Contact

-
-
-

See Issues list in the Khronos SPIRV-Registry repository: -https://github.com/KhronosGroup/SPIRV-Registry

-
-
-
-
-

Contributors

-
-
-
    -
  • -

    Daniel Rakos, AMD

    -
  • -
  • -

    Dominik Witczak, AMD

    -
  • -
  • -

    Matthaeus G. Chajdas, AMD

    -
  • -
  • -

    Quentin Lin, AMD

    -
  • -
  • -

    Rex Xu, AMD

    -
  • -
  • -

    Timothy Lottes, AMD

    -
  • -
  • -

    Zhi Cai, AMD

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2017 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-

Shipping.

-
-
-
-
-

Version

-
-
-

Modified Date: 06/08/2017 -Revision: 1

-
-
-
-
-

Dependencies

-
-
-

This extension is written against Revision 1 of the version 1.1 of the -SPIR-V Specification.

-
-
-

The extension is written against Revision 1 of the OpenGL extension -AMD_gpu_shader_int16.

-
-
-
-
-

Overview

-
-
-

This extension is written to provide the functionality of the -AMD_gpu_shader_int16, OpenGL Shading Language Specification extension, -for SPIR-V.

-
-
-

This extension introduces 16-bit signed and unsigned integer support to extended -instructions described in the GLSL.std.450 extended instruction set.

-
-
-
-
-

Extension Name

-
-
-

To enable SPV_AMD_gpu_shader_int16 extension in SPIR-V, use

-
-
-
-
OpExtension "SPV_AMD_gpu_shader_int16"
-
-
-
-
-
-

Summary

-
-
-

This extension adds support for 16-bit signed and unsigned integer component types -for the following instructions described in the GLSL.std.450 extended instruction set:

-
-
-
-
Frexp
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.00, Revision 10

-
-
-

Modify OpSpecConstantOp definition:

-
-
-

(Append OpUConvert to the list of permitted ..Convert Opcodes):

-
-
-

Opcode must be one of the following opcodes. This literal operand is limited to a single word.

-
-
-

OpSConvert, OpFConvert, OpUConvert, (..)

-
-
-
-
-

Modifications to the OpenGL Shading Language 4.50 Extended Instruction Set Specification, Version 1.00

-
-
-

Modify Section 2, Binary Form

-
-
-

Frexp

-
-

(Replace the following sentence:)
-
-The exp operand must be a pointer to a scalar or vector with integer component type, with 32-bit component width.
-
-(with:)
-
-The exp operand must be a pointer to a scalar or vector with integer component type, with 16- or 32-bit component width.

-
-
-
-

FrexpStruct

-
-

(Replace the following sentence:)
-
-Member 1 must be a scalar or vector with integer component type, with 32-bit component width.
-
-(with:)
-
-Member 1 must be a scalar or vector with integer component type, with 16- or 32-bit component width.

-
-
-
-
-
-

Validation Rules

-
-
-

None.

-
-
-
-
-

Issues

-
-
-

None

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

June 8, 2017

Dominik Witczak

Initial revision based on AMD_gpu_shader_int16.

-
-
-
- - \ No newline at end of file + + + + + + extensions/AMD/SPV_AMD_gpu_shader_int16.html + + +

extensions/AMD/SPV_AMD_gpu_shader_int16.html

+ + diff --git a/extensions/AMD/SPV_AMD_shader_ballot.html b/extensions/AMD/SPV_AMD_shader_ballot.html index 18cf1a2..b07fac1 100644 --- a/extensions/AMD/SPV_AMD_shader_ballot.html +++ b/extensions/AMD/SPV_AMD_shader_ballot.html @@ -1,798 +1,12 @@ - - - - - - - -SPIR-V Extension SPV_AMD_shader_ballot - - - - - -
-
-

Name Strings

-
-
-

SPV_AMD_shader_ballot

-
-
-
-
-

Contact

-
-
-

See Issues list in the Khronos SPIRV-Registry repository: -https://github.com/KhronosGroup/SPIRV-Registry

-
-
-
-
-

Contributors

-
-
-
    -
  • -

    Qun Lin, AMD

    -
  • -
  • -

    Graham Sellers, AMD

    -
  • -
  • -

    Daniel Rakos, AMD

    -
  • -
  • -

    Rex Xu, AMD

    -
  • -
  • -

    Dominik Witczak, AMD

    -
  • -
  • -

    Matthäus G. Chajdas, AMD

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2016 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-

Released.

-
-
-
-
-

Version

-
-
-

Modified Date: March 28, 2018 -Revision: 6

-
-
-
-
-

Dependencies

-
-
-

This extension is written against Revision 1 of the version 1.10 of the -SPIR-V Specification.

-
-
-

The extension is written against Revision 1 of the OpenGL extension -AMD_shader_ballot.

-
-
-
-
-

Overview

-
-
-

This extension is written to provide the functionality of the -AMD_shader_ballot, OpenGL Shading Language Specification extension, -for SPIR-V.

-
-
-

This extension introduces eight core instructions and four new extended -instructions to SPIR-V that enable additional subgroup operations in shaders.

-
-
-

This extension adds 16-bit result type support to a number of core group operations.

-
-
-
-
-

Extension Name

-
-
-

To enable SPV_AMD_shader_ballot extension in SPIR-V, use

-
-
-
-
OpExtension "SPV_AMD_shader_ballot"
-
-
-
-
-
-

New Instructions

-
-
-

This extension adds the following core instructions

-
-
-
-
OpGroupIAddNonUniformAMD = 5000
-OpGroupFAddNonUniformAMD = 5001
-OpGroupFMinNonUniformAMD = 5002
-OpGroupUMinNonUniformAMD = 5003
-OpGroupSMinNonUniformAMD = 5004
-OpGroupFMaxNonUniformAMD = 5005
-OpGroupUMaxNonUniformAMD = 5006
-OpGroupSMaxNonUniformAMD = 5007
-
-
-
-

This extension adds the following extended instructions

-
-
-
-
SwizzleInvocationsAMD = 1
-SwizzleInvocationsMaskedAMD = 2
-WriteInvocationAMD = 3
-MbcntAMD = 4
-
-
-
-

To use the new core instructions, declare:

-
-
-
-
OpCapability Groups
-OpExtension "SPV_AMD_shader_ballot"
-
-
-
-

To use the new extended instructions, declare:

-
-
-
-
OpExtension "SPV_AMD_shader_ballot"
-OpExtInstImport %ext "SPV_AMD_shader_ballot"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.1

-
-
-

Modify Section 3.32.21, Group Instructions

-
-
-

OpGroupIAdd

-
-

(Replace the following sentence):

-
-
-

<Result Type> must be a 32-bit or 64-bit <integer type> scalar.

-
-
-

(with):

-
-
-

<Result Type> must be an <integer type> scalar.

-
-
-
-

OpGroupUMin

-
-

(Replace the following sentence):

-
-
-

<Result Type> must be a 32-bit or 64-bit <integer type> scalar.

-
-
-

(with):

-
-
-

<Result Type> must be an <integer type> scalar.

-
-
-
-

OpGroupSMin

-
-

(Replace the following sentence):

-
-
-

<Result Type> must be a 32-bit or 64-bit <integer type> scalar.

-
-
-

(with):

-
-
-

<Result Type> must be an <integer type> scalar.

-
-
-
-

OpGroupUMax

-
-

(Replace the following sentence):

-
-
-

<Result Type> must be a 32-bit or 64-bit <integer type> scalar.

-
-
-

(with):

-
-
-

<Result Type> must be an <integer type> scalar.

-
-
-
-

OpGroupSMax

-
-

(Replace the following sentence):

-
-
-

<Result Type> must be a 32-bit or 64-bit <integer type> scalar.

-
-
-

(with):

-
-
-

<Result Type> must be an <integer type> scalar.

-
-
-

(Add to the end of the section)

-
-
-
-

OpGroupIAddNonUniformAMD

-
-

An integer add group operation specified for all values of <X> -specified by invocations in the group.

-
-
-

The identity <I> is 0.

-
-
-

All invocations of this module within <Execution> must reach this point of execution.

-
-
-

This instruction is able to work correctly if placed within non-uniform control -flow within <Execution>.

-
-
-

<Result Type> must be an <integer type> scalar.

-
-
-

<Execution> must be Workgroup or Subgroup Scope.

-
-
-

The type of <X> must be the same as <Result Type>.

-
-
-
-
6 | 5000  | <id> Result Type | <id> Result  | Scope <id> Execution | Group Operation | <id> X
-
-
-
-
-

OpGroupFAddNonUniformAMD

-
-

A floating-point add group operation specified for all values of <X> specified -by invocations in the group.

-
-
-

The identity <I> is 0.

-
-
-

All invocations of this module within <Execution> must reach this point of -execution.

-
-
-

This instruction is able to work correctly if placed within non-uniform control -flow within <Execution>.

-
-
-

<Result Type> must be an <integer type> scalar.

-
-
-

<Execution> must be Workgroup or Subgroup Scope.

-
-
-

The type of <X> must be the same as <Result Type>.

-
-
-
-
6 | 5001 | <id> Result Type | <id> Result | <id> Scope Execution | Group Operation | <id> X
-
-
-
-
-

OpGroupFMinNonUniformAMD

-
-

A floating-point minimum group operation specified for all values of <X> specified -by invocations in the group.

-
-
-

The identity <I> is +INF.

-
-
-

All invocations of this module within <Execution> must reach this point of -execution.

-
-
-

This instruction is able to work correctly if placed within non-uniform control -flow within <Execution>.

-
-
-

<Result Type> must be an <integer type> scalar.

-
-
-

<Execution> must be Workgroup or Subgroup Scope.

-
-
-

The type of <X> must be the same as <Result Type>.

-
-
-
-
6 | 5002 | <id> Result Type | <id> Result | <id> Scope Execution | Group Operation | <id> X
-
-
-
-
-

OpGroupUMinNonUniformAMD

-
-

An unsigned integer minimum group operation specified for all values of <X> -specified by invocations in the group.

-
-
-

The identity <I> is UINT_MAX when X is 32 bits wide and ULONG_MAX when <X> is -64 bits wide.

-
-
-

All invocations of this module within <Execution> must reach this point of execution.

-
-
-

This instruction is able to work correctly if placed within non-uniform control flow -within <Execution>.

-
-
-

<Result Type> must be an <integer type> scalar.

-
-
-

<Execution> must be Workgroup or Subgroup Scope.

-
-
-

The type of <X> must be the same as <Result Type>.

-
-
-
-
6 | 5003 | <id> Result Type | <id> Result | <id> Scope Execution | Group Operation | <id> X
-
-
-
-
-

OpGroupSMinNonUniformAMD

-
-

A signed integer minimum group operation specified for all values of <X> specified -by invocations in the group.

-
-
-

The identity <I> is INT_MAX when X is 32 bits wide and LONG_MAX when <X> is 64 -bits wide.

-
-
-

All invocations of this module within <Execution> must reach this point of -execution.

-
-
-

This instruction is able to work correctly if placed within non-uniform control -flow within <Execution>.

-
-
-

<Result Type> must be an <integer type> scalar.

-
-
-

<Execution> must be Workgroup or Subgroup Scope.

-
-
-

The type of <X> must be the same as <Result Type>.

-
-
-
-
6 | 5004 | <id> Result Type | <id> Result | <id> Scope Execution | Group Operation | <id> X
-
-
-
-
-

OpGroupFMaxNonUniformAMD

-
-

A floating-point maximum group operation specified for all values of <X> specified -by invocations in the group.

-
-
-

The identity <I> is -INF.

-
-
-

All invocations of this module within <Execution> must reach this point of -execution.

-
-
-

This instruction is able to work correctly if placed within non-uniform control -flow within <Execution>.

-
-
-

<Result Type> must be an <integer type> scalar.

-
-
-

<Execution> must be Workgroup or Subgroup Scope.

-
-
-

The type of <X> must be the same as <Result Type>.

-
-
-
-
6 | 5005 | <id> Result Type | <id> Result | <id> Scope Execution | Group Operation | <id> X
-
-
-
-
-

OpGroupUMaxNonUniformAMD

-
-

An unsigned integer maximum group operation specified for all values of <X> -specified by invocations in the group.

-
-
-

The identity <I> is 0.

-
-
-

All invocations of this module within <Execution> must reach this point of execution.

-
-
-

This instruction is able to work correctly if placed within non-uniform control flow -within <Execution>.

-
-
-

<Result Type> must be an <integer type> scalar.

-
-
-

<Execution> must be Workgroup or Subgroup Scope.

-
-
-

The type of <X> must be the same as <Result Type>.

-
-
-
-
6 | 5006 | <id> Result Type | <id> Result | <id> Scope Execution> | Group Operation | <id> X
-
-
-
-
-

OpGroupSMaxNonUniformAMD

-
-

A signed integer maximum group operation specified for all values of <X> specified -by invocations in the group.

-
-
-

The identity <I> is INT_MIN when X is 32 bits wide and LONG_MIN when <X> is 64 -bits wide.

-
-
-

All invocations of this module within <Execution> must reach this point of execution.

-
-
-

This instruction is able to work correctly if placed within non-uniform control -flow within <Execution>.

-
-
-

<Result Type> must be an <integer type> scalar.

-
-
-

<Execution> must be Workgroup or Subgroup Scope.

-
-
-

The type of <X> must be the same as <Result Type>.

-
-
-
-
6 | 5007 | <id> Result Type | <id> Result | <id> Scope Execution | Group Operation | <id> X
-
-
-
-
-

SwizzleInvocationsAMD

-
-

Swizzles data within a group of 4 consecutive invocations of the subgroup based -on <offset> as described below:

-
-
-
-
for (i = 0; i < SubgroupSize; i+=4) {
-    dataOut[i+0] = isActive[i+offset.x] ? dataIn[i+offset.x] : 0;
-    dataOut[i+1] = isActive[i+offset.y] ? dataIn[i+offset.y] : 0;
-    dataOut[i+2] = isActive[i+offset.z] ? dataIn[i+offset.z] : 0;
-    dataOut[i+3] = isActive[i+offset.w] ? dataIn[i+offset.w] : 0;
-}
-
-
-
-

Where:

-
-
-
    -
  • -

    isActive[i] tells whether the invocation with the index <i> is currently active -within the subgroup.

    -
  • -
  • -

    dataIn[i] is the value of <data> for invocation index <i>.

    -
  • -
  • -

    dataOut[i] is the return value of the function for invocation index <i>.

    -
  • -
-
-
-

The operand data can be any scalar or vector type.

-
-
-

The operand offset must be a unsigned integer vector with 4 components, and each -component is constant integer with a value in the range [0, 3].

-
-
-

Result Type and the type of operand <data> must be the same type.

-
-
-
-
3 | 1 | <id> data | <id> offset
-
-
-
-
-

SwizzleInvocationsMaskedAMD

-
-

Swizzles data within a group of 32 consecutive invocations with a -limited mask as described below:

-
-
-
-
for (i = 0; i < SubgroupSize; i++) {
-   j = (((i & 0x1f) & mask.x) | mask.y) ^ mask.z;
-   j |= (i & 0x20); // which group of 32
-   dataOut[i] = isActive[j] ? dataIn[j] : 0;
-}
-
-
-
-

Where:

-
-
-
    -
  • -

    isActive[i] tells whether the invocation with the index <i> is currently active -within the subgroup.

    -
  • -
  • -

    dataIn[i] is the value of <data> for invocation index <i>.

    -
  • -
  • -

    dataOut[i] is the return value of the function for invocation index <i>.

    -
  • -
-
-
-

The operand data can be any scalar or vector type.

-
-
-

The operand mask must be a unsigned integer vector with 3 components, and each -component is constant integer with a value in the range [0, 31].

-
-
-

Result Type and the type of operand <data> must be the same type.

-
-
-
-
3 | 2 | <id> data | <id> mask
-
-
-
-
-

WriteInvocationAMD

-
-

Returns <inputValue> for all active invocations in the subgroup except for the -invocation whose invocation index within the subgroup is <invocationIndex>. -Within a subgroup, the outputs are defined as described below:

-
-
-
-
for (i = 0; i < SubgroupSize; i++) {
-   out[i] = (i == invocationIndex) ? writeValue : inputValue;
-}
-
-
-
-

Where out[i] is the return value of the function for invocation index <i>.

-
-
-

Result Type must be a scalar or vector type.

-
-
-

The type of inputValue and writeValue must be the same as Result Type.

-
-
-

invocationIndex must be a 32-bit unsigned integer with a value in the range -[0, SubgroupSize - 1].

-
-
-

writeValue and invocationIndex must be dynamically uniform within the subgroup, -otherwise the result of the operation is undefined.

-
-
-
-
4 | 3  | <id> inputValue | <id> writeValue | <id> invocationIndex
-
-
-
-
-

MbcntAMD

-
-

Returns the bit count of SubgroupLtMaskARB with <mask> as described below:

-
-
-
-
%X = OpBitwiseAnd u32 %SubgroupLtMaskARB %mask
-<Result> = OpBitCount u32 %X
-
-
-
-

Result Type and mask must be 32-bit unsigned integers.

-
-
-
-
4 | <id> mask
-
-
-
-
-
-
-

Validation Rules

-
-
-

None.

-
-
-
-
-

Issues

-
-
-

1.

-
-
-

Supported <result types> for group operation instructions depend on capabilities which are -defined elsewhere in the SPIR-V code. In specific, these capabilities may come from -other SPIR-V extensions, which are out of scope of this extension specification.

-
-
-

Due to the above, we have decided to relax the language restricting allowed result types for -group operation instructions so that it now mentions general integer type, instead of -specialized integer types.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

April 21, 2016

Quentin Lin

Initial revision based on AMD_shader_ballot.

2

May 20, 2016

Dominik Witczak

Document refactoring

3

May 20, 2016

Matthäus G. Chajdas

Document refactoring

4

August 11, 2016

Rex Xu

Add new core instructions to handle group operations placed with non-uniform control flow.

5

October 13, 2016

Dominik Witczak

Added missing numerical value assignments, removed extension number

6

March 28, 2018

Dominik Witczak

Generalized type restrictions for result types of group operation instructions to integer types. Added issue#1.

7

May 16, 2019

Dominik Witczak

Fixed an issue in the section describing how to use the new functionality. Fixed MbcntAMD’s return type.

-
-
-
- - \ No newline at end of file + + + + + + extensions/AMD/SPV_AMD_shader_ballot.html + + +

extensions/AMD/SPV_AMD_shader_ballot.html

+ + diff --git a/extensions/AMD/SPV_AMD_shader_early_and_late_fragment_tests.html b/extensions/AMD/SPV_AMD_shader_early_and_late_fragment_tests.html index 22862f7..416d599 100644 --- a/extensions/AMD/SPV_AMD_shader_early_and_late_fragment_tests.html +++ b/extensions/AMD/SPV_AMD_shader_early_and_late_fragment_tests.html @@ -1,300 +1,12 @@ - - - - - - - -SPV_AMD_shader_early_and_late_fragment_tests - - - - - -
-
-

Name Strings

-
-
-

SPV_AMD_shader_early_and_late_fragment_tests

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Tobias Hector, AMD

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2021 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Draft

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2021-11-05

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.5 Revision 5.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds the EarlyAndLateFragmentTestsAMD Execution -Mode, which enables both early and late fragment tests in some circumstances.

-
-
-

Additionally, it adds execution modes describing how the shader stencil -value is written, allowing stencil writes to be used with this new mode.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must -be present in the module:

-
-
-
-
OpExtension "SPV_AMD_shader_early_and_late_fragment_tests"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5

-
-
-

Capabilities

-
-

Modify Section 3.6, "Execution Mode", adding the following row to the table:

-
-
-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Execution modeExtra OperandsEnabling Capabilities

5017

EarlyAndLateFragmentTestsAMD
-Fragment tests can be performed both before and after fragment shader execution, with latter tests taking values written to FragDepth and FragStencilRefEXT into account. Early tests are not guaranteed, late tests are.+ -
-If neither of ExecutionModeDepthReplacing or ExecutionModeStencilRefReplacingEXT are specified, functions identically to EarlyFragmentTests.
-If this and ExecutionModeStencilRefReplacingEXT are both specified, one of StencilRefGreaterAMD, StencilRefLessAMD, or StencilRefUnchangedAMD must also be specified.
-If this and ExecutionModeDepthReplacing are both specified, one of DepthGreater, DepthLess, or DepthUnchanged must also be specified.
-
-Only valid with the Fragment Execution Model.
-See client API for detail on fragment operations.

Shader

5079

StencilRefUnchangedFrontAMD
-Indicates that early per-fragment tests may assume that any FragStencilRefEXT built in-decorated value written by the shader is equal to the stencil reference value set for the front face in the client API after masking. -Late per-fragment tests will use the written value as normal.
-
-Only valid with the Fragment Execution Model.
-At most one of StencilRefGreaterAMD, StencilRefLessAMD, and StencilRefUnchangedAMD can be specified.

StencilExportEXT

5080

StencilRefGreaterFrontAMD
-Indicates that early per-fragment tests may assume that any FragStencilRefEXT built in-decorated value written by the shader is greater than or equal to the stencil reference value set for the front face in the client API after masking. -Late per-fragment tests will use the written value as normal.
-
-Only valid with the Fragment Execution Model.
-At most one of StencilRefGreaterAMD, StencilRefLessAMD, and StencilRefUnchangedAMD can be specified.

StencilExportEXT

5081

StencilRefLessFrontAMD
-Indicates that early per-fragment tests may assume that any FragStencilRefEXT built in-decorated value written by the shader is less than or equal to the stencil reference value set for the front face in the client API after masking. -Late per-fragment tests will use the written value as normal.
-
-Only valid with the Fragment Execution Model.
-At most one of StencilRefGreaterAMD, StencilRefLessAMD, and StencilRefUnchangedAMD can be specified.

StencilExportEXT

5082

StencilRefUnchangedBackAMD
-Indicates that early per-fragment tests may assume that any FragStencilRefEXT built in-decorated value written by the shader is equal to the stencil reference value set for the back face in the client API after masking. -Late per-fragment tests will use the written value as normal.
-
-Only valid with the Fragment Execution Model.
-At most one of StencilRefGreaterAMD, StencilRefLessAMD, and StencilRefUnchangedAMD can be specified.

StencilExportEXT

5083

StencilRefGreaterBackAMD
-Indicates that early per-fragment tests may assume that any FragStencilRefEXT built in-decorated value written by the shader is greater than or equal to the stencil reference value set for the back face in the client API after masking. -Late per-fragment tests will use the written value as normal.
-
-Only valid with the Fragment Execution Model.
-At most one of StencilRefGreaterAMD, StencilRefLessAMD, and StencilRefUnchangedAMD can be specified.

StencilExportEXT

5084

StencilRefLessBackAMD
-Indicates that early per-fragment tests may assume that any FragStencilRefEXT built in-decorated value written by the shader is less than or equal to the stencil reference value set for the back face in the client API after masking. -Late per-fragment tests will use the written value as normal.
-
-Only valid with the Fragment Execution Model.
-At most one of StencilRefGreaterAMD, StencilRefLessAMD, and StencilRefUnchangedAMD can be specified.

StencilExportEXT

-
-
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2021-11-05

Tobias Hector

Initial extension.

-
-
-
- - \ No newline at end of file + + + + + + extensions/AMD/SPV_AMD_shader_early_and_late_fragment_tests.html + + +

extensions/AMD/SPV_AMD_shader_early_and_late_fragment_tests.html

+ + diff --git a/extensions/AMD/SPV_AMD_shader_explicit_vertex_parameter.html b/extensions/AMD/SPV_AMD_shader_explicit_vertex_parameter.html index ad0d0c4..07a8e17 100644 --- a/extensions/AMD/SPV_AMD_shader_explicit_vertex_parameter.html +++ b/extensions/AMD/SPV_AMD_shader_explicit_vertex_parameter.html @@ -1,458 +1,12 @@ - - - - - - - -SPIR-V Extension SPV_AMD_shader_explicit_vertex_parameter - - - - - -
-
-

Name Strings

-
-
-

SPV_AMD_shader_explicit_vertex_parameter

-
-
-
-
-

Contact

-
-
-

See Issues list in the Khronos SPIRV-Registry repository: -https://github.com/KhronosGroup/SPIRV-Registry

-
-
-
-
-

Contributors

-
-
-
    -
  • -

    Qun Lin, AMD

    -
  • -
  • -

    Graham Sellers, AMD

    -
  • -
  • -

    Daniel Rakos, AMD

    -
  • -
  • -

    Cai Zhi, AMD

    -
  • -
  • -

    Dominik Witczak, AMD

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2016 AMD.

-
-
-
-
-

Status

-
-
-

Released.

-
-
-
-
-

Version

-
-
-

Modified Date: October 13, 2016 -Revision: 4

-
-
-
-
-

Dependencies

-
-
-

This extension is written against Revision 1 of the version 1.1 of the -SPIR-V Specification.

-
-
-
-
-

Overview

-
-
-

This extension is written to provide the functionality of the -GL_AMD_shader_explicit_vertex_parameter, OpenGL Shading Language Specification -extension, for SPIR-V.

-
-
-

This extension provides the capability of arbitrary interpolation in a fragment -shader. -It adds a new extended instruction to access vertex parameter explicitly and -adds 6 new built-in variables to get the barycentric coordinate of a fragment.

-
-
-
-
-

Extension Name

-
-
-

To enable SPV_AMD_shader_explicit_vertex_parameter extension in SPIR-V, use

-
-
-
-
OpExtension "SPV_AMD_shader_explicit_vertex_parameter"
-
-
-
-
-
-

New Builtins

-
-
-

This extension adds the following builtins:

-
-
-
-
BaryCoordNoPerspAMD         = 4992
-BaryCoordNoPerspCentroidAMD = 4993
-BaryCoordNoPerspSampleAMD   = 4994
-BaryCoordSmoothAMD          = 4995
-BaryCoordSmoothCentroidAMD  = 4996
-BaryCoordSmoothSampleAMD    = 4997
-BaryCoordPullModelAMD       = 4998
-
-
-
-

BaryCoordNoPerspAMD, BaryCoordNoPerspCentroidAMD, BaryCoordNoPerspSampleAMD, -BaryCoordSmoothAMD, BaryCoordSmoothCentroidAMD and BaryCoordSmoothSampleAMD -must only decorate input variable which type is a 32-bit float vector with -2 components, like:

-
-
-
-
OpDecorate %BaryCoordNoPerspAMD BuiltIn BaryCoordNoPerspAMD
-
-%float32_t           = OpTypeFloat 32
-%vec2                = OpTypeVector %float32_t 2
-%vec2_ptr            = OpTypePointer Input %vec2
-%BaryCoordNoPerspAMD = OpVariable %vec2_ptr Input
-
-
-
-

BaryCoordPullModelAMD must only decorate input variables whose type is a 32-bit -float vector with 3 components, like:

-
-
-
-
OpDecorate %BaryCoordPullModelAMD BuiltIn BaryCoordPullModelAMD
-
-%float32_t           = OpTypeFloat 32
-%vec3                = OpTypeVector %float32_t 3
-%vec3_ptr            = OpTypePointer Input %vec3
-%BaryCoordNoPerspAMD = OpVariable %vec3_ptr Input
-
-
-
-
-
-

New Decorations

-
-
-

This extension introduces the following new decorations:

-
-
-
-
ExplicitInterpAMD
-
-
-
-

ExplicitInterp must only be applied to an object or a member of a structure type. -It indicates that custom interpolation must be used. The object or member can -only be accessed by the new extended instruction InterpolateAtVertexAMD. Only -valid for Input and Output Storage Classes.

-
-
-
-
-

New Instructions

-
-
-

This extension adds the extended instruction

-
-
-
-
InterpolateAtVertexAMD
-
-
-
-

To use the extended instructions described below, declare:

-
-
-
-
OpExtInstImport %ext "SPV_AMD_shader_explicit_vertex_parameter"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.1

-
-
-

Modify Section 3.21, the BuiltIn list.

-
-
-

(Add to the list of builtins with a Shader capability)

-
-
-
-
BaryCoordNoPerspAMD
-BaryCoordNoPerspCentroidAMD
-BaryCoordNoPerspSampleAMD
-BaryCoordSmoothAMD
-BaryCoordSmoothCentroidAMD
-BaryCoordSmoothSampleAMD
-BaryCoordPullModelAMD
-
-
-
-

(Add a description)

-
-
-

Except the BaryCoordPullModelAMD, the BaryCoord??AMD builtins -provide the (I,J) pair of the barycentric coordinates interpolated at a fixed -location within the pixel. The K coordinate can be derived given the identity -I+J+K=1.0.

-
-
-

The BaryCoordPullModelAMD builtin returns (1/W, 1/I, 1/J) at the pixel center and -the shader can use it to calculate gradients and to interpolate I, J, and W to any -desired sample location.

-
-
-

The interpolation mode of BaryCoord??AMD builtins is as follows:

-
- ---- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Variable nameDescription

BaryCoordNoPerspAMD

Linear interpolation evaluated at the pixel’s center

BaryCoordNoPerspCentroidAMD

Linear interpolation evaluated at the centroid

BaryCoordNoPerspSampleAMD

Linear interpolation evaluated at each covered sample

BaryCoordSmoothAMD

Perspective interpolation evaluated at the pixel’s center

BaryCoordSmoothCentroidAMD

Perspective interpolation evaluated at the centroid

BaryCoordSmoothSampleAMD

Perspective interpolation evaluated at each covered sample

-
-

Modify Section 3.32.1, Miscellaneous Instructions

-
-
-

(Add to the end of the section a list of instructions with "InterpolationFunction" -capability)

-
-
-

InterpolateAtVertexAMD

-
-

Returns the value of the input <interpolant> without any interpolation, i.e. the -raw output value of previous shader stage.

-
-
-

It is guaranteed that the association of the vertex index and barycentric coordinate -is represented with the following table.

-
- ---- - - - - - - - - - - - - - - - - - - - - -
<vertexIdx>Barycentric coordinates

0

I=0, J=0, K=1

1

I=1, J=0, K=0

2

I=0, J=1, K=0

-
-

However this order has no association with the vertex order specified -by the application in the originating draw.

-
-
-

The operand <interpolant> must be a pointer to the Input Storage Class.

-
-
-

The operand <interpolant> must be a pointer to a scalar or vector.

-
-
-

This instruction is only valid in the Fragment execution model.

-
-
-

Result Type and the type of <interpolant> must be the same type.

-
-
-

Use of this instruction requires declaration of the InterpolationFunction -capability.

-
-
-

The operand <vertexIdx> must be constant integer expression with value of 0, 1 -or 2.

-
-
-
-
3 | 1 | <id> interpolant | <id> vertexIdx
-
-
-
-
-
-
-

Validation Rules

-
-
-

None.

-
-
-
-
-

Issues

-
-
-

None

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

April 21, 2016

Quentin Lin

Initial revision based on AMD_shader_explicit_vertex_parameter.

2

May 20, 2016

Dominik Witczak

Document refactoring

3

May 30, 2016

Dominik Witczak

Minor corrections

4

October 13, 2016

Dominik Witczak

Added missing numerical value assignments, removed extension number

-
-
-
- - \ No newline at end of file + + + + + + extensions/AMD/SPV_AMD_shader_explicit_vertex_parameter.html + + +

extensions/AMD/SPV_AMD_shader_explicit_vertex_parameter.html

+ + diff --git a/extensions/AMD/SPV_AMD_shader_fragment_mask.html b/extensions/AMD/SPV_AMD_shader_fragment_mask.html index 861ed97..3beaee5 100644 --- a/extensions/AMD/SPV_AMD_shader_fragment_mask.html +++ b/extensions/AMD/SPV_AMD_shader_fragment_mask.html @@ -1,326 +1,12 @@ - - - - - - - -SPV_AMD_shader_fragment_mask - - - - - -
-
-

Name Strings

-
-
-

SPV_AMD_shader_fragment_mask

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Aaron Hagan, AMD

    -
  • -
  • -

    Daniel Rakos, AMD

    -
  • -
  • -

    Rex Xu, AMD

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2017 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-

Draft.

-
-
-
-
-

Version

-
-
-

Modified Date: August 16, 2017 -Revision: 1

-
-
-
-
-

Dependencies

-
-
-

This extension is written against Revision 1 of the version 1.12 of the -SPIR-V Specification.

-
-
-

The extension is written against Revision 1 of the OpenGL extension -AMD_shader_fragment_mask.

-
-
-
-
-

Overview

-
-
-

This extension is written to provide the functionality of the -AMD_shader_fragment_mask, OpenGL Shading Language Specification extension, -for SPIR-V.

-
-
-

This extension introduces two core instructions to SPIR-V that enable fetching -the fragment mask of compressed multisampled color surfaces, and an instruction -for sampling the surface with the fragment mask.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must be -present in the module:

-
-
-
-
OpExtension "SPV_AMD_shader_fragment_mask"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
FragmentMaskAMD = 5010
-
-
-
-
-
-

New Instructions

-
-
-

This extension adds the following core instructions

-
-
-
-
OpFragmentMaskFetchAMD = 5011
-OpFragmentFetchAMD     = 5012
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.1

-
-
-

Modify Section 3.32.10, Image Instructions

-
-
-

Add the following items to the instruction sections.

-
-
-

OpFragmentMaskFetchAMD

-
-

The fragment mask can be fetched from a compressed multisampled color surface with a -call to fragmentMaskFetchAMD in the shader. The returned value is a single unsigned -integer where each subsequent 4 bit specifies the color fragment index corresponding -to the color sample, starting from the least significant bit.

-
-
-

Result Type must be a 32-bit unsigned integer type scalar.

-
-
-

Image must be an object whose type is OpTypeImage with MS of 1. Dim must be 2D or SubpassData

-
-
-

Coordinate is an integer scalar or vector containing (u[, v] …​ [, array layer]) -as needed by the definition of Sampled Image. When the Image Dim operand is SubpassData, -Coordinate is relative to the current fragment location. That is, the integer value -(rounded down) of the current fragment’s window-relative (x, y) coordinate is added to (u, v).

-
- -------- - - - - - - - - - - -

5

5011

<id> Result Type

Result <id>

<id> Image

<id> Coordinate

-
-
-

OpFragmentFetchAMD

-
-

The color fragment for a particular sample can be fetched using the coorespondng fragment -mask index and calling fragmentFetchAMD.

-
-
-

Result Type must be a vector of four components of floating-point type or integer type. -Its components must be the same as Sampled Type of the underlying OpTypeImage (unless that -underlying Sampled Type isOpTypeVoid).

-
-
-

Image must be an object whose type is OpTypeImage with MS of 1. Dim must be 2D or SubpassData

-
-
-

Coordinate is an integer scalar or vector containing (u[, v] …​ [, array layer]) -as needed by the definition of Sampled Image. When the Image Dim operand is SubpassData, -Coordinate is relative to the current fragment location. That is, the integer value -(rounded down) of the current fragment’s window-relative (x, y) coordinate is added to (u, v).

-
-
-

Fragment Index fragment mask index used to sample the color fragment

-
- --------- - - - - - - - - - - - -

6

5012

<id> Result Type

Result <id>

<id> Image

<id> Coordinate

<id> Fragment Index

-
-
-
-
-

Validation Rules

-
-
-

None.

-
-
-
-
-

Issues

-
-
-

None

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

August 16, 2017

Aaron Hagan

Initial revision based on SPV_AMD_shader_fragment_mask.

-
-
-
- - \ No newline at end of file + + + + + + extensions/AMD/SPV_AMD_shader_fragment_mask.html + + +

extensions/AMD/SPV_AMD_shader_fragment_mask.html

+ + diff --git a/extensions/AMD/SPV_AMD_shader_image_load_store_lod.html b/extensions/AMD/SPV_AMD_shader_image_load_store_lod.html index b2edb41..1105218 100644 --- a/extensions/AMD/SPV_AMD_shader_image_load_store_lod.html +++ b/extensions/AMD/SPV_AMD_shader_image_load_store_lod.html @@ -1,282 +1,12 @@ - - - - - - - -SPIR-V Extension SPV_AMD_shader_image_load_store_lod - - - - - -
-
-

Name Strings

-
-
-

SPV_AMD_shader_image_load_store_lod

-
-
-
-
-

Contact

-
-
-

See Issues list in the Khronos SPIRV-Registry repository: -https://github.com/KhronosGroup/SPIRV-Registry

-
-
-
-
-

Contributors

-
-
-
    -
  • -

    Dominik Witczak, AMD

    -
  • -
  • -

    Quentin Lin, AMD

    -
  • -
  • -

    Rex Xu, AMD

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2017 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-

Shipping.

-
-
-
-
-

Version

-
-
-

Modified Date: 08/21/2017

-
-
-

Revision: 1

-
-
-
-
-

Dependencies

-
-
-

This extension is written against Revision 1 of the version 1.1 of the -SPIR-V Specification.

-
-
-

The extension is written against Revision 1 of the OpenGL extension -AMD_shader_image_load_store_lod.

-
-
-
-
-

Overview

-
-
-

This extension is written to provide the functionality of the -AMD_shader_image_load_store_lod, OpenGL Shading Language Specification extension, -for SPIR-V.

-
-
-

This extension enhances Lod image operand to also support OpImageRead, OpImageWrite -and OpImageSparseRead image instructions.

-
-
-
-
-

Extension Name

-
-
-

To enable SPV_AMD_shader_image_load_store_lod extension in SPIR-V, use

-
-
-
-
OpExtension "SPV_AMD_shader_image_load_store_lod"
-
-
-
-
-
-

Summary

-
-
-

This extension:

-
-
-
    -
  • -

    adds a new CapabilityImageReadWriteLodAMD capability.

    -
  • -
  • -

    expands definition of Lod image operand, as described in the overview.

    -
  • -
-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - -

CapabilityImageReadWriteLodAMD

5015

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.1

-
-
-

Modify Section 3.31, Capability

-
-
-

Append the following Capability to the table:

-
- ----- - - - - - - - - - - - - - - -
CapabilityDepends OnEnabled by Extension

CapabilityImageReadWriteLodAMD

-

Expands Lod image operand definition to support image read/write/sparse read operations.

CapabilityShader

SPV_AMD_shader_image_load_store_lod

-
-

Modify Section 3.14, Image Operands

-
-
-

Lod

-
-

(Replace the following sentence:)
-
-Only valid with explicit-lod instructions.
-
-(with:)
-
-Only valid with explicit-lod instructions, OpImageRead, OpImageWrite, or OpImageSparseRead.

-
-
-
-
-
-

Validation Rules

-
-
-

None.

-
-
-
-
-

Issues

-
-
-

None

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

08/21/2017

Dominik Witczak

Initial revision based on AMD_shader_image_load_store_lod.

-
-
-
- - \ No newline at end of file + + + + + + extensions/AMD/SPV_AMD_shader_image_load_store_lod.html + + +

extensions/AMD/SPV_AMD_shader_image_load_store_lod.html

+ + diff --git a/extensions/AMD/SPV_AMD_shader_trinary_minmax.html b/extensions/AMD/SPV_AMD_shader_trinary_minmax.html index 5417e9c..dd20e6b 100644 --- a/extensions/AMD/SPV_AMD_shader_trinary_minmax.html +++ b/extensions/AMD/SPV_AMD_shader_trinary_minmax.html @@ -1,415 +1,12 @@ - - - - - - - -SPIR-V Extension SPV_AMD_shader_trinary_minmax - - - - - -
-
-

Name Strings

-
-
-

SPV_AMD_shader_trinary_minmax

-
-
-
-
-

Contact

-
-
-

See Issues list in the Khronos SPIRV-Registry repository: -https://github.com/KhronosGroup/SPIRV-Registry

-
-
-
-
-

Contributors

-
-
-
    -
  • -

    Qun Lin, AMD

    -
  • -
  • -

    Graham Sellers, AMD

    -
  • -
  • -

    Daniel Rakos, AMD

    -
  • -
  • -

    Dominik Witczak, AMD

    -
  • -
  • -

    Matthäus G. Chajdas, AMD

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2016 AMD.

-
-
-
-
-

Status

-
-
-

Released.

-
-
-
-
-

Version

-
-
-

Modified Date: October 13, 2016 -Revision: 4

-
-
-
-
-

Dependencies

-
-
-

This extension is written against Revision 1 of the version 1.1 of the -SPIR-V Specification.

-
-
-
-
-

Overview

-
-
-

This extension is written to provide the functionality of the -AMD_shader_trinary_minmax, OpenGL Shading Language Specification extension, -for SPIR-V.

-
-
-

This extension introduces nine new trinary extended instructions to SPIR-V. -These functions allow the minimum, maximum or median of three inputs to be found -with a single function call. These operations may be useful for sorting and -filtering operations, for example. By explicitly performing a trinary operation -with a single built-in function, shader compilers and optimizers may be able to -generate better instruction sequences to perform sorting and to other multi-input -functions.

-
-
-
-
-

Extension Name

-
-
-

To enable SPV_AMD_shader_trinary_minmax extension in SPIR-V, use

-
-
-
-
OpExtension "SPV_AMD_shader_trinary_minmax"
-
-
-
-
-
-

New Instructions

-
-
-

This extension adds the following extended instructions:

-
-
-
-
FMin3AMD = 1
-UMin3AMD = 2
-SMin3AMD = 3
-FMax3AMD = 4
-UMax3AMD = 5
-SMax3AMD = 6
-FMid3AMD = 7
-UMid3AMD = 8
-SMid3AMD = 9
-
-
-
-

To use these extended instructions, declare:

-
-
-
-
OpExtInstImport %ext "SPV_AMD_shader_trinary_minmax"
-
-
-
-

FMin3AMD

-
-

Returns the per-component minimum value of x, y, and z. the result is undefined -if one of the operands is a NaN.

-
-
-

The operands must all be a scalar or vector whose component type is floating-point.

-
-
-

Result Type and the type of all operands must be the same type. Results are -computed per component.

-
-
-
-
4 | 1 | <id> x | <id> y | <id> z
-
-
-
-
-

UMin3AMD

-
-

Returns the per-component minimum value of x, y, and z.

-
-
-

The operands must all be a scalar or vector whose component type is unsigned integer.

-
-
-

Result Type and the type of all operands must be the same type. Results are computed -per component.

-
-
-
-
4| 2 | <id> x | <id> y | <id> z
-
-
-
-
-

SMin3AMD

-
-

Returns the per-component minimum value of x, y, and z.

-
-
-

The operands must all be a scalar or vector whose component type is signed integer.

-
-
-

Result Type and the type of all operands must be the same type. Results are computed -per component.

-
-
-
-
4| 3 | <id> x | <id> y | <id> z
-
-
-
-
-

FMax3AMD

-
-

Returns the per-component maximum value of x, y, and z. The result is undefined -if one of the operands is a NaN.

-
-
-

The operands must all be a scalar or vector whose component type is floating-point.

-
-
-

Result Type and the type of all operands must be the same type. Results are computed -per component. |

-
-
-
-
4| 4 | <id> x | <id> y | <id> z
-
-
-
-
-

UMax3AMD

-
-

Returns the per-component maximum value of x, y, and z.

-
-
-

The operands must all be a scalar or vector whose component type is unsigned -integer.

-
-
-

Result Type and the type of all operands must be the same type. Results are computed -per component.

-
-
-
-
4| 5 | <id> x | <id> y | <id> z
-
-
-
-
-

SMax3AMD

-
-

Returns the per-component maximum value of x, y, and z.

-
-
-

The operands must all be a scalar or vector whose component type is signed -integer.

-
-
-

Result Type and the type of all operands must be the same type. Results are computed -per component.

-
-
-
-
4| 6 | <id> x | <id> y | <id> z
-
-
-
-
-

FMid3AMD

-
-

Returns the per-component median value of x, y, and z. the result is undefined if -one of the operands is a NaN.

-
-
-

The operands must all be a scalar or vector whose component type is floating-point.

-
-
-

Result Type and the type of all operands must be the same type. Results are computed -per component.

-
-
-
-
4| 7 | <id> x | <id> y | <id> z
-
-
-
-
-

UMid3AMD

-
-

Returns the per-component median value of x, y, and z.

-
-
-

The operands must all be a scalar or vector whose component type is unsigned integer.

-
-
-

Result Type and the type of all operands must be the same type. Results are computed -per component.

-
-
-
-
4| 8 | <id> x | <id> y | <id> z
-
-
-
-
-

SMid3AMD

-
-

Returns the per-component median value of x, y, and z.

-
-
-

The operands must all be a scalar or vector whose component type is signed integer.

-
-
-

Result Type and the type of all operands must be the same type. Results are computed -per component.

-
-
-
-
4| 9 | <id> x | <id> y | <id> z
-
-
-
-
-
-
-

Validation Rules

-
-
-

None.

-
-
-
-
-

Issues

-
-
-

None

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

April 21, 2016

Quentin Lin

Initial revision based on AMD_shader_trinary_minmax.

2

May 20, 2016

Dominik Witczak

Document refactoring

3

May 30, 2016

Dominik Witczak

Minor corrections

4

October 13, 2016

Dominik Witczak

Added missing numerical value assignments, removed extension number

-
-
-
- - \ No newline at end of file + + + + + + extensions/AMD/SPV_AMD_shader_trinary_minmax.html + + +

extensions/AMD/SPV_AMD_shader_trinary_minmax.html

+ + diff --git a/extensions/AMD/SPV_AMD_texture_gather_bias_lod.html b/extensions/AMD/SPV_AMD_texture_gather_bias_lod.html index 5a472b6..54b843d 100644 --- a/extensions/AMD/SPV_AMD_texture_gather_bias_lod.html +++ b/extensions/AMD/SPV_AMD_texture_gather_bias_lod.html @@ -1,318 +1,12 @@ - - - - - - - -SPIR-V Extension SPV_AMD_texture_gather_bias_lod - - - - - -
-
-

Name Strings

-
-
-

SPV_AMD_texture_gather_bias_lod

-
-
-
-
-

Contact

-
-
-

See Issues list in the Khronos SPIRV-Registry repository: -https://github.com/KhronosGroup/SPIRV-Registry

-
-
-
-
-

Contributors

-
-
-
    -
  • -

    Dominik Witczak, AMD

    -
  • -
  • -

    Daniel Rakos, AMD

    -
  • -
  • -

    Graham Sellers, AMD

    -
  • -
  • -

    Matthäus G. Chajdas, AMD

    -
  • -
  • -

    Qun Lin, AMD

    -
  • -
  • -

    Rex Xu, AMD

    -
  • -
  • -

    Timothy Lottes, AMD

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2016 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-

Proposed.

-
-
-
-
-

Version

-
-
-

Modified Date: June 22, 2017 -Revision: 2

-
-
-
-
-

Dependencies

-
-
-

This extension is written against Revision 4 of the version 1.1 of the -SPIR-V Specification.

-
-
-

The extension is written against Revision 1 of the OpenGL extension -AMD_texture_gather_bias_lod.

-
-
-
-
-

Overview

-
-
-

This extension is written to provide the functionality of the -AMD_texture_gather_bias_lod, OpenGL Shading Language Specification extension, -for SPIR-V.

-
-
-

This extension lets applications specify bias of implicit level of detail and -explicit control of level of detail for texture gather operations.

-
-
-
-
-

Extension Name

-
-
-

To enable SPV_AMD_texture_gather_bias_lod extension in SPIR-V, use

-
-
-
-
OpExtension "SPV_AMD_texture_gather_bias_lod"
-
-
-
-
-
-

Summary

-
-
-

This extension lets applications specify a bias applied to the implicit level of -detail, or the explicit level of detail to use for texture gather operations.

-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - -

CapabilityImageGatherBiasLodAMD

5009

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.1

-
-
-

Modify Section 3.14, Image Operands

-
-
-

(Replace the following language, as included in the "Image Operands" section of the table -for the "Bias" Image Operand:)
-
-Only valid with implicit-lod instructions.
-
-(with:)
-
-Only valid with implicit-lod instructions, OpImageGather, or OpImageSparseGather.
-
-
-(Replace the following language, as included in the "Image Operands" section of the table - for the "Lod" Image Operand):
-
-Only valid with implicit-lod instructions.
-
-(with:)
-
-Only valid with explicit-lod instructions, OpImageGather, or OpImageSparseGather.

-
-
-

Modify Section 3.31, Capability

-
-
-

Append the following Capability to the table:

-
- ----- - - - - - - - - - - - - - - -
CapabilityDepends OnEnabled by Extension

ImageGatherBiasLodAMD

-

Uses texture gather with either bias added to the implicit level-of-detail or explicit level-of-detail.

Shader

SPV_AMD_texture_gather_bias_lod

-
-
-
-

Validation Rules

-
-
-
    -
  • -

    An instruction can have at most one of the Lod and Bias image operands.

    -
  • -
-
-
-
-
-

Issues

-
-
-

1) What level-of-detail do texture gather functions use, if the extension is defined?

-
-
-

RESOLVED: If SPV_AMD_texture_gather_bias_lod extension is enabled, all texture -gather functions (ie. the ones which do not take the extra bias argument and -the ones that do) fetch texels from implicit LOD in fragment shader stage. In all -other shader stages, base level is used instead.

-
-
-

If the extension is disabled, all texture gather functions always fetch texels -from the base mip level.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

3

May 27, 2020

Nicolai Hähnle

Explicitly state that Lod and Bias cannot be used -simultaneously in the same instruction.

2

June 22, 2017

Dominik Witczak

Typo fix (OpCapabilityImageGatherBiasLodAMD ⇒ CapabilityImageGatherBiasLodAMD)

1

February 21, 2017

Dominik Witczak

Initial revision based on AMD_texture_gather_bias_lod

-
-
-
- - \ No newline at end of file + + + + + + extensions/AMD/SPV_AMD_texture_gather_bias_lod.html + + +

extensions/AMD/SPV_AMD_texture_gather_bias_lod.html

+ + diff --git a/extensions/ARM/SPV_ARM_cooperative_matrix_layouts.html b/extensions/ARM/SPV_ARM_cooperative_matrix_layouts.html index 777df81..7de386c 100644 --- a/extensions/ARM/SPV_ARM_cooperative_matrix_layouts.html +++ b/extensions/ARM/SPV_ARM_cooperative_matrix_layouts.html @@ -1,308 +1,12 @@ - - - - - - - -SPV_ARM_cooperative_matrix_layouts - - - - - -
-
-

Name Strings

-
-
-

SPV_ARM_cooperative_matrix_layouts

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Kevin Petit, Arm Ltd.

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2023 Arm Ltd.

-
-
-
-
-

Status

-
-
-

Complete.

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2024-05-29

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.6 Revision 2.

-
-
-

This extension requires SPIR-V 1.6.

-
-
-

This extension requires SPV_KHR_cooperative_matrix.

-
-
-
-
-

Overview

-
-
-

This extension adds support for cooperative matrix memory layouts used on Arm GPUs.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_ARM_cooperative_matrix_layouts"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6

-
-
-

Validation Rules

-
-

Modify section 2.16.1 Universal Validation Rules:

-
-
    -
  • -

    If the MemoryLayout provided to OpCooperativeMatrixLoadKHR or -OpCooperativeMatrixStoreKHR is RowBlockedInterleavedARM -or ColumnBlockedInterleavedARM then their Result Type, or -Object, respectively, must be a cooperative matrix type whose -Rows is a multiple of 4 and whose Columns is a multiple of -16 / sizeof(Component Type).

    -
  • -
  • -

    If the MemoryLayout provided to OpCooperativeMatrixLoadKHR or -OpCooperativeMatrixStoreKHR is RowBlockedInterleavedARM -then their Result Type, or Object, respectively, must be a cooperative -matrix type whose Columns is a multiple of 16 / sizeof(Component Type) -multiplied by the Stride operand to the OpCooperativeMatrixLoadKHR or -OpCooperativeMatrixStoreKHR instruction.

    -
  • -
  • -

    If the MemoryLayout provided to OpCooperativeMatrixLoadKHR or -OpCooperativeMatrixStoreKHR is ColumnBlockedInterleavedARM -then their Result Type, or Object, respectively, must be a cooperative -matrix type whose Rows is a multiple of 4 times the Stride operand to -the OpCooperativeMatrixLoadKHR or OpCooperativeMatrixStoreKHR instruction.

    -
  • -
-
-
-
-
-

Capabilities

-
-

Modify Section 3.31, "Capability", adding these rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityDepends On

4201

CooperativeMatrixLayoutsARM
-Uses ARM cooperative matrix layouts

CooperativeMatrixKHR

-
-
-
-
-

3.X Cooperative Matrix Layout

-
-

Add the following to the table introduced by SPV_KHR_cooperative_matrix:

-
-
-
- ----- - - - - - - - - - - - - - - - - - - -
Cooperative Matrix LayoutEnabling Capabilities

4202

RowBlockedInterleavedARM
-Matrix elements are grouped in blocks of 64 bytes. Each block stores a sub-matrix -of 4 rows and a number of columns that depends on the size of the Component Type -of the matrix. The number of columns in a block is given by -16 / sizeof(Component Type). The matrix elements within individual blocks are -laid out in row-major order. Blocks are laid out in row-major order. Blocks are -interleaved at a 4-byte granularity in in groups whose size is given by the -Stride operand to OpCooperativeMatrixLoadKHR or OpCooperativeMatrixStoreKHR.

CooperativeMatrixLayoutsARM

4203

ColumnBlockedInterleavedARM
-Matrix elements are grouped in blocks of 64 bytes. Each block stores a sub-matrix -of 4 rows and a number of columns that depends on the size of the Component Type -of the matrix. The number of columns in a block is given by -16 / sizeof(Component Type). The matrix elements within individual blocks are -laid out in row-major order. Blocks are laid out in column-major order. Blocks are -interleaved at a 4-byte granularity in in groups whose size is given by the -Stride operand to OpCooperativeMatrixLoadKHR or OpCooperativeMatrixStoreKHR.

CooperativeMatrixLayoutsARM

-
-
-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2024-05-29

Kevin Petit

Initial revision

-
-
-
- - \ No newline at end of file + + + + + + extensions/ARM/SPV_ARM_cooperative_matrix_layouts.html + + +

extensions/ARM/SPV_ARM_cooperative_matrix_layouts.html

+ + diff --git a/extensions/ARM/SPV_ARM_core_builtins.html b/extensions/ARM/SPV_ARM_core_builtins.html index 47fb045..dedff3b 100644 --- a/extensions/ARM/SPV_ARM_core_builtins.html +++ b/extensions/ARM/SPV_ARM_core_builtins.html @@ -1,274 +1,12 @@ - - - - - - - -SPV_ARM_core_builtins - - - - - -
-
-

Name Strings

-
-
-

SPV_ARM_core_builtins

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Kévin Petit, Arm Ltd.

    -
  • -
  • -

    Christopher Gautier, Arm Ltd.

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2021 Arm Ltd.

-
-
-
-
-

Status

-
-
-

Complete.

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2021-11-29

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.5 Revision 5.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds new builtin decorations that can be used to decorate -integer variables giving programs a means to query information about the -cores and warps it is running on.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_ARM_core_builtins"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5

-
-
-

Builtin

-
-

Modify section 3.21, "Builtin", adding these rows to the Builtin table:

-
-
-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
BuiltinEnabling capability

4160

CoreIDARM
-An ID between 0 and CoreMaxIDARM for the core the current invocation is running on.

CoreBuiltinsARM

4161

CoreCountARM
-The number of cores on the device.

CoreBuiltinsARM

4162

CoreMaxIDARM
-The max ID that can be reported for a core. This may be different from CoreCountARM - 1.

CoreBuiltinsARM

4163

WarpIDARM
-An ID between 0 and WarpMaxIDARM (per core) for the warp the current invocation is running on.

CoreBuiltinsARM

4164

WarpMaxIDARM
-The max ID that can be reported for a warp.

CoreBuiltinsARM

-
-
-
-
-

Capabilities

-
-

Modify Section 3.31, "Capability", adding these rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityDepends On

4165

CoreBuiltinsARM
-Uses the CoreIDARM, CoreCountARM, CoreMaxIDARM, WarpIDARM, WarpMaxIDARM builtin decorations.

-
-
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2021-11-29

Kévin Petit

Initial revision

-
-
-
- - \ No newline at end of file + + + + + + extensions/ARM/SPV_ARM_core_builtins.html + + +

extensions/ARM/SPV_ARM_core_builtins.html

+ + diff --git a/extensions/EXT/SPV_EXT_arithmetic_fence.html b/extensions/EXT/SPV_EXT_arithmetic_fence.html index 5e5b9a5..6df702a 100644 --- a/extensions/EXT/SPV_EXT_arithmetic_fence.html +++ b/extensions/EXT/SPV_EXT_arithmetic_fence.html @@ -1,373 +1,12 @@ - - - - - - - -SPV_EXT_arithmetic_fence - - - - - -
-
-

Name Strings

-
-
-

SPV_EXT_arithmetic_fence

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Dmitry Sidorov, Intel

    -
  • -
  • -

    Joe Garvey, Intel

    -
  • -
  • -

    Arvind Sudarsanam, Intel

    -
  • -
  • -

    Pawel Jurek, Intel

    -
  • -
  • -

    Ben Ashbaugh, Intel

    -
  • -
  • -

    Greg Lueck, Intel

    -
  • -
  • -

    Kévin Petit, ARM

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2024 The Khronos Group Inc. Copyright terms at http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2024-07-16

Revision

2

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.6 Revision 3.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

The extension adds OpArithmeticFenceEXT instruction that prevents fast-math -optimizations between its argument and the expression that contains it. -For example for fast FP model a compiler can perform reassociation:

-
-
-

If all these OpFAdd instructions have the "Fast" FP fast math mode then:

-
-
-
-
%ab = OpFAdd %float %a %b
-%abc = OpFAdd %float %ab %c
-%abc_fence = OpArithmeticFenceEXT %float %abc
-%result = OpFAdd %float %abc_fence %d
-
-
-
-

can be transformed into:

-
-
-
-
%bc = OpFAdd %float %b %c
-%abc = OpFAdd %float %a %bc
-%abc_fence = OpArithmeticFenceEXT %float %abc
-%result = OpFAdd %float %abc_fence %d
-
-
-
-

but not into:

-
-
-
-
%ab = OpFAdd %float %a %b
-%ab_fence = OpArithmeticFenceEXT %float %ab
-%cd = OpFAdd %float %c %d
-%result = OpFAdd %float %ab_fence %cd
-
-
-
-

This instruction is an equivalent of llvm.arithmetic.fence intrinsic function.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must -be present in the module:

-
-
-
-
OpExtension "SPV_EXT_arithmetic_fence"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
ArithmeticFenceEXT
-
-
-
-
-
-

New Instructions

-
-
-

Instructions added under the ArithmeticFenceEXT capability:

-
-
-
-
OpArithmeticFenceEXT
-
-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - - - - - -

ArithmeticFenceEXT

6144

OpArithmeticFenceEXT

6145

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5

-
-
-

Capabilities

-
-

Modify Section 3.31, "Capability", adding these rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityDepends On

6144

ArithmeticFenceEXT
-Allow to use OpArithmeticFenceEXT instruction

-
-
-
-
-

Instructions

-
-

In section 3.49.1. Miscellaneous Instructions, add a new instruction:

-
- ------- - - - - - - - - - - - - - -

OpArithmeticFenceEXT
-Return Target. Indicates that the optimizer can not move or combine Target -with the expression that uses Result of the instruction.
-
-Target must be scalar or vector of floating-point type.
-
-Result Type must be the same as the return type of the Target instruction.
-

Capability: -ArithmeticFenceEXT

4

6145

Result Type <id>

Result <id>

Target <id>

-
-
-
-
-

Issues

-
- -
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2021-05-26

Dmitry Sidorov

Initial revision

2

2024-07-16

Dmitry Sidorov

Prepare for publication

-
-
-
- - \ No newline at end of file + + + + + + extensions/EXT/SPV_EXT_arithmetic_fence.html + + +

extensions/EXT/SPV_EXT_arithmetic_fence.html

+ + diff --git a/extensions/EXT/SPV_EXT_demote_to_helper_invocation.html b/extensions/EXT/SPV_EXT_demote_to_helper_invocation.html index 64f9c88..bdd03de 100644 --- a/extensions/EXT/SPV_EXT_demote_to_helper_invocation.html +++ b/extensions/EXT/SPV_EXT_demote_to_helper_invocation.html @@ -1,317 +1,12 @@ - - - - - - - -SPV_EXT_demote_to_helper_invocation - - - - - -
-
-

Name Strings

-
-
-

SPV_EXT_demote_to_helper_invocation

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Jeff Bolz, NVIDIA Corporation

    -
  • -
  • -

    Alan Baker, Google LLC

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2019 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete.

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2019-06-06

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.4 Revision 1.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds a new OpDemoteToHelperInvocationEXT instruction which is -similar to OpKill in that it suppresses subsequent stores and writes to -outputs, but is not a flow control instruction and does not necessarily terminate -the shader invocation. This is a better match for D3D’s discard instruction, -and preserves the ability to rely on uniform flow control for derivatives -after the discard.

-
-
-

This extension also adds a new OpIsHelperInvocationEXT instruction which -returns whether the invocation is currently a helper invocation. That is, at -the beginning of a fragment shader invocation it returns the same value as -the HelperInvocation input, and after demotion it returns true. The -HelperInvocation builtin decoration is used on a variable in the Input -storage class, and it wouldn’t make sense for an input variable’s value to -change over the course of the invocation’s execution.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_EXT_demote_to_helper_invocation"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.4

-
-
-

Capabilities

-
-

Modify Section 3.31, "Capability", adding these rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityEnabling Capabilities

5379

DemoteToHelperInvocationEXT
-Allow the OpDemoteToHelperInvocationEXT and OpIsHelperInvocationEXT -instructions

Shader

-
-
-
-
-

Instructions

-
-

Modify Section 3.32.1, "Miscellaneous Instructions", adding the new instructions:

-
- ---- - - - - - - - - - - -

OpDemoteToHelperInvocationEXT
-
-Demote fragment shader invocation to a helper invocation. Any stores to memory -after this instruction are suppressed and the fragment does not write outputs to -the framebuffer.
-
-Unlike the OpKill instruction, this does not necessarily terminate the -invocation. It is not considered a flow control instruction (flow control does -not become non-uniform) and does not terminate the block. The implementation -may terminate helper invocations before the end of the shader as an -optimization, but doing so must not affect derivative calculations and does not -make control flow non-uniform.
-
-After this instruction executes, the value of a HelperInvocation builtin -variable is undefined. Use OpIsHelperInvocationEXT to determine whether -invocations are helper invocations in the presence of -OpDemoteToHelperInvocationEXT.
-
-This instruction is only valid in the Fragment Execution Model.

Capability:
-DemoteToHelperInvocationEXT

1

5380

- ------ - - - - - - - - - - - - -

OpIsHelperInvocationEXT
-
-Result is true if the invocation is currently a helper invocation, -otherwise result is false. An invocation is currently a helper invocation -if it was originally invoked as a helper invocation or if it has been demoted -to a helper invocation by OpDemoteToHelperInvocationEXT.
-
-Result Type must be a Boolean type scalar.
-
-This instruction is only valid in the Fragment Execution Model.

Capability:
-DemoteToHelperInvocationEXT

3

5381

<id> Result Type

<id> Result

-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2019-06-06

Jeff Bolz

Initial revision

-
-
-
- - \ No newline at end of file + + + + + + extensions/EXT/SPV_EXT_demote_to_helper_invocation.html + + +

extensions/EXT/SPV_EXT_demote_to_helper_invocation.html

+ + diff --git a/extensions/EXT/SPV_EXT_descriptor_indexing.html b/extensions/EXT/SPV_EXT_descriptor_indexing.html index 07e2762..632e7a4 100644 --- a/extensions/EXT/SPV_EXT_descriptor_indexing.html +++ b/extensions/EXT/SPV_EXT_descriptor_indexing.html @@ -1,644 +1,12 @@ - - - - - - - -SPV_EXT_descriptor_indexing - - - - - -
-
-

Name Strings

-
-
-

SPV_EXT_descriptor_indexing

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Daniel Koch, NVIDIA

    -
  • -
  • -

    Jeff Bolz, NVIDIA

    -
  • -
  • -

    Neil Henning, Codeplay

    -
  • -
  • -

    Matthaeus Chajdas, AMD

    -
  • -
  • -

    John Kessenich, Google

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2018-12-17

Revision

5

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.3 Revision 1.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-

This extension interacts with SPV_KHR_storage_buffer_storage_class.

-
-
-

This extension provides SPIR-V support for the GL_EXT_nonuniform_qualifier -GLSL extension.

-
-
-
-
-

Overview

-
-
-

This extension adds new capabilities to support the Vulkan -VK_EXT_descriptor_indexing extension along with a new decoration -in order to enable support for the GL_EXT_nonuniform_qualifier -GLSL extension.

-
-
-

The NonUniformEXT decoration is used to indicate that a variable -or instruction is non-uniform (or divergent control flow) for -different invocations. The ShaderNonUniformEXT capability is -used to indicate that a module uses this decoration.

-
-
-

The RuntimeDescriptorArrayEXT capability is used to indicate that -a module uses arrays of resources which are declared with -OpTypeRuntimeArray.

-
-
-

The InputAttachmentArrayDynamicIndexingEXT, -UniformTexelBufferArrayDynamicIndexingEXT, and -StorageTexelBufferArrayDynamicIndexingEXT capabilities are used to -indicate that a module uses an array of InputAttachment, SampledBuffer, -or ImageBuffer, respectively, with dynamic indexing.

-
-
-

The UniformBufferArrayNonUniformIndexingEXT, -SampledImageArrayNonUniformIndexingEXT, -StorageBufferArrayNonUniformIndexingEXT, -StorageImageArrayNonUniformIndexingEXT, -InputAttachmentArrayNonUniformIndexingEXT, -UniformTexelBufferArrayNonUniformIndexingEXT, and -StorageTexelBufferArrayNonUniformIndexingEXT capabilities are used to -indicate that a module uses an array of Block-decorated uniforms, -sampled images, StorageBuffer, non-sampled images, InputAttachment, -SampledBuffer, or ImageBuffer resources, respectively, with -non-uniform indexing.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_EXT_descriptor_indexing"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces the following new capabilities:

-
-
-
-
ShaderNonUniformEXT
-RuntimeDescriptorArrayEXT
-InputAttachmentArrayDynamicIndexingEXT
-UniformTexelBufferArrayDynamicIndexingEXT
-StorageTexelBufferArrayDynamicIndexingEXT
-UniformBufferArrayNonUniformIndexingEXT
-SampledImageArrayNonUniformIndexingEXT
-StorageBufferArrayNonUniformIndexingEXT
-StorageImageArrayNonUniformIndexingEXT
-InputAttachmentArrayNonUniformIndexingEXT
-UniformTexelBufferArrayNonUniformIndexingEXT
-StorageTexelBufferArrayNonUniformIndexingEXT
-
-
-
-
-
-

New Decorations

-
-
-

Decoration added under the ShaderNonUniformEXT capability:

-
-
-
-
NonUniformEXT
-
-
-
-
-
-

New Builtins

-
-
-

None.

-
-
-
-
-

New Instructions

-
-
-

None.

-
-
-
-
-

Token Number Assignments

-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
NameValueUsage

NonUniformEXT

5300

Decoration

ShaderNonUniformEXT

5301

Capability

RuntimeDescriptorArrayEXT

5302

Capability

InputAttachmentArrayDynamicIndexingEXT

5303

Capability

UniformTexelBufferArrayDynamicIndexingEXT

5304

Capability

StorageTexelBufferArrayDynamicIndexingEXT

5305

Capability

UniformBufferArrayNonUniformIndexingEXT

5306

Capability

SampledImageArrayNonUniformIndexingEXT

5307

Capability

StorageBufferArrayNonUniformIndexingEXT

5308

Capability

StorageImageArrayNonUniformIndexingEXT

5309

Capability

InputAttachmentArrayNonUniformIndexingEXT

5310

Capability

UniformTexelBufferArrayNonUniformIndexingEXT

5311

Capability

StorageTexelBufferArrayNonUniformIndexingEXT

5312

Capability

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.3

-
-
-
-
(Add a new Section 2.X, Uniformity)
-
-
-
-
-

SPIR-V has multiple notions of uniformity of values. A result <id> decorated -as Uniform (for a particular scope) is a contract that all invocations -within that scope will compute the same value for that result, for a given -dynamic instance of an instruction. This is useful to enable implementations -to store results in a scalar register file (scalarization), for example. -Results are assumed not to be uniform unless decorated as such.

-
-
-

An <id> is defined to be dynamically uniform for a dynamic instance of an -instruction if all invocations (in an invocation group) that execute the -dynamic instance have the same value for that <id>. This is not something that -is explicitly decorated, it is just a property that arises. This property is -assumed to hold for operands of certain instructions, such as the Image -operand of image instructions, unless that operand is decorated as -NonUniformEXT. Some implementations require more complex instruction -expansions to handle non-dynamically uniform values in certain instructions, -and thus it is mandatory for certain operands to be decorated as -NonUniformEXT if they are not guaranteed to be dynamically uniform.

-
-
-

While the names may suggest otherwise, nothing forbids an <id> from being -decorated as both Uniform and NonUniformEXT. Since dynamically uniform -is at a larger scope (invocation group) than the default Uniform scope -(subgroup), it is even possible for the <id> to be uniform at the subgroup -scope but not dynamically uniform.

-
-
-
-
-
(Modify Section 3.20, Decoration, adding a row to the Decoration table)
-
-
-
- ------- - - - - - - - - - - - - - - - -
DecorationEnabling CapabilitiesExtra Operands

5300

NonUniformEXT
-Apply to an object. Asserts that the value -backing the decorated <id> is not dynamically uniform. -See the API specification for more information.

ShaderNonUniformEXT

-
-
-
-
(Modify Section 3.31, Capability, adding new rows to the Capability table)
-
-
-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
CapabilityDepends OnEnabled by Extension

5301

ShaderNonUniformEXT
-Uses the NonUniformEXT decoration on a variable or instruction.

Shader

SPV_EXT_descriptor_indexing

5302

RuntimeDescriptorArrayEXT
-Uses arrays of resources which are sized at run-time.

Shader

SPV_EXT_descriptor_indexing

5303

InputAttachmentArrayDynamicIndexingEXT
-Arrays of InputAttachment-s use dynamically uniform indexing.

InputAttachment

SPV_EXT_descriptor_indexing

5304

UniformTexelBufferArrayDynamicIndexingEXT
-Arrays of SampledBuffer-s use dynamically uniform indexing

SampledBuffer

SPV_EXT_descriptor_indexing

5305

StorageTexelBufferArrayDynamicIndexingEXT
-Arrays of ImageBuffer-s use dynamically uniform indexing

ImageBuffer

SPV_EXT_descriptor_indexing

5306

UniformBufferArrayNonUniformIndexingEXT
-Block-decorated arrays in uniform storage classes use non-uniform indexing.

ShaderNonUniformEXT

SPV_EXT_descriptor_indexing

5307

SampledImageArrayNonUniformIndexingEXT
-Arrays of sampled images use non-uniform indexing.

ShaderNonUniformEXT

SPV_EXT_descriptor_indexing

5308

StorageBufferArrayNonUniformIndexingEXT
-Arrays in the StorageBuffer Storage Class, or BufferBlock-decorated arrays -use non-uniform indexing.

ShaderNonUniformEXT

SPV_EXT_descriptor_indexing

5309

StorageImageArrayNonUniformIndexingEXT
-Arrays of non-sampled images use non-uniform indexing.

ShaderNonUniformEXT

SPV_EXT_descriptor_indexing

5310

InputAttachmentArrayNonUniformIndexingEXT
-Arrays of InputAttachment-s use non-uniform indexing.

InputAttachment, ShaderNonUniformEXT

SPV_EXT_descriptor_indexing

5311

UniformTexelBufferArrayNonUniformIndexingEXT
-Arrays of SampledBuffer-s use non-uniform indexing.

SampledBuffer, ShaderNonUniformEXT

SPV_EXT_descriptor_indexing

5312

StorageTexelBufferArrayNonUniformIndexingEXT
-Arrays of ImageBuffer-s use non-uniform indexing.

ImageBuffer, ShaderNonUniformEXT

SPV_EXT_descriptor_indexing

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_EXT_descriptor_indexing"
-
-
-
-
-
-

Issues

-
-
-
    -
  1. -

    SPIR-V 1.3 Core spec says the following for OpTypeRuntimeArray -"Objects of this type can only be created with OpVariable using the -StorageBuffer or Uniform Storage Classes. This should be removed and -defer to the environment specifications instead."

    -
    -
    -
    -

    RESOLVED: Agree. This is superseded by the language in the Vulkan -and OpenGL SPIR-V Environment specifications which states that -"OpTypeRuntimeArray must only be used for the last member of an -OpTypeStruct in the Uniform Storage Class and is decorated as -BufferBlock."

    -
    -
    -
    -
  2. -
  3. -

    What type of SPIR-V instructions can the NonUniformEXT decoration -be used on?

    -
    -
    -
    -

    RESOLVED: Using the same language as Uniform (Apply to an object). -In SPIR-V, we have the following definition:

    -
    -
    -
    -
    Object: An instantiation of a non-void type, either as the
    -Result <id> of an operation, or created through OpVariable.
    -
    -
    -
    -

    which means it can apply to both variables declarations and specific -instructions.

    -
    -
    -
    -
  4. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2018-01-19

Daniel Koch

Initial draft

2

2018-02-14

Daniel Koch

address WG feedback

3

2018-02-21

Daniel Koch

Resolve issue 2.

4

2018-12-11

Daniel Koch

Update issue 2 for resolution of Issue 317.

5

2018-12-17

Daniel Koch

Add 2.x Uniformity section.

-
-
-
- - \ No newline at end of file + + + + + + extensions/EXT/SPV_EXT_descriptor_indexing.html + + +

extensions/EXT/SPV_EXT_descriptor_indexing.html

+ + diff --git a/extensions/EXT/SPV_EXT_fragment_fully_covered.html b/extensions/EXT/SPV_EXT_fragment_fully_covered.html index 9342528..02d7842 100644 --- a/extensions/EXT/SPV_EXT_fragment_fully_covered.html +++ b/extensions/EXT/SPV_EXT_fragment_fully_covered.html @@ -1,327 +1,12 @@ - - - - - - - -SPV_EXT_fragment_fully_covered - - - - - -
-
-

Name Strings

-
-
-

SPV_EXT_fragment_fully_covered

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Piers Daniell, NVIDIA

    -
  • -
  • -

    Daniel Koch, NVIDIA

    -
  • -
-
-
-
-
-

Status

-
-
-

Complete

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2017-07-07

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.2 Revision 1.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension provides a new built-in variable FullyCoveredEXT in SPIR-V.

-
-
-

The new functionality is enabled under the FragmentFullyCoveredEXT capability.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_EXT_fragment_fully_covered"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
FragmentFullyCoveredEXT
-
-
-
-
-
-

New Builtins

-
-
-

Builtin IDs added:

-
-
-
-
FullyCoveredEXT
-
-
-
-
-
-

New Instructions

-
-
-

None.

-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - - - - - -

FullyCoveredEXT

5264

FragmentFullyCoveredEXT

5265

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.2

-
-
-
-
(Modify Section 3.21, BuiltIn to include new builtins)
-
-
-
- ----- - - - - - - - - - - - - - -
BuiltInRequired Capability

5264

FullyCoveredEXT
-Rasterized fragment is fully covered by the generating primitive. -Input to the Fragment Execution Model. -See Vulkan EXT_conservative_rasterization extension for more detail.

FragmentFullyCoveredEXT

-
-
-
-
(Modify Section 3.31, Capability, adding a row to the Capability table)
-
-
-
- ------ - - - - - - - - - - - - - - - -
CapabilityDepends OnEnabled by Extension

5265

FragmentFullyCoveredEXT

Shader

SPV_EXT_fragment_fully_covered

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_EXT_fragment_fully_covered"
-
-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2017-07-07

Daniel Koch

Initial revision

-
-
-
- - \ No newline at end of file + + + + + + extensions/EXT/SPV_EXT_fragment_fully_covered.html + + +

extensions/EXT/SPV_EXT_fragment_fully_covered.html

+ + diff --git a/extensions/EXT/SPV_EXT_fragment_invocation_density.html b/extensions/EXT/SPV_EXT_fragment_invocation_density.html index b3a6bd7..f0d4eb6 100644 --- a/extensions/EXT/SPV_EXT_fragment_invocation_density.html +++ b/extensions/EXT/SPV_EXT_fragment_invocation_density.html @@ -1,379 +1,12 @@ - - - - - - - -SPV_EXT_fragment_invocation_density - - - - - -
-
-

Name Strings

-
-
-

SPV_EXT_fragment_invocation_density

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Daniel Koch, NVIDIA

    -
  • -
  • -

    Pat Brown, NVIDIA

    -
  • -
  • -

    Matthew Netsch, Qualcomm

    -
  • -
  • -

    Jan-Harald Fredriksen, Arm

    -
  • -
  • -

    Jeff Leger, Qualcomm

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2018-11-07

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.3, Revision 2, Unified.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension provides SPIR-V support for the GLSL -GL_EXT_fragment_invocation_density extension.

-
-
-

In the corresponding API extensions, applications can use a texture -(such as a shading rate image, or fragment density map) to control the -number of fragment shader invocations that will be spawned for a -particular neighborhood of covered pixels. We refer to the density -of fragment shader invocations as the "fragment invocation density".

-
-
-

This extension adds support for two new fragment shader built-ins under the -new FragmentDensityEXT capability. These built-ins can be used to determine -the density of fragment shader invocations.

-
-
-

A FragSizeEXT decorated variable will represent the width and height -of a rectangle of pixels that is being shaded by this fragment shader -invocation.

-
-
-

A FragInvocationCountEXT decorated variable will represent the maximum number -of fragment shader invocations executed for each fragment.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_EXT_fragment_invocation_density"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.3

-
-
-
-
(Modify Section 3.21, BuiltIn)
-
-
-
-
-

(add new rows to the Builtin table)

-
- ------ - - - - - - - - - - - - - - - - - - - - - -
BuiltInEnabling CapabilitiesEnabled by Extension

5292

FragSizeEXT
-Input that represents the width and height of a rectangle of pixels -corresponding to this invocation. -Only valid in the Fragment Execution Model. -See the API specification for more detail.

FragmentDensityEXT

SPV_EXT_fragment_invocation_density

5293

FragInvocationCountEXT
-Input that represents the maximum number of fragment shader invocations -executed for each fragment, as derived from the effective invocation density -for the fragment. -Only valid in the Fragment Execution Model. -See the API specification for more detail.

FragmentDensityEXT

SPV_EXT_fragment_invocation_density

-
-
-
-
(Modify Section 3.31, Capability, adding a new row to the Capability table)
-
-
-
- ------ - - - - - - - - - - - - - - - -
CapabilityDepends OnEnabled by Extension

5291

FragmentDensityEXT
-Uses the FragSizeEXT or FragInvocationCountEXT Builtins.

Shader

SPV_EXT_fragment_invocation_density

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_EXT_fragment_invocation_density"
-
-
-
-
-
-

Issues

-
-
-
    -
  1. -

    How does this extension compare to SPV_NV_shading_rate?

    -
    -
    -
    -

    RESOLVED: SPV_NV_shading rate was to support the VK_NV_shading_rate_image -extension. This version is a little bit more generic and is intended -to support both VK_NV_shading_rate_image and VK_EXT_fragment_density_map. -However, neither of those extensions is strictly needed for this extension -to be of interest to applications.

    -
    -
    -

    This extension uses the slightly more generic term -"Fragment invocation density" instead of "shading rate" and the -names of various tokens are different, per the following table, -but otherwise the extensions are intended to provide equivalent -functionality.

    -
    - ---- - - - - - - - - - - - - - - - - - - - - -
    SPV_NV_shading_rateSPV_EXT_fragment_invocation_density

    ShadingRateNV

    FragmentDensityEXT

    FragmentSizeNV

    FragSizeEXT

    InvocationsPerPixelNV

    FragInvocationCountEXT

    -
    -
    -
  2. -
  3. -

    Should we re-use the tokens from SPV_NV_shading_rate or do we need to -assign new ones?

    -
    -
    -
    -

    RESOLVED: Re-using the tokens from SPV_NV_shading_rate as this is meant -to be a drop-in replacement.

    -
    -
    -
    -
  4. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2018-11-07

Daniel Koch

Initial draft

-
-
-
- - \ No newline at end of file + + + + + + extensions/EXT/SPV_EXT_fragment_invocation_density.html + + +

extensions/EXT/SPV_EXT_fragment_invocation_density.html

+ + diff --git a/extensions/EXT/SPV_EXT_fragment_shader_interlock.html b/extensions/EXT/SPV_EXT_fragment_shader_interlock.html index 177efad..af28dbf 100644 --- a/extensions/EXT/SPV_EXT_fragment_shader_interlock.html +++ b/extensions/EXT/SPV_EXT_fragment_shader_interlock.html @@ -1,642 +1,12 @@ - - - - - - - -SPV_EXT_fragment_shader_interlock - - - - - -
-
-

Name Strings

-
-
-

SPV_EXT_fragment_shader_interlock

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Daniel Koch, NVIDIA

    -
  • -
  • -

    Jeff Bolz, NVIDIA

    -
  • -
  • -

    Kerch Holt, NVIDIA

    -
  • -
  • -

    Piers Daniell, NVIDIA

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2019-05-09

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.4, Revision 1, Unified.

-
-
-

This extension interacts with the SPV_NV_shading_rate and -SPV_EXT_fragment_invocation_density extensions.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-

This extension provides SPIR-V support for the -GL_ARB_fragment_shader_interlock GLSL extension.

-
-
-

This extension provides SPIR-V support for the shading_rate_interlock_ordered -and shading_rate_interlock_unordered layouts defined in the -GL_NV_shading_rate_image GLSL extension.

-
-
-
-
-

Overview

-
-
-

This extension provides new instructions -OpBeginInvocationInterlockEXT and OpEndInvocationInterlockEXT that -delimit a critical section of fragment shader code. For pairs of shader -invocations with "overlapping" coverage in a given pixel, the -implementation will guarantee that the critical section of the fragment -shader will be executed for only one fragment at a time.

-
-
-

There are six different interlock execution modes supported by this extension. -The execution modes PixelInterlockOrderedEXT and PixelInterlockUnorderedEXT -provide mutual exclusion in the critical section for any pair of fragments -corresponding to the same pixel, or pixels if the fragment covers more -than one pixel. When using multisampling, the execution -modes SampleInterlockOrderedEXT and SampleInterlockUnorderedEXT only provide -mutual exclusion for pairs of fragments that both cover at least one -common sample in the same pixel; these are recommended for performance if -shaders use per-sample data structures. When using the ShadingRateNV or -FragmentDensityEXT capabilities, the execution modes ShadingRateInterlockOrderedEXT and -ShadingRateInterlockUnorderedEXT provide mutual exclusion for pairs of -fragments with at least one or more associated samples in common from -all pixels belonging to the fragments.

-
-
-

Additionally, when the PixelInterlockOrderedEXT, -SampleInterlockOrderedEXT or ShadingRateInterlockOrderedEXT execution -mode is used, the interlock also guarantees that the critical section for -multiple shader invocations with "overlapping" coverage will be executed -in primitive order, as defined by the -Primitive Order -section of the Vulkan specification. -Such a guarantee is useful for applications that perform blending in -the fragment shader, where the application requires fragment values -to be composited in primitive order.

-
-
-

This extension can be useful for algorithms that need to access per-pixel -data structures via shader loads and stores. Algorithms using this -extension can access per-pixel data structures in critical sections -without other invocations accessing the same per-pixel data. Additionally, -the ordering guarantees are useful for cases where the API ordering of -fragments is meaningful. For example, applications may be able to execute -programmable blending operations in the fragment shader, where the -destination buffer is read via image loads and the final value is written -via image stores.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_EXT_fragment_shader_interlock"
-
-
-
-
-
-

New Execution Modes

-
-
-

This extension introduces these new execution modes:

-
-
-
-
SampleInterlockOrderedEXT
-SampleInterlockUnorderedEXT
-
-
-
-

which can be used with the FragmentShaderSampleInterlockEXT capability.

-
-
-

This extension introduces these new execution modes:

-
-
-
-
PixelInterlockOrderedEXT
-PixelInterlockUnorderedEXT
-
-
-
-

which can be used with the FragmentShaderPixelInterlockEXT capability.

-
-
-

This extension introduces these new execution modes:

-
-
-
-
ShadingRateInterlockOrderedEXT
-ShadingRateInterlockUnorderedEXT
-
-
-
-

which can be used with the FragmentShaderShadingRateInterlockEXT capability.

-
-
-
-
-

New Capabilities

-
-
-

This extension introduces new capabilities:

-
-
-
-
FragmentShaderSampleInterlockEXT
-FragmentShaderPixelInterlockEXT
-FragmentShaderShadingRateInterlockEXT
-
-
-
-
-
-

New Instructions

-
-
-

Instructions added under the FragmentShaderSampleInterlockEXT, -FragmentShaderPixelInterlockEXT, or FragmentShaderShadingRateInterlockEXT -capabilities.

-
-
-
-
OpBeginInvocationInterlockEXT
-OpEndInvocationInterlockEXT
-
-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

FragmentShaderSampleInterlockEXT

5363

OpBeginInvocationInterlockEXT

5364

OpEndInvocationInterlockEXT

5365

PixelInterlockOrderedEXT

5366

PixelInterlockUnorderedEXT

5367

SampleInterlockOrderedEXT

5368

SampleInterlockUnorderedEXT

5369

ShadingRateInterlockOrderedEXT

5370

ShadingRateInterlockUnorderedEXT

5371

FragmentShaderShadingRateInterlockEXT

5372

FragmentShaderPixelInterlockEXT

5378

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.3

-
-
-
-
(Modify Section 2.16.2, Validation Rules for Shader Capabilities)
-
-
-
-
-
-
(Add new items under "Entry point and execution model")
-
-
-
    -
  • -

    An OpEntryPoint with the Fragment Execution Model can set at most -one of the PixelInterlockOrderedEXT, PixelInterlockUnorderedEXT, -SampleInterlockOrderedEXT, SampleInterlockUnorderedEXT, -ShadingRateInterlockOrderedEXT, or ShadingRateInterlockUnorderedEXT -Execution Modes.

    -
  • -
  • -

    If the entry point has any of the interlock ordering execution modes, -it must dynamically execute each of OpBeginInvocationInterlockEXT and -OpEndInvocationInterlockEXT, in that program order, exactly once.

    -
  • -
-
-
-
-
-
-
-
-
(Modify Section 3.6, Execution Mode)
-
-
-
-
-

(add new rows to the Execution Mode table)

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Execution modeEnabling CapabilitiesExtra Operands

5366

PixelInterlockOrderedEXT
-controls overlap behavior of fragment shader interlock. -See the Fragment Shader Interlock -section of the Vulkan specification for details. Only valid in the Fragment Execution Model.

FragmentShaderPixelInterlockEXT

5367

PixelInterlockUnorderedEXT
-controls overlap behavior of fragment shader interlock. -See the Fragment Shader Interlock -section of the Vulkan specification for details. Only valid in the Fragment Execution Model.

FragmentShaderPixelInterlockEXT

5368

SampleInterlockOrderedEXT
-controls overlap behavior of fragment shader interlock. -See the Fragment Shader Interlock -section of the Vulkan specification for details. Only valid in the Fragment Execution Model.

FragmentShaderSampleInterlockEXT

5369

SampleInterlockUnorderedEXT
-controls overlap behavior of fragment shader interlock. -See the Fragment Shader Interlock -section of the Vulkan specification for details. Only valid in the Fragment Execution Model.

FragmentShaderSampleInterlockEXT

5370

ShadingRateInterlockOrderedEXT
-controls overlap behavior of fragment shader interlock. -See the Fragment Shader Interlock -section of the Vulkan specification for details. Only valid in the Fragment Execution Model.

FragmentShaderShadingRateInterlockEXT

5371

ShadingRateInterlockUnorderedEXT
-controls overlap behavior of fragment shader interlock. -See the Fragment Shader Interlock -section of the Vulkan specification for details. Only valid in the Fragment Execution Model.

FragmentShaderShadingRateInterlockEXT

-
-
-
-
(Modify Section 3.31, Capability, adding new rows to the Capability table)
-
-
-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - -
CapabilityImplicitly DeclaresEnabled by Extension

5363

FragmentShaderSampleInterlockEXT
-Uses the SampleInterlockOrderedEXT, or SampleInterlockUnorderedEXT, -Execution Modes

Shader

SPV_EXT_fragment_shader_interlock

5378

FragmentShaderPixelInterlockEXT
-Uses the PixelInterlockOrderedEXT, or PixelInterlockUnorderedEXT, -Execution Modes

Shader

SPV_EXT_fragment_shader_interlock

5372

FragmentShaderShadingRateInterlockEXT
-Uses the ShadingRateInterlockOrderedEXT, or ShadingRateInterlockUnorderedEXT -Execution Modes

Shader

SPV_EXT_fragment_shader_interlock, and SPV_NV_shading_rate or SPV_EXT_fragment_invocation_density

-
-
-
-
(Modify Section 3.32.1, Miscellaneous Instructions, adding new rows to the table)
-
-
-
- ---- - - - - - - - - - - -

OpBeginInvocationInterlockEXT
-
-Delimits the start of a critical section of the Fragment shader.
-
-See the Fragment Shader Interlock -section in the Vulkan specification for details.

Capability:
-FragmentShaderSampleInterlockEXT, FragmentShaderPixelInterlockEXT, FragmentShaderShadingRateInterlockEXT

1

5364

- ---- - - - - - - - - - - -

OpEndInvocationInterlockEXT
-
-Delimits the end of a critical section of the Fragment shader.
-
-See the Fragment Shader Interlock -section in the Vulkan specification for details.

Capability:
-FragmentShaderSampleInterlockEXT, FragmentShaderPixelInterlockEXT, FragmentShaderShadingRateInterlockEXT

1

5365

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_EXT_fragment_shader_interlock"
-
-
-
-
-
-

Issues

-
-
-
    -
  1. -

    Where does the language specification the synchronization -between the critical sections belong?

    -
    -
    -
    -

    RESOLVED: It’s defined in the Vulkan specification -in the Fragment Shader Interlock -and Memory Model sections.

    -
    -
    -
    -
  2. -
  3. -

    Is there an implicit memory barrier between critical sections?

    -
    -
    -
    -

    RESOLVED: Yes, this is also defined in the Vulkan specification and -is defined in terms of a new memory model Scope for fragment shader -interlock. Doing an implicit memory barrier allows implementations -to use the most optimal scope for their implementation, that is not -necessarily covered by one of the existing scopes.

    -
    -
    -
    -
  4. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2019-05-09

Piers Daniell

Initial revisions

-
-
-
- - \ No newline at end of file + + + + + + extensions/EXT/SPV_EXT_fragment_shader_interlock.html + + +

extensions/EXT/SPV_EXT_fragment_shader_interlock.html

+ + diff --git a/extensions/EXT/SPV_EXT_image_raw10_raw12.html b/extensions/EXT/SPV_EXT_image_raw10_raw12.html index 997a9ce..36c3450 100644 --- a/extensions/EXT/SPV_EXT_image_raw10_raw12.html +++ b/extensions/EXT/SPV_EXT_image_raw10_raw12.html @@ -1,232 +1,12 @@ - - - - - - - -SPV_EXT_image_raw10_raw12 - - - - - -
-
-

Name Strings

-
-
-

SPV_EXT_image_raw10_raw12

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Kevin Petit, Arm Ltd.

    -
  • -
  • -

    Sven van Haastregt, Arm Ltd.

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2023 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-

Complete.

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2023-06-21

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.6 Revision 2.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds Image Channel Data Type definitions for RAW10 and RAW12 -image formats.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_EXT_image_raw10_raw12"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6

-
-
-

Image Channel Data Type

-
-

Modify Section 3.13, "Image Channel Data Type", adding these rows to the table:

-
-
-
- ----- - - - - - - - - - - - - - - - - - - -
Image Channel Data TypeEnabling Capabilities

19

UnsignedIntRaw10EXT

Kernel

20

UnsignedIntRaw12EXT

Kernel

-
-
-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2023-06-21

Kevin Petit

Initial revision

-
-
-
- - \ No newline at end of file + + + + + + extensions/EXT/SPV_EXT_image_raw10_raw12.html + + +

extensions/EXT/SPV_EXT_image_raw10_raw12.html

+ + diff --git a/extensions/EXT/SPV_EXT_mesh_shader.html b/extensions/EXT/SPV_EXT_mesh_shader.html index 1b1386d..b13d1d0 100644 --- a/extensions/EXT/SPV_EXT_mesh_shader.html +++ b/extensions/EXT/SPV_EXT_mesh_shader.html @@ -1,1153 +1,12 @@ - - - - - - - -SPV_EXT_mesh_shader - - - - - -
-
-

Name Strings

-
-
-

SPV_EXT_mesh_shader

-
-
-
-
-

Contact

-
-
-

See Issues list in the Khronos SPIRV-Registry repository: -https://github.com/KhronosGroup/SPIRV-Registry

-
-
-
-
-

Contributors

-
-
-
    -
  • -

    Daniel Koch, NVIDIA

    -
  • -
  • -

    Christoph Kubisch, NVIDIA

    -
  • -
  • -

    Jeff Bolz, NVIDIA

    -
  • -
  • -

    John Kessenich, Google

    -
  • -
  • -

    Sahil Parmar, NVIDIA

    -
  • -
  • -

    Patrick Mours, NVIDIA

    -
  • -
  • -

    Slawomir Grajewski, Intel

    -
  • -
  • -

    Timur Kristóf, Valve

    -
  • -
  • -

    Pankaj Mistry, NVIDIA

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2022-09-16

Revision

7

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.5 Revision 5.

-
-
-

This extension requires SPIR-V 1.4.

-
-
-

This extension interacts with SPV_EXT_viewport_array2.

-
-
-

This extension interacts with SPV_EXT_shader_viewport_index_layer.

-
-
-

This extension interacts with SPV_KHR_fragment_shading_rate.

-
-
-

This extension interacts with SPV_KHR_multiview.

-
-
-

This extension interacts with SPIR-V 1.2 (LocalSizeId).

-
-
-

This extension interacts with SPIR-V 1.3 and -SPV_KHR_shader_draw_parameters (DrawIndex).

-
-
-
-
-

Overview

-
-
-

This extension adds new functionality to support the Vulkan -VK_EXT_mesh_shader extension in SPIR-V. It adds two new programmable shader -types, task and mesh shaders, which are used instead of the standard -programmable vertex processing pipeline. Both new shader types have execution -environments similar to that of compute shaders.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_EXT_mesh_shader"
-
-
-
-
-
-

New Execution Models

-
-
-

This extension introduces new execution models:

-
-
-
-
TaskEXT
-MeshEXT
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
MeshShadingEXT
-
-
-
-
-
-

New Execution Modes

-
-
-

Execution Modes added under the MeshShadingEXT capability:

-
-
-
-
OutputLinesEXT
-OutputTrianglesEXT
-OutputPrimitivesEXT
-
-
-
-
-
-

New Storage Classes

-
-
-

Storage Classes added under the MeshShadingEXT capability:

-
-
-
-
TaskPayloadWorkgroupEXT
-
-
-
-
-
-

New Decorations

-
-
-

Decorations added under the MeshShadingEXT capability:

-
-
-
-
PerPrimitiveEXT
-
-
-
-
-
-

New BuiltIns

-
-
-

BuiltIns added under the MeshShadingEXT capability:

-
-
-
-
PrimitivePointIndicesEXT
-PrimitiveLineIndicesEXT
-PrimitiveTriangleIndicesEXT
-CullPrimitiveEXT
-
-
-
-
-
-

New Instructions

-
-
-

Instructions added under the MeshShadingEXT capability:

-
-
-
-
OpEmitMeshTasksEXT
-OpSetMeshOutputsEXT
-
-
-
-
-
-

Modifications to the SPIR-V Specification

-
-
-
-
(Modify Section 2.2.5, Control Flow)
-
-

Add OpEmitMeshTasksEXT to the list of Termination Instructions.

-
-
(Modify Section 2.16.1, Universal Validation Rules)
-
-
-
    -
  • -

    OpSetMeshOutputsEXT must be called before any variable from Output storage class -is written to. Behavior is undefined if any invocation executes this instruction -more than once or under non-uniform control flow. The arguments for the instruction -is taken from first invocation in each workgroup.

    -
  • -
  • -

    OpEmitMeshTasksEXT must be the last instruction in a block. Only instructions -executed before OpEmitMeshTasksEXT have observable side effects. Behavior is undefined -if any invocation terminates without executing this instruction, or if any invocation -executes this instruction in non-uniform control flow. The arguments for the instruction -is taken from first invocation in each workgroup.

    -
  • -
  • -

    Update Atomic access rule

    -
    -
      -
    • -

      Add Storage Class TaskPayloadWorkgroupEXT to the list of storage classes where -pointers taken by atomic operation instructions can point to.

      -
    • -
    -
    -
  • -
-
-
-
(Modify Section 2.16.2, Validation Rules for Shader Capabilities)
-
-
-
-
-
-
(Add new items under "Entry point and execution model")
-
-
-
    -
  • -

    Each OpEntryPoint with the MeshEXT Execution Model must have an -OpExecutionMode with exactly one of OutputPoints, OutputLinesEXT, or -OutputTrianglesEXT Execution Modes.

    -
  • -
  • -

    Each OpEntryPoint with the MeshEXT Execution Model must specify both the -OutputPrimitivesEXT and OutputVertices Execution Modes.

    -
  • -
  • -

    Each OpEntryPoint with the MeshEXT or TaskEXT Execution Models can have -at most one global OpVariable of storage class TaskPayloadWorkgroupEXT.

    -
  • -
  • -

    OpSetMeshOutputsEXT is only valid in MeshEXT execution model.

    -
  • -
  • -

    OpEmitMeshTasksEXT is only valid in TaskEXT Execution model.

    -
  • -
-
-
-
(Add new items under "Decorations")
-
-
-
    -
  • -

    The PerPrimitiveEXT decoration must be applied only to variables in the -Output Storage Class in the MeshEXT Execution Model or variables in the -Input Storage Class in the Fragment Execution Model.

    -
  • -
-
-
-
-
-
-
-
-
(Modify Section 3.3, Execution Model, adding rows to the Execution Model table)
-
-
-
- ----- - - - - - - - - - - - - - - - - - - -
Execution ModelEnabling Capabilities

5364

TaskEXT
-Task shading stage.

MeshShadingEXT

5365

MeshEXT
-Mesh shading stage.

MeshShadingEXT

-
-
-
-
(Modify Section 3.6, Execution Mode, adding rows to the Execution Mode table)
-
-
-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Execution ModeEnabling CapabilitiesExtra Operands

5269

OutputLinesEXT
-Stage output primitive is lines. -Only valid with the MeshEXT Execution Model.

MeshShadingEXT

5298

OutputTrianglesEXT
-Stage output primitive is triangles. -Only valid with the MeshEXT Execution Model.

MeshShadingEXT

5270

OutputPrimitivesEXT
-For the mesh stage, the maximum number of primitives the shader will ever emit -for the invocation group. -Only valid with the MeshEXT Execution Model.

MeshShadingEXT

Literal Number
-Primitive count

-
-
-
-
(Modify the definition of following Execution Modes, allowing them to be used in TaskEXT or MeshEXT Execution Models)
-
-
-
- -------- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Execution ModeEnabling CapabilitiesExtra Operands

17

LocalSize
-Indicates the workgroup size in the x, y, and z dimensions. -Only valid with the GLCompute, TaskEXT, MeshEXT or Kernel Execution -Models.

Literal Number
-x size

Literal Number
-y size

Literal Number
-z size

26

OutputVertices
-Only valid with the Geometry, TessellationControl, -TessellationEvaluation, or MeshEXT Execution Models.

Literal Number
-Vertex count

For a geometry stage, the maximum number of vertices the shader will -ever emit in a single invocation.

Geometry

For a tessellation-control stage, the number of vertices in the output -patch produced by the tessellation control shader, which also specifies -the number of times the tessellation control shader is invoked.

Tessellation

For a mesh stage, the maximum number of vertices the shader will ever emit -for the invocation group.

MeshShadingEXT

27

OutputPoints
-Stage output primitive is points. -Only valid with the Geometry and MeshEXT Execution Models.

Geometry, MeshShadingEXT

38

LocalSizeId
-Same as LocalSize, but using <id> operands instead of literals. -Only valid with the GLCompute, TaskEXT, MeshEXT or Kernel Execution -Models.

Missing before version 1.2.

<id>
-x size

<id>
-y size

<id>
-z size

-
-
-
-
(Modify Section 3.7, Storage Class, adding a new row to the Storage Class table)
-
-
-
- ----- - - - - - - - - - - - - - -
Storage ClassEnabling Capabilities

5402

TaskPayloadWorkgroupEXT
-Used for storing payload data associated with a task shader invocation group. -Shared across all invocations within a workgroup. Visible across all functions. -Only valid with the TaskEXT and MeshEXT Execution Models. -Variables declared with this storage class must not have initializers, can be -both read and written to in TaskEXT Execution Model, but are read-only in -MeshEXT Execution Model.

MeshShadingEXT

-
-
-
-
(Modify Section 3.20, Decoration, adding a new row to the Decoration table)
-
-
-
- ------- - - - - - - - - - - - - - - - -
DecorationEnabling CapabilitiesExtra Operands

5271

PerPrimitiveEXT
-Must only be used on a memory object declaration or a member of a structure -type. Indicates that the variable has separate instances for each primitive -in the output.

-

Only valid for variables of Input Storage Class in Fragment Execution Model and -Output Storage Class in MeshEXT Execution Model.

MeshShadingEXT

-
-
-
-
(Modify Section 3.21, BuiltIn, adding rows to the BuiltIn table)
-
-
-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
BuiltInEnabling Capabilities

5299

CullPrimitiveEXT
-Primitive cull state in the MeshEXT Execution Model. -See the Vulkan API specification for more detail.

MeshShadingEXT

5294

PrimitivePointIndicesEXT
-Output array of vertex index values in the MeshEXT Execution Model. -See the Vulkan API specification for more detail.

MeshShadingEXT

5295

PrimitiveLineIndicesEXT
-Output array of vertex index values in the MeshEXT Execution Model. -See the Vulkan API specification for more detail.

MeshShadingEXT

5296

PrimitiveTriangleIndicesEXT
-Output array of vertex index values in the MeshEXT Execution Model. -See the Vulkan API specification for more detail.

MeshShadingEXT

-
-
-
-
(Modify the definition of following BuiltIns, allowing them to be used in TaskEXT or MeshEXT Execution Models)
-
-
-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
BuiltInEnabling Capabilities

0

Position
-Output vertex position from a vertex processing or -MeshEXT Execution Model. -See the client API specification for more detail.

Shader

1

PointSize
-Output point size from a vertex processing or -MeshEXT Execution Model. -See the client API specification for more detail.

Shader

3

ClipDistance
-Array of clip distances output from a vertex processing or -MeshEXT Execution Model. -See the client API specification for more detail.

ClipDistance

4

CullDistance
-Array of cull distances output from a vertex processing or -MeshEXT Execution Model. -See the client API specifications for more detail.

CullDistance

7

PrimitiveId
-See the client API specifications for more detail.

Primitive ID in a Geometry Execution Model

Geometry

Primitive ID in a Tessellation Execution Model

Tessellation

Primitive ID output in a MeshEXT Execution Model

MeshShadingEXT

9

Layer
-Layer selection for multi-layer framebuffer. -See the client API specification for more detail.

Layer output by a Geometry Execution Model, input to a Fragment -Execution Model.

Geometry

Layer output by a Vertex or Tessellation Execution Model.

ShaderViewportIndexLayerEXT

Layer output by a MeshEXT Execution Model.

ShaderViewportIndexLayerEXT, MeshShadingEXT

10

ViewportIndex
-Viewport selection for viewport transformation when using multiple viewports. -See the client API specification for more detail.

Viewport index output by a Geometry Execution Model, input to a Fragment -Execution Model.

MultiViewport

Viewport index output by a Vertex or Tessellation Execution Model.

ShaderViewportIndexLayerEXT

Viewport index output by a MeshEXT Execution Model

ShaderViewportIndexLayerEXT, MeshShadingEXT

24

NumWorkgroups
-Number of workgroups in GLCompute, TaskEXT, MeshEXT or Kernel -Execution Models. -See the client API specifications for more detail.

25

WorkgroupSize
-Workgroup size in GLCompute, TaskEXT, MeshEXT or Kernel -Execution Models. -See the client API specifications for more detail.

26

WorkgroupId
-Workgroup ID in GLCompute, TaskEXT, MeshEXT or Kernel -Execution Models. -See the client API API specifications for more detail.

27

LocalInvocationId
-Local invocation ID in GLCompute, TaskEXT, MeshEXT or Kernel -Execution Models. -See the client API API specifications for more detail.

28

GlobalInvocationId
-Global invocation ID in GLCompute, TaskEXT, MeshEXT or Kernel -Execution Models. -See the client API API specifications for more detail.

29

LocalInvocationIndex
-Local invocation index in GLCompute, TaskEXT or MeshEXT Execution Models. -See Vulkan or OpenGL API specifications for more detail.
-
-Workgroup Linear ID in a Kernel Execution Model. -See OpenCL API specification for more detail.

38

NumSubgroups
-Number of subgroups in GLCompute, TaskEXT, MeshEXT or Kernel -Execution Models.
-See the client API specification for more detail.

Kernel, GroupNonUniform

40

SubgroupID
-Subgroup ID in GLCompute, TaskEXT, MeshEXT or Kernel -Execution Models.
-See the client API specification for more detail.

Kernel, GroupNonUniform

4426

DrawIndex
-Contains the index of the draw currently being processed. -Only valid in vertex processing, MeshEXT or Fragment -Execution Models. -See the Vulkan 1.1 or OpenGL 4.6 specifications for more details.

DrawParameters
-
-Missing before version 1.3.

4432

PrimitiveShadingRateKHR
-Output primitive fragment shading rate. -Only valid in the Vertex, Geometry or MeshEXT Execution Models. -See the client API specification for more detail.

FragmentShadingRateKHR

4440

ViewIndex
-Input view index of the view currently being rendered to. -Only valid in the vertex processing, MeshEXT or -Fragment Execution Models. -See the client API specification for more detail.

ViewIndex

-
-
-
-
(Modify the definition of following Memory Semantics, changing WorkgroupMemory to include the new TaskPayloadWorkgroupEXT Storage Class)
-
-
-
- ----- - - - - - - - - - - - - - -
Memory SemanticsEnabling Capabilities

0x100

WorkgroupMemory
-Apply the memory-ordering constraints to Workgroup or -TaskPayloadWorkgroupEXT Storage Class memory.

-
-
-
-
(Modify Section 3.31, Capability, adding a new row to the Capability table)
-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

5283

MeshShadingEXT
-Uses the TaskEXT or MeshEXT Execution Models.

Shader

-
-
-
-
(Modify Section 3.37.1, Miscellaneous Instructions, adding rows to the Miscellaneous Instructions table)
-
-
-
- -------- - - - - - - - - - - - - - - -

OpEmitMeshTasksEXT
-
-Defines the grid size of subsequent mesh shader workgroups to generate -upon completion of the task shader workgroup.
-
-Group Count X Y Z must each be a 32-bit unsigned integer value. -They configure the number of local workgroups in each respective dimensions -for the launch of child mesh tasks. See Vulkan API specification for more detail.
-
-Payload is an optional pointer to the payload structure to pass to the generated mesh shader invocations. -Payload must be the result of an OpVariable with a storage class of TaskPayloadWorkgroupEXT.
-
-The arguments are taken from the first invocation in each workgroup. -Behaviour is undefined if any invocation terminates without executing this instruction, -or if any invocation executes this instruction in non-uniform control flow.

-

This instruction also serves as an OpControlBarrier instruction, and also -performs and adheres to the description and semantics of an OpControlBarrier -instruction with the Execution and Memory operands set to Workgroup and -the Semantics operand set to a combination of WorkgroupMemory and -AcquireRelease.

-

Ceases all further processing: Only instructions executed before -OpEmitMeshTasksEXT have observable side effects.
-
-This instruction must be the last instruction in a block.
-
-This instruction is only valid in the TaskEXT Execution Model.

Capability:
-MeshShadingEXT

4 + variable

5294

<id>
-Group Count X

<id>
-Group Count Y

<id>
-Group Count Z

Optional
-<id>
-Payload

- ------ - - - - - - - - - - - - -

OpSetMeshOutputsEXT
-
-Sets the actual output size of the primitives and vertices that the mesh shader -workgroup will emit upon completion.
-
-Vertex Count must be a 32-bit unsigned integer value. -It defines the array size of per-vertex outputs.
-
-Primitive Count must a 32-bit unsigned integer value. -It defines the array size of per-primitive outputs.
-
-The arguments are taken from the first invocation in each workgroup. -Behavior is undefined if any invocation executes this instruction more than once or under -non-uniform control flow. -Behavior is undefined if there is any control flow path to an output write that is not preceded -by this instruction.
-
-This instruction is only valid in the MeshEXT Execution Model.

Capability:
-MeshShadingEXT

3

5295

<id>
-Vertex Count

<id>
-Primitive Count

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_EXT_mesh_shader"
-
-
-
-
-
-

Issues

-
-
-

1) Can there be more then one OpVariable with storage class TaskPayloadWorkgroupEXT?

-
-
-

Answer: OpEmitMeshTasksEXT has a optional operand "payload". There can be at most -one <id> of type OpVariable with storage class TaskPayloadWorkgroupEXT associated with -an OpEntryPoint. This OpVariable should be a global OpVariable.

-
-
-

Hence for a SPIRV with single OpEntryPoint there can at most be one such OpVariable. -For multiple entry points, refer to answer about issue#2.

-
-
-

2) For SPIRV with multiple entry points how are payloads represented?

-
-
-

Answer : In a multiple entry point SPIR-V each OpEntryPoint should be associated with at most -one global OpVariable of storage class TaskPayloadWorkgroupEXT. Thus more than one -such OpVariable can be present in the SPIR-V. But only one OpVariable of type TaskPayloadWorkgroupEXT -is allowed as part of interface of a OpEntryPoint. -To support this requirement in OpEntryPoint, SPIR-V version has to 1.4 or above.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2021-03-25

Christoph Kubisch

Initial revision

2

2021-08-30

Patrick Mours

Add modifications to NumWorkGroups, NumSubgroups and SubgroupID

3

2021-11-26

Patrick Mours

Add TaskPayloadWorkgroupEXT storage class and payload argument to OpEmitMeshTasksEXT

4

2022-04-11

Pankaj Mistry

Require SPIR-V 1.4 and add validation rules for TaskPayloadWorkgroupEXT

5

2022-08-31

Pankaj Mistry

Added validation rules for OpSetMeshOutputsEXT and OpEmitMeshTasksEXT

6

2022-09-06

Pankaj Mistry

Added OpEmitMeshTasksEXT as a termination instruction and added atomic access validation rule for TaskPayloadWorkgroupEXT

7

2022-09-16

Ricardo Garcia

Forbid more than one TaskPayloadWorkgroupEXT variable in each TaskEXT entry point

-
-
-
- - \ No newline at end of file + + + + + + extensions/EXT/SPV_EXT_mesh_shader.html + + +

extensions/EXT/SPV_EXT_mesh_shader.html

+ + diff --git a/extensions/EXT/SPV_EXT_opacity_micromap.html b/extensions/EXT/SPV_EXT_opacity_micromap.html index 1f3442f..e4035f6 100644 --- a/extensions/EXT/SPV_EXT_opacity_micromap.html +++ b/extensions/EXT/SPV_EXT_opacity_micromap.html @@ -1,271 +1,12 @@ - - - - - - - -SPV_EXT_opacity_micromap - - - - - -
-
-

Name Strings

-
-
-

SPV_EXT_opacity_micromap

-
-
-
-
-

Contact

-
-
-

See Issues list in the Khronos SPIRV-Headers repository: -https://github.com/KhronosGroup/SPIRV-Headers

-
-
-
-
-

Contributors

-
-
-
    -
  • -

    Eric Werness, NVIDIA

    -
  • -
  • -

    Joshua Barczak, Intel

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2022-07-28

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.5 Revision 1.

-
-
-

This extension requires SPIR-V 1.4.

-
-
-

This extension requires SPV_KHR_ray_query or SPV_KHR_ray_tracing.

-
-
-
-
-

Overview

-
-
-

This extension adds new functionality to support the Vulkan -VK_EXT_opacity_micromap extension in SPIR-V.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_EXT_opacity_micromap"
-
-
-
-
-
-

Modifications to the SPIR-V Specification

-
-
-
-
(Modify sub-section 3.RF, Ray Flags, adding to the existing table)
-
-
-
-
-

3.RF, Ray Flags

-
- ----- - - - - - - - - - - - - - -
Ray FlagsEnabling Capabilities

1024

ForceOpacityMicromap2StateEXT
-Force opacity micromaps intersected by this ray to be evaluated in the 2 state mode. -See the Ray Opacity Micromap in the Vulkan API specification.

RayTracingOpacityMicromapEXT

-
-
-
-
(Modify Section 3.31, Capability, adding a row to the Capability table)
-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

5381

RayTracingOpacityMicromapEXT
-Uses the ForceOpacityMicromap2StateEXT enumerant.

Shader

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_EXT_opacity_micromap"
-
-
-
-
-
-

Issues

-
-
-

None yet.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2019-07-28

Eric Werness

Initial revision

-
-
-
- - \ No newline at end of file + + + + + + extensions/EXT/SPV_EXT_opacity_micromap.html + + +

extensions/EXT/SPV_EXT_opacity_micromap.html

+ + diff --git a/extensions/EXT/SPV_EXT_optnone.html b/extensions/EXT/SPV_EXT_optnone.html index a4e0cde..cb391d1 100644 --- a/extensions/EXT/SPV_EXT_optnone.html +++ b/extensions/EXT/SPV_EXT_optnone.html @@ -1,284 +1,12 @@ - - - - - - - -SPV_EXT_optnone - - - - - -
-
-

Name Strings

-
-
-

SPV_EXT_optnone

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Mariya Podchishchaeva, Intel

    -
  • -
  • -

    Ben Ashbaugh, Intel

    -
  • -
  • -

    Alexey Sotkin, Intel

    -
  • -
  • -

    Joe Garvey, Intel

    -
  • -
  • -

    Dmitry Sidorov, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2024 The Khronos Group Inc. Copyright terms at http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2024-06-21

Revision

2

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.6 Revision 3, Unified

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds new possible value for Function Control mask - OptNoneEXT, -which represents a strong request to not optimize the function. It is useful in cases -where the user wants to debug a particular function without sacrificing performance of -the entire application, and this is accomplished by disabling optimizations solely -for that particular function instead of the entire module.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_EXT_optnone"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6, Revision 3, Unified

-
-
-

Capabilities

-
-

Modify Section 3.31, "Capability", adding these rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

6094

OptNoneEXT
-Allow to use OptNoneEXT Function Control mask value

-
-
-
-
-

Function Control

-
-

In section 3.24 "Function Control" add the following row to the Function Control -table:

-
- ----- - - - - - - - - - - - - - -
Function ControlEnabling Capabilities

0x10000

OptNoneEXT
-Strong request, to the extent possible, to not optimize the function. Only functions -with Inline Function Control mask can be considered to be inlined into the -function.
-This function should never be inlined.
-It must not be used together with Inline bit.

OptNoneEXT

-
-
-
-
-

Issues

-
-
-

Discussion:

-
-
-

…​

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2020-12-15

Mariya Podchishchaeva

Initial revision

2

2024-06-21

Dmitry Sidorov

Prepare for publication

-
-
-
- - \ No newline at end of file + + + + + + extensions/EXT/SPV_EXT_optnone.html + + +

extensions/EXT/SPV_EXT_optnone.html

+ + diff --git a/extensions/EXT/SPV_EXT_physical_storage_buffer.html b/extensions/EXT/SPV_EXT_physical_storage_buffer.html index dad2ab6..15f705a 100644 --- a/extensions/EXT/SPV_EXT_physical_storage_buffer.html +++ b/extensions/EXT/SPV_EXT_physical_storage_buffer.html @@ -1,713 +1,12 @@ - - - - - - - -SPV_EXT_physical_storage_buffer - - - - - -
-
-

Name Strings

-
-
-

SPV_EXT_physical_storage_buffer

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Jeff Bolz, NVIDIA

    -
  • -
  • -

    Neil Henning, AMD

    -
  • -
  • -

    Tobias Hector, AMD

    -
  • -
  • -

    Faith Ekstrand, Intel

    -
  • -
  • -

    Mariusz Merecki, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2018 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2019-09-18

Revision

2

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.3, Revision 5, Unified.

-
-
-

This extension requires SPIR-V 1.3.

-
-
-
-
-

Overview

-
-
-

This extension adds a new storage class PhysicalStorageBufferEXT which is -similar to StorageBuffer except pointers to the PhysicalStorageBufferEXT -storage class are treated as physical pointer types according to a new -addressing model PhysicalStorageBuffer64EXT. This addressing model is a -hybrid of logical and physical addressing, with only pointers to -PhysicalStorageBufferEXT storage class being physical, and using 64-bit -addresses. It also adds a new capablity PhysicalStorageBufferAddressesEXT -and enables a few instructions currently supported for Addresses.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_EXT_physical_storage_buffer"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.3

-
-
-

2.2 Terms

-
-

Add new terms to section 2.2.2 Types:

-
-
-

Physical Pointer Type: A pointer type is a physical -pointer type if the storage class of the type pointed to uses physical -addressing according to the addressing model.

-
-
-

Logical Pointer Type: A pointer type is a logical -pointer type if it is not a physical pointer type.

-
-
-

Modify the following definitions:

-
-
-

Concrete Type: A numerical scalar, vector, matrix type, -or physical pointer type, or any aggregate containing only these types.

-
-
-

Abstract Type: An OpTypeVoid or OpTypeBool, or logical -pointer type, or any aggregate type containing any of these.

-
-
-

Modify the definition of Memory Object Declaration:

-
-
-

Memory Object Declaration: An OpVariable, or -an OpFunctionParameter of pointer type, or the contents of an OpVariable -that holds a pointer to PhysicalStorageBufferEXT storage class -or holds an array of such pointers.

-
-
-

Modify the first part of the definition of Variable pointer from:

-
-
-

Variable pointer: A pointer that results from one of the -following instructions: …​

-
-
-

to:

-
-
-

Variable pointer: A pointer of logical pointer type that -results from one of the following instructions: …​

-
-
-
-

2.16 Validation Rules

-
-

Modify section 2.16.1. Universal Validation Rules:

-
-
-

Change:

-
-
-
-
"If the *Logical* <<Addressing_Model, addressing model>> is selected and the
-*VariablePointers* <<Capability,capability>> is not declared:"
-
-
-
-

to:

-
-
-
-
"If the *VariablePointers* <<Capability,capability>> is not declared, the
-following rules apply to <<LogicalPointerType,logical pointer types>>:".
-
-
-
-

Change:

-
-
-
-
*OpVariable* cannot allocate an object whose type is a pointer type (that
-is, it cannot create an object in memory that is itself a pointer and
-whose result would thus be a pointer to a pointer)
-
-
-
-

to:

-
-
-
-
*OpVariable* cannot allocate an object whose type is a
-<<LogicalPointerType,logical pointer type>> (that is, it cannot create an
-object in memory that is itself a logical pointer and whose result would
-thus be a pointer to a logical pointer)
-
-
-
-

Change:

-
-
-
-
"If the *Logical *<<Addressing_Model, addressing model>> is selected and the
-*VariablePointers* or *VariablePointersStorageBuffer* <<Capability,capability>> is
-declared (in addition to what is allowed above by the *Logical* addressing model):"
-
-
-
-

to:

-
-
-
-
"If the *VariablePointers* or *VariablePointersStorageBuffer* <<Capability,capability>>
-is declared, the following are allowed for <<LogicalPointerType,logical pointer types>>:".
-
-
-
-

Change:

-
-
-
-
*OpVariable* can allocate an object whose type is a pointer type, if the
-<<Storage_Class, Storage Class>> of the *OpVariable* is one of the
-following: ...
-
-
-
-

to:

-
-
-
-
*OpVariable* can allocate an object whose type is a
-<<LogicalPointerType,logical pointer type>>, if the
-<<Storage_Class, Storage Class>> of the *OpVariable* is one of the
-following: ...
-
-
-
-

Change:

-
-
-
-
A <<VariablePointer,variable pointer>> with the Logical addressing model cannot ...
-
-
-
-

to:

-
-
-
-
A <<VariablePointer,variable pointer>> cannot ...
-
-
-
-

Add the following rules:

-
-
-

If the addressing model is not PhysicalStorageBuffer64EXT, then the -PhysicalStorageBufferEXT storage class must not be used.

-
-
-

Add PhysicalStorageBufferEXT to the list of storage classes that support -atomic access.

-
-
-

OpVariable must not use a storage class of PhysicalStorageBufferEXT.

-
-
-

If an OpVariable's pointee type is a pointer (or array of pointers) in -PhysicalStorageBufferEXT storage class, then the variable must be decorated -with exactly one of AliasedPointerEXT or RestrictPointerEXT.

-
-
-

If an OpFunctionParameter is a pointer (or array of pointers) in -PhysicalStorageBufferEXT storage class, then the function parameter must be -decorated with exactly one of Aliased or Restrict.

-
-
-

If an OpFunctionParameter is a pointer (or array of pointers) and its -pointee type is a pointer in PhysicalStorageBufferEXT storage class, then -the function parameter must be decorated with exactly one of -AliasedPointerEXT or RestrictPointerEXT.

-
-
-

Any pointer value whose storage class is PhysicalStorageBufferEXT and that -points to a matrix or an array of matrices or a row or element of a matrix must be the result of -an OpAccessChain or OpPtrAccessChain instruction whose base is a structure type (or -recursively must be the result of a sequence of only access chains from a structure to the final -value). Such a pointer must only be used as the Pointer operand to OpLoad or OpStore.

-
-
-

The result of OpConstantNull must not be a pointer into the PhysicalStorageBufferEXT -storage class.

-
-
-

When used with SPIR-V 1.4 or higher, operands to OpPtrEqual, OpPtrNotEqual, and OpPtrDiff -must not be pointers into the PhysicalStorageBufferEXT storage class.

-
-
-

Modify section 2.16.2. Validation Rules for Shader Capabilities:

-
-
-

Add PhysicalStorageBufferEXT to the list of storage classes in which -composite objects must be explicitly laid out.

-
-
-

Add PhysicalStorageBufferEXT to the list of storage classes to which the -result of a FPRoundingMode-decorated conversion instruction can be stored.

-
-
-
-

2.18 Memory Model

-
-

Modify section 2.18.2. Aliasing:

-
-
-

Replace the paragraph about Simple, GLSL, and VulkanKHR memory models:

-
-
-

The Simple, GLSL, and VulkanKHR memory models can assume that aliasing -is generally not present between the memory object declarations. -Specifically, the consumer is free to assume aliasing is not present between -memory object declarations, unless the memory object declarations explicitly -indicate they alias.

-
-
-

Aliasing is indicated by applying the Aliased decoration to a memory object -declaration’s <id>, for OpVariable and OpFunctionParameter <id>s. -Applying Restrict is allowed, but has no effect.

-
-
-

For variables holding PhysicalStorageBufferEXT pointers, applying the -AliasedPointerEXT decoration on the OpVariable <id> indicates that the -PhysicalStorageBufferEXT pointers are potentially aliased. Applying -RestrictPointerEXT is allowed, but has no effect. Variables holding -PhysicalStorageBufferEXT pointers must be decorated as either -AliasedPointerEXT or RestrictPointerEXT.

-
-
-

Only those memory object declarations decorated with Aliased or -AliasedPointerEXT may alias each other.

-
-
-

Modify the Aliasing table in section 2.18.2:

-
-
-

Add a new row for PhysicalStorageBufferEXT that is a copy of -StorageBuffer. Add PhysicalStorageBufferEXT everywhere StorageBuffer is -used in the "Second Storage Classes" column.

-
-
-

Add to the description of the Aliasing table:

-
-
-

For the PhysicalStorageBufferEXT storage class, OpVariable is understood -to mean the PhysicalStorageBufferEXT pointer value(s) stored in the -variable. An Aliased PhysicalStorageBufferEXT pointer stored in a -Function variable can potentially alias with other variables in the same -function, or with global variables or function parameters.

-
-
-
-

3.4 Addressing Model

-
-
- ----- - - - - - - - - - - - - - -
Addressing ModelEnabling Capabilities

5348

PhysicalStorageBuffer64EXT
-Indicates pointers whose storage classes are PhysicalStorageBufferEXT -are physical pointer types with address width equal to 64 bits, and pointers to all other -storage classes are logical.

PhysicalStorageBufferAddressesEXT

-
-
-
-
-

3.7 Storage Class

-
-
- ----- - - - - - - - - - - - - - -
Storage ClassEnabling Capabilities

5349

PhysicalStorageBufferEXT
-Shared externally, readable and writable, visible across all functions in all -invocations in all work groups. Graphics storage buffers using physical -addressing.

PhysicalStorageBufferAddressesEXT

-
-
-
-
-

3.20 Decorations

-
-
- ------- - - - - - - - - - - - - - - - - - - - - - -
DecorationEnabling CapabilitiesExtra Operands

5355

RestrictPointerEXT
-Apply to an OpVariable, to indicate the compiler may compile as if there -is no aliasing of the pointer stored in the variable. See the Aliasing -section for more detail.

PhysicalStorageBufferAddressesEXT

5356

AliasedPointerEXT
-Apply to an OpVariable, to indicate the compiler is to generate accesses to -the pointer stored in the variable that work correctly in the presence of -aliasing. See the Aliasing section for more detail.

PhysicalStorageBufferAddressesEXT

-
-
-
-
-

3.25 Memory Semantics <id>

-
-

Add PhysicalStorageBufferEXT to the list of storage classes synchronized by -UniformMemory.

-
-
-
-

3.26 Memory Access

-
-

Add to the description of Aligned:

-
-
-

Valid values are defined by the execution environment.

-
-
-
-

3.31 Capabilities

-
-

Modify Section 3.31, "Capability", adding these rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityEnabling Capabilities

5347

PhysicalStorageBufferAddressesEXT

Shader

-
-
-
-

Add PhysicalStorageBufferEXT to the list of storage classes for the -StorageBuffer16BitAccess, UniformAndStorageBuffer16BitAccess, -StorageBuffer8BitAccess, and UniformAndStorageBuffer8BitAccess -capabilities.

-
-
-
-

Instructions

-
-

Modify the OpTypeForwardPointer, OpConvertUToPtr, OpConvertPtrToU, and -OpPtrAccessChain instructions to add PhysicalStorageBufferAddressesEXT to -their capability lists.

-
-
-

Modify OpConvertUToPtr to require that the result type must be a physical -pointer type.

-
-
-

Modify OpConvertPtrToU to require that the Pointer operand must have a -physical pointer type.

-
-
-
-
-
-

Issues

-
-
-

1) How can we support comparing pointers to "null"?

-
-
-

Resolution: This can be accomplished by converting the pointer to an integer -with OpConvertPtrToU. But as mentioned in issue (5), doing so requires the -Int64 capability.

-
-
-

2) Should we define a null pointer value in memory?

-
-
-

Discussion: The environment spec can define a particular bit pattern for -NULL, the core SPIR-V spec should not.

-
-
-

Resolution: SPIR-V doesn’t define it, but Vulkan defines it to 0.

-
-
-

3) Can we reuse Aligned to specify a minimum alignment on a load/store?

-
-
-

Resolution: The SPIR-V spec will be changed to say that the meaning of -Aligned is defined by the execution environment, and Vulkan will define -it to be the minimum alignment, at least for physical storage buffer -pointers.

-
-
-

4) Which instructions from Addresses don’t we need?

-
-
-

Discussion: OpSizeOf seems unnecessary without polymorphism in the high -level language. Variable pointers doesn’t enable OpInBoundsPtrAccessChain, -do we need it? OpCopyMemorySized? MaxByteOffset(Id) decorations?

-
-
-

Resolution: Omit all of them listed above, as they are not strictly needed.

-
-
-

5) Does this extension depend on the Int64 capability?

-
-
-

Resolution: This extension can be used without Int64, but OpConvertUToPtr -and OpConvertPtrToU can’t be used in that case.

-
-
-

6) How do Coherent/Volatile work?

-
-
-

Resolution: We rely on the per-instruction availability/visibility and -volatile memory access operands and image operands, many of which were added -by the SPV_KHR_vulkan_memory_model extension. So that extension must be used -to get coherent/volatile access.

-
-
-

7) What changes are needed to the Aliasing section?

-
-
-

Resolution: Pointers to the PhysicalStorageBufferEXT storage class don’t -quite fit the pre-existing definitions because the pointer is not created by -OpVariable, rather it is loaded from memory or generated with -OpConvertUToPtr. So we extend the definition of a memory object declaration -to include a variable that holds a PhysicalStorageBufferEXT pointer, and add -a way to decorate that the object in the variable is aliased/restrict rather -than just the variable itself.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2018-12-07

Jeff Bolz

Initial revision

2

2019-09-18

David Neto

Interaction with OpConstantNull, and new SPIR-V 1.4 instructions

-
-
-
- - \ No newline at end of file + + + + + + extensions/EXT/SPV_EXT_physical_storage_buffer.html + + +

extensions/EXT/SPV_EXT_physical_storage_buffer.html

+ + diff --git a/extensions/EXT/SPV_EXT_relaxed_printf_string_address_space.html b/extensions/EXT/SPV_EXT_relaxed_printf_string_address_space.html index 046fbc3..30acc1e 100644 --- a/extensions/EXT/SPV_EXT_relaxed_printf_string_address_space.html +++ b/extensions/EXT/SPV_EXT_relaxed_printf_string_address_space.html @@ -1,211 +1,12 @@ - - - - - - - -SPV_EXT_relaxed_printf_string_address_space - - - - - -
-
-

Name Strings

-
-
-

SPV_EXT_relaxed_printf_string_address_space

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-

Pekka Jääskeläinen, Parmance
-Henry Linjamäki, Parmance
-Brice Videau, Argonne National Laboratory
-Anastasia Stulova, Arm

-
-
-
-
-

Notice

-
-
-

Copyright (c) 2022 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-

Draft

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2022-06-13

Revision

3

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.5, Revision 3, Unified

-
-
-

This extension is written against the OpenCL Extended Instruction Set -Specification, Version 1.00, Revision 7

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension relaxes the address space requirement of the printf builtin -to support the cl_ext_relaxed_printf_string_address_space OpenCL C extension.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_EXT_relaxed_printf_string_address_space"
-
-
-
-
-
-

Modifications to the OpenCL Extended Instruction Set Specification, Version 1.00, Revision 4

-
-
-

Modify Section 2.8, "Misc instructions", the printf description, the -sentence "format must be a pointer(constant) to i8." to -"format must be a pointer(constant, global, local, private, generic) to i8."

-
-
-

Issues

-
-

None

-
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

2

2022-06-13

Pekka Jääskeläinen, Anastasia Stulova

The format pointer can also point to the generic address space.

2

2022-05-14

Pekka Jääskeläinen, Brice Videau

Changed to an 'ext' extension for multi-vendor adoption promotion.

1

2022-04-08

Pekka Jääskeläinen

Initial RFC version.

-
-
-
- - \ No newline at end of file + + + + + + extensions/EXT/SPV_EXT_relaxed_printf_string_address_space.html + + +

extensions/EXT/SPV_EXT_relaxed_printf_string_address_space.html

+ + diff --git a/extensions/EXT/SPV_EXT_replicated_composites.html b/extensions/EXT/SPV_EXT_replicated_composites.html index 23e53e9..e0aaf51 100644 --- a/extensions/EXT/SPV_EXT_replicated_composites.html +++ b/extensions/EXT/SPV_EXT_replicated_composites.html @@ -1,351 +1,12 @@ - - - - - - - -SPV_EXT_replicated_composites - - - - - -
-
-

Name Strings

-
-
-

SPV_EXT_replicated_composites

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Kevin Petit, Arm Ltd.

    -
  • -
  • -

    Jeff Bolz, NVIDIA

    -
  • -
  • -

    Alan Baker, Google

    -
  • -
  • -

    Yuehang Wu, Arm Ltd.

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2024 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Approved by the SPIR-V Working Group: 2024-04-03

    -
  • -
  • -

    Approved by the Khronos Board of Promoters: 2024-05-17

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2024-05-29

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -version 1.6 Revision 3.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds instructions to create composite objects whose -constituents all have the same value without requiring the value to be -provided for each constituent.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_EXT_replicated_composites"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6

-
-
-

Capabilities

-
-

Modify Section 3.31, "Capability", adding these rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityDepends On

6024

ReplicatedCompositesEXT
-Uses OpConstantCompositeReplicateEXT, OpSpecConstantCompositeReplicateEXT, or OpCompositeConstructReplicateEXT

-
-
-
-
-

Instructions

-
-

Modify Section 3.42.7, "Constant-Creation Instructions", adding two new instructions:

-
- ------- - - - - - - - - - - - - - -

OpConstantCompositeReplicateEXT
-
-Declare a new composite constant whose constituents all have the same value.
-
-Result Type must be a homegeneous composite type.
-
-Value is the value to use for all constituents. Value must have the -same type as the constituents of the result. Value must be the -<id> of a non-specialization constant-instruction declarations or an OpUndef. -

Capability:
-ReplicatedCompositesEXT

4

4461

<id> Result Type

Result <id>

<id> Value

- ------- - - - - - - - - - - - - - -

OpSpecConstantCompositeReplicateEXT
-
-Declare a new composite specialization constant whose constituents all have the same value.
-
-Result Type must be a homogeneous composite type. -
-Value is the value to use for all constituents. Value must have the -same type as the constituents of the result. Value must be the -<id> of a specialization constant, constant declaration, or an OpUndef.
-
-This instruction will be specialized to an OpConstantCompositeReplicateEXT -instruction.
-
-See Section 1.9, Specialization. -

Capability:
-ReplicatedCompositesEXT

4

4462

<id> Result Type

Result <id>

<id> Value

-
-

Modify Section 3.42.12, "Composite Instructions", adding one new instruction:

-
- ------- - - - - - - - - - - - - - -

OpCompositeConstructReplicateEXT
-
-Construct a new composite object whose constituents all have the same value.
-
-Result Type must be a homogeneous composite type.
-
-Value is the value to use for all constituents. Value must have the -same type as the constituents of the result. -

Capability:
-ReplicatedCompositesEXT

4

4463

<id> Result Type

Result <id>

<id> Value

-
-
-
-
-

Issues

-
-
-

None, for now.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2024-05-29

Kevin Petit

Initial revision

-
-
-
- - \ No newline at end of file + + + + + + extensions/EXT/SPV_EXT_replicated_composites.html + + +

extensions/EXT/SPV_EXT_replicated_composites.html

+ + diff --git a/extensions/EXT/SPV_EXT_shader_atomic_float16_add.html b/extensions/EXT/SPV_EXT_shader_atomic_float16_add.html index b8983c8..eedc9cd 100644 --- a/extensions/EXT/SPV_EXT_shader_atomic_float16_add.html +++ b/extensions/EXT/SPV_EXT_shader_atomic_float16_add.html @@ -1,310 +1,12 @@ - - - - - - - -SPV_EXT_shader_atomic_float16_add - - - - - -
-
-

Name Strings

-
-
-

SPV_EXT_shader_atomic_float16_add

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Faith Ekstrand, Intel

    -
  • -
  • -

    Ben Ashbaugh, Intel

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2021-01-13

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, Version 1.5 Revision 5.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-

This extension depends on and extends the SPV_EXT_shader_atomic_float_add extension.

-
-
-
-
-

Overview

-
-
-

This extension extends the SPV_EXT_shader_atomic_float_add extension to support atomically adding to 16-bit floating-point numbers in memory.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_EXT_shader_atomic_float16_add"
-
-
-
-

Because this extension extends the SPV_EXT_shader_atomic_float_add extension, the following OpExtension must also be present in the module:

-
-
-
-
OpExtension "SPV_EXT_shader_atomic_float_add"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces the new capability:

-
-
-
-
AtomicFloat16AddEXT
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5

-
-
-

Modify Section 3.31, "Capability", adding this row to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

6095

AtomicFloat16AddEXT
-Uses the OpAtomicFAddEXT instruction with 16-bit floating point values.

-
-
-
-

Add the AtomicFloat16AddEXT capability to the OpAtomicFAddEXT instruction added by SPV_EXT_shader_atomic_float_add:

-
- ---------- - - - - - - - - - - - - - - - - -

OpAtomicFAddEXT
-
-(The description of this instruction is unchanged from SPV_EXT_shader_atomic_float_add.)

Capability:
-AtomicFloat32AddEXT AtomicFloat64AddEXT AtomicFloat16AddEXT

7

6035

<id> Result type

Result <id>

<id> Pointer

Scope <id> Memory

Memory Semantics <id> Semantics

<id> Value

-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_EXT_shader_atomic_float16_add"
-
-
-
-

An OpExtension must also be added for the SPV_EXT_shader_atomic_float_add extension that this extension depends on:

-
-
-
-
OpExtension "SPV_EXT_shader_atomic_float_add"
-
-
-
-
    -
  • -

    When using OpAtomicFAddEXT 16-bit floating-point values are allowed.

    -
  • -
  • -

    If OpAtomicFAddEXT is used with 16-bit floating-point values, the AtomicFloat16AddEXT -capability must be declared.

    -
  • -
-
-
-
-
-

Issues

-
-
-

None yet.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2021-01-13

Ben Ashbaugh

Internal revisions

-
-
-
- - \ No newline at end of file + + + + + + extensions/EXT/SPV_EXT_shader_atomic_float16_add.html + + +

extensions/EXT/SPV_EXT_shader_atomic_float16_add.html

+ + diff --git a/extensions/EXT/SPV_EXT_shader_atomic_float_add.html b/extensions/EXT/SPV_EXT_shader_atomic_float_add.html index f92164c..581f1c5 100644 --- a/extensions/EXT/SPV_EXT_shader_atomic_float_add.html +++ b/extensions/EXT/SPV_EXT_shader_atomic_float_add.html @@ -1,350 +1,12 @@ - - - - - - - -SPV_EXT_shader_atomic_float_add - - - - - -
-
-

Name Strings

-
-
-

SPV_EXT_shader_atomic_float_add

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Vikram Kushwaha, NVIDIA

    -
  • -
  • -

    Daniel Koch, NVIDIA

    -
  • -
  • -

    Jeff Bolz, NVIDIA

    -
  • -
  • -

    Alan Baker, Google

    -
  • -
  • -

    David Neto, Google

    -
  • -
  • -

    Nicolai Hahnle, AMD

    -
  • -
  • -

    Brian Sumner, AMD

    -
  • -
  • -

    Faith Ekstrand, Intel

    -
  • -
  • -

    Graeme Leese, Broadcom

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2021-03-17

Revision

2

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, Version 1.5 Revision 2.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds atomic add instruction on floating-point numbers.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_EXT_shader_atomic_float_add"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces new capabilities:

-
-
-
-
AtomicFloat32AddEXT
-AtomicFloat64AddEXT
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5

-
-
-

Modify Section 3.31, "Capability", adding this row to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - - - - - - -
CapabilityImplicitly Declares

6033

AtomicFloat32AddEXT
-Uses the OpAtomicFAddEXT instruction with 32-bit floating point values.

6034

AtomicFloat64AddEXT
-Uses the OpAtomicFAddEXT instruction with 64-bit floating point values.

-
-
-
-

(Modify section 3.32.18, Atomic Instructions, adding to the end of the list of instructions)

-
- ---------- - - - - - - - - - - - - - - - - -

OpAtomicFAddEXT
-
-Perform the following steps atomically with respect to any other atomic accesses within Scope to the same location:

-

1) load through Pointer to get an Original Value,

-

2) get a New Value by float addition of Original Value and Value, and

-

3) store the New Value back through Pointer.

-

The instruction’s result is the Original Value.

-

Result Type must be a floating-point type scalar.

-

The type of Value must be the same as Result Type. The type of the value pointed to by Pointer must be the same as Result Type.

-

Memory must be a valid memory Scope.

Capability:
-AtomicFloat32AddEXT AtomicFloat64AddEXT

7

6035

<id> Result type

Result <id>

<id> Pointer

Scope <id> Memory

Memory Semantics <id> Semantics

<id> Value

-
-
-
-

New Instructions

-
-
-

Instructions added under AtomicFloat32AddEXT or AtomicFloat64AddEXT capability:

-
-
-
-
OpAtomicFAddEXT
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_EXT_shader_atomic_float_add"
-
-
-
-
    -
  • -

    When using OpAtomicFAddEXT only 32- or 64-bit floating-point values are allowed.

    -
  • -
  • -

    If OpAtomicFAddEXT is used with 32-bit floating-point values, the AtomicFloat32AddEXT -capability must be declared.

    -
  • -
  • -

    If OpAtomicFAddEXT is used with 64-bit floating-point values, the AtomicFloat64AddEXT -capability must be declared.

    -
  • -
-
-
-
-
-

Issues

-
-
-

None yet.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2020-07-15

Vikram Kushwaha

Internal revisions

2

2021-03-17

Spencer Fricke

Clarify result type is scalar

-
-
-
- - \ No newline at end of file + + + + + + extensions/EXT/SPV_EXT_shader_atomic_float_add.html + + +

extensions/EXT/SPV_EXT_shader_atomic_float_add.html

+ + diff --git a/extensions/EXT/SPV_EXT_shader_atomic_float_min_max.html b/extensions/EXT/SPV_EXT_shader_atomic_float_min_max.html index 8e70715..fde6a3c 100644 --- a/extensions/EXT/SPV_EXT_shader_atomic_float_min_max.html +++ b/extensions/EXT/SPV_EXT_shader_atomic_float_min_max.html @@ -1,395 +1,12 @@ - - - - - - - -SPV_EXT_shader_atomic_float_min_max - - - - - -
-
-

Name Strings

-
-
-

SPV_EXT_shader_atomic_float_min_max

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Faith Ekstrand, Intel

    -
  • -
  • -

    Ben Ashbaugh, Intel

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Draft

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2020-10-05

Revision

2

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, Version 1.5 Revision 2.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds atomic min and max instruction on floating-point numbers.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_EXT_shader_atomic_float_min_max"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces new capabilities:

-
-
-
-
AtomicFloat16MinMaxEXT
-AtomicFloat32MinMaxEXT
-AtomicFloat64MinMaxEXT
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5

-
-
-

Modify Section 3.31, "Capability", adding this row to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - - - - - - - - - - - -
CapabilityImplicitly Declares

5616

AtomicFloat16MinMaxEXT
-Uses the OpAtomicFMinEXT or OpAtomicFMaxEXT instruction with 16-bit floating point values.

5612

AtomicFloat32MinMaxEXT
-Uses the OpAtomicFMinEXT or OpAtomicFMaxEXT instruction with 32-bit floating point values.

5613

AtomicFloat64MinMaxEXT
-Uses the OpAtomicFMinEXT or OpAtomicFMaxEXT instruction with 64-bit floating point values.

-
-
-
-

(Modify section 3.32.18, Atomic Instructions, adding to the end of the list of instructions)

-
- ---------- - - - - - - - - - - - - - - - - -

OpAtomicFMinEXT
-
-Perform the following steps atomically with respect to any other atomic accesses within Scope to the same location:

-

1) load through Pointer to get an Original Value,

-

2) get a New Value by which is the float minimum of Original Value and Value, and

-

3) store the New Value back through Pointer.

-

Given x and y are two numbers (neither is NaN) and qNaN is an IEEE quiet NaN, sNaN is an IEEE signaling NaN, and NaN is any NaN value, the minimum operation performed as part of OpAtomicFMinEXT has the following semantics:

-

* min(x, y) = x if x < y and y otherwise,

-

* min(-0, +0) = min(+0, -0) = +0 or -0,

-

* min(x, qNaN) = min(qNaN, x) = x,

-

* min(qNaN, qNaN) = qNaN,

-

* min(x, sNaN) = min(sNaN, x) = NaN or x, and

-

* min(NaN, sNaN) = min(sNaN, NaN) = NaN

-

For cases which generate consume and produce a NaN value, any NaN value may be generated; it may not be the same as the NaN which was consumed.

-

The instruction’s result is the Original Value.

-

Result Type must be a floating-point type.

-

The type of Value must be the same as Result Type. The type of the value pointed to by Pointer must be the same as Result Type.

-

Memory must be a valid memory Scope.

Capability:
-AtomicFloat16MinMaxEXT AtomicFloat32MinMaxEXT AtomicFloat64MinMaxEXT

7

5614

<id> Result type

Result <id>

<id> Pointer

Scope <id> Memory

Memory Semantics <id> Semantics

<id> Value

- ---------- - - - - - - - - - - - - - - - - -

OpAtomicFMaxEXT
-
-Perform the following steps atomically with respect to any other atomic accesses within Scope to the same location:

-

1) load through Pointer to get an Original Value,

-

2) get a New Value by which is the float maximum of Original Value and Value, and

-

3) store the New Value back through Pointer.

-

Given x and y are two numbers (neither is NaN) and qNaN is an IEEE quiet NaN, sNaN is an IEEE signaling NaN, and NaN is any NaN value, the maximum operation performed as part of OpAtomicFMaxEXT has the following semantics:

-

* max(x, y) = x if x < y and y otherwise,

-

* max(-0, +0) = max(+0, -0) = +0 or -0,

-

* max(x, qNaN) = max(qNaN, x) = x,

-

* max(qNaN, qNaN) = qNaN,

-

* max(x, sNaN) = max(sNaN, x) = NaN or x, and

-

* max(NaN, sNaN) = max(sNaN, NaN) = NaN

-

For cases which generate consume and produce a NaN value, any NaN value may be generated; it may not be the same as the NaN which was consumed.

-

The instruction’s result is the Original Value.

-

Result Type must be a floating-point type.

-

The type of Value must be the same as Result Type. The type of the value pointed to by Pointer must be the same as Result Type.

-

Memory must be a valid memory Scope.

Capability:
-AtomicFloat16MinMaxEXT AtomicFloat32MinMaxEXT AtomicFloat64MinMaxEXT

7

5615

<id> Result type

Result <id>

<id> Pointer

Scope <id> Memory

Memory Semantics <id> Semantics

<id> Value

-
-
-
-

New Instructions

-
-
-

Instructions added under AtomicFloat16MinMaxEXT, AtomicFloat32MinMaxEXT, or AtomicFloat64MinMaxEXT capability:

-
-
-
-
OpAtomicFMinEXT
-OpAtomicFMaxEXT
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_EXT_shader_atomic_float_min_max"
-
-
-
-
    -
  • -

    When using OpAtomicFMinEXT or OpAtomicFMaxEXT only 16-, 32-, or 64-bit floating-point values are allowed.

    -
  • -
  • -

    If OpAtomicFMinEXT or OpAtomicFMaxEXT is used with 16-bit floating-point values, -the AtomicFloat16MinMaxEXT capability must be declared.

    -
  • -
  • -

    If OpAtomicFMinEXT or OpAtomicFMaxEXT is used with 32-bit floating-point values, -the AtomicFloat32MinMaxEXT capability must be declared.

    -
  • -
  • -

    If OpAtomicFMinEXT or OpAtomicFMaxEXT is used with 64-bit floating-point values, -the AtomicFloat64MinMaxEXT capability must be declared.

    -
  • -
-
-
-
-
-

Issues

-
-
-

None yet.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2020-08-14

Faith Ekstrand

Internal revisions

2

2020-10-05

Ben Ashbaugh

Added fp16 capability

-
-
-
- - \ No newline at end of file + + + + + + extensions/EXT/SPV_EXT_shader_atomic_float_min_max.html + + +

extensions/EXT/SPV_EXT_shader_atomic_float_min_max.html

+ + diff --git a/extensions/EXT/SPV_EXT_shader_image_int64.html b/extensions/EXT/SPV_EXT_shader_image_int64.html index 99cd0cd..3daf45b 100644 --- a/extensions/EXT/SPV_EXT_shader_image_int64.html +++ b/extensions/EXT/SPV_EXT_shader_image_int64.html @@ -1,268 +1,12 @@ - - - - - - - -SPV_EXT_shader_image_int64 - - - - - -
-
-

Name Strings

-
-
-

SPV_EXT_shader_image_int64

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Tobias Hector, AMD

    -
  • -
  • -

    Matthaeus Chajdas, AMD

    -
  • -
  • -

    Graham Wihlidal, Epic Games

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2019 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2021-03-17

Revision

2

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.5, Revision 1, Unified.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension enables the use of atomic, load, and store operations on -images with a 64-bit format.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_EXT_shader_image_int64"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5

-
-
-

3.11 Image Formats

-
-

Modify Section 3.11, "Image Formats", adding these rows to the Image Formats table:

-
-
-
- ----- - - - - - - - - - - - - - - - - - - -
Image FormatEnabling Capabilities

40

R64ui

Int64ImageEXT

41

R64i

Int64ImageEXT

-
-
-
-
-

3.31 Capabilities

-
-

Modify Section 3.31, "Capability", adding these rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityEnabling Capabilities

5016

Int64ImageEXT
-Uses atomic, load, or store instructions on images with 64-bit integer types. -If using atomics, Int64Atomics must be declared.

Int64

-
-
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2019-10-23

Tobias Hector

Initial revision

2

2021-03-17

Spencer Fricke

Clarify Int64Atomics interaction

-
-
-
- - \ No newline at end of file + + + + + + extensions/EXT/SPV_EXT_shader_image_int64.html + + +

extensions/EXT/SPV_EXT_shader_image_int64.html

+ + diff --git a/extensions/EXT/SPV_EXT_shader_stencil_export.html b/extensions/EXT/SPV_EXT_shader_stencil_export.html index 38b8648..ad174c6 100644 --- a/extensions/EXT/SPV_EXT_shader_stencil_export.html +++ b/extensions/EXT/SPV_EXT_shader_stencil_export.html @@ -1,293 +1,12 @@ - - - - - - - -SPIR-V Extension SPV_EXT_shader_stencil_export - - - - - -
-
-

Name Strings

-
-
-

SPV_EXT_shader_stencil_export

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Dominik Witczak, AMD

    -
  • -
  • -

    Daniel Rakos, AMD

    -
  • -
  • -

    Rex Xu, AMD

    -
  • -
  • -

    Aaron Hagan, AMD

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2017 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-

Released.

-
-
-
-
-

Version

-
-
-

Modified Date: 08/10/2017

-
-
-

Revision: 3

-
-
-
-
-

Dependencies

-
-
-

This extension is written against Revision 1 of the version 1.10 of the -SPIR-V Specification.

-
-
-

The extension is written against Revision 1 of the OpenGL extension -ARB_shader_stencil_export.

-
-
-
-
-

Overview

-
-
-

This extension is written to provide the functionality of the -ARB_shader_stencil_export, OpenGL Shading Language Specification extension, -for SPIR-V.

-
-
-

This extension adds a new capability, as well as a new built-in. Both, when combined, -let the application output a specific reference stencil value from within a fragment -shader.

-
-
-
-
-

Extension Name

-
-
-

To enable SPV_EXT_shader_stencil_export extension in SPIR-V, use

-
-
-
-
OpExtension "SPV_EXT_shader_stencil_export"
-
-
-
-
-
-

New Execution Mode

-
-
-

This extension introduces a new execution mode:

-
-
-
-
ExecutionModeStencilRefReplacingEXT
-
-
-
-
-
-

New Builtins

-
-
-

This extension adds the following builtins:

-
-
-
-
FragStencilRefEXT = 5014
-
-
-
-

FragStencilRefEXT must only decorate output variable whose type is -an arbitrary-sized integer type scalar.

-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - - - - - - - - - -

CapabilityStencilExportEXT

5013

BuiltInFragStencilRefEXT

5014

ExecutionModeStencilRefReplacingEXT

5027

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.1

-
-
-

Modify Section 3.21, the BuiltIn list.

-
-
-

(Add to the list of builtins with a CapabilityStencilExportEXT capability)

-
-
-
-
FragStencilRefEXT = 5014
-
-
-
-

The FragStencilRefEXT builtin can be used to output reference stencil -value.

-
-
-
-
-

Validation Rules

-
-
-

None.

-
-
-
-
-

Issues

-
-
-

None

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

3

08/10/2017

Aaron Hagan

Add ExecutionModeStencilRefReplacingEXT execution mode.

2

07/26/2017

Dominik Witczak

Language improvements.

1

07/19/2017

Dominik Witczak

Initial revision based on ARB_shader_stencil_export.

-
-
-
- - \ No newline at end of file + + + + + + extensions/EXT/SPV_EXT_shader_stencil_export.html + + +

extensions/EXT/SPV_EXT_shader_stencil_export.html

+ + diff --git a/extensions/EXT/SPV_EXT_shader_tile_image.html b/extensions/EXT/SPV_EXT_shader_tile_image.html index 05e0bb0..10d2b29 100644 --- a/extensions/EXT/SPV_EXT_shader_tile_image.html +++ b/extensions/EXT/SPV_EXT_shader_tile_image.html @@ -1,796 +1,12 @@ - - - - - - - -SPV_EXT_shader_tile_image - - - - - -
-
-

Name Strings

-
-
-

SPV_EXT_shader_tile_image

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Sandeep Kakarlapudi, Arm

    -
  • -
  • -

    Jan-Harald Fredriksen, Arm

    -
  • -
  • -

    Alan Baker, Google

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2021-2023 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2023-03-23

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the Unified SPIR-V Specification, -Version 1.5 Revision 5.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds functionality to enable fragment shaders to read color, -depth or stencil data from the framebuffer.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must -be present in the module:

-
-
-
-
OpExtension "SPV_EXT_shader_tile_image"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces new capabilities:

-
-
-
-
TileImageColorReadAccessEXT
-TileImageDepthReadAccessEXT
-TileImageStencilReadAccessEXT
-
-
-
-
-
-

New Instructions

-
-
-

Instructions added under the TileImageColorReadAccessEXT capability:

-
-
-
-
OpColorAttachmentReadEXT
-
-
-
-

Instructions added under the TileImageDepthReadAccessEXT capability:

-
-
-
-
OpDepthAttachmentReadEXT
-
-
-
-

Instructions added under the TileImageStencilReadAccessEXT capability:

-
-
-
-
OpStencilAttachmentReadEXT
-
-
-
-
-
-

New Execution Modes

-
-
-

Execution modes added under the TileImageColorReadAccessEXT capability:

-
-
-
-
NonCoherentColorAttachmentReadEXT
-
-
-
-

Execution modes added under the TileImageDepthReadAccessEXT capability:

-
-
-
-
NonCoherentDepthAttachmentReadEXT
-
-
-
-

Execution modes added under the TileImageStencilReadAccessEXT capability:

-
-
-
-
NonCoherentStencilAttachmentReadEXT
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5

-
-
-

Validation Rules

-
-

In section 2.16.1 Universal Validation Rules, add TileImageEXT to the list of -storage classes following this sentence:

-
-
-
-
Any pointer operand to an OpFunctionCall must point into one of the following
-storage classes:
-
-
-
-

In 2.16.2. Validation Rules for Shader Capabilities, add the following -under Decorations:

-
-
-
    -
  • -

    It is invalid to apply the Coherent decoration to variables in the -TileImageEXT storage class if the NonCoherentColorAttachmentReadEXT -execution mode is declared.

    -
  • -
-
-
-
-

Execution Modes

-
-

In section 3.6 "Execution Mode", add the following rows to the Execution Mode -table:

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - -

Execution Mode

Extra Operands

Enabling Capabilities

4169

NonCoherentColorAttachmentReadEXT
-Disables rasterization order access guarantees for color attachment reads. Only -valid in the Fragment execution model.

TileImageColorReadAccessEXT

4170

NonCoherentDepthAttachmentReadEXT
-Disables rasterization order access guarantees for depth attachment reads. Only -valid in the Fragment execution model.

TileImageDepthReadAccessEXT

4171

NonCoherentStencilAttachmentReadEXT
-Disables rasterization order access guarantees for stencil attachment reads. Only -valid in the Fragment execution model.

TileImageStencilReadAccessEXT

-
-
-

Storage Classes

-
-

In section 3.7 "Storage Class", add the following rows to the Storage Class -table:

-
- ----- - - - - - - - - - - - - - -
Storage ClassEnabling Capabilities

4172

TileImageEXT
-Visible across all functions in all fragment invocations at a pixel location -within a render pass. For holding framebuffer color attachment memory. Only -valid with image type variables with Dim TileImageDataEXT. See the Client -API specification for more details on -tile images.

TileImageColorReadAccessEXT

-
-
-

Dims

-
-

In section 3.8 "Dim", add the following row to the Dim table:

-
- ----- - - - - - - - - - - - - - -
DimEnabling Capabilities

4173

TileImageDataEXT

TileImageColorReadAccessEXT

-
-
-

Decorations

-
-

In section 3.20 "Decoration", modify the description for "Location" replace the -last sentence:

-
-
-
-
-

Only valid for the Input, Output, and UniformConstant Storage Classes.

-
-
-
-
-

with:

-
-
-
-
-

Only valid for the Input, Output, UniformConstant, TileImageEXT Storage Classes.

-
-
-
-
-
-

Capabilities

-
-

Modify Section 3.31, "Capability", adding these rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - - - - - - - - - - - -
CapabilityImplicitly Declares

4166

TileImageColorReadAccessEXT
-Uses TileImageDataEXT Dim and TileImageEXT storage -class to create tile image variables. Uses -OpColorAttachmentReadEXT to read tile image variables.

4167

TileImageDepthReadAccessEXT
-Uses OpDepthAttachmentReadEXT

4168

TileImageStencilReadAccessEXT
-Uses OpStencilAttachmentReadEXT

-
-
-
-
-

Instructions

-
-

In section 3.37.6 ("Type-Declaration Instructions"), modify the definition of -OpTypeImage to include:

-
-
-
-
-

If Dim is TileImageDataEXT, Sampled Type must not be OpTypeVoid, -Sampled must be 2, Image Format must be Unknown, Depth must be 0, -Arrayed must be 0 and the Execution Model must be Fragment.

-
-
-
-
-

Modify the definition of OpTypeSampledImage to read:

-
-
-
-
-

Image Type must be an OpTypeImage with a sampled parameter of 0 or 1. It is the -type of the image in the combined sampler and image type. Starting with version -1.6, it must not have a Dim of Buffer.

-
-
-
-
-

In section 3.42.8 ("Memory Instructions"), modify the definition of OpImageTexelPointer to read:

-
-
-
-
-

The Dim operand of Type must not be SubpassData or TileImageDataEXT.

-
-
-
-
-

In section 3.42.10 ("Image Instructions"), modify the definition of the instructions as shown:

-
-
-

OpImageRead

-
-
-
-
-

Image must be an object whose type is OpTypeImage with a Sampled operand of 0 or 2, and Dim operand is not TileImageDataEXT.

-
-
-
-
-

OpImageWrite

-
-
-
-
-

Image must be an object whose type is OpTypeImage with a Sampled operand of 0 or 2, and Dim operand is not TileImageDataEXT.

-
-
-
-
-

OpImageQueryFormat

-
-
-
-
-

Image must be an object whose type is OpTypeImage with a Dim operand which is not TileImageDataEXT.

-
-
-
-
-

OpImageQueryOrder

-
-
-
-
-

Image must be an object whose type is OpTypeImage with a Dim operand which is not TileImageDataEXT.

-
-
-
-
-

OpImageSparseRead

-
-
-
-
-

The Image Dim operand must not be SubpassData or TileImageDataEXT.

-
-
-
-
-

Add the new instructions:

-
- -------- - - - - - - - - - - - - - - -

-OpColorAttachmentReadEXT
-
- Read the current value of a tile image variable at the current fragment - location. The read access is guaranteed to be in rasterization order as defined - by the client API specification unless the NonCoherentColorAttachmentReadEXT - execution mode is set.
-
- Result is the returned value.
-
- Result Type must be a scalar or vector of floating-point type or integer - type. It must be a scalar or vector with component type the same as Sampled - Type of the OpTypeImage.
-
- Attachment must be an object whose type is OpTypeImage with a Dim of - TileImageDataEXT
-
- Sample is the sample number of the sample to read at the current fragment - location. It must be an integer type scalar. If Sample is not specified, - it is as if Sample has the value 0. The sample numbering is identical to that - used for SampleId.
-
- This instruction is only valid in the Fragment Execution Model.
-

Capability:
-TileImageColorReadAccessEXT

4 + variable

4160

<id> Result Type

Result <id>

<id> Attachment

Optional <id> Sample

- ------- - - - - - - - - - - - - - -

-OpDepthAttachmentReadEXT
-
- Read the current depth value at the fragment location. The read access is - guaranteed to be in rasterization order as defined by the client API - specification unless the NonCoherentDepthAttachmentReadEXT execution mode - is set.
-
- Result is the returned depth value.
-
- Result Type must be a 32-bit floating-point type scalar.
-
- Sample is the sample number of the sample to read at the current fragment - location. It must be an integer type scalar. If Sample is not specified, - it is as if Sample has the value 0. The sample numbering is identical to that - used for SampleId.
-
- This instruction is only valid in the Fragment Execution Model.
-

Capability:
-TileImageDepthReadAccessEXT

3 + variable

4161

<id> Result Type

Result <id>

Optional <id> Sample

- ------- - - - - - - - - - - - - - -

-OpStencilAttachmentReadEXT
-
- Read the current stencil value at the current fragment location. The read - access is guaranteed to be in rasterization order as defined by the client API - specification unless the NonCoherentStencilAttachmentReadEXT execution - mode is set.
-
- Result is the returned stencil value.
-
- Result Type must be a 32-bit integer type scalar.
-
- Sample is the sample number of the sample to read at the current fragment - location. It must be an integer type scalar. If Sample is not specified, - it is as if Sample has the value 0. The sample numbering is identical to that - used for SampleId.
-
- This instruction is only valid in the Fragment Execution Model.
-

Capability:
-TileImageStencilReadAccessEXT

3 + variable

4162

<id> Result Type

Result <id>

Optional <id> Sample

-
-
-
-
-

Issues

-
-
-

Issues 1 to 4 have been copied from VK_EXT_shader_tile_image for easy reference.

-
-
-
    -
  1. -

    Should we reuse OpTypeImage, or introduce a new type for declaring tile images?

    -
    -
    -
    -

    RESOLVED: OpTypeImage is reused with a special Dim for tile images, following -what was done for subpass attachments.

    -
    -
    -

    An alternative would have been to make tile images their own type, and -introduce an OpTypeTileImage type. -That would require less special-casing of OpTypeImage, but comes with higher -initial burden in tooling.

    -
    -
    -
    -
  2. -
  3. -

    Should Color, Depth, and Stencil reads use the same SPIR-V opcode?

    -
    -
    -
    -

    RESOLVED: No. The extension introduces separate opcodes.

    -
    -
    -

    Tile based GPUs which guarantee framebuffer residency in tile memory can offer -efficient raster order access to color, depth, stencil data with relatively low -overhead. Some GPU implementations would have a significant performance penalty -in raster order access if the implementation cannot determine from the SPIR-V -shader whether a specific access is color, depth, or stencil.

    -
    -
    -
    -
  4. -
  5. -

    Should Depth and Stencil read opcodes consume an image operand specifying the -attachment, or should it be implicit?

    -
    -
    -
    -

    RESOLVED: No operand is necessary as there is depth and stencil uniquely -identify the attachments unlike with color.

    -
    -
    -

    The other options considered were:

    -
    -
    -
      -
    1. -

      Allow depth and stencil tile images to be declared as variables. -Tile images are defined to map to the color attachment specified via the -Location decoration - some equivalent needs to be defined for depth and -stencil. Pixel Local Storage like functionality of supporting format -reinterpretation is only supported for color attachments, and hence -must be disallowed for depth and stencil. There is very little benefit to -declaring the depth and stencil variables given these restrictions.

      -
    2. -
    3. -

      Depth and stencil tile images are exposed as built-in variables.

      -
    4. -
    -
    -
    -

    Given the design choice made for issue 8, the other options do not add -any value.

    -
    -
    -
    -
  6. -
  7. -

    Should this extension re-use the image Dim SubpassData or introduce a new Dim?

    -
    -
    -
    -

    RESOLVED: The extension introduces a new Dim.

    -
    -
    -

    This extension is intended to serve as foundation for further functionality, -for example Pixel Local Storage like format reinterpretation, or to define -the tile size and allow tile shaders to access any pixel within the tile.

    -
    -
    -

    In SPIR-V, input attachments use images with Dim of SubpassData. -We use a new Dim so we can easily distinguish whether an image is an input -attachment or a tile image.

    -
    -
    -
    -
  8. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2023-03-23

Sandeep Kakarlapudi

Initial revision

-
-
-
- - \ No newline at end of file + + + + + + extensions/EXT/SPV_EXT_shader_tile_image.html + + +

extensions/EXT/SPV_EXT_shader_tile_image.html

+ + diff --git a/extensions/EXT/SPV_EXT_shader_viewport_index_layer.html b/extensions/EXT/SPV_EXT_shader_viewport_index_layer.html index 6b4325b..d4a30df 100644 --- a/extensions/EXT/SPV_EXT_shader_viewport_index_layer.html +++ b/extensions/EXT/SPV_EXT_shader_viewport_index_layer.html @@ -1,387 +1,12 @@ - - - - - - - -SPV_EXT_shader_viewport_index_layer - - - - - -
-
-

Name Strings

-
-
-

SPV_EXT_shader_viewport_index_layer

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Daniel Koch, NVIDIA

    -
  • -
  • -

    Daniel Rakos, AMD

    -
  • -
  • -

    Slawomir Grajewski, Intel

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2017-07-25

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.2 Revision 1.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-

This extension interacts with SPV_NV_viewport_array2.

-
-
-
-
-

Overview

-
-
-

This extension adds new capabilities to support the OpenGL -GL_ARB_shader_viewport_layer_array and the Vulkan -VK_EXT_shader_viewport_index_layer extensions in SPIR-V.

-
-
-

The new ShaderViewportIndexLayerEXT capability allows the -Layer and ViewportIndex builtin variables to be exported -from Vertex or Tessellation shaders, in addition to Geometry -shaders. This is functionality added GLSL by both -GL_ARB_shader_viewport_layer_array and GL_NV_viewport_array2, -and separately by GL_AMD_vertex_shader_layer, and -GL_AMD_vertex_shader_viewport_index.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_EXT_shader_viewport_index_layer"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
ShaderViewportIndexLayerEXT
-
-
-
-
-
-

New Decorations

-
-
-

None.

-
-
-
-
-

New Builtins

-
-
-

The existing Layer and ViewportIndex builtins are extended and may -also be used as outputs in the Vertex and Tessellation Execution -Models under the ShaderViewportIndexLayerEXT capability.

-
-
-
-
-

New Instructions

-
-
-

None.

-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - -

ShaderViewportIndexLayerEXT

5254

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.2

-
-
-
-
(Modify Section 3.21, BuiltIn)
-
-
-
-
-

(Modify the definition of Layer and ViewportIndex as follows, allowing -them to be outputs from Vertex and Tessellation shaders)

-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
BuiltInEnabling Capabilities

9

Layer
-Layer selection for multi-layer framebuffer. See Vulkan or OpenGL API -specification for more detail.

Layer output by a Geometry Execution Model, -input to a Fragment Execution Model.

Geometry

Layer output by a Vertex or Tessellation Execution Model.

ShaderViewportIndexLayerEXT

10

ViewportIndex
-Viewport selection for viewport transformation when using multipe viewports. -See Vulkan or OpenGL API specification for more detail.

Viewport Index output by a Geometry Execution Model, -input to a Fragment Execution Model.

MultiViewport

Viewport Index output by a Vertex or Tessellation Execution Model.

ShaderViewportIndexLayerEXT

-
-
-
-
(Modify Section 3.31, Capability, adding a new row to the Capability table)
-
-
-
- ------ - - - - - - - - - - - - - - - -
CapabilityDepends OnEnabled by Extension

5254

ShaderViewportIndexLayerEXT

MultiViewport

SPV_EXT_shader_viewport_index_layer

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_EXT_shader_viewport_index_layer"
-
-
-
-
-
-

Issues

-
-
-
    -
  1. -

    How does this extension relate to the similar functionality in -SPV_NV_viewport_array2?

    -
    -
    -
    -

    RESOLVED: The ShaderViewportIndexLayerEXT capability in this extension -is an alias of ShaderViewportIndexLayerNV from SPV_NV_viewport_array2, and -provides the same functionality.

    -
    -
    -
    -
  2. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2017-07-25

Daniel Koch

Initial draft based on subset of SPV_NV_viewport_array2.

-
-
-
- - \ No newline at end of file + + + + + + extensions/EXT/SPV_EXT_shader_viewport_index_layer.html + + +

extensions/EXT/SPV_EXT_shader_viewport_index_layer.html

+ + diff --git a/extensions/EXT/SPV_EXT_ycbcr_attachments.html b/extensions/EXT/SPV_EXT_ycbcr_attachments.html index ea9f760..a0be2de 100644 --- a/extensions/EXT/SPV_EXT_ycbcr_attachments.html +++ b/extensions/EXT/SPV_EXT_ycbcr_attachments.html @@ -1,293 +1,12 @@ - - - - - - - -SPV_EXT_ycbcr_attachments - - - - - -
-
-

Name Strings

-
-
-

SPV_EXT_ycbcr_attachments

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Chris Forbes, Google LLC

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2022 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete.

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2022-08-03

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.6 Revision 1.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds a new YCbCrAttachmentEXT decoration which is intended -to be equivalent to the GLSL layout(yuv) qualifier and to be used by layered -implementations to implement it.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_EXT_ycbcr_attachments"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces the following new capability:

-
-
-
-
YCbCrAttachmentsEXT
-
-
-
-
-
-

New Decorations

-
-
-

Decoration added under the YCbCrAttachmentsEXT capability:

-
-
-
-
YCbCrAttachmentEXT
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6

-
-
-
-
(Modify Section 3.20, Decoration, adding a row to the Decoration table)
-
-
-
- ------- - - - - - - - - - - - - - - - -
DecorationEnabling CapabilitiesExtra Operands

5632

YCbCrAttachmentEXT
-Apply to an OpVariable in the Output storage class. Specifies that the corresponding color attachment will be YCbCr. -Only valid with the Fragment Execution Model. See the API specification for more information.

YCbCrAttachmentsEXT

-
-
-
-
(Modify Section 3.31, Capability, adding new rows to the Capability table)
-
-
-
- ---- - - - - - - - - - - - - - - - -
Capability

Implicitly Declares

5633

YCbCrAttachmentsEXT
-Uses the YCbCrAttachmentEXT decoration on a variable.

-
-
-
-
-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2022-08-03

Chris Forbes

Initial revision

-
-
-
- - \ No newline at end of file + + + + + + extensions/EXT/SPV_EXT_ycbcr_attachments.html + + +

extensions/EXT/SPV_EXT_ycbcr_attachments.html

+ + diff --git a/extensions/GOOGLE/SPV_GOOGLE_decorate_string.html b/extensions/GOOGLE/SPV_GOOGLE_decorate_string.html index 8d84729..f2e0161 100644 --- a/extensions/GOOGLE/SPV_GOOGLE_decorate_string.html +++ b/extensions/GOOGLE/SPV_GOOGLE_decorate_string.html @@ -1,357 +1,12 @@ - - - - - - - -SPV_GOOGLE_decorate_string - - - - - -
-
-

Name Strings

-
-
-

SPV_GOOGLE_decorate_string

-
-
-
-
-

Contact

-
-
-

See Issues list in the Khronos SPIRV-Registry repository: -https://github.com/KhronosGroup/SPIRV-Registry

-
-
-
-
-

Contributors

-
-
-
    -
  • -

    Hai Nguyen, Google

    -
  • -
  • -

    David Neto, Google

    -
  • -
  • -

    John Kessenich, Google

    -
  • -
  • -

    Lei Zhang, Google

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2018-02-16

Revision

4

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.0 Revision 12.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension provides two new instructions to decorate a variable or a struct -member with a string.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_GOOGLE_decorate_string"
-
-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - - - - - -

OpDecorateStringGOOGLE

5632

OpMemberDecorateStringGOOGLE

5633

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.0

-
-
-
-
(Modify Section 2.4, Logical Layout of a Module)
-
-

Add OpDecorateStringGOOGLE and OpMemberDecorateStringGOOGLE to the list -of annotation instructions under layout section 8.a.

-
-
(Modify Section 3.32.3, Annotation instructions)
-
-

Add OpDecorateStringGOOGLE and OpMemberDecorateStringGOOGLE:

-
-
-
-
-
- ------- - - - - - - - - - - - - -

OpDecorateStringGOOGLE
-
-Add a string decoration to another <id>.
-
-Target is the <id> to decorate. It can potentially be any <id> that is a -forward reference, except it must not be the <id> of an OpDecorationGroup.
-
-Decoration is a decoration that takes at least one Literal String operand, -and has only Literal String operands.

4 + variable

5632

<id>
-Target

Decoration

Literal String, Literal String, …​
-See Decoration.

- -------- - - - - - - - - - - - - - -

OpMemberDecorateStringGOOGLE
-
-Add a string decoration to a member of a structure type.
-
-Structure type is the <id> of a type from OpTypeStruct.
-
-Member is the number of the member to decorate in the type. The first member is member 0, the next is member 1, …​
-
-Decoration is a decoration that takes at least one Literal String operand, -and has only Literal String operands.

5 + variable

5633

<id>
-Structure Type

Literal Number
-Member

Decoration

Literal String
-See Decoration.

-
-
-
-
-
-

Validation Rules

-
-
-

To use this extension within a SPIR-V module the following OpExtension instruction -must be present in the module:

-
-
-
-
OpExtension "SPV_GOOGLE_decorate_string"
-
-
-
-
-
-

Issues

-
-
-
    -
  1. -

    Do we need OpMemberDecorateStringGOOGLE?

    -
    -
    -
    -

    RESOLVED: - Yes. It may be desirable to decorate members of an builtins interface block with string decorations.

    -
    -
    -
    -
  2. -
  3. -

    Should OpDecorateStringGOOGLE and OpMemberDecorateStringGOOGLE be allowed to participate -in decoration groups?

    -
    -
    -
    -

    RESOLVED: - No. String decorations are intended to only decorate variables or members of a structure type.

    -
    -
    -
    -
  4. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2017-10-05

Hai Nguyen

Initial draft

2

2017-12-01

David Neto

Make it a KHR extension, add validation rules, assign token numbers

3

2018-02-04

Hai Nguyen

Changed to GOOGLE extension, removed HLSL semantics

4

2018-02-04

Hai Nguyen

Added GOOGLE suffix, added token numbers, removed redundant sections

-
-
-
- - \ No newline at end of file + + + + + + extensions/GOOGLE/SPV_GOOGLE_decorate_string.html + + +

extensions/GOOGLE/SPV_GOOGLE_decorate_string.html

+ + diff --git a/extensions/GOOGLE/SPV_GOOGLE_hlsl_functionality1.html b/extensions/GOOGLE/SPV_GOOGLE_hlsl_functionality1.html index c55dedd..7583ec8 100644 --- a/extensions/GOOGLE/SPV_GOOGLE_hlsl_functionality1.html +++ b/extensions/GOOGLE/SPV_GOOGLE_hlsl_functionality1.html @@ -1,371 +1,12 @@ - - - - - - - -SPV_GOOGLE_hlsl_functionality1 - - - - - -
-
-

Name Strings

-
-
-

SPV_GOOGLE_hlsl_functionality1

-
-
-
-
-

Contact

-
-
-

See Issues list in the Khronos SPIRV-Registry repository: -https://github.com/KhronosGroup/SPIRV-Registry

-
-
-
-
-

Contributors

-
-
-
    -
  • -

    Hai Nguyen, Google

    -
  • -
  • -

    David Neto, Google

    -
  • -
  • -

    John Kessenich, Google

    -
  • -
  • -

    Lei Zhang, Google

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2018-09-12

Initial

5

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.2 Revision 4, Unified.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-

This extension also subsumes the SPV_GOOGLE_decorate_string extension.

-
-
-
-
-

Overview

-
-
-

This extension provides two new decorations to extend HLSL functionality: -HlslCounterBuffer and HlslSemantic.

-
-
-

The decoration HlslCounterBuffer is used with OpDecorateId to link a counter -buffer to a UAV resource that has an associated counter.

-
-
-

The decoration HlslSemantic is used with OpDecorateStringGOOGLE and -OpMemberDeocrateStringGOOGLE to decorate an input or output variable id -with a string representing semantic as defined in the HLSL source. The -decoration HlslSemantic can also be used with future string decorations that -may get added to SPIR-V.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_GOOGLE_hlsl_functionality1"
-
-
-
-
-
-

New Capabilities

-
-
-

None.

-
-
-
-
-

New Decorations

-
-
-
-
HlslCounterBufferGOOGLE
-HlslSemanticGOOGLE
-
-
-
-
-
-

New Builtins

-
-
-

None.

-
-
-
-
-

New Instructions

-
-
-

None.

-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - - - - - -

HlslCounterBufferGOOGLE

5634

HlslSemanticGOOGLE

5635

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.0

-
-
-
-
(Modify Section 3.20, Decoration)
-
-
- ------- - - - - - - - - - - - - - - - - - - - - - -
DecorationEnabling CapabilitiesExtra Operands

5634

HlslCounterBufferGOOGLE
-An <id> of the counter buffer associated with the target buffer.

<id>
- Counter Buffer ID

5635

HlslSemanticGOOGLE
-A string describing the intended use of a value. -Use this to express user-defined and system semantic values for shaders -originally written in HLSL. The semantic string is case insensitive.

Literal String
- Semantic

-
-
-
(Modify Section 3.32 "Instructions")
-
-

In the OpDecorateId instruction’s Capability box, remove the statement -"Missing before version 1.2", to allow this instruction to be used with -this extension. Also, in the description of the OpDecorateId instruction, -change "All such <id> Extra Operands must be constant instructions." to -"All such <id> Extra Operands must be constant instructions or the -result <id> of OpVariable instructions."

-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_GOOGLE_hlsl_functionality1"
-
-
-
-

HLSLCounterBuffer can only be applied to a variable in the Uniform storage -class. The parameter should also be a variable in the Uniform storage class.

-
-
-

HLSLSemantic can only be applied to a variable or a member of a struct type. -If applied to a variable, it must be in the Input or Output storage class.

-
-
-
-
-

Issues

-
-
-

None yet.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2018-02-04

Hai Nguyen

Initial draft

2

2018-02-16

Hai Nguyen

Added GOOGLE suffix, validation rules, and token numbers

3

2018-03-16

Lei Zhang

Clarified version and extension requirements

4

2018-04-13

Lei Zhang

Subsumed the SPV_GOOGLE_decorate_string extension

5

2018-09-12

Lei Zhang

Allowed OpDecorateId to take result <id> of OpVariable

-
-
-
- - \ No newline at end of file + + + + + + extensions/GOOGLE/SPV_GOOGLE_hlsl_functionality1.html + + +

extensions/GOOGLE/SPV_GOOGLE_hlsl_functionality1.html

+ + diff --git a/extensions/GOOGLE/SPV_GOOGLE_user_type.html b/extensions/GOOGLE/SPV_GOOGLE_user_type.html index 8b5b2d8..1009a8c 100644 --- a/extensions/GOOGLE/SPV_GOOGLE_user_type.html +++ b/extensions/GOOGLE/SPV_GOOGLE_user_type.html @@ -1,519 +1,12 @@ - - - - - - - -SPV_GOOGLE_user_type - - - - - -
-
-

Name Strings

-
-
-

SPV_GOOGLE_user_type

-
-
-
-
-

Contact

-
-
-

See Issues list in the Khronos SPIRV-Registry repository: -https://github.com/KhronosGroup/SPIRV-Registry

-
-
-
-
-

Contributors

-
-
-
    -
  • -

    Hai Nguyen, Google

    -
  • -
  • -

    David Neto, Google

    -
  • -
  • -

    Ehsan Nasiri, Google

    -
  • -
  • -

    Natalie Chouinard, Google

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Draft

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2024-01-29

Revision

5

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.2 Revision 4, Unified.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-

This extension also implicitly includes the SPV_GOOGLE_decorate_string -extension.

-
-
-
-
-

Overview

-
-
-

This extension provides one new decoration, UserTypeGOOGLE, that allows HLL -shader compilers to optionally embed unambiguous type information.

-
-
-

The decoration UserTypeGOOGLE is used with OpDecorateString(GOOGLE) and -OpMemberDecorateString(GOOGLE) to store a string name for the type of the -decorated object in the user’s source. It can decorate only a variable, a -type, or a member of a structure type. This has no semantic impact.

-
-
-

The following table describes standard HLSL resource types and their -corresponding string names:

-
- ---- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

cbuffer

cbuffer

tbuffer

tbuffer

AppendStructuredBuffer

appendstructuredbuffer

Buffer

buffer

ByteAddressBuffer

byteaddressbuffer

ConstantBuffer

constantbuffer

ConsumeStructuredBuffer

consumestructuredbuffer

InputPatch

inputpatch

OutputPatch

outputpatch

RasterizerOrderedBuffer

rasterizerorderedbuffer

RasterizerOrderedByteAddressBuffer

rasterizerorderedbyteaddressbuffer

RasterizerOrderedStructuredBuffer

rasterizerorderedstructuredbuffer

RasterizerOrderedTexture1D

rasterizerorderedtexture1d

RasterizerOrderedTexture1DArray

rasterizerorderedtexture1darray

RasterizerOrderedTexture2D

rasterizerorderedtexture2d

RasterizerOrderedTexture2DArray

rasterizerorderedtexture2darray

RasterizerOrderedTexture3D

rasterizerorderedtexture3d

RaytracingAccelerationStructure

raytracingaccelerationstructure

RWBuffer

rwbuffer

RWByteAddressBuffer

rwbyteaddressbuffer

RWStructuredBuffer

rwstructuredbuffer

RWTexture1D

rwtexture1d

RWTexture1DArray

rwtexture1darray

RWTexture2D

rwtexture2d

RWTexture2DArray

rwtexture2darray

RWTexture3D

rwtexture3d

StructuredBuffer

structuredbuffer

SubpassInput

subpassinput

SubpassInputMS

subpassinputms

Texture1D

texture1d

Texture1DArray

texture1darray

Texture2D

texture2d

Texture2DArray

texture2darray

Texture2DMS

texture2dms

Texture2DMSArray

texture2dmsarray

Texture3D

texture3d

TextureBuffer

texturebuffer

TextureCube

texturecube

TextureCubeArray

texturecubearray

-
-

If the HLSL type includes template parameters, they will be appended to the -corresponding string name in the format :<comma-separated-list>. For example, -the SPIR-V variable corresponding to an HLSL variable with type -Texture2DMSArray<float4, 64> would be decorated with -"texture2dmsarray:<float4,64>".

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_GOOGLE_user_type"
-
-
-
-
-
-

New Capabilities

-
-
-

None.

-
-
-
-
-

New Decorations

-
-
-
-
UserTypeGOOGLE
-
-
-
-
-
-

New Builtins

-
-
-

None.

-
-
-
-
-

New Instructions

-
-
-

None.

-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - -

UserTypeGoogle

5636

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.0

-
-
-
-
(Modify Section 3.20, Decoration)
-
-
- ------- - - - - - - - - - - - - - - - -
DecorationEnabling CapabilitiesExtra Operands

5636

UserTypeGOOGLE

Literal String
- User Type Name

-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_GOOGLE_user_type"
-
-
-
-
-
-

Issues

-
-
-
    -
  1. -

    Can UserTypeGOOGLE be used with variables or is it restricted to types?

    -
  2. -
  3. -

    You can have multiple UserTypeGOOGLE decorations on the same object or member -of an object.  Those conflicts can come as types are collapsed by the front-end. -This is ok.

    -
  4. -
  5. -

    When OpExtension "SPV_GOOGLE_user_type" is included, all features of -SPV_GOOGLE_decorate_string can be used without explicitly declaring that -SPV_GOOGLE_decorate_string extension.

    -
  6. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2019-05-16

Hai Nguyen

Initial draft

2

2019-05-17

Hai Nguyen

Added GOOGLE suffix and token number

3

2019-05-17

David Neto

Clarified definition of UserTypeGOOGLE

4

2019-07-30

Ehsan Nasiri

Added table of user type names

5

2024-01-29

Natalie Chouinard

Added and disambiguated some type names

-
-
-
- - \ No newline at end of file + + + + + + extensions/GOOGLE/SPV_GOOGLE_user_type.html + + +

extensions/GOOGLE/SPV_GOOGLE_user_type.html

+ + diff --git a/extensions/HUAWEI/SPV_HUAWEI_cluster_culling_shader.html b/extensions/HUAWEI/SPV_HUAWEI_cluster_culling_shader.html index 00e5c15..f3f9965 100644 --- a/extensions/HUAWEI/SPV_HUAWEI_cluster_culling_shader.html +++ b/extensions/HUAWEI/SPV_HUAWEI_cluster_culling_shader.html @@ -1,330 +1,12 @@ - - - - - - - -SPV_HUAWEI_cluster_culling_shader - - - - - -
-
-

Name Strings

-
-
-

SPV_HUAWEI_cluster_culling_shader

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at: -https://github.com/KhronosGroup/SPIRV-Registry

-
-
-
-
-

Contributors

-
-
-
    -
  • -

    Cheng Wei, HUAWEI

    -
  • -
  • -

    Jinyongjie, HUAWEI

    -
  • -
  • -

    Yuchang Wang, HUAWEI

    -
  • -
-
-
-
-
-

Status

-
-
-

Complete.

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2023-12-04

Revision

2

-
-

Dependencies -This extension is written against the SPIR-V Specification, Version 1.6

-
-
-
-
-

Overview

-
-
-

This extensions provides SPIR-V support for the GLSL GL_HUAWEI_cluster_culling_shader extension which adds a new programmable shader types - cluster culling shaders - that has an execution environment similar to that of compute shaders. The developer can use this new shader to perform coarse-level geometry culling on GPU and direct emit visible cluster with a shading rate to the subsequent rendering pipeline.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must be present in the module

-
-
-
-
OpExtension "SPV_HUAWEI_cluster_culling_shader"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces new capabilities:

-
-
-
-
ClusterCullingShadingHUAWEI
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6

-
-
-
-
(Modify section 3.3, Execution model, adding a row to the table)
-
-
- ----- - - - - - - - - - - - -

Execution Model

Enabling Capabilities

6272

ClusterCullingHUAWEI
-Cluster Culling shading stage.

ClusterCullingShadingHUAWEI

-
-
-
(Modify section 3.31, Capability, adding a row to the Capability table)
-
-
- ----- - - - - - - - - - - - -

Capabilities

Implicitly Declares

6273

ClusterCullingShadingHUAWEI

shader also see extension SPV_HUAWEI_cluster_culling_shader

-
-
-
(Modify section 3.21, Built-in, adding 7 new rows to the Built-in table)
-
-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Builtin

Enabling Capabilities

6274

IndexCountHUAWEI This variable specifies the number of vertices to draw in a cluster.

SPV_HUAWEI_cluster_culling_shader

6275

VertexCountHUAWEI This variable specifies the number of vertices to draw in a cluster.

SPV_HUAWEI_cluster_culling_shader

6276

InstanceCountHUAWEI This variable specified the number of instance to draw in a cluster. This variable should be one.

SPV_HUAWEI_cluster_culling_shader

6277

FirstIndexHUAWEI This variable specifies the base index of a cluster within the index buffer.

SPV_HUAWEI_cluster_culling_shader

6278

FirstVertexHUAWEI This variable specifies the base index of a cluster within the index buffer.

SPV_HUAWEI_cluster_culling_shader

6279

VertexOffsetHUAWEI This variable specifies the value added to the vertex index of a cluster before indexing into the vertex buffer.

SPV_HUAWEI_cluster_culling_shader

6280

FirstInstanceHUAWEI This variable specifies the instance ID of the first instance to draw.

SPV_HUAWEI_cluster_culling_shader

6281

ClusterIdHUAWEI This variable specifies the index of cluster being rendered by this drawing command. - This id can be passed from CCS to the vertex shader, so that the vertex shader can also get information related to the cluster.

SPV_HUAWEI_cluster_culling_shader

6283

ClusterShadingRateHUAWEI This variable specifies the shading rate of a cluster being rendered by this drawing command.

SPV_HUAWEI_cluster_culling_shader

-
-
-
(Modify section 3.42.25, Reserved Instructions, adding a row to the table)
-
-
- ---- - - - - - - - - - - -

OpDispatchClusterHUAWEI

Capabilities: -ClusterCulingShadingHUAWEI

6282

6273

-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2022-11-16

Yu-Chang Wang

Initial Public Release

2

2023-12-04

Yu-Chang Wang

add cluster shading rate

-
-
-
- - \ No newline at end of file + + + + + + extensions/HUAWEI/SPV_HUAWEI_cluster_culling_shader.html + + +

extensions/HUAWEI/SPV_HUAWEI_cluster_culling_shader.html

+ + diff --git a/extensions/HUAWEI/SPV_HUAWEI_subpass_shading.html b/extensions/HUAWEI/SPV_HUAWEI_subpass_shading.html index e9fea81..76265b7 100644 --- a/extensions/HUAWEI/SPV_HUAWEI_subpass_shading.html +++ b/extensions/HUAWEI/SPV_HUAWEI_subpass_shading.html @@ -1,225 +1,12 @@ - - - - - - - -SPV_HUAWEI_subpass_shading - - - - - -
-
-

Name Strings

-
-
-

SPV_HUAWEI_subpass_shading

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Hueilong Wang, HUAWEI

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Draft

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2021-04-20

Revision

1

-
-

This extension is written against the SPIR-V Specification, -Version 1.5, Revision 6, Unified.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extensions provides SPIR-V support for the GLSL GL_HUAWEI_subpass_shading -extension which adds one new type of compute pipeline — subpass shading — which -is allowed to access input attachments like a graphic pipeline in a subpass.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_HUAWEI_subpass_shading"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces new capabilities:

-
-
-
-
SubpassShadingHUAWEI
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5

-
-
-
-
(Modify Section 3.31, Capability, adding a row to the Capability table)
-
-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

6272

SubpassShadingHUAWEI
-Allows the use of InputAttachmentIndex and SubpassData Dim -in GLCompute Execution Model.

InputAttachment

-
-
-
-
-
(Modify InputAttachmentIndex to allow using this in a subpass shading shader, by changing the existing rule)
-
-
-
-
-
Only valid in the Fragment Execution Model and for variables of type OpTypeImage with a Dim operand of SubpassData.
-
-
-
-

to instead say

-
-
-
-
Only valid when the Execution Model is Fragment or GLCompute with SubpassShadingHUAWEI Capability, and for variables of type OpTypeImage with a Dim operand of SubpassData.
-
-
-
-
-
(Modify OpTypeImage to allow using SubpassData in a subpass shading shader, by changing the existing rule)
-
-
-
-
-
If Dim is SubpassData, Sampled must be 2, Image Format must be Unknown, and the Execution Model must be Fragment.
-
-
-
-

to instead say

-
-
-
-
If Dim is SubpassData, Sampled must be 2, Image Format must be Unknown, and the Execution Model must be Fragment or GLCompute with SubpassShadingHUAWEI Capability.
-
-
-
-
-
- - \ No newline at end of file + + + + + + extensions/HUAWEI/SPV_HUAWEI_subpass_shading.html + + +

extensions/HUAWEI/SPV_HUAWEI_subpass_shading.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_arbitrary_precision_fixed_point.html b/extensions/INTEL/SPV_INTEL_arbitrary_precision_fixed_point.html index 18c7566..d9e7ead 100644 --- a/extensions/INTEL/SPV_INTEL_arbitrary_precision_fixed_point.html +++ b/extensions/INTEL/SPV_INTEL_arbitrary_precision_fixed_point.html @@ -1,987 +1,12 @@ - - - - - - - -SPV_INTEL_arbitrary_precision_fixed_point - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_arbitrary_precision_fixed_point

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Ajaykumar Kannan, Intel

    -
  • -
  • -

    Shuo Niu, Intel

    -
  • -
  • -

    Daniel Zhang, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2023 Intel Corporation

-
-
-
-
-

Status

-
-
-

Final draft

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2023-03-21

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification Version 1.6, Revision 2.

-
-
-

While this extension does not require SPV_INTEL_arbitrary_precision_integers, the new operators it adds are significantly more useful when that extension is supported as the combination of the two extensions allows for more freedom in the width of arbitrary precision data types that can be represented (see the definition of the W parameter below).

-
-
-

This extension is built on Mentor Graphics ac_datatypes spec v3.9.2.

-
-
-
-
-

Overview

-
-
-

This extension introduces operations for arbitrary precision fixed point numbers called ac_fixed. -The ac_fixed datatype is an industry standard for fixed point numbers and is published by Mentor Graphics at hlslibs.org. -This datatype and its corresponding operations can be useful on targets that can take advantage of narrower representation such as FPGAs.

-
-
-

Data Representation

-
-

The ac_fixed datatype will be represented in SPIR-V as a pseudo type using OpTypeInt. -It is characterized by three parameters: W, I, and S.

-
-
-
    -
  • -

    W is the total width of the datatype (including a sign bit, if required) and is encoded in the width of the OpTypeInt.

    -
  • -
  • -

    I determines the position of the decimal point.

    -
    -
      -
    • -

      When I >= 0, I bits starting from the MSB of W are before the decimal point. If I is greater than W, then additional 0 bits are implied after the bits of W and before the decimal point.

      -
    • -
    • -

      When I < 0, -I 0 bits are implied after the decimal point and before the MSB of W.

      -
    • -
    -
    -
  • -
  • -

    S determines if this is a signed or an unsigned number. Note that the support for signedness in OpTypeInt is not leveraged here. If the ac_fixed is signed, then the MSB (most significant bit) will contain the sign bit.

    -
  • -
-
-
-

The datatype itself does not contain any information regarding I and S. -Each operation will contain information about the input and result datatypes (including W, S, and I) where W is implicit from the size of the OpTypeInt.

-
-
-

A variable of ac_fixed type can contain both an integer component and a fractional component depending on the value of I. -Based on its value, the number of bits allocated for the integer and fractional portions will change. -Note that it is also possible that one of the two portions may have no bits.

-
-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_arbitrary_precision_fixed_point"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
ArbitraryPrecisionFixedPointINTEL
-
-
-
-
-
-

New Instructions

-
-
-

Instructions added under the ArbitraryPrecisionFixedPointINTEL capability:

-
-
-
-
OpFixedSqrtINTEL
-OpFixedRecipINTEL
-OpFixedRsqrtINTEL
-OpFixedSinINTEL
-OpFixedCosINTEL
-OpFixedSinCosINTEL
-OpFixedSinPiINTEL
-OpFixedCosPiINTEL
-OpFixedSinCosPiINTEL
-OpFixedLogINTEL
-OpFixedExpINTEL
-
-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

ArbitraryPrecisionFixedPointINTEL

5922

OpFixedSqrtINTEL

5923

OpFixedRecipINTEL

5924

OpFixedRsqrtINTEL

5925

OpFixedSinINTEL

5926

OpFixedCosINTEL

5927

OpFixedSinCosINTEL

5928

OpFixedSinPiINTEL

5929

OpFixedCosPiINTEL

5930

OpFixedSinCosPiINTEL

5931

OpFixedLogINTEL

5932

OpFixedExpINTEL

5933

-
-
-
-

Modifications to the SPIR-V Specification Version 1.6

-
-
-

After Section 3.16, add a new section "3.16b Quantization Modes" as follows:

-
-
-

Quantization Modes

- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ValueModeBehaviorEnabling Capabilities

0

TRN_INTEL

Truncate towards -Inf

ArbitraryPrecisionFixedPointINTEL

1

TRN_ZERO_INTEL

Truncate towards 0

ArbitraryPrecisionFixedPointINTEL

2

RND_INTEL

Round towards +Inf

ArbitraryPrecisionFixedPointINTEL

3

RND_ZERO_INTEL

Round towards 0

ArbitraryPrecisionFixedPointINTEL

4

RND_INF_INTEL

Round positive values toward +Inf and negative values toward -Inf

ArbitraryPrecisionFixedPointINTEL

5

RND_MIN_INF_INTEL

Round towards -Inf

ArbitraryPrecisionFixedPointINTEL

6

RND_CONV_INTEL

Round towards even

ArbitraryPrecisionFixedPointINTEL

7

RND_CONV_ODD_INTEL

Round towards odd

ArbitraryPrecisionFixedPointINTEL

-
-

After Section 3.16, add a new section "3.16c Overflow Modes" as follows:

-
-
-
-

Overflow Modes

- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ValueModeBehaviorEnabling Capabilities

0

WRAP_INTEL

Drop the bits to the left of the MSB

ArbitraryPrecisionFixedPointINTEL

1

SAT_INTEL

Saturate to the closest of MIN or MAX

ArbitraryPrecisionFixedPointINTEL

2

SAT_ZERO_INTEL

Set to 0 on overflow

ArbitraryPrecisionFixedPointINTEL

3

SAT_SYM_INTEL

For unsigned, treat as SAT_INTEL.

-

For signed: a positive overflow will saturate at the maximum positive value, whereas a negative overflow will saturate to the negation of the maximum positive value, as opposed to the most negative value.

ArbitraryPrecisionFixedPointINTEL

-
-

After Section 3.16, add a new section "3.16d Signedness Modes" as follows:

-
-
-
-

Signedness Modes

- ------ - - - - - - - - - - - - - - - - - - - - - - -
ValueModeBehaviorEnabling Capabilities

0

UNSIGNED_INTEL

Input and result types are unsigned

ArbitraryPrecisionFixedPointINTEL

1

SIGNED_INTEL

Input and result types are signed

ArbitraryPrecisionFixedPointINTEL

-
-
-

Capability

-
-

Modify Section 3.31, Capability, adding a row to the Capability table:

-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

5922

ArbitraryPrecisionFixedPointINTEL

-

Enables arbitrary precision fixed-point math instructions.

-
-
-

Instructions

-
-

In Section 3.32.13, Arithmetic Instructions, add the following instructions:

-
- ------------ - - - - - - - - - - - - - - - - - - -

OpFixedSqrtINTEL

-

An OpTypeInt representing an arbitrary precision fixed point number (ac_fixed) is passed in as the Input and the square root of the value is returned in Result. -The behavior of this function is undefined for input values < 0.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision fixed point number.

-

S is chosen from Table 3.16d that indicates the Signedness Mode of the input and output types.

-

I is a signed 32-bit integer that refers to the location of the fixed-point relative to the MSB of the input type.

-

rI is a signed 32-bit integer that refers to the location of the fixed-point relative to the MSB of the result type.

-

Q is a QuantizationMode enum chosen from Table 3.16b that indicates the Quantization Mode of this operation.

-

O is an OverflowMode enum chosen from Table 3.16c that indicates the Overflow Mode of this operation.

Capability: -ArbitraryPrecisionFixedPointINTEL

9

5923

<id> Result Type

Result <id>

Input <id>

Signedness S

Literal I

Literal rI

QuatntizationMode Q

OverflowMode O

- ------------ - - - - - - - - - - - - - - - - - - -

OpFixedRecipINTEL

-

An OpTypeInt representing an arbitrary precision fixed point number (ac_fixed) is passed in as the Input and the reciprocal (1/Input) of the value is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision fixed point number.

-

S is chosen from Table 3.16d that indicates the Signedness Mode of the input and output types.

-

I is a signed 32-bit integer that refers to the location of the fixed-point relative to the MSB of the input type.

-

rI is a signed 32-bit integer that refers to the location of the fixed-point relative to the MSB of the result type.

-

Q is a QuantizationMode enum chosen from Table 3.16b that indicates the Quantization Mode of this operation.

-

O is an OverflowMode enum chosen from Table 3.16c that indicates the Overflow Mode of this operation.

Capability: -ArbitraryPrecisionFixedPointINTEL

9

5924

<id> Result Type

Result <id>

Input <id>

Signedness S

Literal I

Literal rI

QuatntizationMode Q

OverflowMode O

- ------------ - - - - - - - - - - - - - - - - - - -

OpFixedRsqrtINTEL

-

An OpTypeInt representing an arbitrary precision fixed point number (ac_fixed) is passed in as the Input and the reciprocal square root (1/sqrt(Input)) of the value is returned in Result. -The behavior of this function is undefined for input values < 0.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision fixed point number.

-

S is chosen from Table 3.16d that indicates the Signedness Mode of the input and output types.

-

I is a signed 32-bit integer that refers to the location of the fixed-point relative to the MSB of the input type.

-

rI is a signed 32-bit integer that refers to the location of the fixed-point relative to the MSB of the result type.

-

Q is a QuantizationMode enum chosen from Table 3.16b that indicates the Quantization Mode of this operation.

-

O is an OverflowMode enum chosen from Table 3.16c that indicates the Overflow Mode of this operation.

Capability: -ArbitraryPrecisionFixedPointINTEL

9

5925

<id> Result Type

Result <id>

Input <id>

Signedness S

Literal I

Literal rI

QuatntizationMode Q

OverflowMode O

- ------------ - - - - - - - - - - - - - - - - - - -

OpFixedSinINTEL

-

An OpTypeInt representing an arbitrary precision fixed point number (ac_fixed) is passed in as the Input and the sine of the value is returned in Result. Note that the angles are measured in radians.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision fixed point number.

-

S is chosen from Table 3.16d that indicates the Signedness Mode of the input and output types.

-

I is a signed 32-bit integer that refers to the location of the fixed-point relative to the MSB of the input type.

-

rI is a signed 32-bit integer that refers to the location of the fixed-point relative to the MSB of the result type.

-

Q is a QuantizationMode enum chosen from Table 3.16b that indicates the Quantization Mode of this operation.

-

O is an OverflowMode enum chosen from Table 3.16c that indicates the Overflow Mode of this operation.

Capability: -ArbitraryPrecisionFixedPointINTEL

9

5926

<id> Result Type

Result <id>

Input <id>

Signedness S

Literal I

Literal rI

QuatntizationMode Q

OverflowMode O

- ------------ - - - - - - - - - - - - - - - - - - -

OpFixedCosINTEL

-

An OpTypeInt representing an arbitrary precision fixed point number (ac_fixed) is passed in as the Input and the cosine of the value is returned in Result. Note that the angles are measured in radians.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision fixed point number.

-

S is chosen from Table 3.16d that indicates the Signedness Mode of the input and output types.

-

I is a signed 32-bit integer that refers to the location of the fixed-point relative to the MSB of the input type.

-

rI is a signed 32-bit integer that refers to the location of the fixed-point relative to the MSB of the result type.

-

Q is a QuantizationMode enum chosen from Table 3.16b that indicates the Quantization Mode of this operation.

-

O is an OverflowMode enum chosen from Table 3.16c that indicates the Overflow Mode of this operation.

Capability: -ArbitraryPrecisionFixedPointINTEL

9

5927

<id> Result Type

Result <id>

Input <id>

Signedness S

Literal I

Literal rI

QuatntizationMode Q

OverflowMode O

- ------------ - - - - - - - - - - - - - - - - - - -

OpFixedSinCosINTEL

-

An OpTypeInt representing an arbitrary precision fixed point number (ac_fixed) is passed in as the Input and both the sine and cosine of the value are returned in Result. Note that the angles are measured in radians.

-

Result Type must be a two-component vector of OpTypeInt. The first component of the Result contains the sine of the Input and is an arbitrary fixed point number. The second component of the Result contains the cosine of the Input and is also an arbitrary fixed point number.

-

S is chosen from Table 3.16d that indicates the Signedness Mode of the input and output types.

-

I is a signed 32-bit integer that refers to the location of the fixed-point relative to the MSB of the input type.

-

rI is a signed 32-bit integer that refers to the location of the fixed-point relative to the MSB of each component of the result type.

-

Q is a QuantizationMode enum chosen from Table 3.16b that indicates the Quantization Mode of this operation.

-

O is an OverflowMode enum chosen from Table 3.16c that indicates the Overflow Mode of this operation.

Capability: -ArbitraryPrecisionFixedPointINTEL

9

5928

<id> Result Type

Result <id>

Input <id>

Signedness S

Literal I

Literal rI

QuatntizationMode Q

OverflowMode O

- ------------ - - - - - - - - - - - - - - - - - - -

OpFixedSinPiINTEL

-

An OpTypeInt representing an arbitrary precision fixed point number (ac_fixed) is passed in as the Input and the sine of pi * Input is returned in Result. Note that the angles are measured in radians.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision fixed point number.

-

S is chosen from Table 3.16d that indicates the Signedness Mode of the input and output types.

-

I is a signed 32-bit integer that refers to the location of the fixed-point relative to the MSB of the input type.

-

rI is a signed 32-bit integer that refers to the location of the fixed-point relative to the MSB of the result type.

-

Q is a QuantizationMode enum chosen from Table 3.16b that indicates the Quantization Mode of this operation.

-

O is an OverflowMode enum chosen from Table 3.16c that indicates the Overflow Mode of this operation.

Capability: -ArbitraryPrecisionFixedPointINTEL

9

5929

<id> Result Type

Result <id>

Input <id>

Signedness S

Literal I

Literal rI

QuatntizationMode Q

OverflowMode O

- ------------ - - - - - - - - - - - - - - - - - - -

OpFixedCosPiINTEL

-

An OpTypeInt representing an arbitrary precision fixed point number (ac_fixed) is passed in as the Input and the cosine of pi * Input is returned in Result. Note that the angles are measured in radians.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision fixed point number.

-

S is chosen from Table 3.16d that indicates the Signedness Mode of the input and output types.

-

I is a signed 32-bit integer that refers to the location of the fixed-point relative to the MSB of the input type.

-

rI is a signed 32-bit integer that refers to the location of the fixed-point relative to the MSB of the result type.

-

Q is a QuantizationMode enum chosen from Table 3.16b that indicates the Quantization Mode of this operation.

-

O is an OverflowMode enum chosen from Table 3.16c that indicates the Overflow Mode of this operation.

Capability: -ArbitraryPrecisionFixedPointINTEL

9

5930

<id> Result Type

Result <id>

Input <id>

Signedness S

Literal I

Literal rI

QuatntizationMode Q

OverflowMode O

- ------------ - - - - - - - - - - - - - - - - - - -

OpFixedSinCosPiINTEL

-

An OpTypeInt representing an arbitrary precision fixed point number (ac_fixed) is passed in as the Input and both the sine and cosine of pi * Input are returned in Result. Note that the angles are measured in radians.

-

Result Type must be a two-component vector of OpTypeInt. The first component of the Result contains the sine of the Input and is an arbitrary fixed point number. The second component of the Result contains the cosine of the Input and is also an arbitrary fixed point number.

-

S is chosen from Table 3.16d that indicates the Signedness Mode of the input and output types.

-

I is a signed 32-bit integer that refers to the location of the fixed-point relative to the MSB of the input type.

-

rI is a signed 32-bit integer that refers to the location of the fixed-point relative to the MSB of each component of the result type.

-

Q is a QuantizationMode enum chosen from Table 3.16b that indicates the Quantization Mode of this operation.

-

O is an OverflowMode enum chosen from Table 3.16c that indicates the Overflow Mode of this operation.

Capability: -ArbitraryPrecisionFixedPointINTEL

9

5931

<id> Result Type

Result <id>

Input <id>

Signedness S

Literal I

Literal rI

QuatntizationMode Q

OverflowMode O

- ------------ - - - - - - - - - - - - - - - - - - -

OpFixedLogINTEL

-

An OpTypeInt representing an arbitrary precision fixed point number (ac_fixed) is passed in as the Input and the log of the value is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision fixed point number.

-

S is chosen from Table 3.16d that indicates the Signedness Mode of the input and output types.

-

I is a signed 32-bit integer that refers to the location of the fixed-point relative to the MSB of the input type.

-

rI is a signed 32-bit integer that refers to the location of the fixed-point relative to the MSB of the result type.

-

Q is a QuantizationMode enum chosen from Table 3.16b that indicates the Quantization Mode of this operation.

-

O is an OverflowMode enum chosen from Table 3.16c that indicates the Overflow Mode of this operation.

Capability: -ArbitraryPrecisionFixedPointINTEL

9

5932

<id> Result Type

Result <id>

Input <id>

Signedness S

Literal I

Literal rI

QuatntizationMode Q

OverflowMode O

- ------------ - - - - - - - - - - - - - - - - - - -

OpFixedExpINTEL

-

An OpTypeInt representing an arbitrary precision fixed point number (ac_fixed) is passed in as the Input and the exp of the value is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision fixed point number.

-

S is chosen from Table 3.16d that indicates the Signedness Mode of the input and output types.

-

I is a signed 32-bit integer that refers to the location of the fixed-point relative to the MSB of the input type.

-

rI is a signed 32-bit integer that refers to the location of the fixed-point relative to the MSB of the result type.

-

Q is a QuantizationMode enum chosen from Table 3.16b that indicates the Quantization Mode of this operation.

-

O is an OverflowMode enum chosen from Table 3.16c that indicates the Overflow Mode of this operation.

Capability: -ArbitraryPrecisionFixedPointINTEL

9

5933

<id> Result Type

Result <id>

Input <id>

Signedness S

Literal I

Literal rI

QuatntizationMode Q

OverflowMode O

-
-
-

Validation Rules

-
-

None.

-
-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2023-03-21

Ajaykumar Kannan

Initial Public Release

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_arbitrary_precision_fixed_point.html + + +

extensions/INTEL/SPV_INTEL_arbitrary_precision_fixed_point.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_arbitrary_precision_floating_point.html b/extensions/INTEL/SPV_INTEL_arbitrary_precision_floating_point.html index 4fdbcaf..2bcc12c 100644 --- a/extensions/INTEL/SPV_INTEL_arbitrary_precision_floating_point.html +++ b/extensions/INTEL/SPV_INTEL_arbitrary_precision_floating_point.html @@ -1,2340 +1,12 @@ - - - - - - - -SPV_INTEL_arbitrary_precision_floating_point - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_arbitrary_precision_floating_point

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Ajaykumar Kannan, Intel

    -
  • -
  • -

    Shuo Niu, Intel

    -
  • -
  • -

    Daniel Zhang, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2023 Intel Corporation

-
-
-
-
-

Status

-
-
-

Supported

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2023-03-29

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification Version 1.6, Revision 2.

-
-
-

While this extension does not require SPV_INTEL_arbitrary_precision_integers, the new operators it adds are significantly more useful when that extension is supported as the combination of the two extensions allows for more freedom in the width of arbitrary precision floating point data types that can be represented.

-
-
-
-
-

Overview

-
-
-

This extension adds instructions for performing arbitrary precision floating point computations. Each arbitrary-precision floating point value is represented as an OpTypeInt as described below. -This datatype and its corresponding operations can be useful on targets that can take advantage of narrower representation such as FPGAs.

-
-
-

The datatype can be characterized by two parameters:

-
-
-
    -
  • -

    E: the number of exponent bits

    -
  • -
  • -

    M: the number of mantissa bits

    -
  • -
-
-
-

The total width of the OpTypeInt container is E+M+1 where the extra bit is used to represent the sign. -Note that the signedness capabilities of OpTypeInt are not used for any of the operations. -The data layout is shown below:

-
-
-

[ S (sign bit) ][ E (Exponent) ][ M (Mantissa) ] -^--MSB LSB--^

-
-
-

The width of the data (E+M+1) is encoded with the width of the OpTypeInt. -The other parameters regarding the type (namely E and M) are encoded in the arguments of the operations.

-
-
-

Operation Controls

-
-

Each of the operations will also provide some control over the Rounding Mode and the Subnormal support.

-
-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_arbitrary_precision_floating_point"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
ArbitraryPrecisionFloatingPointINTEL
-
-
-
-
-
-

New Instructions

-
-
-

Instructions added under the ArbitraryPrecisionFloatingPointINTEL capability:

-
-
-
-
OpArbitraryFloatAddINTEL
-OpArbitraryFloatSubINTEL
-OpArbitraryFloatMulINTEL
-OpArbitraryFloatDivINTEL
-OpArbitraryFloatGTINTEL
-OpArbitraryFloatGEINTEL
-OpArbitraryFloatLTINTEL
-OpArbitraryFloatLEINTEL
-OpArbitraryFloatEQINTEL
-OpArbitraryFloatRecipINTEL
-OpArbitraryFloatRSqrtINTEL
-OpArbitraryFloatCbrtINTEL
-OpArbitraryFloatHypotINTEL
-OpArbitraryFloatSqrtINTEL
-OpArbitraryFloatLogINTEL
-OpArbitraryFloatLog2INTEL
-OpArbitraryFloatLog10INTEL
-OpArbitraryFloatLog1pINTEL
-OpArbitraryFloatExpINTEL
-OpArbitraryFloatExp2INTEL
-OpArbitraryFloatExp10INTEL
-OpArbitraryFloatExpm1INTEL
-OpArbitraryFloatSinINTEL
-OpArbitraryFloatCosINTEL
-OpArbitraryFloatSinCosINTEL
-OpArbitraryFloatSinPiINTEL
-OpArbitraryFloatCosPiINTEL
-OpArbitraryFloatSinCosPiINTEL
-OpArbitraryFloatASinINTEL
-OpArbitraryFloatASinPiINTEL
-OpArbitraryFloatACosINTEL
-OpArbitraryFloatACosPiINTEL
-OpArbitraryFloatATanINTEL
-OpArbitraryFloatATanPiINTEL
-OpArbitraryFloatATan2INTEL
-OpArbitraryFloatPowINTEL
-OpArbitraryFloatPowRINTEL
-OpArbitraryFloatPowNINTEL
-OpArbitraryFloatConvertINTEL
-OpArbitraryFloatConvertFromUIntINTEL
-OpArbitraryFloatConvertFromSIntINTEL
-OpArbitraryFloatConvertToUIntINTEL
-OpArbitraryFloatConvertToSIntINTEL
-
-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

ArbitraryPrecisionFloatingPointINTEL

5845

OpArbitraryFloatAddINTEL

5846

OpArbitraryFloatSubINTEL

5847

OpArbitraryFloatMulINTEL

5848

OpArbitraryFloatDivINTEL

5849

OpArbitraryFloatGTINTEL

5850

OpArbitraryFloatGEINTEL

5851

OpArbitraryFloatLTINTEL

5852

OpArbitraryFloatLEINTEL

5853

OpArbitraryFloatEQINTEL

5854

OpArbitraryFloatRecipINTEL

5855

OpArbitraryFloatRSqrtINTEL

5856

OpArbitraryFloatCbrtINTEL

5857

OpArbitraryFloatHypotINTEL

5858

OpArbitraryFloatSqrtINTEL

5859

OpArbitraryFloatLogINTEL

5860

OpArbitraryFloatLog2INTEL

5861

OpArbitraryFloatLog10INTEL

5862

OpArbitraryFloatLog1pINTEL

5863

OpArbitraryFloatExpINTEL

5864

OpArbitraryFloatExp2INTEL

5865

OpArbitraryFloatExp10INTEL

5866

OpArbitraryFloatExpm1INTEL

5867

OpArbitraryFloatSinINTEL

5868

OpArbitraryFloatCosINTEL

5869

OpArbitraryFloatSinCosINTEL

5870

OpArbitraryFloatSinPiINTEL

5871

OpArbitraryFloatCosPiINTEL

5872

OpArbitraryFloatSinCosPiINTEL

5840

OpArbitraryFloatASinINTEL

5873

OpArbitraryFloatASinPiINTEL

5874

OpArbitraryFloatACosINTEL

5875

OpArbitraryFloatACosPiINTEL

5876

OpArbitraryFloatATanINTEL

5877

OpArbitraryFloatATanPiINTEL

5878

OpArbitraryFloatATan2INTEL

5879

OpArbitraryFloatPowINTEL

5880

OpArbitraryFloatPowRINTEL

5881

OpArbitraryFloatPowNINTEL

5882

OpArbitraryFloatConvertINTEL

5841

OpArbitraryFloatConvertFromUIntINTEL

5842

OpArbitraryFloatConvertFromSIntINTEL

5838

OpArbitraryFloatConvertToUIntINTEL

5843

OpArbitraryFloatConvertToSIntINTEL

5839

-
-
-
-

Modifications to the SPIR-V Specification Version 1.6

-
-
-

After Section 3.16, add a new section "3.16a Subnormal Support" as follows:

-
-
-

Subnormal Support

-
-

Control whether subnormal support is enabled or not.

-
- ---- - - - - - - - - - - - - - - - - -
ValueSubnormal Support

0

Flush subnormal numbers to zero on inputs and outputs

1

Enable support for operating on subnormal numbers

-
-

After Section 3.16, add a new section "3.16d Rounding Accuracy" as follows:

-
-
-
-

Rounding Accuracy

-
-

Controls whether rounding operations can be relaxed to trade correctness for improved resource utilization.

-
- ----- - - - - - - - - - - - - - - - - - - - -
ValueModeBehavior

0

CORRECT_INTEL

Conform to the rounding mode specified by the instruction’s rounding mode operand.

1

FAITHFUL_INTEL

Allow some tolerance for error (within 1ULP of the infinitely precise result) for rounding.
-The returned result is one of the two floating point values closest to the mathematical result.

-

This mode is useful for devices that can trade CORRECT_INTEL rounding for improved resource utilization.

-
-
-

Capability

-
-

Modify Section 3.31, Capability, adding a row to the Capability table:

-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

5845

ArbitraryPrecisionFloatingPointINTEL

-

Allows the use of various operations for arbitrary precision floating-point math

-
-
-

Instructions

-
-

In Section 3.32.13, Arithmetic Instructions, add the following instructions:

-
- -------------- - - - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatAddINTEL

-

Two OpTypeInt values representing two arbitrary precision floating point numbers are passed in as A and B and the result of A+B is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult, Ma and Mb are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result, A and B respectively. -Note that the exponent values (Ea, Eb, Eresult) are inferred from the width of the OpTypeInt values used to represent their corresponding arguments (A, B, Result)

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

11

5846

<id> Result Type

Result <id>

A <id>

Literal Ma

B <id>

Literal Mb

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- -------------- - - - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatSubINTEL

-

Two OpTypeInt values representing two arbitrary precision floating point numbers are passed in as A and B and the result of A-B is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult, Ma and Mb are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result, A and B respectively. -Note that the exponent values (Ea, Eb, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

11

5847

<id> Result Type

Result <id>

A <id>

Literal Ma

B <id>

Literal Mb

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- -------------- - - - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatMulINTEL

-

Two OpTypeInt values representing two arbitrary precision floating point numbers are passed in as A and B and the result of A*B is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult, Ma and Mb are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result, A and B respectively. -Note that the exponent values (Ea, Eb, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

11

5848

<id> Result Type

Result <id>

A <id>

Literal Ma

B <id>

Literal Mb

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- -------------- - - - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatDivINTEL

-

Two OpTypeInt values representing two arbitrary precision floating point numbers are passed in as A and B and the result of A/B is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult, Ma and Mb are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result, A and B respectively. -Note that the exponent values (Ea, Eb, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

11

5849

<id> Result Type

Result <id>

A <id>

Literal Ma

B <id>

Literal Mb

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- ---------- - - - - - - - - - - - - - - - - -

OpArbitraryFloatGTINTEL

-

Two OpTypeInt values representing two arbitrary precision floating point numbers are passed in as A and B. -The two numbers are compared and a value of true is returned in Result if A > B. -Otherwise, a value of false is returned.

-

Result Type must be a Boolean type.

-

Result is of type OpTypeBool.

-

Ma and Mb are 32-bit unsigned integers that define the mantissa widths of the floating point types within A and B respectively. -Note that the exponent values (Ea, Eb) are inferred from the width of the OpTypeInt.

Capability: -ArbitraryPrecisionFloatingPointINTEL

7

5850

<id> Result Type

Result <id>

A <id>

Literal Ma

B <id>

Literal Mb

- ---------- - - - - - - - - - - - - - - - - -

OpArbitraryFloatGEINTEL

-

Two OpTypeInt values representing two arbitrary precision floating point numbers are passed in as A and B. -The two numbers are compared and a value of true is returned in Result if A >= B. -Otherwise, a value of false is returned.

-

Result Type must be a Boolean type.

-

Result is of type OpTypeBool.

-

Ma and Mb are 32-bit unsigned integers that define the mantissa widths of the floating point types within A and B respectively. -Note that the exponent values (Ea, Eb) are inferred from the width of the OpTypeInt.

Capability: -ArbitraryPrecisionFloatingPointINTEL

7

5851

<id> Result Type

Result <id>

A <id>

Literal Ma

B <id>

Literal Mb

- ---------- - - - - - - - - - - - - - - - - -

OpArbitraryFloatLTINTEL

-

Two OpTypeInt values representing two arbitrary precision floating point numbers are passed in as A and B. -The two numbers are compared and a value of true is returned in Result if A < B. -Otherwise, a value of false is returned.

-

Result Type must be a Boolean type.

-

Result is of type OpTypeBool.

-

Ma and Mb are 32-bit unsigned integers that define the mantissa widths of the floating point types within A and B respectively. -Note that the exponent values (Ea, Eb) are inferred from the width of the OpTypeInt.

Capability: -ArbitraryPrecisionFloatingPointINTEL

7

5852

<id> Result Type

Result <id>

A <id>

Literal Ma

B <id>

Literal Mb

- ---------- - - - - - - - - - - - - - - - - -

OpArbitraryFloatLEINTEL

-

Two OpTypeInt values representing two arbitrary precision floating point numbers are passed in as A and B. -The two numbers are compared and a value of true is returned in Result if A <= B. -Otherwise, a value of false is returned.

-

Result Type must be a Boolean type.

-

Result is of type OpTypeBool.

-

Ma and Mb are 32-bit unsigned integers that define the mantissa widths of the floating point types within A and B respectively. -Note that the exponent values (Ea, Eb) are inferred from the width of the OpTypeInt.

Capability: -ArbitraryPrecisionFloatingPointINTEL

7

5853

<id> Result Type

Result <id>

A <id>

Literal Ma

B <id>

Literal Mb

- ---------- - - - - - - - - - - - - - - - - -

OpArbitraryFloatEQINTEL

-

Two OpTypeInt values representing two arbitrary precision floating point numbers are passed in as A and B. -The two numbers are compared and a value of true is returned in Result if A == B. -Otherwise, a value of false is returned.

-

Result Type must be a Boolean type.

-

Result is of type OpTypeBool.

-

Ma and Mb are 32-bit unsigned integers that define the mantissa widths of the floating point types within A and B respectively. -Note that the exponent values (Ea, Eb) are inferred from the width of the OpTypeInt.

Capability: -ArbitraryPrecisionFloatingPointINTEL

7

5854

<id> Result Type

Result <id>

A <id>

Literal Ma

B <id>

Literal Mb

- ------------ - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatRecipINTEL

-

An OpTypeInt representing an arbitrary precision floating point number is passed in as A and the reciprocal of the value is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult and Ma are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result and A respectively. -Note that the exponent values (Ea, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

9

5855

<id> Result Type

Result <id>

A <id>

Literal Ma

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- ------------ - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatRSqrtINTEL

-

An OpTypeInt representing an arbitrary precision floating point number is passed in as A and the reciprocal of the square root of the value is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult and Ma are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result and A respectively. -Note that the exponent values (Ea, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

9

5856

<id> Result Type

Result <id>

A <id>

Literal Ma

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- ------------ - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatCbrtINTEL

-

An OpTypeInt representing an arbitrary precision floating point number is passed in as A and the cube root of the value is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult and Ma are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result and A respectively. -Note that the exponent values (Ea, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

9

5857

<id> Result Type

Result <id>

A <id>

Literal Ma

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- -------------- - - - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatHypotINTEL

-

Two OpTypeInt values representing two arbitrary precision floating point numbers are passed in as A and B and the hypotenuse, sqrt(A^2 + B^2), is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult, Ma and Mb are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result, A and B respectively. -Note that the exponent values (Ea, Eb, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

11

5858

<id> Result Type

Result <id>

A <id>

Literal Ma

B <id>

Literal Mb

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- ------------ - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatSqrtINTEL

-

An OpTypeInt representing an arbitrary precision floating point number is passed in as A and the square root of the value is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult and Ma are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result and A respectively. -Note that the exponent values (Ea, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

9

5859

<id> Result Type

Result <id>

A <id>

Literal Ma

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- ------------ - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatLogINTEL

-

An OpTypeInt representing an arbitrary precision floating point number is passed in as A and the ln(A) of the value is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult and Ma are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result and A respectively. -Note that the exponent values (Ea, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

9

5860

<id> Result Type

Result <id>

A <id>

Literal Ma

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- ------------ - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatLog2INTEL

-

An OpTypeInt representing an arbitrary precision floating point number is passed in as A the log2(A) of the value is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult and Ma are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result and A respectively. -Note that the exponent values (Ea, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

9

5861

<id> Result Type

Result <id>

A <id>

Literal Ma

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- ------------ - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatLog10INTEL

-

An OpTypeInt representing an arbitrary precision floating point number is passed in as A and the log10(A) of the value is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult and Ma are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result and A respectively. -Note that the exponent values (Ea, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

9

5862

<id> Result Type

Result <id>

A <id>

Literal Ma

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- ------------ - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatLog1pINTEL

-

An OpTypeInt representing an arbitrary precision floating point number is passed in as A and the ln(1+A) of the value is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult and Ma are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result and A respectively. -Note that the exponent values (Ea, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

9

5863

<id> Result Type

Result <id>

A <id>

Literal Ma

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- ------------ - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatExpINTEL

-

An OpTypeInt representing an arbitrary precision floating point number is passed in as A and the e^(A) of the value is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult and Ma are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result and A respectively. -Note that the exponent values (Ea, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

9

5864

<id> Result Type

Result <id>

A <id>

Literal Ma

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- ------------ - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatExp2INTEL

-

An OpTypeInt representing an arbitrary precision floating point number is passed in as A and the 2^(A) of the value is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult and Ma are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result and A respectively. -Note that the exponent values (Ea, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

9

5865

<id> Result Type

Result <id>

A <id>

Literal Ma

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- ------------ - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatExp10INTEL

-

An OpTypeInt representing an arbitrary precision floating point number is passed in as A and the 10^(A) of the value is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult and Ma are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result and A respectively. -Note that the exponent values (Ea, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

9

5866

<id> Result Type

Result <id>

A <id>

Literal Ma

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- ------------ - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatExpm1INTEL

-

An OpTypeInt representing an arbitrary precision floating point number is passed in as A and the (e^A)-1 of the value is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult and Ma are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result and A respectively. -Note that the exponent values (Ea, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

9

5867

<id> Result Type

Result <id>

A <id>

Literal Ma

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- ------------ - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatSinINTEL

-

An OpTypeInt representing an arbitrary precision floating point number is passed in as A and the sine of the value is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult and Ma are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result and A respectively. -Note that the exponent values (Ea, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

9

5868

<id> Result Type

Result <id>

A <id>

Literal Ma

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- ------------ - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatCosINTEL

-

An OpTypeInt representing an arbitrary precision floating point number is passed in as A and the cosine of the value is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult and Ma are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result and A respectively. -Note that the exponent values (Ea, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

9

5869

<id> Result Type

Result <id>

A <id>

Literal Ma

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- ------------ - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatSinCosINTEL

-

An OpTypeInt representing an arbitrary precision floating point number is passed in as A and the sine and cosine of the value is returned in Result.

-

Result Type must be a two-component vector of OpTypeInt. The first component of the Result contains the sine of A and is an arbitrary precision floating point number. The second component of the Result contains the cosine of A and is also an arbitrary precision floating point number.

-

Mresult and Ma are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result and A respectively. -Note that the exponent values (Ea, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

9

5870

<id> Result Type

Result <id>

A <id>

Literal Ma

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- ------------ - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatSinPiINTEL

-

An OpTypeInt representing an arbitrary precision floating point number is passed in as A and the sin(A*pi) is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult and Ma are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result and A respectively. -Note that the exponent values (Ea, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

9

5871

<id> Result Type

Result <id>

A <id>

Literal Ma

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- ------------ - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatCosPiINTEL

-

An OpTypeInt representing an arbitrary precision floating point number is passed in as A and the cos(A*pi) is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult and Ma are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result and A respectively. -Note that the exponent values (Ea, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

9

5872

<id> Result Type

Result <id>

A <id>

Literal Ma

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- ------------ - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatSinCosPiINTEL

-

An OpTypeInt representing an arbitrary precision floating point number is passed in as A and the sine and cosine of A*pi is returned in Result.

-

Result Type must be a two-component vector of OpTypeInt. The first component of the Result contains the sine of A and is an arbitrary precision floating point number. The second component of the Result contains the cosine of A and is also an arbitrary precision floating point number.

-

Mresult and Ma are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result and A respectively. -Note that the exponent values (Ea, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

9

5840

<id> Result Type

Result <id>

A <id>

Literal Ma

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- ------------ - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatASinINTEL

-

An OpTypeInt representing an arbitrary precision floating point number is passed in as A and the arcsin(A) is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult and Ma are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result and A respectively. -Note that the exponent values (Ea, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

9

5873

<id> Result Type

Result <id>

A <id>

Literal Ma

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- ------------ - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatASinPiINTEL

-

An OpTypeInt representing an arbitrary precision floating point number is passed in as A and the arcsin(A)/pi is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult and Ma are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result and A respectively. -Note that the exponent values (Ea, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

9

5874

<id> Result Type

Result <id>

A <id>

Literal Ma

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- ------------ - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatACosINTEL

-

An OpTypeInt representing an arbitrary precision floating point number is passed in as A and the arccos(A) is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult and Ma are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result and A respectively. -Note that the exponent values (Ea, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

9

5875

<id> Result Type

Result <id>

A <id>

Literal Ma

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- ------------ - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatACosPiINTEL

-

An OpTypeInt representing an arbitrary precision floating point number is passed in as A and the arccos(A)/pi is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult and Ma are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result and A respectively. -Note that the exponent values (Ea, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

9

5876

<id> Result Type

Result <id>

A <id>

Literal Ma

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- ------------ - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatATanINTEL

-

An OpTypeInt representing an arbitrary precision floating point number is passed in as A and the arctan(A) is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult and Ma are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result and A respectively. -Note that the exponent values (Ea, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

9

5877

<id> Result Type

Result <id>

A <id>

Literal Ma

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- ------------ - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatATanPiINTEL

-

An OpTypeInt representing an arbitrary precision floating point number is passed in as A and the arctan(A)/pi is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult and Ma are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result and A respectively. -Note that the exponent values (Ea, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

9

5878

<id> Result Type

Result <id>

A <id>

Literal Ma

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- -------------- - - - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatATan2INTEL

-

Two OpTypeInt values representing two arbitrary precision floating point numbers are passed in as A and B and the arctan2(A,B) = arctan(A/B) is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult, Ma and Mb are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result, A and B respectively. -Note that the exponent values (Ea, Eb, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

11

5879

<id> Result Type

Result <id>

A <id>

Literal Ma

B <id>

Literal Mb

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- -------------- - - - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatPowINTEL

-

Two OpTypeInt values representing two arbitrary precision floating point numbers are passed in as A and B and the value of A^B is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult, Ma and Mb are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result, A and B respectively. -Note that the exponent values (Ea, Eb, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

11

5880

<id> Result Type

Result <id>

A <id>

Literal Ma

B <id>

Literal Mb

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- -------------- - - - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatPowRINTEL

-

Two OpTypeInt values representing two arbitrary precision floating point numbers are passed in as A and B. -The value of A^B is returned in Result. However, A >= 0, otherwise, the result is undefined.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult, Ma and Mb are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result, A and B respectively. -Note that the exponent values (Ea, Eb, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

11

5881

<id> Result Type

Result <id>

A <id>

Literal Ma

B <id>

Literal Mb

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- -------------- - - - - - - - - - - - - - - - - - - - - -

OpArbitraryFloatPowNINTEL

-

Two OpTypeInt values representing an arbitrary precision floating point number and an arbitrary precision integer number of signedness SignOfB are passed in as A and B respectively. -The value of A^B is returned in Result where B is a signed or unsigned integer of arbitrary size.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult and Ma are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result and A respectively. -Note that the exponent values (Ea, Eresult) are inferred from the width of the OpTypeInt.

-

SignOfB specifies whether B is signed or unsigned.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result. It is ignored if the Accuracy operand is set to "FAITHFUL_INTEL".

-

Accuracy is a RoundingAccuracy chosen from Table 3.16d that controls the rounding accuracy of the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

11

5882

<id> Result Type

Result <id>

A <id>

Literal Ma

B <id>

Literal SignOfB

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

RoundingAccuracy Accuracy

- ----------- - - - - - - - - - - - - - - - - - -

OpArbitraryFloatConvertINTEL

-

An OpTypeInt representing an arbitrary precision floating point number is passed in as A. -It is type converted into an arbitrary precision floating point number with the new specification (Eresult, Mresult) and returned as Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult and Ma are 32-bit unsigned integers that define the mantissa widths of the floating point types within Result and A respectively. Note that the exponent values (Ea, Eresult) are inferred from the width of the OpTypeInt.

-

Subnormal is a SubnormalMode chosen from Table 3.16a that specifies whether subnormal numbers should be supported or flushed to zero before and after the operation.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

8

5841

<id> Result Type

Result <id>

A <id>

Literal Ma

Literal Mresult

SubnormalMode Subnormal

RoundingMode Rounding

- --------- - - - - - - - - - - - - - - - -

OpArbitraryFloatConvertFromUIntINTEL

-

An OpTypeInt representing an unsigned integer passed in as A. -It is type converted into an arbitrary precision floating point number with the specification (Eresult, Mresult). The result of the convert operation is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult is a 32-bit unsigned integer that defines the mantissa width of the floating point value in Result. Note that the exponent value (Eresult) is inferred from the width of the OpTypeInt.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

6

5842

<id> Result Type

Result <id>

A <id>

Literal Mresult

RoundingMode Rounding

- --------- - - - - - - - - - - - - - - - -

OpArbitraryFloatConvertFromSIntINTEL

-

An OpTypeInt representing a signed integer passed in as A. -It is type converted into an arbitrary precision floating point number with the new specification (Eresult, Mresult). The result of the convert operation is returned in Result.

-

Result Type must be OpTypeInt.

-

Result is the <id> of the operation’s result, which is an arbitrary precision floating point number.

-

Mresult is a 32-bit unsigned integer that defines the mantissa width of the floating point value in Result. Note that the exponent value (Eresult) is inferred from the width of the OpTypeInt.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

6

5838

<id> Result Type

Result <id>

A <id>

Literal Mresult

RoundingMode Rounding

- --------- - - - - - - - - - - - - - - - -

OpArbitraryFloatConvertToUIntINTEL

-

An OpTypeInt representing an arbitrary precision floating point number is passed in as A. -It is type converted into an unsigned integer and returned as Result.

-

Result Type must be OpTypeInt, whose Signedness operand is 0. Behaviour is undefined if Result Type is not wide enough to hold the converted value.

-

Result is the <id> of the operation’s result, which is an arbitrary precision integer.

-

Ma is a 32-bit unsigned integer that defines the mantissa width of the floating point value in A. Note that the exponent value (Ea) is inferred from the width of the OpTypeInt.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

6

5843

<id> Result Type

Result <id>

A <id>

Literal Ma

RoundingMode Rounding

- --------- - - - - - - - - - - - - - - - -

OpArbitraryFloatConvertToSIntINTEL

-

An OpTypeInt representing an arbitrary precision floating point number is passed in as A. -It is type converted into a signed integer and returned as Result.

-

Result Type must be OpTypeInt. Behaviour is undefined if Result Type is not wide enough to hold the converted value.

-

Result is the <id> of the operation’s result, which is an arbitrary precision integer.

-

Ma is a 32-bit unsigned integer that defines the mantissa width of the floating point value in A. Note that the exponent value (Ea) is inferred from the width of the OpTypeInt.

-

Rounding is a RoundingMode chosen from Table 3.16 that controls the rounding mode for the result.

Capability: -ArbitraryPrecisionFloatingPointINTEL

6

5839

<id> Result Type

Result <id>

A <id>

Literal Ma

RoundingMode Rounding

-
-
-

Validation Rules

-
-
    -
  • -

    Any M* literal argument to any instruction added in this extension can’t exceed the width of its corresponding OpTypeInt argument minus 1

    -
  • -
-
-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2023-03-29

Ajaykumar Kannan

Initial Public Release

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_arbitrary_precision_floating_point.html + + +

extensions/INTEL/SPV_INTEL_arbitrary_precision_floating_point.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_arbitrary_precision_integers.html b/extensions/INTEL/SPV_INTEL_arbitrary_precision_integers.html index dfc83eb..cef9a0c 100644 --- a/extensions/INTEL/SPV_INTEL_arbitrary_precision_integers.html +++ b/extensions/INTEL/SPV_INTEL_arbitrary_precision_integers.html @@ -1,254 +1,12 @@ - - - - - - - -SPV_INTEL_arbitrary_precision_integers - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_arbitrary_precision_integers

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Ajaykumar Kannan, Intel

    -
  • -
  • -

    Joe Garvey, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2020 Intel Corporation

-
-
-
-
-

Status

-
-
-

Final draft

-
-
-

Version

- ---- - - - - - - - - - - -

Last Modified Date

2020-03-27

Revision

1

-
-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification Version 1.5, Revision 2.

-
-
-
-
-

Overview

-
-
-

This extension relaxes the restriction that OpTypeInt must have a width of 32 bits. -Ints of arbitrary bit widths can be beneficial on targets that can exploit narrower widths such as FPGAs.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_arbitrary_precision_integers"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
ArbitraryPrecisionIntegersINTEL
-
-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - -

ArbitraryPrecisionIntegersINTEL

5844

-
-
-
-

Modifications to the SPIR-V Specification Version 1.5

-
-
-

Capability

-
-

Modify Section 3.31, Capability, adding a row to the Capability table:

-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

5844

ArbitraryPrecisionIntegersINTEL

-

Allows the use of the OpTypeInt to declare integers of any arbitrary width. -The minimum requirement is that all bitwidths up to 32-bits must be supported, but implementations can extend the support beyond 32-bits.

Int8, Int16

-
-
-

Validation Rules

-
-

None.

-
-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2020-03-27

Ajaykumar Kannan

Initial public release

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_arbitrary_precision_integers.html + + +

extensions/INTEL/SPV_INTEL_arbitrary_precision_integers.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_bfloat16_conversion.html b/extensions/INTEL/SPV_INTEL_bfloat16_conversion.html index 2147eaa..6fc6896 100644 --- a/extensions/INTEL/SPV_INTEL_bfloat16_conversion.html +++ b/extensions/INTEL/SPV_INTEL_bfloat16_conversion.html @@ -1,328 +1,12 @@ - - - - - - - -SPV_INTEL_bfloat16_conversion - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_bfloat16_conversion

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Ben Ashbaugh, Intel

    -
  • -
  • -

    Greg Lueck, Intel

    -
  • -
  • -

    Alexey Sotkin, Intel

    -
  • -
  • -

    Arvind Sudarsanam, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2023 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-
    -
  • -

    Shipping

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2023-03-06

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, Version 1.6 Revision -2.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds instructions to convert between single-precision 32-bit -floating-point values and 16-bit bfloat16 values. -The bfloat16 floating-point format is a truncated variant of the IEEE 754 -single-precision 32-bit floating-point format with one sign bit, eight exponent -bits, and seven mantissa bits. -This gives the 16-bit bfloat16 format similar dynamic range as the 32-bit -float format, albeit with lower precision than the 16-bit half format.

-
-
-

Please note that this extension does not introduce a bfloat16 type to SPIR-V -and instead the new instructions convert to or from a 16-bit integer type whose -bit pattern represents a bfloat16 value.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the appropriate OpExtension must -be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_bfloat16_conversion"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6

-
-
-

Capabilities

-
-

Modify Section 3.31, Capability, adding rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

6115

BFloat16ConversionINTEL

-
-
-
-
-

Instructions

-
-

Add to Section 3.42.11, Conversion Instructions:

-
- ------- - - - - - - - - - - - - - -

OpConvertFToBF16INTEL
-
-Convert value numerically from 32-bit floating point to bfloat16, which is -represented as a 16-bit unsigned integer.
-
-Result Type must be a scalar or vector of integer type. -The component width must be 16 bits. -The bit pattern in the Result represents a bfloat16 value.
-
-Float Value must be a scalar or vector of floating-point type. -It must have the same number of components as Result Type. -The component width must be 32 bits.
-
-Results are computed per component.

Capability:
-BFloat16ConversionINTEL

4

6116

<id>
-Result Type

Result <id>

<id>
-Float Value

- ------- - - - - - - - - - - - - - -

OpConvertBF16ToFINTEL
-
-Interpret a 16-bit integer value as bfloat16 and convert the value numerically -to 32-bit floating point.
-
-Result Type must be a scalar or vector of floating-point type. -The component width must be 32 bits.
-
-BFloat16 Value must be a scalar or vector of integer type, which is -interpreted as a bfloat16. -The type must have the same number of components as Result Type. -The component width must be 16 bits.
-
-Results are computed per component.

Capability:
-BFloat16ConversionINTEL

4

6117

<id>
-Result Type

Result <id>

<id>
-BFloat16 Value

-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2023-03-06

Ben Ashbaugh

Initial revision for publication

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_bfloat16_conversion.html + + +

extensions/INTEL/SPV_INTEL_bfloat16_conversion.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_blocking_pipes.html b/extensions/INTEL/SPV_INTEL_blocking_pipes.html index 5295de2..784b23b 100644 --- a/extensions/INTEL/SPV_INTEL_blocking_pipes.html +++ b/extensions/INTEL/SPV_INTEL_blocking_pipes.html @@ -1,365 +1,12 @@ - - - - - - - -SPV_INTEL_blocking_pipes - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_blocking_pipes

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Joe Garvey, Intel

    -
  • -
  • -

    Michael Kinsner, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2019 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-

Final draft

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2019-07-17

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.4 Revision 1.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds new pipe read and write functions that have blocking semantics instead of the non-blocking semantics of the existing pipe read/write functions. In this version, only the variants that don’t support reservations are specified.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_blocking_pipes"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
BlockingPipesINTEL
-
-
-
-
-
-

New Instructions

-
-
-

Instructions added under the BlockingPipesINTEL capability:

-
-
-
-
OpReadPipeBlockingINTEL
-OpWritePipeBlockingINTEL
-
-
-
-
-
-

Token Number Assignments

-
-
-
- ---- - - - - - - - - - - - - - - -

BlockingPipesINTEL

5945

OpReadPipeBlockingINTEL

5946

OpWritePipeBlockingINTEL

5947

-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.4

-
-
-

Capability

-
-

Modify section 3.31, Capability, adding a row to the capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

5945

BlockingPipesINTEL

Pipes

-
-
-
-
-

Instructions

-
-

In section 3.32.23, Pipe Instructions, add two new instructions: OpReadPipeBlockingINTEL and OpWritePipeBlockingINTEL, as follows:

-
- -------- - - - - - - - - - - - - - - -

OpReadPipeBlockingINTEL

-

Read a packet from the pipe object specified by Pipe into Pointer. This instruction will not return until the read has completed successfully and thus if the pipe is empty it will block until there is data in the pipe.

-

Pipe must have a type of OpTypePipe with ReadOnly access qualifier.

-

Pointer must have a type of OpTypePointer with the same data type as Pipe and a Generic Storage Class.

-

Packet Size must be a 32-bit integer type scalar that represents the size in bytes of each packet in the pipe.

-

Packet Alignment must be a 32-bit integer type scalar that presents the alignment in bytes of each packet in the pipe.

-

Packet Size and Packet Alignment must satisfy the following:
-- 1 <= Packet Alignment <= Packet Size.
-- Packet Alignment must evenly divide Packet Size.

-

For concrete types, Packet Alignment should equal Packet Size. For aggregate types, Packet Alignment should be the size of the largest primitive type in the hierarchy of types.

Capability:
-BlockingPipesINTEL

5

5946

<id>
-Pipe

<id>
-Pointer

<id>
-Packet Size

<id>
-Packet Alignment

- -------- - - - - - - - - - - - - - - -

OpWritePipeBlockingINTEL

-

Write a packet from Pointer to the pipe object specified by Pipe. This instruction will not return until the write has completed successfully and thus if the pipe is full it will block until there is available capacity in the pipe.

-

Pipe must have a type of OpTypePipe with WriteOnly access qualifier.

-

Pointer must have a type of OpTypePointer with the same data type as Pipe and a Generic Storage Class.

-

Packet Size must be a 32-bit integer type scalar that represents the size in bytes of each packet in the pipe.

-

Packet Alignment must be a 32-bit integer type scalar that presents the alignment in bytes of each packet in the pipe.

-

Packet Size and Packet Alignment must satisfy the following:
-- 1 <= Packet Alignment <= Packet Size.
-- Packet Alignment must evenly divide Packet Size.

-

For concrete types, Packet Alignment should equal Packet Size. For aggregate types, Packet Alignment should be the size of the largest primitive type in the hierarchy of types.

Capability:
-BlockingPipesINTEL

5

5947

<id>
-Pipe

<id>
-Pointer

<id>
-Packet Size

<id>
-Packet Alignment

-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2019-07-17

Joe Garvey

Initial public release

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_blocking_pipes.html + + +

extensions/INTEL/SPV_INTEL_blocking_pipes.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_cache_controls.html b/extensions/INTEL/SPV_INTEL_cache_controls.html index d772098..6d9dd75 100644 --- a/extensions/INTEL/SPV_INTEL_cache_controls.html +++ b/extensions/INTEL/SPV_INTEL_cache_controls.html @@ -1,565 +1,12 @@ - - - - - - - -SPV_INTEL_cache_controls - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_cache_controls

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Andrzej Ratajewski, Intel

    -
  • -
  • -

    Ben Ashbaugh, Intel

    -
  • -
  • -

    Dmitry Sidorov, Intel

    -
  • -
  • -

    Gregory Lueck, Intel

    -
  • -
  • -

    Victor Mustya, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2023 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-

Complete.

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2023-08-28

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, Version 1.6, -Revision 2.

-
-
-

This extension requires SPIR-V 1.0

-
-
-
-
-

Overview

-
-
-

This extension allows cache control information to be applied to memory access -instructions.

-
-
-

The cache controls are a strong request that the SPIR-V consumer should use a -memory operation with the indicated cache controls. However, the SPIR-V consumer -may choose a different cache control if the requested one is unsupported or for -any other reason. The cache controls may affect the performance of a program, -but (with very few exceptions) must not affect the functional correctness.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the appropriate OpExtension must -be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_cache_controls"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces new capabilities:

-
-
-
-
CacheControlsINTEL
-
-
-
-
-
-

New Decorations

-
-
-
-
CacheControlLoadINTEL
-CacheControlStoreINTEL
-
-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - - - - - - - - - -

CacheControlsINTEL

6441

CacheControlLoadINTEL

6442

CacheControlStoreINTEL

6443

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6, Revision 2

-
-
-

Validation Rules

-
-

Modify Section 2.16.1, Universal Validation Rules, adding the following statements.

-
-
-
    -
  • -

    Decoration rules

    -
    -
      -
    • -

      A CacheControlLoadINTEL Decoration must be applied only as follows:

      -
      -
        -
      • -

        Only OpTypePointer values can be decorated.

        -
      • -
      • -

        Pointer types of the decorated instructions must have Function, -UniformConstant, CrossWorkgroup or Generic storage class.

        -
      • -
      • -

        It’s allowed to apply CacheControlLoadINTEL multiple times to the same -Pointer.

        -
      • -
      • -

        Two CacheControlLoadINTEL decorations decorating the same Pointer -must have different Cache Level values.

        -
      • -
      -
      -
    • -
    • -

      A CacheControlStoreINTEL Decoration must be applied only as follows:

      -
      -
        -
      • -

        Only OpTypePointer values can be decorated.

        -
      • -
      • -

        Pointer types of the decorated instructions must have Function, -CrossWorkgroup or Generic storage class.

        -
      • -
      • -

        It’s allowed to apply CacheControlStoreINTEL multiple times to the same -Pointer.

        -
      • -
      • -

        Two CacheControlStoreINTEL decorations decorating the same Pointer -must have different Cache Level values.

        -
      • -
      -
      -
    • -
    -
    -
  • -
-
-
-

Modify Section 3, Binary form, add new sub-sections after 3.18 Access Qualifier.

-
-
-
-
-

3.XX Load Cache Controls

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Cache ControlEnabling CapabilitiesDescription

0

UncachedINTEL

CacheControlsINTEL

Hint that the load operation should not cache its data in the given cache -level.

1

CachedINTEL

CacheControlsINTEL

Hint that the load operation should cache its data in the given cache level.

2

StreamingINTEL

CacheControlsINTEL

Hint that the load operation should cache its data in the specified cache -level, using evict-first policy to minimize cache pollution caused by temporary -streaming data that may only be accessed once or twice.

3

InvalidateAfterReadINTEL

CacheControlsINTEL

By using this control the SPIR-V generator is asserting that the cache line -containing the data will not be read again until it’s overwritten, therefore -the load operation can invalidate the cache line and discard "dirty" data. If -the assertion is violated (the cache line is read again) then behavior is -undefined.

4

ConstCachedINTEL

CacheControlsINTEL

By using this control the SPIR-V generator is asserting that the cache line -containing the data will not be written until all invocations of the shader or -kernel execution are finished. If the assertion is violated (the cache line is -written), the behavior is undefined.

-
-

3.XX Store Cache Controls

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Cache ControlEnabling CapabilitiesDescription

0

UncachedINTEL

CacheControlsINTEL

Hint that the store or atomic operation should not cache its data in the -given cache level.

1

WriteThroughINTEL

CacheControlsINTEL

Hint that the store or atomic operation should immediately write data to the -subsequent furthest cache, marking the cache line in the current cache as -"not dirty".

2

WriteBackINTEL

CacheControlsINTEL

Hint that the store or atomic operation should write data into the given -cache level and mark the cache line as "dirty". Upon eviction, "dirty" data -will be written into the furthest subsequent cache.

3

StreamingINTEL

CacheControlsINTEL

Same as WriteThroughINTEL, but use evict-first policy to limit cache -pollution by streaming output data.

-
-
-
-
-

Decorations

-
-

Modify Section 3.20, Decoration, adding rows to the Decoration table:

-
-
-
- ------- - - - - - - - - - - - - - - - - - - - - - - - -
DecorationExtra OperandsEnabling Capabilities

6442

CacheControlLoadINTEL
-Apply the cache controls to a Pointer. -The pointer must point to the Function, UniformConstant, CrossWorkgroup -or Generic Storage Class.
-
-If a memory-reading instruction uses the decorated pointer value as its input -parameter, the decoration is a hint that the load operation should be performed -with the specified cache semantics.
-
-Cache Level is an unsigned 32-bit integer telling the cache level to which -the control applies. The value 0 indicates the cache level closest to the -processing unit, the value 1 indicates the next furthest cache level, etc. -If some cache level does not exist, the decoration is ignored.
-
-If the exact Load Cache Control value is unsupported, the consumer may apply -implementation-specific closest match, but only if it does not change the -observable effect of the requested Load Cache Control.

Literal
-Cache Level

Load_Cache_Control
-Cache Control

CacheControlsINTEL

6443

CacheControlStoreINTEL
-Apply the cache controls to a Pointer. -The pointer must point to the Function, CrossWorkgroup or -Generic Storage Class.
-
-If a memory-writing or atomic instruction uses the decorated pointer value as -its input parameter, the decoration is a hint that the store operation should be -performed with the specified cache semantics.
-
-Cache Level is an unsigned 32-bit integer telling the cache level to which the -control applies. The value 0 indicates the cache level closest to the -processing unit, the value 1 indicates the next furthest cache level, etc. If -some cache level does not exist, the decoration is ignored.
-
-If the exact Store Cache Control value is unsupported, the consumer may apply -implementation-specific closest match, but only if it does not change the -observable effect of the requested Store Cache Control.

Literal
-Cache Level

Store_Cache_Control
-Cache Control

CacheControlsINTEL

-
-
-
-
-

Capabilities

-
-

Modify Section 3.31, Capability, adding rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

6441

CacheControlsINTEL

-
-
-
-
-
-
-

Issues

-
-
-
    -
  1. -

    How the consumer should work with shareable data with cached controls, if -some cache level is non-coherent?
    -RESOLVED: The cache controls defined as "hints" cannot break memory model. The -consumer must insert extra flush or invalidate operations to maintain the -memory model in case of non-coherent caches. The cache controls defined as -"assertions" may break memory model, so users should take care on undefined -behavior cases.

    -
  2. -
  3. -

    How to mark an operation "uncached" on all the available cache levels?
    -RESOLVED: Use Nontemporal Memory Operand defined in core SPIR-V spec -instead of this extension.

    -
  4. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2023-08-28

Victor Mustya

Initial public revision

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_cache_controls.html + + +

extensions/INTEL/SPV_INTEL_cache_controls.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_device_side_avc_motion_estimation.html b/extensions/INTEL/SPV_INTEL_device_side_avc_motion_estimation.html index ba56c1a..a23475f 100644 --- a/extensions/INTEL/SPV_INTEL_device_side_avc_motion_estimation.html +++ b/extensions/INTEL/SPV_INTEL_device_side_avc_motion_estimation.html @@ -1,7942 +1,12 @@ - - - - - - - -SPV_INTEL_device_side_avc_motion_estimation - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_device_side_avc_motion_estimation

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Biju George, Intel

    -
  • -
  • -

    Ben Ashbaugh, Intel

    -
  • -
  • -

    Kristina Bessonova, Intel

    -
  • -
  • -

    Pawel Jurek, Intel

    -
  • -
  • -

    Alexey Sachkov, Intel

    -
  • -
  • -

    Alexey Sotkin, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2018 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-
    -
  • -

    Final Draft

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2018-10-29

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.2 Revision 1.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

Video motion estimation (VME) is defined as of set motion estimation operations that are used to determine the motion vectors, intra estimation angles and macroblock partitioning combination that best describe the transformation to the source macroblock, from blocks in one or more previous reference pictures (inter-prediction), or from other blocks in the same source picture (intra-prediction). It does this by searching for spatial and temporal patterns on the current and various forward and backward reference pictures.

-
-
-

The goal of this extension is to provide programmers with a fine-grained interface to the AVC VME media sampler in Intel graphics processors. It describes the specification of instructions that facilitate the programming of the VME media sampler to evaluate specific AVC motion estimation operations.

-
-
-

Instructions are defined for all the major operations of the VME media sampler. The major operations of the AVC VME media sampler in Intel Graphics Processors can be described as follows:

-
-
-
    -
  1. -

    Integer motion estimation (IME)
    -Perform motion estimation on a given source macroblock in a source image over a single or dual reference window in a reference image, at full-pixel resolution, to determine the best integer motion vectors and their associated distortions, and the best macroblock shape partitioning combination.

    -
  2. -
  3. -

    Motion estimation refinement (REF)
    -Perform refinement operations on the results of IME. The two sub-operations are:

    -
    -
      -
    • -

      Fractional motion estimation (FME)
      -Perform sub-pixel refinement on the results of an IME operation. Half-pixel (HPEL) or quarter-pixel (QPEL) refinements are performed to determine the best sub-pixel motion vectors and their associated distortions.

      -
    • -
    • -

      Bidirectional motion estimation (BME)
      -Perform bidirectional refinement on the results of an IME operation using two reference images to check if the bidirectional mode using two references yields lesser distortions. An FME can optionally be performed implicitly as part of a bidirectional refinement.

      -
    • -
    -
    -
  4. -
  5. -

    Skip and Intra check (SIC)
    -Performs the following two sub-operations:

    -
    -
      -
    • -

      Skip check (SKC)
      -Compute the pixel distortion of a user-specified shape and motion vector combination. The VME media sampler fetches necessary pixels, performs fractional and bidirectional filtering (as necessary), and then computes the distortion between the derived reference and source. The skip decision can optionally be enhanced to include a 4x4 forward transform, the results of which are compared against a user specified threshold to emulate the effects of the forward quantization zeroing effect.

      -
    • -
    • -

      Intra prediction estimation (IPE)
      -Perform intra prediction on a given source macroblock to determine the best intra prediction modes and the best shape partitioning combination.

      -
    • -
    -
    -
  6. -
-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the appropriate OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_device_side_avc_motion_estimation"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces new capabilities:

-
-
-
-
SubgroupAvcMotionEstimationINTEL
-SubgroupAvcMotionEstimationIntraINTEL
-SubgroupAvcMotionEstimationChromaINTEL
-
-
-
-
-
-
-

New Instructions

-
-
-

Instructions added under the SubgroupAvcMotionEstimationINTEL capability (some are additionally defined under the SubgroupAvcMotionEstimationIntraINTEL or SubgroupAvcMotionEstimationChromaINTEL capability):

-
-
-
-
OpSubgroupAvcMceGetDefaultInterBaseMultiReferencePenaltyINTEL
-OpSubgroupAvcMceSetInterBaseMultiReferencePenaltyINTEL
-OpSubgroupAvcMceGetDefaultInterShapePenaltyINTEL
-OpSubgroupAvcMceSetInterShapePenaltyINTEL
-OpSubgroupAvcMceGetDefaultInterDirectionPenaltyINTEL
-OpSubgroupAvcMceSetInterDirectionPenaltyINTEL
-OpSubgroupAvcMceGetDefaultIntraLumaShapePenaltyINTEL
-OpSubgroupAvcMceGetDefaultInterMotionVectorCostTableINTEL
-OpSubgroupAvcMceGetDefaultHighPenaltyCostTableINTEL
-OpSubgroupAvcMceGetDefaultMediumPenaltyCostTableINTEL
-OpSubgroupAvcMceGetDefaultLowPenaltyCostTableINTEL
-OpSubgroupAvcMceSetMotionVectorCostFunctionINTEL
-OpSubgroupAvcMceGetDefaultIntraLumaModePenaltyINTEL
-OpSubgroupAvcMceGetDefaultNonDcLumaIntraPenaltyINTEL
-OpSubgroupAvcMceGetDefaultIntraChromaModeBasePenaltyINTEL
-OpSubgroupAvcMceSetAcOnlyHaarINTEL
-OpSubgroupAvcMceSetSourceInterlacedFieldPolarityINTEL
-OpSubgroupAvcMceSetSingleReferenceInterlacedFieldPolarityINTEL
-OpSubgroupAvcMceSetDualReferenceInterlacedFieldPolaritiesINTEL
-OpSubgroupAvcMceConvertToImePayloadINTEL
-OpSubgroupAvcMceConvertToImeResultINTEL
-OpSubgroupAvcMceConvertToRefPayloadINTEL
-OpSubgroupAvcMceConvertToRefResultINTEL
-OpSubgroupAvcMceConvertToSicPayloadINTEL
-OpSubgroupAvcMceConvertToSicResultINTEL
-OpSubgroupAvcMceGetMotionVectorsINTEL
-OpSubgroupAvcMceGetInterDistortionsINTEL
-OpSubgroupAvcMceGetBestInterDistortionsINTEL
-OpSubgroupAvcMceGetInterMajorShapeINTEL
-OpSubgroupAvcMceGetInterMinorShapeINTEL
-OpSubgroupAvcMceGetInterDirectionsINTEL
-OpSubgroupAvcMceGetInterMotionVectorCountINTEL
-OpSubgroupAvcMceGetInterReferenceIdsINTEL
-OpSubgroupAvcMceGetInterReferenceInterlacedFieldPolaritiesINTEL
-
-
-
-
-
OpVmeImageINTEL
-OpSubgroupAvcImeInitializeINTEL
-OpSubgroupAvcImeSetSingleReferenceINTEL
-OpSubgroupAvcImeSetDualReferenceINTEL
-OpSubgroupAvcImeRefWindowSizeINTEL
-OpSubgroupAvcImeAdjustRefOffsetINTEL
-OpSubgroupAvcImeConvertToMcePayloadINTEL
-OpSubgroupAvcImeSetMaxMotionVectorCountINTEL
-OpSubgroupAvcImeSetUnidirectionalMixDisableINTEL
-OpSubgroupAvcImeSetEarlySearchTerminationThresholdINTEL
-OpSubgroupAvcImeSetWeightedSadINTEL
-OpSubgroupAvcImeEvaluateWithSingleReferenceINTEL
-OpSubgroupAvcImeEvaluateWithDualReferenceINTEL
-OpSubgroupAvcImeEvaluateWithSingleReferenceStreamoutINTEL
-OpSubgroupAvcImeEvaluateWithDualReferenceStreamoutINTEL
-OpSubgroupAvcImeEvaluateWithSingleReferenceStreaminoutINTEL
-OpSubgroupAvcImeEvaluateWithDualReferenceStreaminoutINTEL
-OpSubgroupAvcImeConvertToMceResultINTEL
-OpSubgroupAvcImeGetSingleReferenceStreaminINTEL
-OpSubgroupAvcImeGetDualReferenceStreaminINTEL
-SubgroupAvcImeStripSingleReferenceStreamoutINTEL
-SubgroupAvcImeStripDualReferenceStreamoutINTEL
-OpSubgroupAvcImeGetStreamoutSingleReferenceMajorShapeMotionVectorsINTEL
-OpSubgroupAvcImeGetStreamoutSingleReferenceMajorShapeDistortionsINTEL
-OpSubgroupAvcImeGetStreamoutSingleReferenceMajorShapeReferenceIdsINTEL
-OpSubgroupAvcImeGetStreamoutDualReferenceMajorShapeMotionVectorsINTEL
-OpSubgroupAvcImeGetStreamoutDualReferenceMajorShapeDistortionsINTEL
-OpSubgroupAvcImeGetStreamoutDualReferenceMajorShapeReferenceIdsINTEL
-OpSubgroupAvcImeGetBorderReachedINTEL
-OpSubgroupAvcImeGetTruncatedSearchIndicationINTEL
-OpSubgroupAvcImeGetUnidirectionalEarlySearchTerminationINTEL
-OpSubgroupAvcImeGetWeightingPatternMinimumMotionVectorINTEL
-OpSubgroupAvcImeGetWeightingPatternMinimumDistortionINTEL
-
-
-
-
-
OpSubgroupAvcFmeInitializeINTEL
-OpSubgroupAvcBmeInitializeINTEL
-OpSubgroupAvcRefConvertToMcePayloadINTEL
-OpSubgroupAvcRefSetBidirectionalMixDisableINTEL
-OpSubgroupAvcRefSetBilinearFilterEnableINTEL
-OpSubgroupAvcRefEvaluateWithSingleReferenceINTEL
-OpSubgroupAvcRefEvaluateWithDualReferenceINTEL
-OpSubgroupAvcRefEvaluateWithMultiReferenceINTEL
-OpSubgroupAvcRefEvaluateWithMultiReferenceInterlacedINTEL
-OpSubgroupAvcRefConvertToMceResultINTEL
-
-
-
-
-
OpSubgroupAvcSicInitializeINTEL
-OpSubgroupAvcSicConfigureSkcINTEL
-OpSubgroupAvcSicConfigureIpeLumaINTEL
-OpSubgroupAvcSicConfigureIpeLumaChromaINTEL
-OpSubgroupAvcSicGetMotionVectorMaskINTEL
-OpSubgroupAvcSicConvertToMcePayloadINTEL
-OpSubgroupAvcSicSetIntraLumaShapePenaltyINTEL
-OpSubgroupAvcSicSetIntraLumaModeCostFunctionINTEL
-OpSubgroupAvcSicSetIntraChromaModeCostFunctionINTEL
-OpSubgroupAvcSicSetBilinearFilterEnableINTEL
-OpSubgroupAvcSicSetSkcForwardTransformEnableINTEL
-OpSubgroupAvcSicSetBlockBasedRawSkipSadINTEL
-OpSubgroupAvcSicEvaluateIpeINTEL
-OpSubgroupAvcSicEvaluateWithSingleReferenceINTEL
-OpSubgroupAvcSicEvaluateWithDualReferenceINTEL
-OpSubgroupAvcSicEvaluateWithMultiReferenceINTEL
-OpSubgroupAvcSicEvaluateWithMultiReferenceInterlacedINTEL
-OpSubgroupAvcSicConvertToMceResultINTEL
-OpSubgroupAvcSicGetIpeLumaShapeINTEL
-OpSubgroupAvcSicGetBestIpeLumaDistortionINTEL
-OpSubgroupAvcSicGetBestIpeChromaDistortionINTEL
-OpSubgroupAvcSicGetPackedIpeLumaModesINTEL
-OpSubgroupAvcSicGetIpeChromaModeINTEL
-OpSubgroupAvcSicGetPackedSkcLumaCountThresholdINTEL
-OpSubgroupAvcSicGetPackedSkcLumaSumThresholdINTEL
-OpSubgroupAvcSicGetInterRawSadsINTEL
-
-
-
-
-
-
-

New Types

-
-
-

Opaque Types added under the SubgroupAvcMotionEstimationINTEL capability:

-
-
-
-
OpTypeVmeImageINTEL
-OpTypeAvcMcePayloadINTEL
-OpTypeAvcImePayloadINTEL
-OpTypeAvcRefPayloadINTEL
-OpTypeAvcSicPayloadINTEL
-OpTypeAvcMceResultINTEL
-OpTypeAvcImeResultINTEL
-OpTypeAvcImeResultSingleReferenceStreamoutINTEL
-OpTypeAvcImeResultDualReferenceStreamoutINTEL
-OpTypeAvcImeSingleReferenceStreaminINTEL
-OpTypeAvcImeDualReferenceStreaminINTEL
-OpTypeAvcRefResultINTEL
-OpTypeAvcSicResultINTEL
-
-
-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
OpVmeImageINTEL5699

OpTypeVmeImageINTEL

5700

OpTypeAvcImePayloadINTEL

5701

OpTypeAvcRefPayloadINTEL

5702

OpTypeAvcSicPayloadINTEL

5703

OpTypeAvcMcePayloadINTEL

5704

OpTypeAvcMceResultINTEL

5705

OpTypeAvcImeResultINTEL

5706

OpTypeAvcImeResultSingleReferenceStreamoutINTEL

5707

OpTypeAvcImeResultDualReferenceStreamoutINTEL

5708

OpTypeAvcImeSingleReferenceStreaminINTEL

5709

OpTypeAvcImeDualReferenceStreaminINTEL

5710

OpTypeAvcRefResultINTEL

5711

OpTypeAvcSicResultINTEL

5712

OpSubgroupAvcMceGetDefaultInterBaseMultiReferencePenaltyINTEL

5713

OpSubgroupAvcMceSetInterBaseMultiReferencePenaltyINTEL

5714

OpSubgroupAvcMceGetDefaultInterShapePenaltyINTEL

5715

OpSubgroupAvcMceSetInterShapePenaltyINTEL

5716

OpSubgroupAvcMceGetDefaultInterDirectionPenaltyINTEL

5717

OpSubgroupAvcMceSetInterDirectionPenaltyINTEL

5718

OpSubgroupAvcMceGetDefaultIntraLumaShapePenaltyINTEL

5719

OpSubgroupAvcMceGetDefaultInterMotionVectorCostTableINTEL

5720

OpSubgroupAvcMceGetDefaultHighPenaltyCostTableINTEL

5721

OpSubgroupAvcMceGetDefaultMediumPenaltyCostTableINTEL

5722

OpSubgroupAvcMceGetDefaultLowPenaltyCostTableINTEL

5723

OpSubgroupAvcMceSetMotionVectorCostFunctionINTEL

5724

OpSubgroupAvcMceGetDefaultIntraLumaModePenaltyINTEL

5725

OpSubgroupAvcMceGetDefaultNonDcLumaIntraPenaltyINTEL

5726

OpSubgroupAvcMceGetDefaultIntraChromaModeBasePenaltyINTEL

5727

OpSubgroupAvcMceSetAcOnlyHaarINTEL

5728

OpSubgroupAvcMceSetSourceInterlacedFieldPolarityINTEL

5729

OpSubgroupAvcMceSetSingleReferenceInterlacedFieldPolarityINTEL

5730

OpSubgroupAvcMceSetDualReferenceInterlacedFieldPolaritiesINTEL

5731

OpSubgroupAvcMceConvertToImePayloadINTEL

5732

OpSubgroupAvcMceConvertToImeResultINTEL

5733

OpSubgroupAvcMceConvertToRefPayloadINTEL

5734

OpSubgroupAvcMceConvertToRefResultINTEL

5735

OpSubgroupAvcMceConvertToSicPayloadINTEL

5736

OpSubgroupAvcMceConvertToSicResultINTEL

5737

OpSubgroupAvcMceGetMotionVectorsINTEL

5738

OpSubgroupAvcMceGetInterDistortionsINTEL

5739

OpSubgroupAvcMceGetBestInterDistortionsINTEL

5740

OpSubgroupAvcMceGetInterMajorShapeINTEL

5741

OpSubgroupAvcMceGetInterMinorShapeINTEL

5742

OpSubgroupAvcMceGetInterDirectionsINTEL

5743

OpSubgroupAvcMceGetInterMotionVectorCountINTEL

5744

OpSubgroupAvcMceGetInterReferenceIdsINTEL

5745

OpSubgroupAvcMceGetInterReferenceInterlacedFieldPolaritiesINTEL

5746

OpSubgroupAvcImeInitializeINTEL

5747

OpSubgroupAvcImeSetSingleReferenceINTEL

5748

OpSubgroupAvcImeSetDualReferenceINTEL

5749

OpSubgroupAvcImeRefWindowSizeINTEL

5750

OpSubgroupAvcImeAdjustRefOffsetINTEL

5751

OpSubgroupAvcImeConvertToMcePayloadINTEL

5752

OpSubgroupAvcImeSetMaxMotionVectorCountINTEL

5753

OpSubgroupAvcImeSetUnidirectionalMixDisableINTEL

5754

OpSubgroupAvcImeSetEarlySearchTerminationThresholdINTEL

5755

OpSubgroupAvcImeSetWeightedSadINTEL

5756

OpSubgroupAvcImeEvaluateWithSingleReferenceINTEL

5757

OpSubgroupAvcImeEvaluateWithDualReferenceINTEL

5758

OpSubgroupAvcImeEvaluateWithSingleReferenceStreaminINTEL

5759

SubgroupAvcImeEvaluateWithDualReferenceStreaminINTEL

5760

OpSubgroupAvcImeEvaluateWithSingleReferenceStreamoutINTEL

5761

OpSubgroupAvcImeEvaluateWithDualReferenceStreamoutINTEL

5762

OpSubgroupAvcImeEvaluateWithSingleReferenceStreaminoutINTEL

5763

OpSubgroupAvcImeEvaluateWithDualReferenceStreaminoutINTEL

5764

OpSubgroupAvcImeConvertToMceResultINTEL

5765

OpSubgroupAvcImeGetSingleReferenceStreaminINTEL

5766

OpSubgroupAvcImeGetDualReferenceStreaminINTEL

5767

SubgroupAvcImeStripSingleReferenceStreamoutINTEL

5768

SubgroupAvcImeStripDualReferenceStreamoutINTEL

5769

OpSubgroupAvcImeGetStreamoutSingleReferenceMajorShapeMotionVectorsINTEL

5770

OpSubgroupAvcImeGetStreamoutSingleReferenceMajorShapeDistortionsINTEL

5771

OpSubgroupAvcImeGetStreamoutSingleReferenceMajorShapeReferenceIdsINTEL

5772

OpSubgroupAvcImeGetStreamoutDualReferenceMajorShapeMotionVectorsINTEL

5773

OpSubgroupAvcImeGetStreamoutDualReferenceMajorShapeDistortionsINTEL

5774

OpSubgroupAvcImeGetStreamoutDualReferenceMajorShapeReferenceIdsINTEL

5775

OpSubgroupAvcImeGetBorderReachedINTEL

5776

OpSubgroupAvcImeGetTruncatedSearchIndicationINTEL

5777

OpSubgroupAvcImeGetUnidirectionalEarlySearchTerminationINTEL

5778

OpSubgroupAvcImeGetWeightingPatternMinimumMotionVectorINTEL

5779

OpSubgroupAvcImeGetWeightingPatternMinimumDistortionINTEL

5780

OpSubgroupAvcFmeInitializeINTEL

5781

OpSubgroupAvcBmeInitializeINTEL

5782

OpSubgroupAvcRefConvertToMcePayloadINTEL

5783

OpSubgroupAvcRefSetBidirectionalMixDisableINTEL

5784

OpSubgroupAvcRefSetBilinearFilterEnableINTEL

5785

OpSubgroupAvcRefEvaluateWithSingleReferenceINTEL

5786

OpSubgroupAvcRefEvaluateWithDualReferenceINTEL

5787

OpSubgroupAvcRefEvaluateWithMultiReferenceINTEL

5788

OpSubgroupAvcRefEvaluateWithMultiReferenceInterlacedINTEL

5789

OpSubgroupAvcRefConvertToMceResultINTEL

5790

OpSubgroupAvcSicInitializeINTEL

5791

OpSubgroupAvcSicConfigureSkcINTEL

5792

OpSubgroupAvcSicConfigureIpeLumaINTEL

5793

OpSubgroupAvcSicConfigureIpeLumaChromaINTEL

5794

OpSubgroupAvcSicGetMotionVectorMaskINTEL

5795

OpSubgroupAvcSicConvertToMcePayloadINTEL

5796

OpSubgroupAvcSicSetIntraLumaShapePenaltyINTEL

5797

OpSubgroupAvcSicSetIntraLumaModeCostFunctionINTEL

5798

OpSubgroupAvcSicSetIntraChromaModeCostFunctionINTEL

5799

OpSubgroupAvcSicSetBilinearFilterEnableINTEL

5800

OpSubgroupAvcSicSetSkcForwardTransformEnableINTEL

5801

OpSubgroupAvcSicSetBlockBasedRawSkipSadINTEL

5802

OpSubgroupAvcSicEvaluateIpeINTEL

5803

OpSubgroupAvcSicEvaluateWithSingleReferenceINTEL

5804

OpSubgroupAvcSicEvaluateWithDualReferenceINTEL

5805

OpSubgroupAvcSicEvaluateWithMultiReferenceINTEL

5806

OpSubgroupAvcSicEvaluateWithMultiReferenceInterlacedINTEL

5807

OpSubgroupAvcSicConvertToMceResultINTEL

5808

OpSubgroupAvcSicGetIpeLumaShapeINTEL

5809

OpSubgroupAvcSicGetBestIpeLumaDistortionINTEL

5810

OpSubgroupAvcSicGetBestIpeChromaDistortionINTEL

5811

OpSubgroupAvcSicGetPackedIpeLumaModesINTEL

5812

OpSubgroupAvcSicGetIpeChromaModeINTEL

5813

OpSubgroupAvcSicGetPackedSkcLumaCountThresholdINTEL

5814

OpSubgroupAvcSicGetPackedSkcLumaSumThresholdINTEL

5815

OpSubgroupAvcSicGetInterRawSadsINTEL

5816

-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.2

-
-
-

Terms

-
-

Modify Section 2.2, Terms, adding to the numbered list a new sub-section 2.2.X "AVC Motion Estimation":

-
-
-

The following terms, acronyms and definitions are used in and provide context for the AVE motion estimation group instructions.

-
-
-

Macro-block (MB)
-An image is partitioned into macro-blocks of size 16x16 pixels. It is the basic unit of processing for AVC video motion estimation operations.

-
-
-

Shape
-A MB may be partitioned into sub-blocks of one of the major shapes. A sub-block with an 8x8 major shape may be further independently partitioned into sub-blocks of one of the minor shapes. It is represented by predefined shape values.

-
-
-

Major Shapes
-Shapes of 16x16, 16x8, 8x16, or 8x8 partitions of a MB. A 16x16 major shape merely indicates that the MB was not further partitioned.

-
-
-

Minor Shapes
-Shapes of 8x8, 8x4, 4x8, or 4x4 sub-partitions of an 8x8 partition. A 8x8 minor shape merely indicates that the 8x8 major partition was not further sub-partitioned.

-
-
-

Block
-A sub-block of a MB with one of the major or minor shapes.

-
-
-

Reference Image
-An image (typically from the previously decoded buffer in an encoder pipeline) from which motion estimation predictions are made.

-
-
-

Source Image
-The current image for which motion estimation predictions are made.

-
-
-

Source Macro-block Offset
-The 2D offset of the top left corner of the source MB in pixel units. It is represented by a pair of unsigned 16-bit integers.

-
-
-

Reference Window Offset
-The 2D offset of the top left corner reference search window w.r.t to the top left corner of the source MB in pixel units. It is represented by a pair of signed 16-bit integers in the range [-2048, 2047].

-
-
-

Reference identifier
-Reference identifiers are associated to pairs of forward(L0)/ backward(L1) reference image parameters. Up to 16 pairs of reference pairs of reference image parameters are permitted, with the permitted values of reference identifiers ranging from 0 to 15. The reference identifiers are assigned in increasing order in which the reference image parameter pairs are declared in the kernel parameter operand list.

-
-
-

Motion Vector (MV)
-A 2D vector used for inter motion estimation that provides an offset from the top left corner of a block in the source image to the top left corner an identically sized block in the reference image. Generally it is used to represent the best match of a block in the reference image to a block in the source image. The best match is determined as the block minimizing the distortion. MVs are specified in QPEL resolution with the 2 LSB representing the fractional part of the offset. It is represented by a pair of signed 16-bit signed integers.

-
-
-

Packed Motion Vector
-A motion vector represented as a packed 32-bit unsigned integer. The lower 16 bits contains the X coordinate and the upper 16 bits contains the Y coordinate.

-
-
-

Bidirectional Motion Vector (BMV)
-A pair of MVs for the forward(L0) and backward(L1) images. Depending on how the VME operation is configured only the forward or the backward MV or both may be valid.

-
-
-

Packed Bidirectional Motion Vector
-A bidirectional MV represented as a packed 64-bit unsigned integer. The lower 32-bits contain the forward packed MV, and the upper 32-bits contain the backward packed MV.

-
-
-

Sum Of Absolute Difference (SAD)
-The sum of absolute differences of every full/sub-pixel location in the source block w.r.t every corresponding full/sub pixel in the reference block as specified by a given MV. The sum of absolute differences may be optionally Haar transform adjusted. It is represented by an unsigned 16-bit integer value.

-
-
-

Haar Transform (HAAR)
-A simple wavelet transform that is used to refine the distortion measure of SAD. The per pixel difference goes through a 4x4 Haar transform. Then the SAD is replaced by the sum of the absolute values of the transform domain coefficients in the distortion. Haar transform is used as a coarse estimation of the integer transform.

-
-
-

Motion Vector Cost Center (CC)
-A MV has an associated cost w.r.t a cost center coordinate. The further away from the cost center, the larger will be the cost associated with the MV. Cost centers are specified in QPEL resolution with the 2 LSB representing the fractional part of the offset.

-
-
-

Motion Vector Cost Center Delta (CCD)
-The 2D offset of the cost center relative to the top left corner of the source MB. Cost center deltas are specified in QPEL resolution with the 2 LSB representing the fractional part of the offset. It is represented by a pair of signed 16-bit integers.

-
-
-

Packed Motion Vector Cost Center Delta
-A motion vector cost center delta represented as a packed 32-bit unsigned integer. The lower 16 bits contains the X coordinate and the upper 16 bits contains the Y coordinate.

-
-
-

Bidirectional Motion Vector Cost Center Delta
-A pair of cost center deltas for the forward and backward images.

-
-
-

Packed Bidirectional Motion Vector Cost Center Delta
-A packed bidirectional motion vector cost center delta represented as a 64-bit unsigned integer. The lower 32-bits contain the forward packed CCD, and the upper 32-bits contain the backward packed CCD.

-
-
-

Motion Vector Cost
-The MV cost is determined using a cost function described by a cost table that is indexed based on power-of-two distances from the user specified cost center, with a user specified precision (or unit) of the distances from the cost center.

-
-
-

U4U4 Byte Format
-Represents a value of (B<<S), where B, called base, is the 4 bit LSB of the byte and S, called shift, is the 4 bit MSB of the byte.

-
-
-

Motion Vector Cost Table
-A table which specifies the cost penalties at 8 control points. The first 7 control points represent the distances from cost center at powers-of-two locations (20 to 26), and the last control point represents the base penalty for distances that are out of range of the cost function curve. It is represented by a packed array of 8 U4U4 unsigned integer values.

-
-
-

Motion Vector Cost Precision
-The precision (or unit) of the control points in the MV cost table. It can be used to control the precision and range of the cost function. It is represented by pre-defined cost precision values.

-
-
-

Shape Cost
-The cost associated with encoding a particular partition shape using inter or intra prediction. It is represented by a packed array of 10 U4U4 unsigned integer values.

-
-
-

Distortion
-The distortion is the sum of SAD, MV cost, shape cost and multi-reference cost for inter estimation, and the sum of SAD, mode cost, shape cost and non-dc cost for intra estimation. It is a measure of the cost of encoding a block and is represented by an unsigned 16-bit integer value.

-
-
-

Intra Mode
-An intra-prediction angle which provides a prediction for the current block from the edge pixels in its neighboring blocks. It is represented by pre-defined intra mode values.

-
-
-

Intra Mode Cost
-The cost associated with a computed intra mode for a block w.r.t a predicted intra mode based on the computed intra modes for itsneighboring blocks.

-
-
-

Mode
-The decision whether the inter-prediction or intra-prediction minimizes distortion of a given MB.

-
-
-

Search unit (SU)
-The basic unit of searching. Possible reference search locations are grouped in a predefined 4x4 pattern, and all locations within the same group must be completely chosen or completely skipped. These predefined groups are called search units.

-
-
-

Search Path (SP)
-The path taken during searching in a reference window. The steps taken in a search path are in units of SUs. The search path must lie within the defined search window.

-
-
-

Luma
-Luma refers to either the Y-plane of a NV12 image or a regular image with the Image Channel Order and Image Channel Data Type restricted as R and UnormInt8.

-
-
-

Chroma
-Chroma refer to the UV-plane of a NV12 image.

-
-
-

Search Window (SW)
-The search area that will be covered during searching. The area of the search window is limited to 2K luma pixels.

-
-
-

Search Window Configuration
-The configuration of a search window which is a combination of the search path and search window.

-
-
-

The predefined search window configurations are:

-
- ---- - - - - - - - - - - - - - - - - - - - - - - - - - - -

EXHAUSTIVE

48x40 SW with exhaustive single reference search (or 32x32 dual SW for exhaustive dual-reference search); an exhaustive search means that all SU within the search window are searched in a spiral pattern with the search center being the middle of the search window.

SMALL

28x28 SW with exhaustive search

TINY

24x24 SW with exhaustive search

EXTRA TINY

20x20 SW with exhaustive search

DIAMOND

48x40 SW with diamond single reference search (or 32x32 dual SW for diamond dual-reference search); a diamond pattern search path is used for the first 16 (or 7 per reference for dual reference search) SUs, and then gradient based searching is used for up to a maximum of 57 search unit.

LARGE DIAMOND

48x40 SW with large diamond single reference search(or 32x32 dual SW for large diamond dual-reference search); a diamond pattern search pattern is used for the first 32 (or 10 per reference for dual reference search) SUs, and then gradient based searching is used for up to a maximum of 57 search units.

-
-

Inter Estimation
-The process of determining motion vectors and shapes that best describe the transformation from 2D images from previously decoded images in a video sequence to the currently processed image.

-
-
-

Intra-Prediction Estimation (IPE)
-The process of determining prediction angles and shapes that best describe the transformation from neighboring MBs in an image to the currently processed MB in the same image.

-
-
-

Luma Mode
-The prediction angle returned by IPE for the luma component for a block. It is represented by an unsigned 8-bit integer with the upper 4 bits set to zero.

-
-
-

Integer Motion Estimation (IME)
-Inter-motion estimation in integer pixel resolution.

-
-
-

Fractional Motion Estimation (FME)
-Inter-motion estimation in sub-pixel resolution. The result of integer motion estimation on a reference image is used to perform fractional refinement.

-
-
-

Bidirectional Motion Estimation (BME)
-The process of determining if the bi-directional prediction minimizes the distortion w.r.t to unidirectional prediction. The results of IME on forward(L0) and backward(L1) reference images are used to perform bi-directional refinement. BME can be performed in integer or sub-pixel resolution. If performed in sub-pixel resolution an implicit FME operation is done before performing the BME.

-
-
-

Refinement (REF)
-A FME and/or BME refinement operation.

-
-
-

Skip/Spot Check (SKC)
-The operation determining the distortion associated with a given (uni or bidirectional) MV in a reference image(s) w.r.t a source image.

-
-
-

Skip and Intra Check (SIC)
-The process of performing both SKC and IPE in the same operation.

-
-
-

Motion Check or Estimation (MCE)
-A generic IME, REF, or SIC operation.

-
-
-

Forward Transform (FT)
-An 8x8 or 4x4 integer transform used to transform the residual to the frequency domain.

-
-
-
-
-

Capabilities

-
-

Modify Section 3.31, Capability, adding rows to the Capability table:

-
- ------ - - - - - - - - - - - - - - - - - - - - -

5696

SubgroupAvcMotionEstimationINTEL

Groups

SPV_INTEL_device_side_motion_estimation

5697

SubgroupAvcMotionEstimationIntraINTEL

SubgroupAvcMotionEstimationINTEL, SubgroupImageMediaBlockIOINTEL

SPV_INTEL_device_side_motion_estimation

5698

SubgroupAvcMotionEstimationChromaINTEL

SubgroupAvcMotionEstimationIntraINTEL

SPV_INTEL_device_side_motion_estimation

-
-
-
-

Types

-
-

Modify Section 2.2.2, Types, adding "VME image" type after the definition of Sampler as follows:

-
-
-

Sampler: There are essentially two categories of samplers: texture and media samplers. Texture samplers essentially describe settings how to access, filter, or sample on an image, that come either from literal declarations of settings or be an opaque reference to externally bound settings. Media samplers essentially settings for motion estimation on an image, that come only from literal declarations of settings. Refer to section 3.32.21 Group Instructions for a detailed description the instructions that use of this type. In general, the use of the word "sampler" by itself refers to a texture sampler. A media sampler will be explicitly referred to as "media sampler". A sampler does not include an image.

-
-
-

Modify Section 2.2.2, Types, adding "VME image" type after the definition of Sampled Image as follows:

-
-
-

VME Image: An image combined with a sampler, enabling VME accesses of the image’s contents.

-
-
-

Modify Section 2.2.2, Types, adding the following to the Opaque types:

-
-
-
    -
  • -

    OpTypeVmeImageINTEL

    -
  • -
  • -

    OpTypeAvcMcePayloadINTEL

    -
  • -
  • -

    OpTypeAvcImePayloadINTEL

    -
  • -
  • -

    OpTypeAvcRefPayloadINTEL

    -
  • -
  • -

    OpTypeAvcSicPayloadINTEL

    -
  • -
  • -

    OpTypeAvcMceResultINTEL

    -
  • -
  • -

    OpTypeAvcImeResultINTEL

    -
  • -
  • -

    OpTypeAvcImeResultSingleReferenceStreamoutINTEL

    -
  • -
  • -

    OpTypeAvcImeResultDualReferenceStreamoutINTEL

    -
  • -
  • -

    OpTypeAvcImeSingleReferenceStreaminINTEL

    -
  • -
  • -

    OpTypeAvcImeDualReferenceStreaminINTEL

    -
  • -
  • -

    OpTypeAvcRefResultINTEL

    -
  • -
  • -

    OpTypeAvcSicResultINTEL

    -
  • -
-
-
-

Modify Section 2.8, Types and Variables, adding the following to the third paragraph:

-
-
-

To do motion estimation operations, a type from OpTypeVmeImageINTEL is used that contains both an image and a media sampler. Such an image can be set only in a SPIR-V module from an independent image and an independent sampler. Furthermore its OpTypeImage must have a Dim of 2D.

-
-
-

Modify Section 3.32.6, Type Declaration Instructions, amending the description of OpTypeSampler as follows:

-
- --- - - - - - -
-

OpTypeSampler
-
-Declare the sampler type. Consumed by OpSampledImage or OpVmeImageINTEL. This type is opaque: values of this type have no defined physical size or bit pattern.

-
- ----- - - - - - - - -

2

26

Result <id>

-
-

Modify Section 3.32.6, Type Declaration Instructions, adding the description of OpTypeSampledImage as follows:

-
- --- - - - - - -
-

OpTypeVmeImageINTEL
-
-Declare a VME image type, the Result Type of OpVmeImageINTEL. This type is opaque: values of this type have no defined physical size or bit pattern.

-
-
-

Image Type must be an OpTypeImage. It is the type of the image in the combined sampler and image type.

-
- ------ - - - - - - - - -

3

5700

Result <id>

<id> Image Type

-
-

Modify Section 3.32.6, Type Declaration Instructions, adding to the end of the list of type declarations:

-
- --- - - - - - -
-

OpTypeAvcMcePayloadINTEL
-
-Declare the Operand and/or Result Type of a AVC MCE Group Instruction. Consumed by AVC MCE Group Instruction and represents the payload for a basic IME/REF/SIC operation. This type is opaque: values of this type have no defined physical size or bit pattern.

-
- ----- - - - - - - - -

2

5704

Result <id>

- --- - - - - - -
-

OpTypeAvcImePayloadINTEL
-
-Declare the Operand and/or Result Type of a AVC IME Group Instruction. Consumed by AVC MCE Group Instruction and represents the payload for a basic IME operation. This type is opaque: values of this type have no defined physical size or bit pattern.

-
- ----- - - - - - - - -

2

5701

Result <id>

- --- - - - - - -
-

OpTypeAvcRefPayloadINTEL
-
-Declare the Operand and/or Result Type of a AVC REF Group Instruction. Consumed by AVC REF Group Instruction and represents the payload for a basic REF operation. This type is opaque: values of this type have no defined physical size or bit pattern.

-
- ----- - - - - - - - -

2

5702

Result <id>

- --- - - - - - -
-

OpTypeAvcSicPayloadINTEL
-
-Declare the Operand and/or Result Type of a AVC SIC Group Instruction. Consumed by AVC SIC Group Instruction and represents the payload for a basic SIC operation. This type is opaque: values of this type have no defined physical size or bit pattern.

-
- ----- - - - - - - - -

2

5703

Result <id>

- --- - - - - - -
-

OpTypeAvcMceResultINTEL
-
-Declare the Operand and/or Result Type of a AVC MCE Group Instruction. Consumed by AVC MCE Group Instruction and represents the evluation results of a basic IME/REF/SIC operation. This type is opaque: values of this type have no defined physical size or bit pattern.

-
- ----- - - - - - - - -

2

5705

Result <id>

- --- - - - - - -
-

OpTypeAvcImeResultINTEL
-
-Declare the Operand and/or Result Type of a AVC IME Group Instruction. Consumed by AVC IME Group Instruction and represents the evaluation of a basic IME operation not using the stream-in/streamout functionality. This type is opaque: values of this type have no defined physical size or bit pattern.

-
- ----- - - - - - - - -

2

5706

Result <id>

- --- - - - - - -
-

OpTypeAvcImeResultSingleReferenceStreamoutINTEL
-
-Declare the Operand and/or Result Type of a AVC IME Group Instruction. Consumed by AVC IME Group Instruction and represents the additional results from the result of an IME evaluation using the streamout functionality that may be streamed-in in a subsequent IME streamin call. This type is opaque: values of this type have no defined physical size or bit pattern.

-
- ----- - - - - - - - -

2

5707

Result <id>

- --- - - - - - -
-

OpTypeAvcImeResultDualReferenceStreamoutINTEL
-
-Declare the Operand and/or Result Type of a AVC IME Group Instruction. Consumed by AVC IME Group Instruction and represents the additional results from the result of an IME evaluation using the streamout functionality that may be streamed-in in a subsequent IME streamin call. This type is opaque: values of this type have no defined physical size or bit pattern.

-
- ----- - - - - - - - -

2

5708

Result <id>

- --- - - - - - -
-

OpTypeAvcImeSingleReferenceStreaminINTEL
-
-Declare the Operand and/or Result Type of a AVC IME Group Instruction. Consumed by AVC IME Group Instruction and represents the additional results from the result of an IME evaluation using the streamout functionality that may be streamed-in in a subsequent IME streamin call. This type is opaque: values of this type have no defined physical size or bit pattern.

-
- ----- - - - - - - - -

2

5709

Result <id>

- --- - - - - - -
-

OpTypeAvcImeDualReferenceStreaminINTEL
-
-Declare the Operand and/or Result Type of a AVC IME Group Instruction. Consumed by AVC IME Group Instruction and represents the additional results from the result of an IME evaluation using the streamout functionality that may be streamed-in in a subsequent IME streamin call. This type is opaque: values of this type have no defined physical size or bit pattern.

-
- ----- - - - - - - - -

2

5710

Result <id>

- --- - - - - - -
-

OpTypeAvcRefResultINTEL
-
-Declare the Operand and/or Result Type of a AVC REF Group Instruction. Consumed by AVC REF Group Instruction and represents the evaluation of a basic REF operation. This type is opaque: values of this type have no defined physical size or bit pattern.

-
- ----- - - - - - - - -

2

5711

Result <id>

- --- - - - - - -
-

OpTypeAvcSicResultINTEL
-
-Declare the Operand and/or Result Type of a AVC SIC Group Instruction. Consumed by AVC SIC Group Instruction and represents the evaluation of a basic SIC operation. This type is opaque: values of this type have no defined physical size or bit pattern.

-
- ----- - - - - - - - -

2

5712

Result <id>

-
-
-
-

Binary Form

-
-

Modify Section 3, Binary Form, adding to the numbered list the following sub-sections:

-
-
-

Interlaced image field polarity values:

-
- ----- - - - - - - - - - - - - - - - - - - - -
Field polarity valuesEnabling Capabilities

0x0

AVC_ME_INTERLACED_SCAN_TOP_FIELD_INTEL

SubgroupAvcMotionEstimationINTEL

0x1

AVC_ME_INTERLACED_SCAN_BOTTOM_FIELD_INTEL

SubgroupAvcMotionEstimationINTEL

-
-

Inter macro-block major shape values:

-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Major shape valuesEnabling Capabilities

0x0

AVC_ME_MAJOR_16x16_INTEL

SubgroupAvcMotionEstimationINTEL

0x1

AVC_ME_MAJOR_16x8_INTEL

SubgroupAvcMotionEstimationINTEL

0x2

AVC_ME_MAJOR_8x16_INTEL

SubgroupAvcMotionEstimationINTEL

0x3

AVC_ME_MAJOR_8x8_INTEL

SubgroupAvcMotionEstimationINTEL

-
-

Inter macro-block minor shape values:

-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Minor shape valuesEnabling Capabilities

0x0

AVC_ME_MINOR_8x8_INTEL

SubgroupAvcMotionEstimationINTEL

0x1

AVC_ME_MINOR_8x4_INTEL

SubgroupAvcMotionEstimationINTEL

0x2

AVC_ME_MINOR_4x8_INTEL

SubgroupAvcMotionEstimationINTEL

0x3

AVC_ME_MINOR_4x4_INTEL

SubgroupAvcMotionEstimationINTEL

-
-

Inter macro-block major direction values:

-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - -
Major direction valuesEnabling Capabilities

0x0

AVC_ME_MAJOR_FORWARD_INTEL

SubgroupAvcMotionEstimationINTEL

0x1

AVC_ME_MAJOR_BACKWARD_INTEL

SubgroupAvcMotionEstimationINTEL

0x2

AVC_ME_MAJOR_BIDIRECTIONAL_INTEL

SubgroupAvcMotionEstimationINTEL

-
-

Inter (IME) partition mask values:

-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Partition mask valuesEnabling Capabilities

0x0

AVC_ME_PARTITION_MASK_ALL_INTEL

SubgroupAvcMotionEstimationINTEL

0x7E

AVC_ME_PARTITION_MASK_16x16_INTEL

SubgroupAvcMotionEstimationINTEL

0x7D

AVC_ME_PARTITION_MASK_16x8_INTEL

SubgroupAvcMotionEstimationINTEL

0x7B

AVC_ME_PARTITION_MASK_8x16_INTEL

SubgroupAvcMotionEstimationINTEL

0x77

AVC_ME_PARTITION_MASK_8x8_INTEL

SubgroupAvcMotionEstimationINTEL

0x6F

AVC_ME_PARTITION_MASK_8x4_INTEL

SubgroupAvcMotionEstimationINTEL

0x5F

AVC_ME_PARTITION_MASK_4x8_INTEL

SubgroupAvcMotionEstimationINTEL

0x3F

AVC_ME_PARTITION_MASK_4x4_INTEL

SubgroupAvcMotionEstimationINTEL

-
-

Slice type values:

-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - -
Polarity valuesEnabling Capabilities

0x0

AVC_ME_SLICE_TYPE_PRED_INTEL

SubgroupAvcMotionEstimationINTEL

0x1

AVC_ME_SLICE_TYPE_BPRED_INTEL

SubgroupAvcMotionEstimationINTEL

0x2

AVC_ME_SLICE_TYPE_INTRA_INTEL

SubgroupAvcMotionEstimationINTEL

-
-

Search window configuration:

-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Search window configuration valuesEnabling Capabilities

0x0

AVC_ME_SEARCH_WINDOW_EXHAUSTIVE_INTEL

SubgroupAvcMotionEstimationINTEL

0x1

AVC_ME_SEARCH_WINDOW_SMALL_INTEL

SubgroupAvcMotionEstimationINTEL

0x2

AVC_ME_SEARCH_WINDOW_TINY_INTEL

SubgroupAvcMotionEstimationINTEL

0x3

AVC_ME_SEARCH_WINDOW_EXTRA_TINY_INTEL

SubgroupAvcMotionEstimationINTEL

0x4

AVC_ME_SEARCH_WINDOW_DIAMOND_INTEL

SubgroupAvcMotionEstimationINTEL

0x5

AVC_ME_SEARCH_WINDOW_LARGE_DIAMOND_INTEL

SubgroupAvcMotionEstimationINTEL

0x6

AVC_ME_SEARCH_WINDOW_RESERVED0_INTEL

SubgroupAvcMotionEstimationINTEL

0x7

AVC_ME_SEARCH_WINDOW_RESERVED1_INTEL

SubgroupAvcMotionEstimationINTEL

-
-

SAD adjustment mode:

-
- ----- - - - - - - - - - - - - - - - - - - - -
SAD adjustment valuesEnabling Capabilities

0x0

AVC_ME_SAD_ADJUST_MODE_NONE_INTEL

SubgroupAvcMotionEstimationINTEL -

0x2

AVC_ME_SAD_ADJUST_MODE_HAAR_INTEL

SubgroupAvcMotionEstimationINTEL

-
-

Pixel resolution values:

-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - -
Pixel resolution valuesEnabling Capabilities

0x0

AVC_ME_SUBPIXEL_MODE_INTEGER_INTEL

SubgroupAvcMotionEstimationINTEL

0x1

AVC_ME_SUBPIXEL_MODE_HPEL_INTEL

SubgroupAvcMotionEstimationINTEL

0x3

AVC_ME_SUBPIXEL_MODE_QPEL_INTEL

SubgroupAvcMotionEstimationINTEL

-
-

Cost precision values:

-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Cost precision valuesEnabling Capabilities

0x0

AVC_ME_COST_PRECISION_QPEL_INTEL

SubgroupAvcMotionEstimationINTEL

0x1

AVC_ME_COST_PRECISION_HPEL_INTEL

SubgroupAvcMotionEstimationINTEL

0x2

AVC_ME_COST_PRECISION_PEL_INTEL

SubgroupAvcMotionEstimationINTEL

0x3

AVC_ME_COST_PRECISION_DPEL_INTEL

SubgroupAvcMotionEstimationINTEL

-
-

Inter bidirectional weights:

-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Inter bidirectional weight valuesEnabling Capabilities

0x10

AVC_ME_BIDIR_WEIGHT_QUARTER_INTEL

SubgroupAvcMotionEstimationINTEL

0x15

AVC_ME_BIDIR_WEIGHT_THIRD_INTEL

SubgroupAvcMotionEstimationINTEL

0x20

AVC_ME_BIDIR_WEIGHT_HALF_INTEL

SubgroupAvcMotionEstimationINTEL

0x2B

AVC_ME_BIDIR_WEIGHT_TWO_THIRD_INTEL

SubgroupAvcMotionEstimationINTEL

0x30

AVC_ME_BIDIR_WEIGHT_THREE_QUARTER_INTEL

SubgroupAvcMotionEstimationINTEL

-
-

Inter border reached values

-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Inter border reached valuesEnabling Capabilities

0x0

AVC_ME_BORDER_REACHED_LEFT_INTEL

SubgroupAvcMotionEstimationINTEL

0x2

AVC_ME_BORDER_REACHED_RIGHT_INTEL

SubgroupAvcMotionEstimationINTEL

0x4

AVC_ME_BORDER_REACHED_TOP_INTEL

SubgroupAvcMotionEstimationINTEL

0x8

AVC_ME_BORDER_REACHED_BOTTOM_INTEL

SubgroupAvcMotionEstimationINTEL

-
-

Intra macro-block shape values

-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - -
Intra macro-block shape valuesEnabling Capabilities

0x0

AVC_ME_INTRA_16x16_INTEL

SubgroupAvcMotionEstimationINTEL

0x1

AVC_ME_INTRA_8x8_INTEL

SubgroupAvcMotionEstimationINTEL

0x2

AVC_ME_INTRA_4x4_INTEL

SubgroupAvcMotionEstimationINTEL

-
-

Inter skip block partition type:

-
- ----- - - - - - - - - - - - - - - - - - - - -
Intra macro-block shape valuesEnabling Capabilities

0x0

AVC_ME_SKIP_BLOCK_PARTITION_16x16_INTEL

SubgroupAvcMotionEstimationINTEL

0x04000

AVC_ME_SKIP_BLOCK_PARTITION_8x8_INTEL

SubgroupAvcMotionEstimationINTEL

-
-

Inter skip motion vector mask:

-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Inter skip motion vector valuesEnabling Capabilities

(0x1<<24)

AVC_ME_SKIP_BLOCK_16x16_FORWARD_ENABLE_INTEL

SubgroupAvcMotionEstimationINTEL

(0x2<<24)

AVC_ME_SKIP_BLOCK_16x16_BACKWARD_ ENABLE_INTEL

SubgroupAvcMotionEstimationINTEL

(0x3<<24)

AVC_ME_SKIP_BLOCK_16x16_DUAL_ENABLE_INTEL

SubgroupAvcMotionEstimationINTEL

(0x55<<24)

AVC_ME_SKIP_BLOCK_8x8_FORWARD_ENABLE_INTEL

SubgroupAvcMotionEstimationINTEL

(0xAA<<24)

AVC_ME_SKIP_BLOCK_8x8_BACKWARD_ENABLE_INTEL

SubgroupAvcMotionEstimationINTEL

(0xFF<<24)

AVC_ME_SKIP_BLOCK_8x8_DUAL_ENABLE_INTEL

SubgroupAvcMotionEstimationINTEL

(0x1<<24)

AVC_ME_SKIP_BLOCK_8x8_0_FORWARD_ENABLE_INTEL

SubgroupAvcMotionEstimationINTEL

(0x2<<24)

AVC_ME_SKIP_BLOCK_8x8_0_BACKWARD_ENABLE_INTEL

SubgroupAvcMotionEstimationINTEL

(0x1<<26)

AVC_ME_SKIP_BLOCK_8x8_1_FORWARD_ENABLE_INTEL

SubgroupAvcMotionEstimationINTEL

(0x2<<26)

AVC_ME_SKIP_BLOCK_8x8_1_BACKWARD_ENABLE_INTEL

SubgroupAvcMotionEstimationINTEL

(0x1<<28)

AVC_ME_SKIP_BLOCK_8x8_2_FORWARD_ENABLE_INTEL

SubgroupAvcMotionEstimationINTEL

(0x2<<28)

AVC_ME_SKIP_BLOCK_8x8_2_BACKWARD_ENABLE_INTEL

SubgroupAvcMotionEstimationINTEL

(0x1<<30)

AVC_ME_SKIP_BLOCK_8x8_3_FORWARD_ENABLE_INTEL

SubgroupAvcMotionEstimationINTEL

(0x2<<30)

AVC_ME_SKIP_BLOCK_8x8_3_BACKWARD_ENABLE_INTEL

SubgroupAvcMotionEstimationINTEL

-
-

Block based skip type values:

-
- ----- - - - - - - - - - - - - - - - - - - - -
Block based skip type valuesEnabling Capabilities

0x0

AVC_ME_BLOCK_BASED_SKIP_4x4_INTEL

SubgroupAvcMotionEstimationINTEL

0x80

AVC_ME_BLOCK_BASED_SKIP_8x8_INTEL

SubgroupAvcMotionEstimationINTEL

-
-

Luma intra partition mask values:

-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Luma intra partition mask valuesEnabling Capabilities

0x0

AVC_ME_INTRA_LUMA_PARTITION_MASK_ALL_INTEL

SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationIntraINTEL

0x6

AVC_ME_INTRA_LUMA_PARTITION_MASK_16x16_INTEL

SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationIntraINTEL

0x5

AVC_ME_INTRA_LUMA_PARTITION_MASK_8x8_INTEL

SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationIntraINTEL

0x3

AVC_ME_INTRA_LUMA_PARTITION_MASK_4x4_INTEL

SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationIntraINTEL

-
-

Intra neighbor availability mask values:

-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Intra neighbor availability mask valuesEnabling Capabilities

0x60

AVC_ME_INTRA_NEIGHBOR_LEFT_MASK_ENABLE_INTEL

SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationIntraINTEL

0x10

AVC_ME_INTRA_NEIGHBOR_UPPER_MASK_ENABLE_INTEL

SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationIntraINTEL

0x8

AVC_ME_INTRA_NEIGHBOR_UPPER_RIGHT_MASK_ENABLE_INTEL

SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationIntraINTEL

0x4

AVC_ME_INTRA_NEIGHBOR_UPPER_LEFT_MASK_ENABLE_INTEL

SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationIntraINTEL

-
-

Luma intra modes:

-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Luma intra mode valuesEnabling Capabilities

0x0

AVC_ME_LUMA_PREDICTOR_MODE_VERTICAL_INTEL

SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationIntraINTEL

0x1

AVC_ME_LUMA_PREDICTOR_MODE_HORIZONTAL_INTEL

SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationIntraINTEL

0x2

AVC_ME_LUMA_PREDICTOR_MODE_DC_INTEL

SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationIntraINTEL

0x3

AVC_ME_LUMA_PREDICTOR_MODE_DIAGONAL_DOWN_LEFT_INTEL

SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationIntraINTEL

0x4

AVC_ME_LUMA_PREDICTOR_MODE_DIAGONAL_DOWN_RIGHT_INTEL

SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationIntraINTEL

0x4

AVC_ME_LUMA_PREDICTOR_MODE_PLANE_INTEL

SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationIntraINTEL

0x5

AVC_ME_LUMA_PREDICTOR_MODE_VERTICAL_RIGHT_INTEL

SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationIntraINTEL

0x6

AVC_ME_LUMA_PREDICTOR_MODE_HORIZONTAL_DOWN_INTEL

SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationIntraINTEL

0x7

AVC_ME_LUMA_PREDICTOR_MODE_VERTICAL_LEFT_INTEL

SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationIntraINTEL

0x8

AVC_ME_LUMA_PREDICTOR_MODE_HORIZONTAL_UP_INTEL

SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationIntraINTEL

-
-

Chroma intra modes:

-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Chroma intra mode valuesEnabling Capabilities

0x0

AVC_ME_CHROMA_PREDICTOR_MODE_DC_INTEL

SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationChromaINTEL

0x1

AVC_ME_CHROMA_PREDICTOR_MODE_HORIZONTAL_INTEL

SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationChromaINTEL

0x2

AVC_ME_CHROMA_PREDICTOR_MODE_VERTICAL_INTEL

SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationChromaINTEL

0x3

AVC_ME_CHROMA_PREDICTOR_MODE_PLANE_INTEL

SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationChromaINTEL

-
-

Reference image select values:

-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - -
Reference image select valuesEnabling Capabilities

0x1

AVC_ME_FRAME_FORWARD_INTEL

SubgroupAvcMotionEstimationINTEL

0x2

AVC_ME_FRAME_BACKWARD_INTEL

SubgroupAvcMotionEstimationINTEL

0x3

AVC_ME_FRAME_DUAL_INTEL

SubgroupAvcMotionEstimationINTEL

-
-
-
-

Instructions

-
-

Modify Section 3.32.7, Constant-Creation Instructions, amending the desciption of OpConstantNull as follow:

-
- --- - - - - - -
-

OpConstantNull
-
-Declare a new null constant value.

-
-
-

The null value is type dependent, defined as follows:

-
-
-
    -
  • -

    Scalar Boolean: false

    -
  • -
  • -

    Scalar integer: 0

    -
  • -
  • -

    Scalar floating point: +0.0 (all bits 0)

    -
  • -
  • -

    All other scalars: Abstract

    -
  • -
  • -

    Composites: Members are set recursively to the null constant according to the null value of their constituent types.

    -
  • -
  • -

    IME/REF/SIC payload & result types: Abstract

    -
  • -
-
-
-

Result Type must be one of the following types:

-
-
-
    -
  • -

    Scalar or vector Boolean type

    -
  • -
  • -

    Scalar or vector integer type

    -
  • -
  • -

    Scalar or vector floating-point type

    -
  • -
  • -

    Pointer type

    -
  • -
  • -

    Event type

    -
  • -
  • -

    Device side event type

    -
  • -
  • -

    Reservation id type

    -
  • -
  • -

    Queue type

    -
  • -
  • -

    Composite type

    -
  • -
  • -

    IME/REF/SIC payload or result type

    -
  • -
-
- ------ - - - - - - - - -

3

46

<id> Result Type

Result <id>

-
-

Modify Section 3.32.10, Image Instructions, amending the description of OpImage as follows:

-
- --- - - - - - -
-

OpImage
-
-Extract the image from a sampled or VME image.

-
-
-

Result Type must be OpTypeImage.

-
-
-

Sampled Image must have type OpTypeSampledImage or OpTypeVmeImageINTEL whose Image Type is the same as Result Type.

-
- ------- - - - - - - - - - -

4

100

<id> Result Type

Result <id>

<id> VME Image

-
-

Modify Section 3.32.10, Image Instructions, adding the description of OpVmeImageINTEL as follows:

-
- --- - - - - - -
-

OpVmeImageINTEL
-
-Create a VME image, containing both a (media) sampler and an image.

-
-
-

Result Type must be the OpTypeVmeImageINTEL type.

-
-
-

Image is an object whose type is an OpTypeImage, whose Sampled operand is 0 or 1, and whose Dim operand is not SubpassData.

-
-
-

Sampler must be an object whose type is OpTypeSampler.

-
- -------- - - - - - - - - - - -

5

5699

<id> Result Type

Result <id>

<id> Image Type

<id> Sampler

-
-

Modify Section 3.32.21, Group Instructions, adding to the end of the list of instructions the following MCE instructions:

-
-
-
-

MCE instructions

-
-

A set of generic MCE operations which may be called for IME, REF, or SIC operations with the restrictions as stated in their descriptions. They can be called only during specific phases of these operations as indicated in the description of the instructions.

-
-
-

These instruction are only guaranteed to work correctly if placed strictly within uniform control flow within the Subgroup execution scope. This ensures that if any invocation executes it, all invocations will execute it. If placed elsewhere, the results are undefined.

-
-
-
Multi-reference cost configuration instructions:
-
-

These instructions enable multi-reference image costing. They allow for the configuration of the payloads to favorably bias the major partitions coming from reference images that are closer to the source image, than the ones coming from reference images that are further away. The distance of the reference images, in the timing order, from the source image is implied based on the order in which the reference images are declared in the kernel parameter operand list.

-
- -------- - - - - - - - - - - - - - - -

OpSubgroupAvcMceGetDefaultInterBaseMultiReferencePenaltyINTEL
-
-Get the default base multi-reference cost penalty in U4U4 format when HW assisted multi-reference search is used.

-

Result Type must be an OpTypeInt with 8-bit Width and 0 Signedness in U4U4 format.

-

Slice Type must be an OpTypeInt with 8-bit Width and 0 Signedness that must evaluate to a valid slice type value as per Section 3, Binary Form.

-

Qp must be a 8-bit scalar integer type and a valid quantization parameter value between 0 and 51. It is treated as an unsigned value.

Capability:
-SubgroupAvcMotionEstimationINTEL

5

5713

<id> Result Type

Result <id>

<id> Slice Type

<id> Qp

- -------- - - - - - - - - - - - - - - -
-

OpSubgroupAvcMceSetInterBaseMultiReferencePenaltyINTEL
-
-Set the multi-reference base penalty when HW assisted multi-reference search is performed.

-
-
-

Reference major partitions get associated with a penalty based on its distance from the source image. The Reference Base Penalty is scaled using a scaling factor based on the implied distance of the reference image from -the source image as shown below.

-
- ---- - - - - - - - - - - - - - - - - - - -

0

0x

[1, 2]

1x

[3, 6]

2x

[7, 15]

3x

-
-

Result Type must be the OpTypeAvcMcePayloadINTEL type.

-
-
-

Reference Base Penalty must be 8-bit scalar integer type in U4U4 format and the decoded integer value must fit within 12 bits. It is treated as an unsigned value.

-
-
-

Payload must be the OpTypeAvcMcePayloadINTEL type.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

5

-

5714

<id> Result Type

Result <id>

<id> Reference Base Penalty

<id> Payload

-
-
-
Inter shape and direction cost configuration instructions
-
-

These instructions enable shape costing for inter estimation. They allow for the configuration of payloads for the biasing of certain shapes over others based on the configured parameters.

-
- -------- - - - - - - - - - - - - - - -

OpSubgroupAvcMceGetDefaultInterShapePenaltyINTEL
-
-Get the default packed shape cost for inter estimation in U4U4 format.

-

Result Type must be an OpTypeInt with 64-bit Width and 0 Signedness.

-

Slice Type must be an OpTypeInt with 8-bit Width and 0 Signedness that must evaluate to a valid slice type value as per Section 3, Binary Form.

-

Qp must be a 8-bit scalar integer type and a valid quantization parameter value between 0 and 51. It is treated as an unsigned value.

Capability:
-SubgroupAvcMotionEstimationINTEL

5

5715

<id> Result Type

Result <id>

<id> Slice Type

<id> Qp

- -------- - - - - - - - - - - - - - - -
-

OpSubgroupAvcMceSetInterShapePenaltyINTEL
-
-Set the shape penalty for inter motion estimation.

-
-
-

Result Type must be the OpTypeAvcMcePayloadINTEL type.

-
-
-

Packed Shape Penalty must be 64-bit scalar integer type. It is treated as an unsigned value. The following bits specify the shape penalty in U4U4 format:

-
- ---- - - - - - - - - - - - - - - - - - - - - - - - - - - -

7:0

16x8 and 8x16

15:8

8x8

23:16

8x4 and 4x8

31:24

4x4

39:32

16x16

63:40

Must be zero

-
-

The U4U4 decoded integer values for byte 0 and byte 4 must bit fit in 12 bits, while the U4U4 decoded integer values for the other bytes must fit within 10 bits.

-
-
-

Payload must be the OpTypeAvcMcePayloadINTEL type.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

5

-

5716

<id> Result Type

Result <id>

<id> Packed Shape Penalty

<id> Payload

- -------- - - - - - - - - - - - - - - -

OpSubgroupAvcMceGetDefaultInterDirectionPenaltyINTEL
-
-Get the default direction penalty for inter estimation in U4U4 format.

-

Result Type must be an OpTypeInt with 8-bit Width and 0 Signedness.

-

Slice Type must be an OpTypeInt with 8-bit Width and 0 Signedness that must evaluate to a valid slice type value as per Section 3, Binary Form.

-

Qp must be a 8-bit scalar integer type and a valid quantization parameter value between 0 and 51. It is treated as an unsigned value.

Capability:
-SubgroupAvcMotionEstimationINTEL

5

5717

<id> Result Type

Result <id>

<id> Slice Type

<id> Qp

- -------- - - - - - - - - - - - - - - -

OpSubgroupAvcMceSetInterDirectionPenaltyINTEL
-
-Set the direction penalty for backward images used in inter motion estimation.

-

Result Type must be the OpTypeAvcMcePayloadINTEL type.

-

Direction Cost must be 8-bit scalar integer type in U4U4 format and -the decoded integer value must fit within 12 bits. It is treated as an unsigned value.

-

Payload must be the OpTypeAvcMcePayloadINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

5

5718

<id> Result Type

Result <id>

<id> Direction Cost

<id> Payload

-
-
-
Intra shape cost configuration phase instructions
-
-

These instructions enable shape costing for intra estimation. They allow for the configuration of payloads for biasing of certain shapes over others based on the configured parameters. Only the instruction providing the default shape penalty is specified as an MCE instruction. The instruction which actually configures the payload for the intra estimation operation is specified as a SIC instruction.

-
- -------- - - - - - - - - - - - - - - -

OpSubgroupAvcMceGetDefaultIntraLumaShapePenaltyINTEL
-
-Get the default packed luma intra penalty estimation in U4U4 format.

-

Result Type must be an OpTypeInt with 32-bit Width and 0 Signedness.

-

Slice Type must be an OpTypeInt with 8-bit Width and 0 Signedness that must evaluate to a valid slice type value as per Section 3, Binary Form.

-

Qp must be a 8-bit scalar integer type and a valid quantization parameter value between 0 and 51. It is treated as an unsigned value.

Capability:
-SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationIntraINTEL

5

5719

<id> Result Type

Result <id>

<id> Slice Type

<id> Qp

-
-
-
Inter motion vector cost configuration phase instructions
-
-

These instructions enable motion vector costing for inter estimation. The distortion measure is augmented to favor motion vectors closer to the cost-center considered in conjunction with the primary objective of minimizing the SAD between the source and reference blocks.

-
- -------- - - - - - - - - - - - - - - -

OpSubgroupAvcMceGetDefaultInterMotionVectorCostTableINTEL
-
-Get the default inter motion vector cost table for the pre-defined control points in U4U4 format for the input Qp and slice type.

-

Result Type must be a vector(2) of i32 values and 0 Signedness.

-

Slice Type must be an OpTypeInt with 8-bit Width and 0 Signedness that must evaluate to a valid slice type value as per Section 3, Binary Form.

-

Qp must be a 8-bit scalar integer type and a valid quantization parameter value between 0 and 51. It is treated as an unsigned value.

Capability:
-SubgroupAvcMotionEstimationINTEL

5

5720

<id> Result Type

Result <id>

<id> Slice Type

<id> Qp

- ------ - - - - - - - - - - - - -

OpSubgroupAvcMceGetDefaultHighPenaltyCostTableINTEL
-
-Get the default predefined packed U4U4 format high cost table for high Qp. This may be more appropriate for frame sequences with high motion.

-

Result Type must be a vector(2) of i32 values and 0 Signedness.

Capability:
-SubgroupAvcMotionEstimationINTEL

3

5721

<id> Result Type

Result <id>

- ------ - - - - - - - - - - - - -

OpSubgroupAvcMceGetDefaultMediumPenaltyCostTableINTEL
-
-Get the default predefined packed U4U4 format high cost table for medium Qp. This may be more appropriate for frame sequences with high motion.

-

Result Type must be a vector(2) of i32 values and 0 Signedness.

Capability:
-SubgroupAvcMotionEstimationINTEL

3

5722

<id> Result Type

Result <id>

- ------ - - - - - - - - - - - - -

OpSubgroupAvcMceGetDefaultLowPenaltyCostTableINTEL
-
-Get the default predefined packed U4U4 format low cost table for high Qp. This may be more appropriate for frame sequences with high motion.

-

Result Type must be a vector(2) of i32 values and 0 Signedness.

Capability:
-SubgroupAvcMotionEstimationINTEL

3

5723

<id> Result Type

Result <id>

- ---------- - - - - - - - - - - - - - - - - -
-

OpSubgroupAvcMceSetMotionVectorCostFunctionINTEL
-
-Update the input payload to set the cost precision along with the cost center and cost table and return it.

-
-
-

Result Type must be the OpTypeAvcMcePayloadINTEL type.

-
-
-

Packed Cost Center Delta must be an OpTypeInt with 64-bit Width and 0 Signedness. It is the packed bidirectional cost center delta value relative to the source macroblock, which specifies the 4 bidirectional cost centers of each of the 8x8 partitions of the reference image. If only unidirectional search is performed then the values of the backward reference cost centers must be zero. Work-item n provides the value of cost center n. It is specified in QPEL units. For 16x16 partitions work-item 0 provides the cost center. For 8x16 partitions work-items 0 and 1 provide the cost centers. For 16x8 partitions work-items 0 and 2 provide the cost centers. The X and Y coordinates of each cost center delta must be in the range [-2048, 2047] and [-512.00 to 511.75] respectively, otherwise the results are undefined.

-
-
-

Packed Cost Table must be a vector(2) of i32 values and 0 Signedness and specifies the cost penalties for pre-defined control points in U4U4 format in the cost function curve. The first 7 bytes specify 7 control points representing consecutive powers-of-two delta units (20 to 26). Each delta unit, dx, is the distance of a motion vector, mv, from the specified cost center, cc (dx=abs(mv-cc)). The cost penalty values at in-between control points are linearly interpolated. The range of the cost function is defined to be from 20 to 26 delta units. The 8th byte of the packed cost table specifies the penalty base factor (over_cost) for dx distances that are out-of-range. The penalty of out-of-range cost dx distances is computed as min(over_cost + int(dx) - 64, 255).

-
-
-

Cost Precision must be an OpTypeInt with 8-bit Width and 0 Signedness that must evaluate to a valid cost precision value as per Section 3, Binary Form, and specifies the precision of the delta units from the cost center, dx. This effectively can be used to control the range of the cost function as follows:

-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
PrecisionDeltasPixel Range

PEL

pixel

0-64

DPEL

dual pixel

0-127

HALF

half pixel

0-31

QUARTER

quarter pixel

0-15

-
-

The inter distortion for a block can be described by the following formula:

-
-
-
-
Distortion =
-    SAD(or HAAR) + MV_Cost_Penalty +
-    Shape_Penalty + Direction_Cost +
-    Multi_Reference_Penalty
-
-
-
-

Payload must be the OpTypeAvcMcePayloadINTEL type.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

7

-

5724

<id> Result Type

Result <id>

<id> Packed Cost Center Delta

<id> Packed Cost Table

<id> Cost Precision

<id> Payload

-
-
-
Intra mode cost configuration phase instructions
-
-

These instructions enable mode costing for intra estimation. They allow for the configuration of payloads to bias the computed intra modes to be closer to their configured neighbor modes. This form of costing is similar to the inter motion vector costing. Only the instructions providing the defaults mode costs are specified as MCE instructions. The remaining instructions which actually configure the payload for the intra estimation operation is specified as SIC instruction.

-
- -------- - - - - - - - - - - - - - - -

OpSubgroupAvcMceGetDefaultIntraLumaModePenaltyINTEL
-
-Get the default inter motion vector cost table for the pre-defined control points in U4U4 format for the input Qp and slice type.

-

Result Type must be an OpTypeInt with 8-bit Width and 0 Signedness in U4U4 format.

-

Slice Type must be an OpTypeInt with 8-bit Width and 0 Signedness that must evaluate to a valid slice type value as per Section 3, Binary Form.

-

Qp must be a 8-bit scalar integer type and a valid quantization parameter value between 0 and 51. It is treated as an unsigned value.

Capability:
-SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationIntraINTEL

5

5725

<id> Result Type

Result <id>

<id> Slice Type

<id> Qp

- ------ - - - - - - - - - - - - -

OpSubgroupAvcMceGetDefaultNonDcLumaIntraPenaltyINTEL
-
-Get the default intra non-dc cost penalty for intra luma estimation in packed 32-bit integer format.

-

Result Type must be an OpTypeInt with 32-bit Width and 0 Signedness.

Capability:
-SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationIntraINTEL

3

5726

<id> Result Type

Result <id>

- ------ - - - - - - - - - - - - -

OpSubgroupAvcMceGetDefaultIntraChromaModeBasePenaltyINTEL
-
-Get the default chroma mode base penalty in U4U4 format.

-

Result Type must be an OpTypeInt with 8-bit Width and 0 Signedness in U4U4 format.

Capability:
-SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationChromaINTEL

3

5727

<id> Result Type

Result <id>

-
-
-
Miscellaneous property configuration phase instructions
-
-

These instructions enable miscellaneous MCE properties settings.

-
- ------- - - - - - - - - - - - - - -

OpSubgroupAvcMceSetAcOnlyHaarINTEL
-
-Update the input payload to enable an AC only HAAR SAD mode and return it. It overrides any previous setting for sad adjustment. This feature is mainly intended for improved block matching in frame-rate conversion (FRC) kernels.

-

Result Type must be the OpTypeAvcMcePayloadINTEL type.

-

Payload must be the OpTypeAvcMcePayloadINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

4

5728

<id> Result Type

Result <id>

<id> Payload

- -------- - - - - - - - - - - - - - - -

OpSubgroupAvcMceSetSourceInterlacedFieldPolarityINTEL
-
-Update the input payload to specify the field polarities for interlaced source images used for inter or intra operations.

-

Result Type must be the OpTypeAvcMcePayloadINTEL type.

-

Source Field Polarity must be an OpTypeInt with 8-bit Width and 0 Signedness that must evaluate to an OpTypeInt with 8-bit Width and 0 Signedness that must evaluate to a valid field polarity value as per Section 3, Binary Form indicating the field polarity for the source image.

-

Payload must be the OpTypeAvcMcePayloadINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

5

5729

<id> Result Type

Result <id>

<id> Source Field Polarity

<id> Payload

- -------- - - - - - - - - - - - - - - -

OpSubgroupAvcMceSetSingleReferenceInterlacedFieldPolarityINTEL
-
-Update the input payload to specify the field polarities for interlaced reference images used for single reference inter search or check operation.

-

Result Type must be the OpTypeAvcMcePayloadINTEL type.

-

Reference Field Polarity must be an OpTypeInt with 8-bit Width and 0 Signedness that must evaluate to a valid field polarity value as per Section 3, Binary Form indicating the field polarity for the reference image.

-

Payload must be the OpTypeAvcMcePayloadINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

5

5730

<id> Result Type

Result <id>

<id> Reference Field Polarity

<id> Payload

- --------- - - - - - - - - - - - - - - - -

OpSubgroupAvcMceSetDualReferenceInterlacedFieldPolaritiesINTEL
-
-Update the input payload to specify the field polarities for interlaced reference images used for dual reference inter search or check operation.

-

Result Type must be the OpTypeAvcMcePayloadINTEL type.

-

Forward Reference Field Polarity must be an OpTypeInt with 8-bit Width and 0 Signedness that must evaluate to a valid field polarity value as per Section 3, Binary Form indicating the field polarity for the forward reference image.

-

Backward Reference Field Polarity must be an OpTypeInt with 8-bit Width and 0 Signedness that must evaluate to a valid field polarity value as per Section 3, Binary Form indicating the field polarity for the forward reference image.

-

Payload must be the OpTypeAvcMcePayloadINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

6

5731

<id> Result Type

Result <id>

<id> Forward Reference Field Polarity

<id> Backward Reference Field Polarity

<id> Payload

-
-
-
Result processing phase instructions
-
-

These instructions facilitate the extraction of components of the result from VME unit.

-
- ------- - - - - - - - - - - - - - -
-

OpSubgroupAvcMceGetMotionVectorsINTEL
-
-Get the MCE packed BMVs result.

-
-
-

Up to 16 packed BMVs are returned, one per work-item. If the MCE search operation’s payload was setup for unidirectional search then only the forward packed MV will be valid in each BMV, otherwise both packed MVs will be valid. The BMVs have to be selected by their respective work-items based on the result block major and minor shapes.

-
-
-

If the major shape is:

-
-
-
    -
  • -

    16x16, then one BMV is returned by work-item 0

    -
  • -
  • -

    16x8, or 8x16, then two BMVs are returned by work-items 0 and 8

    -
  • -
  • -

    8x8, then four sets of BMVs corresponding to the four partitions in traditional Z-order are returned by work-items in the ranges [0, 3], [4, 7], [8, 11], and [12, 15]; the minor shape will determine exactly which work-items in the reserved inclusive range for the partition returns the BMVs for that partition

    -
  • -
-
-
-

If the range of work-items for the 8x8 major partition is [n, n+3] and the minor shape is:

-
-
-
    -
  • -

    8x8, then work-item n returns the BMV for each minor partition

    -
  • -
  • -

    8x4 or 4x8, then work-items n and n+2 returns the BMVs for each minor partition

    -
  • -
  • -

    4x4, then all work-items in [n, n+3] return the BMVs for each minor partition in traditional Z-order

    -
  • -
-
-
- - - - - -
- - -
-
    -
  1. -

    All sub-block BMVs get replicated for each partition. For example, for a 16x16 -partition, all smaller sub-block BMVs are replicated to the same BMV, and for 8x8 partition, each 8x8 must have its respective sub-block BMVs replicated. This -is not important to extract the component BMVs itself, but is needed if the result of this instruction is used to initialize the input motion vectors of a REF initialization instruction.

    -
  2. -
  3. -

    With interlaced images, the MBs for the top field MBs are considered as logically -overlapping with the bottom MBs.

    -
  4. -
-
-
-
-
-

Result Type must be an OpTypeInt with 64-bit Width and 0 Signedness.

-
-
-

Payload must be the OpTypeAvcImeResultINTEL type.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

4

-

5738

<id> Result Type

Result <id>

<id> Payload

- ------- - - - - - - - - - - - - - -

OpSubgroupAvcMceGetInterDistortionsINTEL
-
-Get the MCE inter distortions result corresponding to the BMVs returned by OpSubgroupAvcMceGetMotionVectorsINTEL. The MCE inter directions result -returned by OpSubgroupAvcMceGetInterDirectionsINTEL will specify if the distortion corresponds to the forward MV, backward MV, or the bidirectional MV in the BMV. Up to 16 distortions are returned, one per work-item.

-

The distortions have to be selected by their respective work-items based on the result block major and minor shapes just as for the result MVs as described above.

-

Result Type must be an OpTypeInt with 16-bit Width and 0 Signedness.

-

Payload must be the OpTypeAvcImeResultINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

4

5739

<id> Result Type

Result <id>

<id> Payload

- ------- - - - - - - - - - - - - - -

OpSubgroupAvcMceGetBestInterDistortionsINTEL
-
-Get the best inter distortion for the whole MB.

-

Result Type must be an OpTypeInt with 16-bit Width and 0 Signedness.

-

Payload must be the OpTypeAvcImeResultINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

4

5740

<id> Result Type

Result <id>

<id> Payload

- ------- - - - - - - - - - - - - - -

OpSubgroupAvcMceGetInterMajorShapeINTEL
-
-Get the MCE inter MB major partition shape.

-

The returned values are as per the inter-MB major shapes values as per Section 3, Binary Form.

-

This can only be called as part of an IME or REF operation evaluation.

-

Result Type must be an OpTypeInt with 8-bit Width and 0 Signedness.

-

Payload must be the OpTypeAvcImeResultINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

4

5741

<id> Result Type

Result <id>

<id> Payload

- ------- - - - - - - - - - - - - - -

OpSubgroupAvcMceGetInterMinorShapeINTEL
-
-Get the MCE inter MB minor partition shapes.

-

It returns a bit field with the minor shapes for the 4 8x8 sub-partitions in traditional Z order. Two bits are reserved for each of the four sub-partitions in row-major order. The returned 2-bit values are as per the inter-MB minor shapes values as per Section 3, Binary Form.

-

This instruction returns valid results only if the major shape is 8x8, otherwise the results are undefined.

-

This can only be called as part of an IME or REF operation evaluation.

-

Result Type must be an OpTypeInt with 8-bit Width and 0 Signedness.

-

Payload must be the OpTypeAvcImeResultINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

4

5742

<id> Result Type

Result <id>

<id> Payload

- ------- - - - - - - - - - - - - - -
-

OpSubgroupAvcMceGetInterDirectionsINTEL
-
-Get the MCE inter MB major partition directions.

-
-
-

It returns a bit field with the direction for up to 4 major sub-partitions in traditional Z order. Two bits are reserved for each of the four sub-partitions. The returned 2-bit values are as per the inter-MB major shape direction values as per Section 3, Binary Form.

-
-
-

If the major partition is:

-
-
-
    -
  • -

    16x16, then bits in the range [0, 1] contains the direction

    -
  • -
  • -

    16x8 or 8x16, then bits in the ranges [0, 1] and [2,3] contains the two partitions -directions

    -
  • -
  • -

    8x8, then bits in the ranges [0, 1], [2, 3], [4,5], and [6, 7] contains the four partitions directions

    -
  • -
-
-
-

The returned values are as per the inter direction values as per Section 3, Binary Form.

-
-
-

This can only be called as part of an IME or REF operation evaluation.

-
-
-

Result Type must be an OpTypeInt with 8-bit Width and 0 Signedness.

-
-
-

Payload must be the OpTypeAvcImeResultINTEL type.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

4

-

5743

<id> Result Type

Result <id>

<id> Payload

- ------- - - - - - - - - - - - - - -

OpSubgroupAvcMceGetInterMotionVectorCountINTEL
-
-Get the count of motion vectors (based on the partitioning decision) returned by the search operation.

-

This can only be called as part of an IME or REF operation evaluation.

-

Result Type must be an OpTypeInt with 8-bit Width and 0 Signedness.

-

Payload must be the OpTypeAvcImeResultINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

4

5744

<id> Result Type

Result <id>

<id> Payload

- ------- - - - - - - - - - - - - - -
-

OpSubgroupAvcMceGetInterReferenceIdsINTEL
-
-Get the MCE inter MB reference identifiers in a packed integer format, with the following bits specifying the reference identifiers for the major partitions.

-
- ---- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

3:0

Fwd reference block 0

7:4

Bwd reference block 0

11:8

Fwd reference block 1

15:12

Bwd reference block 1

19:16

Fwd reference block 2

23:20

Bwd reference block 2

27:24

Fwd reference block 3

31:28

Bwd reference block 3

-
-

The values of each individual 4-bit reference identifier range from 0 to 15, with each value identifying the distance of ordered pair of forward/backward reference images as declared in the VME kernel parameter operand list.

-
-
-

If the dual-reference evaluation instructions are not used, then the values of the backward reference identifiers are undefined.

-
-
-

The blocks are numbered using the traditional Z order. For larger block sizes, the -sub-block reference identifier pairs are replicated. For example, for a 16x16 block all -four pairs of reference identifiers are replicated to the value of the first pair for -block 0.

-
-
- - - - - -
- - -
-

Unless HW assisted multi-reference search was performed using the IME streamin/streamout evaluation instructions, the individual 4-bit reference identifier pair values will all be the same (pointing to the same pair for forward/backward reference -images).

-
-
-
-
-

Result Type must be an OpTypeInt with 32-bit Width and 0 Signedness.

-
-
-

Payload must be the OpTypeAvcImeResultINTEL type.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

4

-

5745

<id> Result Type

Result <id>

<id> Payload

- --------- - - - - - - - - - - - - - - - -
-

OpSubgroupAvcMceGetInterReferenceInterlacedFieldPolaritiesINTEL
-
-Get the MCE inter MB reference field polarities for the corresponding reference identifiers returned by OpSubgroupAvcMceGetInterReferenceIdsINTEL in a packed integer format, with the following bits specifying the reference field polarities -for the major partitions.

-
- ---- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

0

Fwd reference block 0

1

Fwd reference block 1

2

Fwd reference block 2

3

Fwd reference block 3

4

Bwd reference block 4

5

Bwd reference block 5

6

Bwd reference block 6

7

Bwd reference block 7

-
-

If the dual-reference evaluation instructions are not used, then the values of the backward reference field polarities are undefined.

-
-
-

The blocks are numbered using the traditional Z order. For larger block sizes, the sub-block reference field polarities are replicated. For example, for a 16x16 block all -four pairs of reference field polarities are replicated to the value of the first pair for block 0.

-
-
- - - - - -
- - -
-

An important restriction is that when multiple IME operations are performed for a -HW multi-assisted multi-reference search operation using the streamin/streamout capabilities, the same reference image parameter cannot be used with different polarities in the sequence of IME operations used for a HW-assisted search -operation. In other words, the field polarities for reference image parameters must be used consistently across IME operations used in a HW assisted multi-reference search -operation.

-
-
-
-
-

Result Type must be an OpTypeInt with 8-bit Width and 0 Signedness.

-
-
-

Packed Reference Ids must be an OpTypeInt with 32-bit Width and 0 Signedness, and is as defined by the return value of OpSubgroupAvcMceGetInterReferenceIdsINTEL.

-
-
-

Packed Reference Parameter Field Polarities must be an OpTypeInt with 32-bit Width and 0 Signedness, and specifies the packed bit field of field polarities for each of the (up to 16) forward/backward interleaved pairs of reference images in the same order as specified in the kernel parameter operand list, as used for the inter search operation. If less than 16 pairs are used then the corresponding bit field values are ignored.

-
-
-

Payload must be the OpTypeAvcImeResultINTEL type.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

6

-

5746

<id> Result Type

Result <id>

<id> Packed Reference Ids

<id> Packed Reference Parameter Field Polarities

<id> Payload

-
-
-
-
-

IME instructions

-
-

A set of ordered phases of instructions are required to be called to evaluate an integer motion estimation result.

-
-
-

These instruction are only guaranteed to work correctly if placed strictly within uniform control flow within the Subgroup execution scope. This ensures that if any invocation executes it, all invocations will execute it. If placed elsewhere, the results are undefined.

-
-
-
Initialization instructions
-
-

These instructions create a properly initialized payload that can be used for further configured for evaluating IME operations. This is a required initial phase.

-
- --------- - - - - - - - - - - - - - - - -
-

OpSubgroupAvcImeInitializeINTEL
-
-Return an initialized payload for a VME integer search (IME) operation.

-
-
-

The payload is initialized for progressive frame operations, and the cost configuration values and the miscellaneous property values are all initialized to zero. The cost configuration and the miscellaneous property configuration instructions must be used to override the initial configurations in the payload.

-
-
- - - - - -
- - -
-

If the source image is an interlaced scan image, then the bottom field lines are considered as logically overlapping with the top field lines (i.e. the top field MBs are considered as logically overlapping with the bottom MBs) for the purposes for specifying the Src Coord value.

-
-
-
-
-

Result Type must be the OpTypeAvcImePayloadINTEL type.

-
-
-

Src Coord must be a vector(2) of i16 values and 0 Signedness. It represents the 2D offset of the top left corner of the source MB in pixel units in the source image. Source MBs at the image borders are allowed to be partial, but the top-left corner must be within the image.

-
-
-

Partition Mask must be an OpTypeInt with 8-bit Width and 0 Signedness. The legal values can be composed by setting the appropriate bit fields specified by partition mask values as per Section 3, Binary Form using OpBitwiseAnd.

-
-
-

SAD Adjustment must be an OpTypeInt with 8-bit Width and 0 Signedness that must evaluate to a valid SAD adjustment mode as per Section 3, Binary Form. If it is set to AVC_ME_SAD_ADJUST_MODE_HAAR_INTEL, a simple wavelet transform, Haar transform, is used to refine the distortion measure of SAD. Haar transform here is used as a coarse estimation of the integer transform.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

6

-

5747

<id> Result Type

Result <id>

<id> Src Coord

<id> Partition Mask

<id> SAD Adjustment

-
-
-
Configuration instructions
-
-

These instructions allow for configuration of the search window. A call to either OpSubgroupAvcImeSetSingleReferenceINTEL or OpSubgroupAvcImeSetDualReferenceINTEL is required. This is a required phase immediately following the initialization phase.

-
- --------- - - - - - - - - - - - - - - - -
-

OpSubgroupAvcImeSetSingleReferenceINTEL
-
-Update the input payload for a VME single-reference search with the configuration for the reference window search region, and return it.

-
-
- - - - - -
- - -
-

If the reference image is an interlaced scan image, then the top field lines are considered as logically overlapping with the bottom field lines (i.e. the top field MBs are considered as logically overlapping with the bottom MBs) for the purposes for -specifying the Ref Offset value.

-
-
-
-
-

Result Type must be the OpTypeAvcImePayloadINTEL type.

-
-
-

Ref Offset specifies the 2D reference window offset, and must be a vector(2) of i16 values and interpreted as signed values. The X and Y coordinates must be in the range [-2048, 2047], otherwise the results are undefined. The reference window is allowed to be partially outside the image. Pixel replication is applied to generate out-of-bound reference pixels. It is specified in PEL units. Results are undefined in the reference region is completely outside the image.

-
-
-

Search Window Config must be an OpTypeInt with 8-bit Width and 0 Signedness, and must come from a constant instruction of an integer-type scalar whose value is a valid unreserved search window configuration value as per Section 3, Binary Form.

-
-
-

Payload must be the OpTypeAvcImePayloadINTEL type.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

6

-

5748

<id> Result Type

Result <id>

<id> Ref Offset

<id> Search Window Config

<id> Payload

- ---------- - - - - - - - - - - - - - - - - -
-

OpSubgroupAvcImeSetDualReferenceINTEL
-
-Update the input payload for a VME dual-reference search with the configurations for the -reference window search regions, and return it.

-
-
- - - - - -
- - -
-

If a reference image is an interlaced scan image, then the top field lines are considered as logically overlapping with the bottom field lines (i.e. the top field MBs are considered as logically overlapping with the bottom MBs) for the purposes for specifying the corresponding Fwd Ref Offset and/or Bwd Ref Offset values.

-
-
-
-
-

Result Type must be the OpTypeAvcImePayloadINTEL type.

-
-
-

Fwd Ref Offset/Bwd Ref Offset specify the 2D forward/backward reference window offset, and must be a vector(2) of i16 values and interpreted as signed values. The X and Y coordinates must be in the range [-2048, 2047], otherwise the results are undefined. The reference window is allowed to be partially outside the image. Pixel replication is applied to generate out-of-bound reference pixels. It is specified in PEL units. Results are undefined in the reference region is completely outside the image.

-
-
-

Search Window Config must be an OpTypeInt with 8-bit Width and 0 Signedness, and must come from a constant instruction of an integer-type scalar whose value is a valid unreserved search window configuration value as per Section 3, Binary Form.

-
-
-

Payload must be the OpTypeAvcImePayloadINTEL type.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

7

-

5749

<id> Result Type

Result <id>

<id> Fwd Ref Offset

<id> Bwd Ref Offset

id> Search Window Config

<id> Payload

- -------- - - - - - - - - - - - - - - -

OpSubgroupAvcImeRefWindowSizeINTEL
-
-Get the 2D size of the reference window in pixel units.

-

Result Type must be a vector(2) of i16 values and 0 Signedness.

-

Search Window Config must be an OpTypeInt with 8-bit Width and 0 Signedness, and must come from a constant instruction of an integer-type scalar whose value is a valid unreserved search window configuration value as per Section 3, Binary Form.

-

Dual Ref must be a 8-bit scalar integer type and must evaluate to zero for a single reference search window and one for a dual-reference search window. It is treated as an unsigned value.

Capability:
-SubgroupAvcMotionEstimationINTEL

5

5750

<id> Result Type

Result <id>

<id> Search Window Config

<id> Dual Ref

- ---------- - - - - - - - - - - - - - - - - -
-

OpSubgroupAvcImeAdjustRefOffsetINTEL
-
-If the input 2D reference window offset, Ref Offset, causes the reference window to be fully out-of-bound of the reference image, adjust it such that the reference window is -within bounds of the reference image.

-
-
- - - - - -
- - -
-

If the reference image is an interlaced scan image, then the bottom field lines are considered as logically overlapping with the top field lines the purposes for specifying the image size value. Since, the actual layout of the top and bottom fields in the reference image is in an interleaved fashion, the height of the top or bottom fields should be exactly half of the actual reference image height.

-
-
-
-
- - - - - -
- - -
-

A call to OpSubgroupAvcImeAdjustRefOffsetINTEL is optional. It is required only if the reference window offsets inputs to OpSubgroupAvcImeSetSingleReferenceINTEL or -OpSubgroupAvcImeSetDualReferenceINTEL is potentially out-of-bounds and need to be adjusted.

-
-
-
-
-

Result Type must be a vector(2) of i16 values and interpreted as signed values.

-
-
-

Ref Offset must be a vector(2) of i16 values and interpreted as signed values. It specifies the 2D reference window offset. The X and Y coordinates must be in the range [-2048, 2047], otherwise the results are undefined.

-
-
-

Src Coord must be a vector(2) of i16 values and 0 Signedness. It represents the 2D offset of the top left corner of the source MB in pixel units in the source image. Source MBs at the image borders are allowed to be partial, but the top-left corner must be within the image.

-
-
-

Ref Window Size must be a vector(2) of i16 values and 0 Signedness. It specifies -the 2D size of the reference window in pixel units.

-
-
-

Image Size must be a vector(2) of i16 values and 0 Signedness. It specifies the 2D -size of the progressive scan, or top or bottom fields, of the interlaced scan image in pixel units.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

7

-

5751

<id> Result Type

Result <id>

<id> Ref Offset

<id> Src Coord

<id> Ref Window Size

<id> Image Size

-
-
-
Payload type conversion instructions
-
-

These are optional instructions that may be called following the search configuration phase to convert IME payload to MCE payloads and vice-versa.

-
- ------- - - - - - - - - - - - - - -

OpSubgroupAvcImeConvertToMcePayloadINTEL
-
-Convert the IME payload to a generic MCE payload.

-

Result Type must be the OpTypeAvcMcePayloadINTEL type.

-

Payload must be the OpTypeAvcImePayloadINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

4

5752

<id> Result Type

Result <id>

<id> Payload

- ------- - - - - - - - - - - - - - -

OpSubgroupAvcMceConvertToImePayloadINTEL
-
-Convert the generic MCE payload to a IME payload.

-

Result Type must be the OpTypeAvcImePayloadINTEL type.

-

Payload must be the OpTypeAvcMcePayloadINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

4

5732

<id> Result Type

Result <id>

<id> Payload

-
-
-
Miscellaneous property configuration instructions
-
-

These are optional instructions that may be called following the search configuration phase to enable miscellaneous properties setting in the payload.

-
- -------- - - - - - - - - - - - - - - -
-

OpSubgroupAvcImeSetMaxMotionVectorCountINTEL
-
-Specify the maximum number of motion vectors allowed for the current MB. The default setting is 32. Any other value may alter the MB partitioning decision. The IME operation -will compute the best allowed partitioning such that the number of sub-block motion vectors will not exceed Max Motion Vector Count.

-
-
- - - - - -
- - -
-

This can be used to handle the restriction for certain profiles for AVC in that the maximum number of motion vectors allowed for two consecutive MBs can only be 16.

-
-
-
-
-

Result Type must be the OpTypeAvcImePayloadINTEL type.

-
-
-

Max Motion Vector Count must be an OpTypeInt with 8-bit Width and 0 Signedness, and specifies the maximum number of motion vectors allowed for the current MB. It must be in the range [1, 32], otherwise the results are undefined.

-
-
-

Payload must be the OpTypeAvcImePayloadINTEL type.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

5

-

5753

<id> Result Type

Result <id>

<id> Max Motion Vector Count

<id> Payload

- ------- - - - - - - - - - - - - - -

OpSubgroupAvcImeSetUnidirectionalMixDisableINTEL
-
-Update the input payload to disable a mix of forward and backward MVs in the result.

-

Default is to enable it.

-

Result Type must be the OpTypeAvcImePayloadINTEL type.

-

Payload must be the OpTypeAvcImePayloadINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

4

5754

<id> Result Type

Result <id>

<id> Payload

- -------- - - - - - - - - - - - - - - -

OpSubgroupAvcImeSetEarlySearchTerminationThresholdINTEL
-
-Specifies the threshold value of a distortion compute of a 16x16 partition of a MB for a -single-reference search, below which no more searching is performed for the MB.

-

The input payload must have been configured for a single-reference search with the 16x16 partition enabled for this threshold to be set, or else the results are undefined. -Result Type must be an OpTypeInt with 8-bit Width and 0 Signedness.

-

Threshold must be an OpTypeInt with 8-bit Width and 0 Signedness, and is specified in U4U4 format. Additionally, the integer value must fit within 14 bits.

-

Payload must be the OpTypeAvcImePayloadINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

5

5755

<id> Result Type

Result <id>

<id> Threshold

<id> Payload

- -------- - - - - - - - - - - - - - - -
-

OpSubgroupAvcImeSetWeightedSadINTEL
-
-Set the (16) SAD weights for each 4x4 sub-block.

-
-
-

These values are used to decrease the SAD magnitude of each 4x4 sub-block by dividing -the SAD of 4x4 sub-block of the source MB by its mapped weight. It requires a Partition Mask of 16x16 and forward Search Window Configuration.

-
-
-

The weighting pattern used is the traditional Z order for each 4x4 block. Weighted-SAD -Control Mapping:

-
-
-
-
0 1 4 5
-2 3 6 7
-8 9 C D
-A B E F
-
-
-
-

A prior call to OpSubgroupAvcImeSetSingleReferenceINTEL to set up the forward reference image is required.

-
-
- - - - - -
- - -
-

This feature is mainly intended for improved block matching in image-rate conversion (FRC) kernels.

-
-
-
-
-

Packed Sad Weights must be an OpTypeInt with 32-bit Width and 0 Signedness. Each weight is of 2 bits represented in a packed format for each of the 4x4 blocks in Z order.

-
-
-

Payload must be the OpTypeAvcImePayloadINTEL type.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

5

-

5756

<id> Result Type

Result <id>

<id> Packed Sad Weights

<id> Payload

-
-
-
Evaluation instructions
-
-

These instructions perform the evaluation of the IME operation configured in the payload with a VME media sampler and return the results.

-
- --------- - - - - - - - - - - - - - - - -

OpSubgroupAvcImeEvaluateWithSingleReferenceINTEL
-
-Evaluate the basic IME operation with a single reference and return its results. The IME payload must have been configured with OpSubgroupAvcImeSetSingleReferenceINTEL .

-

Result Type must be the OpTypeAvcImeResultINTEL type.

-

Src Image is an object whose type is an OpTypeVmeImageINTEL, and specifies the source image.

-

Ref Image is an object whose type is an OpTypeVmeImageINTEL, and specifies a forward reference image.

-

Payload must be the OpTypeAvcImePayloadINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

6

5757

<id> Result Type

Result <id>

<id> Src Image

<id> Ref Image

<id> Payload

- ---------- - - - - - - - - - - - - - - - - -

OpSubgroupAvcImeEvaluateWithDualReferenceINTEL
-
-Evaluate the basic IME operation with dual references and return its results. The IME payload must have been configured with OpSubgroupAvcImeSetDualReferenceINTEL.

-

Result Type must be the OpTypeAvcImeResultINTEL type.

-

Src Image is an object whose type is an OpTypeVmeImageINTEL, and specifies the source image.

-

Fwd Ref Image is an object whose type is an OpTypeVmeImageINTEL, and specifies a forward reference image.

-

Bwd Ref Image is an object whose type is an OpTypeVmeImageINTEL, and specifies a backward reference image.

-

Payload must be the OpTypeAvcImePayloadINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

7

5758

<id> Result Type

Result <id>

<id> Src Image

<id> Fwd Ref Image

<id> Bwd Ref Image

<id> Payload

- --------- - - - - - - - - - - - - - - - -

OpSubgroupAvcImeEvaluateWithSingleReferenceStreamoutINTEL
-
-Evaluate the single reference IME operation with streamout and return its results. The IME payload must have been configured with OpSubgroupAvcImeSetSingleReferenceINTEL.

-

Result Type must be the OpTypeAvcImeResultSingleReferenceStreamoutINTEL type.

-

Src Image is an object whose type is an OpTypeVmeImageINTEL, and specifies the source image.

-

Ref Image is an object whose type is an OpTypeVmeImageINTEL, and specifies a forward reference image.

-

Payload must be the OpTypeAvcImePayloadINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

6

5761

<id> Result Type

Result <id>

<id> Src Image

<id> Ref Image

<id> Payload

- ---------- - - - - - - - - - - - - - - - - -

OpSubgroupAvcImeEvaluateWithDualReferenceStreamoutINTEL
-
-Evaluate the basic IME operation with dual references with streamout and return its results. The IME payload must have been configured with OpSubgroupAvcImeSetDualReferenceINTEL.

-

Result Type must be the OpTypeAvcImeResultDualReferenceStreamoutINTEL type.

-

Src Image is an object whose type is an OpTypeVmeImageINTEL, and specifies the source image.

-

Fwd Ref Image is an object whose type is an OpTypeVmeImageINTEL, and specifies a forward reference image.

-

Bwd Ref Image is an object whose type is an OpTypeVmeImageINTEL, and specifies a backward reference image.

-

Payload must be the OpTypeAvcImePayloadINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

7

5762

<id> Result Type

Result <id>

<id> Src Image

<id> Fwd Ref Image

<id> Bwd Ref Image

<id> Payload

- ---------- - - - - - - - - - - - - - - - - -

OpSubgroupAvcImeEvaluateWithSingleReferenceStreaminINTEL
-
-Evaluate the single reference IME operation with streamin and return its results. The IME payload must have been configured with OpSubgroupAvcImeSetSingleReferenceINTEL.

-

Result Type must be the OpTypeAvcImeResultINTEL type.

-

Src Image is an object whose type is an OpTypeVmeImageINTEL, and specifies the source image.

-

Ref Image is an object whose type is an OpTypeVmeImageINTEL, and specifies a forward reference image.

-

Payload must be the OpTypeAvcImePayloadINTEL type.

-

Streamin Components must be the OpTypeAvcImeSingleReferenceStreaminINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

7

5759

<id> Result Type

Result <id>

<id> Src Image

<id> Ref Image

<id> Payload

<id> Streamin Components

- ----------- - - - - - - - - - - - - - - - - - -

OpSubgroupAvcImeEvaluateWithDualReferenceStreaminINTEL
-
-Evaluate the dual reference IME operation with streamin and return its results. The IME payload must have been configured with OpSubgroupAvcImeSetDualReferenceINTEL.

-

Result Type must be the OpTypeAvcImeResultINTEL type.

-

Src Image is an object whose type is an OpTypeVmeImageINTEL, and specifies the source image.

-

Fwd Ref Image is an object whose type is an OpTypeVmeImageINTEL, and specifies a forward reference image.

-

Bwd Ref Image is an object whose type is an OpTypeVmeImageINTEL, and specifies a backward reference image.

-

Payload must be the OpTypeAvcImePayloadINTEL type.

-

Streamin Components must be the OpTypeAvcImeDualReferenceStreaminINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

8

5760

<id> Result Type

Result <id>

<id> Src Image

<id> Fwd Ref Image

<id> Bwd Ref Image

<id> Payload

<id> Streamin Components

- ---------- - - - - - - - - - - - - - - - - -

OpSubgroupAvcImeEvaluateWithSingleReferenceStreaminoutINTEL
-
-Evaluate the single reference IME operation with streamin/streamout and return its results. The IME payload must have been configured with OpSubgroupAvcImeSetSingleReferenceINTEL.

-

Result Type must be the OpTypeAvcImeResultSingleReferenceStreamoutINTEL type.

-

Src Image is an object whose type is an OpTypeVmeImageINTEL, and specifies the source image.

-

Ref Image is an object whose type is an OpTypeVmeImageINTEL, and specifies a forward reference image.+

-

Payload must be the OpTypeAvcImePayloadINTEL type.

-

Streamin Components must be the OpTypeAvcImeSingleReferenceStreaminINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

7

5763

<id> Result Type

Result <id>

<id> Src Image

<id> Ref Image

<id> Payload

<id> Streamin Components

- ----------- - - - - - - - - - - - - - - - - - -

OpSubgroupAvcImeEvaluateWithDualReferenceStreaminoutINTEL
-
-Evaluate the dual reference IME operation with streamin/streamout and return its results. The IME payload must have been configured with OpSubgroupAvcImeSetDualReferenceINTEL.

-

Result Type must be the OpTypeAvcImeResultDualReferenceStreamoutINTEL type.

-

Src Image is an object whose type is an OpTypeVmeImageINTEL, and specifies the source image.

-

Fwd Ref Image is an object whose type is an OpTypeVmeImageINTEL, and specifies a forward reference image.

-

Bwd Ref Image is an object whose type is an OpTypeVmeImageINTEL, and specifies a backward reference image.

-

Payload must be the OpTypeAvcImePayloadINTEL type.

-

Streamin Components must be the OpTypeAvcImeDualReferenceStreaminINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

8

5764

<id> Result Type

Result <id>

<id> Src Image

<id> Fwd Ref Image

<id> Bwd Ref Image

<id> Payload

<id> Streamin Components

-
-
-
Result type conversion instructions
-
-

These are optional instructions that may be called following the evaluation phase to convert IME results to MCE results and vice-versa.

-
- ------- - - - - - - - - - - - - - -

OpSubgroupAvcImeConvertToMceResultINTEL
-
-Convert the IME result to a generic MCE result.

-

Result Type must be the OpTypeAvcMceResultINTEL type.

-

Payload must be the OpTypeAvcImeResultINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

4

5765

<id> Result Type

Result <id>

<id> Payload

- ------- - - - - - - - - - - - - - -

OpSubgroupAvcMceConvertToImeResultINTEL
-
-Convert the generic MCE result to a IME result.

-

Result Type must be the OpTypeAvcImeResultINTEL type.

-

Payload must be the OpTypeAvcMceResultINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

4

5733

<id> Result Type

Result <id>

<id> Payload

-
-
-
Result processing instructions
-
-

These instructions are called following the evaluation phase to extract the various result components from an IME evaluation result.

-
- ------- - - - - - - - - - - - - - -

OpSubgroupAvcImeGetSingleReferenceStreaminINTEL
-
-Return the streamed out BMVs and distortions from the input result from a single reference streamout IME operation that can be used as streamin input for a subsequent IME operation.

-

Result Type must be the OpTypeAvcImeSingleReferenceStreaminINTEL type.

-

Payload must be the OpTypeAvcImeResultSingleReferenceStreamoutINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

4

5766

<id> Result Type

Result <id>

<id> Payload

- ------- - - - - - - - - - - - - - -

OpSubgroupAvcImeGetDualReferenceStreaminINTEL
-
-Return the streamed out BMVs and distortions from the input result from a dual reference streamout IME operation that can be used as streamin input for a subsequent IME operation.

-

Result Type must be the OpTypeAvcImeDualReferenceStreaminINTEL type.

-

Payload must be the OpTypeAvcImeResultDualReferenceStreamoutINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

4

5767

<id> Result Type

Result <id>

<id> Payload

- ------- - - - - - - - - - - - - - -

OpSubgroupAvcImeStripSingleReferenceStreamoutINTEL
-
-Strip out the single reference streamout BMVs and distortions from the streamout results and return the rest.

-

Result Type must be the OpTypeAvcImeResultINTEL type.

-

Payload must be the OpTypeAvcImeResultSingleReferenceStreamoutINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

4

5768

<id> Result Type

Result <id>

<id> Payload

- ------- - - - - - - - - - - - - - -

OpSubgroupAvcImeStripDualReferenceStreamoutINTEL
-
-Strip out the dual reference streamout BMVs and distortions from the streamout results and return the rest.

-

Result Type must be the OpTypeAvcImeResultINTEL type.

-

Payload must be the OpTypeAvcImeResultDualReferenceStreamoutINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

4

5769

<id> Result Type

Result <id>

<id> Payload

- -------- - - - - - - - - - - - - - - -
-

OpSubgroupAvcImeGetStreamoutSingleReferenceMajorShapeMotionVectorsINTEL
-
-Get the packed motion vectors for the input major shape from the IME single reference -streamout results.

-
-
-

Up to 4 packed MVs are returned, one per work-item. If the major shape is:

-
-
-
    -
  • -

    16x6, then one packed MV is returned by work-item 0

    -
  • -
  • -

    16x8, or 8x16, then two packed MVs are returned by work-items 0 and 1

    -
  • -
  • -

    8x8, then four packed MVs are returned by work-items 0 to 3.

    -
  • -
-
-
-

Result Type must be a OpTypeInt with 32-bit Width and 0 Signedness.

-
- -
-

Major Shape must be a OpTypeInt with 8-bit Width and 0 Signedness, and must come from a constant instruction of an integer-type scalar whose value is a valid inter macro-block major shape value as per Section 3, Binary Form.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

5

-

5770

<id> Result Type

Result <id>

<id> Payload

<id> Major Shape

- --------- - - - - - - - - - - - - - - - -
-

OpSubgroupAvcImeGetStreamoutDualReferenceMajorShapeMotionVectorsINTEL
-
-Get the packed motion vectors for the input major shape from the IME dual reference -streamout results.

-
-
-

Up to 4 packed MVs are returned, one per work-item in the same format as for motion -vectors for single reference streamout as described in OpSubgroupAvcImeGetStreamoutSingleReferenceMajorShapeMotionVectorsINTEL.

-
-
-

Result Type must be a OpTypeInt with 32-bit Width and 0 Signedness.

-
-
-

Payload must be the OpTypeAvcImeResultDualReferenceStreamoutINTEL type.

-
-
-

Major Shape must be a OpTypeInt with 8-bit Width and 0 Signedness, and must come from a constant instruction of an integer-type scalar whose value is a valid inter macro-block major shape value as per Section 3, Binary Form.

-
-
-

Direction must be a OpTypeInt with 8-bit Width and 0 Signedness, and must come from a constant instruction of an integer-type scalar whose value is a valid inter macro-block major direction value as per Section 3, Binary Form.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

6

-

5773

<id> Result Type

Result <id>

<id> Payload

<id> Major Shape

<id> Direction

- -------- - - - - - - - - - - - - - - -
-

OpSubgroupAvcImeGetStreamoutSingleReferenceMajorShapeDistortionsINTEL
-
-Get the distortions for the input major shape from the IME single reference streamout results.

-
-
-

Up to 4 distortions are returned, one per work-item in the same format as for motion -vectors as described in OpSubgroupAvcImeGetStreamoutSingleReferenceMajorShapeMotionVectorsINTEL.

-
-
-

Result Type must be a OpTypeInt with 16-bit Width and 0 Signedness.

-
- -
-

Major Shape must be a OpTypeInt with 8-bit Width and 0 Signedness, and must come from a constant instruction of an integer-type scalar whose value is a valid inter macro-block major shape value as per Section 3, Binary Form.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

5

-

5771

<id> Result Type

Result <id>

<id> Payload

<id> Major Shape

- --------- - - - - - - - - - - - - - - - -
-

OpSubgroupAvcImeGetStreamoutDualReferenceMajorShapeDistortionsINTEL
-
-Get the distortions for the input major shape from the IME dual reference streamout results.

-
-
-

Up to 4 distortion are returned, one per work-item in the same format as for motion -vectors for single reference streamout as described in OpSubgroupAvcImeGetStreamoutSingleReferenceMajorShapeMotionVectorsINTEL.

-
-
-

Result Type must be a OpTypeInt with 16-bit Width and 0 Signedness.

-
-
-

Payload must be the OpTypeAvcImeResultDualReferenceStreamoutINTEL type.

-
-
-

Major Shape must be a OpTypeInt with 8-bit Width and 0 Signedness, and must come from a constant instruction of an integer-type scalar whose value is a valid inter macro-block major shape value as per Section 3, Binary Form.

-
-
-

Direction must be a OpTypeInt with 8-bit Width and 0 Signedness, and must come from a constant instruction of an integer-type scalar whose value is a valid inter macro-block major direction value as per Section 3, Binary Form.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

6

-

5774

<id> Result Type

Result <id>

<id> Payload

<id> Major Shape

<id> Direction

- -------- - - - - - - - - - - - - - - -
-

OpSubgroupAvcImeGetStreamoutSingleReferenceMajorShapeReferenceIdsINTEL
-
-Get the reference identifiers for the input major shape and direction from the IME dual reference streamout results

-
-
-

Up to 4 reference identifiers are returned, one per work-item in the same format as for motion -vectors as described in OpSubgroupAvcImeGetStreamoutSingleReferenceMajorShapeMotionVectorsINTEL.

-
-
-

Result Type must be a OpTypeInt with 8-bit Width and 0 Signedness.

-
- -
-

Major Shape must be a OpTypeInt with 8-bit Width and 0 Signedness, and must come from a constant instruction of an integer-type scalar whose value is a valid inter macro-block major shape value as per Section 3, Binary Form.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

5

-

5772

<id> Result Type

Result <id>

<id> Payload

<id> Major Shape

- --------- - - - - - - - - - - - - - - - -
-

OpSubgroupAvcImeGetStreamoutDualReferenceMajorShapeReferenceIdsINTEL
-
-Get the reference identifiers for the input major shape from the IME dual reference streamout results.

-
-
-

Up to 4 reference identifiers are returned, one per work-item in the same format as for motion vectors for single reference streamout as described in OpSubgroupAvcImeGetStreamoutSingleReferenceMajorShapeMotionVectorsINTEL.

-
-
-

Result Type must be a OpTypeInt with 8-bit Width and 0 Signedness.

-
-
-

Payload must be the OpTypeAvcImeResultDualReferenceStreamoutINTEL type.

-
-
-

Major Shape must be a OpTypeInt with 8-bit Width and 0 Signedness, and must come from a constant instruction of an integer-type scalar whose value is a valid inter macro-block major shape value as per Section 3, Binary Form.

-
-
-

Direction must be a OpTypeInt with 8-bit Width and 0 Signedness, and must come from a constant instruction of an integer-type scalar whose value is a valid inter macro-block major direction value as per Section 3, Binary Form.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

6

-

5775

<id> Result Type

Result <id>

<id> Payload

<id> Major Shape

<id> Direction

- -------- - - - - - - - - - - - - - - -
-

OpSubgroupAvcImeGetBorderReachedINTEL
-
-Get the bitmask indicating whether any border of forward/backward reference image is reached by one or more MVs in the winning inter shape. The bitmask values are as per the inter border reached values as per Section 3, Binary Form.

-
-
-

The search window must have been configured for a forward reference if image_select is -set as AVC_ME_FRAME_FORWARD_INTEL and with a backward reference if image_select is set as AVC_ME_FRAME_BACKWARD_INTEL.

-
-
-

Result Type must be a OpTypeInt with 8-bit Width and 0 Signedness.

-
-
-

Image Select must be a OpTypeInt with 8-bit Width and 0 Signedness, and must come from a constant instruction of an integer-type scalar whose value is either AVC_ME_FRAME_FORWARD_INTEL or AVC_ME_FRAME_BACKWARD_INTEL.

-
-
-

Payload must be the OpTypeAvcImeResultINTEL type.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

5

-

5776

<id> Result Type

Result <id>

<id> Image Select

<id> Payload

- ------- - - - - - - - - - - - - - -
-

OpSubgroupAvcImeGetTruncatedSearchIndicationINTEL
-
-Get the indication that the search operation was prevented from providing the lowest -distortion solution due to the tighter constraints on the maximum number of MB motion -vectors configured for the search operation.

-
-
-

Result Type must be a OpTypeInt with 8-bit Width and 0 Signedness.

-
-
-

Payload must be the OpTypeAvcImeResultINTEL type.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

4

-

5777

<id> Result Type

Result <id>

<id> Payload

- ------- - - - - - - - - - - - - - -
-

OpSubgroupAvcImeGetUnidirectionalEarlySearchTerminationINTEL
-
-Get the indication that unidirectional search operation terminated early because the configured distortion threshold was met.

-
-
-

Result Type must be a OpTypeInt with 8-bit Width and 0 Signedness.

-
-
-

Payload must be the OpTypeAvcImeResultINTEL type.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

4

-

5778

<id> Result Type

Result <id>

<id> Payload

- ------- - - - - - - - - - - - - - -
-

OpSubgroupAvcImeGetWeightingPatternMinimumMotionVectorINTEL
-
-Get the 16x16 motion vector corresponding to the minimum 16x16 distortion when applying -the traditional Z-order SAD weighting pattern.

-
-
- - - - - -
- - -
-

This can only be called if a SAD weighting pattern was set prior to evaluation using -OpSubgroupAvcImeSetWeightedSadINTEL.

-
-
-
-
-

Result Type must be a OpTypeInt with 32-bit Width and 0 Signedness.

-
-
-

Payload must be the OpTypeAvcImeResultINTEL type.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

4

-

5779

<id> Result Type

Result <id>

<id> Payload

- ------- - - - - - - - - - - - - - -
-

OpSubgroupAvcImeGetWeightingPatternMinimumDistortionINTEL
-
-Get the minimum 16x16 distortion when applying the traditional Z-order SAD weighting pattern.

-
-
-

Result Type must be a OpTypeInt with 16-bit Width and 0 Signedness.

-
-
-

Payload must be the OpTypeAvcImeResultINTEL type.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

4

-

5780

<id> Result Type

Result <id>

<id> Payload

-
-
-
-
-

REF instructions

-
-

These instruction are only guaranteed to work correctly if placed strictly within uniform control flow within the Subgroup execution scope. This ensures that if any invocation executes it, all invocations will execute it. If placed elsewhere, the results are undefined.

-
-
-
Initialization instructions
-
-

These instructions create a properly initialized payload that can be used for further configured for evaluating REF operations. This is a required initial phase. A call to either OpSubgroupAvcFmeInitializeINTEL or OpSubgroupAvcBmeInitializeINTEL is required.

-
- ------------- - - - - - - - - - - - - - - - - - - - -
-

OpSubgroupAvcFmeInitializeINTEL
-
-Return an initialized payload for a VME fractional motion estimation operation (FME).

-
-
-

The payload is initialized for progressive frame operations, and the cost configuration -values and the miscellaneous property values are all initialized to zero. The cost configuration and the miscellaneous property configuration instructions must be used to override the initial configurations in the payload.

-
-
-

Result Type must be the OpTypeAvcRefPayloadINTEL type.

-
-
-

Src Coord must be a vector(2) of i16 values and 0 Signedness, and represents the 2D offset of the top left corner of the source MB in pixel units in the source image.

-
-
- - - - - -
- - -
-

If the source image is an interlaced scan image, then the bottom field lines are considered as logically overlapping with the top field lines (i.e. the top field MBs are considered as logically overlapping with the bottom MBs) for the purposes for specifying -the Src Coord value.

-
-
-
-
-

Motion Vectors must be an OpTypeInt with 64-bit Width and 0 Signedness. It represents the BMVs returned by an IME in the same format as -returned by OpSubgroupAvcMceGetMotionVectorsINTEL. The MVs are in QPEL units. The X and Y coordinates of each MV must be in the range [-2048.00, 2047.75), otherwise the results are undefined.

-
-
- - - - - -
- - -
-

If this operand’s value is manually composed, all sub-block MVs must be replicated per its format for each partition. For example for 16x16 partition, all sub-block MVs must be replicated to the same MV, and for 8x8 partition, each 8x8 must have its respective sub-block MVs replicated.

-
-
-
-
-

Major Shapes must be an OpTypeInt with 8-bit Width and 0 Signedness. Legal values and format for it are as per the return value of OpSubgroupAvcMceGetInterMajorShapeINTEL.

-
-
-

Minor Shapes must be an OpTypeInt with 8-bit Width and 0 Signedness. Legal values and format for it are as per the return value of OpSubgroupAvcMceGetInterMajorShapeINTEL.

-
-
-

Directions must be an OpTypeInt with 8-bit Width and 0 Signedness. Legal values and format for it are as per the return value of -OpSubgroupAvcMceGetInterDirectionsINTEL.

-
-
-

Pixel Resolution must be an OpTypeInt with 8-bit Width and 0 Signedness. Legal values for it are either AVC_ME_SUBPIXEL_MODE_HPEL_INTEL or AVC_ME_SUBPIXEL_MODE_QPEL_INTEL.

-
-
-

Sad Adjustment must be an OpTypeInt with 8-bit Width and 0 Signedness that must evaluate to a valid sad adjustment value as per Section 3, Binary Form.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

10

-

5781

<id> Result Type

Result <id>

<id> Src Coord

<id> Motion Vectors

<id> Major Shapes

<id> Minor Shapes

<id> Direction

<id> Pixel Resolution

<id> Sad Adjustment

- -------------- - - - - - - - - - - - - - - - - - - - - -
-

OpSubgroupAvcBmeInitializeINTEL
-
-Return an initialized payload for a VME bidirectional motion estimation (BME) operation.

-
-
-

The payload is initialized for progressive frame operations, and the cost configuration -values and the miscellaneous property values are all initialized to zero. The cost configuration and the miscellaneous property configuration instructions must be used to override the initial configurations in the payload.

-
-
-

Result Type must be the OpTypeAvcRefPayloadINTEL type.

-
-
-

Src Coord must be a vector(2) of i16 values and 0 Signedness, and represents the 2D offset of the top left corner of the source MB in pixel units in the source image.

-
-
- - - - - -
- - -
-

If the source image is an interlaced scan image, then the bottom field lines are considered as logically overlapping with the top field lines (i.e. the top field MBs are considered as logically overlapping with the bottom MBs) for the purposes for specifying -the Src Coord value.

-
-
-
-
-

Motion Vectors must be an OpTypeInt with 64-bit Width and 0 Signedness. It represents the BMVs returned by an IME in the same format as -returned by OpSubgroupAvcMceGetMotionVectorsINTEL. The MVs are in QPEL units. The X and Y coordinates of each MV must be in the range [-2048.00, 2047.75), otherwise the results are undefined.

-
-
- - - - - -
- - -
-

If this operand’s value is manually composed, all sub-block MVs must be replicated per its format for each partition. For example for 16x16 partition, all sub-block MVs must be replicated to the same MV, and for 8x8 partition, each 8x8 must have its respective sub-block MVs replicated.

-
-
-
-
-

Major Shapes must be an OpTypeInt with 8-bit Width and 0 Signedness. Legal values and format for it are as per the return value of OpSubgroupAvcMceGetInterMajorShapeINTEL.

-
-
-

Minor Shapes must be an OpTypeInt with 8-bit Width and 0 Signedness. Legal values and format for it are as per the return value of OpSubgroupAvcMceGetInterMajorShapeINTEL.

-
-
-

Directions must be an OpTypeInt with 8-bit Width and 0 Signedness. Legal values and format for it are as per the return value of -OpSubgroupAvcMceGetInterDirectionsINTEL.

-
-
-

Pixel Resolution must be an OpTypeInt with 8-bit Width and 0 Signedness. Legal values for it are either AVC_ME_SUBPIXEL_MODE_HPEL_INTEL or AVC_ME_SUBPIXEL_MODE_QPEL_INTEL.

-
-
-

Bidirectional Weight must be an OpTypeInt with 8-bit Width and 0 Signedness that must evaluate to a valid bidirectional weight value as per Section 3, Binary Form.

-
-
-

Sad Adjustment must be an OpTypeInt with 8-bit Width and 0 Signedness that must evaluate to a valid sad adjustment value as per Section 3, Binary Form.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

11

-

5782

<id> Result Type

Result <id>

<id> Src Coord

<id> Motion Vectors

<id> Major Shapes

<id> Minor Shapes

<id> Direction

<id> Pixel Resolution

<id> Bidirectional Weight

<id> Sad Adjustment

-
-
-
Payload type conversion instructions
-
-

These are optional instructions that may be called following the initialization phase to convert REF payload to MCE payloads and vice-versa.

-
- ------- - - - - - - - - - - - - - -

OpSubgroupAvcRefConvertToMcePayloadINTEL
-
-Convert the REF payload to a generic MCE payload.

-

Result Type must be the OpTypeAvcMcePayloadINTEL type.

-

Payload must be the OpTypeAvcRefPayloadINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

4

5783

<id> Result Type

Result <id>

<id> Payload

- ------- - - - - - - - - - - - - - -

OpSubgroupAvcMceConvertToRefPayloadINTEL
-
-Convert the generic MCE payload to a REF payload.

-

Result Type must be the OpTypeAvcRefPayloadINTEL type.

-

Payload must be the OpTypeAvcMcePayloadINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

4

5734

<id> Result Type

Result <id>

<id> Payload

-
-
-
Miscellaneous property configuration instructions
-
-

These are optional instructions that may be called following the initialization phase to enable miscellaneous properties setting in the payload.

-
- ------- - - - - - - - - - - - - - -

OpSubgroupAvcRefSetBidirectionalMixDisableINTEL
-
-Update the input payload to disable a mix of bidirectional and unidirectional MVs in the result.

-

Default is to enable it.

-

Result Type must be the OpTypeAvcRefPayloadINTEL type.

-

Payload must be the OpTypeAvcRefPayloadINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

4

5784

<id> Result Type

Result <id>

<id> Payload

- ------- - - - - - - - - - - - - - -
-

OpSubgroupAvcRefSetBilinearFilterEnableINTEL
-
-Update the input payload to do enable bilinear filter interpolation instead of 4-tap -filter interpolation. Default is 4-tap filter interpolation.

-
-
- - - - - -
- - -
-

This should not be called if the payload was initialized with integer pixel resolution.

-
-
-
-
-

Default is to enable it.

-
-
-

Result Type must be the OpTypeAvcRefPayloadINTEL type.

-
-
-

Payload must be the OpTypeAvcRefPayloadINTEL type.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

4

-

5785

<id> Result Type

Result <id>

<id> Payload

-
-
-
Evaluation instructions
-
-

These instructions perform the evaluation of the REF operation configured in the payload with a VME media sampler and return the results.

-
- --------- - - - - - - - - - - - - - - - -

OpSubgroupAvcRefEvaluateWithSingleReferenceINTEL
-
-Evaluate the basic REF operation with a single reference and return its results.

-

Result Type must be the OpTypeAvcRefResultINTEL type.

-

Src Image is an object whose type is an OpTypeVmeImageINTEL, and specifies the source image.

-

Ref Image is an object whose type is an OpTypeVmeImageINTEL, and specifies a forward reference image.

-

Payload must be the OpTypeAvcRefPayloadINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

6

5786

<id> Result Type

Result <id>

<id> Src Image

<id> Ref Image

<id> Payload

- ---------- - - - - - - - - - - - - - - - - -

OpSubgroupAvcRefEvaluateWithDualReferenceINTEL
-
-Evaluate the basic REF operation with dual references and return its results.

-

Result Type must be the OpTypeAvcRefResultINTEL type.

-

Src Image is an object whose type is an OpTypeVmeImageINTEL, and specifies the source image.

-

Fwd Ref Image is an object whose type is an OpTypeVmeImageINTEL, and specifies a forward reference image.

-

Bwd Ref Image is an object whose type is an OpTypeVmeImageINTEL, and specifies a backward reference image.

-

Payload must be the OpTypeAvcRefPayloadINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

7

5787

<id> Result Type

Result <id>

<id> Src Image

<id> Fwd Ref Image

<id> Bwd Ref Image

<id> Payload

- --------- - - - - - - - - - - - - - - - -
-

OpSubgroupAvcRefEvaluateWithMultiReferenceINTEL
-
-Evaluate the basic REF operation with multi references and return its results.

-
-
-

Result Type must be the OpTypeAvcRefResultINTEL type.

-
-
-

Src Image is an object whose type is an OpTypeVmeImageINTEL, and specifies the source image.

-
-
-

Packed Reference Ids must be an OpTypeInt with 32-bit Width and 0 Signedness, with the following bits specifying the values for the pair of reference images for each major partition.

-
- ---- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

3:0

Fwd reference block 0

7:4

Bwd reference block 0

11:8

Fwd reference block 1

15:12

Bwd reference block 1

19:16

Fwd reference block 2

23:20

Bwd reference block 2

27:24

Fwd reference block 3

31:28

Bwd reference block 3

-
-

A forward[backward] reference identifier value of n indicates the forward[backward] -image from the nth pair of forward/backward reference images, with the value of n -ranging from 0 to 15.

-
-
-

If the REF operation is configured with only forward reference images then, the -values of the backward reference identifiers are not used.

-
-
-

The blocks are numbered using the traditional Z order. For larger block sizes, the -sub-block reference identifier pairs must be replicated. For example, for a 16x16 block, all four pair of reference identifiers must be replicated to the value of the first pair -for block 0.

-
-
- - - - - -
- - -
-

The value for the Packed Reference Ids is typically obtained by calling OpSubgroupAvcMceGetInterReferenceIdsINTEL for the preceding IME operation’s result.

-
-
-
-
-

Payload must be the OpTypeAvcRefPayloadINTEL type.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

6

-

5788

<id> Result Type

Result <id>

<id> Src Image

<id> Packed Reference Ids

<id> Payload

- ---------- - - - - - - - - - - - - - - - - -
-

OpSubgroupAvcRefEvaluateWithMultiReferenceInterlacedINTEL
-
-Evaluate the basic REF operation with multi references and return its results. This -is used for interlaced source and reference images.

-
-
-

Result Type must be the OpTypeAvcRefResultINTEL type.

-
-
-

Src Image is an object whose type is an OpTypeVmeImageINTEL, and specifies the source image.

-
-
-

Packed Reference Ids must be an OpTypeInt with 32-bit Width and 0 Signedness, with the following bits specifying the values for the pair of reference images for each major partition.

-
- ---- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

3:0

Fwd reference block 0

7:4

Bwd reference block 0

11:8

Fwd reference block 1

15:12

Bwd reference block 1

19:16

Fwd reference block 2

23:20

Bwd reference block 2

27:24

Fwd reference block 3

31:28

Bwd reference block 3

-
-

A forward[backward] reference identifier value of n indicates the forward[backward] -image from the nth pair of forward/backward reference images, with the value of n -ranging from 0 to 15.

-
-
-

If the REF operation is configured with only forward reference images then, the -values of the backward reference identifiers are not used.

-
-
-

The blocks are numbered using the traditional Z order. For larger block sizes, the -sub-block reference identifier pairs must be replicated. For example, for a 16x16 block, all four pair of reference identifiers must be replicated to the value of the first pair -for block 0.

-
-
- - - - - -
- - -
-

The value for the Packed Reference Ids is typically obtained by calling OpSubgroupAvcMceGetInterReferenceIdsINTEL for the preceding IME operation’s result.

-
-
-
-
-

Packed Reference Field Polarities must be an OpTypeInt with 8-bit Width and 0 Signedness. Reference field polarities for forward and backward reference images are specified for each of the allowed major partitions using it, with the following bits specifying the reference field polarities for the major partitions.

-
- ---- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

0

Fwd reference block 0

1

Fwd reference block 1

2

Fwd reference block 2

3

Fwd reference block 3

4

Bwd reference block 0

5

Bwd reference block 1

6

Bwd reference block 2

7

Bwd reference block 3

-
-

If the dual-reference evaluation instructions are not used, then the values of the -backward reference field polarities are not used.

-
-
-

The blocks are numbered using the traditional Z order. For larger block sizes, the -sub-block reference field polarities are replicated. For example, for a 16x16 block all -four pairs of reference field polarities are replicated to the value of the first pair for block 0.

-
-
-

Payload must be the OpTypeAvcRefPayloadINTEL type.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

7

-

5789

<id> Result Type

Result <id>

<id> Src Image

<id> Packed Reference Ids

<id> Packed Reference Field Polarities

<id> Payload

-
-
-
Result type conversion instructions
-
-

These are optional instructions that may be called following the evaluation phase to convert REF results to MCE results and vice-versa.

-
- ------- - - - - - - - - - - - - - -

OpSubgroupAvcRefConvertToMceResultINTEL
-
-Convert the REF result to a generic MCE result.

-

Result Type must be the OpTypeAvcMceResultINTEL type.

-

Payload must be the OpTypeAvcRefResultINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

4

5790

<id> Result Type

Result <id>

<id> Payload

- ------- - - - - - - - - - - - - - -

OpSubgroupAvcMceConvertToRefResultINTEL
-
-Convert the generic MCE result to a REF result.

-

Result Type must be the OpTypeAvcRefResultINTEL type.

-

Payload must be the OpTypeAvcMceResultINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

4

5735

<id> Result Type

Result <id>

<id> Payload

-
-
-
-
-

SIC instructions

-
-

These instruction are only guaranteed to work correctly if placed strictly within uniform control flow within the Subgroup execution scope. This ensures that if any invocation executes it, all invocations will execute it. If placed elsewhere, the results are undefined.

-
-
-
Initialization instructions
-
-

These instructions create a properly initialized payload that can be used for further configured for evaluating SIC operations. This is a required initial phase.

-
- ------- - - - - - - - - - - - - - -
-

OpSubgroupAvcSicInitializeINTEL
-
-Return an initialized payload for a VME SIC operation.

-
-
-

Result Type must be the OpTypeAvcSicPayloadINTEL type.

-
-
-

Src Coord must be a vector(2) of i16 values and 0 Signedness. It represents the 2D offset of the top left corner of the source MB in pixel units in the source image. Source MBs at the image borders are allowed to be partial, but the top-left corner must be within the image.

-
-
- - - - - -
- - -
-
    -
  1. -

    If the source image is an interlaced scan image, then the bottom field lines are -considered as logically overlapping with the top field lines (i.e. the top field MBs -are considered as logically overlapping with the bottom MBs) for the purposes for specifying the Src Coord value.

    -
  2. -
  3. -

    If the SIC operation is being configured for chroma based intra estimation, then the x and y coordinates of Src Coord must be multiples of 2.

    -
  4. -
-
-
-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

4

-

5791

<id> Result Type

Result <id>

<id> Src Coord

-
-
-
Configuration instructions
- ------------ - - - - - - - - - - - - - - - - - - -
-

OpSubgroupAvcSicConfigureSkcINTEL
-
-Configure the SIC payload for (uni or bi-directional) skip checks.

-
-
-

Result Type must be the OpTypeAvcSicPayloadINTEL type.

-
-
-

Skip Block Partition Type must be an OpTypeInt with 8-bit Width and 0 Signedness that must evaluate to one of the specified partition mask values as per Section 3, Binary Form.

-
-
-

Skip Motion Vector Mask must be an OpTypeInt with 32-bit Width and 0 Signedness. Legal values for it can be composed using the OpBitwiseOr instruction from the values defined for it as per Section 3, Binary Form; both unidirectional and bidirectional skip vectors can be specified uniquely for each major partition (16x16 or 8x8) by an appropriate selection of the skip motion vector mask enumeration values. If the 16x16 Skip Block Partition Type is specified, then only the 16x16 enumeration values may be used, else only the 8x8 enumeration values may be used.

-
-
- - - - - -
- - -
-

The instruction OpSubgroupAvcSicGetMotionVectorMask may be used to set this operand’s value.

-
-
-
-
-

Motion Vectors must be an OpTypeInt with 64-bit Width and 0 Signedness, and specifies the input packed BMVs. Either the forward or backward is ignored if the setting in Skip Motion Vector Mask is backward or forward respectively. If the setting is bidirectional, then both the forward and backward motion vectors will be used. If the Skip Block Partition Type is 16x16, work-item 0 in the subgroup provides the BMV, and if the Skip Block Partition Type is 8x8, work-items 0 to 4 in the subgroup provide the four BMVs. The MVs are in QPEL units. The X and Y coordinates of each MV must be in the range [-2048.00, 2047.75] and [-512.00 to 511.75] respectively, otherwise the results are undefined.

-
-
-

Bidirectional Weight must be an OpTypeInt with 8-bit Width and 0 Signedness that must evaluate to a valid bidirectional weight value as per Section 3, Binary Form. If the setting is unidirectional, then the this parameter value is ignored and can be set to the value 0.

-
-
-

Sad Adjustment must be an OpTypeInt with 8-bit Width and 0 Signedness that must evaluate to a valid sad adjustment value as per Section 3, Binary Form.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

9

-

5792

<id> Result Type

Result <id>

<id> Skip Block Partition Type

<id> Skip Motion Vector Mask

<id> Motion Vectors

<id> Bidirectional Weight

<id> Sad Adjustment

<id> Payload

- -------------- - - - - - - - - - - - - - - - - - - - - -
-

OpSubgroupAvcSicConfigureIpeLumaINTEL
-
-Return an initialized payload for a VME luma intra prediction estimation (IPE) operation.

-
-
-

Result Type must be the OpTypeAvcSicPayloadINTEL type.

-
-
-

Luma Intra Partition Mask must be an OpTypeInt with 8-bit Width and 0 Signedness, which can be composed from their respective values as per Section 3, Binary Form using the OpBitwiseAnd instruction.

-
-
-

Intra Neighbour Availabilty must be an OpTypeInt with 8-bit Width and 0 Signedness that must evaluate to a valid intra neighbour availabilty value as per Section 3, Binary Form.

-
-
-

Left Edge Luma Pixels, Upper Left Corner Luma Pixel, Upper Edge Luma Pixels, Upper Right Edge Luma Pixels must be an OpTypeInt with 8-bit Width and 0 Signedness, and specify the neighbor edge pixels for the left, top-left corner, top -and top right edges with each work-item providing each pixel value. These pixels values are used to perform the intra mode estimation.

-
-
-

For the left and top edge pixels, successive subgroup work-items 0 to 15 provide the -successive edge pixels. For the top-right edge, successive work-items 0 to 7 provide the -successive edge pixels; the pixel values in work-items 8 to 15 are ignored. The top-left -corner pixel is a uniform pixel value with each work-item providing the same corner pixel.

-
-
-

Sad Adjustment must be an OpTypeInt with 8-bit Width and 0 Signedness that must evaluate to a valid sad adjustment value as per Section 3, Binary Form.

-

Capability:
-SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationIntraINTEL

-

11

-

5793

<id> Result Type

Result <id>

<id> Luma Intra Partition Mask

<id> Intra Neighbour Availabilty

<id> Left Edge Luma Pixels

<id> Upper Left Corner Luma Pixel

<id> Upper Edge Luma Pixels

<id> Upper Right Edge Luma Pixels

<id> Sad Adjustment

<id> Payload

- ----------------- - - - - - - - - - - - - - - - - - - - - - - - -
-

OpSubgroupAvcSicConfigureIpeLumaChromaINTEL
-
-Return an initialized payload for a VME luma and chroma intra prediction estimation (IPE) operation.

-
-
-

Result Type must be the OpTypeAvcSicPayloadINTEL type.

-
-
-

Luma Intra Partition Mask must be an OpTypeInt with 8-bit Width and 0 Signedness, which can be composed from their respective values as per Section 3, Binary Form using the OpBitwiseAnd instruction.

-
-
-

Intra Neighbour Availabilty must be an OpTypeInt with 8-bit Width and 0 Signedness that must evaluate to a valid intra neighbour availabilty value as per Section 3, Binary Form.

-
-
-

Left Edge Luma Pixels, Upper Left Corner Luma Pixel, Upper Edge Luma Pixels, Upper Right Edge Luma Pixels, Left Edge Chroma Pixels, Upper Left Corner Chroma Pixel, Upper Edge Chroma Pixels must be an OpTypeInt with 8-bit Width and 0 Signedness, and specify the neighbor luma and chroma edge pixels for the left, top-left corner, top and top-right (luma only) edges with each work-item providing each pixel value. These pixels values are used to perform the intra mode estimation.

-
-
-

For the left and top edge pixels, successive subgroup work-items 0 to 15 provide the -successive edge pixels. For the top-right edge, successive work-items 0 to 7 provide the -successive edge pixels; the pixel values in work-items 8 to 15 are ignored. The top-left -corner pixel is a uniform pixel value with each work-item providing the same corner pixel.

-
-
-

For the left and top chroma CbCr pixels, successive subgroup work-items 0 to 7 provide the successive CbCr pixels; the pixel values in work-items 8 to 15 are ignored. The top-left corner pixel is a uniform CbCr pixel value with each work-item providing the same corner CbCr pixel.

-
-
-

Sad Adjustment must be an OpTypeInt with 8-bit Width and 0 Signedness that must evaluate to a valid sad adjustment value as per Section 3, Binary Form.

-

Capability:
-SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationChromaINTEL

-

14

-

5794

<id> Result Type

Result <id>

<id> Luma Intra Partition Mask

<id> Intra Neighbour Availabilty

<id> Left Edge Luma Pixels

<id> Upper Left Corner Luma Pixel

<id> Upper Edge Luma Pixels

<id> Upper Right Edge Luma Pixels

<id> Left Edge Chroma Pixels

<id> Upper Left Corner Chroma Pixel

<id> Upper Edge Chroma Pixels

<id> Sad Adjustment

<id> Payload

- -------- - - - - - - - - - - - - - - -

OpSubgroupAvcSicGetMotionVectorMaskINTEL
-
-Compose the value for the input argument Skip Motion Vector Mask for the -input Skip Block Partition Type and direction.

-

Result Type must be an OpTypeInt with 32-bit Width and 0 Signedness.

-

Skip Block Partition Type must be an OpTypeInt with 32-bit Width and 0 Signedness, which must evaluate to valid partition mask values for either 16x16 or 8x8 as per Section 3, Binary Form.

-

Direction must be an OpTypeInt with 8-bit Width and 0 Signedness. It is a bit field with the directions for the 4 8x8 sub-partitions in traditional Z order, or for only the 16x16 partition. Two bits are reserved for each of the four sub-partitions in row-major order. The 2-bit values are as per the inter macro-block major direction values. If the Skip Block Partition Type indicates a 16x16 shape, then only the 1st 2 bits contains the direction, and other bits must be zeroed.

Capability:
-SubgroupAvcMotionEstimationINTEL

5

5795

<id> Result Type

Result <id>

<id> Skip Block Partition Type

<id> Direction

-
-
-
Payload type conversion instructions
-
-

These are optional instructions that may be called following the search configuration phase to convert SIC payload to MCE payloads and vice-versa.

-
- ------- - - - - - - - - - - - - - -

OpSubgroupAvcSicConvertToMcePayloadINTEL
-
-Convert the SIC payload to a generic MCE payload.

-

Result Type must be the OpTypeAvcMcePayloadINTEL type.

-

Payload must be the OpTypeAvcSicPayloadINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

4

5796

<id> Result Type

Result <id>

<id> Payload

- ------- - - - - - - - - - - - - - -

OpSubgroupAvcMceConvertToSicPayloadINTEL
-
-Convert the generic MCE payload to a SIC payload.

-

Result Type must be the OpTypeAvcSicPayloadINTEL type.

-

Payload must be the OpTypeAvcMcePayloadINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

4

5736

<id> Result Type

Result <id>

<id> Payload

-
-
-
Intra shape cost configuration instructions
- -------- - - - - - - - - - - - - - - -
-

OpSubgroupAvcSicSetIntraLumaShapePenaltyINTEL
-
-Set the shape penalty for inter motion estimation.

-
-
-

Result Type must be the OpTypeAvcSicPayloadINTEL type.

-
-
-

Packed Shape Penalty must be a 32-bit scalar integer type. It is treated as an unsigned value. The following bits specify the shape penalty in U4U4 format:

-
- ---- - - - - - - - - - - - - - - - - - - -

7:0

Must be zero

15:8

16x16

23:16

8x8

31:24

4x4

-
-

The U4U4 decoded integer values for each of the bytes must fit within 12 bits.

-
-
-

Payload must be the OpTypeAvcSicPayloadINTEL type.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

5

-

5797

<id> Result Type

Result <id>

<id> Packed Shape Penalty

<id> Payload

-
-
-
Intra mode cost configuration instructions
- ---------- - - - - - - - - - - - - - - - - -
-

OpSubgroupAvcSicSetIntraLumaModeCostFunctionINTEL
-
-Update the payload to configure the luma mode cost function to be applied to the computed luma mode for SIC intra operations.

-
-
-

Result Type must be the OpTypeAvcSicPayloadINTEL type.

-
-
-

Luma Mode Penalty specifies the penalty to be applied to the estimated luma mode if it differs from its predicted luma mode (based on its neighbor intra modes). It is specified in U4U4 format and must bit in 10 bits.

-
-
-

Luma Packed Neighbor Modes specifies the values of the already computed top and left neighbor modes for the bordering 4x4 blocks, with the 4x4 blocks numbered in the traditional Z-order as shown -below.

-
-
-
-
0 1 4 5
-2 3 6 7
-8 9 C D
-A B E F
-
-
-
-

The following bits specify the neighbor modes.

-
- ---- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

3:0

Left neighbor block 5

7:4

Left neighbor block 7

11:8

Left neighbor block D

15:12

Left neighbor block F

19:16

Top neighbor block A

23:20

Top neighbor block B

27:24

Top neighbor block E

31:28

Top neighbor block F

-
-

Luma Packed Non Dc Penalty specifies the penalty to be applied for any computed non-DC luma mode for each of the 16x16, 8x8, and 4x4 shapes, with the following bits specifying the penalties.

-
- ---- - - - - - - - - - - - - - - - - - - -

7:0

Intra16x16

15:8

Intra8x8

23:16

Intra4x4

31:24

Must be zero

-
-

The component byte values are specified in 8-bit integer format.

-
-
-

Payload must be the OpTypeAvcSicPayloadINTEL type.

-
-
-

The intra distortion for each intra luma block can be described by the following formulas:

-
-
-
-
Intra_4x4_SAD(or Haar) +
-Luma_Shape_Penalty_4x4 +
-Luma_Non_Dc_4x4_Penalty(if not  DC) +
-Luma_Mode_Penalty(if computed mode is not
-the same predicted mode from neighbor modes)
-
-Intra_8x8_SAD(or Haar) +
-Luma_Shape_Penalty_8x8 +
-Luma_Non_Dc_8x8_Penalty(if not  DC) +
-Luma_Mode_Penalty(if computed mode is not
-the same predicted mode from neighbor modes)
-end{gather}
-
-Intra_16x16_SAD(or Haar) +
-Luma_Shape_Penalty_16x16 +
-Luma_Non_Dc_16x16_Penalty(if not DC)
-
-

Capability:
-SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationIntraINTEL

-

7

-

5798

<id> Result Type

Result <id>

<id> Luma Mode Penalty

<id> Luma Packed Neighbor Modes

<id> Luma Packed Non Dc Penalty

<id> Payload

- -------- - - - - - - - - - - - - - - -
-

OpSubgroupAvcSicSetIntraChromaModeCostFunctionINTEL
-
-Update the payload to configure the intra chroma mode cost function by specifying the penalty to be applied to the computed chroma mode for SIC intra operations.

-
-
-

Result Type must be the OpTypeAvcSicPayloadINTEL type.

-
-
-

Chroma Mode Base Penalty is the base penalty to be applied to the computed intra chroma modes. This penalty is in U4U4 format.

-
-
-

The U4U4 decoded integer value must fit in 12 bits.

-
-
-

The base penalty is scaled based on the computed mode as defined below.

-
- ---- - - - - - - - - - - - - - - - - - - -

DC

0x

HORZ

1x

VERT

1x

PLANE

2x

-
-

The component byte values are specified in 8-bit integer format.

-
-
-

Payload must be the OpTypeAvcSicPayloadINTEL type.

-
-
-

The intra distortion for each intra 8x8 chroma block can be described by the following -formula:

-
-
-
-
Distortion =
-    SAD(or Haar) +
-    Chroma_Mode_Base_Penalty
-    (scaled based on computed mode)
-
-

Capability:
-SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationChromaINTEL

-

7

-

5799

<id> Result Type

Result <id>

<id> Chroma Mode Base Penalty

<id> Payload

-
-
-
Miscellaneous property configuration instructions
-
-

These are optional instructions that may be called following the configuration phase to enable miscellaneous properties setting in the payload.

-
- ------- - - - - - - - - - - - - - -
-

OpSubgroupAvcSicSetBilinearFilterEnableINTEL
-
-Update the input payload to do enable bilinear filter interpolation instead of 4-tap -filter interpolation. Default is 4-tap filter interpolation.

-
-
- - - - - -
- - -
-

This should not be called if the payload was initialized with integer pixel resolution.

-
-
-
-
-

Default is to enable it.

-
-
-

Result Type must be the OpTypeAvcSicPayloadINTEL type.

-
-
-

Payload must be the OpTypeAvcSicPayloadINTEL type.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

4

-

5800

<id> Result Type

Result <id>

<id> Payload

- -------- - - - - - - - - - - - - - - -
-

OpSubgroupAvcSicSetSkcForwardTransformEnableINTEL
-
-Enable skip check forward transform with the specified SAD coefficients thresholds in -the frequency domain to approximate the effects of forward quantization.

-
-
-

The skip decision will be enhanced to include an accurate AVC forward transform for skip -estimation. This feature is in addition to the previous SAD or HAAR skip estimation. The -results of the forward transform are compared one coefficient at a time against a -user-specified threshold, in the input argument packed_sad_coefficients, to -emulate quantization’s zeroing effect. The user is returned the count of coefficients that exceeded their threshold along with a sum of the amount exceeded, both grouped at the 8x8 block level (i.e. for each 8x8 block).

-
-
-

This is valid only for SKC operations.

-
-
-

Result Type must be the OpTypeAvcSicPayloadINTEL type.

-
-
-

Packed Sad Coefficients must be an OpTypeInt with 64-bit Width and 0 Signedness, and spoecifies the SAD coefficient threshold matrix. The SAD coefficient threshold matrix for a 4x4 transform as shown in the table below is packed into a 64-bit integer. The low 16 bits contains the larger DC threshold. The coefficient thresholds for the remaining 6 AC thresholds in the order of increasing frequency are provided by the successive 8-bit bit ranges.

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - -

0 (DC)

1 (AC)

2 (AC)

3 (AC)

1 (AC)

2 (AC)

3 (AC)

4 (AC)

2 (AC)

3 (AC)

4 (AC)

5 (AC)

3 (AC)

4 (AC)

5 (AC)

6 (AC)

-
-

Payload must be the OpTypeAvcSicPayloadINTEL type.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

5

-

5801

<id> Result Type

Result <id>

<id> Packed Sad Coefficients

<id> Payload

- -------- - - - - - - - - - - - - - - -
-

OpSubgroupAvcSicSetBlockBasedRawSkipSadINTEL
-
-The raw skip SAD computed during the evaluation phase will be the maximal SAD of individual 4x4 (or 8x8) blocks, instead of the sum of the entire individual 4x4 block -SADs of the MB.

-
-
-

It is valid to call this function only if the payload is configured for a skip check -operation by a prior call to [blue]#OpSubgroupAvcSicConfigureSkcINTEL.

-
-
-

Result Type must be the OpTypeAvcSicPayloadINTEL type.

-
-
-

Block Based Skip Type must be an OpTypeInt with 8-bit Width and 0 Signedness, that must evaluate to a valid block based skip type value as per Section 3, Binary Form.

-
-
-

Payload must be the OpTypeAvcSicPayloadINTEL type.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

5

-

5802

<id> Result Type

Result <id>

<id> Block Based Skip Type

<id> Payload

-
-
-
Evaluation instructions
-
-

These instructions perform the evaluation of the SIC operation configured in the payload with a VME media sampler and return the results.

-
- -------- - - - - - - - - - - - - - - -

OpSubgroupAvcSicEvaluateIPEINTEL
-
-Evaluate the SIC IPE operation and return its results.

-

Result Type must be the OpTypeAvcSicResultINTEL type.

-

Src Image is an object whose type is an OpTypeVmeImageINTEL, and specifies the source image.

-

Payload must be the OpTypeAvcSicPayloadINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationIntraINTEL

5

5803

<id> Result Type

Result <id>

<id> Src Image

<id> Payload

- --------- - - - - - - - - - - - - - - - -

OpSubgroupAvcSicEvaluateWithSingleReferenceINTEL
-
-Evaluate the SIC operation with single reference and return its results.

-

Result Type must be the OpTypeAvcSicResultINTEL type.

-

Src Image is an object whose type is an OpTypeVmeImageINTEL, and specifies the source image.

-

Ref Image is an object whose type is an OpTypeVmeImageINTEL, and specifies a forward reference image.

-

Payload must be the OpTypeAvcSicPayloadINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

6

5804

<id> Result Type

Result <id>

<id> Src Image

<id> Ref Image

<id> Payload

- ---------- - - - - - - - - - - - - - - - - -

OpSubgroupAvcSicEvaluateWithDualReferenceINTEL
-
-Evaluate the SIC operation with dual references and return its results.

-

Result Type must be the OpTypeAvcSicResultINTEL type.

-

Src Image is an object whose type is an OpTypeVmeImageINTEL, and specifies the source image.

-

Fwd Ref Image is an object whose type is an OpTypeVmeImageINTEL, and specifies a forward reference image.

-

Bwd Ref Image is an object whose type is an OpTypeVmeImageINTEL, and specifies a backward reference image.

-

Payload must be the OpTypeAvcSicPayloadINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

7

5805

<id> Result Type

Result <id>

<id> Src Image

<id> Fwd Ref Image

<id> Bwd Ref Image

<id> Payload

- --------- - - - - - - - - - - - - - - - -
-

OpSubgroupAvcSicEvaluateWithMultiReferenceINTEL
-
-Evaluate the SIC operation with multi references and return its results.

-
-
-

Result Type must be the OpTypeAvcSicResultINTEL type.

-
-
-

Src Image is an object whose type is an OpTypeVmeImageINTEL, and specifies the source image.

-
-
-

Packed Reference Ids must be an OpTypeInt with 32-bit Width and 0 Signedness, with the following bits specifying the values for the pair of reference images for each major partition.

-
- ---- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

3:0

Fwd reference block 0

7:4

Bwd reference block 0

11:8

Fwd reference block 1

15:12

Bwd reference block 1

19:16

Fwd reference block 2

23:20

Bwd reference block 2

27:24

Fwd reference block 3

31:28

Bwd reference block 3

-
-

A forward[backward] reference identifier value of n indicates the forward[backward] -image from the nth pair of forward/backward reference images, with the value of n -ranging from 0 to 15.

-
-
-

If the SIC operation is configured with only forward reference images then, the -values of the backward reference identifiers are not used.

-
-
-

The blocks are numbered using the traditional Z order. For larger block sizes, the -sub-block reference identifier pairs must be replicated. For example, for a 16x16 block, all four pair of reference identifiers must be replicated to the value of the first pair -for block 0.

-
-
-

Payload must be the OpTypeAvcSicPayloadINTEL type.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

6

-

5806

<id> Result Type

Result <id>

<id> Src Image

<id> Packed Reference Ids

<id> Payload

- ---------- - - - - - - - - - - - - - - - - -
-

OpSubgroupAvcSicEvaluateWithMultiReferenceInterlacedINTEL
-
-Evaluate the SIC operation with multi references and return its results. This is used for interlaced source and reference images.

-
-
-

Result Type must be the OpTypeAvcSicResultINTEL type.

-
-
-

Src Image is an object whose type is an OpTypeVmeImageINTEL, and specifies the source image.

-
-
-

Packed Reference Ids must be an OpTypeInt with 32-bit Width and 0 Signedness, with the following bits specifying the values for the pair of reference images for each major partition.

-
- ---- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

3:0

Fwd reference block 0

7:4

Bwd reference block 0

11:8

Fwd reference block 1

15:12

Bwd reference block 1

19:16

Fwd reference block 2

23:20

Bwd reference block 2

27:24

Fwd reference block 3

31:28

Bwd reference block 3

-
-

A forward[backward] reference identifier value of n indicates the forward[backward] -image from the nth pair of forward/backward reference images, with the value of n -ranging from 0 to 15.

-
-
-

If the SIC operation is configured with only forward reference images then, the -values of the backward reference identifiers are not used.

-
-
-

The blocks are numbered using the traditional Z order. For larger block sizes, the -sub-block reference identifier pairs must be replicated. For example, for a 16x16 block, all four pair of reference identifiers must be replicated to the value of the first pair -for block 0.

-
-
- - - - - -
- - -
-

The value for the Packed Reference Ids is typically obtained by calling OpSubgroupAvcMceGetInterSicerenceIdsINTEL for the preceding IME operation’s result.

-
-
-
-
-

Packed Reference Field Polarities must be an OpTypeInt with 8-bit Width and 0 Signedness. Reference field polarities for forward and backward reference images are specified for each of the allowed major partitions using it, with the following bits specifying the reference field polarities for the major partitions.

-
- ---- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

0

Fwd reference block 0

1

Fwd reference block 1

2

Fwd reference block 2

3

Fwd reference block 3

4

Bwd reference block 0

5

Bwd reference block 1

6

Bwd reference block 2

7

Bwd reference block 3

-
-

If the dual-reference evaluation instructions are not used, then the values of the -backward reference field polarities are not used.

-
-
-

The blocks are numbered using the traditional Z order. For larger block sizes, the -sub-block reference field polarities are replicated. For example, for a 16x16 block all -four pairs of reference field polarities are replicated to the value of the first pair for block 0.

-
-
-

Payload must be the OpTypeAvcSicPayloadINTEL type.

-

Capability:
-SubgroupAvcMotionEstimationINTEL

-

7

-

5807

<id> Result Type

Result <id>

<id> Src Image

<id> Packed Reference Ids

<id> Packed Reference Field Polarities

<id> Payload

-
-
-
Result type conversion instructions
-
-

These are optional instructions that may be called following the evaluation phase to convert SIC results to MCE results and vice-versa.

-
- ------- - - - - - - - - - - - - - -

OpSubgroupAvcSicConvertToMceResultINTEL
-
-Convert the SIC result to a generic MCE result.

-

Result Type must be the OpTypeAvcMceResultINTEL type.

-

Payload must be the OpTypeAvcSicResultINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

4

5808

<id> Result Type

Result <id>

<id> Payload

- ------- - - - - - - - - - - - - - -

OpSubgroupAvcMceConvertToSicResultINTEL
-
-Convert the generic MCE result to a SIC result.

-

Result Type must be the OpTypeAvcSicResultINTEL type.

-

Payload must be the OpTypeAvcMceResultINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

4

5737

<id> Result Type

Result <id>

<id> Payload

-
-
-
Result processing instructions
-
-

These instructions are called following the evaluation phase to extract the various result components from an SIC evaluation result.

-
- ------- - - - - - - - - - - - - - -

OpSubgroupAvcSicGetIpeLumaShapeINTEL
-
-Return the best intra shape from the SIC result.

-

The returned values are valid intra-MB shapes as per Section 3, Binary Form.

-

Result Type must be a OpTypeInt with 8-bit Width and 0 Signedness.

-

Payload must be the OpTypeAvcSicResultINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationIntraINTEL

4

5809

<id> Result Type

Result <id>

<id> Payload

- ------- - - - - - - - - - - - - - -

OpSubgroupAvcSicGetBestIpeLumaDistortionINTEL
-
-Return the best intra luma distortion for the shape return by OpSubgroupAvcSicGetIpeLumaShapeINTEL from the SIC result.

-

Result Type must be a OpTypeInt with 16-bit Width and 0 Signedness.

-

Payload must be the OpTypeAvcSicResultINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationIntraINTEL

4

5810

<id> Result Type

Result <id>

<id> Payload

- ------- - - - - - - - - - - - - - -

OpSubgroupAvcSicGetBestIpeChromaDistortionINTEL
-
-Return the best intra chroma distortion for the 8x8 shape from the SIC result.

-

Result Type must be a OpTypeInt with 16-bit Width and 0 Signedness.

-

Payload must be the OpTypeAvcSicResultINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

4

5811

<id> Result Type

Result <id>

<id> Payload

- ------- - - - - - - - - - - - - - -
-

OpSubgroupAvcSicGetPackedIpeLumaModesINTEL
-

-
-
-

Return the packed intra luma modes for all blocks from the SIC result.

-
-
-

There are four bits per -luma mode for a block and legal values are valid intra luma modes as per Section 3, Binary Form.

-
-
-

The number of blocks is based on the result of OpSubgroupAvcSicGetIpeLumaShapeINTEL.

-
-
-

If the luma shape is:

-
-
-
    -
  • -

    16x16, then one mode is returned in bits [0, 3]

    -
  • -
  • -

    8x8, then four modes corresponding to the four partitions are returned by bits in the ranges [0, 3], [16,19], [32, 35], and [48, 51]; the order of the four partitions are in the traditional Z-order

    -
  • -
  • -

    4x4, then 16 modes (4 bits per mode) are returned of all 16 partitions by all the bits; the order of the 16 partitions are in the traditional Z-order as shown below:

    -
    -
    -
    0 1 4 5
    -2 3 6 7
    -8 9 C D
    -A B E F
    -
    -
    -
  • -
-
-
-

Result Type must be a OpTypeInt with 64-bit Width and 0 Signedness.

-
-
-

Payload must be the OpTypeAvcSicResultINTEL type.

-

Capability:
-SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationIntraINTEL

-

4

-

5812

<id> Result Type

Result <id>

<id> Payload

- ------- - - - - - - - - - - - - - -

OpSubgroupAvcSicGetIpeChromaModeINTEL
-
-Return the intra chroma mode for the 8x8 block from the SIC result.

-

The returned values are valid intra chroma modes as per Section 3, Binary Form.

-

Result Type must be a OpTypeInt with 8-bit Width and 0 Signedness.

-

Payload must be the OpTypeAvcSicResultINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationChromaINTEL

4

5813

<id> Result Type

Result <id>

<id> Payload

- ------- - - - - - - - - - - - - - -
-

OpSubgroupAvcSicGetPackedSkcLumaCountThresholdINTEL
-
-Return the packed count of luma coefficient components that exceeded their transform thresholds from the SIC result for each 8x8 partition in traditional Z-order.

-
-
-

The format of the results is as follows:

-
-
-
    -
  • -

    count of coefficients that exceeded their respective threshold for block 8x8_0 is returned in bit [0, 7]

    -
  • -
  • -

    count of coefficients that exceeded their respective threshold for block 8x8_1 is returned in bit [8, 15]

    -
  • -
  • -

    count of coefficients that exceeded their respective threshold for block 8x8_2 is returned in bit [16, 23]

    -
  • -
  • -

    count of coefficients that exceeded their respective threshold for block 8x8_0 is returned in bit [24, 31]

    -
  • -
-
-
-

The results are only valid if the SIC operation was configured with frequency domain SAD transform coefficients using OpSubgroupAvcSicSetSkcForwardTransformEnableINTEL .

-
-
-

Result Type must be a OpTypeInt with 32-bit Width and 0 Signedness.

-
-
-

Payload must be the OpTypeAvcSicResultINTEL type.

-

Capability:
-SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationIntraINTEL

-

4

-

5814

<id> Result Type

Result <id>

<id> Payload

- ------- - - - - - - - - - - - - - -
-

OpSubgroupAvcSicGetPackedSkcLumaSumThresholdINTEL
-
-Return the packed sum of luma coefficient components that exceeded their transform thresholds from the SIC result for each 8x8 partition in traditional Z-order.

-
-
-

The format of the results is as follows:

-
-
-
    -
  • -

    sum of coefficients that exceeded their respective threshold for block 8x8_0 is returned in bit [0, 15]

    -
  • -
  • -

    sum of coefficients that exceeded their respective threshold for block 8x8_1 is returned in bit [16,31]

    -
  • -
  • -

    sum of coefficients that exceeded their respective threshold for block 8x8_2 is returned in bit [32,47]

    -
  • -
  • -

    sum of coefficients that exceeded their respective threshold for block 8x8_0 is returned in bit [48, 63]

    -
  • -
-
-
-

The results are only valid if the SIC operation was configured with frequency domain SAD transform coefficients using OpSubgroupAvcSicSetSkcForwardTransformEnableINTEL .

-
-
-

Result Type must be a OpTypeInt with 64-bit Width and 0 Signedness.

-
-
-

Payload must be the OpTypeAvcSicResultINTEL type.

-

Capability:
-SubgroupAvcMotionEstimationINTEL, SubgroupAvcMotionEstimationIntraINTEL

-

4

-

5815

<id> Result Type

Result <id>

<id> Payload

- ------- - - - - - - - - - - - - - -

OpSubgroupAvcSicGetInterRawSadsINTEL
-
-Return the skip check raw SAD (i.e. without any mode or shape costs included) for the entire MB if the input payload was note configured for block based skip checks, otherwise return the maximal SAD of individual 4x4 (or 8x8, if the block size for block based skip checking was configured as 8x8) blocks with the MB.

-

Result Type must be a OpTypeInt with 16-bit Width and 0 Signedness.

-

Payload must be the OpTypeAvcSicResultINTEL type.

Capability:
-SubgroupAvcMotionEstimationINTEL

4

5816

<id> Result Type

Result <id>

<id> Payload

-
-
-
-
-
-
-
-

Validation Rules

-
-
-

Modify Section 2.16.1, Universal Validation Rules, adding the following under the "Data Rules" bullet:

-
-
-

All OpVmeImageINTEL instructions must be in the same block in which their Result <id> are consumed. Result <id> from OpVmeImageINTEL instructions must not appear as operands to OpPhi instructions or OpSelect instructions, or any instructions other than the image lookup and query instructions specified to take an operand whose type is OpTypeVmeImageINTEL.

-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2018-10-29

Biju George

Initial version

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_device_side_avc_motion_estimation.html + + +

extensions/INTEL/SPV_INTEL_device_side_avc_motion_estimation.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_fp_fast_math_mode.html b/extensions/INTEL/SPV_INTEL_fp_fast_math_mode.html index 70a1037..35588ae 100644 --- a/extensions/INTEL/SPV_INTEL_fp_fast_math_mode.html +++ b/extensions/INTEL/SPV_INTEL_fp_fast_math_mode.html @@ -1,284 +1,12 @@ - - - - - - - -SPV_INTEL_FP_FAST_MATH_MODE - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_fp_fast_math_mode

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Jessica Davies, Intel

    -
  • -
  • -

    Joe Garvey, Intel

    -
  • -
  • -

    Michael Kinsner, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2020-2021 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-

Draft

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2021-05-21

Revision

2

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.5 Revision 2.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds two new bit masks to the FPFastMathMode decoration, to allow floating point operations to be annotated as allowing reassociation, and contraction.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_fp_fast_math_mode"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
FPFastMathModeINTEL
-
-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - -

FPFastMathModeINTEL

5837

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5

-
-
-

Modify Section 3.15, FP Fast Math Mode, adding the following rows to the fp fast math mode table.

-
- ----- - - - - - - - - - - - - -

0x10000

AllowContractFastINTEL
-Allow contraction of floating-point expressions even if it may violate the language standard. Overrides the ContractionOff execution mode. Implied by the Fast FP Fast Math Mode.

FPFastMathModeINTEL

0x20000

AllowReassocINTEL
-Allow algebraic transformations according to real-number associative algebra, even if it may violate the language standard. Implied by the Fast FP Fast Math Mode.

FPFastMathModeINTEL

-
-

Capability

-
-

Modify section 3.31, Capability, adding a row to the Capability table:

-
- ----- - - - - - - - - - - - -

Capability

Implicitly Declares

5837

FPFastMathModeINTEL

-

Allows control over floating point optimizations including contraction and reassociation.

Kernel

-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2020-04-22

Jessica Davies

Initial public release

2

2021-05-21

Jessica Davies

Clarify interaction with FP fast math mode

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_fp_fast_math_mode.html + + +

extensions/INTEL/SPV_INTEL_fp_fast_math_mode.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_fp_max_error.html b/extensions/INTEL/SPV_INTEL_fp_max_error.html index 02f710d..11bad93 100644 --- a/extensions/INTEL/SPV_INTEL_fp_max_error.html +++ b/extensions/INTEL/SPV_INTEL_fp_max_error.html @@ -1,326 +1,12 @@ - - - - - - - -SPV_INTEL_fp_max_error - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_fp_max_error

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Shuo Niu, Intel

    -
  • -
  • -

    Daniel Zhang, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2023 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-

Final draft

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2023-03-29

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.6 Revision 2.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds a decoration that may be attached to any instruction of floating-point type. It can be used to express the maximum acceptable relative error in the result of that instruction, in ULPs.

-
-
-

Note that this takes precedence over accuracy requirements from the environment specification. Please refer to the client environment specification for the list of instruction that the decoration can be attached to. The client environment specification will also provide a list of valid values for the maximum acceptable relative error.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_fp_max_error"
-
-
-
-
-
-

New capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
FPMaxErrorINTEL
-
-
-
-
-
-

New Decorations

-
-
-

This extension adds the following decoration under the FPMaxErrorINTEL capability:

-
-
-
-
FPMaxErrorDecorationINTEL
-
-
-
-
-
-

Token Number Assignments

-
-
-
- ---- - - - - - - - - - - -

FPMaxErrorINTEL

6169

FPMaxErrorDecorationINTEL

6170

-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6

-
-
-

Decoration

-
-

Modify Section 3.20, Decoration, adding the following row to the Decoration table:

-
-
-
- ------ - - - - - - - - - - - - - - - -
DecorationExtra OperandsEnabling Capabilities

6170

FPMaxErrorDecorationINTEL
-Apply to a floating-point instruction to express the maximum acceptable relative error in the result of that instruction, in ULPs. ULP is defined as follows:

-

If x is a real number that lies between two finite consecutive floating-point numbers a and b, without being equal to one of them, then ulp(x) = |b - a|, otherwise ulp(x) is the distance between the two non-equal finite floating-point numbers nearest to x. Moreover, ulp(NaN) is NaN. -To give attribution, this description referenced Jean-Michel Muller’s definition of ulp(x) with slight clarification for behaviour at zero. For details, please refer to https://hal.inria.fr/inria-00070503/document.

-

Max Error is a positive 32-bit float type number representing the maximum acceptable relative error.

-

Note that this decoration can increase or decrease allowable error. It overrides the accuracy from the environment specification and allows both the expressions of additional error and/or less error when it differs from the environment specification.

Literal
-Max Error

FPMaxErrorINTEL

-
-
-
-
-

Capability

-
-

Modify Section 3.31, Capability, adding a row to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

6169

FPMaxErrorINTEL

-
-
-
-
-

Validation Rules

-
-

None.

-
-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2023-03-29

Shuo Niu

Initial public release

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_fp_max_error.html + + +

extensions/INTEL/SPV_INTEL_fp_max_error.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_fpga_argument_interfaces.html b/extensions/INTEL/SPV_INTEL_fpga_argument_interfaces.html index cf1fe9d..df1df92 100644 --- a/extensions/INTEL/SPV_INTEL_fpga_argument_interfaces.html +++ b/extensions/INTEL/SPV_INTEL_fpga_argument_interfaces.html @@ -1,394 +1,12 @@ - - - - - - - -SPV_INTEL_fpga_argument_interfaces - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_fpga_argument_interfaces

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-

Abhishek Tiwari, Intel
-Joe Garvey, Intel

-
-
-
-
-

Notice

-
-
-

Copyright (c) 2022 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-

Final draft

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2022-12-04

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.6 Revision 2.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds kernel argument decorations that influence the interfaces built for for Field Programmable Gate Array (FPGA) kernel arguments.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_fpga_argument_interfaces"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces the following new capability:

-
-
-
-
FPGAArgumentInterfacesINTEL
-
-
-
-
-
-

Token Number Assignments

-
-
-
- ---- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

FPGAArgumentInterfacesINTEL

6174

ConduitKernelArgumentINTEL

6175

RegisterMapKernelArgumentINTEL

6176

MMHostInterfaceAddressWidthINTEL

6177

MMHostInterfaceDataWidthINTEL

6178

MMHostInterfaceLatencyINTEL

6179

MMHostInterfaceReadWriteModeINTEL

6180

MMHostInterfaceMaxBurstINTEL

6181

MMHostInterfaceWaitRequestINTEL

6182

StableKernelArgumentINTEL

6183

-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6

-
-
-

Decoration

-
-

Modify Section 3.20, Decoration, adding these rows to the Decoration table:

-
-
-
- ------- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
DecorationExtra OperandsEnabling Capabilities

6175

ConduitKernelArgumentINTEL
-Must be applied only to an OpFunctionParameter of a function that is an entry point. Indicates that dedicated input wires should be created for this argument.

FPGAArgumentInterfacesINTEL

6176

RegisterMapKernelArgumentINTEL
-Must be applied only to an OpFunctionParameter of a function that is an entry point. Indicates that this argument is stored in registers in the kernel that are accessed through a common interface shared between this argument, other RegisterMapKernelArgumentINTEL arguments, and possibly kernel control signals.

FPGAArgumentInterfacesINTEL

6177

MMHostInterfaceAddressWidthINTEL
-Must be applied only to a pointer OpFunctionParameter of a function that is an entry point. Indicates the size, in bits, of the address bus for the Memory Mapped Interface created for this pointer argument.

Literal Number (32-bit signed integer)
-AddressWidth

FPGAArgumentInterfacesINTEL

6178

MMHostInterfaceDataWidthINTEL
-Must be applied only to a pointer OpFunctionParameter of a function that is an entry point. Indicates the size, in bits, of the data bus for the Memory Mapped Interface created for this pointer argument.

Literal Number (32-bit signed integer)
-DataWidth

FPGAArgumentInterfacesINTEL

6179

MMHostInterfaceLatencyINTEL
-Must be applied only to a pointer OpFunctionParameter of a function that is an entry point. Indicates the latency in cycles of the Memory Mapped Interface created for this pointer argument. If this decoration is present it guarantees that the latency is fixed.

Literal Number (32-bit signed integer)
-Latency

FPGAArgumentInterfacesINTEL

6180

MMHostInterfaceReadWriteModeINTEL
-Must be applied only to a pointer OpFunctionParameter of a function that is an entry point. Indicates the read-write mode of the Memory Mapped Interface created for this pointer argument.

Access Qualifier
-ReadWriteMode

FPGAArgumentInterfacesINTEL

6181

MMHostInterfaceMaxBurstINTEL
-Must be applied only to a pointer OpFunctionParameter of a function that is an entry point. Indicates the maximum burst count of the Memory Mapped Interface created for this pointer argument.

Literal Number (32-bit signed integer)
-MaxBurstCount

FPGAArgumentInterfacesINTEL

6182

MMHostInterfaceWaitRequestINTEL
-Must be applied only to a pointer OpFunctionParameter of a function that is an entry point. Indicates whether the Memory Mapped Interface created for this pointer argument should accept a waitrequest signal.

-

A setting of 1 means build a waitrequest signal and a setting of 0 means don’t.

Literal Number (32-bit signed integer)
-Waitrequest

FPGAArgumentInterfacesINTEL

6183

StableKernelArgumentINTEL
-Must be applied only to an OpFunctionParameter of a function that is an entry point. Indicates that this input will not change during the execution of pipelined kernel invocations. Input can change once all active invocations have finished.

FPGAArgumentInterfacesINTEL

-
-
-
-
-

Capability

-
-

Modify Section 3.31, Capability, adding a row to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

6174

FPGAArgumentInterfacesINTEL

-
-
-
-
-

Validation Rules

-
-

It is invalid to specify both ConduitKernelArgumentINTEL and RegisterMapKernelArgumentINTEL decorations on the same OpFunctionParameter.

-
-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2022-12-04

Abhishek Tiwari, Brox Chen

Initial public release

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_fpga_argument_interfaces.html + + +

extensions/INTEL/SPV_INTEL_fpga_argument_interfaces.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_fpga_buffer_location.html b/extensions/INTEL/SPV_INTEL_fpga_buffer_location.html index b5ad2f8..a06fe0e 100644 --- a/extensions/INTEL/SPV_INTEL_fpga_buffer_location.html +++ b/extensions/INTEL/SPV_INTEL_fpga_buffer_location.html @@ -1,305 +1,12 @@ - - - - - - - -SPV_INTEL_fpga_buffer_location - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_fpga_buffer_location

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Joe Garvey, Intel

    -
  • -
  • -

    Daniel Zhang, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2023 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-

Final Draft

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2023-02-01

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.6 Revision 2.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds a pointer decoration that is useful for FPGA targets. This decoration indicates that a particular global memory pointer can only access a particular physical memory location. Knowing this information at compile time can allow FPGA compilers to generate load store units of lower area for accesses done through such a pointer.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_fpga_buffer_location"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
FPGABufferLocationINTEL
-
-
-
-
-
-

Token Number Assignments

-
-
-
- ---- - - - - - - - - - - -

FPGABufferLocationINTEL

5920

BufferLocationINTEL

5921

-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6

-
-
-

Decoration

-
-

Modify Section 3.20, Decoration, adding these rows to the Decoration table:

-
-
-
- ------ - - - - - - - - - - - - - - - -
DecorationExtra OperandsEnabling Capabilities

5921

BufferLocationINTEL
-Apply only to a pointer. Indicates that the pointer must only point into the physical memory identified by the subsequent literal number operand. The exact semantics of the literal number are implementation defined.

Literal
-Buffer Location ID

FPGABufferLocationINTEL

-
-
-
-
-

Capability

-
-

Modify Section 3.31, Capability, adding a row to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

5920

FPGABufferLocationINTEL

-
-
-
-
-

Validation Rules

-
-

None.

-
-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2023-02-01

Joe Garvey

Initial public release

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_fpga_buffer_location.html + + +

extensions/INTEL/SPV_INTEL_fpga_buffer_location.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_fpga_cluster_attributes.html b/extensions/INTEL/SPV_INTEL_fpga_cluster_attributes.html index 655d0a0..0cb652f 100644 --- a/extensions/INTEL/SPV_INTEL_fpga_cluster_attributes.html +++ b/extensions/INTEL/SPV_INTEL_fpga_cluster_attributes.html @@ -1,351 +1,12 @@ - - - - - - - -SPV_INTEL_fpga_cluster_attributes - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_fpga_cluster_attributes

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Jessica Davies, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2020, 2023 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-

Final draft

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2023-04-11

Revision

2

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.6 Revision 2.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds decorations to request that statically-scheduled clusters use a stall-enable signal, or an exit FIFO, on an FPGA target.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_fpga_cluster_attributes"
-
-
-
-
-
-

New capabilities

-
-
-

This extension introduces two new capabilities:

-
-
-
-
FPGAClusterAttributesINTEL
-FPGAClusterAttributesV2INTEL
-
-
-
-
-
-

New Decorations

-
-
-

This extension adds the following decoration under the FPGAClusterAttributesINTEL capability:

-
-
-
-
StallEnableINTEL
-
-
-
-

This extension adds the following decoration under the FPGAClusterAttributesV2INTEL capability:

-
-
-
-
StallFreeINTEL
-
-
-
-
-
-

Token Number Assignments

-
-
-
- ---- - - - - - - - - - - - - - - - - - - -

FPGAClusterAttributesINTEL

5904

StallEnableINTEL

5905

FPGAClusterAttributesV2INTEL

6150

StallFreeINTEL

6151

-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6

-
-
-

Decoration

-
-

Modify Section 3.20, Decoration, adding these rows to the Decoration table:

-
-
-
- ------- - - - - - - - - - - - - - - - - - - - - - -
DecorationExtra OperandsEnabling Capabilities

5905

StallEnableINTEL
-Only valid on OpFunction. Request, to the extent possible, that statically-scheduled clusters should handle stalls using a stall-enable signal to freeze computation within the cluster.

FPGAClusterAttributesINTEL

6151

StallFreeINTEL
-Only valid on OpFunction. Request, to the extent possible, that statically-scheduled clusters should handle stalls by using an exit FIFO to hold any output data until the cluster is no longer stalled.

FPGAClusterAttributesV2INTEL

-
-
-
-
-

Capability

-
-

Modify Section 3.31, Capability, adding the following rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - - - - - - -
CapabilityImplicitly Declares

5904

FPGAClusterAttributesINTEL

6150

FPGAClusterAttributesV2INTEL

FPGAClusterAttributesINTEL

-
-
-
-
-

Validation Rules

-
-

At most one of the StallEnableINTEL and StallFreeINTEL decorations can appear on an OpFunction.

-
-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2020-10-13

Jessica Davies

Initial public release

2

2023-04-11

Jessica Davies

Add stall-free cluster decoration

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_fpga_cluster_attributes.html + + +

extensions/INTEL/SPV_INTEL_fpga_cluster_attributes.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_fpga_dsp_control.html b/extensions/INTEL/SPV_INTEL_fpga_dsp_control.html index 8d25d62..cdca29f 100644 --- a/extensions/INTEL/SPV_INTEL_fpga_dsp_control.html +++ b/extensions/INTEL/SPV_INTEL_fpga_dsp_control.html @@ -1,340 +1,12 @@ - - - - - - - -SPV_INTEL_fpga_dsp_control - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_fpga_dsp_control

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Jessica Davies, Intel

    -
  • -
  • -

    Joe Garvey, Intel

    -
  • -
  • -

    Ajaykumar Kannan, Intel

    -
  • -
  • -

    Mike Kinsner, Intel

    -
  • -
  • -

    Shuo Niu, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2021 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-

Final draft

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2021-03-12

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.5 Revision 4.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds a decoration to request that math operations be implemented using -Digital Signal Processing (DSP) blocks or soft logic, on an FPGA target.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_fpga_dsp_control"
-
-
-
-
-
-

New capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
FPGADSPControlINTEL
-
-
-
-
-
-

New Decorations

-
-
-

This extension adds the following decoration under the FPGADSPControlINTEL capability:

-
-
-
-
MathOpDSPModeINTEL
-
-
-
-
-
-

Token Number Assignments

-
-
-
- ---- - - - - - - - - - - -

FPGADSPControlINTEL

5908

MathOpDSPModeINTEL

5909

-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5

-
-
-

Decoration

-
-

Modify Section 3.20, Decoration, adding the following rows to the Decoration table:

-
-
-
- ------- - - - - - - - - - - - - - - - - -
DecorationExtra OperandsEnabling Capabilities

5909

MathOpDSPModeINTEL
-Only valid on OpFunction. -Request, to the extent possible, that math operations in the function be implemented according to Mode.

-

Mode is a 32-bit unsigned integer type scalar. Propagate is a 32-bit unsigned integer type scalar.

-

If Mode is equal to 0 it indicates a request that math operations be implemented using soft logic and/or DSP blocks according to default implementation-defined heuristics.

-

If Mode is equal to 1 it indicates a request that math operations be implemented using soft logic.

-

If Mode is equal to 2 it indicates a request that math operations be implemented using DSP blocks.

-

If Propagate is equal to 0, the Mode request applies to math operations in this function F only, and does not extend to math operations executed as part of function calls made by F.

-

If Propagate is equal to 1, the Mode request applies to math operations in this function F, and to all math operations executed as part of functions called (transitively) by F, unless a called function G has a MathOpDSPModeINTEL decoration. The decoration on G takes precedence for G and all functions called (transitively) by G, i.e., the Mode request from F does not apply to G nor functions called (transitively) by G.

-

If Propagate is equal to 2, the Mode request applies to all math operations in this function F, and to all math operations executed as part of function calls made (transitively) by F, overriding any MathOpDSPModeINTEL on the called functions.

Literal
-Mode

Literal
-Propagate

FPGADSPControlINTEL

-
-
-
-
-

Capability

-
-

Modify Section 3.31, Capability, adding a row to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

5908

FPGADSPControlINTEL

-
-
-
-
-

Validation Rules

-
-

None.

-
-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2021-03-12

Jessica Davies

Initial public release

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_fpga_dsp_control.html + + +

extensions/INTEL/SPV_INTEL_fpga_dsp_control.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_fpga_invocation_pipelining_attributes.html b/extensions/INTEL/SPV_INTEL_fpga_invocation_pipelining_attributes.html index 4243d6f..bbb4aed 100644 --- a/extensions/INTEL/SPV_INTEL_fpga_invocation_pipelining_attributes.html +++ b/extensions/INTEL/SPV_INTEL_fpga_invocation_pipelining_attributes.html @@ -1,352 +1,12 @@ - - - - - - - -SPV_INTEL_fpga_invocation_pipelining_attributes - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_fpga_invocation_pipelining_attributes

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Jessica Davies, Intel

    -
  • -
  • -

    Joe Garvey, Intel

    -
  • -
  • -

    Mike Kinsner, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2021 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-

Final draft

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2021-05-21

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.5 Revision 3.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

Some FPGA devices and toolchains can support customizable levels or implementation of pipeline parallelism when mapping a SPIR-V module to hardware. Through pipeline parallelism, multiple invocations of a kernel or function can execute concurrently.

-
-
-

This extension adds decorations to request that a kernel or function support invocations at a specified initiation interval, that multiple invocations are forbidden from executing concurrently, or that the kernel or function is limited to a maximum number of concurrent invocations.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_fpga_invocation_pipelining_attributes"
-
-
-
-
-
-

New capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
FPGAInvocationPipeliningAttributesINTEL
-
-
-
-
-
-

New Decorations

-
-
-

This extension adds the following decorations under the FPGAInvocationPipeliningAttributesINTEL capability:

-
-
-
-
InitiationIntervalINTEL
-MaxConcurrencyINTEL
-PipelineEnableINTEL
-
-
-
-
-
-

Token Number Assignments

-
-
-
- ---- - - - - - - - - - - - - - - - - - - -

FPGAInvocationPipeliningAttributesINTEL

5916

InitiationIntervalINTEL

5917

MaxConcurrencyINTEL

5918

PipelineEnableINTEL

5919

-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5

-
-
-

Decoration

-
-

Modify Section 3.20, Decoration, adding these rows to the Decoration table:

-
-
-
- ------- - - - - - - - - - - - - - - - - - - - - - - - - - - - -
DecorationExtra OperandsEnabling Capabilities

5917

InitiationIntervalINTEL
-Only valid on OpFunction. Strong request, to the extent possible, for this function to support an initiation interval of Cycles clock cycles. Cycles is a 32-bit unsigned integer type scalar. The value of Cycles must be non-zero.

Literal
-Cycles

FPGAInvocationPipeliningAttributesINTEL

5918

MaxConcurrencyINTEL
-Only valid on OpFunction. Strong request, to the extent possible, to allow no more than a fixed number Invocations of invocations to execute the function concurrently. Invocations is a 32-bit unsigned integer type scalar. If Invocations is equal to zero, it indicates no limit on the number of concurrent invocations.

Literal
-Invocations

FPGAInvocationPipeliningAttributesINTEL

5919

PipelineEnableINTEL
-Only valid on OpFunction. Strong request, to the extent possible, to either support pipelining or to not pipeline invocations of this function. Enable is a 32-bit unsigned integer type scalar. If Enable is equal to 0, it indicates a request not to pipeline, while a non-zero value indicates a request to pipeline.

Literal
-Enable

FPGAInvocationPipeliningAttributesINTEL

-
-
-
-
-

Capability

-
-

Modify Section 3.31, Capability, adding a row to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

5916

FPGAInvocationPipeliningAttributesINTEL

Kernel

-
-
-
-
-

Validation Rules

-
-

None.

-
-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2021-05-21

Jessica Davies

Initial public release

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_fpga_invocation_pipelining_attributes.html + + +

extensions/INTEL/SPV_INTEL_fpga_invocation_pipelining_attributes.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_fpga_latency_control.html b/extensions/INTEL/SPV_INTEL_fpga_latency_control.html index f0cd501..6a6514b 100644 --- a/extensions/INTEL/SPV_INTEL_fpga_latency_control.html +++ b/extensions/INTEL/SPV_INTEL_fpga_latency_control.html @@ -1,342 +1,12 @@ - - - - - - - -SPV_INTEL_fpga_latency_control - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_fpga_latency_control

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-

Shuo Niu, Intel

-
-
-
-
-

Notice

-
-
-

Copyright (c) 2022 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-

Final Draft

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2022-11-28

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.6 Revision 2.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-

This extension specifies interaction with the SPV_INTEL_blocking_pipes extension.

-
-
-
-
-

Overview

-
-
-

This extension adds two decorations to represent latency controls on the pointer accessed by load, store, pipe read and pipe write instructions.

-
-
-

The behavior is implementation-defined if the combination of constraints specified by the decorations cannot be satisfied. For example, if one constraint specifies instruction A should be scheduled after instruction B, while another constraint specifies instruction B should be scheduled after instruction A then that set of constraints is unsatisfiable.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_fpga_latency_control"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
FPGALatencyControlINTEL
-
-
-
-
-
-

New Decorations

-
-
-

Decorations added under the FPGALatencyControlINTEL capability:

-
-
-
-
LatencyControlLabelINTEL
-LatencyControlConstraintINTEL
-
-
-
-
-
-

Token Number Assignments

-
-
-
- ---- - - - - - - - - - - - - - - -

FPGALatencyControlINTEL

6171

LatencyControlLabelINTEL

6172

LatencyControlConstraintINTEL

6173

-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6

-
-
-

Decoration

-
-

Modify Section 3.20, Decoration, adding these rows to the Decoration table:

-
-
-
- -------- - - - - - - - - - - - - - - - - - - - - - - - -
DecorationExtra OperandsEnabling Capabilities

6172

LatencyControlLabelINTEL
-Apply to an object of type OpTypePointer or OpTypePipe. If that object is used as the pointer operand by an OpLoad or OpStore instruction (for OpTypePointer) or the pipe operand by an OpReadPipe, OpWritePipe, OpReadPipeBlockingINTEL, or OpWritePipeBlockingINTEL instruction (for OpTypePipe) then this decoration conveys latency control information about that instruction. Any such instructions will be referred to as the "labeled instructions" corresponding to the decoration.

-

Latency Label is a 32-bit signed integer type scalar that labels the labeled instruction so that it may be referred to in LatencyControlConstraintINTEL decorations.

Literal Number
-Latency Label

FPGALatencyControlINTEL

6173

LatencyControlConstraintINTEL
-Apply to an object of type OpTypePointer or OpTypePipe. If that object is used as the pointer operand by an OpLoad or OpStore instruction (for OpTypePointer) or the pipe operand by an OpReadPipe, OpWritePipe, OpReadPipeBlockingINTEL, or OpWritePipeBlockingINTEL instruction (for OpTypePipe) then this decoration conveys latency control information about that instruction. Any such instructions will be referred to as the "constrained instructions" corresponding to the decoration.

-

Relative To, Control Type, and Relative Cycle constrain the cycle on which the constrained instruction can be scheduled.

-

Relative To is a 32-bit signed integer type scalar that identifies the labeled instruction relative to which the constrained instruction associated with this decoration is being constrained. It corresponds to the Latency Label operand of a LatencyControlLabelINTEL decoration.

-

Relative Cycle is a 32-bit signed integer type scalar whose meaning depends on Control Type.

-

Control Type is a 32-bit signed integer type scalar that represents the type of the constraint.

-

If Control Type is equal to 1, it indicates that the latency between the labeled instruction and the constrained instruction should be exactly Relative Cycle cycles.

-

If Control Type is equal to 2, it indicates that the latency between the labeled instruction and the constrained instruction should be at most Relative Cycle cycles.

-

If Control Type is equal to 3, it indicates that the latency between the labeled instruction and the constrained instruction should be at least Relative Cycle cycles.

Literal Number
-Relative To

Literal Number
-Control Type

Literal Number
-Relative Cycle

FPGALatencyControlINTEL

-
-
-
-

Note that both of these decorations are ignored for target devices that are not FPGA.

-
-
-
-

Capability

-
-

Modify Section 3.31, Capability, adding a row to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

6171

FPGALatencyControlINTEL

-
-
-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2022-11-28

Shuo Niu

Initial public release

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_fpga_latency_control.html + + +

extensions/INTEL/SPV_INTEL_fpga_latency_control.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_fpga_loop_controls.html b/extensions/INTEL/SPV_INTEL_fpga_loop_controls.html index 69c42a2..806a06f 100644 --- a/extensions/INTEL/SPV_INTEL_fpga_loop_controls.html +++ b/extensions/INTEL/SPV_INTEL_fpga_loop_controls.html @@ -1,407 +1,12 @@ - - - - - - - -SPV_INTEL_fpga_loop_controls - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_fpga_loop_controls

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Artem Chikin, Intel

    -
  • -
  • -

    Jessica Davies, Intel

    -
  • -
  • -

    Joe Garvey, Intel

    -
  • -
  • -

    Michael Kinsner, Intel

    -
  • -
  • -

    Mark Mendell, Intel

    -
  • -
  • -

    Ci Tian, Intel

    -
  • -
  • -

    Bowen Xue, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2019-2022 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-

Final draft

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2022-10-13

Revision

J

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.4 Revision 1.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension introduces additional loop controls for FPGA targets.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_fpga_loop_controls"
-
-
-
-
-
-

New capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
FPGALoopControlsINTEL
-
-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - -

FPGALoopControlsINTEL

5888

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.4

-
-
-

Loop Control

-
-

In section 3.23, Loop Control, add the following entries to the table:

-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Loop Control

Enabling Capabilities

0x10000

InitiationIntervalINTEL
-Strong request, to the extent possible, to implement this loop with an initiation interval specified as a subsequent literal-number operand to the instruction.

FPGALoopControlsINTEL

0x20000

MaxConcurrencyINTEL
-Strong request, to the extent possible, to allow no more than a fixed number of threads or loop iterations to execute the loop concurrently as specified by a subsequent literal-number operand to the instruction.

FPGALoopControlsINTEL

0x40000

DependencyArrayINTEL
-Guarantees that there are no dependencies on a particular variable between a number of loop iterations. -Can be applied to multiple variables, the number of which is specified as a subsequent literal-number operand to the instruction. Following that, for each variable, an <id> and literal number pair are provided indicating the variable and number of loop iterations. A number of loop iterations of 0 indicates that there are no loop-carried dependences on that variable.

FPGALoopControlsINTEL

0x80000

PipelineEnableINTEL
-Strong request, to the extent possible, to either pipeline iterations of this loop or to not pipeline iterations of this loop depending on the value of the subsequent literal number operand. A value of 0 indicates a request not to pipeline while a value of 1 indicates a request to pipeline.

FPGALoopControlsINTEL

0x100000

LoopCoalesceINTEL
-Request to combine the loops nested within this loop into a single loop. A subsequent 32-bit integer literal operand specifies the number of nested loop levels to coalesce. A value of 0 indicates that all loop levels should be coalesced.

FPGALoopControlsINTEL

0x200000

MaxInterleavingINTEL
-Request to limit the number of pipelined interleaved invocations of this loop that can be executed simultaneously to the number specified subsequently as a 32-bit integer literal operand.

FPGALoopControlsINTEL

0x400000

SpeculatedIterationsINTEL
-Request to limit the number of iterations launched before the loop exit condition has been evaluated to the number specified subsequently as a 32-bit integer literal operand.

FPGALoopControlsINTEL

0x800000

NoFusionINTEL
-Strong request, to the extent possible, that this loop not be fused with any adjacent loop.

FPGALoopControlsINTEL

0x1000000

LoopCountINTEL
-Specify minimum, maximum and expected iteration counts of the loop. There are three 64-bit integer literal operands. The first operand is the minimum iteration count, the second is the maximum iteration count, and the third is the expected iteration count. A negative literal operand value specifies that the respective loop iteration bound or expectation is not defined. The behavior is undefined if the minimum iteration operand is non-negative and the loop iterates fewer times than that minimum. The behavior is also undefined if the maximum iteration operand is non-negative and the loop iterates more times than that maximum.

FPGALoopControlsINTEL

0x2000000

MaxReinvocationDelayINTEL
-Request to implement this loop with a maximum limit on the delay between launching the last iteration of a loop invocation and launching the first iteration of the next loop invocation. A subsequent positive 32-bit integer literal operand specifies the budget for the maximum reinvocation delay allowed. A value of 1 indicates that the first iteration of the next invocation should start immediately following the start of the last iteration of the previous loop invocation.

FPGALoopControlsINTEL

-
-
-

Capability

-
-

Modify Section 3.31, Capability, adding a row to the Capability table:

-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

5888

FPGALoopControlsINTEL

-
-
-

Validation Rules

-
-

None.

-
-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

A

2019-05-06

Joe Garvey

Initial public release

B

2019-05-07

Michael Kinsner

Update overview wording

C

2019-06-02

Michael Kinsner

Use loop control bits directly, as allocated in SPIRV-Headers spir-v.xml

D

2020-02-11

Artem Chikin

Add PipelineDisableINTEL

E

2020-02-12

Ci Tian

Add LoopCoalesceINTEL, MaxInterleavingINTEL and SpeculatedIterationsINTEL

F

2020-10-27

Jessica Davies

Add NoFusionINTEL

G

2020-11-17

Joe Garvey

Made LoopCoalesceINTEL argument mandatory

H

2021-05-03

Mark Mendell

Add LoopCountINTEL

I

2022-08-18

Bowen Xue

Add MaxReinvocationDelayINTEL

J

2022-10-13

Bowen Xue

Update wording of MaxReinvocationDelayINTEL

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_fpga_loop_controls.html + + +

extensions/INTEL/SPV_INTEL_fpga_loop_controls.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_fpga_memory_accesses.html b/extensions/INTEL/SPV_INTEL_fpga_memory_accesses.html index afb04d7..ec6de07 100644 --- a/extensions/INTEL/SPV_INTEL_fpga_memory_accesses.html +++ b/extensions/INTEL/SPV_INTEL_fpga_memory_accesses.html @@ -1,363 +1,12 @@ - - - - - - - -SPV_INTEL_fpga_memory_accesses - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_fpga_memory_accesses

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Mohammad Fawaz, Intel

    -
  • -
  • -

    Joe Garvey, Intel

    -
  • -
  • -

    Michael Kinsner, Intel

    -
  • -
  • -

    Alexey Sotkin, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2020 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-

First draft

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2020-02-20

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.5 Revision 2.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds decorations, useful for FPGA targets, that explicitly request that implementation of a memory access is configured in a certain way.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_fpga_memory_accesses"
-
-
-
-
-
-

New capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
FPGAMemoryAccessesINTEL
-
-
-
-
-
-

New Decorations

-
-
-

Decorations added under the FPGAMemoryAccessesINTEL capability:

-
-
-
-
BurstCoalesceINTEL
-CacheSizeINTEL
-DontStaticallyCoalesceINTEL
-PrefetchINTEL
-
-
-
-
-
-

Token Number Assignments

-
-
-
- ---- - - - - - - - - - - - - - - - - - - - - - - -

FPGAMemoryAccessesINTEL

5898

BurstCoalesceINTEL

5899

CacheSizeINTEL

5900

DontStaticallyCoalesceINTEL

5901

PrefetchINTEL

5902

-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5

-
-
-

Decoration

-
-

Modify Section 3.20, Decoration, adding these rows to the Decoration table:

-
-
-
- ------- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
DecorationExtra OperandsEnabling Capabilities

5899

BurstCoalesceINTEL
-Apply to a pointer. Request, to the extent possible, that a dynamic burst coalescer be implemented when the memory pointed to by the pointer is accessed using OpLoad or OpStore.

FPGAMemoryAccessesINTEL

5900

CacheSizeINTEL
-Apply to a pointer. Request, to the extent possible, that a read-only cache of the specified size be implemented when the memory pointed to by the pointer is accessed using OpLoad.

Literal Number
-Cache Size in bytes

FPGAMemoryAccessesINTEL

5901

DontStaticallyCoalesceINTEL
-Apply to a pointer. Request, to the extent possible, that accesses to the pointer, using OpLoad or OpStore, should not be statically coalesced with other memory accesses at compile time.

FPGAMemoryAccessesINTEL

5902

PrefetchINTEL
-Apply to a pointer. Request, to the extent possible, that a prefetcher of the specified size be implemented when the memory pointed to by the pointer is accessed using OpLoad.

Literal Number
-Prefetcher Size in bytes

FPGAMemoryAccessesINTEL

-
-
-
-
-

Capability

-
-

Modify Section 3.31, Capability, adding a row to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

5898

FPGAMemoryAccessesINTEL

-
-
-
-
-

Validation Rules

-
-

None.

-
-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2020-02-20

Mohammad Fawaz

Initial public release

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_fpga_memory_accesses.html + + +

extensions/INTEL/SPV_INTEL_fpga_memory_accesses.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_fpga_memory_attributes.html b/extensions/INTEL/SPV_INTEL_fpga_memory_attributes.html index 925a22a..f3e5f6c 100644 --- a/extensions/INTEL/SPV_INTEL_fpga_memory_attributes.html +++ b/extensions/INTEL/SPV_INTEL_fpga_memory_attributes.html @@ -1,575 +1,12 @@ - - - - - - - -SPV_INTEL_fpga_memory_attributes - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_fpga_memory_attributes

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Ben Ashbaugh, Intel

    -
  • -
  • -

    Mohammad Fawaz, Intel

    -
  • -
  • -

    Joe Garvey, Intel

    -
  • -
  • -

    Michael Kinsner, Intel

    -
  • -
  • -

    Julian Packer, Intel

    -
  • -
  • -

    Artem Radzikhovksyy, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2019-2023 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-

Final draft

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2023-10-03

Revision

I

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.5 Revision 2.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds decorations that influence compiler generation of memory structures on an FPGA target.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_fpga_memory_attributes"
-
-
-
-
-
-

New capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
FPGAMemoryAttributesINTEL
-
-
-
-
-
-

New Decorations

-
-
-

Decorations added under the FPGAMemoryAttributes capability:

-
-
-
-
RegisterINTEL
-MemoryINTEL
-NumbanksINTEL
-BankwidthINTEL
-MaxPrivateCopiesINTEL
-SinglepumpINTEL
-DoublepumpINTEL
-MaxReplicatesINTEL
-SimpleDualPortINTEL
-MergeINTEL
-BankBitsINTEL
-ForcePow2DepthINTEL
-StridesizeINTEL
-WordsizeINTEL
-TrueDualPortINTEL
-
-
-
-
-
-

Token Number Assignments

-
-
-
- ---- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

FPGAMemoryAttributesINTEL

5824

RegisterINTEL

5825

MemoryINTEL

5826

NumbanksINTEL

5827

BankwidthINTEL

5828

MaxPrivateCopiesINTEL

5829

SinglepumpINTEL

5830

DoublepumpINTEL

5831

MaxReplicatesINTEL

5832

SimpleDualPortINTEL

5833

MergeINTEL

5834

BankBitsINTEL

5835

ForcePow2DepthINTEL

5836

StridesizeINTEL

5883

WordsizeINTEL

5884

TrueDualPortINTEL

5885

-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5

-
-
-

Decoration

-
-

Modify Section 3.20, Decoration, adding these rows to the Decoration table:

-
-
-
- ------- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
DecorationExtra OperandsEnabling Capabilities

5825

RegisterINTEL
-Apply to a variable or a structure-type member. Request, to the extent possible, that the variable or structure member should be implemented in logic and carried through the datapath.

FPGAMemoryAttributesINTEL

5826

-

MemoryINTEL
-Apply to a variable or a structure-type member. Request, to the extent possible, that the variable or structure member should be implemented in memory of the specified type.

-
-
-
Supported strings:
-
    -
  • -

    DEFAULT: Implemenation defined what memory resource is used to implement the variable

    -
  • -
  • -

    MLAB: data is stored in special Adaptive Logic Modules (ALMs), called memory-logic array blocks

    -
  • -
  • -

    BLOCK_RAM: data is stored in dedicated block RAM modules

    -
  • -
-

Literal String
-Memory Type

FPGAMemoryAttributesINTEL

5827

NumbanksINTEL
-Apply to a variable or a structure-type member. Request, to the extent possible, that the variable or structure member should be implemented in a memory with the specified number of banks.

Literal Number
-Banks

FPGAMemoryAttributesINTEL

5828

BankwidthINTEL
-Apply to a variable or a structure-type member. Request, to the extent possible, that the variable or structure member should be implemented in a memory whose banks have the specified width in bytes.

Literal Number
-Bank Width

FPGAMemoryAttributesINTEL

5829

MaxPrivateCopiesINTEL
-Apply to a variable or a structure-type member. Request, to the extent possible, that no more than the specified number of independent copies of the memory synthesized for the variable or structure member should be created for the purpose of enabling concurrent thread or loop iteration accesses.

Literal Number
-Maximum Copies

FPGAMemoryAttributesINTEL

5830

SinglepumpINTEL
-Apply to a variable or a structure-type member. Request, to the extent possible, that the variable or structure member should be implemented in a memory that is clocked at the same rate as accesses to it.

FPGAMemoryAttributesINTEL

5831

DoublepumpINTEL
-Apply to a variable or a structure-type member. Request, to the extent possible, that the variable or structure member should be implemented in a memory that is clocked at twice the rate of accesses to it.

FPGAMemoryAttributesINTEL

5832

MaxReplicatesINTEL
-Apply to a variable or a structure-type member. Request, to the extent possible, that each copy of the memory synthesized for the variable or structure member should be replicated no more than the specified number of times for the purpose of enabling simultaneous accesses from different load/store sites in the program.

Literal Number
-Maximum Replicates

FPGAMemoryAttributesINTEL

5833

SimpleDualPortINTEL
-Apply to a variable or a structure-type member. Request, to the extent possible, that the variable or structure member should be implemented in a memory that is configured such that no memory port services both stores and loads.

FPGAMemoryAttributesINTEL

5834

MergeINTEL
-Apply to a variable or a structure-type member. Request, to the extent possible, that the variable or structure member should be implemented in a memory that is merged with any memories synthesized from arrays or structure members that are decorated with this decoration and the same specified merge key. The mechanism of this merging is specified as a subsequent literal string.

Literal String
-Merge Key

Literal String
-Merge Type

FPGAMemoryAttributesINTEL

5835

BankBitsINTEL
-Apply to a variable or a structure-type member. Request, to the extent possible, that the variable or structure member should be implemented in a banked memory system, where the bits specified determine the pointer address bits to bank on.

Literal Number, Literal Number, …​
-Bank Bits

FPGAMemoryAttributesINTEL

5836

ForcePow2DepthINTEL
-Apply to a variable or a structure-type member. Request that the variable or structure member should be implemented in a memory that is a power-of-2 deep. This option is enabled if the subsequent literal number specified is 1, and disabled if the subsequent literal number specified is 0.

Literal Number
-Force Power-Of-2 Depth

FPGAMemoryAttributesINTEL

5883

StridesizeINTEL
-Apply to a variable or a structure-type member of array type. Request, to the extent possible, that Stride Size worth of consecutive array elements be placed in the same memory bank.

Literal Number
-Stride Size

FPGAMemoryAttributesINTEL

5884

WordsizeINTEL
-Apply to a variable or a structure-type member of array type. Request, to the extent possible, the size in array elements of a single memory transaction.

Literal Number
-Word Size

FPGAMemoryAttributesINTEL

5885

TrueDualPortINTEL
-Apply to a variable or a structure-type member. Request, to the extent possible, that the variable or structure member should be implemented in a memory that is configured such that all memory ports can service both stores and loads.

FPGAMemoryAttributesINTEL

-
-
-
-
-

Capability

-
-

Modify Section 3.31, Capability, adding a row to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

5824

FPGAMemoryAttributesINTEL

-
-
-
-
-

Validation Rules

-
-

None.

-
-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

A

2019-02-27

Joe Garvey

Initial public release

B

2019-03-18

Joe Garvey

Added MaxconcurrencyINTEL decoration. Fixed NumbanksINTEL capitalization

C

2019-04-23

Joe Garvey

Added SinglepumpINTEL and DoublepumpINTEL decorations

D

2019-06-06

Joe Garvey

Changed the name of MaxconcurrencyINTEL to MaxPrivateCopiesINTEL

E

2019-06-18

Joe Garvey

Added the MaxReplicatesINTEL, SimpleDualPortINTEL, and MergeINTEL decorations

F

2019-12-18

Julian Packer

Added the BankBitsINTEL decoration

G

2020-02-06

Mohammad Fawaz

Added the ForcePow2DepthINTEL decoration

H

2023-07-26

Artem Radzikhovskyy

Added StridesizeINTEL, WordsizeINTEL, TrueDualPortINTEL decorations

I

2023-10-03

Artem Radzikhovskyy

Definition clarifications; Defined supported strings in MemoryINTEL

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_fpga_memory_attributes.html + + +

extensions/INTEL/SPV_INTEL_fpga_memory_attributes.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_fpga_reg.html b/extensions/INTEL/SPV_INTEL_fpga_reg.html index 4e6cdeb..f02386f 100644 --- a/extensions/INTEL/SPV_INTEL_fpga_reg.html +++ b/extensions/INTEL/SPV_INTEL_fpga_reg.html @@ -1,316 +1,12 @@ - - - - - - - -SPV_INTEL_fpga_reg - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_fpga_reg

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Joe Garvey, Intel

    -
  • -
  • -

    Michael Kinsner, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2019 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-

Final draft

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2019-07-12

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.4 Revision 1.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds an instruction which explicitly requests that a pipelining register be introduced at a particular point in a program (on a specific assignment). The instruction is useful for FPGA targets, to separate regions of the program that are expected to end up in geographically distant regions of a device. This instruction is purely an optimization hint, and is functionally equivalent to an assignment.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_fpga_reg"
-
-
-
-
-
-

New capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
FPGARegINTEL
-
-
-
-
-
-

New Instructions

-
-
-

Instructions added under the FPGARegINTEL capability:

-
-
-
-
OpFPGARegINTEL
-
-
-
-
-
-

Token Number Assignments

-
-
-
- ---- - - - - - - - - - - -

FPGARegINTEL

5948

OpFPGARegINTEL

5949

-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.4

-
-
-

Capability

-
-

Modify Section 3.31, Capability, adding a row to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

5948

FPGARegINTEL

-
-
-
-
-

Instructions

-
-

In section 3.32.1, Miscellaneous Instructions, add a new instruction, OpFPGARegINTEL, as follows:

-
- ------- - - - - - - - - - - - - - -

OpFPGARegINTEL

-

Used to indicate to FPGA backends that pipelining registers should be inserted between the definition of Input and uses of Result. The value passed in as Input is returned in Result. This instruction is strictly an optimization hint and thus it would be functionally correct for a consumer to treat it as an assignment.

-

Result Type can be any type and is the type of both Result and Input.

Capability: -FPGARegINTEL

4

5949

<id>
-Result Type

Result <id>

Input <id>

-
-
-

Validation Rules

-
-

None.

-
-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2019-07-12

Joe Garvey

Initial public release

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_fpga_reg.html + + +

extensions/INTEL/SPV_INTEL_fpga_reg.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_global_variable_fpga_decorations.html b/extensions/INTEL/SPV_INTEL_global_variable_fpga_decorations.html index bf3d4f4..3542bdd 100644 --- a/extensions/INTEL/SPV_INTEL_global_variable_fpga_decorations.html +++ b/extensions/INTEL/SPV_INTEL_global_variable_fpga_decorations.html @@ -1,424 +1,12 @@ - - - - - - - -SPV_INTEL_global_variable_fpga_decorations - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_global_variable_fpga_decorations

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Artem Radzikhovskyy, Intel

    -
  • -
  • -

    Michael Kinsner, Intel

    -
  • -
  • -

    Joe Garvey, Intel

    -
  • -
  • -

    Mohammad Fawaz, Intel

    -
  • -
  • -

    Gregory Lueck, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2021-2023 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2023-10-27

Revision

3

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.6 Revision 2.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds decorations that can be applied to global (module scope) -variables. These decorations are intended to help code generation for -FPGA devices, they can be ignored by all other consumers of this extension.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must -be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_global_variable_fpga_decorations"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
GlobalVariableFPGADecorationsINTEL
-
-
-
-
-
-

Token Number Assignments

-
-
-
- ---- - - - - - - - - - - - - - - -

GlobalVariableFPGADecorationsINTEL

6189

InitModeINTEL

6190

ImplementInRegisterMapINTEL

6191

-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6

-
-
-

Initialization Mode Qualifier

-
-

After Section 3.18, add a new section "3.18a Initialization Mode Qualifier" as follows

-
-
-

Defines how the initialization should be triggered.

-
-
-

Used by InitModeINTEL.

-
-
-
- ----- - - - - - - - - - - - - - - - - - - -
Initialization Mode QualifierEnabling Capabilities

0

InitOnDeviceReprogramINTEL

-

Initialization is performed by reprogramming - the device. This may require more frequent reprogramming but may reduce - area.

GlobalVariableFPGADecorationsINTEL

1

InitOnDeviceResetINTEL

-

Initialization is performed by sending a reset - signal to the device. This may increase area but may reduce reprogramming - frequency.

GlobalVariableFPGADecorationsINTEL

-
-
-
-
-

Decoration

-
-

Modify Section 3.20, Decoration, adding these rows to the Decoration table:

-
-
-
- ------- - - - - - - - - - - - - - - - - - - - - - -
DecorationExtra OperandsEnabling Capabilities

6190

-

InitModeINTEL
-Only valid on global (module scope) OpVariable which has an Initializer -operand.

-
-
-

This decoration only has an effect when the consumer is an FPGA or similar -device. The Trigger value tells how the global variable should be -initialized.

-
-
-

If a global OpVariable with an Initializer operand is not decorated with -InitModeINTEL, the method by which the variable’s value is initialized is -implementation defined.

-

Initialization Mode Qualifier
-Trigger

GlobalVariableFPGADecorationsINTEL

6191

-

ImplementInRegisterMapINTEL
-Only valid on global (module scope) OpVariable.

-
-
-

This decoration only has an effect when the consumer is an FPGA or similar -device. The Value value controls the interface of this global variable with -hardware outside the boundary of the SPIR-V module.

-
-
-

Legal values of Value:

-
-
-
    -
  • -

    0 [False] - Access to this memory is through a dedicated interface.

    -
  • -
  • -

    1 [True] - Access to this memory is through a common register map interface that may be shared by other control or data inputs and outputs.

    -
  • -
-
-
-

If a global OpVariable is not decorated with ImplementInRegisterMapINTEL, the -interface for the variable is implementation defined.

-

Literal Number
-Value

GlobalVariableFPGADecorationsINTEL

-
-
-
-
-

Capability

-
-

Modify Section 3.31, Capability, adding a row to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

6189

GlobalVariableFPGADecorationsINTEL

-
-
-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2022-11-1

Gregory Lueck

Initial revision

2

2023-04-25

Artem Radzikhovskyy

Seperated the FPGA specific decorations from the generic ones

3

2023-10-27

Artem Radzikhovskyy

Reverting Capability ID

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_global_variable_fpga_decorations.html + + +

extensions/INTEL/SPV_INTEL_global_variable_fpga_decorations.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_global_variable_host_access.html b/extensions/INTEL/SPV_INTEL_global_variable_host_access.html index 971c977..b9ad1ba 100644 --- a/extensions/INTEL/SPV_INTEL_global_variable_host_access.html +++ b/extensions/INTEL/SPV_INTEL_global_variable_host_access.html @@ -1,419 +1,12 @@ - - - - - - - -SPV_INTEL_global_variable_host_access - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_global_variable_host_access

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Artem Radzikhovskyy, Intel

    -
  • -
  • -

    Michael Kinsner, Intel

    -
  • -
  • -

    Joe Garvey, Intel

    -
  • -
  • -

    Mohammad Fawaz, Intel

    -
  • -
  • -

    Gregory Lueck, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2021-2023 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2023-10-27

Revision

4

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.6 Revision 2.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds a decoration that can be applied to global (module scope) -variables. This decoration explicitly asserts that the global variable can be accessed outside the SPIR-V module.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must -be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_global_variable_host_access"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
GlobalVariableHostAccessINTEL
-
-
-
-
-
-

Token Number Assignments

-
-
-
- ---- - - - - - - - - - - -

GlobalVariableHostAccessINTEL

6187

HostAccessINTEL

6188

-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6

-
-
-

Host Access Qualifier

-
-

After Section 3.18, add a new section "3.18a Host Access Qualifier" as follows

-
-
-

Defines the host system access permissions.

-
-
-

Used by HostAccessINTEL.

-
-
-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Host Access QualifierEnabling Capabilities

0

NoneINTEL

-

The execution environment may neither read nor write the variable -from the host. On an FPGA device, no memory port is exposed.

GlobalVariableHostAccessINTEL

1

ReadINTEL

-

The execution environment may read the variable from the host but -will never write it. On an FPGA device, only a read memory port is exposed.

GlobalVariableHostAccessINTEL

2

WriteINTEL

-

The execution environment may write the variable from the host - but will never read it. On an FPGA device, only a write memory port is - exposed.

GlobalVariableHostAccessINTEL

3

ReadWriteINTEL

-

The execution environment may read or write the variable - from the host. On an FPGA device, a read/write memory port is exposed.

GlobalVariableHostAccessINTEL

-
-
-
-
-

Decoration

-
-

Modify Section 3.20, Decoration, adding these rows to the Decoration table:

-
-
-
- ------- - - - - - - - - - - - - - - - - -
DecorationExtra OperandsEnabling Capabilities

6188

-

HostAccessINTEL
-Only valid on global (module scope) OpVariable.

-
-
-

The client API’s execution environment may provide a way to access a global -variable’s value from the host system. If it does, this decoration provides -two pieces of information. Access is an assertion by the producer about the -types of these accesses, which may allow the consumer to perform certain -optimizations. Name is a name which the client -API’s execution environment may use to identify this variable.

-
-
-

If a global OpVariable is not decorated with HostAccessINTEL, the default behavior is defined by the client API specification.

-

Host Access Qualifier
-Access

Literal String
-Name

GlobalVariableHostAccessINTEL

-
-
-
-
-

Capability

-
-

Modify Section 3.31, Capability, adding a row to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

6187

GlobalVariableHostAccessINTEL

-
-
-
-
-

Validation Rules

-
-
    -
  • -

    It is invalid for two HostAccessINTEL decorations in the same module to -have the same Name operand.

    -
  • -
-
-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2022-11-1

Gregory Lueck

Initial revision

2

2023-04-25

Artem Radzikhovskyy

Address default behavior

3

2023-06-30

Artem Radzikhovskyy

Typo in capability

4

2023-10-27

Artem Radzikhovskyy

Reverting Capability ID

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_global_variable_host_access.html + + +

extensions/INTEL/SPV_INTEL_global_variable_host_access.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_io_pipes.html b/extensions/INTEL/SPV_INTEL_io_pipes.html index 6881350..2683f84 100644 --- a/extensions/INTEL/SPV_INTEL_io_pipes.html +++ b/extensions/INTEL/SPV_INTEL_io_pipes.html @@ -1,320 +1,12 @@ - - - - - - - -SPV_INTEL_io_pipes - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_io_pipes

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Joe Garvey, Intel

    -
  • -
  • -

    Dmitry Sidorov, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2020 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-

Draft

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2020-02-25

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.5 Revision 2.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds a decoration to identify pipes that correspond to hardware peripherals. This can be a useful programming model for any target which may directly interact with I/O.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_io_pipes"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
IOPipesINTEL
-
-
-
-
-
-

New Decorations

-
-
-

Decorations added under the IOPipesINTEL capability:

-
-
-
-
IOPipeStorageINTEL
-
-
-
-
-
-

Token Number Assignments

-
-
-
- ---- - - - - - - - - - - -

IOPipesINTEL

5943

IOPipeStorageINTEL

5944

-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5

-
-
-

Decoration

-
-

Modify Section 3.20, Decoration, adding these rows to the Decoration table:

-
-
-
- ------- - - - - - - - - - - - - - - - -
DecorationExtra OperandsEnabling Capabilities

5944

IOPipeStorageINTEL
-Apply to a pipe-storage object created from OpConstantPipeStorage. Indicates that the pipe storage object provides access to a hardware peripheral identified by the specified ID.

Literal Number
-IO Pipe ID

IOPipesINTEL

-
-
-
-
-

Capability

-
-

Modify Section 3.31, Capability, adding a row to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

5943

IOPipesINTEL

-
-
-
-
-

Validation Rules

-
-

None.

-
-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2020-02-25

Joe Garvey

Initial public release

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_io_pipes.html + + +

extensions/INTEL/SPV_INTEL_io_pipes.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_kernel_attributes.html b/extensions/INTEL/SPV_INTEL_kernel_attributes.html index 41e299c..0439723 100644 --- a/extensions/INTEL/SPV_INTEL_kernel_attributes.html +++ b/extensions/INTEL/SPV_INTEL_kernel_attributes.html @@ -1,458 +1,12 @@ - - - - - - - -SPV_INTEL_kernel_attributes - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_kernel_attributes

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Jessica Davies, Intel

    -
  • -
  • -

    Joseph Garvey, Intel

    -
  • -
  • -

    Ajaykumar Kannan, Intel

    -
  • -
  • -

    Michael Kinsner, Intel

    -
  • -
  • -

    Ryan Murray, Intel

    -
  • -
  • -

    Abhishek Tiwari, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2019-2022 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-

Final Draft

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2022-12-05

Revision

4

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.6 Revision 2.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds a variety of new execution modes, both general and target-specific. The target-specific execution modes are guarded by separate capabilities.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_kernel_attributes"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces new capabilities:

-
-
-
-
KernelAttributesINTEL
-FPGAKernelAttributesINTEL
-FPGAKernelAttributesv2INTEL
-
-
-
-
-
-

New Execution Modes

-
-
-
-
MaxWorkgroupSizeINTEL
-MaxWorkDimINTEL
-NoGlobalOffsetINTEL
-NumSIMDWorkitemsINTEL
-SchedulerTargetFmaxMhzINTEL
-StreamingInterfaceINTEL
-RegisterMapInterfaceINTEL
-
-
-
-
-
-

Token Number Assignments

-
-
-
- ---- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

KernelAttributesINTEL

5892

MaxWorkgroupSizeINTEL

5893

MaxWorkDimINTEL

5894

NoGlobalOffsetINTEL

5895

NumSIMDWorkitemsINTEL

5896

FPGAKernelAttributesINTEL

5897

FPGAKernelAttributesv2INTEL

6161

SchedulerTargetFmaxMhzINTEL

5903

StreamingInterfaceINTEL

6154

RegisterMapInterfaceINTEL

6160

-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6

-
-
-

Execution Mode

-
-

Modify Section 3.6, Execution Mode, adding these rows to the Execution Mode table:

-
-
-
- -------- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Execution ModeExtra OperandsEnabling Capabilities

5893

MaxWorkgroupSizeINTEL
-Indicates the maximum possible work-group size in the x, y, and z dimensions. -If a LocalSize execution mode is applied to the same entry point, it is invalid for max_i_size < i size for dimension i. -Only valid with the Kernel Execution Model.

Literal Number
-max_x_size

Literal Number
-max_y_size

Literal Number
-max_z_size

KernelAttributesINTEL

5894

MaxWorkDimINTEL
-Indicates the maximum number of work dimensions. Legal values range from 0 to 3. -A maximum dimensionality of 0 indicates that the kernel can only be launched with a single work-item. -If a LocalSize execution mode is applied to the same entry point, the size of each dimension beyond max_dimensions must be 1. -Only valid with the Kernel Execution Model.

Literal Number
-max_dimensions

KernelAttributesINTEL

5895

NoGlobalOffsetINTEL
-Indicates that the global offset is always (0, 0, 0). Only valid with the Kernel Execution Model.

KernelAttributesINTEL

5896

NumSIMDWorkitemsINTEL
-Indicates that the kernel should be vectorized with the provided vector width. Only valid with the Kernel Execution Model.

Literal Number
-vector_width

FPGAKernelAttributesINTEL

5903

SchedulerTargetFmaxMhzINTEL
-Indicates the target clock frequency (Fmax) for the kernel, in MHz. Only valid with the Kernel Execution Model.

Literal Number
-target_fmax

FPGAKernelAttributesINTEL

6154

StreamingInterfaceINTEL
-Indicates that the kernel has a streaming interface, in which invocation of and return from the kernel is synchronized by a flow control handshaking protocol. -StallFreeReturn is a 32-bit unsigned integer type scalar. -If StallFreeReturn is equal to zero, it indicates that the return interface of the kernel can input a stall control flow signal from downstream logic, while a non-zero value indicates that it will not accept a stall control flow signal from downstream logic.

Literal
-StallFreeReturn

FPGAKernelAttributesINTEL

6160

RegisterMapInterfaceINTEL
-Indicates that the kernel has a single register based interface that is shared across all kernel control signals and kernel arguments. -WaitForDoneWrite is a boolean type scalar. -If WaitForDoneWrite is true, it indicates that the kernel interface will contain a stall register that can be used to back-pressure the kernel, while if it is false, it indicates that it will not.

Literal
-WaitForDoneWrite

FPGAKernelAttributesv2INTEL

-
-
-
-
-

Capability

-
-

Modify Section 3.31, Capability, adding the following rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - - - - - - - - - - - -
CapabilityImplicitly Declares

5892

KernelAttributesINTEL

5897

FPGAKernelAttributesINTEL

6161

FPGAKernelAttributesv2INTEL

FPGAKernelAttributesINTEL

-
-
-
-
-

Validation Rules

-
-

It is illegal to specify both StreamingInterfaceINTEL and RegisterMapInterfaceINTEL modes on the same entry point.

-
-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2019-12-18

Joe Garvey

Initial public release

2

2020-04-22

Jessica Davies

Added one new execution mode, SchedulerTargetFmaxMhzINTEL.

3

2021-09-14

Ajaykumar Kannan

Added one new execution mode, StreamingInterfaceINTEL.

4

2022-12-05

Abhishek Tiwari

Added one new execution mode, RegisterMapInterfaceINTEL, under a new compatibility.

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_kernel_attributes.html + + +

extensions/INTEL/SPV_INTEL_kernel_attributes.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_long_composites.html b/extensions/INTEL/SPV_INTEL_long_composites.html index ccdebbe..50c795f 100644 --- a/extensions/INTEL/SPV_INTEL_long_composites.html +++ b/extensions/INTEL/SPV_INTEL_long_composites.html @@ -1,431 +1,12 @@ - - - - - - - -SPV_INTEL_long_composites - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_long_composites

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Mariya Podchishchaeva, Intel

    -
  • -
  • -

    Alexey Sotkin, Intel

    -
  • -
  • -

    Ben Ashbaugh, Intel

    -
  • -
  • -

    Alexey Sachkov, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2023 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-
    -
  • -

    Shipping.

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2023-03-22

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.6 Revision 2, Unified

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds new capability and instructions to allow to represent -composites with number of Constituents greater than the maximum -possible WordCount.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_long_composites"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5

-
-
-

Capabilities

-
-

Modify Section 3.31, "Capability", adding these rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

6089

CapabilityLongCompositesINTEL
-Allow to use OpTypeStructContinuedINTEL, OpConstantCompositeContinuedINTEL, -OpCompositeConstructContinuedINTEL and OpSpecConstantCompositeContinuedINTEL instructions

-
-
-
-
-

Instructions

-
-

In section 3.42.6. Type-Declaration Instructions add the new instruction

-
- ----- - - - - - - - - - - - -

OpTypeStructContinuedINTEL

-

Continue specifying an OpTypeStruct with number of Member types -greater than the maximum possible WordCount.

-

The previous instruction must be an OpTypeStruct or an -OpTypeStructContinuedINTEL instruction.

-

Member types follow the same rules as defined for Member types of -OpTypeStruct.

Capability: -CapabilityLongCompositesINTEL

2 + variable

6090

<id>, <id>, …​ Member N type, member N + 1 type

-
-

Modify the description of OpTypeStruct instruction, adding the -following sentence to the end: -In case if it is not possible to specify all the member types of the structure -by one OpTypeStruct instruction, i.e. if number of members of the -Result type is greater than the maximum possible WordCount, the remaining -member types are specified by the following OpTypeStructContinuedINTEL -instructions.

-
-
-

In section 3.42.7. Constant-Creation Instructions, add the new instructions

-
- ----- - - - - - - - - - - - -

OpConstantCompositeContinuedINTEL

-

Continue specifying an OpConstantComposite instruction with number of -Constituents greater than the maximum possible WordCount.

-

The previous instruction must be an OpConstantComposite or an -OpConstantCompositeContinuedINTEL instruction.

-

Constituents follow the same rules as defined for Constituents of -OpConstantComposite instruction and specify members of a structure, or -elements of an array, or components of a vector, or columns of a matrix.

Capability: -CapabilityLongCompositesINTEL

2 + variable

6091

<id>, <id>, …​ Constituents

- ----- - - - - - - - - - - - -

OpSpecConstantCompositeContinuedINTEL

-

Continue specifying an OpSpecConstantComposite instruction with number of -Constituents greater than the maximum possible WordCount.

-

The previous instruction must be an OpSpecConstantComposite or an -OpSpecConstantCompositeContinuedINTEL instruction.

-

Constituents follow the same rules as defined for Constituents of -OpSpecConstantComposite instruction and specify members of a structure, or -elements of an array, or components of a vector, or columns of a matrix.

-

This instruction will be specialized to an OpConstantCompositeContinuedINTEL -instruction.

-

See Specialization.

Capability: -CapabilityLongCompositesINTEL

2 + variable

6092

<id>, <id>, …​ Constituents

-
-

Modify the description of OpConstantComposite instruction, adding the -following sentence to the end: -In case if it is not possible to specify all the Constituents by one -OpConstantComposite instruction, i.e. if number of members of the -Result type and corresponding Constituents is greater than the maximum -possible WordCount, the remaining Constituents are specified by the following -OpConstantCompositeContinuedINTEL instructions.

-
-
-

Modify the description of OpSpecConstantComposite instruction, adding the -following sentence to the end: -In case if it is not possible to specify all the Constituents by one -OpSpecConstantComposite instruction, i.e. if number of members of the -Result type and corresponding Constituents is greater than the maximum -possible WordCount, the remaining Constituents are specified by the following -OpSpecConstantCompositeContinuedINTEL instructions.

-
-
-

Modify the description of OpCompositeConstruct instruction, adding the -following sentence to the end: -In case if it is not possible to specify all the Constituents by one -CompositeConstruct instruction, i.e. if number of members of the -Result type and corresponding Constituents is greater than the maximum -possible WordCount, the remaining Constituents are specified by the following -OpCompositeConstructContinuedINTEL instructions.

-
-
-

In section 3.42.12. Composite Instructions, add the new instruction

-
- ----- - - - - - - - - - - - -

OpCompositeConstructContinuedINTEL

-

Continue specifying an OpCompositeConstruct instruction with number of -Constituents greater than the maximum possible WordCount.

-

The previous instruction must be an OpCompositeConstruct or an -OpCompositeConstructContinuedINTEL instruction.

-

Constituents follow the same rules as defined for Constituents of -OpCompositeConstruct instruction and specify members of a structure, or -elements of an array, or components of a vector, or columns of a matrix.

Capability: -CapabilityLongCompositesINTEL

2 + variable

6096

<id>, <id>, …​ Constituents

-
-
-

Validation Rules

-
-

Previous instruction to OpTypeStructContinuedINTEL must be OpTypeStruct or OpTypeStructContinuedINTEL.
-Previous instruction to OpConstantCompositeContinuedINTEL must be OpConstantComposite or OpConstantCompositeContinuedINTEL.
-Previous instruction to OpCompositeConstructContinuedINTEL must be OpCompositeConstruct or OpCompositeConstructContinuedINTEL.
-Previous instruction to OpSpecConstantCompositeContinuedINTEL must be OpSpecConstantComposite or OpSpecConstantCompositeContinuedINTEL.

-
-
-
-
-
-

Issues

-
-
-

1) Do we need to define additional validation rules?

-
-
-

Resolution:

-
-
-

Yes, added the validation rules for the new instructions.

-
-
-

2) Do we need modifications of the OpConstantComposite/OpSpecConstantComposite -instruction description?

-
-
-

Resolution:

-
-
-

Yes, it seems that description of these instructions defines one to one match -between composite type members and Constituents by the sentence: -"There must be exactly one Constituent for each top-level -member/element/component/column of the result." Done.

-
-
-

3) We also might want to modify OpAccessChain to clarify how it works on large -constants.

-
-
-

Resolution:

-
-
-

No. Already existing description of OpAccessChain in code SPIR-V spec is good enough.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2023-03-22

Mariya Podchishchaeva

Initial revision

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_long_composites.html + + +

extensions/INTEL/SPV_INTEL_long_composites.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_loop_fuse.html b/extensions/INTEL/SPV_INTEL_loop_fuse.html index 4f32787..e5dc40b 100644 --- a/extensions/INTEL/SPV_INTEL_loop_fuse.html +++ b/extensions/INTEL/SPV_INTEL_loop_fuse.html @@ -1,319 +1,12 @@ - - - - - - - -SPV_INTEL_loop_fuse - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_loop_fuse

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Jessica Davies, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2020 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-

Final draft

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2020-11-24

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.5 Revision 3.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds a function decoration to request that loops meeting defined conditions be fused with each other.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_loop_fuse"
-
-
-
-
-
-

New capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
LoopFuseINTEL
-
-
-
-
-
-

New Decorations

-
-
-

This extension adds the following decoration under the LoopFuseINTEL capability:

-
-
-
-
FuseLoopsInFunctionINTEL
-
-
-
-
-
-

Token Number Assignments

-
-
-
- ---- - - - - - - - - - - -

LoopFuseINTEL

5906

FuseLoopsInFunctionINTEL

5907

-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5

-
-
-

Decoration

-
-

Modify Section 3.20, Decoration, adding the following row to the Decoration table:

-
-
-
- ------- - - - - - - - - - - - - - - - - -
DecorationExtra OperandsEnabling Capabilities

5907

FuseLoopsInFunctionINTEL
-Only valid on OpFunction. Request, to the extent possible, that loops in the function be fused if they are contained in strictly fewer than Depth other loops in the function. Depth is a 32-bit unsigned integer type scalar. Independent is a 32-bit unsigned integer type scalar. If Independent is non-zero, it guarantees that fusing loops in the function that are contained in strictly fewer than Depth other loops within the function does not change the order of any dependent memory accesses.

Literal
-Depth

Literal
-Independent

LoopFuseINTEL

-
-
-
-
-

Capability

-
-

Modify Section 3.31, Capability, adding a row to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

5906

LoopFuseINTEL

-
-
-
-
-

Validation Rules

-
-

None.

-
-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2020-11-24

Jessica Davies

Initial public release

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_loop_fuse.html + + +

extensions/INTEL/SPV_INTEL_loop_fuse.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_masked_gather_scatter.html b/extensions/INTEL/SPV_INTEL_masked_gather_scatter.html index bc2613b..edf8e94 100644 --- a/extensions/INTEL/SPV_INTEL_masked_gather_scatter.html +++ b/extensions/INTEL/SPV_INTEL_masked_gather_scatter.html @@ -1,504 +1,12 @@ - - - - - - - -SPV_INTEL_masked_gather_scatter - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_masked_gather_scatter

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Dmitry Sidorov, Intel

    -
  • -
  • -

    Ben Ashbaugh, Intel

    -
  • -
  • -

    Arvind Sudarsanam, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2023 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-
    -
  • -

    Shipping.

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2023-09-05

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.6 Revision 2.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension allows OpTypeVector to have a physical pointer type Component Type and introduces gather/scatter instructions. -These are important operations for many explicitly vectorized kernels.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the appropriate OpExtension must -be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_masked_gather_scatter"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces new capabilities:

-
-
-
-
MaskedGatherScatterINTEL
-
-
-
-
-
-

New Instructions

-
-
-

Instructions added under the MaskedGatherScatterINTEL capability:

-
-
-
-
OpMaskedGatherINTEL
-OpMaskedScatterINTEL
-
-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - - - - - - - - - -

MaskedGatherScatterINTEL

6427

OpMaskedGatherINTEL

6428

OpMaskedScatterINTEL

6429

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5

-
-
-

2.2.2. Types

-
-

Update the definition of Vector, adding pointers to the set of supported component types: -An ordered homogeneous collection of two or more scalars or pointers of physical pointer type. -Vector sizes are quite restrictive and dependent on the execution model.

-
-
-
-

2.16.1. Universal Validation Rules

-
-

Modify Data rules section, replacing following segment:

-
-
-
    -
  • -

    Vector types must be parameterized only with numerical types or the OpTypeBool type.

    -
  • -
-
-
-

with:

-
-
-
    -
  • -

    Vector types must be parameterized only with numerical types or the OpTypeBool type. They can also -be parameterized with physical pointer type types under MaskedGatherScatterINTEL capability.

    -
  • -
-
-
-
-

Capabilities

-
-

Modify Section 3.31, Capability, adding rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

6427

MaskedGatherScatterINTEL
-
-Allow OpTypeVector to have a physical pointer type Component Type.
-
-See also extension: SPV_INTEL_masked_gather_scatter

Addresses

-
-
-
-
-

3.42.6. Type-Declaration Instructions

-
-

Modify OpTypeVector, changing the description of Component Type to: - Component Type is the type of each component in the resulting type. It must be a scalar type or physical pointer type.

-
-
-
-

3.42.7. Constant-Creation Instructions

-
-

Modify OpConstantNull, allowing Result Type to be a vector of physical pointer type.

-
-
-
-

3.42.8. Memory Instructions

-
-

Allow vector of physical pointer type to be used by OpVariable, OpAccessChain, OpInBoundsAccessChain, -OpPtrAccessChain, OpInBoundsPtrAccessChain, OpPtrEqual, OpPtrNotEqual and OpPtrDiff instructions. When vector of -physical pointer type is allowed for OpVariable it is implicitly possible to be used by OpStore and OpLoad which can -store/load through a pointer to this vector.

-
-
-

Change the Overview of OpVariable as follows: -Allocate an object or a vector of objects in memory, resulting in a pointer or appropriately a vector of pointers to it, -which can be used with OpLoad and OpStore. -Change the Result Type of OpVariable as follows: -Result Type must be an OpTypePointer or a vector with physical pointer type Component Type. -Its Type operand is the type of object or vector of objects in memory.

-
-
-

Modify OpAccessChain (implicitly modifies OpInBoundsAccessChain, OpPtrAccessChain and OpInBoundsPtrAccessChain instructions) -Change the Base as follows: -Base must be a pointer, pointing to the base of a composite object or a vector of physical pointer type.

-
-
-

Allow vector of physical pointer type to be the type of Operand 1 and Operand 2 of OpPtrEqual, OpPtrNotEqual and -OpPtrDiff instructions. If operands are vectors of pointers, then the Result Type of OpPtrEqual and OpPtrNotEqual is a -vector with boolean Component Type and Result Type of OpPtrDiff is a vector with integer Component Type.

-
-
-

Add the following new entries:

-
- ---------- - - - - - - - - - - - - - - - - -

OpMaskedGatherINTEL
-
-Reads values from a vector of pointers gathering them into one vector. Returns the gathered vector. Memory access -is specified by a mask instruction parameter.
-
-Result Type is a type of the gathered vector. Its Component Type must be the same as the base type of -PtrVector. -
-PtrVector is a vector with physical pointer type Component Type, containing addresses from where the instruction reads.
-
-Alignment is an unsigned 32-bit integer literal whose value is -either 0 or a power of two. When the value is not 0, it is an assertion that -each pointer value in PtrVector has this alignment. The behavior is undefined if -any pointer value in PtrVector does not have this alignment.
-
-Mask is a vector of boolean values with the same number of elements as the Result Type. It specifies which elements of -PtrVector should be gathered.
-
-FillEmpty is used to fill the masked-off lanes of the result. It must be of the same type as the Component Type of Result Type.

Capability:
-MaskedGatherScatterINTEL

7

6428

<id>
-Result Type

Result <id>

<id>
-PtrVector

<literal>
-Alignment

<id>
-Mask

<id>
-FillEmpty

- -------- - - - - - - - - - - - - - - -

OpMaskedScatterINTEL
-
-Writes values from a vector to the corresponding memory address of the given vector of pointers. Memory access -is specified by a mask instruction parameter.
-
-InputVector is a vector of values to scatter.
-
-PtrVector is a vector with physical pointer type Component Type, containing addresses where the instruction stores the scattered values.
-
-Alignment is an unsigned 32-bit integer literal whose value is -either 0 or a power of two. When the value is not 0, it is an assertion that -each pointer value in PtrVector has this alignment. The behavior is undefined if -any pointer value in PtrVector does not have this alignment.
-
-Mask is a vector of boolean values with the same number of elements as the InputVector. It specifies which elements of -InputVector should be scattered.

Capability:
-MaskedGatherScatterINTEL

5

6429

<id>
-InputVector

<id>
-PtrVector

<literal>
-Alignment

<id>
-Mask

-
-
-

3.42.11. Conversion Instructions

-
-

Allow OpTypeVector to be Result Type and type of an input for OpConvertPtrToU, OpConvertUToPtr instructions: -Change the Result Type of OpConvertPtrToU as follows: -Result Type must be a scalar or vector of integer type, whose Signedness operand is 0.

-
-
-

Change the Pointer of OpConvertPtrToU as follows: -Pointer must be a physical pointer type or a vector with physical pointer type Component Type. -If the bit width of Pointer is smaller than that of Result Type, the conversion zero-extends Pointer. -If the bit width of Pointer is larger than that of Result Type, the conversion truncates Pointer. For -same bit width Pointer and Result Type, this is the same as OpBitcast.

-
-
-

Change the Result Type of OpConvertUToPtr as follows: -Result Type must be a physical pointer type or a vector with physical pointer type Component Type.

-
-
-

Change the Integer Value of OpConvertUToPtr as follows: -Integer Value must be a scalar or vector of integer type, whose Signedness operand is 0. -If the bit width of Integer Value is smaller than that of Result Type, the -conversion zero-extends Integer Value. If the bit width of Integer Value is larger -than that of Result Type, the conversion truncates Integer Value. For same width Integer Value and Result Type, -this is the same as OpBitcast.

-
-
-

Allow vector of physical pointer type to be Result Type and type of a Pointer for -OpPtrCastToGeneric, OpGenericCastToPtr and OpGenericCastToPtrExplicit instructions.

-
-
-

Allow vector of physical pointer type to be Result Type and type of an Operand for OpBitcast instruction.

-
-
-
-

3.42.12. Composite Instructions

-
-

Most of the Composite Instructions that are supposed to work with vector type do not have any restrictions about Component Type. -This extension allows these instructions to operate on vector of physical pointer type.

-
-
-

Allow physical pointer type to be a Result Type of OpVectorExtractDynamic.

-
-
-
-

Issues

-
-

None

-
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2023-09-05

Dmitry Sidorov

Prepare to ship

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_masked_gather_scatter.html + + +

extensions/INTEL/SPV_INTEL_masked_gather_scatter.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_maximum_registers.html b/extensions/INTEL/SPV_INTEL_maximum_registers.html index 5a6a007..f4bde2c 100644 --- a/extensions/INTEL/SPV_INTEL_maximum_registers.html +++ b/extensions/INTEL/SPV_INTEL_maximum_registers.html @@ -1,401 +1,12 @@ - - - - - - - -SPV_INTEL_maximum_registers - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_maximum_registers

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Ben Ashbaugh, Intel

    -
  • -
  • -

    Greg Lueck, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2024 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2024-02-05

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, Version 1.6 Revision -3.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds an execution mode to specify the maximum number of registers -a SPIR-V consumer should use when compiling an entry point. -This is a hint only that does not modify the functional behavior of the program, -but can change its performance characteristics.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the appropriate OpExtension must -be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_maximum_registers"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6

-
-
-

Validation Rules

-
-

Add validation rules to section 2.16.1 Universal Validation Rules under Entry Point:

-
-
-
    -
  • -

    Each OpEntryPoint must contain at most one of the -MaximumRegistersINTEL, MaximumRegistersIdINTEL, or -NamedMaximumRegistersINTEL execution modes.

    -
  • -
-
-
-
-

Capabilities

-
-

Modify Section 3.31, Capability, adding rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

6460

RegisterLimitsINTEL
-Specifies the maximum number of registers that may be used by an entry point.

-
-
-
-
-

Execution Modes

-
-

Modify Section 3.6, Execution Mode, adding rows to the Execution Mode table:

-
- -------- - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Execution ModeExtra OperandsEnabling Capabilities

6461

MaximumRegistersINTEL
-Specifies the maximum number of registers to be allocated to a single -invocation. -This is a performance hint only. -If the specified number of registers is not supported then the compiler may -choose the closest number of supported registers or may ignore the request.

Literal
-Number of Registers

RegisterLimitsINTEL

6462

MaximumRegistersIdINTEL
-Same as the MaximumRegistersINTEL execution mode but using an <id> -operand instead of a literal. -The operand must be an integer type scalar and is interpreted as an unsigned -value.

<id>
-Number of Registers

RegisterLimitsINTEL

6463

NamedMaximumRegistersINTEL
-Specifies the maximum number of registers to be allocated to a single invocation -using a named policy rather than a specific numeric number of registers. -This is a performance hint only. -If the named policy is not supported then the compiler may ignore the request.

Named Maximum Number of Registers
-Named Maximum Number of Registers

RegisterLimitsINTEL

-
-
-

Named Maximum Number of Registers

-
-

Add a new Section 3.XX, "Named Maximum Number of Registers":

-
-
-

Specify the maximum number of registers using a named policy. -A named maximum number of registers policy is a symbolic name describing -desired properties that may influence the maximum number of registers allocated -to a single invocation.

-
-
-
- ----- - - - - - - - - - - - - - -
Named Maximum Number of RegistersEnabling Capabilities

0

AutoINTEL
-Choose the maximum number of registers automatically to minimize register spills.

RegisterLimitsINTEL

-
-
-
-
-
-
-

Issues

-
-
-
    -
  1. -

    Do we need to support both the literal and <id> execution modes?

    -
    -
    -
    -

    RESOLVED: Because different devices may support differently sized register -files it is valuable to support specifying the maximum number of registers -using a specialization constant.

    -
    -
    -
    -
  2. -
  3. -

    Should we support other "performance tuning directives" in addition to the -maximum number of registers?

    -
    -
    -
    -

    RESOLVED: Not in this extension.

    -
    -
    -
    -
  4. -
  5. -

    What should behavior be when no maximum number of registers is specified for -an entry point?

    -
    -
    -
    -

    RESOLVED: This is outside of the scope of this extension, but for informative -purposes: behavior should be considered implementation-defined when no explicit -maximum number of registers is specified for an entry point. Some possible -valid implementations could be: the compiler chooses a fixed number of registers -for simplicity and predictability, or the compiler chooses a number of registers -based on heuristics to balance parallelism and register spills.

    -
    -
    -
    -
  6. -
  7. -

    What should the named maximum number of register policy be in the initial -version of this extension?

    -
    -
    -
    -

    RESOLVED: The name is colloquially known as "auto" therefore it is the name -that is used currently.

    -
    -
    -

    Note that the behavior is implementation-defined both with this named policy and -when the entry point does not describe any specific maximum number of registers, -although it is a different implementation, at least for current Intel GPUs.

    -
    -
    -
    -
  8. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2024-02-05

Ben Ashbaugh

Initial public revision.

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_maximum_registers.html + + +

extensions/INTEL/SPV_INTEL_maximum_registers.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_media_block_io.html b/extensions/INTEL/SPV_INTEL_media_block_io.html index 3755810..2c0ebdf 100644 --- a/extensions/INTEL/SPV_INTEL_media_block_io.html +++ b/extensions/INTEL/SPV_INTEL_media_block_io.html @@ -1,369 +1,12 @@ - - - - - - - -SPV_INTEL_media_block_io - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_media_block_io

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Ben Ashbaugh, Intel

    -
  • -
  • -

    Biju George, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2018 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-
    -
  • -

    Final Draft

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2018-10-29

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.2 Revision 1.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds additional subgroup block read and write functionality that allow applications to flexibly specify the width and height of the block to read from or write to a 2D image.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the appropriate OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_media_block_io"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces new capabilities:

-
-
-
-
SubgroupImageMediaBlockIOINTEL
-
-
-
-
-
-

New Instructions

-
-
-

Instructions added under the SubgroupImageMediaBlockIOINTEL capability:

-
-
-
-
OpSubgroupImageMediaBlockReadINTEL
-OpSubgroupImageMediaBlockWriteINTEL
-
-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - - - - - - - - - -

SubgroupImageMediaBlockIOINTEL

5579

OpSubgroupImageMediaBlockReadINTEL

5580

OpSubgroupImageMediaBlockWriteINTEL

5581

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.2

-
-
-

Capabilities

-
-

Modify Section 3.31, Capability, adding rows to the Capability table:

-
- ------ - - - - - - - - - - - - - - - -
CapabilityImplicitly DeclaresEnabled by Extension

5579

SubgroupImageMediaBlockIOINTEL

SPV_INTEL_media_block_io

-
-
-

Instructions

-
-

Modify Section 3.32.21, Group Instructions, adding to the end of the list of instructions:

-
- ---------- - - - - - - - - - - - - - - - - -

OpSubgroupImageMediaBlockReadINTEL
-
-Reads a block of data from a 2D region of the specified Image.

-

Image must be an object whose type is OpTypeImage with a Sampled operand of 0 or 2.

-

The Result Type is used to compute the maximum size of the 2D region that may be read, and may be a scalar or a vector type. For this description, the block read Component Type is defined as the Result Type if Result Type is a scalar type, or as the vector Component Type if Result Type is a vector type.

-

The size of the 2D region to read is specified by the Width and Height operands. The Width of the 2D region is expressed in units of the block read Component Type, and the Height directly describes the number of rows of the 2D region to read.

-

The Coordinate describes where in the Image to start reading the 2D region, and must be a vector of integer type. The first component of the Coordinate is a byte coordinate into a row of the Image and is not expressed in units of the block read Component Type. Remaining coordinates are non-normalized texel coordinates.

-

The data read from the 2D region is assigned to Result for each invocation in the subgroup by reorganizing the data read from the 2D region into a different 2D destination region in row major order. The width of the destination 2D region is equivalent to SubgroupMaxSize in units of the Component Type, and the height of the destination 2D region is equivalent to one if Result Type is a scalar type, or to the vector Component Count if Result Type is a vector type. The reorganization assigns rows of the source 2D region to the destination 2D region in row-major order. If the source 2D region row byte width is not a power-of-two then the source 2D region row is assigned to the destination 2D region with undefined padding values to make a power-of-two row byte width. Each invocation in the subgroup is then assigned a Component Type column vector of the reorganized 2D region, i.e. each invocation_s subsequent data element_s index is strided by the SubgroupMaxSize in the 2D destination region.

-

If the size of the requested 2D region to read is smaller than the 2D destination region then some tail components of some Result values will not be assigned values. If the size of the 2D region to read is larger than the 2D destination region then some parts of the 2D region to read will be not be assigned and will be dropped.

Capability:
-SubgroupImageMediaBlockIOINTEL

7

5580

<id> Result Type

<id> Result

<id> Image

<id> Coordinate

<id> Width

<id> Height

- --------- - - - - - - - - - - - - - - - -

OpSubgroupImageMediaBlockWriteINTEL
-
-Writes a block of data into a 2D region of the specified Image.

-

Image must be an object whose type is OpTypeImage with a Sampled operand of 0 or 2.

-

The type of Data is used to compute the maximum size of the 2D region that may be written, and may be a scalar or a vector type. For this description, the block write Component Type is defined as the type of Data if it is a scalar type, or the vector Component Type if it is a vector type.

-

The size of the 2D region to write is specified by the Width and Height operands. The Width of the 2D region is expressed in units of the block write Component Type, and the Height directly describes the number of rows of the 2D region to write.

-

The Coordinate describes where in the Image to start writing the 2D region, and must be a vector of integer type. The first component of the Coordinate is a byte coordinate into a row of the Image and is not expressed in units of the Component Type. Remaining coordinates are non-normalized texel coordinates.

-

The Data for each invocation in the subgroup collectively forms a 2D source region, where the width of the 2D source region is equivalent to the SubgroupMaxSize in units of the block write Component Type, and the height of the 2D source region is equivalent to one if the type of Data is a scalar type, or to the vector Component Count if it is a vector type. This 2D source region is then reorganized into a different 2D region to write. The reorganization assigns data from the 2D source region to rows of the 2D region to write in row-major order. If the row byte width of the 2D region to write is not a power-of-two, then some values from the 2D source region are skipped during assignment, so that each row of the 2D region to write begins at an byte offset that is a power-of-two.

-

If the size of the 2D source region is greater than the size of the 2D region to write then some tail components of some Data values will not be written. This SPIR-V extension does not require any specific behavior when the size of the 2D source region is smaller than the size of the 2D region to write, but some environments may define behavior for this case.

Capability:
-SubgroupImageMediaBlockIOINTEL

6

5581

<id> Image

<id> Coordinate

<id> Width

<id> Height

<id> Data

-
-
-
-
-

Validation Rules

-
-
-

None.

-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2018-10-29

Ben Ashbaugh

Initial revision

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_media_block_io.html + + +

extensions/INTEL/SPV_INTEL_media_block_io.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_runtime_aligned.html b/extensions/INTEL/SPV_INTEL_runtime_aligned.html index eef3cdf..a0521b5 100644 --- a/extensions/INTEL/SPV_INTEL_runtime_aligned.html +++ b/extensions/INTEL/SPV_INTEL_runtime_aligned.html @@ -1,294 +1,12 @@ - - - - - - - -SPV_INTEL_runtime_aligned - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_runtime_aligned

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Joe Garvey, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2021 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-

Final draft

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2021-06-29

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.5 Revision 5.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension introduces a new function parameter attribute that can be applied to pointers to indicate that the pointer was allocated with an implementation-specific alignment.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_runtime_aligned"
-
-
-
-
-
-

New capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
RuntimeAlignedAttributeINTEL
-
-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - - - - - -

RuntimeAlignedAttributeINTEL

5939

RuntimeAlignedINTEL

5940

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5 Revision 5

-
-
-

Function Parameter Attribute

-
-

Modify Section 3.19, Function Parameter Attribute, adding a row to the table:

-
-
-
- ----- - - - - - - - - - - - - - -
Function Parameter AttributeEnabling Capabilities

5940

RuntimeAlignedINTEL
-Indicates that the pointer comes directly from a runtime allocation and was not offset in any way. This pointer can thus be assumed to have the implementation-defined alignment with which the corresponding runtime is known to allocate pointers. Only valid for pointer parameters. Not valid on return values.

RuntimeAlignedAttributeINTEL

-
-
-
-
-

Capability

-
-

Modify Section 3.31, Capability, adding a row to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

5939

RuntimeAlignedAttributeINTEL

Kernel

-
-
-
-
-

Validation Rules

-
-

None.

-
-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2021-06-29

Joe Garvey

Initial public release

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_runtime_aligned.html + + +

extensions/INTEL/SPV_INTEL_runtime_aligned.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_shader_integer_functions2.html b/extensions/INTEL/SPV_INTEL_shader_integer_functions2.html index cf9c62b..c9c78cb 100644 --- a/extensions/INTEL/SPV_INTEL_shader_integer_functions2.html +++ b/extensions/INTEL/SPV_INTEL_shader_integer_functions2.html @@ -1,817 +1,12 @@ - - - - - - - -SPV_INTEL_shader_integer_functions2 - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_shader_integer_functions2

-
-
-
-
-

Contact

-
-
-

See Issues list in the Khronos SPIRV-Registry repository: -https://github.com/KhronosGroup/SPIRV-Registry

-
-
-
-
-

Contributors

-
-
-
    -
  • -

    Ian Romanick, Intel

    -
  • -
  • -

    Ben Ashbaugh, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2018 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-

Final Draft

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2019-01-22

Revision

2

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.3 Revision 1.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension is written to provide the functionality of the INTEL_shader_integer_functions2, OpenGL Shading Language Specification extensions, to SPIR-V.

-
-
-

This extension introduces several new integer instructions to SPIR-V for use in graphics shaders. Many of these instructions have pre-existing counterparts in the Kernel environment.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_shader_integer_functions2"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
IntegerFunctions2INTEL
-
-
-
-
-
-

New Instructions

-
-
-

Instructions added under IntegerFunctions2INTEL capability.

-
-
-
-
OpUCountLeadingZeros
-OpUCountTrailingZeros
-OpAbsISub
-OpAbsUSub
-OpIAddSat
-OpUAddSat
-OpIAverage
-OpUAverage
-OpIAverageRounded
-OpUAverageRounded
-OpISubSat
-OpUSubSat
-OpIMul32x16
-OpUMul32x16
-
-
-
-
-
-

Token Number Assignments

-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
NameValueUsage

IntegerFunctions2INTEL

5584

Capability

OpUCountLeadingZeros

5585

Opcode

OpUCountTrailingZeros

5586

Opcode

OpAbsISub

5587

Opcode

OpAbsUSub

5588

Opcode

OpIAddSat

5589

Opcode

OpUAddSat

5590

Opcode

OpIAverage

5591

Opcode

OpUAverage

5592

Opcode

OpIAverageRounded

5593

Opcode

OpUAverageRounded

5594

Opcode

OpISubSat

5595

Opcode

OpUSubSat

5596

Opcode

OpIMul32x16

5597

Opcode

OpUMul32x16

5598

Opcode

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.3

-
-
-

Capabilities

-
-

Modify Section 3.31, "Capability", adding these rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityDepends On

5584

IntegerFunctions2INTEL
-Allow this new functionality…​

-
-
-
-

(Add to the table in 3.32.13, Arithmetic Instructions)

-
- ------- - - - - - - - - - - - - -

OpUCountLeadingZeros
-
-Returns the number of leading 0-bits, starting at the most significant bit, in the binary representation of value. If value is zero, the size in bits of the type of value or component type of value (if value is a vector) will be returned.
-
-Result Type must be a scalar or vector of integer type, whose Width operand is 32 and whose Signedness operand is 0.
-
-The type of Operand must be the same as Result Type.

4

5585

<id>
-Result Type

Result <id>

<id>
-Operand

- ------- - - - - - - - - - - - - -

OpUCountTrailingZeros
-
-Returns the number of trailing 0-bits, starting at the least significant bit, in the binary representation of value. If value is zero, the size in bits of the type of value or component type of value (if value is a vector) will be returned.
-
-Result Type must be a scalar or vector of integer type, whose Width operand is 32 and whose Signedness operand is 0.
-
-The type of Operand must be the same as Result Type.

4

5586

<id>
-Result Type

Result <id>

<id>
-Operand

- -------- - - - - - - - - - - - - - -

OpAbsISub
-
-Returns |x - y| clamped to the range of Result Type (instead of modulo overflowing).
-
-Result Type must be a scalar or vector of integer type, whose Signedness operand is 0.
-
-The type of Operand 1 and Operand 2 must be a scalar or vector of integer type. They must have the same number of components as Result Type. They must have the same component width as Result Type.

5

5587

<id>
-Result Type

Result <id>

<id>
-Operand 1

<id>
-Operand 2

- -------- - - - - - - - - - - - - - -

OpAbsUSub
-
-Returns |x - y| clamped to the range of Result Type (instead of modulo overflowing).
-
-Result Type must be a scalar or vector of integer type, whose Signedness operand is 0.
-
-The type of Operand 1 and Operand 2 must be the same as Result Type.

5

5588

<id>
-Result Type

Result <id>

<id>
-Operand 1

<id>
-Operand 2

- -------- - - - - - - - - - - - - - -

OpIAddSat
-
-Returns x + y clamped to the range of Result Type (instead of modulo overflowing).
-
-Result Type must be a scalar or vector of integer type.
-
-The type of Operand 1 and Operand 2 must be the same as Result Type.

5

5589

<id>
-Result Type

Result <id>

<id>
-Operand 1

<id>
-Operand 2

- -------- - - - - - - - - - - - - - -

OpUAddSat
-
-Returns x + y clamped to the range of Result Type (instead of modulo overflowing).
-
-Result Type must be a scalar or vector of integer type, whose Signedness operand is 0.
-
-The type of Operand 1 and Operand 2 must be the same as Result Type.

5

5590

<id>
-Result Type

Result <id>

<id>
-Operand 1

<id>
-Operand 2

- -------- - - - - - - - - - - - - - -

OpIAverage
-
-Returns (x+y) >> 1. The intermediate sum does not modulo overflow.
-
-Result Type must be a scalar or vector of integer type.
-
-The type of Operand 1 and Operand 2 must be the same as Result Type.

5

5591

<id>
-Result Type

Result <id>

<id>
-Operand 1

<id>
-Operand 2

- -------- - - - - - - - - - - - - - -

OpUAverage
-
-Returns (x+y) >> 1. The intermediate sum does not modulo overflow.
-
-Result Type must be a scalar or vector of integer type, whose Signedness operand is 0.
-
-The type of Operand 1 and Operand 2 must be the same as Result Type.

5

5592

<id>
-Result Type

Result <id>

<id>
-Operand 1

<id>
-Operand 2

- -------- - - - - - - - - - - - - - -

OpIAverageRounded
-
-Returns (x+y+1) >> 1. The intermediate sum does not modulo overflow.
-
-Result Type must be a scalar or vector of integer type.
-
-The type of Operand 1 and Operand 2 must be the same as Result Type.

5

5593

<id>
-Result Type

Result <id>

<id>
-Operand 1

<id>
-Operand 2

- -------- - - - - - - - - - - - - - -

OpUAverageRounded
-
-Returns (x+y+1) >> 1. The intermediate sum does not modulo overflow.
-
-Result Type must be a scalar or vector of integer type, whose Signedness operand is 0.
-
-The type of Operand 1 and Operand 2 must be the same as Result Type.

5

5594

<id>
-Result Type

Result <id>

<id>
-Operand 1

<id>
-Operand 2

- -------- - - - - - - - - - - - - - -

OpISubSat
-
-Returns x - y clamped to the range of Result Type (instead of modulo overflowing).
-
-Result Type must be a scalar or vector of integer type.
-
-The type of Operand 1 and Operand 2 must be the same as Result Type.

5

5595

<id>
-Result Type

Result <id>

<id>
-Operand 1

<id>
-Operand 2

- -------- - - - - - - - - - - - - - -

OpUSubSat
-
-Returns x - y clamped to the range of Result Type (instead of modulo overflowing).
-
-Result Type must be a scalar or vector of integer type, whose Signedness operand is 0.
-
-The type of Operand 1 and Operand 2 must be the same as Result Type.

5

5596

<id>
-Result Type

Result <id>

<id>
-Operand 1

<id>
-Operand 2

- -------- - - - - - - - - - - - - - -

OpIMul32x16
-
-Integer multiplication of Operand 1 and Operand 2. The low 16-bits of Operand 2 are sign extended to 32-bits before performing the multiplication.
-
-Result Type must be a scalar or vector of integer type, whose Width operand is 32.
-
-The type of Operand 1 and Operand 2 must be the same type as Result Type.

5

5597

<id>
-Result Type

Result <id>

<id>
-Operand 1

<id>
-Operand 2

- -------- - - - - - - - - - - - - - -

OpUMul32x16
-
-Integer multiplication of Operand 1 and Operand 2. The high 16-bits of Operand 2 are replaced with 0x0000 before performing the multiplication.
-
-Result Type must be a scalar or vector of integer type, whose Width operand is 32 and whose Signedness operand is 0.
-
-The type of Operand 1 and Operand 2 must be the same type as Result Type.

5

5598

<id>
-Result Type

Result <id>

<id>
-Operand 1

<id>
-Operand 2

-
-
-
-
-

Issues

-
-
-

None yet.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2018-09-10

idr

Initial revision

2

2019-01-22

idr

Remove all references to Signedness being 1

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_shader_integer_functions2.html + + +

extensions/INTEL/SPV_INTEL_shader_integer_functions2.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_split_barrier.html b/extensions/INTEL/SPV_INTEL_split_barrier.html index 864da00..6ecc419 100644 --- a/extensions/INTEL/SPV_INTEL_split_barrier.html +++ b/extensions/INTEL/SPV_INTEL_split_barrier.html @@ -1,316 +1,12 @@ - - - - - - - -SPV_INTEL_split_barrier - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_split_barrier

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Ben Ashbaugh, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2022 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-
    -
  • -

    Shipping

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2022-02-24

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.6 Revision 1.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds SPIR-V instructions to split a control barrier (OpControlBarrier) into two separate operations: -the first indicates that an invocation has "arrived" at the barrier but should continue executing, -and the second indicates that an invocation should "wait" for other invocations to arrive at the barrier before executing further.

-
-
-

Splitting a barrier operation may improve performance and may provide a closer match to "latch" or "barrier" operations in other parallel languages such as C++ 20.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the appropriate OpExtension must -be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_split_barrier"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces the new capability:

-
-
-
-
SplitBarrierINTEL
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6

-
-
-

Modify Section 3.31, Capability, adding rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

6141

SplitBarrierINTEL

-
-
-
-

Add to Section 3.42.20, Barrier Instructions:

-
- ------- - - - - - - - - - - - - - -

OpControlBarrierArriveINTEL
-
-Indicates that an invocation has arrived at a split control barrier. -This may allow other invocations waiting on the split control barrier to continue executing.
-
-When Execution is Workgroup or larger, behavior is undefined unless all invocations within Execution execute the same dynamic instance of this instruction. -When Execution is Subgroup or Invocation, the behavior of this instruction in non-uniform control flow is defined by the client API.
-
-If Semantics is not None, this instruction also serves as the start of a memory barrier similar to an OpMemoryBarrier instruction with the same Memory and Semantics operands. -This allows atomically specifying both a control barrier and a memory barrier (that is, without needing two instructions). If Semantics is None, Memory is ignored.

Capability:
-SplitBarrierINTEL

4

6142

Scope <id>
-Execution

Scope <id>
-Memory

Memory Semantics <id>
-Semantics

- ------- - - - - - - - - - - - - - -

OpControlBarrierWaitINTEL
-
-Waits for other invocations of this module to arrive at a split control barrier.
-
-When Execution is Workgroup or larger, behavior is undefined unless all invocations within Execution execute the same dynamic instance of this instruction. -When Execution is Subgroup or Invocation, the behavior of this instruction in non-uniform control flow is defined by the client API.
-
-If Semantics is not None, this instruction also serves as the end of a memory barrier similar to an OpMemoryBarrier instruction with the same Memory and Semantics operands. -This ensures that memory accesses issued before arriving at the split barrier are observed before memory accesses issued after this instruction. -This control is ensured only for memory accesses issued by this invocation and observed by another invocation executing within Memory scope. -This allows atomically specifying both a control barrier and a memory barrier (that is, without needing two instructions). If Semantics is None, Memory is ignored.

Capability:
-SplitBarrierINTEL

4

6143

Scope <id>
-Execution

Scope <id>
-Memory

Memory Semantics <id>
-Semantics

-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2022-02-24

Ben Ashbaugh

Initial revision

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_split_barrier.html + + +

extensions/INTEL/SPV_INTEL_split_barrier.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_subgroup_buffer_prefetch.html b/extensions/INTEL/SPV_INTEL_subgroup_buffer_prefetch.html index 5b79d28..e8953e1 100644 --- a/extensions/INTEL/SPV_INTEL_subgroup_buffer_prefetch.html +++ b/extensions/INTEL/SPV_INTEL_subgroup_buffer_prefetch.html @@ -1,394 +1,12 @@ - - - - - - - -SPV_INTEL_subgroup_buffer_prefetch - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_subgroup_buffer_prefetch

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Ben Ashbaugh, Intel

    -
  • -
  • -

    Greg Lueck, Intel

    -
  • -
  • -

    Andrzej Ratajewski, Intel

    -
  • -
  • -

    Grzegorz Wawiorko, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2024 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2024-05-30

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, Version 1.6 Revision 3.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-

This extension extends the SPV_INTEL_subgroups extension and interacts with the SPV_INTEL_cache_controls and SPV_KHR_untyped_pointers extensions.

-
-
-
-
-

Overview

-
-
-

This extension extends the SPV_INTEL_subgroups extension by adding support for prefetching data from buffers. -The functionality added by this extension can improve the performance of some kernels by prefetching data into a cache, so future reads of the data are from a fast cache rather than slower memory.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the appropriate OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_subgroup_buffer_prefetch"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6

-
-
-

Capabilities

-
-

Modify Section 3.31, Capability, adding rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

6220

SubgroupBufferPrefetchINTEL

-
-
-
-
-

Instructions

-
-

Modify Section 3.49.21, Group and Subgroup Instructions, adding to the end of the list of instructions:

-
- ------- - - - - - - - - - - - - - -

OpSubgroupBlockPrefetchINTEL
-
-Prefetches one or more bytes from Ptr for each invocation in the subgroup as a block operation, where the number of bytes to prefetch per invocation is specified by NumBytes. -The total number of bytes that is collectively prefetched is therefore NumBytes times SubgroupSize. -Prefetching does not affect the functionality of a module but may change its performance characteristics.
-
-Ptr must be a pointer into the CrossWorkgroup Storage Class. -If it is an OpTypePointer pointer, it must point to an integer type scalar type.
-
-NumBytes must be a 32-bit integer type scalar whose Signedness operand is 0, and must come from a constant instruction. -The prefetch operation may be silently ignored unless NumBytes is a power of two between one and 64 bytes, inclusive.
-
-If present, any Memory Operands must begin with a memory operand literal. -If not present, it is the same as specifying the memory operand None.
-
-Behavior is undefined unless Ptr and NumBytes are dynamically uniform for all invocations in the subgroup.

Capability:
-SubgroupBufferPrefetchINTEL

3 + variable

6221

<id> Ptr

<id> NumBytes

Optional Memory Operands

-
-
-
-
-

Validation Rules

-
-
-

None.

-
-
-
-
-

Interactions with Other Extensions

-
-
-

If the SPV_INTEL_cache_controls extension is supported, the CacheControlLoadINTEL decoration may be used to control which cache levels the data will be prefetched into.

-
-
-

If the SPV_KHR_untyped_pointers extension is supported, the Ptr operand to OpSubgroupBlockPrefetchINTEL may be an OpTypeUntypedPointerKHR pointer.

-
-
-
-
-

Issues

-
-
-
    -
  1. -

    Do we also need to support prefetching data from images?

    -
    -
    -
    -

    RESOLVED: We do not currently have a use-case for prefetching data from images, so this extension will only support prefetching data from buffers. -The extension is written so support for prefetching data from images could be added by a future extension, if desired.

    -
    -
    -
    -
  2. -
  3. -

    Should the prefetch specify the number of elements to prefetch or the number of bytes to prefetch?

    -
    -
    -
    -

    RESOLVED: The prefetch instruction will specify the number of bytes to prefetch, per invocation. -Specifying the number of bytes rather than the number of components works best for opaque (also known as un-typed) pointers, where the type of data that the pointer points to is not necessarily known.

    -
    -
    -

    For completeness, note that the LLVM prefetch intrinsic only specifies the address to prefetch and does not specify the number of elements or bytes to prefetch, but this probably is not what we want to do.

    -
    -
    -
    -
  4. -
  5. -

    Which storage classes (address spaces) should we support for block prefetches?

    -
    -
    -
    -

    RESOLVED: The OpenCL C prefetch function and the prefetch instruction in the OpenCL Extended Instruction Set only supports prefetching from the global address space, or equivalently, from the CrossWorkgroup storage class.

    -
    -
    -

    The same is also true for the subgroup block reads added by cl_intel_subgroups and cl_intel_spirv_subgroups.

    -
    -
    -

    Therefore, we will follow this precedent and only support prefetching from the CrossWorkgroup storage class, or equivalently, from the global address space.

    -
    -
    -
    -
  6. -
  7. -

    What type should be used for the amount of data to prefetch?

    -
    -
    -
    -

    RESOLVED: Because we only expect to see a small set of prefetch sizes we can use a 32-bit integer to specify the amount of data to prefetch. -This is different than the OpenCL C prefetch function and the prefetch instruction in the OpenCL Extended Instruction Set, which use a size_t to describe the amount of data to prefetch, though it is sufficient for our use-cases and it is a simpler specification to use a 32-bit integer type unconditionally.

    -
    -
    -

    We will document this requirement in this SPIR-V specification and not in a client API environment specification.

    -
    -
    -
    -
  8. -
  9. -

    Should the amount of data to prefetch be an <id> and hence have the ability to be specialized, or should it be a compile-time Literal instead?

    -
    -
    -
    -

    RESOLVED: We will specify the amount of data to prefetch as an <id>. -Although there is no known use-case that requires specializing the amount of data to prefetch, specifying the amount of data to prefetch as an <id> allows this functionality, if necessary. -This is also consistent with the number of elements to prefetch for the prefetch instruction in the OpenCL Extended Instruction Set.

    -
    -
    -
    -
  10. -
  11. -

    What should the behavior be if the amount of data to prefetch is excessively large or some other unexpected value?

    -
    -
    -
    -

    RESOLVED: If the amount of data to prefetch is unexpected or otherwise unsupported, it will silently be ignored. -The expected amounts of data to prefetch will be: 1, 2, 4, 8, 16, 32, or 64 bytes per invocation. -We do not expect to prefetch three-component vectors. -We also do not expect to prefetch 16-component vectors, except for very small data types, so we do not expect to prefetch 128 bytes per invocation.

    -
    -
    -
    -
  12. -
  13. -

    Should we require Ptr to point to any specific type?

    -
    -
    -
    -

    RESOLVED: Yes, the pointer Ptr must point to an integer-type scalar. -Passing a pointer to a concrete type provides alignment information that would not be present for a pointer to OpTypeVoid.

    -
    -
    -
    -
  14. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2024-05-30

Ben Ashbaugh

Initial version

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_subgroup_buffer_prefetch.html + + +

extensions/INTEL/SPV_INTEL_subgroup_buffer_prefetch.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_subgroups.html b/extensions/INTEL/SPV_INTEL_subgroups.html index a5a9396..6faee96 100644 --- a/extensions/INTEL/SPV_INTEL_subgroups.html +++ b/extensions/INTEL/SPV_INTEL_subgroups.html @@ -1,649 +1,12 @@ - - - - - - - -SPV_INTEL_subgroups - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_subgroups

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Ben Ashbaugh, Intel

    -
  • -
  • -

    Biju George, Intel

    -
  • -
  • -

    Michael Kinsner, Intel

    -
  • -
  • -

    Mariusz Merecki, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2017-2018 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-
    -
  • -

    Final Draft

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2018-10-22

Revision

2

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.2 Revision 1.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

The goal of this extension is to allow programmers to improve the performance of their applications by taking advantage of the fact that some work items in a work group execute together as a group (a "subgroup"), and that work items in a subgroup can use hardware features that are not available to all work items in a work group. Specifically, this extension is designed to allow work items in a subgroup to share data without the use of local memory and work group barriers, and to utilize specialized hardware to load and store blocks of data from images or buffers.

-
-
-

This extension builds upon "subgroups" functionality that is already in core SPIR-V, so this extension reuses many of the names, concepts, and instructions already described in SPIR-V. The key additions in this extension are:

-
-
-
    -
  • -

    Intel subgroups adds "shuffle" instructions to allow data interchange between work items within a subgroup without the use of local memory or barriers.

    -
  • -
  • -

    Intel subgroups adds "block read and write" instructions to take advantage of specialized hardware to read or write blocks of data from or to buffers or images.

    -
  • -
-
-
-

This extension has a source language counterpart extension for the OpenCL-C kernel language, cl_intel_subgroups, which can be used for online compilation in an OpenCL environment.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the appropriate OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_subgroups"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces new capabilities:

-
-
-
-
SubgroupShuffleINTEL
-SubgroupBufferBlockIOINTEL
-SubgroupImageBlockIOINTEL
-
-
-
-
-
-

New Instructions

-
-
-

Instructions added under the SubgroupShuffleINTEL capability:

-
-
-
-
OpSubgroupShuffleINTEL
-OpSubgroupShuffleDownINTEL
-OpSubgroupShuffleUpINTEL
-OpSubgroupShuffleXorINTEL
-
-
-
-

Instructions added under the SubgroupBufferBlockIOINTEL capability:

-
-
-
-
OpSubgroupBlockReadINTEL
-OpSubgroupBlockWriteINTEL
-
-
-
-

Instructions added under the SubgroupImageBlockIOINTEL capability:

-
-
-
-
OpSubgroupImageBlockReadINTEL
-OpSubgroupImageBlockWriteINTEL
-
-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

SubgroupShuffleINTEL

5568

SubgroupBufferBlockIOINTEL

5569

SubgroupImageBlockIOINTEL

5570

OpSubgroupShuffleINTEL

5571

OpSubgroupShuffleDownINTEL

5572

OpSubgroupShuffleUpINTEL

5573

OpSubgroupShuffleXorINTEL

5574

OpSubgroupBlockReadINTEL

5575

OpSubgroupBlockWriteINTEL

5576

OpSubgroupImageBlockReadINTEL

5577

OpSubgroupImageBlockWriteINTEL

5578

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.2

-
-
-

Capabilities

-
-

Modify Section 3.31, Capability, adding rows to the Capability table:

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - -
CapabilityImplicitly DeclaresEnabled by Extension

5568

SubgroupShuffleINTEL

SPV_INTEL_subgroups

5569

SubgroupBufferBlockIOINTEL

SPV_INTEL_subgroups

5570

SubgroupImageBlockIOINTEL

SPV_INTEL_subgroups

-
-
-

Instructions

-
-

Modify Section 3.32.21, Group Instructions, adding to the end of the list of instructions:

-
- -------- - - - - - - - - - - - - - - -

OpSubgroupShuffleINTEL
-
-Allows data to be arbitrarily transferred between invocations in a subgroup. The data that is returned for this invocation is the value of Data for the invocation identified by InvocationId.

-

InvocationId need not be the same value for all invocations in the subgroup.

-

Result Type may be a scalar or vector type.

-

The type of Data must be the same as Result Type.

-

InvocationId must be a 32-bit integer type scalar.

Capability:
-SubgroupShuffleINTEL

5

5571

<id> Result Type

<id> Result

<id> Data

<id> InvocationId

- --------- - - - - - - - - - - - - - - - -

OpSubgroupShuffleDownINTEL
-
-Allows data to be transferred from an invocation in the subgroup with a higher SubgroupLocalInvocationId down to a invocation in the subgroup with a lower SubgroupLocalInvocationId.

-

There are two data sources to this built-in function: Current and Next. To determine the result of this built-in function, first let the unsigned shuffle index be equivalent to the sum of this invocation’s SubgroupLocalInvocationId plus the specified Delta:

-

If the shuffle index is less than the SubgroupMaxSize, the result of this built-in function is the value of the Current data source for the invocation with SubgroupLocalInvocationId equal to the shuffle index.

-

If the shuffle index is greater than or equal to the SubgroupMaxSize but less than twice the SubgroupMaxSize, the result of this built-in function is the value of the Next data source for the invocation with SubgroupLocalInvocationId equal to the shuffle index minus the SubgroupMaxSize.

-

All other values of the shuffle index are considered to be out-of-range.

-

Delta need not be the same value for all invocations in the subgroup.

-

Result Type may be a scalar or vector type.

-

The type of Current and Next must be the same as Result Type.

-

Delta must be a 32-bit integer type scalar.

Capability:
-SubgroupShuffleINTEL

6

5572

<id> Result Type

<id> Result

<id> Current

<id> Next

<id> Delta

- --------- - - - - - - - - - - - - - - - -

OpSubgroupShuffleUpINTEL
-
-Allows data to be transferred from an invocation in the subgroup with a lower SubgroupLocalInvocationId up to an invocation in the subgroup with a higher SubgroupLocalInvocationId.

-

There are two data sources to this built-in function: Previous and Current. To determine the result of this built-in function, first let the signed shuffle index be equivalent to this invocation’s SubgroupLocalInvocationId minus the specified Delta:

-

If the shuffle index is greater than or equal to zero and less than the SubgroupMaxSize, the result of this built-in function is the value of the Current data source for the invocation with SubgroupLocalInvocationId equal to the shuffle index.

-

If the shuffle index is less than zero but greater than or equal to the negative SubgroupMaxSize, the result of this built-in function is the value of the Previous data source for the invocation with SubgroupLocalInvocationId equal to the shuffle index plus the SubgroupMaxSize.

-

All other values of the shuffle index are considered to be out-of-range.

-

Delta need not be the same value for all invocations in the subgroup.

-

Result Type may be a scalar or vector type.

-

The type of Previous and Current must be the same as Result Type.

-

Delta must be a 32-bit integer type scalar.

Capability:
-SubgroupShuffleINTEL

6

5573

<id> Result Type

<id> Result

<id> Previous

<id> Current

<id> Delta

- -------- - - - - - - - - - - - - - - -

OpSubgroupShuffleXorINTEL
-
-Allows data to be transferred between invocations in a subgroup as a function of the invocation_s SubgroupLocalInvocationId. The data that is returned for this invocation is the value of Data for the invocation with SubgroupLocalInvocationId equal to this invocation’s SubgroupLocalInvocationId XOR_d with the specified Value. If the result of the XOR is greater than SubgroupMaxSize then it is considered out-of-range.

-

Value need not be the same for all invocations in the subgroup.

-

Result Type may be a scalar or vector type.

-

The type of Data must be the same as Result Type.

-

Value must be a 32-bit integer type scalar.

Capability:
-SubgroupShuffleINTEL

5

5574

<id> Result Type

<id> Result

<id> Data

<id> Value

- ------- - - - - - - - - - - - - - -

OpSubgroupBlockReadINTEL
-
-Reads one or more components of Result data for each invocation in the subgroup from the specified Ptr as a block operation.

-

The data is read strided, so the first value read is:

-

Ptr[ SubgroupLocalInvocationId ]

-

and the second value read is:

-

Ptr[ SubgroupLocalInvocationId + SubgroupMaxSize ]

-

etc.

-

Result Type may be a scalar or vector type, and its component type must be equal to the type pointed to by Ptr.

-

The type of Ptr must be a pointer type, and must point to a scalar type.

Capability:
-SubgroupBufferBlockIOINTEL

4

5575

<id> Result Type

<id> Result

<id> Ptr

- ------ - - - - - - - - - - - - -

OpSubgroupBlockWriteINTEL
-
-Writes one or more components of Data for each invocation in the subgroup from the specified Ptr as a block operation.

-

The data is written strided, so the first value is written to:

-

Ptr[ SubgroupLocalInvocationId ]

-

and the second value written is:

-

Ptr[ SubgroupLocalInvocationId + SubgroupMaxSize ]

-

etc.

-

The type of Ptr must be a pointer type, and must point to a scalar type.

-

The component type of Data must be equal to the type pointed to by Ptr.

Capability:
-SubgroupBufferBlockIOINTEL

3

5576

<id> Ptr

<id> Data

- -------- - - - - - - - - - - - - - - -

OpSubgroupImageBlockReadINTEL
-
-Reads one or more components of Result data for each invocation in the subgroup from the specified Image at the specified Coordinate as a block operation. Note that the Coordinate is a byte coordinate, not a texel coordinate. Also note that the image data is read without format conversion, so each invocation may read multiple image elements.

-

The data is read row-by-row, so the first value read is from the row specified by the y-component of the provided Coordinate, the second value is read from the row specified by the y-component of the provided Coordinate plus one, etc.

-

Result Type may be a scalar or vector type.

-

Image must be an object whose type is OpTypeImage with a Sampled operand of 0 or 2. If the Sampled operand is 2, then some dimensions require a capability.

-

Coordinate is an integer scalar or vector. The x-component is a byte coordinate into rows of the image and remaining coordinates are non-normalized texel coordinates.

Capability:
-SubgroupImageBlockIOINTEL

5

5577

<id> Result Type

<id> Result

<id> Image

<id> Coordinate

- ------- - - - - - - - - - - - - - -

OpSubgroupImageBlockWriteINTEL
-
-Writes one or more components of Data for each invocation in the subgroup to the specified Image at the specified Coordinate as a block operation. Note that the Coordinate is a byte coordinate, not a texel coordinate. Also note that the image data is read without format conversion, so each invocation may write multiple image elements.

-

The data is written row-by-row, so the first value is written to the row specified by the y-component of the provided Coordinate, the second value is written to the row specified by the y-component of the provided Coordinate plus one, etc.

-

Image must be an object whose type is OpTypeImage with a Sampled operand of 0 or 2. If the Sampled operand is 2, then some dimensions require a capability.

-

Coordinate is an integer scalar or vector. The x-component is a byte coordinate into rows of the image and remaining coordinates are non-normalized texel coordinates.

-

Result Type may be a scalar or vector type.

Capability:
-SubgroupImageBlockIOINTEL

4

5578

<id> Image

<id> Coordinate

<id> Data

-
-
-
-
-

Validation Rules

-
-
-

None.

-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2017-09-29

Ben Ashbaugh

Initial revision

2

2018-10-22

Ben Ashbaugh

Minor formatting updates.

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_subgroups.html + + +

extensions/INTEL/SPV_INTEL_subgroups.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_task_sequence.html b/extensions/INTEL/SPV_INTEL_task_sequence.html index 764183c..37649df 100644 --- a/extensions/INTEL/SPV_INTEL_task_sequence.html +++ b/extensions/INTEL/SPV_INTEL_task_sequence.html @@ -1,829 +1,12 @@ - - - - - - - -SPV_INTEL_task_sequence - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_task_sequence

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Jessica Davies, Intel

    -
  • -
  • -

    Joe Garvey, Intel

    -
  • -
  • -

    Robert Ho, Intel

    -
  • -
  • -

    Michael Kinsner, Intel

    -
  • -
  • -

    Abhishek Tiwari, Intel

    -
  • -
  • -

    Bowen Xue, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2022-2024 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-

Complete

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2023-03-06

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.6 Revision 2.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

A task sequence is an abstraction of a sequence of calls to a function that can -execute asynchronously from the caller and each other. This extension introduces -four new instructions that support task sequence execution.

-
-
-

The OpTaskSequenceCreateINTEL instruction creates a task sequence to which -asynchronous function calls can be submitted through the -OpTaskSequenceAsyncINTEL instruction. The results of those function calls can -be queried with the OpTaskSequenceGetINTEL instruction.

-
-
-

Task Sequence and Task Threads

-
-

A task sequence object can be created by calling OpTaskSequenceCreateINTEL. -The OpTaskSequenceAsyncINTEL, OpTaskSequenceGetINTEL, and -OpTaskSequenceReleaseINTEL instructions take a task sequence object as an -argument. The OpTaskSequenceAsyncINTEL instruction creates a invocation which -will be referred to as a task thread in this document. This task thread is -said to belong to the task sequence specified to the OpTaskSequenceAsyncINTEL -instruction. The OpTaskSequenceGetINTEL instruction returns the result of a -task thread in the specified task sequence. Results are returned from the task -sequence in the same order as the OpTaskSequenceAsyncINTEL calls are made to -the task sequence.

-
-
-

An OpFunction f is passed as an argument to the OpTaskSequenceCreateINTEL -instruction. The task threads belonging to a task sequence asynchronously -execute f and they may run in parallel with the caller and with any other -task threads. The implementation is not required to run these task threads in -parallel except in so far as is necessary to meet the forward progress -guarantees outlined in the section below.

-
-
-
-

Forward Progress Guarantees and Execution Model

-
-

A task thread is a new Invocation which has a LocalInvocationId, -GlobalInvocationId, and WorkgroupId of 0, WorkgroupSize and GlobalSize -of 1 and LocalSize of 1, 1, 1. It does not share Workgroup storage class -memory or Function storage class memory with the caller or with other task -threads. It can access memory from CrossWorkgroup storage class. A task thread -cannot synchronize with the caller or with other task threads using a barrier.

-
-
- - - - - -
- - -Calling OpTaskSequenceAsyncINTEL is analogous to enqueuing an OpenCL -kernel with global_work_offset, global_work_size, local_work_size set to -0, 1, 1, i.e., a task kernel. This extension does not support -any instruction which would be analogous to enqueuing a kernel with a different -geometry. -
-
-
-

An OpTaskSequenceAsyncINTEL call is guaranteed to not block the caller as long -as the number of task threads in the task sequence is strictly less than the -AsyncCapacity of the sequence.

-
-
-

A task thread executes f and then writes its completion status and -results to an output data structure D associated with the sequence. The -task thread can only write into D if there is space available in it and the -task thread ceases to exist after writing its results. The implementation must -ensure that at least GetCapacity task threads can store their outputs to D. -Results are removed from D when they are retrieved by OpTaskSequenceGetINTEL -calls. An OpTaskSequenceGetINTEL call is guaranteed to block the caller if -there are no results stored in D.

-
-
-

C++ defines a framework for describing the -forward progress of -individual thread of execution in a multi-threaded program. Here are the terms -and definitions from the C++ specification that we will use to define -progress guarantees for task threads:

-
-
-
    -
  1. -

    Weakly parallel forward progress guarantee: the implementation does not -ensure that the thread will eventually make progress.

    -
  2. -
  3. -

    Concurrent forward progress guarantee: the implementation ensures -that the thread will eventually make progress for as long as it has not -terminated.

    -
  4. -
  5. -

    Blocking with forward progress guarantee delegation: When a thread of -execution A is specified to block with forward progress guarantee delegation -on the completion of a set M of threads of execution, then throughout the -whole time of A being blocked on M, the implementation shall ensure that the -forward progress guarantees provided by at least one thread of execution in M -is at least as strong as A's forward progress guarantees. It is unspecified -which thread or threads of execution in M are chosen and for which number of -execution steps. The strengthening is not necessarily in place for the rest of -the lifetime of the affected thread of execution. -Using the above definitions, the progress guarantees for task threads are -defined as follows:

    -
    -
      -
    • -

      When a task sequence object O is created by OpTaskSequenceCreateINTEL, a -task sequence object thread is also created.

      -
    • -
    • -

      At any point in time, the progress guarantee of all task sequence object -threads created by a work item WI matches that of WI. For example, -if WI is strengthened to have a stronger progress guarantee than its -initial guarantee, all of the task sequence object threads created by WI -are also strengthened.

      -
    • -
    • -

      A call to OpTaskSequenceAsyncINTEL(O, …​) will result in creation of a -task thread. OpTaskSequenceAsyncINTEL(O, …​) can be called multiple times -to create multiple task threads for O. A task thread has weakly parallel -forward progress guarantee.

      -
    • -
    • -

      Upon creation, a task sequence object thread P immediately blocks on the -set S of task threads that belong to O with forward progress guarantee -delegation.

      -
    • -
    • -

      If a task thread with concurrent forward progress guarantee has finished -executing f and if it can write its results to the output data structure D, -then it does so and some other task thread in S is strengthened to have -concurrent forward progress guarantee. If a task thread cannot write its -results to D, the task thread blocks until space is available.

      -
    • -
    -
    -
  6. -
-
-
-

The two examples below, respectively, show the following:

-
-
-
    -
  1. -

    How strengthening of a work item strengthens the task threads.

    -
  2. -
  3. -

    How a task thread delegates its progress guarantee to other task threads in -the same task sequence object.

    -
  4. -
-
-
-

Example 1 uses the following pseudo-code program:

-
-
-
-
// A work item WI
-{
-  ...
-  TaskSeqObject1 = OpTaskSequenceCreateINTEL(SomeFunction, ...); // Object_1_Thread
-  OpTaskSequenceAsyncINTEL(TaskSeqObject1, ...); // Task_1_1
-  OpTaskSequenceAsyncINTEL(TaskSeqObject1, ...); // Task_1_2
-  ...
-  TaskSeqObject2 = OpTaskSequenceCreateINTEL(SomeFunction, ...); // Object_2_Thread
-  OpTaskSequenceAsyncINTEL(TaskSeqObject2, ...); // Task_2_1
-  OpTaskSequenceAsyncINTEL(TaskSeqObject2, ...); // Task_2_2
-}
-
-
-
-

The OpTaskSequenceCreateINTEL calls create task object threads -Object_1_Thread and Object_2_Thread. The first two -OpTaskSequenceAsyncINTEL calls create task threads Task_1_1 and Task_1_2. -Similarly the next two calls create Task_2_1 and Task_2_2.

-
-
-

The table below provides a view of the hierarchy of task threads that will be -generated.

-
- - ------- - - - - - - - - - - - - - - - - - - -
Table 1. Hierarchy of task threads.

Work Item

WI

Task Sequence Object Thread

Object_1_Thread

Object_2_Thread

Task Thread

Task_1_1

Task_1_2

Task_2_1

Task_2_2

-
-

At some initial stage, all task threads have weakly parallel forward progress -guarantee. If WI is strengthened to have concurrent forward progress -guarantee, then all of the object threads are also strengthened. Next, in this -example one task thread for each task sequence is also strengthened. This is -depicted in the table below (progress guarantee for each thread is in -parenthesis):

-
- - ------- - - - - - - - - - - - - - - - - - - -
Table 2. Possible Progress Guarantees at some time after WI is strengthened.

Work Item

WI (concurrently parallel)

Task Sequence Object Thread

Object_1_Thread (concurrent)

Object_2_Thread (concurrent)

Task Thread

Task_1_1 (weakly parallel)

Task_1_2 (concurrent)

Task_2_1 (concurrent)

Task_2_2 (weakly parallel)

-
-

The next example shows how a task thread delegates its progress -guarantee to another task thread:

-
-
-

Assume that we have a task sequence TS with GetCapacity of 1 and -AsyncCapacity of 5. Four OpTaskSequenceAsyncINTEL calls create the -following task threads: T1, T2, T3 and T4, for TS. T1 has -concurrent forward progress guarantee after getting strengthened, while -T2, T3 and T4 have weakly parallel forward progress guarantees. The -task threads go through the following execution flow:

-
-
-
    -
  • -

    T1 finishes executing the function f associated with TS.

    -
  • -
  • -

    For TS, the output data structure D can store the output of only one -task thread since GetCapacity is one. T1 writes its output.

    -
  • -
  • -

    Any task thread can now be picked to be strengthened to have concurrent -forward progress guarantee. Let’s say T2 is picked.

    -
  • -
  • -

    At some point T2 finishes executing f. T1's results are still in the -output data structure.

    -
  • -
  • -

    T2 cannot write its results until space is available in D. Hence -, none of the other task threads can be picked to be strengthened to the -stronger progress guarantee.

    -
  • -
  • -

    OpTaskSequenceGetINTEL is invoked. T1's results get removed from -D.

    -
  • -
  • -

    T2 can write its results and some other task thread can be picked to be -strengthened.

    -
  • -
-
-
-
-

Memory Order Semantics

-
-
    -
  • -

    OpTaskSequenceAsyncINTEL is a Release operation scoped to include the work -item that called it and the task thread that the OpTaskSequenceAsyncINTEL call -creates.

    -
  • -
  • -

    The beginning of a task thread T is an Acquire operation scoped to include -the work item that called OpTaskSequenceAsyncINTEL to create T and the -task thread T.

    -
  • -
  • -

    The end of a task thread T is a Release operation scoped to include T -and the work item that called OpTaskSequenceAsyncINTEL to create T.

    -
  • -
  • -

    OpTaskSequenceGetINTEL is an Acquire operation scoped to include the task -thread that is being retrieved by OpTaskSequenceGetINTEL and the work item -that is calling OpTaskSequenceGetINTEL.

    -
  • -
-
-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must -be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_task_sequence"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
TaskSequenceINTEL
-
-
-
-
-
-

New Instructions

-
-
-

Instructions added under the TaskSequenceINTEL capability:

-
-
-
-
OpTaskSequenceCreateINTEL
-OpTaskSequenceAsyncINTEL
-OpTaskSequenceGetINTEL
-OpTaskSequenceReleaseINTEL
-
-
-
-
-
-

Token Number Assignments

-
-
-
- ---- - - - - - - - - - - - - - - - - - - - - - - - - - - -

TaskSequenceINTEL

6162

OpTaskSequenceCreateINTEL

6163

OpTaskSequenceAsyncINTEL

6164

OpTaskSequenceGetINTEL

6165

OpTaskSequenceReleaseINTEL

6166

OpTypeTaskSequenceINTEL

6199

-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6, Revision 2

-
-
-

Capability

-
-

Modify Section 3.31, Capability, adding a row to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

6162

TaskSequenceINTEL

-
-
-
-
-

Type Declaration Instruction

-
-

Add a new subsection, 3.42.26, Task Sequence Type Declaration Instruction, and -add one new instruction in this subsection as follows:

-
- ----- - - - - - - - - - - - -

OpTypeTaskSequenceINTEL

-

Declare a task sequence type.

Capability: -TaskSequenceINTEL

2

6199

Result
-<id>

-
-
-

Instructions

-
-

Add a new subsection, 3.42.27, Task Sequence Instructions, and add four new -instructions in this subsection as follows:

-
- ----------- - - - - - - - - - - - - - - - - - -

OpTaskSequenceCreateINTEL

-

Create and return an instance of a task sequence with type - OpTypeTaskSequenceINTEL. All calls to OpTaskSequenceAsyncINTEL with - Result passed in as an argument will execute the function Function.

-

Result Type must be OpTypeTaskSequenceINTEL.

-

Function is an OpFunction.

-

Pipelined is a literal 32-bit signed integer and it represents the following -based on the value:

-

0 - Do not pipeline the task sequence data path.

-

N - (N > 0), Pipeline the data path such that a new invocation of the task -sequence can be launched every N cycles (also known as the Initiation Interval).

-

-1 - Pipeline the task sequence with a compiler determined Initiation Interval.

-

This argument is only meaningful on FPGA devices.

-

ClusterMode is a literal 32-bit signed integer and it is a request -for the method that statically-scheduled clusters should handle stalls: using an -exit FIFO to drain computations from the cluster or using a stall-enable signal -to freeze computations within the cluster.

-

The valid values are:

-

0 - Direct the compiler to use stall-free clusters.

-

1 - Direct the compiler to use stall-enable clusters.

-

-1 - Let the compiler decide which type of cluster to use.

-

This argument is only meaningful on FPGA devices.

-

GetCapacity is a literal 32-bit unsigned integer. A task thread that has -finished executing Function is guaranteed to write its results to the results -data structure of the task sequence as long as there is space to do so. The -implementation must ensure that at least the oldest GetCapacity task threads -can write their results and completion status. Only task threads that have -written their results are counted against this limit.

-

AsyncCapacity is a literal 32-bit unsigned integer. OpTaskSequenceAsyncINTEL -calls for Result are guaranteed to not block as long as the number of task -threads in Result are strictly less than this limit.

Capability: -TaskSequenceINTEL

8

6163

<id>
-Result Type

Result
-<id>

<id>
-Function

Literal
-Pipelined

Literal
-UseStallEnableClusters

Literal
-GetCapacity

Literal
-AsyncCapacity

- ------ - - - - - - - - - - - - -

OpTaskSequenceAsyncINTEL

-

Asynchronously invoke the OpFunction f associated with the task sequence -Sequence.

-

Sequence must have type OpTypeTaskSequenceINTEL.

-

This instruction is guaranteed to not block as long as the number of task -threads in Sequence are strictly less than the AsyncCapacity of Sequence. -The call may return before the asynchronous call to f completes execution, and -potentially before f even begins executing.

-

Argument N is the object to pass as the N th parameter of the function f. -If f cannot be called with N arguments the behavior is undefined.

Capability: -TaskSequenceINTEL

2+variable

6164

<id>
-Sequence

<id>, <id>, …​
-Argument 0,
-Argument 1,
-…​

- ------- - - - - - - - - - - - - - -

OpTaskSequenceGetINTEL
-Retrieve the result of a task thread in the task sequence Sequence. If there -are multiple task threads, the results are retrieved in the same order in which -the threads were created. -Sequence must have type OpTypeTaskSequenceINTEL. -This instruction will block if there are no results to return. -Result Type is the same as the return type of the OpFunction associated with -Sequence.

Capability: -TaskSequenceINTEL

4

6165

<id>
-Result Type

Result
-<id>

<id>
-Sequence

- ----- - - - - - - - - - - - -

OpTaskSequenceReleaseINTEL
-Release the memory allocated for the task sequence uniquely identified by the -id Sequence. -Sequence must have type OpTypeTaskSequenceINTEL.

Capability: -TaskSequenceINTEL

2

6166

<id>
-Sequence

-
-
-
-
-

SPIR-V Representation in LLVM IR

-
-
-

This is a non-normative section. OpTypeTaskSequenceINTEL can be mapped to LLVM -opaque type spirv.TaskSequenceINTEL and mangled as -__spirv_TaskSequenceINTEL__.

-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2023-03-06

Abhishek Tiwari

Initial public release

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_task_sequence.html + + +

extensions/INTEL/SPV_INTEL_task_sequence.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_unstructured_loop_controls.html b/extensions/INTEL/SPV_INTEL_unstructured_loop_controls.html index 63bf64e..969e9bb 100644 --- a/extensions/INTEL/SPV_INTEL_unstructured_loop_controls.html +++ b/extensions/INTEL/SPV_INTEL_unstructured_loop_controls.html @@ -1,325 +1,12 @@ - - - - - - - -SPV_INTEL_unstructured_loop_controls - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_unstructured_loop_controls

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Joe Garvey, Intel

    -
  • -
  • -

    Michael Kinsner, Intel

    -
  • -
  • -

    Ben Ashbaugh, Intel

    -
  • -
  • -

    Alexey Sotkin, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2019 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-

Final draft

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2019-06-12

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.4 Revision 1.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension introduces a new instruction that allows loop control parameters to be applied to unstructured loops. This instruction can be used in place of an OpLoopMerge on such loops. As both must be the second-to-last instruction in a loop header block, they can’t co-exist and there can’t be multiple instances of either in the same loop.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_unstructured_loop_controls"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
UnstructuredLoopControlsINTEL
-
-
-
-
-
-

New Instructions

-
-
-

Instructions added under the UnstructuredLoopControlsINTEL capability:

-
-
-
-
OpLoopControlINTEL
-
-
-
-
-
-

Token Number Assignments

-
-
-
- ---- - - - - - - - - - - -

UnstructuredLoopControlsINTEL

5886

OpLoopControlINTEL

5887

-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.4

-
-
-

Loop Control

-
-

In Section 3.23, Loop Control, add OpLoopControlINTEL to the set of instructions that can use Loop Controls:

-
-
-

Used by OpLoopMerge and OpLoopControlINTEL.

-
-
-
-

Capability

-
-

Modify Section 3.31, Capability, adding a row to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

5886

UnstructuredLoopControlsINTEL

-
-
-
-
-

Instructions

-
-

In section 3.32.17, Control-Flow Instructions, add a new instruction, OpLoopControlINTEL, as follows:

-
- ------ - - - - - - - - - - - - -

OpLoopControlINTEL

-

Declare a loop.

-

This instruction must be in a block that is the target of a back edge and that block must dominate the back-edge block from which that edge originated.

-

This instruction must immediately precede either an OpBranch or OpBranchConditional instruction. That is, it must be the second-to-last instruction in its block.

-

Loop Control Parameters appear in Loop Control-table order for any Loop Control setting that requires such a parameter.

Capability: -UnstructuredLoopControlsINTEL

2 + variable

5887

Loop Control

Literal, Literal, …​
-Loop Control Parameters

-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2019-06-12

Joe Garvey

Initial public release

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_unstructured_loop_controls.html + + +

extensions/INTEL/SPV_INTEL_unstructured_loop_controls.html

+ + diff --git a/extensions/INTEL/SPV_INTEL_usm_storage_classes.html b/extensions/INTEL/SPV_INTEL_usm_storage_classes.html index 8a1dc5a..7703a63 100644 --- a/extensions/INTEL/SPV_INTEL_usm_storage_classes.html +++ b/extensions/INTEL/SPV_INTEL_usm_storage_classes.html @@ -1,378 +1,12 @@ - - - - - - - -SPV_INTEL_usm_storage_classes - - - - - -
-
-

Name Strings

-
-
-

SPV_INTEL_usm_storage_classes

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Joe Garvey, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2022 Intel Corporation. All rights reserved.

-
-
-
-
-

Status

-
-
-

Final Draft

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2020-04-30

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.6 Revision 2.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension introduces two new storage classes that are sub classes of the CrossWorkgroup storage class. -Using these more specific storage classes provides additional information that can enable optimization. -The extension also introduces two new conversion instructions to enable converting pointers from and to these storage classes.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_INTEL_usm_storage_classes"
-
-
-
-
-
-

New capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
USMStorageClassesINTEL
-
-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - - - - - - - - - - - - - - - - - -

OpPtrCastToCrossWorkgroupINTEL

5934

USMStorageClassesINTEL

5935

DeviceOnlyINTEL

5936

HostOnlyINTEL

5937

OpCrossWorkgroupCastToPtrINTEL

5938

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6 Revision 2

-
-
-

Storage Class

-
-

Modify Section 3.7, Storage Class, adding these rows to the table:

-
-
-
- ----- - - - - - - - - - - - - - - - - - - -
Storage ClassEnabling Capabilities

5936

DeviceOnlyINTEL
-A subset of the CrossWorkgroup storage class. Memory that is resident on the device. SYCL or OpenCL device unified shared memory.

USMStorageClassesINTEL

5937

HostOnlyINTEL
-A subset of the CrossWorkgroup storage class. Memory that is resident on the host. SYCL or OpenCL host unified shared memory.

USMStorageClassesINTEL

-
-
-
-
-

Capability

-
-

Modify Section 3.31, Capability, adding a row to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

5935

USMStorageClassesINTEL

Kernel

-
-
-
-
-

Instructions

-
-

Modify Section 3.36.11, Conversion Instructions, adding two new instructions as follows:

-
- ------- - - - - - - - - - - - - - -

OpPtrCastToCrossWorkgroupINTEL

-

Converts a pointer’s Storage Class from a more specific class to CrossWorkgroup.

-

Result Type must be an OpTypePointer. Its Storage Class must be CrossWorkgroup.

-

Pointer must point to the DeviceOnlyINTEL or HostOnlyINTEL Storage Class.

-

Pointer must have the same type as Result Type aside from the Storage Class.

Capability:
-USMStorageClassesINTEL

4

5934

<id>
-Result Type

Result <id>

<id>
-Pointer

- ------- - - - - - - - - - - - - - -

OpCrossWorkgroupCastToPtrINTEL

-

Convert a pointer’s Storage Class from CrossWorkgroup to a more specific class.

-

Result Type must be an OpTypePointer. Result Type's Storage Class must be DeviceOnlyINTEL or HostOnlyINTEL.

-

Pointer must point to the CrossWorkgroup Storage Class.

-

Pointer must have the same type as Result Type aside from the Storage Class.

Capability:
-USMStorageClassesINTEL

4

5938

<id>
-Result Type

Result <id>

<id>
-Pointer

-
-
-

Validation Rules

-
-

None.

-
-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2022-11-28

Joe Garvey

Initial public release

-
-
-
- - \ No newline at end of file + + + + + + extensions/INTEL/SPV_INTEL_usm_storage_classes.html + + +

extensions/INTEL/SPV_INTEL_usm_storage_classes.html

+ + diff --git a/extensions/KHR/SPV_KHR_16bit_storage.html b/extensions/KHR/SPV_KHR_16bit_storage.html index 3df6d17..a60d1c9 100644 --- a/extensions/KHR/SPV_KHR_16bit_storage.html +++ b/extensions/KHR/SPV_KHR_16bit_storage.html @@ -1,573 +1,12 @@ - - - - - - - -SPV_KHR_16bit_storage - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_16bit_storage

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Alexander Galazin, ARM

    -
  • -
  • -

    Jan-Harald Fredriksen, ARM

    -
  • -
  • -

    Joerg Wagner, ARM

    -
  • -
  • -

    Neil Henning, Codeplay

    -
  • -
  • -

    Jeff Bolz, Nvidia

    -
  • -
  • -

    David Neto, Google

    -
  • -
  • -

    John Kessenich, Google

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2016-2017 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
  • -

    Approved by the SPIR Working Group: 2017-01-11

    -
  • -
  • -

    Approved by the Khronos Board of Promoters: 2017-02-24

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2018-06-11

Revision

9

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.1 Revision 1.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-

This extension interacts with SPV_KHR_storage_buffer_storage_class.

-
-
-
-
-

Overview

-
-
-

This extension adds new StorageUniformBufferBlock16, StorageUniform16 -capabilities that allow to access 16-bit data in objects in Uniform storage -class with BufferBlock and Block decorations.

-
-
-

If the SPV_KHR_storage_buffer_storage_class extension is supported, it also -allows use of the StorageBuffer16BitAccess and the UniformAndStorageBuffer16BitAccess -capabilities, providing the same functionality as the -StorageUniformBufferBlock16 and the StorageUniform16 capabilities.

-
-
-

It also adds new StoragePushConstant16 capability that allows access to 16-bit -data in objects in PushConstant storage class.

-
-
-

Finally, this extension adds new StorageInputOutput16 capability that allows -to access to 16-bit data in objects in Input and Output storage classes.

-
-
-

To facilitate stores of 32-bit data to 16-bit storage this extensions enables -use of the FP Rounding Mode decoration in graphics shaders.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_16bit_storage"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces new capabilities:

-
-
-
-
StorageUniformBufferBlock16
-StorageUniform16
-StoragePushConstant16
-StorageInputOutput16
-
-
-
-

If the SPV_KHR_storage_buffer_storage_class extension is supported, the following capabilities are also supported:

-
-
-
-
StorageBuffer16BitAccess
-UniformAndStorageBuffer16BitAccess
-
-
-
-
-
-

New Builtins

-
-
-

None.

-
-
-
-
-

New Instructions

-
-
-

None.

-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - - - - - - - - - - - - - - - - - - - - - -

StorageUniformBufferBlock16

4433

StorageBuffer16BitAccess

4433

StorageUniform16

4434

UniformAndStorageBuffer16BitAccess

4434

StoragePushConstant16

4435

StorageInputOutput16

4436

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.1

-
-
-
-
Modify Section 3.16, FP Rounding Mode, amend the Enabling Capabilities column to say:
-
-
-

Any of Kernel, StorageUniformBufferBlock16, StorageUniform16, -StoragePushConstant16, StorageInputOutput16.

-
-
-
Modify Section 3.20, Decoration, amend the Enabling Capabilities column for the FPRoundingMode decoration to say:
-
-
-

Any of Kernel, StorageUniformBufferBlock16, StorageUniform16, -StoragePushConstant16, StorageInputOutput16.

-
-
-
Modify Section 3.31, Capability, adding the following rows to the Capability table:
-
-
-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
CapabilityDepends On

4433

StorageUniformBufferBlock16
-Allows 16-bit OpTypeFloat and OpTypeInt -instructions for creating scalar, vector, and composite types that become members of a block -residing in the Uniform Storage Class. -A type that is or contains such a 16-bit type can be used only as an operand of an -OpTypePointer instruction. -The block must be decorated with BufferBlock.

-

Other uses of 16-bit types are not enabled by this capability.

4434

StorageUniform16
-Allows 16-bit OpTypeFloat and OpTypeInt -instructions for creating scalar, vector, and composite types that become members of a block -residing in the Uniform Storage Class. -A type that is, or contains, such a 16-bit type can be used only as an operand of an -OpTypePointer instruction. -The block can have any supported decoration, including BufferBlock.

-

Other uses of 16-bit types are not enabled by this capability.

StorageUniformBufferBlock16

4435

StoragePushConstant16
-Allows 16-bit OpTypeFloat and OpTypeInt -instructions for creating scalar, vector, and composite types that become members of a block -residing in the PushConstant Storage Class. -A type that is, or contains, such a 16-bit type can be used only as an operand of an -OpTypePointer instruction.

-

Other uses of 16-bit types are not enabled by this capability.

4436

StorageInputOutput16
-Allows 16-bit OpTypeFloat and OpTypeInt -instructions for creating scalar, vector, and composite types that become members of a block -residing in the Output Storage Class. -A type that is, or contains, such a 16-bit type can be used only as an operand of an -OpTypePointer instruction.

-

Other uses of 16-bit types are not enabled by this capability.

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

If the StorageUniformBufferBlock16, StorageUniform16, StoragePushConstant16, or StorageInputOutput16 capability is declared:

-
-
-
    -
  • -

    An OpTypePointer pointing to a 16-bit scalar, a 16-bit vector, -or a composite containing a 16-bit member can be used as the result type of an OpVariable, -OpAccessChain, or OpInBoundsAccessChain.

    -
  • -
  • -

    OpLoad can only load 16-bit scalars, 16-bit vectors, and 16-bit matrices.

    -
  • -
  • -

    OpStore can only store 16-bit scalars, 16-bit vectors, and 16-bit matrices.

    -
  • -
  • -

    OpCopyObject can be used for 16-bit scalars or composites containing 16-bit members.

    -
  • -
  • -

    16-bit scalars or 16-bit vectors can be used as operands to a width-only conversion -instruction to a 32-bit type (OpFConvert, OpSConvert, -or OpUConvert), and can be produced as results of a width-only conversion instruction -from a 32-bit type.

    -
  • -
  • -

    A structure containing a 16-bit member can be an operand to OpArrayLength.

    -
  • -
  • -

    Any other instructions not explicitly listed by the capabilities or allowed by the validations rules -cannot operate on variables with 16-bit scalar, 16-bit vector, or 16-bit composite types.

    -
  • -
-
-
-

A FPRoundingMode decoration can be applied only to:

-
-
-
    -
  • -

    a width-only conversion instruction that is used as the object argument of an -OpStore storing through a pointer to a 16-bit floating-point -object in Uniform, or PushConstant, or Input, or Output -Storage Classes.

    -
  • -
-
-
-
-
-

Interactions with SPV_KHR_storage_buffer_storage_class

-
-
-
-
If SPV_KHR_uniform_buffer_storage_class is supported,
-
-
-
-
modify the description of the StorageUniformBufferBlock16 capability, adding the following sentence to the first paragraph of the description:
-
-
-
-
-
-
-

The object can also be in the StorageBuffer Storage Class and have any decorations supported for this Storage Class.

-
-
-
-
modify the description of the StorageUniform16 capability, adding the following sentence to the first paragraph of the description:
-
-

The object can also be in the StorageBuffer Storage Class and have any decorations supported for this Storage Class.

-
-
-
Modify Section 3.31, Capability, adding the following rows to the Capability table:
-
-
-
-
-
- ----- - - - - - - - - - - - - - - - - - - -
CapabilityDepends On

4433

StorageBuffer16BitAccess
-Same as StorageUniformBufferBlock16

4434

UniformAndStorageBuffer16BitAccess
-Same as StorageUniform16

StorageBuffer16BitAccess

-
-
-
-

Issues

-
- -
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2016-11-22

Alexander Galazin

Initial revision

2

2016-11-28

Alexander Galazin

Address first round of feedback

3

2016-12-01

Alexander Galazin

Removed combined Load/Store and Convert instructions. -Renamed capabilities and described them in terms of storage classes.

4

2016-12-08

David Neto

Assigned token numbers

5

2016-12-14

Alexander Galazin

Renamed the extension. Removed changes to the default rounding modes. Made StorageUniform16 dependent on StorageUniformBufferBlock16

6

2017-02-22

JohnK

Clarified that conversions for changing width can only change the width, not the fundamental type domain.

7

2017-03-15

Alexander Galazin

Clarified that FP Rounding mode can be used only if the capabilities from this extension are enabled

8

2017-03-23

Alexander Galazin

Added interactions with SPV_KHR_uniform_buffer_storage_class

9

2018-06-11

Alexander Galazin

Added clarifications for SPIR-V issue 319

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_16bit_storage.html + + +

extensions/KHR/SPV_KHR_16bit_storage.html

+ + diff --git a/extensions/KHR/SPV_KHR_8bit_storage.html b/extensions/KHR/SPV_KHR_8bit_storage.html index f31fbe5..6d151e7 100755 --- a/extensions/KHR/SPV_KHR_8bit_storage.html +++ b/extensions/KHR/SPV_KHR_8bit_storage.html @@ -1,331 +1,12 @@ - - - - - - - -SPV_KHR_8bit_storage - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_8bit_storage

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Alexander Galazin, Arm

    -
  • -
  • -

    Ruihao Zhang, Qualcomm

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2016-2018 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
  • -

    Approved by the SPIR Working Group: 2018-03-28

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2019-06-10

Revision

4

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.3 Revision 1.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-

This extension requires SPV_KHR_storage_buffer_storage_class.

-
-
-

This extension interacts with SPV_KHR_16bit_storage.

-
-
-
-
-

Overview

-
-
-

This extension adds new StorageBuffer8BitAccess, UniformAndStorageBuffer8BitAccess, -and StoragePushConstant8 capabilities to allow accesses to 8-bit integer types in -the StorageBuffer, Uniform, and PushConstant storage classes.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_8bit_storage"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.1

-
-
-
-
Modify Section 3.31, Capability, adding the following rows to the Capability table:
-
-
-
- ----- - - - - - - - - - - - - - - - - - - - - - - - -
CapabilityDepends On

4448

StorageBuffer8BitAccess
-Allows an OpTypePointer to an 8-bit OpTypeInt -type used as a member of a structure in the StorageBuffer storage class.

-

The 8-bit object pointed to by such a pointer may be the result of a width-only -conversion instruction (OpSConvert, -or OpUConvert) from another type or of -an OpLoad, and may be used as an operand to a width-only conversion -instruction to another type or as the Object operand of an -OpStore.

-

Other uses of 8-bit types are not enabled by this capability.

4449

UniformAndStorageBuffer8BitAccess
-Allows an OpTypePointer to an 8-bit OpTypeInt -type used as a member of a structure in either of the Uniform or StorageBuffer -storage classes.

-

The 8-bit object pointed to by such a pointer may be the result of a width-only -conversion instruction (OpSConvert, -or OpUConvert) from another type or of -an OpLoad, and may be used as an operand to a width-only conversion -instruction to another type or as the Object operand of an -OpStore.

-

Other uses of 8-bit types are not enabled by this capability.

StorageBuffer8BitAccess

4450

StoragePushConstant8
-Allows an OpTypePointer to an 8-bit OpTypeInt -type used as a member of a structure in the PushConstant storage class.

-

The 8-bit object pointed to by such a pointer may be the result of a width-only -conversion instruction (OpSConvert, -or OpUConvert) from another type or of -an OpLoad.

-

Other uses of 8-bit types are not enabled by this capability.

-
-
-
-
-
-
-
-
-

Interactions with optional types

-
-
-

If the Int8 capability is declared, then the 8-bit OpTypeInt -instruction mentioned earlier can be used as an operand or a result to any supported instruction -with an 8-bit result type or an 8-bit operand type.

-
-
-

If the Int16 or the Float16 capability is declared, then the 8-bit OpTypeInt -instruction mentioned earlier can be used as an operand or a result to any supported conversion -instruction with a 16-bit result type or a 16-bit operand type.

-
-
-

If the Int64 or the Float64 capability is declared, then the 8-bit OpTypeInt -instruction mentioned earlier can be used as an operand or result to any supported conversion -instruction with a 64-bit result type or a 64-bit operand type.

-
-
-
-
-

Interactions with SPV_KHR_16bit_storage

-
-
-

If any capability is declared from the SPV_KHR_16bit_storage extension, -then the object produced by dereferencing a pointer pointing to 8-bit data can be used -as an operand or a result to a width-only conversion instruction with -a 16-bit result type or a 16-bit operand type, and in addition, -the object produced by dereferencing a pointer pointing to 16-bit data mentioned -in the Capability section of the SPV_KHR_16bit_storage extension -can be used as an operand or a result to a width-only conversion instruction with -an 8-bit result type or an 8-bit operand type.

-
-
-
-
-

Issues

-
- -
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2017-10-05

Alexander Galazin

Initial revision

2

2017-11-01

Alexander Galazin

Assigned token numbers

3

2018-03-28

David Neto

Record approval by SPIR Working Group

4

2019-06-10

John Kessenich

Rationalize and clean up

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_8bit_storage.html + + +

extensions/KHR/SPV_KHR_8bit_storage.html

+ + diff --git a/extensions/KHR/SPV_KHR_bit_instructions.html b/extensions/KHR/SPV_KHR_bit_instructions.html index cc17fc6..27f015d 100644 --- a/extensions/KHR/SPV_KHR_bit_instructions.html +++ b/extensions/KHR/SPV_KHR_bit_instructions.html @@ -1,395 +1,12 @@ - - - - - - - -SPV_KHR_bit_instructions - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_bit_instructions

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Ben Ashbaugh, Intel

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
  • -

    Approved by the SPIR Working Group: 2021-03-10

    -
  • -
  • -

    Approved by the Khronos Board of Promoters: 2021-04-23

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2021-06-23

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, Version 1.5 Revision 5.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This enables the following bit instructions to be used by SPIR-V modules without requiring the Shader capability:

-
-
-
    -
  • -

    OpBitFieldInsert

    -
  • -
  • -

    OpBitFieldSExtract

    -
  • -
  • -

    OpBitFieldUExtract

    -
  • -
  • -

    OpBitReverse

    -
  • -
-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_bit_instructions"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces the new capability:

-
-
-
-
BitInstructions
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5

-
-
-

Modify Section 3.31, "Capability", adding this row to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

6025

BitInstructions
-Uses the bit reverse, bitfield insert, and bitfield extract instructions.

-
-
-
-

Modify Section 3.37.14, "Bit Instructions", adding the BitInstructions capability to the following instructions:

-
- ---------- - - - - - - - - - - - - - - - - -

OpBitFieldInsert
-
-(The description of this instruction is unchanged.)

Capability:
-Shader
-BitInstructions

7

201

<id>
-Result Type

Result <id>

<id>
-Base

<id>
-Insert

<id>
-Offset

<id>
-Count

- --------- - - - - - - - - - - - - - - - -

OpBitFieldSExtract
-
-(The description of this instruction is unchanged.)

Capability:
-Shader
-BitInstructions

6

202

<id>
-Result Type

Result <id>

<id>
-Base

<id>
-Offset

<id>
-Count

- --------- - - - - - - - - - - - - - - - -

OpBitFieldUExtract
-
-(The description of this instruction is unchanged.)

Capability:
-Shader
-BitInstructions

6

203

<id>
-Result Type

Result <id>

<id>
-Base

<id>
-Offset

<id>
-Count

- ------- - - - - - - - - - - - - - -

OpBitReverse
-
-(The description of this instruction is unchanged.)

Capability:
-Shader
-BitInstructions

4

204

<id>
-Result Type

Result <id>

<id>
-Base

-
-
-
-

Validation Rules

-
-
-

No new validation rules are required.

-
-
-
-
-

Issues

-
-
-

None yet.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2021-06-23

Ben Ashbaugh

Internal revisions

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_bit_instructions.html + + +

extensions/KHR/SPV_KHR_bit_instructions.html

+ + diff --git a/extensions/KHR/SPV_KHR_compute_shader_derivatives.html b/extensions/KHR/SPV_KHR_compute_shader_derivatives.html index 1fb7219..abcb0e8 100644 --- a/extensions/KHR/SPV_KHR_compute_shader_derivatives.html +++ b/extensions/KHR/SPV_KHR_compute_shader_derivatives.html @@ -1,488 +1,12 @@ - - - - - - - -SPV_KHR_compute_shader_derivatives - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_compute_shader_derivatives

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Jean-Noe Morissette, Epic Games

    -
  • -
  • -

    Daniel Koch, Nvidia

    -
  • -
  • -

    John Kessenich, Google

    -
  • -
  • -

    Tobias Hector, AMD

    -
  • -
  • -

    Alan Baker, Google

    -
  • -
  • -

    Stu Smith, AMD

    -
  • -
  • -

    Samuel (Sheng-Wen) Huang, MediaTek

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Approved by the SPIR-V Working Group: 2024-06-26

    -
  • -
  • -

    Approved by the Khronos Board of Promoters: 2024-08-16

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2024-06-26

Revision

2

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.3 Revision 2, Unified.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension provides a capability to enable derivatives in any -Execution Model that has defined workgroups. There are two new execution -modes added which specify which four shader invocations are grouped together.

-
-
-

The new ComputeDerivativeGroupQuadsKHR and ComputeDerivativeGroupLinearKHR -capabilities enable the use of OpImageQueryLod, the ImplicitLod instructions, -and the Derivative instructions in Execution Models with defined workgroups -(GLCompute, Mesh, and Task at the time of writing).

-
-
-

This SPIR-V extension provides support for the GLSL -GL_KHR_compute_shader_derivatives extension.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_compute_shader_derivatives"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.3

-
-
-
-
(Modify Section 2.2.4, Control Flow)
-
-

(Modify the definition of Derivative Group, to include more Execution Models)

-
-
-
-

Derivative Group: Defined only for the Fragment Execution Model and any Execution Model that has defined workgroups. -In the Fragment execution model this is the set of invocations collectively -processing a single point, line, or triangle, including any helper invocations. -In other execution models, this is a single workgroup.

-
-
-
-
-
(Modify Section 2.19, Derivatives)
-
-

(Replace the first sentence:)

-
-
-
-

Derivatives appear only in the Fragment Execution Model.

-
-
-
-
-

(with the following:)

-
-
-
-
-

Derivatives appear in the Fragment and any Execution Model that has defined workgroups.

-
-
-
-
-
(Modify Section 3.6, Execution Mode)
-
-
-
-
-

(add new rows to the Execution Mode table)

-
- -------- - - - - - - - - - - - - - - - - - - - - - -
Execution ModeExtra OperandsEnabling Capabilities

5289

DerivativeGroupQuadsKHR
-Specifies that shader derivatives are evaluated over 2x2 -groups of invocations. -See the Vulkan or OpenGL API specifications for more detail. -Valid with any Execution Model that has defined workgroups.

ComputeDerivativeGroupQuadsKHR

5290

DerivativeGroupLinearKHR
-Specifies that shader derivatives are evaluated over groups -of four invocations with consecutive LocalInvocationIndex values. -See the Vulkan or OpenGL API specifications for more detail. -Valid with any Execution Model that has defined workgroups.

ComputeDerivativeGroupLinearKHR

-
-
-
-
(Modify Section 3.31, Capability, adding a new row to the Capability table)
-
-
-
- ------ - - - - - - - - - - - - - - - - - - - - - -
CapabilityImplicitly DeclaresEnabled by Extension

5288

ComputeDerivativeGroupQuadsKHR
-Uses the DerivativeGroupQuadsKHR execution mode.

Shader

SPV_KHR_compute_shader_derivatives

5350

ComputeDerivativeGroupLinearKHR
-Uses the DerivativeGroupLinearKHR execution mode.

Shader

SPV_KHR_compute_shader_derivatives

-
-
-
-
(Modify Section 3.32.10, Image Instructions)
-
-

(Modify the description of the following instructions to allow them in -any Execution Model that has defined workgroups in addition to the Fragment -Execution Model)

-
-
-
-
    -
  • -

    OpImageSampleImplicitLod

    -
  • -
  • -

    OpImageSampleDrefImplicitLod

    -
  • -
  • -

    OpImageSampleProjImplicitLod

    -
  • -
  • -

    OpImageSampleProjDrefImplicitLod

    -
  • -
  • -

    OpImageQueryLod

    -
  • -
  • -

    OpImageSparseSampleImplicitLod

    -
  • -
  • -

    OpImageSparseSampleDrefImplicitLod

    -
    -
      -
    • -

      This instruction is only valid in the Fragment and any Execution Model -that has defined workgroups. In addition, it consumes an implicit derivative -that can be affected by code motion.

      -
    • -
    -
    -
  • -
-
-
-
-
-
(Modify Section 3.32.16, Derivative Instructions)
-
-

(Modify the description of the following instructions to allow them in any - Execution Model that has defined workgroup in addition to the Fragment - Execution Model)

-
-
-
-
    -
  • -

    OpDPdx

    -
  • -
  • -

    OpDPdy

    -
  • -
  • -

    OpFwidth

    -
  • -
  • -

    OpDPdxFine

    -
  • -
  • -

    OpDPdyFine

    -
  • -
  • -

    OpFwidthFine

    -
  • -
  • -

    OpDPdxCoarse

    -
  • -
  • -

    OpDPdyCoarse

    -
  • -
  • -

    OpFwidthCoarse

    -
    -
      -
    • -

      This instruction is only valid in the Fragment and any Execution Model that -has defined workgroups.

      -
    • -
    -
    -
  • -
-
-
-

(Modify the existing descriptions of OpDPd{x,y}{Fine,Course}, prefacing the - existing language that talks about partial derivatives relative to the window - x or y coordinate with "In the Fragment Execution Model:")

-
-
-

(Add the following to the descriptions of OpDPd{x,y}{Fine,Course}, describing - how partial derivatives work in any Execution Model that has defined workgroups)

-
-
-

In any Execution Model that has defined workgroups:
-Result is the partial derivative of P evaluated over groups of four invocations. -Selection of the four invocations is determined by the DerivativeGroup*KHR -execution mode that was specified for the entry point. For these instructions to be -used, one of the derivative group modes must be specified.

-
-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_KHR_compute_shader_derivatives"
-
-
-
-
    -
  • -

    An entry point cannot have both the DerivativeGroupQuadsKHR and -DerivativeGroupLinearKHR execution modes specified.

    -
  • -
  • -

    The DerivativeGroupQuadsKHR and DerivativeGroupLinearKHR execution modes -can be used on entry points with any Execution Model that has defined workgroups.

    -
  • -
-
-
-
-
-

Issues

-
-
-

None yet!

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2023-02-28

Jean-Noe Morissette

Internal revisions

2

2024-06-26

Daniel Koch

Update overview to clarify supported execution models

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_compute_shader_derivatives.html + + +

extensions/KHR/SPV_KHR_compute_shader_derivatives.html

+ + diff --git a/extensions/KHR/SPV_KHR_cooperative_matrix.html b/extensions/KHR/SPV_KHR_cooperative_matrix.html index ed20a68..ee7b676 100644 --- a/extensions/KHR/SPV_KHR_cooperative_matrix.html +++ b/extensions/KHR/SPV_KHR_cooperative_matrix.html @@ -1,1141 +1,12 @@ - - - - - - - -SPV_KHR_cooperative_matrix - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_cooperative_matrix

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Jeff Bolz, NVIDIA

    -
  • -
  • -

    Daniel Koch, NVIDIA

    -
  • -
  • -

    Markus Tavenrath, NVIDIA

    -
  • -
  • -

    Kevin Petit, Arm Ltd.

    -
  • -
  • -

    Lei Zhang, Google

    -
  • -
  • -

    Ben Ashbaugh, Intel

    -
  • -
  • -

    Ruihao Zhang, Qualcomm

    -
  • -
  • -

    David Neto, Google

    -
  • -
  • -

    Tobias Hector, AMD

    -
  • -
  • -

    John Kessenich, Google

    -
  • -
  • -

    Nicolai Hahnle, AMD

    -
  • -
  • -

    Mariusz Merecki, Intel

    -
  • -
  • -

    Pedro Olsen Ferreira, Arm Ltd.

    -
  • -
  • -

    Ni Hui, Tencent

    -
  • -
  • -

    Dmitry Sidorov, Intel

    -
  • -
  • -

    Dong Wang, AMD

    -
  • -
  • -

    Ruimin Zhao, AMD

    -
  • -
  • -

    Alan Baker, Google

    -
  • -
  • -

    Lin Qun, AMD

    -
  • -
  • -

    Wooyoung Kim, Qualcomm

    -
  • -
  • -

    Krystian Andrzejewski, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2020-2024 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Approved by the SPIR-V Working Group: 2023-05-03

    -
  • -
  • -

    Approved by the Khronos Board of Promoters: 2023-06-16

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2024-08-07

Revision

6

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.6, Revision 2, Unified.

-
-
-

This extension requires SPIR-V 1.6.

-
-
-

This extension interacts with SPV_EXT_physical_storage_buffer.

-
-
-
-
-

Overview

-
-
-

This extension adds a new set of types known as "cooperative matrix" types, -where the storage for and computations performed on the matrix are spread -across a set of invocations such as a subgroup. These types give the -implementation freedom in how to optimize matrix multiplies.

-
-
-

This extension introduces the types and instructions, but does not specify -rules about what sizes/combinations are valid. This is left to the -client API specs, and it is expected that different implementations may -support different sizes. To help accommodate this, the dimensions of the -cooperative types can be specialized via specialization constants. Since -the scope parameter is also something that could potentially be specialized, -this extension allows all scope ids to be specialization constants.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_cooperative_matrix"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6

-
-
-

2.2 Terms

-
-

Add new terms to section 2.2.2 Types:

-
-
-

Cooperative Matrix: A two-dimensional ordered -collection of scalars, whose storage is spread across multiple shader -invocations.

-
-
-

Add Cooperative Matrix to the definition of Abstract Type.

-
-
-

Add Cooperative Matrix to the definition of Composite. A cooperative matrix -is a composite with an implementation-dependent number of components -(which can be queried with OpCooperativeMatrixLengthKHR). It can be used as a -composite for all operations that act on composite types. The mapping -of components to invocations and indexes is implementation-dependent.

-
-
-
-

2.16 Validation Rules

-
-

Modify section 2.16.1. Universal Validation Rules:

-
-
    -
  • -

    Add OpCooperativeMatrixLoadKHR and OpCooperativeMatrixStoreKHR to the list -of instructions under "It is invalid for a pointer to be an operand to any -instruction other than:", when the Logical addressing model is selected and -neither the VariablePointers nor VariablePointersStorageBuffer capability -are declared.

    -
  • -
  • -

    Cooperative matrix types (or types containing them) can only be allocated -in Function or Private storage classes.

    -
  • -
  • -

    The Matrix{A,B,C,Result}SignedComponentsKHR Cooperative Matrix Operand can only be -used when the type of the corresponding matrix is an integer type.

    -
  • -
-
-
-
-

Modify section 2.16.2. Validation Rules for Shader Capabilities:

-
-

Replace:

-
-
-
    -
  • -

    All <id> used for Scope <id> and Memory Semantics <id> must be of an OpConstant.

    -
  • -
-
-
-

with:

-
-
-
    -
  • -

    All <id> used for Scope <id> must be the result of a constant instruction.

    -
  • -
  • -

    All <id> used for Memory Semantics <id> must be of an OpConstant.

    -
  • -
-
-
-

Add:

-
-
-
    -
  • -

    If the CooperativeMatrixKHR capability is declared then the VulkanMemoryModel -capability must be declared as well.

    -
  • -
-
-
-
-
-

3.26 Memory Operands

-
-

Modify Section 3.26, "Memory Operands":

-
-
-

In the description of MakePointerAvailable, change "Not valid with OpLoad" -to "Not valid with OpLoad or OpCooperativeMatrixLoadKHR".

-
-
-

In the description of MakePointerVisible, change "Not valid with OpStore" -to "Not valid with OpStore or OpCooperativeMatrixStoreKHR".

-
-
-
-

3.31 Capabilities

-
-

Modify Section 3.31, "Capability", adding these rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityEnabling Capabilities

6022

CooperativeMatrixKHR
-Enables cooperative matrix types and instructions operating on them.

-
-
-
-
-

3.X Cooperative Matrix Operands

-
-

New section in 3 "Binary Form".

-
-
-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Cooperative Matrix OperandsEnabling Capabilities

0x0

NoneKHR

0x1

MatrixASignedComponentsKHR
-The components of matrix A are treated as signed.

0x2

MatrixBSignedComponentsKHR
-The components of matrix B are treated as signed.

0x4

MatrixCSignedComponentsKHR
-The components of matrix C are treated as signed.

0x8

MatrixResultSignedComponentsKHR
-The components of matrix Result are treated as signed.

0x10

SaturatingAccumulationKHR
-The accumulation of A x B and C performed by OpCooperativeMatrixMulAddKHR is saturating.

-
-
-
-
-

3.X Cooperative Matrix Layout

-
-

New section in 3 "Binary Form".

-
-
-
- ----- - - - - - - - - - - - - - - - - - - -
Cooperative Matrix LayoutEnabling Capabilities

0x0

RowMajorKHR
-Elements in rows of the matrix are laid out in contiguous memory locations. Rows -are laid out with a fixed stride communicated via the Stride operand to -OpCooperativeMatrixLoadKHR or OpCooperativeMatrixStoreKHR which must be -provided. Elements (row,*) of the result of load or store operations are taken -in order from contiguous locations starting at Pointer[row*Stride] where -Pointer is the Pointer operand to OpCooperativeMatrixLoadKHR or -OpCooperativeMatrixStoreKHR. Stride must be greater than 0 when passed to -OpCooperativeMatrixStoreKHR and must be greater than or equal to 0 when passed -to OpCooperativeMatrixLoadKHR.

0x1

ColumnMajorKHR
-Elements in columns of the matrix are laid out in contiguous memory locations. Columns -are laid out with a fixed stride communicated via the Stride operand to -OpCooperativeMatrixLoadKHR or OpCooperativeMatrixStoreKHR which must be -provided. Elements (*,col) of the result of load or store operations are taken -in order from contiguous locations starting at Pointer[col*Stride] where -Pointer is the Pointer operand to OpCooperativeMatrixLoadKHR or -OpCooperativeMatrixStoreKHR. Stride must be greater than 0 when passed to -OpCooperativeMatrixStoreKHR and must be greater than or equal to 0 when passed -to OpCooperativeMatrixLoadKHR.

-
-
-
-
-

3.X Cooperative Matrix Use

-
-

New section in 3 "Binary Form".

-
-
-
- ----- - - - - - - - - - - - - - - - - - - - - - - - -
Cooperative Matrix UseEnabling Capabilities

0

MatrixAKHR

1

MatrixBKHR

2

MatrixAccumulatorKHR

-
-
-
-
-

3.42.6 Type-Declaration Instructions

- ---------- - - - - - - - - - - - - - - - - -

OpTypeCooperativeMatrixKHR
-
-Declare a new cooperative matrix type with Rows rows and Columns columns, -where all invocations in Scope cooperate to compute and store the matrix.
-
-Component Type must be a scalar numerical type.
-
-Scope must be a constant instruction with scalar 32-bit integer type.
-
-Rows must be a constant instruction with scalar 32-bit integer type.
-
-Columns must be a constant instruction with scalar 32-bit integer type.
-
-Use must be a constant instruction scalar 32-bit integer type whose -value corresponds to a Cooperative Matrix Use.
-
-All dynamic instances of an instruction with an operand or result that is an -object of this type must be executed such that all the invocations in the -Scope instance are active or none of them are. -

Capability:
-CooperativeMatrixKHR

7

4456

Result <id>

<id>
-Component Type

Scope <id>
-Scope

<id>
-Rows

<id>
-Columns

<id>
-Use

-
-
-

3.42.7 Constant-Creation Instructions

-
-

Modify OpConstantComposite to make an exception for cooperative matrix types: -"If the Result Type is a cooperative matrix type, then there must be only one -Constituent and it is used to initialize all members."

-
-
-
-

3.42.8 Memory Instructions

- ---------- - - - - - - - - - - - - - - - - -

OpCooperativeMatrixLoadKHR
-
-Load a cooperative matrix through a pointer.
-
-Result Type is the type of the loaded object. It must be a cooperative matrix -type.
-
-Pointer is a pointer. Its type must be an OpTypePointer whose Type operand -is a scalar or vector type. If the Shader capability was declared, Pointer -must point into an array and any ArrayStride decoration on Pointer is ignored.
-
-MemoryLayout specifies how matrix elements are laid out in memory. It must come -from a 32-bit integer constant instruction whose value corresponds to a -Cooperative Matrix Layout. See the Cooperative Matrix Layout table for -a description of the layouts and detailed layout-specific rules.
-
-Stride further qualifies how matrix elements are laid out in memory. It must be a -scalar integer type and its exact semantics depend on MemoryLayout.
-
-Memory Operand, if present, must begin with a Memory Operand literal. If not -present, it is the same as specifying the Memory Operand None.
-
-All the operands to this instruction must be dynamically uniform within every -instance of the Scope of the cooperative matrix. -

Capability:
-CooperativeMatrixKHR

5+variable

4457

<id>
-Result Type

Result <id>

<id>
-Pointer

<id>
-MemoryLayout

Optional <id>
-Stride

Optional
-Memory Operand

- --------- - - - - - - - - - - - - - - - -

OpCooperativeMatrixStoreKHR
-
-Store a cooperative matrix through a pointer.
-
-Pointer is a pointer. Its type must be an OpTypePointer whose Type operand -is a scalar or vector type. If the Shader capability was declared, Pointer -must point into an array and any ArrayStride decoration on Pointer is ignored.
-
-Object is the object to store. Its type must be an -OpTypeCooperativeMatrixKHR.
-
-MemoryLayout specifies how matrix elements are laid out in memory. It must come -from a 32-bit integer constant instruction whose value corresponds to a -Cooperative Matrix Layout. See the Cooperative Matrix Layout table for -a description of the layouts and detailed layout-specific rules.
-
-Stride further qualifies how matrix elements are laid out in memory. It must be a -scalar integer type and its exact semantics depend on MemoryLayout.
-
-Memory Operand, if present, must begin with a Memory Operand literal. If not -present, it is the same as specifying the Memory Operand None.
-
-All the operands to this instruction must be dynamically uniform within every -instance of the Scope of the cooperative matrix. -

Capability:
-CooperativeMatrixKHR

4+variable

4458

<id>
-Pointer

<id>
-Object

<id>
-MemoryLayout

Optional <id>
-Stride

Optional
-Memory Operand

- ------- - - - - - - - - - - - - - -

OpCooperativeMatrixLengthKHR
-
-Number of components of a cooperative matrix type accessible to the current -invocation when treated as a composite.
-
-Result Type must be an OpTypeInt with 32-bit Width and 0 Signedness.
-
-Type is a cooperative matrix type.

Capability:
-CooperativeMatrixKHR

4

4460

<id>
-Result Type

Result <id>

<id>
-Type

-
-
-

3.42.11 Conversion Instructions

-
-

Allow values of cooperative matrix type for the following conversion instructions -(if the component types are appropriate): OpConvertFToU, OpConvertFToS, -OpConvertSToF, OpConvertUToF, OpUConvert, OpSConvert, OpFConvert. -Allow the use of OpBitcast on objects of cooperative matrix type whose -Component Type are integer types with the same Width. -The result type and value type must have the same Scope, number of Rows, -number of Columns, and Use.

-
-
-

All the operands to this instruction must be dynamically uniform within every -instance of the Scope of the cooperative matrix.

-
-
-
-

3.42.12 Composite Instructions

-
-

Modify OpCompositeConstruct to make an exception for cooperative matrix types: -"If the Result Type is a cooperative matrix type, then there must be only one -Constituent and it is used to initialize all members. The Constituent must -be dynamically uniform within the Scope of the cooperative matrix type.

-
-
-
-

3.42.13 Arithmetic Instructions

- ---------- - - - - - - - - - - - - - - - - -

OpCooperativeMatrixMulAddKHR
-
-Linear-algebraic matrix multiply of A by B and then component-wise -add C. The order of the operations is implementation-dependent. The -internal precision of floating-point operations is defined by the client -API. If any of the Matrix{A,B,C}SignedComponentsKHR operands are present, -elements of the coresponding matrix operands are sign-extended to the -precision of Result Type, otherwise they are zero-extended. -Integer operations used in the multiplication of A by B are -performed at the precision of the Result Type and the resulting value -will equal the low-order N bits of the correct result R, where N is the -result width and R is computed with enough precision to avoid overflow -and underflow if the SaturatingAccumulation Cooperative Matrix Operand -is not present. If the SaturatingAccumulation Cooperative Matrix Operand -is present and overflow or underflow occurs as part of calculating that -intermediate result, the result of the instruction is undefined. Integer -additions of the elements of that intermediate -result with those of C are performed at the precision of Result Type, -are exact, and are saturating if the SaturatingAccumulation -Cooperative Matrix Operand is present, with the signedness of the saturation -being that of the components of Result Type. If the SaturatingAccumulation -Cooperative Matrix Operand is not present then the resulting value will equal -the low-order N bits of the correct result R, where N is the result width and -R is computed with enough precision to avoid overflow and underflow.
-
-Result Type must be a cooperative matrix type with M rows and N columns -whose Use must be MatrixAccumulatorKHR.
-
-A is a cooperative matrix with M rows and K columns whose Use must be MatrixAKHR.
-
-B is a cooperative matrix with K rows and N columns whose Use must be MatrixBKHR.
-
-C is a cooperative matrix with M rows and N columns whose Use must be MatrixAccumulatorKHR.
-
-The values of M, N, and K must be consistent across the result and operands. -This is referred to as an MxNxK matrix multiply.
-
-A, B, C, and Result Type must have the same scope, and this defines -the scope of the operation. A, B, C, and Result Type need not -necessarily have the same component type, this is defined by the client API.
-
-If the Component Type of any matrix operand is an integer type, then its -components are treated as signed if the Matrix{A,B,C,Result}SignedComponentsKHR -Cooperative Matrix Operand is present and are treated as unsigned otherwise.
-
-Cooperative Matrix Operands is an optional Cooperative Matrix Operand literal. If -not present, it is the same as specifying the Cooperative Matrix Operand None.
-
-All the operands to this instruction must be dynamically uniform within every -instance of the Scope of the cooperative matrix. -

Capability:
-CooperativeMatrixKHR

6+variable

4459

<id>
-Result Type

Result <id>

<id>
-A

<id>
-B

<id>
-C'

Optional
-Cooperative Matrix Operands

-
-

Allow cooperative matrix types for the following arithmetic instructions:

-
-
-
    -
  • -

    OpSNegate and OpFNegate

    -
  • -
  • -

    OpIAdd, OpFAdd, OpISub, OpFSub, OpFMul, OpIMul, -OpFDiv, OpSDiv, and OpUDiv.

    -
  • -
-
-
-

if their Component Type is appropriate:

-
-
-
    -
  • -

    OpF instructions can be used with cooperative matrix types whose -Component Type is a floating-point type.

    -
  • -
  • -

    OpI, OpS, and OpU instructions can be used with cooperative -matrix types whose Component Type is an integer type.

    -
  • -
-
-
-

Unary arithmetic instructions operate on the individual elements of the cooperative -matrices.

-
-
-

Binary arithmetic instructions operate on the individual elements of a pair -of cooperative matrices whose type must match.

-
-
-

Allow cooperative matrix types for OpMatrixTimesScalar.

-
-
-

All the operands to this instruction must be dynamically uniform within every -instance of the Scope of the cooperative matrix.

-
-
-
-
-
-

Issues

-
-
-
    -
  1. -

    Should cooperative operations imply a fixed scope (e.g. Subgroup) or be more -flexible?

    -
    -
    -
    -

    Discussion: Some hardware (e.g. NVidia Volta) use a smaller scope than the typical -Subgroup size, and it is plausible that other implementations could also want a -different scope.

    -
    -
    -

    RESOLVED: Allow a specialization constant scope.

    -
    -
    -
    -
  2. -
  3. -

    Should we have capabilities for each MxNxK matrix multiply "size" that is -supported?

    -
    -
    -
    -

    Discussion: It’s nice for validation if the shader instructions can be -validated solely based on the OpCapability instructions. But that already -breaks down for spec-constant-defined cooperative matrix types.

    -
    -
    -

    RESOLVED: Just one capability for the overall feature.

    -
    -
    -
    -
  4. -
  5. -

    Should strides be in bytes or elements?

    -
    -
    -
    -

    Discussion: Using elements helps avoid the unsupportable (or more difficult -to support) cases.

    -
    -
    -

    RESOLVED: Stride is in elements of the pointee type (which can be different -than the matrix component type).

    -
    -
    -
    -
  6. -
  7. -

    Should we allow matrices to be stored in an opaque layout in shared -memory?

    -
    -
    -
    -

    Discussion: Some implementation need opaque layouts for optimal performance.

    -
    -
    -

    RESOLVED: Load/store instructions accept a layout operand that vendors can -use to select custom layouts.

    -
    -
    -
    -
  8. -
  9. -

    Should the MemoryLayout operand be a literal constant, or a constant -instruction?

    -
    -
    -
    -

    Discussion: Constant instructions are more general, and easier for code -generation.

    -
    -
    -

    RESOLVED: Constant instruction.

    -
    -
    -
    -
  10. -
  11. -

    Should we allow OpTranspose on cooperative matrix types?

    -
    -
    -
    -

    Discussion: Most implementations are expected to support a restricted set of -sizes where the transpose of a matrix will sometimes not be a valid type; it’s -unclear if this is useful.

    -
    -
    -

    RESOLVED: Not supported in this extension.

    -
    -
    -
    -
  12. -
  13. -

    What should the Pointer operand to a cooperative Load/Store be?

    -
    -
    -
    -

    Discussion: The spec currently chooses to have the Pointer parameter point at -the first element of the matrix in memory, and this pointer is assumed to be -in the middle of an array. Another option would be to have the Pointer -parameter be a pointer to the whole array, and have an additional "Element" -parameter to the instructions, which indicates where the matrix starts in the -array.

    -
    -
    -

    The alternative option’s main benefit is that you don’t end up with a pointer -parameter being used to access something it does not point to. However, it -effectively splits out the last element of the access chain into the -load/store instruction, which is kind of weird. And in the first option, the -pointer to the array is still there implicitly in the access chain.

    -
    -
    -

    RESOLVED: Pointer points to the first element of the array.

    -
    -
    -
    -
  14. -
  15. -

    Should we allow the Pointer type and matrix component type to mismatch?

    -
    -
    -
    -

    RESOLVED: Yes, this makes it easier to efficiently load matrix data into -shared memory, which can be declared to use a larger type (e.g. uvec4). The -Stride parameter is interpreted in units of the pointed-to type, not in -units of the matrix’s component type.

    -
    -
    -
    -
  16. -
  17. -

    Should we make it possible to use OpMatrixTimesScalar with OpenCL?

    -
    -
    -
    -

    RESOLVED: No, this instruction is not generally supported in OpenCL -environments and the same can be achieved either via an elementwise -multiplication with a cooperative matrix object created from the scalar -using OpConstantComposite or by iterating over the elements of the -cooperative matrix to multiply each element by the scalar.

    -
    -
    -
    -
  18. -
  19. -

    Both the Stride and Memory Operand operands to OpCooperativeMatrixLoadKHR -and OpCooperativeMatrixStoreKHR are optional. Can Memory Operand be provided -alone?

    -
    -
    -
    -

    RESOLVED: No, in line with core SPIR-V rules, all optional operands that -appear before a given optional operand must be provided for it to be possible to -provide that given optional operand. Only the following combinations are valid:

    -
    -
    -
      -
    • -

      None of the optional operands are present.

      -
    • -
    • -

      Stride alone is present.

      -
    • -
    • -

      Stride and Memory Operand are both present.

      -
    • -
    -
    -
    -
    -
  20. -
  21. -

    Can elements of cooperative matrix objects treated as composites be -accessed in non-uniform control flow?

    -
    -
    -
    -

    RESOLVED: Yes, control flow uniformity requirements apply to instructions -whose operands are cooperative matrix objects but not pointers to cooperative -matrix objects. Dereferencing a pointer to an element of a cooperative matrix -object can be done in non-uniform control flow.

    -
    -
    -
    -
  22. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

6

2024-08-07

Jeff Bolz

Clarify sign/zero-extension behavior

5

2023-12-06

Kevin Petit

Clarifications, mostly of uniformity rules

4

2023-07-26

Kevin Petit

Add KHR suffixes to Cooperative Matrix Operands

3

2023-05-03

Kevin Petit

Initial revision of SPV_KHR_cooperative_matrix

2

2019-07-12

Jeff Bolz

Added details for integer operations

1

2019-01-30

Jeff Bolz

Initial revision of SPV_NV_cooperative_matrix

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_cooperative_matrix.html + + +

extensions/KHR/SPV_KHR_cooperative_matrix.html

+ + diff --git a/extensions/KHR/SPV_KHR_device_group.html b/extensions/KHR/SPV_KHR_device_group.html index 8de9ddc..cdd911b 100644 --- a/extensions/KHR/SPV_KHR_device_group.html +++ b/extensions/KHR/SPV_KHR_device_group.html @@ -1,354 +1,12 @@ - - - - - - - -SPV_KHR_device_group - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_device_group

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Ashwin Kolhe, NVIDIA

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
  • -

    Approved by the SPIR Working Group: 2017-01-11

    -
  • -
  • -

    Approved by the Khronos Board of Promoters: 2017-02-24

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2016-12-12

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.1 Revision 4.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds a new capability to support the Vulkan -VK_KHX_device_group and the VK_KHX_device_group_creation extensions -in SPIR-V. It provides functionality to use a logical device -that consists of multiple physical devices, as created with -the VK_KHX_device_group_creation extension.

-
-
-

The new DeviceGroup capability allows the DeviceIndex builtin -variable to be exported from all shaders stages, which represents -the index of the logical device.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_device_group"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
DeviceGroup
-
-
-
-
-
-

New Builtins

-
-
-

A new builtin is added as an input for all shader stages.

-
-
-
-
DeviceIndex
-
-
-
-

Input device index of the logical device consisting of multiple -physical devices.

-
-
-
-
-

New Instructions

-
-
-

None.

-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - - - - - -

DeviceGroup

4437

DeviceIndex

4438

-
-
-
(Modify Section 3.21, BuiltIn)
-
-
-
-
-

(add a new row to the Builtin table)

-
- ----- - - - - - - - - - - - - - -
BuiltInEnabling Capabilities

4438

DeviceIndex
-Input device index of the logical device. See VK_KHX_device_group for more details.

DeviceGroup

-
-
-
-
(Modify Section 3.31, Capability, adding new row to the Capability table)
-
-
-
- ------ - - - - - - - - - - - - - - - -
CapabilityDepends OnEnabled by Extension

4437

DeviceGroup

SPV_KHR_device_group

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_KHR_device_group"
-
-
-
-
-
-

Issues

-
-
-
    -
  1. -

    Should the DeviceGroup have a dependency on Shader?

    -
    -
    -
    -

    RESOLVED: -SPIR WG 2016-12-21: No. It seems that this could be useful in Kernels -in a future extension, so we won’t limit it to Shaders. -The semantics are defined by the corresponding API extension so there -should be no conflicts.

    -
    -
    -
    -
  2. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2016-12-12

Ashwin Kolhe

Initial draft

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_device_group.html + + +

extensions/KHR/SPV_KHR_device_group.html

+ + diff --git a/extensions/KHR/SPV_KHR_expect_assume.html b/extensions/KHR/SPV_KHR_expect_assume.html index 4fc9385..d498bf9 100644 --- a/extensions/KHR/SPV_KHR_expect_assume.html +++ b/extensions/KHR/SPV_KHR_expect_assume.html @@ -1,324 +1,12 @@ - - - - - - - -SPV_KHR_expect_assume - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_expect_assume

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Dmitry Sidorov, Intel

    -
  • -
  • -

    Alexey Sachkov, Intel

    -
  • -
  • -

    Alexey Sotkin, Intel

    -
  • -
  • -

    Ben Ashbaugh, Intel

    -
  • -
  • -

    Stuart Brady, ARM

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2020 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
  • -

    Approved by the SPIR Working Group: 2020-06-24

    -
  • -
  • -

    Ratified the Khronos Group: 2020-08-14

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2020-06-03

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.5 Revision 2.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds OpAssumeTrueKHR and OpExpectKHR instructions that -provide additional information to a compiler, similar to the llvm.assume and -llvm.expect intrinsics. -SPIR-V consumers may use this information to generate more efficient code.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must -be present in the module:

-
-
-
-
OpExtension "SPV_KHR_expect_assume"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5

-
-
-

Capabilities

-
-

Modify Section 3.31, "Capability", adding rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

5629

ExpectAssumeKHR
-Uses the OpAssumeTrueKHR or OpExpectKHR instructions.

-
-
-
-
-

Instructions

-
-

Modify section 3.32.1, "Miscellaneous Instructions", adding to the end of the list of instructions:

-
- ----- - - - - - - - - - - - -

OpAssumeTrueKHR

-

The instruction allows the optimizer to assume that Condition is true -whenever the control flow reaches the instruction. -Behavior is undefined if Condition is false instead.

-

Condition must be a scalar of Boolean type.

Capability: -ExpectAssumeKHR

2

5630

<id>
-Condition

- -------- - - - - - - - - - - - - - - -

OpExpectKHR

-

This instruction behaves the same as OpCopyObject by making a copy of Value, -except it also provides information to the optimizer that the most probable -value of Value is ExpectedValue.

-

Result Type must be a scalar or vector of integer type or Boolean type.

-

The type of Value and ExpectedValue must be the same as Result Type.

Capability: -ExpectAssumeKHR

5

5631

<id>
-Result Type

Result <id>

<id>
-Value

<id>
-ExpectedValue

-
-
-
-
-

Issues

-
-
-
    -
  1. -

    What should this extension be called?

    -
    -
    -
    -

    RESOLVED: -The name of the extension will be SPV_KHR_expect_assume.

    -
    -
    -
    -
  2. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2020-06-03

Ben Ashbaugh

Initial KHR extension.

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_expect_assume.html + + +

extensions/KHR/SPV_KHR_expect_assume.html

+ + diff --git a/extensions/KHR/SPV_KHR_float_controls.html b/extensions/KHR/SPV_KHR_float_controls.html index 9ad80e9..3930f2e 100755 --- a/extensions/KHR/SPV_KHR_float_controls.html +++ b/extensions/KHR/SPV_KHR_float_controls.html @@ -1,446 +1,12 @@ - - - - - - - -SPV_KHR_float_controls - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_float_controls

-
-
-
-
-

Contact

-
-
-

See Issues list in the Khronos SPIRV-Registry repository: -https://github.com/KhronosGroup/SPIRV-Registry

-
-
-
-
-

Contributors

-
-
-
    -
  • -

    Alexander Galazin, Arm

    -
  • -
  • -

    Neil Henning, AMD

    -
  • -
  • -

    Jeff Bolz, NVIDIA

    -
  • -
  • -

    John Kessenich, Google

    -
  • -
  • -

    Ruihao Zhang, Qualcomm

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2016-2018 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
  • -

    Approved by the SPIR Working Group: 2018-09-05

    -
  • -
  • -

    Ratified by Khronos: 2018-10-26

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2018-09-06

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.3 Revision 1.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension provides new execution modes to control floating-point -computations by overriding an implementation’s default behavior for -rounding modes, denormals, signed zero, and infinities.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_float_controls"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.3

-
-
-
-
Modify Section 3.6, Execution Mode, adding the following rows to the Execution Mode table:
-
-
-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Execution modeEnabling CapabilitiesExtra Operands

4459

DenormPreserve
-Any denormalized value input into a shader or potentially generated by -any instruction in a shader must be preserved. -Denormalized values obtained via unpacking an integer into a vector of -values with smaller bit width and interpreting those values as -floating-point numbers must be preserved. -Only affects instructions operating on -a scalar or a composite type derived from a scalar floating-point type -defined by the extra operand of OpExecutionMode.

DenormPreserve

Literal Number
-Scalar Floating-Point Type Bit Width

4460

DenormFlushToZero
-Any denormalized value input into a shader or potentially generated by -any instruction in a shader must be flushed to zero. -Denormalized values obtained via unpacking an integer into a vector of -values with smaller bit width and interpreting those values as -floating-point numbers must be flushed. -Only affects instructions operating on -a scalar or a composite type derived from a scalar floating-point type -defined by the extra operand of OpExecutionMode.

DenormFlushToZero

Literal Number
-Scalar Floating-Point Type Bit Width

4461

SignedZeroInfNanPreserve
-The implementation must not perform optimizations on floating-point instructions -that do not preserve sign of a zero, or assume that operands and results are -not NaNs or infinities. -Bit patterns for NaNs might not be preserved. -Only affects instructions operating on -a scalar or a composite type derived from a scalar floating-point type -defined by the extra operand of OpExecutionMode.

SignedZeroInfNanPreserve

Literal Number
-Scalar Floating-Point Type Bit Width

4462

RoundingModeRTE
-The default rounding mode for floating-point arithmetic and conversions instructions -must be round-to-nearest-even. -Only affects instructions operating on -a scalar or a composite type derived from a scalar floating-point type -defined by the extra operand of OpExecutionMode. -If an instruction is decorated with FPRoundingMode or defines a rounding mode in its description, -that rounding mode is applied and RoundingModeRTE is ignored.

RoundingModeRTE

Literal Number
-Scalar Floating-Point Type Bit Width

4463

RoundingModeRTZ
-The default rounding mode for floating-point arithmetic and conversions instructions -must be round-towards-zero. -Only affects instructions operating on -a scalar or a composite type derived from a scalar floating-point type -defined by the extra operand of OpExecutionMode. -If an instruction is decorated with FPRoundingMode or defines a rounding mode in its description, -that rounding mode is applied and RoundingModeRTZ is ignored.

RoundingModeRTZ

Literal Number
-Scalar Floating-Point Type Bit Width

-
-
-
-
Modify Section 3.31, Capability, adding the following rows to the Capability table:
-
-
-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
CapabilityImplicitly DeclaresEnabled by Extension

4464

DenormPreserve
-Uses the DenormPreserve Execution Mode

SPV_KHR_float_controls

4465

DenormFlushToZero
-Uses the DenormFlushToZero Execution Mode

SPV_KHR_float_controls

4466

SignedZeroInfNanPreserve
-Uses the SignedZeroInfNanPreserve Execution Mode

SPV_KHR_float_controls

4467

RoundingModeRTE
-Uses the RoundingModeRTE Execution Mode

SPV_KHR_float_controls

4468

RoundingModeRTZ
-Uses the RoundingModeRTZ Execution Mode

SPV_KHR_float_controls

-
-
-
-
Modify Section 2.16.1. Universal Validation Rules, adding the following sub-items to the "Entry point and execution model" item:
-
-
-
-
-
    -
  • -

    An OpEntryPoint can set at most one of the DenormFlushToZero, DenormPreserve -Execution Modes for the given Floating-Point Type Bit Width.

    -
  • -
  • -

    An OpEntryPoint can set at most one of the RoundingModeRTE, RoundingModeRTZ -Execution Modes for the given Floating-Point Type Bit Width.

    -
  • -
-
-
-
-
-
-
-
-
-
-

Interactions with the FP Rounding Mode

-
-
-

The RoundingModeRTE and the RoundingModeRTZ are applied globally per entry point. -The FPRoundingMode is applied on per-instruction basis, which takes precedence -over the RoundingModeRTE and the RoundingModeRTZ execution modes.

-
-
-
-
-

Interactions with the FP Fast Math Mode

-
-
-

The FPFastMathMode decoration is limited to the Kernel capability. -There are no interactions with this extension.

-
-
-
-
-

Default execution modes

-
-
-

Default execution modes are expected to be documented by client APIs.

-
-
-
-
-

Issues

-
-
-
    -
  1. -

    Do we need DenormUnspecified, SignedZeroInfNanIgnore, RoundingModeUnspecified?

    -
    -
    -
    -

    RESOLVED: No. These are assumed to be the default execution modes. -This must be explicitly stated in client API execution environment specifications.

    -
    -
    -
    -
  2. -
  3. -

    Why not reuse FPRoundingMode?

    -
    -
    -
    -

    RESOLVED: We would like to have an ability to specify the rounding mode on the global scope -and also for a larger set of instructions than the FPRoundingMode decoration allows.

    -
    -
    -
    -
  4. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2018-09-06

Alexander Galazin

Initial revision

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_float_controls.html + + +

extensions/KHR/SPV_KHR_float_controls.html

+ + diff --git a/extensions/KHR/SPV_KHR_float_controls2.html b/extensions/KHR/SPV_KHR_float_controls2.html index f32d22b..442ffdd 100644 --- a/extensions/KHR/SPV_KHR_float_controls2.html +++ b/extensions/KHR/SPV_KHR_float_controls2.html @@ -1,630 +1,12 @@ - - - - - - - -SPV_KHR_float_controls2 - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_float_controls2

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Graeme Leese, Broadcom

    -
  • -
  • -

    Ben Ashbaugh, Intel

    -
  • -
  • -

    Alan Baker, Google

    -
  • -
  • -

    David Neto, Google

    -
  • -
  • -

    Ruihao Zhang, Qualcomm

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2021 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Approved by the SPIR-V Working Group: 2023-12-06

    -
  • -
  • -

    Approved by the Khronos Board of Promoters: 2024-01-19

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2024-04-03

Revision

10

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.6 Revision 1, Unified

-
-
-

This extension requires SPIR-V 1.2.

-
-
-

This extension interacts trivially with SPV_KHR_float_controls (which became core in SPIR-V 1.4).

-
-
-

This extension supersedes SPV_INTEL_fp_fast_math_mode.

-
-
-
-
-

Overview

-
-
-

This extension provides a single mechanism for specifying the floating-point -environment which can be used on whole modules and individual instructions. -It is designed to supersede the various existing ways of specifying different -modifications to the floating-point semantics.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_float_controls2"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6

-
-
-

Universal Validation Rules

-
-

In section 2.16.1 "Universal Validation Rules" add the rules:

-
-
-
    -
  • -

    It is not valid for an entry point using the FPFastMathDefault execution mode to:

    -
    -
      -
    • -

      Use the execution modes ContractionOff or SignedZeroInfNanPreserve.

      -
    • -
    • -

      Contain any instructions decorated with NoContraction.

      -
    • -
    • -

      Contain any FP Fast Math Mode bitmask containing Fast.

      -
    • -
    -
    -
  • -
  • -

    It is not valid for any instruction to be decorated with both NoContraction -and FPFastMathMode.

    -
  • -
  • -

    Any FP Fast Math Mode bitmask that includes the AllowTransform bit must also -include the AllowContract and AllowReassoc bits.

    -
  • -
-
-
-
-

Execution Mode

-
-

In section 3.6 "Execution Mode" add the following row to the Execution Mode table:

-
- ------- - - - - - - - - - - - - - - - - -
Execution ModeExtra OperandsEnabling Capabilities

6028

FPFastMathDefault
-Set the default fast math flags for instructions not themselves decorated with -FPFastMathMode. This only affects instructions operating on or resulting in a -type that is Target Type or an OpTypeMatrix or OpTypeVector derived from it. Target -Type must be a scalar, floating-point type. Fast-Math Mode must be the <id> -of a constant instruction of 32-bit integer type -containing a valid FP Fast Math Mode bitmask. -Fast-Math Mode must not be a specialization-constant instruction. -May be applied at most once per Target Type to any execution mode.

<id>
-Target Type

<id>
-Fast-Math Mode

FloatControls2

-
-
-

FP Fast Math Mode

-
-

In section 3.15 "FP Fast Math Mode", following "Enables fast math operations -which are otherwise unsafe", add:

-
-
-

If an operation is decorated with FPFastMathMode then the flags from that -decoration apply. Otherwise, if the current entry point sets any -FPFastMathDefault execution mode then all flags specified for any operand -type or for the result type of the operation apply. If the operation is not -decorated with FPFastMathMode and the entry point sets no -FPFastMathDefault execution modes then the flags to be applied are determined -by the client API and not by SPIR-V.

-
-
- - - - - -
- - -
Note
-
-

This definition implies that, if the entry point set any FPFastMathDefault -execution mode then any type for which a default is not set uses no fast math -flags (although this can still be overridden on a per-operation basis). Modules -must not mix setting fast math modes explicitly using this extension and -relying on older API defaults.

-
-
-
-
-

Replace the text following "Only valid on …​" with:

-
-
-
    -
  • -

    All core instructions which use any floating-point type for either operands or result.

    -
  • -
  • -

    OpExInst extended instructions, where expressly permitted by the extended -instruction set in use.

    -
  • -
-
-
-

Add the text:

-
-
-

Expressions decorated with AllowContract, AllowReassoc, or AllowTransform -may be rearranged using the appropriate mathematical properties even though this -may cause a change in the floating-point results and may involve a different -number of rounding steps than would otherwise occur. Where these operations are -not also decorated with NotInf and NotNaN then these values must be -considered in the results of the transformed expressions, but they do not -change which rearrangements are valid.

-
-
- - - - - -
- - -
Note
-
-

For example, if the expression a + b + (-a) is decorated AllowReassoc then -it may be implemented as b. This is valid whether or not it is also decorated -NotInf even though the original expression may overflow to infinity when -evaluated in floating-point.

-
-
-

If the expression a + a + (-a) is not decorated AllowReassoc then it -cannot, in general be rearranged. However, in this case, if it is decorated -with NotInf then it may be implemented as a since the replacement is exact -for all values that do not overflow to infinity and the value is undefined if -one of the operands is infinity. If the expression is not decorated with either -AllowReassoc or NotInf then the result must be infinity for sufficiently -large but finite values of a.

-
-
-
-
-

Add the following rows to the FP Fast Math Mode table:

-
- ----- - - - - - - - - - - - - - - - - - - - - - - - -
FP Fast Math ModeEnabling Capabilities

0x10000

AllowContract
-Allows a floating-point operation to be contracted with any operation(s) -producing its operands. Rounding steps may be eliminated or may preserve higher -bit-depth than the specified types. The instructions producing the operands do -not need to be decorated to allow this transformation.

FloatControls2

0x20000

AllowReassoc
-Allows a floating-point operation to be reordered with any operation(s) -producing its operands according to real-number associativity rules. The -instructions producing the operands do not need to be decorated to allow this -transformation.

FloatControls2

0x40000

AllowTransform
-Allows a floating-point operation to be transformed with any operation(s) -producing its operands according to real-number rules. This is a superset of -AllowContract and AllowReassoc and those bits must be set whenever this bit -is set. The instructions producing the operands do not need to be decorated to -allow this transformation, but note that non-trivial transformations may -require multiple instructions to be decorated.

FloatControls2

-
-
-

Decoration

-
-

In section 3.20 "Decoration" modify row 40 of the Decoration table to add the enabling capability -FloatControls2:

-
- ------- - - - - - - - - - - - - - - - -
DecorationExtra OperandsEnabling Capabilities

40

FPFastMathMode
-Indicates a floating-point fast math flag.

FP Fast Math Mode
-Fast-Math Mode

Kernel, FloatControls2

-
-
-

Capability

-
-

In section 3.31 "Capability" add the following row to the capability table:

-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

6029

FloatControls2
-Uses FPFastMathDefault execution mode or uses FPFastMathMode decoration (unless enabled with the Kernel capability).

-
-
-
-
-

Modifications to the GLSL.std.450 Extended Instruction Set

-
-
-

Introduction

-
-

Following the introduction, add "For environments that allow use of -FPFastMathMode decorations on OpExtInst instructions, FPFastMathMode -decorations may be applied to any instruction which uses any floating-point -type for either operands or result".

-
-
-
-
-
-

Deprecation

-
-
-

This extension deprecates the following features:

-
-
-
    -
  • -

    The execution modes ContractionOff and SignedZeroInfNanPreserve. Use -FPFastMathDefault with the appropriate flags instead.

    -
  • -
  • -

    The decoration NoContraction. Use the FPFastMathMode decoration instead.

    -
  • -
  • -

    The FPFastMathMode mode bit Fast. Set all the other FPFastMathMode bits instead.

    -
  • -
  • -

    Enabling the FPFastMathMode decoration using the Kernel capability. All uses should -declare the FloatControls2 capability.

    -
  • -
  • -

    The OpenCL.std instructions fmin_common, fmax_common. Use fmin, fmax with -NInf and NNaN instead.

    -
  • -
-
-
-
-
-

Issues

-
-
-
    -
  1. -

    How does this interact with SPV_INTEL_fp_fast_math_mode?

    -
    -
    -
    -

    RESOLVED: It supersedes it. This extension contains a superset of the functionality and is expected to be supported on a wider range of implementations, but applications targeting only Intel platforms can continue to use the older extension.

    -
    -
    -
    -
  2. -
  3. -

    Which operations must be decorated with Contract or Reassoc to allow the optimisation?

    -
    -
    -
    -

    RESOLVED: Only the operation consuming a value must be decorated to permit the contraction or reassociation. -This is useful when mixing precise and imprecise operations (the imprecise ones are still permitted to use the -faster, contracted computation). Optimisers (and consumers) must ensure that following any transformation, no -operation is affected by any FastMath flag where it was not affected in the input program.

    -
    -
    -
    -
  4. -
  5. -

    Are there any other fast-math flags that should be added here?

    -
    -
    -
    -

    RESOLVED: Not at the moment. The current set covers all gcc and LLVM relaxed -modes except for gcc’s sign-dependent-rounding and LLVM’s afn (approximate -function). Most SPIR-V consumers do not support rounding that is sign-dependent -so that flag is unlikely to be significant. It is envisaged that something like -afn will be added in a future extension but the accuracy of builtin functions -is outside the scope of this extension.

    -
    -
    -
    -
  6. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2021-09-15

Graeme Leese

Initial KHR extension.

2

2021-09-24

Graeme Leese

Updated following review.

3

2022-04-06

Graeme Leese

Updated following review.

4

2023-04-26

Graeme Leese

Clarify which operations must be decorated.

5

2023-05-09

Graeme Leese

Resolve issues.

6

2023-05-17

Graeme Leese

Clarify interaction of transforms with inf/nan.

7

2023-06-08

Graeme Leese

Update deprecations, fix defaults to use IDs.

8

2023-10-02

Graeme Leese

Update required SPIR-V version, clarify deprecation of fast.

9

2024-03-15

Graeme Leese

Clarify rules for modules declaring no FPFastMathDefault.

10

2024-04-03

Graeme Leese

Fix some spellings and clarify multiple use of execution mode.

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_float_controls2.html + + +

extensions/KHR/SPV_KHR_float_controls2.html

+ + diff --git a/extensions/KHR/SPV_KHR_fragment_shader_barycentric.html b/extensions/KHR/SPV_KHR_fragment_shader_barycentric.html index ce0b2bf..c1d8694 100644 --- a/extensions/KHR/SPV_KHR_fragment_shader_barycentric.html +++ b/extensions/KHR/SPV_KHR_fragment_shader_barycentric.html @@ -1,356 +1,12 @@ - - - - - - - -SPV_KHR_fragment_shader_barycentric - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_fragment_shader_barycentric

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Stu Smith, AMD

    -
  • -
  • -

    Tobias Hector, AMD

    -
  • -
  • -

    Jan-Harald Fredriksen, Arm

    -
  • -
  • -

    Graeme Leese, Broadcom

    -
  • -
  • -

    Daniel Koch, NVIDIA

    -
  • -
  • -

    Pat Brown, NVIDIA

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2021-06-24

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.5 Revision 5.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension provides SPIR-V support for the GLSL -GL_EXT_fragment_shader_barycentric extension which provides -fragment shaders with access to barycentric weights vectors and -enables fragment inputs to read the raw per-vertex outputs from -the last vertex processing stage.

-
-
-

The extension adds the following functionality under the new -FragmentBarycentricKHR capability:

-
-
-
    -
  • -

    adds the PerVertexKHR decoration for fragment shader input variables

    -
  • -
  • -

    adds BaryCoordKHR and BaryCoordNoPerspKHR builtins in fragment -shaders

    -
  • -
-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_fragment_shader_barycentric"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5

-
-
-
-
(Modify Section 3.20, Decoration, add a new row to the Decoration table)
-
-
-
- ------- - - - - - - - - - - - - - - - -
DecorationEnabling CapabilitiesExtra Operands

5285

PerVertexKHR
-Must only be used on a memory object declaration or a member of a structure type. -No interpolation. Values are accessed by vertex number in the fragment input. -Only valid for the Input Storage Class.

FragmentBarycentricKHR

-
-
-
-
(Modify Section 3.21, BuiltIn, add two new rows to the Builtin table)
-
-
-
- ----- - - - - - - - - - - - - - - - - - - -
BuiltInEnabling Capabilities

5286

BaryCoordKHR
-Input barycentric coordinates in the Fragment Execution Model. -These values are perspective-corrected versions of the barycentric weights. -See the Vulkan API specification for more detail.

FragmentBarycentricKHR

5287

BaryCoordNoPerspKHR
-Input barycentric coordinates in the Fragment Execution Model. -These values vary linearly in screenspace. -See the Vulkan API specification for more detail.

FragmentBarycentricKHR

-
-
-
-
(Modify Section 3.31, Capability, add a new row to the Capability table)
-
-
-
- ------ - - - - - - - - - - - - - - - -
CapabilityDepends OnEnabled by Extension

5284

FragmentBarycentricKHR

Shader

SPV_KHR_fragment_shader_barycentric

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_KHR_fragment_shader_barycentric"
-
-
-
-
-
-

Issues

-
-
-
    -
  1. -

    How does this extension relate to the similar functionality in -SPV_NV_fragment_shader_barycentric?

    -
    -
    -
    -

    RESOLVED: This extension provides identical functionality to that -of SPV_NV_fragment_shader_barycentric, with the decoration, -builtins, and capability being aliases.

    -
    -
    -
    -
  2. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2021-06-24

Stu Smith

Initial revision

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_fragment_shader_barycentric.html + + +

extensions/KHR/SPV_KHR_fragment_shader_barycentric.html

+ + diff --git a/extensions/KHR/SPV_KHR_fragment_shading_rate.html b/extensions/KHR/SPV_KHR_fragment_shading_rate.html index 3906f9d..d444703 100644 --- a/extensions/KHR/SPV_KHR_fragment_shading_rate.html +++ b/extensions/KHR/SPV_KHR_fragment_shading_rate.html @@ -1,395 +1,12 @@ - - - - - - - -SPV_KHR_fragment_shading_rate - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_fragment_shading_rate

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Tobias Hector, AMD

    -
  • -
  • -

    Pat Brown, NVIDIA

    -
  • -
  • -

    Matthew Netsch, Qualcomm

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2020-04-24

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.5, Revision 3, Unified.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-

This extension interacts with SPV_NV_mesh_shader.

-
-
-
-
-

Overview

-
-
-

This extension adds support for setting the fragment shading rate for a -primitive in vertex, geometry, and mesh shading stages, and querying the -shading rate in fragment shaders.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_fragment_shading_rate"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces new capability:

-
-
-
-
FragmentShadingRateKHR
-
-
-
-
-
-

New Built Ins

-
-
-

This extension introduces the following new built-in values:

-
-
-
-
ShadingRateKHR
-PrimitiveShadingRateKHR
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5

-
-
-
-
(Modify Section 3.21, BuiltIn)
-
-
-
-
-

(add new rows to the Builtin table)

-
- ------ - - - - - - - - - - - - - - - - - - - - - -
BuiltInEnabling CapabilitiesEnabled by Extension

4432

PrimitiveShadingRateKHR
-Output primitive fragment shading rate. -Only valid in the Vertex, Geometry, and MeshNV Execution Models. -See the API specification for more detail.

FragmentShadingRateKHR

SPV_KHR_fragment_shading_rate

4444

ShadingRateKHR
-Input fragment shading rate for the current shader -invocation. -Only valid in the Fragment Execution Model. -See the API specification for more detail.

FragmentShadingRateKHR

SPV_KHR_fragment_shading_rate

-
-
-
-
(Modify Section 3.31, Capability, adding a new row to the Capability table)
-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

4422

FragmentShadingRateKHR
-Uses the PrimitiveShadingRateKHR or ShadingRateKHR Builtins.

Shader

-
-
-
-
(Add a new sub-section 3.FSR, Fragment Shading Rates)
-
-
-
-
-

3.FSR, Fragment Shading Rates

-
-
-

Fragment shading rate flag values, determining how many pixels are covered -by a fragment shader invocation.

-
-
-

Valid rate combinations must not include more than 1 horizontal and 1 -vertical rate. -If no horizontal rate flags are set, it indicates a fragment shader covers one -pixel horizontally. -If no vertical rate flags are set, it indicates a fragment shader covers one -pixel vertically.

-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Fragment Shading Rate FlagsEnabling Capabilities

1

Vertical2Pixels
-Fragment invocation covers 2 pixels vertically.

FragmentShadingRateKHR

2

Vertical4Pixels
-Fragment invocation covers 4 pixels vertically.

FragmentShadingRateKHR

4

Horizontal2Pixels
-Fragment invocation covers 2 pixels horizontally.

FragmentShadingRateKHR

8

Horizontal4Pixels
-Fragment invocation covers 4 pixels horizontally.

FragmentShadingRateKHR

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_KHR_fragment_shading_rate"
-
-
-
-
-
-

Issues

-
-
-
    -
  1. -

    How does this extension compare to SPV_NV_shading_rate and -SPV_EXT_fragment_invocation_density?

    -
    -
    -
    -

    RESOLVED: This extension uses a different (enum based) scheme for shading -rates, and provides a way to set a rate in Vertex and Geometry Execution -Models, as well as the MeshNV Execution Model when supported.

    -
    -
    -
    -
  2. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2020-04-24

Tobias Hector

Initial draft

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_fragment_shading_rate.html + + +

extensions/KHR/SPV_KHR_fragment_shading_rate.html

+ + diff --git a/extensions/KHR/SPV_KHR_integer_dot_product.html b/extensions/KHR/SPV_KHR_integer_dot_product.html index 3828533..78e7c85 100644 --- a/extensions/KHR/SPV_KHR_integer_dot_product.html +++ b/extensions/KHR/SPV_KHR_integer_dot_product.html @@ -1,786 +1,12 @@ - - - - - - - -SPV_KHR_integer_dot_product - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_integer_dot_product

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Kévin Petit, Arm Ltd.

    -
  • -
  • -

    Ben Ashbaugh, Intel

    -
  • -
  • -

    Graeme Leese, Broadcom

    -
  • -
  • -

    Robert Quill, Imagination Technologies

    -
  • -
  • -

    Jeff Bolz, Nvidia

    -
  • -
  • -

    Raun Krisch, Samsung

    -
  • -
  • -

    Simon Waters, Samsung

    -
  • -
  • -

    John Kessenich, Google

    -
  • -
  • -

    David Neto, Google

    -
  • -
  • -

    Nicolai Hähnle, AMD

    -
  • -
  • -

    Ruihao Zhang, Qualcomm

    -
  • -
  • -

    Stuart Brady, Arm Ltd.

    -
  • -
  • -

    Alan Baker, Google

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2019 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Approved by the SPIR-V Working Group: 2020-05-20

    -
  • -
  • -

    Approved by the Khronos Board of Promoters: 2020-07-17

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2021-09-08

Revision

3

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.5 Revision 2.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension introduces support for dot product operations on integer vectors -with optional accumulation. The specific types accepted as inputs are -constrained by capabilities of which this extension introduces three:

-
-
-
    -
  • -

    Generic support for all input vector types

    -
  • -
  • -

    Support 4-component vectors of 8 bit integers (several implementers -only want to support this case)

    -
  • -
  • -

    Support 4-component vectors of 8 bit integers packed into 32-bit integers -(for devices that don’t have generic Int8 support)

    -
  • -
-
-
-

This extension introduces two groups of three instructions each. Instructions of -one of the groups perform simple dot product operations on input vectors with -signed, unsigned or mixed-signedness (one signed, one unsigned) components. -Instructions of the other group also perform a saturating addition of the -dot product result with an accumulator they accept as an operand.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_integer_dot_product"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5

-
-
-

Capabilities

-
-

Modify Section 3.31, "Capability", adding these rows to the Capability table -(these capabilities enable specific input types):

-
-
-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
CapabilityImplicitly declares

6019

DotProductKHR
-Uses dot product instructions

6016

DotProductInputAllKHR
-Uses vector of any integer type as input to the dot product instructions

6017

DotProductInput4x8BitKHR
-Uses vectors of four components of 8-bit integer type as inputs to the dot product instructions

Int8

6018

DotProductInput4x8BitPackedKHR
-Uses 32-bit integer scalars packing 4-component vectors of 8-bit integers as inputs to the dot product instructions

-
-
-
-
-

Packed Vector Format

-
-

New section under 3. Binary Form.

-
-
-

Specify how to interpret scalar integers as vectors.

-
-
-
- ----- - - - - - - - - - - - - - -
Packed Vector FormatEnabling Capabilities

0x0

PackedVectorFormat4x8BitKHR
-Interpret 32-bit scalar integer operands as vectors of four 8-bit components. Vector components follow byte significance order with the lowest-numbered component stored in the least significant byte.

-
-
-
-
-

Instructions

-
-

Add the following new instructions:

-
- --------- - - - - - - - - - - - - - - - -

OpSDotKHR
-
-Signed integer dot product of Vector 1 and Vector 2.
-
-Result Type must be an integer type whose Width must be greater than or equal -to that of the components of Vector 1 and Vector 2.
-
-Vector 1 and Vector 2 must have the same type.
-
-Vector 1 and Vector 2 must be either 32-bit integers (enabled by -DotProductInput4x8BitPackedKHR) or vectors of integer type (enabled by -DotProductInput4x8BitKHR or DotProductInputAllKHR).
-
-When Vector 1 and Vector 2 are scalar integer types, Packed Vector Format must -be specified to select how the integers are to be interpreted as vectors.
-
-All components of the input vectors are sign-extended to the bit width of the -result’s type. The sign-extended input vectors are then multiplied component-wise -and all components of the vector resulting from the component-wise multiplication -are added together. The resulting value will equal the low-order N bits of the -correct result R, where N is the result width and R is computed with enough -precision to avoid overflow and underflow.

Capability:
-DotProductKHR

5+

4450

<id> Result Type

Result <id>

<id> Vector 1

<id> Vector 2

Optional
-Packed Vector Format

- --------- - - - - - - - - - - - - - - - -

OpUDotKHR
-
-Unsigned integer dot product of Vector 1 and Vector 2.
-
-Result Type must be an integer type with Signedness of 0 whose Width -must be greater than or equal to that of the components of -Vector 1 and Vector 2.
-
-Vector 1 and Vector 2 must have the same type.
-
-Vector 1 and Vector 2 must be either 32-bit integers (enabled by -DotProductInput4x8BitPackedKHR) or vectors of integer type with Signedness -of 0 (enabled by DotProductInput4x8BitKHR or DotProductInputAllKHR).
-
-When Vector 1 and Vector 2 are scalar integer types, Packed Vector Format must -be specified to select how the integers are to be interpreted as vectors.
-
-All components of the input vectors are zero-extended to the bit width of the -result’s type. The zero-extended input vectors are then multiplied -component-wise and all components of the vector resulting from the component-wise -multiplication are added together. The resulting value will equal the low-order -N bits of the correct result R, where N is the result width and R is computed -with enough precision to avoid overflow and underflow.

Capability:
-DotProductKHR

5+

4451

<id> Result Type

Result <id>

<id> Vector 1

<id> Vector 2

Optional
-Packed Vector Format

- --------- - - - - - - - - - - - - - - - -

OpSUDotKHR
-
-Mixed-signedness integer dot product of Vector 1 and Vector 2. Components of Vector 1 are treated as signed, components of Vector 2 are treated as unsigned.
-
-Result Type must be an integer type whose Width must be greater than or equal -to that of the components of Vector 1 and Vector 2.
-
-Vector 1 and Vector 2 must be either 32-bit integers (enabled by -DotProductInput4x8BitPackedKHR) or vectors of integer type with the same -number of components and same component Width (enabled by DotProductInput4x8BitKHR -or DotProductInputAllKHR). When Vector 1 and Vector 2 are vectors, the components -of Vector 2 must have a Signedness of 0.
-
-When Vector 1 and Vector 2 are scalar integer types, Packed Vector Format must -be specified to select how the integers are to be interpreted as vectors.
-
-All components of Vector 1 are sign-extended to the bit width of the result’s type. -All components of Vector 2 are zero-extended to the bit width of the result’s type. -The sign- or zero-extended input vectors are then multiplied component-wise and all -components of the vector resulting from the component-wise multiplication are added -together. The resulting value will equal the low-order N bits of the correct -result R, where N is the result width and R is computed with enough precision to -avoid overflow and underflow.

Capability:
-DotProductKHR

5+

4452

<id> Result Type

Result <id>

<id> Vector 1

<id> Vector 2

Optional
-Packed Vector Format

- ---------- - - - - - - - - - - - - - - - - -

OpSDotAccSatKHR
-
-Signed integer dot product of Vector 1 and Vector 2 and signed saturating addition of the result with Accumulator.
-
-Result Type must be an integer type whose Width must be greater than or equal -to that of the components of Vector 1 and Vector 2.
-
-Vector 1 and Vector 2 must have the same type.
-
-Vector 1 and Vector 2 must be either 32-bit integers (enabled by -DotProductInput4x8BitPackedKHR) or vectors of integer type -(enabled by DotProductInput4x8BitKHR or DotProductInputAllKHR).
-
-The type of Accumulator must be the same as Result Type.
-
-When Vector 1 and Vector 2 are scalar integer types, Packed Vector Format must -be specified to select how the integers are to be interpreted as vectors.
-
-All components of the input vectors are sign-extended to the bit width of the -result’s type. The sign-extended input vectors are then multiplied component-wise -and all components of the vector resulting from the component-wise multiplication -are added together. Finally, the resulting sum is added to the input accumulator. -This final addition is saturating.
-
-If any of the multiplications or additions, with the exception of the final -accumulation, overflow or underflow, the result of the instruction is undefined.

Capability:
-DotProductKHR

6+

4453

<id> Result Type

Result <id>

<id> Vector 1

<id> Vector 2

<id> Accumulator

Optional
-Packed Vector Format

- ---------- - - - - - - - - - - - - - - - - -

OpUDotAccSatKHR
-
-Unsigned integer dot product of Vector 1 and Vector 2 and unsigned saturating addition of the result with Accumulator.
-
-Result Type must be an integer type with Signedness of 0 whose Width -must be greater than or equal to that of the components of -Vector 1 and Vector 2.
-
-Vector 1 and Vector 2 must have the same type.
-
-Vector 1 and Vector 2 must be either 32-bit integers (enabled by -DotProductInput4x8BitPackedKHR) or vectors of integer type with Signedness -of 0 (enabled by DotProductInput4x8BitKHR or DotProductInputAllKHR).
-
-The type of Accumulator must be the same as Result Type.
-
-When Vector 1 and Vector 2 are scalar integer types, Packed Vector Format must -be specified to select how the integers are to be interpreted as vectors.
-
-All components of the input vectors are zero-extended to the bit width of the -result’s type. The zero-extended input vectors are then multiplied component-wise -and all components of the vector resulting from the component-wise multiplication -are added together. Finally, the resulting sum is added to the input accumulator. -This final addition is saturating.
-
-If any of the multiplications or additions, with the exception of the final -accumulation, overflow or underflow, the result of the instruction is undefined.

Capability:
-DotProductKHR

6+

4454

<id> Result Type

Result <id>

<id> Vector 1

<id> Vector 2

<id> Accumulator

Optional
-Packed Vector Format

- ---------- - - - - - - - - - - - - - - - - -

OpSUDotAccSatKHR
-
-Mixed-signedness integer dot product of Vector 1 and Vector 2 and signed saturating addition of the result with Accumulator. Components of Vector 1 are treated as signed, components of Vector 2 are treated as unsigned.
-
-Result Type must be an integer type whose Width must be greater than or equal -to that of the components of Vector 1 and Vector 2.
-
-Vector 1 and Vector 2 must be either 32-bit integers (enabled by -DotProductInput4x8BitPackedKHR) or vectors of integer type with the same -number of components and same component Width (enabled by DotProductInput4x8BitKHR -or DotProductInputAllKHR). When Vector 1 and Vector 2 are vectors, the components -of Vector 2 must have a Signedness of 0.
-
-The type of Accumulator must be the same as Result Type.
-
-When Vector 1 and Vector 2 are scalar integer types, Packed Vector Format must -be specified to select how the integers are to be interpreted as vectors.
-
-All components of Vector 1 are sign-extended to the bit width of the result’s type. -All components of Vector 2 are zero-extended to the bit width of the result’s type. -The sign- or zero-extended input vectors are then multiplied component-wise and -all components of the vector resulting from the component-wise multiplication -are added together. Finally, the resulting sum is added to the input accumulator. -This final addition is saturating.
-
-If any of the multiplications or additions, with the exception of the final -accumulation, overflow or underflow, the result of the instruction is undefined.

Capability:
-DotProductKHR

6+

4455

<id> Result Type

Result <id>

<id> Vector 1

<id> Vector 2

<id> Accumulator

Optional
-Packed Vector Format

-
-
-
-
-

Interactions with type capabilities

-
-
-

Support for specific input types is enabled by various capabilities as -follows.

-
-
-

Vectors of 4 8-bit integer components packed into a 32-bit integer are enabled by DotProductInput4x8BitPackedKHR.

-
-
-

Vectors of 4 8-bit integer components are enabled by DotProductInput4x8BitKHR.

-
-
-

Vectors of any other type are enabled by DotProductInputAllKHR along with other -capabilities:

-
-
-
    -
  • -

    2-, 3- or 4-component vectors require no additional capabilities

    -
  • -
  • -

    8- or 16-component vectors require Vector16

    -
  • -
  • -

    8-bit components require Int8

    -
  • -
  • -

    16-bit components require Int16

    -
  • -
  • -

    32-bit components require no additional capabilities

    -
  • -
  • -

    64-bit components require Int64

    -
  • -
-
-
-
-
-

Issues

-
-
-
    -
  1. -

    How should the signedness of operations be determined?

    -
    -
    -
    -

    RESOLVED: In line with existing instructions, the signedness of operations is -carried by instructions (OpS*, OpU\* and OpSU*). Using the signedness -of operands couldn’t work at all for OpenCL where signedness isn’t part of the -types. Having three separate instructions for that purpose was deemed acceptable. -The signedness of operands is contrained to be 0 for instructions that treat their -inputs as unsigned to help with validation (as a non-zero value is very likely -to be incorrect).

    -
    -
    -
    -
  2. -
  3. -

    Should there be non-saturating accumulating instructions?

    -
    -
    -
    -

    RESOLVED: No. It is simple enough to spot the dot product followed by an -addition pattern and lower it to specific instructions in consumers that have -them. There are multiple benefits to this approach:

    -
    -
    -
      -
    • -

      Consumers that have these instructions are forced to optimise the pattern -which removes the possibility that a user might use a non-accumulating -instruction followed by an addition instead of an accumulating instruction.

      -
    • -
    • -

      Keeping the addition and dot product separate may expose additional -optimisation opportunities.

      -
    • -
    • -

      Most high-level languages already have operators for addition. This reduces -the number of new built-in functions to introduce.

      -
    • -
    -
    -
    -
    -
  4. -
  5. -

    Shouldn’t the width of the result type always be large enough to accomodate -all possible values of the input vectors?

    -
    -
    -
    -

    RESOLVED: No. This prevents implementing the instructions with lower precision -arithmetic in some cases and is not consistent with other arithmetic -instructions. Programs that need the result type to be large enough to represent -the dot product of the input vectors for all possible values of the input vectors -should choose a result type that satisfies the following constraint:

    -
    -
    -
    -
    result_width >= input_component_width * 2 + ceil(log2(input_num_components))
    -
    -
    -
    -
    -
  6. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

3

2021-09-08

Kévin Petit

Clarify how vectors are packed into 32-bit integers

2

2021-06-09

Kévin Petit

Use a single capability to enable all instructions

1

2020-05-20

Kévin Petit

Initial revision

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_integer_dot_product.html + + +

extensions/KHR/SPV_KHR_integer_dot_product.html

+ + diff --git a/extensions/KHR/SPV_KHR_linkonce_odr.html b/extensions/KHR/SPV_KHR_linkonce_odr.html index 4a5e44f..9f6e0d2 100644 --- a/extensions/KHR/SPV_KHR_linkonce_odr.html +++ b/extensions/KHR/SPV_KHR_linkonce_odr.html @@ -1,258 +1,12 @@ - - - - - - - -SPV_KHR_linkonce_odr - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_linkonce_odr

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Ben Ashbaugh, Intel

    -
  • -
  • -

    Mariya Podchishchaeva, Intel

    -
  • -
  • -

    Nicolai Hähnle, AMD

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2020 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
  • -

    Approved by the SPIR Working Group: 2020-06-24

    -
  • -
  • -

    Ratified the Khronos Group: 2020-08-14

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2020-05-19

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.5 Revision 2, Unified

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds a new Linkage Type LinkOnceODR. -A global variable or function with such linkage type can be merged with -equivalent global variables or functions from other modules during the -linking process. -The primary use case of this linkage type is to satisfy the C++ language -One Definition Rule.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_linkonce_odr"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5

-
-
-

Linkage Type

-
-

In section 3.17 "Linkage Type" add the following row to the Linkage Type table:

-
- ----- - - - - - - - - - - - - - -
Linkage TypeEnabling Capabilities

2

LinkOnceODR
-Same as the Export linkage type, but a function or global variable with this -linkage type will be merged with other functions or global variables of the same -name when linkage occurs, i.e. the linker will just pick one of them. -Tools (such as optimizers) can assume that if the definition of a LinkOnceODR -function or global variable gets replaced during linking by one from another -module, then the replacement is equivalent, so it is correct to inline a -LinkOnceODR function.

Linkage

-
-
-
-
-

Issues

-
-
-
    -
  1. -

    Do we need to introduce more linkage types, such as weak, weak_odr etc?

    -
    -
    -
    -

    RESOLVED: We will not add additional linkage types as part of this extension.

    -
    -
    -
    -
  2. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2020-05-19

Ben Ashbaugh

Initial KHR extension.

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_linkonce_odr.html + + +

extensions/KHR/SPV_KHR_linkonce_odr.html

+ + diff --git a/extensions/KHR/SPV_KHR_maximal_reconvergence.html b/extensions/KHR/SPV_KHR_maximal_reconvergence.html index e689544..47dea33 100644 --- a/extensions/KHR/SPV_KHR_maximal_reconvergence.html +++ b/extensions/KHR/SPV_KHR_maximal_reconvergence.html @@ -1,701 +1,12 @@ - - - - - - - -SPV_KHR_maximal_reconvergence - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_maximal_reconvergence

-
-
-
-
-

Contact

-
-
-

To report a problem with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Alan Baker, Google LLC

    -
  • -
  • -

    David Neto, Google LLC

    -
  • -
  • -

    Jeff Bolz, NVIDIA Corporation

    -
  • -
  • -

    Graeme Leese, Broadcom

    -
  • -
  • -

    Nicolai Hahnle, AMD

    -
  • -
  • -

    Ruihao Zhang, Qualcomm

    -
  • -
  • -

    Tobias Hector, AMD

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2024 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
  • -

    Approved by the SPIR Working Group: 2023-12-06

    -
  • -
  • -

    Approved by the Khronos Board of Promoters: 2024-01-19

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2024-04-18

Revision

2

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.6 Revision 1.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds a new execution mode -MaximallyReconvergesKHR which changes the -behavior of divergence and reconvergence of invocations executing affected -entry points. Under maximal reconvergence, invocations that diverge must -reconverge as soon as possible such that as many invocations as possible -execute the same dynamic instruction instances.

-
-
-

Maximal reconvergence provides guarantees that match shader author intuition -concerning divergence and reconvergence. Divergence and reconvergence is -understandable from the control flow graph of the IR.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_maximal_reconvergence"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6

-
-
-

Add the following new terms:

-
-
-

Tangle: The set of invocations that execute the same dynamic instance of an -instruction. The rules describing the behavior of tangles are -Maximal Reconvergence.

-
-
-

Converged: If all invocations in a particular scope instance are part of the -tangle, the tangle is said to be converged for that scope instance.

-
-
-

Dynamic Successor: Given a dynamic instance, A, of an instruction a, and -a dynamic instance, B, of instruction b, where a differs from b, B is -a dynamic successor of A if B is executed immediately after A for a given -invocation. Similarly, if B is a dynamic successor of A, then A is -dynamic predecessor of B.

-
-
-

Changes to Section 2.11 Structured Control Flow

-
-
-

Add the following to the end of the section:

-
-
-

Any function in the static call tree of an entry point annotated with the -MaximallyReconvergesKHR execution mode must -satisfy the following requirements:

-
-
-
    -
  • -

    The only Basic Blocks that may have multiple unique predecessors are:

    -
    -
      -
    • -

      Any Basic Block containing an OpLoopMerge instruction

      -
    • -
    • -

      Any declared Merge Block

      -
    • -
    • -

      Any declared Continue Target

      -
    • -
    • -

      Any Target or Default of an OpSwitch instruction

      -
    • -
    -
    -
  • -
-
-
-

Changes to Section 3 Binary Form

-
-
-

Add the following row to the table in Section 3.6 Execution Mode:

-
-
-
- -------- - - - - - - - - - - - - - - - -
Execution ModeExtra OperandsEnabling Capabilities

6023

MaximallyReconvergesKHR
-Invocations in the entry point will obey the rules in Maximal Reconvergence.

Shader

-
-
-
-

Modify the description of OpBranchConditional:

-
-
-

Change:

-
-
-
-
-

Starting with version 1.6, True Label and False Label must not be the same <id>.

-
-
-
-
-

To:

-
-
-
-
-

Starting with version 1.6, or in any function reachable from an entry -point with the execution mode -MaximallyReconvergesKHR, True Label and False -Label must not be the same <id>.

-
-
-
-
-

Add a new section to the end of Section 2 Specification titled

-
-
-
-
-

Maximal Reconvergence

-
-
-

Notation

-
-

If I is a dynamic instance of an instruction, let the tangle that executes I be T(I).

-
-
-
-

Initial State

-
-

Let F be the first dynamically executed instruction in an entry point. T(F) -will be converged.

-
-
-

Note: This is a restatement of the initial conditions in Uniform Control Flow.

-
-
-
-

Divergence

-
-

A divergence occurs when executing I if the invocations in -T(I) do not all have the same dynamic successor.

-
-
-

The tangles that execute the dynamic successors of a dynamic -instruction instance I form a partition of those invocations in T(I) that -have a dynamic successor. The tangles of the dynamic successors may include -invocations not in T(I) if that dynamic successor reconverges.

-
-
-

The only instructions that can produce a divergence are:

-
-
-
    -
  • -

    An OpBranchConditional.

    -
    -
      -
    • -

      T(I) is partitioned into up to two tangles. -All the invocations in T(I) for whom Condition evaluates to true are -members of the tangle that executes True Label and the rest are in the -tangle that executes False Label.

      -
    • -
    -
    -
  • -
  • -

    An OpSwitch.

    -
    -
      -
    • -

      T(I) is partitioned into at least one tangle per case construct, -and at most one tangle per unique Selector value. Invocations in T(I) -with the same Selector value will be partitioned into the same tangle, -executing the associated case construct. Invocations with different -Selector values executing the same case construct may be partitioned -into the same tangle. This behavior is deterministic for a given -compilation of a shader.

      -
    • -
    -
    -
  • -
  • -

    An OpDemoteToHelperInvocation or OpKill instruction executed for the -last non-demoted invocations in a quad. The newly demoted invocations may be -in a different tangle causing a divergence to appear to occur for any -instruction.

    -
  • -
-
-
-

Note: This means that invocations cannot spontaneously diverge, although demoting -an invocation to a helper invocation may look like spontaneous divergence.

-
-
-

Note: All invocations in a tangle that are not terminated during the execution -of an OpFunctionCall will remain tangled in the next dynamic instance -executed in calling function. That is, function call return acts as -reconvergence point.

-
-
-

Note: When the last invocation in a quad is demoted to a helper invocation, the -whole quad may be terminated. Since the invocations in the quad may be -diverged, the termination of a quad may give the appearance of spontaneous -divergence of some tangles. The invocations that were already helper -invocations might be in vastly different points in the program execution.

-
-
-
-

Reconvergence

-
-

Invocations that diverged from each other are said to reconverge -when they rejoin a common tangle. Reconvergence occurs at certain related dynamic instances. -A dynamic instruction instance, L, is related to another -dynamic instruction instance, I, if I executed before L and the -invocations in T(I) are candiates for inclusion in T(L). The subset of -T(I) required to reconverge depends on the instructions executed as detailed -below.

-
-
-

With L related to I as above, an invocation escapes -reconvergence when that invocation is in T(I), but not in T(L). This only -occurs when:

-
-
-
    -
  • -

    The invocation executes OpTerminateInvocation or OpKill.

    -
  • -
  • -

    The last non-demoted, non-terminated invocation in the invocation’s quad -executes OpDemoteToHelperInvocation, OpTerminateInvocation, or -OpKill.

    -
  • -
  • -

    The invocation executes OpReturn or OpReturnValue. Escaping in this manner only -affects relations in the current function.

    -
  • -
  • -

    Executing OpBranch or OpBranchConditional causes an invocation -to branch to the Merge Block or Continue Target for a merge instruction -instance that strictly dominates I.

    -
  • -
-
-
-

Note: The common cases an invocation would escape reconvergence are breaking -from a switch or loop, or continuing in a loop.

-
-
-

Note: OpKill will behave the same as either OpTerminateInvocation or -OpDemoteToHelperInvocation depending on the implementation. It is -recommended that shader authors use OpTerminateInvocation or -OpDemoteToHelperInvocation instead of OpKill whenever possible to -produce more predictable behavior.

-
-
-

The only related instances introduced during execution are the following:

-
-
-
    -
  • -

    Given dynamic instances L of an OpLabel and M of an OpSelectionMerge, where:

    -
    -
      -
    • -

      The OpLabel is the declared Merge Block of the OpSelectionMerge, and

      -
    • -
    • -

      An invocation i executes both L and M, and

      -
    • -
    • -

      M is the last execution of the OpSelectionMerge before executing L for i, then

      -
    • -
    • -

      L is related to M, and

      -
    • -
    • -

      T(L) will include all non-escaping invocations in T(M)

      -
    • -
    -
    -
  • -
  • -

    Given dynamic instances L of an OpLabel and S of an OpSwitch, where:

    -
    -
      -
    • -

      The OpLabel is a declared Target or Default of the OpSwitch, and

      -
    • -
    • -

      An invocation i executes both L and S, and

      -
    • -
    • -

      S is the last execution of the OpSwitch before executing L for i, then

      -
    • -
    • -

      L is related to S, and

      -
    • -
    • -

      T(L) may include a subset of non-escaping invocations in T(S)

      -
    • -
    -
    -
  • -
  • -

    Given dynamic instances L of an OpLabel and M of an OpLoopMerge, where:

    -
    -
      -
    • -

      The OpLabel is the declared Merge Block of the OpLoopMerge, and

      -
    • -
    • -

      An invocation i executes both L and M, and

      -
    • -
    • -

      M is the last execution of the OpLoopMerge where i did not enter the basic block -via the loop backedge before executing L for i, then

      -
    • -
    • -

      L is related to M, and

      -
    • -
    • -

      T(L) will include all non-escaping invocations in T(M)

      -
    • -
    -
    -
  • -
  • -

    Given dynamic instances L of an OpLabel and M of an OpLoopMerge, where:

    -
    -
      -
    • -

      The OpLabel is the declared Continue Target of the OpLoopMerge, and

      -
    • -
    • -

      An invocation i executes both L and M, and

      -
    • -
    • -

      M is the last execution of the OpLoopMerge before executing L for i, then

      -
    • -
    • -

      L is related to M, and

      -
    • -
    • -

      T(L) will include all non-escaping invocations in T(M)

      -
    • -
    • -

      Note: this requires that invocations reconverge at the Continue Target of a loop. -Therefore, at the beginning of each iteration of the loop, invocations that entered -the loop together and are continuing to the execute the loop will be converged.

      -
    • -
    -
    -
  • -
  • -

    Given dynamic instances I of an instruction and FC of an OpFunctionCall, where:

    -
    -
      -
    • -

      The instruction of I immediately succeeds the OpFunctionCall in binary order, and

      -
    • -
    • -

      An invocation i executes both I and FC, and

      -
    • -
    • -

      FC is the last execution of the OpFunctionCall before executing I for i, then

      -
    • -
    • -

      I is related to FC, and

      -
    • -
    • -

      T(I) will include all non-escaping invocations in T(FC)

      -
    • -
    -
    -
  • -
-
-
-
-

Non-reconvergence

-
-

Invocations will not reconverge except at Merge Blocks, -Continue Targets, and case constructs or after OpFunctionCall is -executed.

-
-
-

A tangle that executes an instance of a merge instruction, M, -represents the maximal tangle for all of the invocations in T(M). That is, -implementations will not merge tangles during execution except through -reconvergence.

-
-
-

Note: This means that the instructions in a break block will execute as if -they were still diverged according to the loop iteration. This restricts -potential transformations an implementation may perform on the IR to match -shader author expectations. Similarly, instructions in the loop construct -cannot be moved into the continue construct unless it can be proven that -invocations are always converged.

-
-
-
-
-
-

Issues

-
-
-
    -
  1. -

    What should be the behavior of an OpSwitch with multiple labels for a single case construct?

    -
    -
    -
    -

    Resolved

    -
    -
    -

    This behavior is implementation-defined. An implementation will guarantee that -at least all invocations that have the same selector value remain tangled, but -may further include invocations up to all of those invocations that reach the -same case construct.

    -
    -
    -
    -
  2. -
  3. -

    Should any structured control flow rules be tightened for this extension?

    -
    -
    -
    -

    Resolved

    -
    -
    -

    Yes, see the modifications to 2.11 Structured Control Flow.

    -
    -
    -
    -
  4. -
  5. -

    Should this extension make any mention of forward progress?

    -
    -
    -
    -

    Resolved

    -
    -
    -

    No, similar to memory model synchronization, if the invocations do not -reconverge, the the program may hang. Behaviour is undefined if invocations -don’t make progress.

    -
    -
    -
    -
  6. -
  7. -

    Is enough said about helper invocations?

    -
    -
    -
    -

    Resolved

    -
    -
    -

    Yes, the extension describes the behavior as specifically as it can. Quads -being terminated may look like unexpected divergence, but the behavior is -reasonable when viewed as a whole.

    -
    -
    -
    -
  8. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - -

Rev

Date

Author

Changes

1

2022-01-22

Alan Baker

Initial Revision

2

2024-04-18

Alan Baker

Fix typo and resolve issue

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_maximal_reconvergence.html + + +

extensions/KHR/SPV_KHR_maximal_reconvergence.html

+ + diff --git a/extensions/KHR/SPV_KHR_multiview.html b/extensions/KHR/SPV_KHR_multiview.html index c4193d1..eae3c1e 100644 --- a/extensions/KHR/SPV_KHR_multiview.html +++ b/extensions/KHR/SPV_KHR_multiview.html @@ -1,335 +1,12 @@ - - - - - - - -SPV_KHR_multiview - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_multiview

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Ashwin Kolhe, NVIDIA

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
  • -

    Approved by the SPIR Working group: 2017-01-11

    -
  • -
  • -

    Approved by the Khronos Board of Promoters: 2017-02-24

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2016-12-12

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.1 Revision 4.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds a new capability to support the Vulkan -VK_KHX_multiview extension in SPIR-V.

-
-
-

The new MultiView capability allows the ViewIndex builtin -variable to be exported from all shaders stages except compute, -which represents the index of view currently being rendered to.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_multiview"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
MultiView
-
-
-
-
-
-

New Builtins

-
-
-

A new builtin is added as an input for all shader stages except compute.

-
-
-
-
ViewIndex
-
-
-
-

Input view index of the view currently being rendered to.

-
-
-
-
-

New Instructions

-
-
-

None.

-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - - - - - -

MultiView

4439

ViewIndex

4440

-
-
-
(Modify Section 3.21, BuiltIn)
-
-
-
-
-

(add a new row to the Builtin table)

-
- ----- - - - - - - - - - - - - - -
BuiltInEnabling Capabilities

4440

ViewIndex
-Input view index of the view currently being rendered to. See VK_KHX_multiview for more details.

MultiView

-
-
-
-
(Modify Section 3.31, Capability, adding new row to the Capability table)
-
-
-
- ------ - - - - - - - - - - - - - - - -
CapabilityDepends OnEnabled by Extension

4439

MultiView

Shader

SPV_KHR_multiview

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_KHR_multiview"
-
-
-
-
-
-

Issues

-
-
-

None yet.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2016-12-12

Ashwin Kolhe

Initial draft

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_multiview.html + + +

extensions/KHR/SPV_KHR_multiview.html

+ + diff --git a/extensions/KHR/SPV_KHR_no_integer_wrap_decoration.html b/extensions/KHR/SPV_KHR_no_integer_wrap_decoration.html index 69be554..9ec0976 100755 --- a/extensions/KHR/SPV_KHR_no_integer_wrap_decoration.html +++ b/extensions/KHR/SPV_KHR_no_integer_wrap_decoration.html @@ -1,423 +1,12 @@ - - - - - - - -SPV_KHR_no_integer_wrap_decoration - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_no_integer_wrap_decoration

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Pawel Jurek, Intel

    -
  • -
  • -

    Mariusz Merecki, Intel

    -
  • -
  • -

    Ben Ashbaugh, Intel

    -
  • -
  • -

    Faith Ekstrand, Intel

    -
  • -
  • -

    Neil Henning, AMD

    -
  • -
  • -

    John Kessenich, Google

    -
  • -
  • -

    Graeme Leese, Broadcom

    -
  • -
  • -

    Ruihao Zhang, Qualcomm

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2018 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Approved by the SPIR Working Group: 2018-10-03

    -
  • -
  • -

    Ratified by Khronos 2018-11-16

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2018-09-11

Revision

4

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.3, Revision 3, Unified

-
-
-

This extension is written against the SPIR-V Extended Instructions for GLSL, -Version 1.00, Revision 7

-
-
-

This extension is written against the OpenCL Extended Instruction Set Specification, -Version 1.00, Revision 4

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds new decorations to indicate that a given instruction does not cause integer wrapping to occur, in the form of overflow or underflow.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_no_integer_wrap_decoration"
-
-
-
-
-
-

New Decorations

-
-
-

This extension introduces the following new decorations:

-
-
-
-
NoSignedWrap
-NoUnsignedWrap
-
-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - - - - - -

NoSignedWrap

4469

NoUnsignedWrap

4470

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.3, Revision 3, Unified

-
-
-

Validation Rules

-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_KHR_no_integer_wrap_decoration"
-
-
-
-

Modify Section 2.16.1, Universal Validation Rules, adding the following rules:

-
-
-
    -
  • -

    NoSignedWrap decoration defined in this extension can be applied only to the following instructions:

    -
    -
      -
    • -

      OpIAdd

      -
    • -
    • -

      OpISub

      -
    • -
    • -

      OpIMul

      -
    • -
    • -

      OpShiftLeftLogical

      -
    • -
    • -

      OpSNegate

      -
    • -
    • -

      OpExtInst with instruction numbers specified in the extended instruction-set specifications as accepting the decoration.

      -
    • -
    -
    -
  • -
  • -

    NoUnsignedWrap decoration defined in this extension can be applied only to the following instructions:

    -
    -
      -
    • -

      OpIAdd

      -
    • -
    • -

      OpISub

      -
    • -
    • -

      OpIMul

      -
    • -
    • -

      OpShiftLeftLogical

      -
    • -
    • -

      OpExtInst with instruction numbers specified in the extended instruction-set specifications as accepting the decoration.

      -
    • -
    -
    -
  • -
-
-
-
-

Decorations

-
-

Modify Section 3.20, "Decoration", adding these rows to the Decoration table:

-
-
-
- ------ - - - - - - - - - - - - - - - - - - - - - -
DecorationEnabling CapabilitiesExtra Operands

4469

NoSignedWrap
-Apply to an instruction to indicate that it doesn’t cause signed integer wrapping to occur, in the form of overflow or underflow.

-

If the instruction decorated with NoSignedWrap does overflow or underflow, the behavior is undefined.

4470

NoUnsignedWrap
-Apply to an instruction to indicate that it doesn’t cause unsigned integer wrapping to occur, in the form of overflow or underflow.

-

If the instruction decorated with NoUnsignedWrap does overflow or underflow, the behavior is undefined.

-
-
-
-
-
-
-

Modifications to the SPIR-V Extended Instructions for GLSL, Version 1.00, Revision 7

-
-
-

Modify Section 2, "Binary Form", adding the following text to the SAbs instruction description in the Extended instructions table: -

-
-
-

This instruction can be decorated with NoSignedWrap decoration.

-
-
-
-
-

Modifications to the OpenCL Extended Instruction Set Specification, Version 1.00, Revision 4

-
-
-

Modify Section 2.2, "Integer Instructions", adding the following text to the s_abs instruction description: -

-
-
-

This instruction can be decorated with NoSignedWrap decoration.

-
-
-
-
-

Issues

-
-
-

1) Should we add a floating point version of the decoration?

-
-
-

RESOLVED: No. -A new decoration would provide the same information as FP Fast Math Mode.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2018-08-10

Pawel Jurek

Initial revision

2

2018-08-23

Mariusz Merecki

Rename to KHR, remove the capability, allow new decorations on OpUDiv, OpUMod, OpSNegate, OpSDiv, OpSMod and OpSRem.

3

2018-08-29

Mariusz Merecki

Use the term wrap instead of overflow in the extension name and new decorations, remove OpUMod, OpSMod and OpSRem from the list of instructions allowed to be decorated with new decorations.

4

2018-09-11

Mariusz Merecki

Remove OpUDiv and OpSDiv from the list of instructions allowed to be decorated with new decorations.

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_no_integer_wrap_decoration.html + + +

extensions/KHR/SPV_KHR_no_integer_wrap_decoration.html

+ + diff --git a/extensions/KHR/SPV_KHR_non_semantic_info.html b/extensions/KHR/SPV_KHR_non_semantic_info.html index 3353a25..4ab5fb0 100644 --- a/extensions/KHR/SPV_KHR_non_semantic_info.html +++ b/extensions/KHR/SPV_KHR_non_semantic_info.html @@ -1,318 +1,12 @@ - - - - - - - -SPV_KHR_non_semantic_info - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_non_semantic_info

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Baldur Karlsson, Valve

    -
  • -
  • -

    Neil Henning, AMD

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2019 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
  • -

    Ratified by the Khronos Board 2019-12-13.

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2020-03-25

Revision

4

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.4 Revision 2, Unified.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds the ability to declare extended instruction sets that have -no semantic impact and can be safely removed from a module.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_non_semantic_info"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.4

-
-
-

Terms

-
-

Add a new term to section 2.2.1 Instructions:

-
-
-

Non-Semantic Instruction: an instruction that has no -semantic impact, and can be safely removed from the module.

-
-
-
-

Logical Layout of a Module

-
-

Modify section 9, adding:

-
-
-

This section is the first section to allow use of Non-Semantic Instructions -with OpExtInst.

-
-
-
-

Instructions

-
-

Modify the OpExtInstImport instruction:

-
-
-

(Replace the following sentence):

-
-
-

There must be an external specification defining the semantics for this extended -instruction set.

-
-
-

(with):

-
-
-

There must be an external specification defining the semantics for this extended -instruction set, unless it has a name prefixed with NonSemantic. at the -beginning of the name, including the period separating the namespace -NonSemantic from the rest of the name. In that case it is encouraged for a -specification to exist on the SPIR-V registry, but it is not required.

-
-
-

An extended set name which is prefixed with NonSemantic. is guaranteed to -contain only non-semantic instructions and all OpExtInst instructions -referencing this set can be ignored. All instructions within such a set have -only ID operands, no literal values. When literals are needed then the result ID -from an OpConstant or OpString instruction is referenced as appropriate. -Result IDs from these non-semantic instruction set instructions must only be -be used in other non-semantic instructions.

-
-
-
-
-
-

Issues

-
-
-
    -
  1. -

    Should we define any specific Non-Semantic extended instruction set in this -extension?

    -
    -
    -
    -

    RESOLVED: No, this extension only defines the mechanism by which such -extended instruction sets are defined to allow consumers to ignore them. The -specific extended instruction sets will be defined elsewhere.

    -
    -
    -
    -
  2. -
  3. -

    In which category (subsection) should it be valid to use OpExtInst with -a non-semantic extended instruction set?

    -
    -
    -
    -

    RESOLVED: From section 9 (All type declarations, all constant instructions, -and all global variable declarations) onwards. Since the goal of these -instructions is to provide additional non-semantic information it is valid to -use outside of function declarations to allow attaching of information to global -declarations.

    -
    -
    -
    -
  4. -
  5. -

    Can an OpExtInst with a non-semantic extended instruction set be -intermixed with an OpPhi?

    -
    -
    -
    -

    RESOLVED: No, non-semantic extended instructions are not exempt -from the OpPhi rule: "Within a block, this instruction [OpPhi] must -appear before all non-OpPhi instructions".

    -
    -
    -
    -
  6. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

4

2020-03-25

Caio Marcelo de Oliveira Filho

Clarified that non-semantic instructions cannot be intermixed with OpPhi

3

2019-09-09

Baldur Karlsson

Clarified prefix contains a "." - Restrict OpExtInst parameters to only IDs

2

2019-09-06

Baldur Karlsson

Removed specific instruction set, defined logical layout - Target latest unified specification

1

2019-01-10

Baldur Karlsson

Initial revision

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_non_semantic_info.html + + +

extensions/KHR/SPV_KHR_non_semantic_info.html

+ + diff --git a/extensions/KHR/SPV_KHR_physical_storage_buffer.html b/extensions/KHR/SPV_KHR_physical_storage_buffer.html index ab839f4..597e2df 100644 --- a/extensions/KHR/SPV_KHR_physical_storage_buffer.html +++ b/extensions/KHR/SPV_KHR_physical_storage_buffer.html @@ -1,772 +1,12 @@ - - - - - - - -SPV_KHR_physical_storage_buffer - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_physical_storage_buffer

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Jeff Bolz, NVIDIA

    -
  • -
  • -

    Neil Henning, AMD

    -
  • -
  • -

    Tobias Hector, AMD

    -
  • -
  • -

    Faith Ekstrand, Intel

    -
  • -
  • -

    Mariusz Merecki, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2018 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
  • -

    Ratified by the Khronos Board 2019-08-23.

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2019-09-20

Revision

2

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.3, Revision 5, Unified.

-
-
-

This extension requires SPIR-V 1.3.

-
-
-

The result of OpConstantNull must not be a pointer into the -PhysicalStorageBufferEXT storage class.

-
-
-

When used with SPIR-V 1.4 or higher, operands to OpPtrEqual, OpPtrNotEqual, -and OpPtrDiff must not be pointers into the PhysicalStorageBufferEXT storage -class.

-
-
-
-
-

Overview

-
-
-

This extension adds a new storage class PhysicalStorageBuffer which is -similar to StorageBuffer except pointers to the PhysicalStorageBuffer -storage class are treated as physical pointer types according to a new -addressing model PhysicalStorageBuffer64. This addressing model is a -hybrid of logical and physical addressing, with only pointers to -PhysicalStorageBuffer storage class being physical, and using 64-bit -addresses. It also adds a new capablity PhysicalStorageBufferAddresses -and enables a few instructions currently supported for Addresses.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_physical_storage_buffer"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.3

-
-
-

2.2 Terms

-
-

Add new terms to section 2.2.2 Types:

-
-
-

Physical Pointer Type: A pointer type is a physical -pointer type if the storage class of the type pointed to uses physical -addressing according to the addressing model.

-
-
-

Logical Pointer Type: A pointer type is a logical -pointer type if it is not a physical pointer type.

-
-
-

Modify the following definitions:

-
-
-

Concrete Type: A numerical scalar, vector, matrix type, -or physical pointer type, or any aggregate containing only these types.

-
-
-

Abstract Type: An OpTypeVoid or OpTypeBool, or logical -pointer type, or any aggregate type containing any of these.

-
-
-

Modify the definition of Memory Object Declaration:

-
-
-

Memory Object Declaration: An OpVariable, or -an OpFunctionParameter of pointer type, or the contents of an OpVariable -that holds a pointer to PhysicalStorageBuffer storage class -or holds an array of such pointers.

-
-
-

Modify the first part of the definition of Variable pointer from:

-
-
-

Variable pointer: A pointer that results from one of the -following instructions: …​

-
-
-

to:

-
-
-

Variable pointer: A pointer of logical pointer type that -results from one of the following instructions: …​

-
-
-
-

2.16 Validation Rules

-
-

Modify section 2.16.1. Universal Validation Rules:

-
-
-

Change:

-
-
-
-
-

If the Logical addressing model is selected and the -VariablePointers capability is not declared:

-
-
-
-
-

to:

-
-
-
-
-

If the VariablePointers capability is not declared, the -following rules apply to logical pointer types:

-
-
-
-
-

Change:

-
-
-
-
-

OpVariable cannot allocate an object whose type is a pointer type (that -is, it cannot create an object in memory that is itself a pointer and -whose result would thus be a pointer to a pointer)

-
-
-
-
-

to:

-
-
-
-
-

OpVariable cannot allocate an object whose type is a -logical pointer type (that is, it cannot create an -object in memory that is itself a logical pointer and whose result would -thus be a pointer to a logical pointer)

-
-
-
-
-

Change:

-
-
-
-
-

"If the Logical *addressing model is selected and the -*VariablePointers or VariablePointersStorageBuffer capability is -declared (in addition to what is allowed above by the Logical addressing model):"

-
-
-
-
-

to:

-
-
-
-
-

"If the VariablePointers or VariablePointersStorageBuffer capability -is declared, the following are allowed for logical pointer types:".

-
-
-
-
-

Change:

-
-
-
-
-

OpVariable can allocate an object whose type is a pointer type, if the -Storage Class of the OpVariable is one of the -following: …​

-
-
-
-
-

to:

-
-
-
-
-

OpVariable can allocate an object whose type is a -logical pointer type, if the -Storage Class of the OpVariable is one of the -following: …​

-
-
-
-
-

Change:

-
-
-
-
-

A variable pointer with the Logical addressing model cannot …​

-
-
-
-
-

to:

-
-
-
-
-

A variable pointer cannot …​

-
-
-
-
-

Add the following rules:

-
-
-

If the addressing model is not PhysicalStorageBuffer64, then the -PhysicalStorageBuffer storage class must not be used.

-
-
-

Add PhysicalStorageBuffer to the list of storage classes that support -atomic access.

-
-
-

OpVariable must not use a storage class of PhysicalStorageBuffer.

-
-
-

If an OpVariable's pointee type is a pointer (or array of pointers) in -PhysicalStorageBuffer storage class, then the variable must be decorated -with exactly one of AliasedPointer or RestrictPointer.

-
-
-

If an OpFunctionParameter is a pointer (or array of pointers) in -PhysicalStorageBuffer storage class, then the function parameter must be -decorated with exactly one of Aliased or Restrict.

-
-
-

If an OpFunctionParameter is a pointer (or array of pointers) and its -pointee type is a pointer in PhysicalStorageBuffer storage class, then -the function parameter must be decorated with exactly one of -AliasedPointer or RestrictPointer.

-
-
-

Any pointer value whose storage class is PhysicalStorageBuffer and that -points to a matrix or an array of matrices or a row or element of a matrix must be the result of -an OpAccessChain or OpPtrAccessChain instruction whose base is a structure type (or -recursively must be the result of a sequence of only access chains from a structure to the final -value). Such a pointer must only be used as the Pointer operand to OpLoad or OpStore.

-
-
-

Modify section 2.16.2. Validation Rules for Shader Capabilities:

-
-
-

Add PhysicalStorageBuffer to the list of storage classes in which -composite objects must be explicitly laid out.

-
-
-

Add PhysicalStorageBuffer to the list of storage classes to which the -result of a FPRoundingMode-decorated conversion instruction can be stored.

-
-
-
-

2.18 Memory Model

-
-

Modify section 2.18.2. Aliasing:

-
-
-

Replace the paragraph about Simple, GLSL, and VulkanKHR memory models:

-
-
-

The Simple, GLSL, and VulkanKHR memory models can assume that aliasing -is generally not present between the memory object declarations. -Specifically, the consumer is free to assume aliasing is not present between -memory object declarations, unless the memory object declarations explicitly -indicate they alias.

-
-
-

Aliasing is indicated by applying the Aliased decoration to a memory object -declaration’s <id>, for OpVariable and OpFunctionParameter <id>s. -Applying Restrict is allowed, but has no effect.

-
-
-

For variables holding PhysicalStorageBuffer pointers, applying the -AliasedPointer decoration on the OpVariable <id> indicates that the -PhysicalStorageBuffer pointers are potentially aliased. Applying -RestrictPointer is allowed, but has no effect. Variables holding -PhysicalStorageBuffer pointers must be decorated as either -AliasedPointer or RestrictPointer.

-
-
-

Only those memory object declarations decorated with Aliased or -AliasedPointer may alias each other.

-
-
-

Modify the Aliasing table in section 2.18.2:

-
-
-

Add a new row for PhysicalStorageBuffer that is a copy of -StorageBuffer. Add PhysicalStorageBuffer everywhere StorageBuffer is -used in the "Second Storage Classes" column.

-
-
-

Add to the description of the Aliasing table:

-
-
-

For the PhysicalStorageBuffer storage class, OpVariable is understood -to mean the PhysicalStorageBuffer pointer value(s) stored in the -variable. An Aliased PhysicalStorageBuffer pointer stored in a -Function variable can potentially alias with other variables in the same -function, or with global variables or function parameters.

-
-
-
-

3.4 Addressing Model

-
-
- ----- - - - - - - - - - - - - - -
Addressing ModelEnabling Capabilities

5348

PhysicalStorageBuffer64
-Indicates pointers whose storage classes are PhysicalStorageBuffer -are physical pointer types with address width equal to 64 bits, and pointers to all other -storage classes are logical.

PhysicalStorageBufferAddresses

-
-
-
-
-

3.7 Storage Class

-
-
- ----- - - - - - - - - - - - - - -
Storage ClassEnabling Capabilities

5349

PhysicalStorageBuffer
-Shared externally, readable and writable, visible across all functions in all -invocations in all work groups. Graphics storage buffers using physical -addressing.

PhysicalStorageBufferAddresses

-
-
-
-
-

3.20 Decorations

-
-
- ------- - - - - - - - - - - - - - - - - - - - - - -
DecorationEnabling CapabilitiesExtra Operands

5355

RestrictPointer
-Apply to an OpVariable, to indicate the compiler may compile as if there -is no aliasing of the pointer stored in the variable. See the Aliasing -section for more detail.

PhysicalStorageBufferAddresses

5356

AliasedPointer
-Apply to an OpVariable, to indicate the compiler is to generate accesses to -the pointer stored in the variable that work correctly in the presence of -aliasing. See the Aliasing section for more detail.

PhysicalStorageBufferAddresses

-
-
-
-
-

3.25 Memory Semantics <id>

-
-

Add PhysicalStorageBuffer to the list of storage classes synchronized by -UniformMemory.

-
-
-
-

3.26 Memory Access

-
-

Add to the description of Aligned:

-
-
-

Valid values are defined by the execution environment.

-
-
-
-

3.31 Capabilities

-
-

Modify Section 3.31, "Capability", adding these rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityEnabling Capabilities

5347

PhysicalStorageBufferAddresses

Shader

-
-
-
-

Add PhysicalStorageBuffer to the list of storage classes for the -StorageBuffer16BitAccess, UniformAndStorageBuffer16BitAccess, -StorageBuffer8BitAccess, and UniformAndStorageBuffer8BitAccess -capabilities.

-
-
-
-

Instructions

-
-

Modify the OpTypeForwardPointer, OpConvertUToPtr, OpConvertPtrToU, and -OpPtrAccessChain instructions to add PhysicalStorageBufferAddresses to -their capability lists.

-
-
-

Modify OpConvertUToPtr to require that the result type must be a physical -pointer type.

-
-
-

Modify OpConvertPtrToU to require that the Pointer operand must have a -physical pointer type.

-
-
-

Modify OpBitcast to allow vector conversions to/from pointers, by changing -this existing rule:

-
-
-
-
-

"If Result Type is a pointer, Operand must be a pointer or integer -scalar. If Operand is a pointer, Result Type must be a pointer or -integer scalar."

-
-
-
-
-

to instead say:

-
-
-
-
-

"If either Result Type or Operand is a pointer, the other must be a -pointer, an integer scalar, or an integer vector."

-
-
-
-
-
-

Universal Validation Rules

-
-
    -
  • -

    When using OpBitcast to convert pointers to/from vectors of integers, only -vectors of 32-bit integers are supported.

    -
  • -
-
-
-
-
-
-

Issues

-
-
-

1) How can we support comparing pointers to "null"?

-
-
-

Resolution: This can be accomplished by converting the pointer to an integer -with OpConvertPtrToU or to a uvec2 with OpBitcast.

-
-
-

2) Should we define a null pointer value in memory?

-
-
-

Discussion: The environment spec can define a particular bit pattern for -NULL, the core SPIR-V spec should not.

-
-
-

Resolution: SPIR-V doesn’t define it, but Vulkan defines it to 0.

-
-
-

3) Can we reuse Aligned to specify a minimum alignment on a load/store?

-
-
-

Resolution: The SPIR-V spec will be changed to say that the meaning of -Aligned is defined by the execution environment, and Vulkan will define -it to be the minimum alignment, at least for physical storage buffer -pointers.

-
-
-

4) Which instructions from Addresses don’t we need?

-
-
-

Discussion: OpSizeOf seems unnecessary without polymorphism in the high -level language. Variable pointers doesn’t enable OpInBoundsPtrAccessChain, -do we need it? OpCopyMemorySized? MaxByteOffset(Id) decorations?

-
-
-

Resolution: Omit all of them listed above, as they are not strictly needed.

-
-
-

5) Does this extension depend on the Int64 capability?

-
-
-

Resolution: This extension can be used without Int64, but OpConvertUToPtr -and OpConvertPtrToU can’t be used in that case. However, OpBitcast can be -used to convert uvec2 <→ reference address.

-
-
-

6) How do Coherent/Volatile work?

-
-
-

Resolution: We rely on the per-instruction availability/visibility and -volatile memory access operands and image operands, many of which were added -by the SPV_KHR_vulkan_memory_model extension. So that extension must be used -to get coherent/volatile access.

-
-
-

7) What changes are needed to the Aliasing section?

-
-
-

Resolution: Pointers to the PhysicalStorageBuffer storage class don’t -quite fit the pre-existing definitions because the pointer is not created by -OpVariable, rather it is loaded from memory or generated with -OpConvertUToPtr. So we extend the definition of a memory object declaration -to include a variable that holds a PhysicalStorageBuffer pointer, and add -a way to decorate that the object in the variable is aliased/restrict rather -than just the variable itself.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2018-12-07

Jeff Bolz

Initial revision

2

2019-09-18

David Neto

Interaction with OpConstantNull, and new SPIR-V 1.4 instructions

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_physical_storage_buffer.html + + +

extensions/KHR/SPV_KHR_physical_storage_buffer.html

+ + diff --git a/extensions/KHR/SPV_KHR_post_depth_coverage.html b/extensions/KHR/SPV_KHR_post_depth_coverage.html index 3cca48d..1c15c72 100644 --- a/extensions/KHR/SPV_KHR_post_depth_coverage.html +++ b/extensions/KHR/SPV_KHR_post_depth_coverage.html @@ -1,418 +1,12 @@ - - - - - - - -SPV_KHR_post_depth_coverage - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_post_depth_coverage

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Daniel Koch, NVIDIA

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2017 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
  • -

    Approved by the SPIR Working Group: 2017-05-10

    -
  • -
  • -

    Approved by the Khronos Board of Promoters: 2017-06-30

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2017-07-07

Revision

3

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.1 Revision 6.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension provides a capability to allow a new execution mode -for post depth coverage to support the -GL_ARB_post_depth_coverage and -GL_EXT_post_depth_coverage -extensions.

-
-
-

The new functionality is enabled under the PostDepthCoverageCapability -capability.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_post_depth_coverage"
-
-
-
-
-
-

New Execution Mode

-
-
-

This extension introduces a new execution mode:

-
-
-
-
PostDepthCoverage
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
SampleMaskPostDepthCoverage
-
-
-
-
-
-

New Builtins

-
-
-

None.

-
-
-
-
-

New Instructions

-
-
-

None.

-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - - - - - -

PostDepthCoverage

4446

SampleMaskPostDepthCoverage

4447

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.1

-
-
-
-
(Modify Section 3.6, Execution Mode, adding a row to the Execution Mode table)
-
-
-
- ------ - - - - - - - - - - - - - - - -
Execution ModeEnabling CapabilitiesExtra Operands

4446

PostDepthCoverage
-The input variable decorated with SampleMask will reflect the result of -the EarlyFragmentTests. Only valid with the Fragment Execution Model.

SampleMaskPostDepthCoverage

-
-
-
-
(Modify Section 3.31, Capability, adding a row to the Capability table)
-
-
-
- ------ - - - - - - - - - - - - - - - -
CapabilityDepends OnEnabled by Extension

4447

SampleMaskPostDepthCoverage

SPV_KHR_post_depth_coverage

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_KHR_post_depth_coverage"
-
-
-
-
-
(Modify Section 2.16.2, Validation Rules for Shader Capabilities, adding a new rule)
-
-

An OpEntryPoint with the PostDepthCoverage Execution Model must also set -the EarlyFragmentTests Execution Mode.

-
-
-
-
-
-
-

Issues

-
-
-
    -
  1. -

    What should we call the capability?

    -
    -
    -
    -

    DISCUSSION: Can both the execution mode and capability have the same name? -It seems like it could be confusing (documentation and code-wise) even if it is -technically possible. The Capability suffix would be redundant.

    -
    -
    -

    RESOLVED: Call it SampleMaskPostDepthCoverage, similar to -SampleMaskCoverageOverrideNV since this is modifying the semantics of -the SampleMask decorated variables. Other options considered were -PostDepthCoverage or PostDepthCoverageCapability.

    -
    -
    -
    -
  2. -
  3. -

    Should the EarlyFragmentTests Execution Mode be explicit or implicit when -PostDepthCoverage is enabled?

    -
    -
    -
    -

    RESOLVED: In GL_EXT_post_depth_coverage, both the early_fragment_test and -post_depth_coverage layouts needed to be explicitly set. In -GL_ARB_post_depth_coverage, the early_fragment_test was made implicit when -post_depth_coverage was enabled as there is no other sensible way of using -post depth coverage. However, since SPIR-V is lower-level than GLSL and more -explicit/verbose, it follows that both Execution Modes should be explicitly -declared and the GLSL front-end can ensure that both modes are specified when -either extension is used, and it should also be simpler for consumers.

    -
    -
    -
    -
  4. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2017-04-25

Daniel Koch

Initial revision

2

2017-05-12

David Neto

Record approval by SPIR Working Group

3

2017-07-07

Daniel Koch

Record ratification

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_post_depth_coverage.html + + +

extensions/KHR/SPV_KHR_post_depth_coverage.html

+ + diff --git a/extensions/KHR/SPV_KHR_quad_control.html b/extensions/KHR/SPV_KHR_quad_control.html index c1958d5..13b7474 100644 --- a/extensions/KHR/SPV_KHR_quad_control.html +++ b/extensions/KHR/SPV_KHR_quad_control.html @@ -1,373 +1,12 @@ - - - - - - - -SPV_KHR_quad_control - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_quad_control

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Tobias Hector, AMD

    -
  • -
  • -

    Nicolai Haehnle, AMD

    -
  • -
  • -

    Jeff Bolz, Nvidia

    -
  • -
  • -

    Graeme Leese, Broadcom

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2024 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Approved by the SPIR-V Working Group: 2023-12-06

    -
  • -
  • -

    Approved by the Khronos Board of Promoters: 2024-01-19

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2024-01-25

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.6, Revision 3, Unified.

-
-
-

This extension requires SPIR-V 1.3.

-
-
-

This extension requires SPV_KHR_maximal_reconvergence.

-
-
-
-
-

Overview

-
-
-

This extension adds new quad group operations, and two new execution modes.

-
-
-

The QuadDerivativesKHR execution mode requires that derivatives used to -determine implicit lod are always calculated on a per-quad basis. -This allows sampling from textures with ImplicitLod operations as long as -control flow is uniform within the quad - which the new quad operations can -be used to guarantee.

-
-
-

The RequireFullQuadsKHR execution mode requires that helper invocations -are spawned for fragment shader invocations, enabling users to explicitly -opt-in to helper invocations. -Invocations may still be spawned implicitly according to the client API. -This is intended to be paired with the MaximallyReconvergesKHR execution -mode in SPV_KHR_maximal_reconvergence to provide robust guarantees about -uniform control flow within a quad.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_quad_control"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6.3

-
-
-

3.6 Execution Mode

-
-

Modify Section 3.6, "Execution Mode", adding these rows to the table:

-
-
-
- -------- - - - - - - - - - - - - - - - - - - - - - -
Execution ModeExtra OperandsEnabling Capabilities

5088

QuadDerivativesKHR
-The derivative group must be equivalent to the quad group.

QuadControlKHR

5089

RequireFullQuadsKHR
-Helper invocations must be spawned such that all quad groups start with four active invocations. Only valid with the Fragment Execution Model.

QuadControlKHR

-
-
-
-
-

3.31 Capabilities

-
-

Modify Section 3.31, "Capability", adding these rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

5087

QuadControlKHR
-Uses the QuadDerivativesKHR or RequireFullQuadsKHR execution modes, or the OpGroupNonUniformQuadAllKHR or OpGroupNonUniformQuadAnyKHR instructions.

-
-
-
-
-

3.42.24 Non-Uniform Instructions

-
-

Modify Section 3.42.24, "Non-Uniform Instructions", adding two new instructions:

-
- ------- - - - - - - - - - - - - - -

OpGroupNonUniformQuadAllKHR
-
-Evaluates a predicate for all active invocations in the quad, resulting in true if predicate evaluates to true for all active invocations in the quad, otherwise the result is false.
-
-Result Type must be a Boolean Type.
-
-Predicate must be a Boolean Type.

Capability:
-QuadControlKHR

4

5110

<id>
-Result Type

Result <id>

<id>
-Predicate

- ------- - - - - - - - - - - - - - -

OpGroupNonUniformQuadAnyKHR
-
-Evaluates a predicate for all active invocations in the quad, resulting in true if predicate evaluates to true for any active invocation in the quad, otherwise the result is false.
-
-Result Type must be a Boolean Type.
-
-Predicate must be a Boolean Type.

Capability:
-QuadControlKHR

4

5111

<id>
-Result Type

Result <id>

<id>
-Predicate

-
-
-
-
-

Issues

-
-
-

Why do the new quad operations not have execution scopes?

-
-

This parameter was deemed redundant for quad operations; the scope is always the quad.

-
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2024-01-25

Tobias Hector

Initial revision

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_quad_control.html + + +

extensions/KHR/SPV_KHR_quad_control.html

+ + diff --git a/extensions/KHR/SPV_KHR_ray_cull_mask.html b/extensions/KHR/SPV_KHR_ray_cull_mask.html index 1db061d..eca7738 100644 --- a/extensions/KHR/SPV_KHR_ray_cull_mask.html +++ b/extensions/KHR/SPV_KHR_ray_cull_mask.html @@ -1,305 +1,12 @@ - - - - - - - -SPV_KHR_ray_cull_mask - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_ray_cull_mask

-
-
-
-
-

Contact

-
-
-

See Issues list in the Khronos SPIRV-Headers repository: -https://github.com/KhronosGroup/SPIRV-Headers

-
-
-
-
-

Contributors

-
-
-
    -
  • -

    Marius Bjorge, ARM

    -
  • -
  • -

    Alan Baker, Google

    -
  • -
  • -

    Daniel Koch, NVIDIA

    -
  • -
  • -

    Members of the Vulkan Ray Tracing TSG

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2022-02-21

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the Unified SPIR-V Specification, -Version 1.5, Revision 1.

-
-
-

This extension requires SPIR-V 1.4.

-
-
-

This extension requires SPV_KHR_ray_tracing.

-
-
-
-
-

Overview

-
-
-

This extension adds functionality to get the ray cull mask for the current intersection.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_ray_cull_mask"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces new capabilities:

-
-
-
-
RayCullMaskKHR
-
-
-
-
-
-

New Builtins

-
-
-

Builtins added under the RayCullMaskKHR capability

-
-
-
-
CullMaskKHR
-
-
-
-
-
-

Modifications to the SPIR-V Specification

-
-
-
-
(Modify Section 3.21, Builtin, adding rows to the Builtin table)
-
-
-
- ----- - - - - - - - - - - - - - -
DecorationEnabling Capabilities

6021

CullMaskKHR
-Cull mask for the ray being traced in the IntersectionKHR, -AnyHitKHR, ClosestHitKHR, or MissKHR execution models.

-

Refer to the Vulkan API specification for more details.

RayCullMaskKHR

-
-
-
-
(Modify Section 3.31, Capability, adding a row to the Capability table)
-
-
-
- ------ - - - - - - - - - - - - - - - -
CapabilityDepends OnEnabled by Extension

6020

RayCullMaskKHR
-Allows the use of CullMaskKHR builtin.

SPV_KHR_ray_cull_mask

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_KHR_ray_cull_mask"
-
-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2022-02-21

Marius Bjorge

Internal revisions

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_ray_cull_mask.html + + +

extensions/KHR/SPV_KHR_ray_cull_mask.html

+ + diff --git a/extensions/KHR/SPV_KHR_ray_query.html b/extensions/KHR/SPV_KHR_ray_query.html index 5b5c29d..fbfeedf 100644 --- a/extensions/KHR/SPV_KHR_ray_query.html +++ b/extensions/KHR/SPV_KHR_ray_query.html @@ -1,1780 +1,12 @@ - - - - - - - -SPV_KHR_ray_query - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_ray_query

-
-
-
-
-

Contact

-
-
-

See Issues list in the Khronos SPIRV-Registry repository: -https://github.com/KhronosGroup/SPIRV-Registry

-
-
-
-
-

Contributors

-
-
-
    -
  • -

    Contributors to SPV_KHR_ray_tracing

    -
  • -
  • -

    Tobias Hector, AMD

    -
  • -
  • -

    Nicolai Haehnle, AMD

    -
  • -
  • -

    Eric Werness, NVIDIA

    -
  • -
  • -

    Joshua Barczak, Intel

    -
  • -
  • -

    Ashwin Lele, NVIDIA

    -
  • -
  • -

    Daniel Koch, NVIDIA

    -
  • -
  • -

    Hans-Kristian Arntzen, Valve

    -
  • -
  • -

    David McAllister, Qualcomm

    -
  • -
  • -

    Dae Kim, Imagination

    -
  • -
  • -

    Alan Baker, Google

    -
  • -
  • -

    Aleksandra Krstic, Qualcomm

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
  • -

    Ratified by the Khronos Board 2020-11-20

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2023-01-13

Revision

14

-
-
-
-

Dependencies

-
-
-

This extension is written against the Unified SPIR-V Specification, -Version 1.5, Revision 1.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-

This extension interacts with SPV_KHR_ray_tracing.

-
-
-
-
-

Overview

-
-
-

This extension adds ray query objects to enable ray traversal in any shader stage.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_ray_query"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces new capabilities:

-
-
-
-
RayQueryKHR
-RayTraversalPrimitiveCullingKHR
-
-
-
-
-
-

New Instructions

-
-
-

Instructions added under the RayQueryKHR capability

-
-
-
-
OpTypeAccelerationStructureKHR
-OpTypeRayQueryKHR
-OpRayQueryInitializeKHR
-OpRayQueryTerminateKHR
-OpRayQueryGenerateIntersectionKHR
-OpRayQueryConfirmIntersectionKHR
-OpRayQueryProceedKHR
-OpRayQueryGetIntersectionTypeKHR
-OpRayQueryGetRayTMinKHR
-OpRayQueryGetRayFlagsKHR
-OpRayQueryGetWorldRayDirectionKHR
-OpRayQueryGetWorldRayOriginKHR
-OpRayQueryGetIntersectionTKHR
-OpRayQueryGetIntersectionInstanceCustomIndexKHR
-OpRayQueryGetIntersectionInstanceIdKHR
-OpRayQueryGetIntersectionInstanceShaderBindingTableRecordOffsetKHR
-OpRayQueryGetIntersectionGeometryIndexKHR
-OpRayQueryGetIntersectionPrimitiveIndexKHR
-OpRayQueryGetIntersectionBarycentricsKHR
-OpRayQueryGetIntersectionFrontFaceKHR
-OpRayQueryGetIntersectionCandidateAABBOpaqueKHR
-OpRayQueryGetIntersectionObjectRayDirectionKHR
-OpRayQueryGetIntersectionObjectRayOriginKHR
-OpRayQueryGetIntersectionObjectToWorldKHR
-OpRayQueryGetIntersectionWorldToObjectKHR
-OpConvertUToAccelerationStructureKHR
-
-
-
-
-
-

Modifications to the SPIR-V Specification

-
-
-
-
(Add the following terminology to section 2.2.2, Types)
-
-
-
-
-

Ray query type: The type returned by OpTypeRayQueryKHR.

-
-
-
-
-
(Add to the list of opaque types in section 2.2.2, Types)
-
-
-
-
-
    -
  • -

    OpTypeAccelerationStructureKHR

    -
  • -
  • -

    OpTypeRayQueryKHR

    -
  • -
-
-
-
-
-
(Modify Section 2.16.1, Universal Validation Rules)
-
-
-
-
-

Change the second bullet under "Any pointer operand to an OpFunctionCall must be" -to include OpTypeAccelerationStructureKHR:

-
-
-
    -
  • -

    a pointer to an element in an array that is a memory object declaration, -where the element type is OpTypeSampler, OpTypeImage, or -OpTypeAccelerationStructureKHR.

    -
  • -
-
-
-

Add a new bullet under "Data rules":

-
-
-
    -
  • -

    Instructions accessing a scalar acceleration structure out of a composite -must only use dynamically-uniform indexes, unless the index is decorated with -NonUniformEXT. They must be in the same block in which their Result <id> -are consumed. Such Result <id> must not appear as operands to OpPhi or -OpSelect instructions, or any instructions other than the ray tracing -instructions specified to operate on them.

    -
  • -
-
-
-
-
-
-
-
-
-
(Add a new sub-section 3.RF, Ray Flags, adding a new table)
-
-
-
-
-

3.RF, Ray Flags

-
-
-

Flags controlling the properties of an OpRayQueryInitializeKHR instruction -or for comparing against the IncomingRayFlagsKHR builtin. -See the client API specification for more details.

-
-
-

Despite being a mask and allowing multiple bits to be combined, -the following combinations are invalid:

-
-
-
    -
  • -

    if more than one of these four bits are set: -OpaqueKHR, NoOpaqueKHR, CullOpaqueKHR, CullNoOpaqueKHR.

    -
  • -
  • -

    if more than one of these three bits are set: SkipTrianglesKHR, -CullBackFacingTrianglesKHR, CullFrontFacingTrianglesKHR.

    -
  • -
  • -

    if more than one of these two bits are set: SkipTrianglesKHR, -SkipAABBsKHR.

    -
  • -
-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Ray FlagsEnabling Capabilities

0

NoneKHR
-No flags specified.

RayQueryKHR

1

OpaqueKHR
-Force all intersections with the trace to be opaque.
-See the Ray Opacity Culling section in the Vulkan API specification.

RayQueryKHR

2

NoOpaqueKHR
-Force all intersections with the trace to be non-opaque.
-See the Ray Opacity Culling section in the Vulkan API specification.

RayQueryKHR

4

TerminateOnFirstHitKHR
-Accept the first hit discovered.
-See the Ray Closest Hit Determination section in the Vulkan API specification.

RayQueryKHR

8

SkipClosestHitShaderKHR
-Do not execute a closest hit shader.
-See the Ray Result Determination section in the Vulkan API specification.

RayQueryKHR

16

CullBackFacingTrianglesKHR
-Do not intersect with the back face of triangles.
-See the Ray Face Culling section in the Vulkan API specification.

RayQueryKHR

32

CullFrontFacingTrianglesKHR
-Do not intersect with the front face of triangles.
-See the Ray Face Culling section in the Vulkan API specification.

RayQueryKHR

64

CullOpaqueKHR
-Do not intersect with opaque geometry.
-See the Ray Opacity Culling section in the Vulkan API specification.

RayQueryKHR

128

CullNoOpaqueKHR
-Do not intersect with non-opaque geometry.
-See the Ray Opacity Culling section in the Vulkan API specification.

RayQueryKHR

256

SkipTrianglesKHR
-Do not intersect with any triangle geometries. -See the Ray Primitive Culling section in the Vulkan API specification.

RayTraversalPrimitiveCullingKHR

512

SkipAABBsKHR
-Do not intersect with any aabb geometries. -See the Ray Primitive Culling section in the Vulkan API specification.

RayTraversalPrimitiveCullingKHR

-
-
-
-
-
-
-
-
(Add a new sub-section 3.RQIntersection, Ray Query Intersection, adding a new table)
-
-
-
-
-

Identifies which intersection should be returned from a ray query.

-
- ----- - - - - - - - - - - - - - - - - - - -
Ray Query IntersectionEnabling Capabilities

0

RayQueryCandidateIntersectionKHR
-Identifies the current candidate intersection being considered, valid when OpRayQueryProceedKHR returns true.

RayQueryKHR

1

RayQueryCommittedIntersectionKHR
-Identifies the last intersection committed that is being considered, valid when OpRayQueryGetCommittedIntersectionTypeKHR does not return RayQueryCommittedIntersectionNoneKHR.

RayQueryKHR

-
-
-
-
-
-
-
-
(Add a new sub-section 3.RQCommitted, Ray Query Committed Intersection Type, adding a new table)
-
-
-
-
-

Describes the type of the intersection currently committed in a ray query.

-
- ----- - - - - - - - - - - - - - - - - - - - - - - - -
Ray Query Committed Intersection TypeEnabling Capabilities

0

RayQueryCommittedIntersectionNoneKHR
-No intersection is committed.

RayQueryKHR

1

RayQueryCommittedIntersectionTriangleKHR
-An intersection with a triangle has been committed.

RayQueryKHR

2

RayQueryCommittedIntersectionGeneratedKHR
-A user-generated intersection has been committed.

RayQueryKHR

-
-
-
-
-
-
-
-
(Add a new sub-section 3.RQCandidate, Ray Query Candidate Intersection Type, adding a new table)
-
-
-
-
-

Describes the type of the intersection which is currently the candidate in a ray query.

-
- ----- - - - - - - - - - - - - - - - - - - -
Ray Query Candidate Intersection TypeEnabling Capabilities

0

RayQueryCandidateIntersectionTriangleKHR
-A potential intersection with a triangle is being considered.

RayQueryKHR

1

RayQueryCandidateIntersectionAABBKHR
-A potential intersection with an axis-aligned bounding box is being considered.

RayQueryKHR

-
-
-
-
(Modify Section 3.31, Capability, adding a row to the Capability table)
-
-
-
- ----- - - - - - - - - - - - - - - - - - - -
CapabilityImplicitly Declares

4472

RayQueryKHR
-Uses Ray Query

Shader

4478

RayTraversalPrimitiveCullingKHR
-Uses SkipAABBsKHR or SkipTrianglesKHR

RayQueryKHR

-
-
-
-
(Modify Section 3.36.6, Type-Declaration Instructions, adding two new tables)
-
-
-
- ----- - - - - - - - - - - - -

OpTypeAccelerationStructureKHR
-
-Declares an acceleration structure type which is an opaque reference to -acceleration structure handle as defined in the client API -specification.

-

Consumed by OpRayQueryInitializeKHR and -OpTraceRayKHR

-

This type is opaque: values of this type have no defined physical size or -bit pattern.

Capability:
-RayQueryKHR

2

5341

<id> Result

- ----- - - - - - - - - - - - -

OpTypeRayQueryKHR
-
-Declares a ray query type which is an opaque object representing a ray traversal.
-
-This type is opaque: values of this type have no defined physical size or -bit pattern.

Capability:
-RayQueryKHR

2

4472

<id> Result

-
-
-
-
(Add the following line to the description of OpTypePointer, in Section 3.32.6, Type-Declaration Instructions)
-
-
-
-
-

If Type is OpTypeRayQueryKHR, Storage Class must be Private or Function.

-
-
-
-
-
(Add the following line to the description of OpStore and OpLoad, in Section 3.32.8, Memory Instructions)
-
-
-
-
-

The Type operand to the OpTypePointer used for Pointer must not be OpTypeRayQueryKHR.

-
-
-
-
-
(Add the following line to the description of OpCopyMemory and OpCopyMemorySized, in Section 3.32.8, Memory Instructions)
-
-
-
-
-

The Type operand to the OpTypePointer used for Target or Source must not be OpTypeRayQueryKHR.

-
-
-
-
-
(Modify Section 3.36.11, Conversion Instructions, adding a new table)
-
-
-
- ------- - - - - - - - - - - - - - -

OpConvertUToAccelerationStructureKHR
-
- Converts a 64-bit integer into an OpTypeAccelerationStructureKHR.
-
- Acceleration Structure must either be a 64-bit scalar of integer type, whose Signedness operand is 0, or a 2-component vector of 32-bit integer type, whose Signedness operand is 0.
- A vector value input behaves as-if OpBitcast converts the value to a 64-bit scalar integer first.
- Acceleration Structure represents the address of a valid acceleration structure.
- Refer to the client API specification for details.
-
-Result Type must be an OpTypeAccelerationStructureKHR. -

Capability:
-RayQueryKHR

4

4447

<id> Result Type

<id> Result

<id> Acceleration Structure

-
-
-
-
(Add a new sub section 3.36.RQInstructions, Ray Query Instructions)
-
-
-
- ------------ - - - - - - - - - - - - - - - - - - -

OpRayQueryInitializeKHR
-
- Initialize a ray query object, defining parameters of traversal. After this call, a new ray trace can be performed with OpRayQueryProceedKHR. Any previous traversal state stored in the object is lost.
-
- Ray Query is a pointer to the ray query to initialize.
-
- Acceleration Structure is the descriptor for the acceleration structure to trace into.
-
- Ray Flags contains one or more of the Ray Flag values.
-
- Cull Mask is the mask to test against the instance mask.
-
- Ray Origin, Ray Tmin, Ray Direction, and Ray Tmax control the basic parameters of the ray to be traced.
-
- Ray Flags and Cull Mask must be a 32-bit integer type scalar.
-
- Only the 8 least-significant bits of Cull Mask are used by this instruction - other bits are ignored. -
- Ray Origin and Ray Direction must be a 32-bit floating-point type 3-component vector.
-
- Ray Tmin and Ray Tmax must be a 32-bit floating-point type scalar.

Capability:
-RayQueryKHR

9

4473

<id> Ray Query

<id> Acceleration Structure

<id> Ray Flags

<id> Cull Mask

<id> Ray Origin

<id> Ray Tmin

<id> Ray Direction

<id> Ray Tmax

- ----- - - - - - - - - - - - -

OpRayQueryTerminateKHR
-
- Terminates further execution of a ray query; further calls to OpRayQueryProceed will return false. - Refer to the client API specification for more details.
-
- Ray Query is a pointer to the ray query to terminate.
-
- Behavior is undefined if the value returned by any prior execution of OpRayQueryProceedKHR with the same ray query object was not true.

Capability:
-RayQueryKHR

2

4474

<id> Ray Query

- ------ - - - - - - - - - - - - -

OpRayQueryGenerateIntersectionKHR
-
- Adds a candidate generated intersection to the ray query to be included in the determination of the closest hit for a ray query.
-
- Ray Query is a pointer to the ray query to generate an intersection candidate for.
-
- Hit T is the floating point parametric value along ray for the intersection.
-
- Hit T must be a 32-bit floating-point type scalar.
-
- Behavior is undefined if OpRayQueryProceedKHR was not executed on the same ray query object, - or if the last value returned by such an execution of OpRayQueryProceedKHR was not true. -
- Behavior is undefined if the Ray Query Candidate Intersection Type is not RayQueryCandidateIntersectionAABBKHR.

Capability:
-RayQueryKHR

3

4475

<id> Ray Query

<id> Hit T

- ----- - - - - - - - - - - - -

OpRayQueryConfirmIntersectionKHR
-
- Confirms a triangle intersection to be included in the determination of the closest hit for a ray query.
-
- Ray Query is a pointer to the ray query to confirm the hit to.
-
- Behavior is undefined if OpRayQueryProceedKHR was not executed on the same ray query object, - or if the last value returned by such an execution of OpRayQueryProceedKHR was not true. -
- Ray Query Candidate Intersection Type must be RayQueryCandidateIntersectionTriangleKHR.

Capability:
-RayQueryKHR

2

4476

<id> Ray Query

- ------- - - - - - - - - - - - - - -

OpRayQueryProceedKHR
-
- Allow traversal to proceed. Returns true if traversal is incomplete, and false when it has completed.
-
- Ray Query is a pointer to the ray query to continue traversal on.
-
- Behavior is undefined if a previous call to OpRayQueryProceedKHR with the same ray query object returned false. -
- Result Type must be a Boolean type.

Capability:
-RayQueryKHR

4

4477

<id> Result Type

Result <id>

<id> Ray Query

- -------- - - - - - - - - - - - - - - -

OpRayQueryGetIntersectionTypeKHR
-
- Returns the type of the current candidate or committed intersection.
-
- Result describes the type of the intersection in the ray query object.
- If Intersection is RayQueryCandidateIntersectionKHR, it returns one of the Ray Query Candidate Intersection Types.
- If Intersection is RayQueryCommittedIntersectionKHR, it returns one of the Ray Query Committed Intersection Types.
-
- Result Type must be a 32-bit integer type scalar.
-
- Intersection must be the <id> of a constant instruction with a 32-bit scalar integer type.
-
- Intersection identifies which intersection values should be returned for, either the current candidate or the closest recorded hit so far; see Ray Query Intersection.
-
- Ray Query is a pointer to the ray query object.
-
- If Intersection is RayQueryCandidateIntersectionKHR, behavior is undefined if OpRayQueryProceedKHR - was not executed on the same ray query object, or if the last value returned by such an execution of OpRayQueryProceedKHR was not true.

Capability:
-RayQueryKHR

5

4479

<id> Result Type

Result <id>

<id> Ray Query

<id> Intersection

- ------- - - - - - - - - - - - - - -

OpRayQueryGetRayTMinKHR
-
- Returns the Ray Tmin value used by the ray query.
-
- Result returns the Ray Tmin value used by the ray query.
-
- Result Type must be a 32-bit floating-point type scalar.
-
- Ray Query is a pointer to the ray query object.

Capability:
-RayQueryKHR

4

6016

<id> Result Type

Result <id>

<id> Ray Query

- ------- - - - - - - - - - - - - - -

OpRayQueryGetRayFlagsKHR
-
- Returns the Ray Flags used by the ray query.
-
- Result returns the Ray Flag values used by the ray query.
-
- Result Type must be a 32-bit integer type scalar.
-
- Ray Query is a pointer to the ray query object.

Capability:
-RayQueryKHR

4

6017

<id> Result Type

Result <id>

<id> Ray Query

- -------- - - - - - - - - - - - - - - -

OpRayQueryGetIntersectionTKHR
-
- Gets the T value for the current or previous intersection considered in a ray query.
-
- Result is the returned T value.
-
- Result Type must be a 32-bit floating-point type scalar.
-
- Intersection must be the <id> of a constant instruction with a 32-bit scalar integer type.
-
- Intersection identifies which intersection values should be returned for, either the current candidate or the closest recorded hit so far; see Ray Query Intersection.
-
- Ray Query is a pointer to the ray query object.
-
- If Intersection is RayQueryCandidateIntersectionKHR, behavior is undefined if OpRayQueryProceedKHR was not executed on - the same ray query object, if the last value returned by such an execution of OpRayQueryProceedKHR was not true, or the - current intersection candidate does not have a Ray Query Candidate Intersection Type of RayQueryCandidateIntersectionTriangleKHR. - If Intersection is RayQueryCommittedIntersectionKHR, behavior is undefined if there is no current committed intersection - (see OpRayQueryGetIntersectionTypeKHR).

Capability:
-RayQueryKHR

5

6018

<id> Result Type

Result <id>

<id> Ray Query

<id> Intersection

- -------- - - - - - - - - - - - - - - -

OpRayQueryGetIntersectionInstanceCustomIndexKHR
-
- Gets the custom index of the instance for the current intersection considered in a ray query.
-
- Result is the returned custom instance index.
-
- Result Type must be a 32-bit integer type scalar.
-
- Intersection must be the <id> of a constant instruction with a 32-bit scalar integer type.
-
- Intersection identifies which intersection values should be returned for, either the current candidate or the closest recorded hit so far; see Ray Query Intersection.
-
- Ray Query is a pointer to the ray query object.
-
- If Intersection is RayQueryCandidateIntersectionKHR, behavior is undefined if OpRayQueryProceedKHR - was not executed on the same ray query object, or if the last value returned by such an execution of OpRayQueryProceedKHR was not true. - If Intersection is RayQueryCommittedIntersectionKHR, behavior is undefined if there is no current committed - intersection (see OpRayQueryGetIntersectionTypeKHR).

Capability:
-RayQueryKHR

5

6019

<id> Result Type

Result <id>

<id> Ray Query

<id> Intersection

- -------- - - - - - - - - - - - - - - -

OpRayQueryGetIntersectionInstanceIdKHR
-
- Gets the id of the instance for the current intersection considered in a ray query.
-
- Result is the returned instance id.
-
- Result Type must be a 32-bit integer type scalar.
-
- Intersection must be the <id> of a constant instruction with a 32-bit scalar integer type.
-
- Intersection identifies which intersection values should be returned for, either the current candidate or the closest recorded hit so far; see Ray Query Intersection.
-
- Ray Query is a pointer to the ray query object.
-
- If Intersection is RayQueryCandidateIntersectionKHR, behavior is undefined if OpRayQueryProceedKHR - was not executed on the same ray query object, or if the last value returned by such an execution of OpRayQueryProceedKHR was not true. - If Intersection is RayQueryCommittedIntersectionKHR, behavior is undefined if there is no current committed - intersection (see OpRayQueryGetIntersectionTypeKHR).

Capability:
-RayQueryKHR

5

6020

<id> Result Type

Result <id>

<id> Ray Query

<id> Intersection

- -------- - - - - - - - - - - - - - - -

OpRayQueryGetIntersectionInstanceShaderBindingTableRecordOffsetKHR
-
- Gets the shader binding table record offset for the current intersection considered in a ray query.
-
- Result is the returned instance id.
-
- Result Type must be a 32-bit integer type scalar.
-
- Intersection must be the <id> of a constant instruction with a 32-bit scalar integer type.
-
- Intersection identifies which intersection values should be returned for, either the current candidate or the closest recorded hit so far; see Ray Query Intersection.
-
- Ray Query is a pointer to the ray query object.
-
- If Intersection is RayQueryCandidateIntersectionKHR, behavior is undefined if OpRayQueryProceedKHR - was not executed on the same ray query object, or if the last value returned by such an execution of OpRayQueryProceedKHR was not true. - If Intersection is RayQueryCommittedIntersectionKHR, behavior is undefined if there is no current committed - intersection (see OpRayQueryGetIntersectionTypeKHR).

Capability:
-RayQueryKHR

5

6021

<id> Result Type

Result <id>

<id> Ray Query

<id> Intersection

- -------- - - - - - - - - - - - - - - -

OpRayQueryGetIntersectionGeometryIndexKHR
-
- Gets the geometry index for the current intersection considered in a ray query.
-
- Result is the returned geometry index.
-
- Result Type must be a 32-bit integer type scalar.
-
- Intersection must be the <id> of a constant instruction with a 32-bit scalar integer type.
-
- Intersection identifies which intersection values should be returned for, either the current candidate or the closest recorded hit so far; see Ray Query Intersection.
-
- Ray Query is a pointer to the ray query object.
-
- If Intersection is RayQueryCandidateIntersectionKHR, behavior is undefined if OpRayQueryProceedKHR - was not executed on the same ray query object, or if the last value returned by such an execution of OpRayQueryProceedKHR was not true. - If Intersection is RayQueryCommittedIntersectionKHR, behavior is undefined if there is no current committed - intersection (see OpRayQueryGetIntersectionTypeKHR).

Capability:
-RayQueryKHR

5

6022

<id> Result Type

Result <id>

<id> Ray Query

<id> Intersection

- -------- - - - - - - - - - - - - - - -

OpRayQueryGetIntersectionPrimitiveIndexKHR
-
- Gets the primitive index for the current intersection considered in a ray query.
-
- Result is the returned primitive index.
-
- Result Type must be a 32-bit integer type scalar.
-
- Intersection must be the <id> of a constant instruction with a 32-bit scalar integer type.
-
- Intersection identifies which intersection values should be returned for, either the current candidate or the closest recorded hit so far; see Ray Query Intersection.
-
- Ray Query is a pointer to the ray query object.
-
- If Intersection is RayQueryCandidateIntersectionKHR, behavior is undefined if OpRayQueryProceedKHR - was not executed on the same ray query object, or if the last value returned by such an execution of OpRayQueryProceedKHR was not true. - If Intersection is RayQueryCommittedIntersectionKHR, behavior is undefined if there is no current committed - intersection (see OpRayQueryGetIntersectionTypeKHR).

Capability:
-RayQueryKHR

5

6023

<id> Result Type

Result <id>

<id> Ray Query

<id> Intersection

- -------- - - - - - - - - - - - - - - -

OpRayQueryGetIntersectionBarycentricsKHR
-
- Gets the second and third barycentric coordinates of the current intersection considered in a ray query against the primitive it hit.
-
- Result is the returned barycentric coordinates.
-
- Result Type must be a 32-bit floating-point type 2-component vector.
-
- Intersection must be the <id> of a constant instruction with a 32-bit scalar integer type.
-
- Intersection identifies which intersection values should be returned for, either the current candidate or the closest recorded hit so far; see Ray Query Intersection.
-
- Ray Query is a pointer to the ray query object.
-
- If Intersection is RayQueryCandidateIntersectionKHR, behavior is undefined if OpRayQueryProceedKHR was not executed on - the same ray query object, if the last value returned by such an execution of OpRayQueryProceedKHR was not true, or the - current intersection candidate does not have a Ray Query Candidate Intersection Type of RayQueryCandidateIntersectionTriangleKHR. - If Intersection is RayQueryCommittedIntersectionKHR, behavior is undefined if there is no current committed intersection - (see OpRayQueryGetIntersectionTypeKHR), or if the Ray Query Committed Intersection Type is not RayQueryCommittedIntersectionTriangleKHR.

Capability:
-RayQueryKHR

5

6024

<id> Result Type

Result <id>

<id> Ray Query

<id> Intersection

- -------- - - - - - - - - - - - - - - -

OpRayQueryGetIntersectionFrontFaceKHR
-
- Gets a boolean indicating whether the current intersection considered in a ray query was with the front face or back face of a primitive.
-
- Result is true if the intersection was with the front face of a primitive, or false otherwise.
-
- Result Type must be a boolean type scalar.
-
- Intersection must be the <id> of a constant instruction with a 32-bit scalar integer type.
-
- Intersection identifies which intersection values should be returned for, either the current candidate or the closest recorded hit so far; see Ray Query Intersection.
-
- Ray Query is a pointer to the ray query object.
-
- If Intersection is RayQueryCandidateIntersectionKHR, behavior is undefined if OpRayQueryProceedKHR was not executed on - the same ray query object, if the last value returned by such an execution of OpRayQueryProceedKHR was not true, or the - current intersection candidate does not have a Ray Query Candidate Intersection Type of RayQueryCandidateIntersectionTriangleKHR. - If Intersection is RayQueryCommittedIntersectionKHR, behavior is undefined if there is no current committed intersection - (see OpRayQueryGetIntersectionTypeKHR), or if the Ray Query Committed Intersection Type is not RayQueryCommittedIntersectionTriangleKHR.

Capability:
-RayQueryKHR

5

6025

<id> Result Type

Result <id>

<id> Ray Query

<id> Intersection

- ------- - - - - - - - - - - - - - -

OpRayQueryGetIntersectionCandidateAABBOpaqueKHR
-
- Gets a boolean indicating whether a candidate intersection considered in a ray query was with an opaque AABB or not.
-
- Result is true if the intersection was with an opaque AABB, or false otherwise.
-
- Result Type must be a boolean type scalar.
-
- Ray Query is a pointer to the ray query object.

Capability:
-RayQueryKHR

4

6026

<id> Result Type

Result <id>

<id> Ray Query

- -------- - - - - - - - - - - - - - - -

OpRayQueryGetIntersectionObjectRayDirectionKHR
-
- Gets the object-space ray direction for the current intersection considered in a ray query.
-
- Result is the returned ray direction.
-
- Result Type must be a 32-bit floating-point type 3-component vector.
-
- Intersection must be the <id> of a constant instruction with a 32-bit scalar integer type.
-
- Intersection identifies which intersection values should be returned for, either the current candidate or the closest recorded hit so far; see Ray Query Intersection.
-
- Ray Query is a pointer to the ray query object.
-
- If Intersection is RayQueryCandidateIntersectionKHR, behavior is undefined if OpRayQueryProceedKHR - was not executed on the same ray query object, or if the last value returned by such an execution of OpRayQueryProceedKHR was not true. - If Intersection is RayQueryCommittedIntersectionKHR, behavior is undefined if there is no current committed - intersection (see OpRayQueryGetIntersectionTypeKHR).

Capability:
-RayQueryKHR

5

6027

<id> Result Type

Result <id>

<id> Ray Query

<id> Intersection

- -------- - - - - - - - - - - - - - - -

OpRayQueryGetIntersectionObjectRayOriginKHR
-
- Gets the object-space ray origin for the current intersection considered in a ray query.
-
- Result is the returned ray origin.
-
- Result Type must be a 32-bit floating-point type 3-component vector.
-
- Intersection must be the <id> of a constant instruction with a 32-bit scalar integer type.
-
- Intersection identifies which intersection values should be returned for, either the current candidate or the closest recorded hit so far; see Ray Query Intersection.
-
- Ray Query is a pointer to the ray query object.
-
- If Intersection is RayQueryCandidateIntersectionKHR, behavior is undefined if OpRayQueryProceedKHR - was not executed on the same ray query object, or if the last value returned by such an execution of OpRayQueryProceedKHR was not true. - If Intersection is RayQueryCommittedIntersectionKHR, behavior is undefined if there is no current committed - intersection (see OpRayQueryGetIntersectionTypeKHR).

Capability:
-RayQueryKHR

5

6028

<id> Result Type

Result <id>

<id> Ray Query

<id> Intersection

- ------- - - - - - - - - - - - - - -

OpRayQueryGetWorldRayDirectionKHR
-
- Gets the world-space direction for the ray traced in a ray query.
-
- Result is the returned ray direction.
-
- Result Type must be a 32-bit floating-point type 3-component vector.
-
- Ray Query is a pointer to the ray query object.

Capability:
-RayQueryKHR

4

6029

<id> Result Type

Result <id>

<id> Ray Query

- ------- - - - - - - - - - - - - - -

OpRayQueryGetWorldRayOriginKHR
-
- Gets the world-space origin for the ray traced in a ray query.
-
- Result is the returned ray origin.
-
- Result Type must be a 32-bit floating-point type 3-component vector.
-
- Ray Query is a pointer to the ray query object.

Capability:
-RayQueryKHR

4

6030

<id> Result Type

Result <id>

<id> Ray Query

- -------- - - - - - - - - - - - - - - -

OpRayQueryGetIntersectionObjectToWorldKHR
-
- Gets a matrix that transforms values to world-space from the object-space of the current intersection considered in a ray query.
-
- Result is the returned matrix.
-
- Result Type must be a matrix with a Column Count of 4, and a Column Type that is a vector type with a Component Type that is a 32-bit floating-point type and a Component Count of 3.
-
- Intersection must be the <id> of a constant instruction with a 32-bit scalar integer type.
-
- Intersection identifies which intersection values should be returned for, either the current candidate or the closest recorded hit so far; see Ray Query Intersection.
-
- Ray Query is a pointer to the ray query object.
-
- If Intersection is RayQueryCandidateIntersectionKHR, behavior is undefined if OpRayQueryProceedKHR - was not executed on the same ray query object, or if the last value returned by such an execution of OpRayQueryProceedKHR was not true. - If Intersection is RayQueryCommittedIntersectionKHR, behavior is undefined if there is no current committed - intersection (see OpRayQueryGetIntersectionTypeKHR).

Capability:
-RayQueryKHR

5

6031

<id> Result Type

Result <id>

<id> Ray Query

<id> Intersection

- -------- - - - - - - - - - - - - - - -

OpRayQueryGetIntersectionWorldToObjectKHR
-
- Gets a matrix that transforms values from world-space to the object-space of the current intersection considered in a ray query.
-
- Result is the returned matrix.
-
- Result Type must be a matrix with a Column Count of 4, and a Column Type that is a vector type with a Component Type that is a 32-bit floating-point type and a Component Count of 3.
-
- Intersection must be the <id> of a constant instruction with a 32-bit scalar integer type.
-
- Intersection identifies which intersection values should be returned for, either the current candidate or the closest recorded hit so far; see Ray Query Intersection.
-
- Ray Query is a pointer to the ray query object.
-
- If Intersection is RayQueryCandidateIntersectionKHR, behavior is undefined if OpRayQueryProceedKHR - was not executed on the same ray query object, or if the last value returned by such an execution of OpRayQueryProceedKHR was not true. - If Intersection is RayQueryCommittedIntersectionKHR, behavior is undefined if there is no current committed - intersection (see OpRayQueryGetIntersectionTypeKHR).

Capability:
-RayQueryKHR

5

6032

<id> Result Type

Result <id>

<id> Ray Query

<id> Intersection

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_KHR_ray_query"
-
-
-
-
-
-

Interactions with SPV_KHR_ray_tracing

-
-
-

OpTypeAccelerationStructureKHR, RayTraversalPrimitiveCullingKHR, -OpConvertUToAccelerationStructureKHR and the Ray Flags are added by both -this extension and SPV_KHR_ray_tracing; they -are intended to have identical definitions, and can be enabled by either -extension’s capability, for use with the instructions under that same -capability. -If SPV_KHR_ray_tracing is not supported, ignore any references to -OpTraceRayKHR.

-
-
-
-
-

Issues

-
-
-

1) What are the differences between provisional and final?

-
-
-

Discussion:

-
-
-
    -
  • -

    rename OpTypeRayQueryProvisionalKHR to OpTypeRayQueryKHR as was originally -intended. Seems like it fell victim to an overreaching seach and replace -when this was made provisional.

    -
  • -
  • -

    change RayQueryProvisionalKHR to RayQueryKHR and assign new -token (4472)

    -
  • -
  • -

    add OpConvertUToAccelerationStructureKHR (4447) instruction to convert -64-bit integer to OpTypeAccelerationStructureKHR to enable query by handle

    -
  • -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2019-12-05

Tobias Hector

First draft

2

2019-12-11

Daniel Koch

add Provisional string to capabilities.

3

2020-03-06

Ashwin Lele

Reorder operands and rename builtins (!170)

4

2020-04-14

Jeff Bolz

Fix return type of OpRayQueryGetIntersectionTKHR

5

2020-06-03

Daniel Koch

Update capabilities tables to match SPIR-V 1.5.

6

2020-06-05

Hans-Kristian Arntzen

Add conversion from 64-bit acceleration structure pointer - to OpTypeAccelerationStructureKHR

7

2020-06-12

Daniel Koch

rename OpTypeRayQueryProvisionalKHR → OpTypeRayQueryKHR - refactored common code to include files

8

2020-07-03

Daniel Koch

Remove provisional notices and update capabilities

9

2020-07-10

Tobias Hector

Clarify that subset of bits are used for cull mask

10

2020-07-22

David McAllister

Disallow querying candidate T value for AABB primitives (!191)

11

2021-02-19

Dae Kim

Fix barycentric coordinates retrieved for - OpRayQueryGetIntersectionBarycentricsKHR (!203)

12

2021-09-08

Daniel Koch

replace references to nonexistent OpRayQueryCommittedTypeKHR (GH#128)

13

2022-05-27

Daniel Koch

disallow more combinations of ray flags (vk-gl-cts#3647)

14

2023-01-13

Daniel Koch

Follow SPIR-V conventions for undefined behavior.

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_ray_query.html + + +

extensions/KHR/SPV_KHR_ray_query.html

+ + diff --git a/extensions/KHR/SPV_KHR_ray_tracing.html b/extensions/KHR/SPV_KHR_ray_tracing.html index c7adbee..8557f22 100644 --- a/extensions/KHR/SPV_KHR_ray_tracing.html +++ b/extensions/KHR/SPV_KHR_ray_tracing.html @@ -1,1573 +1,12 @@ - - - - - - - -SPV_KHR_ray_tracing - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_ray_tracing

-
-
-
-
-

Contact

-
-
-

See Issues list in the Khronos SPIRV-Registry repository: -https://github.com/KhronosGroup/SPIRV-Registry

-
-
-
-
-

Contributors

-
-
-
    -
  • -

    Eric Werness, NVIDIA

    -
  • -
  • -

    Ashwin Lele, NVIDIA

    -
  • -
  • -

    Daniel Koch, NVIDIA

    -
  • -
  • -

    Tobias Hector, AMD

    -
  • -
  • -

    Nicolai Haehnle, AMD

    -
  • -
  • -

    David Neto, Google

    -
  • -
  • -

    Alan Baker, Google

    -
  • -
  • -

    John Kessenich, Google

    -
  • -
  • -

    Joshua Barczak, Intel

    -
  • -
  • -

    Faith Ekstrand, Intel

    -
  • -
  • -

    Hans-Kristian Arntzen, Valve

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
  • -

    Ratified by the Khronos Board 2020-11-20

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2022-08-17

Revision

24

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.5 Revision 1.

-
-
-

This extension requires SPIR-V 1.4.

-
-
-

This extension interacts with SPV_KHR_ray_query.

-
-
-
-
-

Overview

-
-
-

This extension adds new functionality to support the Vulkan -VK_KHR_ray_tracing_pipeline extension in SPIR-V.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_ray_tracing"
-
-
-
-
-
-

New Execution Models

-
-
-

This extension introduces new execution models:

-
-
-
-
RayGenerationKHR
-IntersectionKHR
-AnyHitKHR
-ClosestHitKHR
-MissKHR
-CallableKHR
-
-
-
-

these depend on the RayTracingKHR capability.

-
-
-
-
-

New Capabilities

-
-
-

This extension introduces new capabilities:

-
-
-
-
RayTracingKHR
-RayTraversalPrimitiveCullingKHR
-
-
-
-
-
-

New Storage Classes

-
-
-

Storage classes added under the RayTracingKHR capability

-
-
-
-
RayPayloadKHR
-IncomingRayPayloadKHR
-HitAttributeKHR
-CallableDataKHR
-IncomingCallableDataKHR
-ShaderRecordBufferKHR
-
-
-
-
-
-

New Builtins

-
-
-

Builtins added under the RayTracingKHR capability

-
-
-
-
LaunchIdKHR
-LaunchSizeKHR
-InstanceCustomIndexKHR
-RayGeometryIndexKHR
-WorldRayOriginKHR
-WorldRayDirectionKHR
-ObjectRayOriginKHR
-ObjectRayDirectionKHR
-RayTminKHR
-RayTmaxKHR
-ObjectToWorldKHR
-WorldToObjectKHR
-HitKindKHR
-IncomingRayFlagsKHR
-
-
-
-
-
-

New Instructions

-
-
-

Instructions added under the RayTracingKHR capability

-
-
-
-
OpReportIntersectionKHR
-OpIgnoreIntersectionKHR
-OpTerminateRayKHR
-OpTraceRayKHR
-OpTypeAccelerationStructureKHR
-OpExecuteCallableKHR
-OpConvertUToAccelerationStructureKHR
-
-
-
-
-
-

Modifications to the SPIR-V Specification

-
-
-
-
(Modify Section 2.2.1, Instructions )
-
-
-

Shader Call Instruction: An instruction which may cause execution to -continue elsewhere by creating one or more invocations that execute -other shaders. The OpTraceRayKHR, -OpExecuteCallableKHR, and -OpReportIntersectionKHR instructions are -shader call instructions.

-
-
-
(Modify Section 2.2.2, Types )
-
-
-

add OpTypeAccelerationStructureKHR to list of opaque types

-
-
-
(Modify Section 2.2.5, Control Flow)
-
-
-

Add OpIgnoreIntersectionKHR and OpTerminateRayKHR to the list of Termination Instructions.

-
-
-
(Modify Section 2.16.1, Universal Validation Rules)
-
-

Modify the list following the statement:

-
-
-
-

A pointer operand to an OpFunctionCall must point into one of the following -storage classes:

-
-
-
-
-

to include ShaderRecordBufferKHR.

-
-
-
-
-

Change the second bullet under "Any pointer operand to an OpFunctionCall must be" -to include OpTypeAccelerationStructureKHR:

-
-
-
    -
  • -

    a pointer to an element in an array that is a memory object declaration, -where the element type is OpTypeSampler, OpTypeImage, or -OpTypeAccelerationStructureKHR.

    -
  • -
-
-
-

Add a new bullet under "Data rules":

-
-
-
    -
  • -

    Instructions accessing a scalar acceleration structure out of a composite -must only use dynamically-uniform indexes, unless the index is decorated with -NonUniformEXT. They must be in the same block in which their Result <id> -are consumed. Such Result <id> must not appear as operands to OpPhi or -OpSelect instructions, or any instructions other than the ray tracing -instructions specified to operate on them.

    -
  • -
-
-
-
-
-

Modify the item under "Memory model":

-
-
-
-
-

Memory accesses that use NonPrivatePointer must use pointers in the -Uniform, Workgroup, CrossWorkgroup, Generic, Image, or -StorageBuffer storage classes.

-
-
-
-
-

to include ShaderRecordBufferKHR.

-
-
-
(Modify Section 2.16.2, Universal Rules for Shader Capabilities)
-
-

Modify the item:

-
-
-
-
    -
  • -

    Composite objects in the StorageBuffer, PhysicalStorageBuffer, Uniform, -and PushConstant Storage Classes must be explicitly laid out. …​

    -
  • -
-
-
-
-
-

to include ShaderRecordBufferKHR.

-
-
-
(Modify Section 3.3, Execution Model, adding rows to the Execution Model table)
-
-
-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Execution ModelEnabling Capabilities

5313

RayGenerationKHR
-Ray generation shading stage.

RayTracingKHR

5314

IntersectionKHR
-Intersection shading stage.

RayTracingKHR

5315

AnyHitKHR
-Any hit shading stage.

RayTracingKHR

5316

ClosestHitKHR
-Closest hit shading stage.

RayTracingKHR

5317

MissKHR
-Miss shading stage.

RayTracingKHR

5318

CallableKHR
-Ray callable shading stage.

RayTracingKHR

-
-
-
-
(Modify Section 3.7, Storage Class, adding rows to the Storage Class table)
-
-
-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Storage ClassEnabling CapabilitiesEnabled by Extension

5328

CallableDataKHR
-Used for storing arbitrary data associated with a ray to pass to callables. -Visible across all functions in the current invocation. Not shared externally. Variables declared -with this storage class can be both read and written to, but cannot have initializers. -Only allowed in RayGenerationKHR, ClosestHitKHR, CallableKHR, and MissKHR execution models.

RayTracingKHR

SPV_KHR_ray_tracing

5329

IncomingCallableDataKHR
-Used for storing arbitrary data from parent sent to current callable stage invoked from -OpExecuteCallable. Visible across all functions in current invocation. Not shared externally. -Variables declared with the storage class are allowed only in CallableKHR execution models. -Can be both read and written to in above execution models, but cannot have initializers.

RayTracingKHR

SPV_KHR_ray_tracing

5338

RayPayloadKHR
-Used for storing payload data associated with a ray. Visible across all functions in -the current invocation. Not shared externally. Variables declared -with this storage class can be both read and written to, but cannot have initializers. -Only allowed in RayGenerationKHR, ClosestHitKHR and MissKHR execution models.

RayTracingKHR

SPV_KHR_ray_tracing

5339

HitAttributeKHR
-Used for storing attributes of geometry intersected by a ray. Visible across all -functions in the current invocation. Not shared externally. Variables declared with this -storage class are allowed only in IntersectionKHR, AnyHitKHR and ClosestHitKHR execution models. -They can be written to only in IntersectionKHR execution model and read from only -in AnyHitKHR and ClosestHitKHR execution models. They cannot have initializers.

RayTracingKHR

SPV_KHR_ray_tracing

5342

IncomingRayPayloadKHR
-Used for storing parent payload data associated with a ray in current stage invoked from -a trace call. Visible across all functions in current invocation. Not shared externally. -Variables declared with the storage class are allowed only in AnyHitKHR, ClosestHitKHR and -MissKHR execution models. Can be both read and written to in above execution models, but -cannot have initializers.

RayTracingKHR

SPV_KHR_ray_tracing

5343

ShaderRecordBufferKHR
-Used for storing data in shader record associated with each unique shader in ray_tracing -pipeline. Visible across all functions in current invocation. Can be initialized externally via API. -Variables declared with this storage class are allowed in RayGenerationKHR, IntersectionKHR, -AnyHitKHR, ClosestHitKHR, MissKHR and CallableKHR execution models, are read-only, -and cannot have initializers. Refer to the client API for details on shader records.

RayTracingKHR

SPV_KHR_ray_tracing

-
-
-
-
(Modify Section 3.21, Builtin, adding rows to the Builtin table)
-
-
-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
DecorationEnabling Capabilities

5319

LaunchIdKHR
-Index of work item being processed in current invocation of ray tracing shader stage. -Allowed in all ray tracing execution models.
-
-Refer to the client API specification for more details.

RayTracingKHR

5320

LaunchSizeKHR
-Width and height dimensions passed to vkCmdTraceRaysKHR call which resulted in invocation of -current ray tracing shader stage. Allowed in all ray tracing execution models.
-
-Refer to the client API specification for more details.

RayTracingKHR

5321

WorldRayOriginKHR
-World-space origin coordinates for the ray being traced in the IntersectionKHR, -AnyHitKHR, ClosestHitKHR, or MissKHR execution models.
-
-Refer to the client API specification for more details.

RayTracingKHR

5322

WorldRayDirectionKHR
-World-space direction for the ray being traced in the IntersectionKHR, -AnyHitKHR, ClosestHitKHR, or MissKHR execution models.
-
-Refer to the client API specification for more details.

RayTracingKHR

5323

ObjectRayOriginKHR
-Object-space origin coordinates for the ray being traced in the IntersectionKHR, -AnyHitKHR, or ClosestHitKHR execution models.
-
-Refer to the client API specification for more details.

RayTracingKHR

5324

ObjectRayDirectionKHR
-Object-space direction for the ray being traced in the IntersectionKHR, -AnyHitKHR, or ClosestHitKHR execution models.
-
-Refer to the client API specification for more details.

RayTracingKHR

5325

RayTminKHR
-The current Tmin parametric value for the ray being traced in the IntersectionKHR, -AnyHitKHR, ClosestHitKHR, or MissKHR execution models.
-
-Refer to the client API specification for more details.

RayTracingKHR

5326

RayTmaxKHR
-The current Tmax parametric value for the ray being traced in the IntersectionKHR, -AnyHitKHR, ClosestHitKHR, or MissKHR execution models.
-
-Refer to the client API specification for more details.

RayTracingKHR

5327

InstanceCustomIndexKHR
-Application specified value associated with the instance that was hit by the current ray in the IntersectionKHR, -AnyHitKHR, ClosestHitKHR execution models.
-
-Refer to the client API specification for more details.

RayTracingKHR

5330

ObjectToWorldKHR
-The 4x3 object to world transformation matrix for the ray being traced in the IntersectionKHR, -AnyHitKHR, or ClosestHitKHR execution models.
-
-Refer to the client API specification for more details.

RayTracingKHR

5331

WorldToObjectKHR
-The 4x3 world to object transformation matrix for the ray being traced in the IntersectionKHR, -AnyHitKHR, or ClosestHitKHR execution models.
-
-Refer to the client API specification for more details.

RayTracingKHR

5333

HitKindKHR
-The hit kind of the hit for the ray being traced in the AnyHitKHR or -ClosestHitKHR execution models.
-
-Refer to the client API specification for more details.

RayTracingKHR

5351

IncomingRayFlagsKHR
-The ray flags in current stage as passed in through trace call in parent. Available in AnyHitKHR, -ClosestHitKHR, IntersectionKHR, and MissKHR stage
-
-Refer to the client API specification for more details.

RayTracingKHR

5352

RayGeometryIndexKHR
-Implementation defined index corresponding to the geometry that was hit by the current ray in the IntersectionKHR, -AnyHitKHR, or ClosestHitKHR execution models.
-
-Refer to the client API specification for more details.

RayTracingKHR

-
-
-
-
(Modify the definition of following BuiltIns, allowing them to be used in IntersectionKHR, AnyHitKHR, or ClosestHitKHR Execution Models.)
-
-
-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
BuiltInEnabling CapabilitiesEnabled by Extension

6

InstanceId
-Input Instance identifier. See the client API specifications -for more detail.

Instance ID in a Vertex Execution Model

Shader

Instance ID in an IntersectionKHR, AnyHitKHR, or ClosestHitKHR Execution Model

RayTracingKHR

SPV_KHR_ray_tracing

7

PrimitiveId
-Primitive identifier. See the client API specifications for more detail.

Primitive ID in a Geometry Execution Model

Geometry

Primitive ID in a Tessellation Execution Model

Tessellation

Primitive ID in an IntersectionKHR, AnyHitKHR, or ClosestHitKHR Execution Model

RayTracingKHR

SPV_KHR_ray_tracing

-
-
-
-
(Modify Section 3.25, Memory Semantics <id>)
-
-

Modify the UniformMemory cell:

-
-
-
-

Apply the memory-ordering constraints to StorageBuffer, PhysicalStorageBuffer, -or Uniform Storage Class memory.

-
-
-
-
-

to include ShaderRecordBufferKHR.

-
-
-
(Modify Section 3.27, Scope <id>, adding a new row to the Scope table)
-
-
-
- ----- - - - - - - - - - - - - - -
ScopeEnabling Capabilities

6

ShaderCallKHR
-Scope is the set of invocations that are shader-call-related in a ray tracing -Execution Model. Set the client API specification for details.

RayTracingKHR

-
-
-
-
-
-
-
-
(Add a new sub-section 3.RF, Ray Flags, adding a new table)
-
-
-
-
-

3.RF, Ray Flags

-
-
-

Flags controlling the properties of an OpTraceRayKHR instruction -or for comparing against the IncomingRayFlagsKHR builtin. -See the client API specification for more details.

-
-
-

Despite being a mask and allowing multiple bits to be combined, -the following combinations are invalid:

-
-
-
    -
  • -

    if more than one of these four bits are set: -OpaqueKHR, NoOpaqueKHR, CullOpaqueKHR, CullNoOpaqueKHR.

    -
  • -
  • -

    if more than one of these three bits are set: SkipTrianglesKHR, -CullBackFacingTrianglesKHR, CullFrontFacingTrianglesKHR.

    -
  • -
  • -

    if more than one of these two bits are set: SkipTrianglesKHR, -SkipAABBsKHR.

    -
  • -
-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Ray FlagsEnabling Capabilities

0

NoneKHR
-No flags specified.

RayTracingKHR

1

OpaqueKHR
-Force all intersections with the trace to be opaque.
-See the Ray Opacity Culling section in the Vulkan API specification.

RayTracingKHR

2

NoOpaqueKHR
-Force all intersections with the trace to be non-opaque.
-See the Ray Opacity Culling section in the Vulkan API specification.

RayTracingKHR

4

TerminateOnFirstHitKHR
-Accept the first hit discovered.
-See the Ray Closest Hit Determination section in the Vulkan API specification.

RayTracingKHR

8

SkipClosestHitShaderKHR
-Do not execute a closest hit shader.
-See the Ray Result Determination section in the Vulkan API specification.

RayTracingKHR

16

CullBackFacingTrianglesKHR
-Do not intersect with the back face of triangles.
-See the Ray Face Culling section in the Vulkan API specification.

RayTracingKHR

32

CullFrontFacingTrianglesKHR
-Do not intersect with the front face of triangles.
-See the Ray Face Culling section in the Vulkan API specification.

RayTracingKHR

64

CullOpaqueKHR
-Do not intersect with opaque geometry.
-See the Ray Opacity Culling section in the Vulkan API specification.

RayTracingKHR

128

CullNoOpaqueKHR
-Do not intersect with non-opaque geometry.
-See the Ray Opacity Culling section in the Vulkan API specification.

RayTracingKHR

256

SkipTrianglesKHR
-Do not intersect with any triangle geometries. -See the Ray Primitive Culling section in the Vulkan API specification.

RayTraversalPrimitiveCullingKHR

512

SkipAABBsKHR
-Do not intersect with any aabb geometries. -See the Ray Primitive Culling section in the Vulkan API specification.

RayTraversalPrimitiveCullingKHR

-
-
-
-
(Add a new sub-section 3.HK, Hit Kinds, adding a new table)
-
-
-
-
-

3.HK, Hit Kinds

-
-
-

Values returned in the variable decorated as HitKindKHR from built-in -intersections with triangle geometry. -See the Ray Face Culling section in the Vulkan API specification.

-
- ----- - - - - - - - - - - - - - - - - - - -
Hit KindEnabling Capabilities

0xFE

HitKindFrontFacingTriangleKHR
-The intersection was with front-facing geometry.

RayTracingKHR

0xFF

HitKindBackFacingTriangleKHR
-The intersection was with back-facing geometry.

RayTracingKHR

-
-
-
-
(Modify Section 3.31, Capability, adding a row to the Capability table)
-
-
-
- ----- - - - - - - - - - - - - - - - - - - -
CapabilityImplicitly Declares

4479

RayTracingKHR
-Uses the RayGenerationKHR, IntersectionKHR, AnyHitKHR, ClosestHitKHR, -MissKHR, or CallableKHR Execution Models

Shader

4478

RayTraversalPrimitiveCullingKHR
-Uses SkipAABBsKHR or SkipTrianglesKHR

RayTracingKHR

-
-
-
-
-
-

Modify the StorageBuffer16BitAccess cell -to include ShaderRecordBufferKHR.

-
-
-

Modify the UniformAndStorageBuffer16BitAccess cell -to include ShaderRecordBufferKHR.

-
-
-

Modify the StorageBuffer8BitAccess cell -to include ShaderRecordBufferKHR.

-
-
-

Modify the UniformAndStorageBuffer8BitAccess cell -to include ShaderRecordBufferKHR.

-
-
-
-
-
(Modify Section 3.36.6, Type-Declaration Instructions, adding a new table)
-
-
-
- ----- - - - - - - - - - - - -

OpTypeAccelerationStructureKHR
-
-Declares an acceleration structure type which is an opaque reference to -acceleration structure handle as defined in the client API -specification.

-

Consumed by OpRayQueryInitializeKHR and -OpTraceRayKHR

-

This type is opaque: values of this type have no defined physical size or -bit pattern.

Capability:
-RayTracingKHR

2

5341

<id> Result

-
-
-
-
(Modify Section 3.36.8, Memory Instructions)
-
-

Modify the following sentence in the description of OpPtrAccessChain

-
-
-
-

For objects in the Uniform, StorageBuffer, or PushConstant storage -classes, the element’s address or location is calculated using a stride, -which will be the Base-type’s Array Stride when the Base type is -decorated with ArrayStride.

-
-
-
-
-

to include ShaderRecordBufferKHR.

-
-
-
(Modify Section 3.36.11, Conversion Instructions, adding a new table)
-
-
-
- ------- - - - - - - - - - - - - - -

OpConvertUToAccelerationStructureKHR
-
- Converts a 64-bit integer into an OpTypeAccelerationStructureKHR.
-
- Acceleration Structure must either be a 64-bit scalar of integer type, whose Signedness operand is 0, or a 2-component vector of 32-bit integer type, whose Signedness operand is 0.
- A vector value input behaves as-if OpBitcast converts the value to a 64-bit scalar integer first.
- Acceleration Structure represents the address of a valid acceleration structure.
- Refer to the client API specification for details.
-
-Result Type must be an OpTypeAccelerationStructureKHR. -

Capability:
-RayTracingKHR

4

4447

<id> Result Type

<id> Result

<id> Acceleration Structure

-
-
-
-
(Add a new sub section 3.36.RT, Ray Tracing Instructions, adding to end of list of instructions)
-
-
-
- --------------- - - - - - - - - - - - - - - - - - - - - - -

OpTraceRayKHR
-
- Trace a ray into the acceleration structure.
-
- Acceleration Structure is the descriptor for the acceleration structure to trace into.
-
- Ray Flags contains one or more of the Ray Flag values.
-
- Cull Mask is the mask to test against the instance mask.
-
- SBT Offset and SBT Stride control indexing into the SBT for hit shaders called from this trace. - SBT stands for Shader Binding Table. Refer to the client API specification for details.
-
- Miss Index is the index of the miss shader to be called from this trace call.
-
- Ray Origin, Ray Tmin, Ray Direction, and Ray Tmax control the basic parameters of the ray to be traced.
-
- Payload is a pointer to the ray payload structure to use for this trace. Payload must be the result of an OpVariable with a storage class of RayPayloadKHR or IncomingRayPayloadKHR.
-
- Ray Flags, Cull Mask, SBT Offset, SBT Stride, and Miss Index must be a 32-bit integer type scalar.
-
- Only the 8 least-significant bits of Cull Mask are used by this instruction - other bits are ignored. -
- Only the 4 least-significant bits of SBT Offset and SBT Stride are used by this instruction - other bits are ignored. -
- Only the 16 least-significant bits of Miss Index are used by this instruction - other bits are ignored. -
- Ray Origin and Ray Direction must be a 32-bit float type 3-component vector.
-
- Ray Tmin and Ray Tmax must be a 32-bit float type scalar.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models.
-
- This instruction is a shader call instruction which may invoke shaders with the IntersectionKHR, AnyHitKHR, -ClosestHitKHR, and MissKHR execution models.
-

Capability:
-RayTracingKHR

12

4445

<id> Acceleration Structure

<id> Ray Flags

<id> Cull Mask

<id> SBT Offset

<id> SBT Stride

<id> Miss Index

<id> Ray Origin

<id> Ray Tmin

<id> Ray Direction

<id> Ray Tmax

<id> Payload

- -------- - - - - - - - - - - - - - - -

OpReportIntersectionKHR
-
-Reports an intersection back to the traversal infrastructure.

-

If the intersection occurred within the current ray interval, the intersection confirmation is -performed (see the API specification for more details). -If the value of Hit falls outside the current ray interval, the hit is rejected.

-

Returns True if the hit was accepted by the ray interval and the intersection was confirmed. -Returns False otherwise.

-

Hit is the floating point parametric value along ray for the intersection.

-

Hit Kind is the integer hit kind reported back to other shaders and accessible by the hit kind builtin.

-

Result Type must be a scalar boolean.

-

Hit must be a 32-bit float type scalar.

-

Hit Kind must be a 32-bit unsigned integer type scalar.

-

This instruction is allowed only in IntersectionKHR execution model.

-

This instruction is a shader call instruction which may invoke shaders with the -AnyHitKHR execution model.

Capability:
-RayTracingKHR

5

5334

<id> Result Type

<id> Result

<id> Hit

<id> Hit Kind

- ---- - - - - - - - - - - -

OpIgnoreIntersectionKHR
-
-Ignores the current potential intersection, terminating the invocation that executes it, and -continues the ray traversal.

-

This instruction must be the last instruction in a block.

-

This instruction is allowed only in AnyHitKHR execution model.

Capability:
-RayTracingKHR

1

4448

- ---- - - - - - - - - - - -

OpTerminateRayKHR
-
-Terminates the invocation that executes it, stops the ray traversal, accepts the current hit, -and invokes the ClosestHitKHR execution model (if active).

-

This instruction must be the last instruction in a block.

-

This instruction is allowed only in AnyHitKHR execution model.

Capability:
-RayTracingKHR

1

4449

- ------ - - - - - - - - - - - - -

OpExecuteCallableKHR
-
-Invoke a callable shader

-

SBT Index is the index into the SBT table to select callable shader to execute

-

Callable Data is a pointer to the callable data to pass into the called shader. Callable Data must be the result of an OpVariable with a storage class of CallableDataKHR or IncomingCallableDataKHR.

-

SBT Index must be a 32-bit unsigned integer type scalar.

-

This instruction is allowed only in RayGenerationKHR, ClosestHitKHR, MissKHR and CallableKHR execution models.

-

This instruction is a shader call instruction which will invoke a shader with the -CallableKHR execution model.

Capability:
-RayTracingKHR

3

4446

<id> SBT Index

<id> Callable Data

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_KHR_ray_tracing"
-
-
-
-
-
-

Interactions with SPV_KHR_ray_query

-
-
-

OpTypeAccelerationStructureKHR, RayTraversalPrimitiveCullingKHR, -OpConvertUToAccelerationStructureKHR and the Ray Flags are added by both -this extension and SPV_KHR_ray_query; they -are intended to have identical definitions, and can be enabled by either -extension’s capability, for use with the instructions under that same -capability. -If SPV_KHR_ray_query is not supported, ignore any references to -OpRayQueryInitializeKHR.

-
-
-
-
-

Issues

-
-
-

1) Should the global variables be listed in the entrypoint interface?

-
-
-

Discussion: This makes the consumer lives easier in the presence of multiple -entry points. This is already required in SPIR-V 1.4, but if using an earlier -version of SPIR-V it is actually illegal.

-
-
-

Resolved: Require SPIR-V 1.4 to make it simpler for consumers. -SPV_NV_ray_tracing needs to work both ways since it pre-dates SPIR-V 1.4, but -implementations which only support SPV_KHR_ray_tracing will benefit from -this requirement.

-
-
-

2) What are the differences between provisional and final?

-
-
-

Discussion:

-
-
-
    -
  • -

    change RayTracingProvisionalKHR to RayTracingKHR and assign new -token (4479)

    -
  • -
  • -

    change ray payloads and callable data to pointers rather than integer -locations (this resulted in new opcodes for OpTraceRayKHR (4445) and -OpExecuteCallableKHR (4446))

    -
  • -
  • -

    added OpConvertUToAccelerationStructureKHR (4447) instruction to convert -from a 64-bit acceleration structure pointer to an -OpTypeAccelerationStructureKHR to enable tracing by handle

    -
  • -
  • -

    Assign new opcodes for OpIgnoreIntersectionKHR (4448) and -OpTerminateRayKHR (4449) and specify that they are termination -instructions.

    -
  • -
-
-
-

3) Are OpReportIntersectionKHR and OpIgnoreIntersectionKHR terminators like - OpTerminateInvocation or just OpKill?

-
-
-

Resolved: They are meant to unambiguously end execution similarly to -OpTerminateInvocation. We are trying to avoid the mess caused by OpKill.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2019-04-30

Daniel Koch

Fork from SPV_NV_ray_tracing rev 4.

2

2019-04-30

Daniel Koch

Add Ray Flags documentation.

3

2019-06-20

Tobias Hector

Add RayGeometryIndexKHR.

4

2019-10-25

Tobias Hector

Add description to ray flags.

5

2019-11-20

Daniel Koch

OpTraceKHR → OpTraceRayKHR. - Add references to API spec for ray flags. - Add Hit Kind documentation.

6

2019-11-25

Daniel Koch

Add ShaderCallKHR scope. - Document payload for OpTraceRayKHR.

7

2019-11-27

Daniel Koch

Disallow initializers on all new storage classes.

8

2019-12-03

Tobias Hector

Add interactions with SPV_KHR_ray_query.

9

2019-12-05

Tobias Hector

Add RayTraversalPrimitiveCullingKHR capability - and the SkipAABBsKHR/SkipTrianglesKHR ray flags.

10

2019-12-05

Daniel Koch

Base on SPIR-V 1.5

11

2019-12-11

Daniel Koch

add Provisional string to capabilities, and reassign token - for RayTracingProvisionalKHR.

12

2020-02-20

Eric Werness

Miss does not have object parameters.

13

2020-02-22

Tobias Hector

Removed HitTKHR alias of RayTmaxKHR

14

2020-04-22

Daniel Koch

Require SPIR-V 1.4.

15

2020-06-03

Daniel Koch

Update capabilities tables to match SPIR-V 1.5.

16

2020-06-04

Faith Ekstrand

Make ray payloads and callable data pointers rather than - integer locations

17

2020-06-05

Hans-Kristian Arntzen

Add conversion from 64-bit acceleration structure pointer - to OpTypeAccelerationStructureKHR

18

2020-07-03

Daniel Koch

Remove provisional notices and update capabilities

19

2020-07-10

Tobias Hector

Clarify that subset of bits are used for trace operation

20

2020-09-25

Daniel Koch

Require explicit layouts for ShaderRecordBufferKHR and - otherwise just generally treat it as the StorageBuffer storage class. - Clarify OpReportIntersection behavior if out of range Hit (vulkan#2359).

21

2020-10-01

Daniel Koch

Update OpIgnoreIntersectionKHR and OpTerminateRayKHR behavior - (they are terminators) and assign new opcodes (vulkan#2374).

22

2021-05-13

Eric Werness

Fix ray payload allowed execution models to exclude any hit.

23

2022-05-27

Daniel Koch

disallow more combinations of ray flags (vk-gl-cts#3647)

24

2022-08-17

Daniel Koch

OpExecuteCallableKHR SBT Index must be 32-bit unsigned integer (#156)

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_ray_tracing.html + + +

extensions/KHR/SPV_KHR_ray_tracing.html

+ + diff --git a/extensions/KHR/SPV_KHR_ray_tracing_position_fetch.html b/extensions/KHR/SPV_KHR_ray_tracing_position_fetch.html index d6da55a..025bde0 100644 --- a/extensions/KHR/SPV_KHR_ray_tracing_position_fetch.html +++ b/extensions/KHR/SPV_KHR_ray_tracing_position_fetch.html @@ -1,426 +1,12 @@ - - - - - - - -SPV_KHR_ray_tracing_position_fetch - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_ray_tracing_position_fetch

-
-
-
-
-

Contact

-
-
-

See Issues list in the Khronos SPIRV-Headers repository: -https://github.com/KhronosGroup/SPIRV-Headers

-
-
-
-
-

Contributors

-
-
-
    -
  • -

    Eric Werness, NVIDIA

    -
  • -
  • -

    Daniel Koch, NVIDIA

    -
  • -
  • -

    Alan Baker, Google

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    2023-02-22 Approved by the SPIR Working Group

    -
  • -
  • -

    2023-04-14 Ratified by the Khronos Board

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2023-04-21

Revision

4

-
-
-
-

Dependencies

-
-
-

This extension is written against the Unified SPIR-V Specification, -Version 1.5, Revision 1.

-
-
-

This extension requires SPIR-V 1.4.

-
-
-

This extension requires SPV_KHR_ray_tracing or SPV_KHR_ray_query.

-
-
-
-
-

Overview

-
-
-

This extension adds functionality to ray pipelines and ray query to allow -fetching the vertex positions for the current hit.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_ray_tracing_position_fetch"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces two new capabilities:

-
-
-
-
RayTracingPositionFetchKHR
-RayQueryPositionFetchKHR
-
-
-
-
-
-

New Builtins

-
-
-

Builtin added under the RayTracingPositionFetchKHR capability

-
-
-
-
HitTriangleVertexPositionsKHR
-
-
-
-
-
-

New Instructions

-
-
-

Instruction added under the RayQueryPositionFetchKHR capability

-
-
-
-
OpRayQueryGetIntersectionTriangleVertexPositionsKHR
-
-
-
-
-
-

Modifications to the SPIR-V Specification

-
-
-
-
(Modify Section 3.21, BuiltIn, adding rows to the BuiltIn table)
-
-
-
- ----- - - - - - - - - - - - - - -
BuiltInEnabling Capabilities

5335

HitTriangleVertexPositionsKHR
-The vertex positions for the triangle hit for the ray being traced in the AnyHitKHR or -ClosestHitKHR execution models.+ -+ -Refer to the Ray Tracing chapter of Vulkan API specification for more details.

RayTracingPositionFetchKHR

-
-
-
-
(Modify Section 3.31, Capability, adding rows to the Capability table)
-
-
-
- ----- - - - - - - - - - - - - - - - - - - -
CapabilityImplicitly Declares

5336

RayTracingPositionFetchKHR
-Uses the HitTriangleVertexPositionsKHR builtin.

RayTracingKHR

5391

RayQueryPositionFetchKHR
-Uses the OpRayQueryGetIntersectionTriangleVertexPositionsKHR -instruction.

RayQueryKHR

-
-
-
-
(Add to sub section 3.32.RQInstructions, Ray Query Instructions)
-
-
-
- -------- - - - - - - - - - - - - - - -

OpRayQueryGetIntersectionTriangleVertexPositionsKHR
-
- Gets the vertex positions for the triangle at the current intersection.
-
- Result is the returned vertex positions.
-
- Result Type must be an array with a Length of 3, and an Element Type that is a vector type with a Component Type that is a 32-bit floating-point type and a Component Count of 3.
-
- Intersection must be the <id> of a constant instruction with a 32-bit scalar integer type.
-
- Intersection identifies which intersection values should be returned for, either the current candidate or the - closest recorded hit so far; see Ray Query Intersection.
-
- Ray Query is a pointer to the ray query object.
-
- If Intersection is RayQueryCandidateIntersectionKHR, behavior is undefined if OpRayQueryProceedKHR - was not executed on the same ray query object, or if the last value returned by such an execution of OpRayQueryProceedKHR was not true. -
- If Intersection is RayQueryCommittedIntersectionKHR, behavior is undefined if there is no current committed - intersection (see OpRayQueryCommittedTypeKHR).

Capability:
-RayQueryPositionFetchKHR

5

5340

<id> Result Type

Result <id>

<id> Ray Query

<id> Intersection

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_KHR_ray_tracing_position_fetch"
-
-
-
-
-
-

Interactions with SPV_KHR_ray_tracing

-
-
-

The RayTracingPositionFetchKHR capability and the HitTriangleVertexPositionsKHR builtin -are only supported if SPV_KHR_ray_tracing and the RayTracingKHR capability are supported.

-
-
-
-
-

Interactions with SPV_KHR_ray_query

-
-
-

The RayQueryPositionFetchKHR capability and the OpRayQueryGetIntersectionTriangleVertexPositionsKHR -instruction are only supported if SPV_KHR_ray_query and the RayQueryKHR capability are supported.

-
-
-
-
-

Issues

-
-
-

1) Should triangle be in the name somewhere?

-
-
-

RESOLVED: Yes, though OpRayQueryGetIntersectionTriangleVertexPositionsKHR seems a bit long.

-
-
-

2) Where should the functionality of the new builtin and instruction be defined?

-
-
-

RESOLVED: Following precedent, ray tracing (pipeline) relies more on "Refer to the Ray Tracing -chapter of Vulkan API" language while ray query inlines more of the functionality definition -directly in the SPIR-V extensions.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2022-05-12

Eric Werness

First draft

2

2022-12-14

Daniel Koch

Use two capabilities and other spec cleanup.

3

2023-01-06

Daniel Koch

Follow SPIR-V conventions for undefined behavior.

4

2023-04-21

Daniel Koch

Add ratification status

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_ray_tracing_position_fetch.html + + +

extensions/KHR/SPV_KHR_ray_tracing_position_fetch.html

+ + diff --git a/extensions/KHR/SPV_KHR_relaxed_extended_instruction.html b/extensions/KHR/SPV_KHR_relaxed_extended_instruction.html index 78ed7b1..32362fb 100644 --- a/extensions/KHR/SPV_KHR_relaxed_extended_instruction.html +++ b/extensions/KHR/SPV_KHR_relaxed_extended_instruction.html @@ -1,371 +1,12 @@ - - - - - - - -SPV_KHR_relaxed_extended_instruction - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_relaxed_extended_instruction

-
-
-
-
-

Contact

-
-
-

To report a problem with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Alan Baker, Google LLC

    -
  • -
  • -

    Nathan Gauër, Google LLC

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2024 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
  • -

    Approved by the SPIR-V Working Group: 2024-04-03

    -
  • -
  • -

    Ratified by the Khronos Group: 2024-05-17

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2024-06-05

Revision

4

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, Version 1.6 -Revision 3.

-
-
-

This extension requires SPIR-V 1.0 or later and "SPV_KHR_non_semantic_info".

-
-
-

This extension interacts with "NonSemantic.Shader.DebugInfo.100" extended -instruction set and "SPV_KHR_non_semantic_info".

-
-
-
-
-

Overview

-
-
-

This extension adds the ability to have forward declaration in some specific -non-semantic instructions. It modifies similarly both the core SPIR-V -specification and the SPV_KHR_non_semantic_info extension specification.

-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6

-
-
-

Logical Layout of a Module

-
-

Modify section 9, modifying:

-
-
-

(Replace the following sentence):

-
-
-

All operands in all these instructions must be declared before being used.

-
-
-

(with):

-
-
-

All operands in all these instructions must be declared before being used, -except when the opcode is OpExtInstWithForwardRefsKHR.

-
-
-

Modify section 9, modifying:

-
-
-

(Replace the following sentence):

-
-
-

This section is the first section to allow use of Non-semantic instructions with OpExtInst

-
-
-

(with):

-
-
-

This section is the first section to allow use of Non-semantic instructions with OpExtInst or OpExtInstWithForwardRefsKHR

-
-
-

Modifies the list of locations where a forward reference is allowed to add -the following cases:

-
-
-
    -
  • -

    OpExtInstWithForwardRefsKHR can contain forward references.

    -
  • -
-
-
-
-

Instructions

-
-

Modify Section 3.49.4, "Extension Instructions", -adding one new instruction:

-
- --------- - - - - - - - - - - - - - - -

OpExtInstWithForwardRefsKHR
-
-Executes an instruction in an imported set of non-semantic extended -instructions.

-

At least one <id> operand must be a forward reference, otherwise, -OpExtInst must be used.

5 + variable

4433

<id>
-Result Type

Result <id>

<id>
-Set

Literal
-Instruction

<id>, <id>, …​
-Operand 1, Operand 2, …​

-
-
-
-
-

Modifications to the NonSemantic.Shader.DebugInfo.100 extended instruction set, Version 1.0

-
-
-

Introduction

-
-

(Replace the following sentence):

-
-
-
    -
  • -

    Forward references in any instruction are disallowed.

    -
  • -
-
-
-

(with):

-
-
-
    -
  • -

    Forward references in any instruction are disallowed, except as operands in -OpExtInstWithForwardRefsKHR.

    -
  • -
-
-
-
-

Binary Form

-
-

Modify the Forward references section:

-
-
-

(Replace the following sentence):

-
-
-

Forward references are not allowed, to be compliant with SPV_KHR_non_semantic_info.

-
-
-

(with):

-
-
-

Forward references are not allowed, except when the instruction allows it, -and when the instruction is used with OpExtInstWithForwardRefsKHR.

-
-
-
-
-
-

Validation Rules

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must -be present in the module:

-
-
-
-
OpExtension "SPV_KHR_relaxed_extended_instruction"
-
-
-
-

Each OpExtInstWithForwardRefsKHR use must have at least one forward reference -as operand.

-
-
-

Each extended opcode used with OpExtInstWithForwardRefsKHR must belong to a -non-semantic instruction set.

-
-
-
-
-

Issues

-
- -
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2023-10-11

Nathan Gauër

Initial revision

2

2024-03-11

Nathan Gauër

Relaxed SPIR-V version requirements.

3

2024-05-28

Nathan Gauër

Added ratification/approval dates.

4

2024-06-05

Nathan Gauër

Add KHR to opcode name.

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_relaxed_extended_instruction.html + + +

extensions/KHR/SPV_KHR_relaxed_extended_instruction.html

+ + diff --git a/extensions/KHR/SPV_KHR_shader_atomic_counter_ops.html b/extensions/KHR/SPV_KHR_shader_atomic_counter_ops.html index eeb73ef..0967ebc 100644 --- a/extensions/KHR/SPV_KHR_shader_atomic_counter_ops.html +++ b/extensions/KHR/SPV_KHR_shader_atomic_counter_ops.html @@ -1,385 +1,12 @@ - - - - - - - -SPV_KHR_shader_atomic_counter_ops - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_shader_atomic_counter_ops

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Daniel Koch, NVIDIA

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2017 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
  • -

    Approved by the SPIR Working Group: 2017-05-10

    -
  • -
  • -

    Approved by the Khronos Board of Promoters: 2017-06-30

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2017-07-07

Revision

3

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.1 Revision 7.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension provides new capability to allow additional Atomic Instructions -on the AtomicCounter Storage Class in order to support the -GL_ARB_shader_atomic_counter_ops extension in SPIR-V.

-
-
-

The new functionality is enabled under the AtomicStorageOps -capability.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_shader_atomic_counter_ops"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
AtomicStorageOps
-
-
-
-
-
-

New Builtins

-
-
-

None.

-
-
-
-
-

New Instructions

-
-
-

None.

-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - -

AtomicStorageOps

4445

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.1

-
-
-
-
(Modify Section 3.31, Capability, adding a row to the Capability table)
-
-
-
- ------ - - - - - - - - - - - - - - - -
CapabilityDepends OnEnabled by Extension

4445

AtomicStorageOps
-Uses atomic instructions: OpAtomicIAdd, OpAtomicISub, OpAtomicUMin, -OpAtomicUMax, OpAtomicAnd, OpAtomicOr, OpAtomicXor, -OpAtomicExchange, or OpAtomicCompareExchange.

AtomicStorage

SPV_KHR_shader_atomic_counter_ops

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_KHR_shader_atomic_counter_ops"
-
-
-
-
-
-

Issues

-
-
-
    -
  1. -

    What should we call the capability?

    -
    -
    -
    -

    RESOLVED: AtomicStorageOps. Possible alternatives included -AtomicStorageCounterOps or AtomicCounterOps, but this extends the -AtomicStorage capability and adds more operations. There are additional -operations that still aren’t enabled, but that is a problem for a future -extension.

    -
    -
    -
    -
  2. -
  3. -

    Should OpAtomicSMin and OpAtomicSMax be supported as well?

    -
    -
    -
    -

    RESOLVED: No. The corresponding GLSL built-ins only take uint parameters, -so this capability will aim to expose exactly the same set of operations.

    -
    -
    -
    -
  4. -
  5. -

    What happened to the Universal Validation Rule about AtomicCounter operations?

    -
    -
    -
    -

    RESOLVED: In early versions of SPIR-V (1.0.x for x < 11, and 1.1.y for y < 7) -there was a universal validation rule which stated:

    -
    -
    -
    -
    -

    The only instructions that can operate on a pointer to the AtomicCounter -Storage Class are

    -
    -
    -
      -
    • -

      OpAtomicLoad

      -
    • -
    • -

      OpAtomicIIncrement

      -
    • -
    • -

      OpAtomicIDecrement

      -
    • -
    -
    -
    -
    -
    -

    Starting with SPIR-V 1.0 version 11 (1.1 version 7) this was moved from a -universal validation rule into the AtomicStorage capability.

    -
    -
    -
    -
  6. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2017-04-25

Daniel Koch

Initial revision

2

2017-05-12

David Neto

Record approval by SPIR Working Group

3

2017-07-07

Daniel Koch

Record ratification

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_shader_atomic_counter_ops.html + + +

extensions/KHR/SPV_KHR_shader_atomic_counter_ops.html

+ + diff --git a/extensions/KHR/SPV_KHR_shader_ballot.html b/extensions/KHR/SPV_KHR_shader_ballot.html index 3b9c80c..d4577d9 100644 --- a/extensions/KHR/SPV_KHR_shader_ballot.html +++ b/extensions/KHR/SPV_KHR_shader_ballot.html @@ -1,741 +1,12 @@ - - - - - - - -SPV_KHR_shader_ballot - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_shader_ballot

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Neil Henning, Codeplay

    -
  • -
  • -

    Kerch Holt, NVIDIA

    -
  • -
  • -

    John Kessenich, Google

    -
  • -
  • -

    Daniel Koch, NVIDIA

    -
  • -
  • -

    David Neto, Google

    -
  • -
  • -

    Daniel Rakos, AMD

    -
  • -
  • -

    Rex Xu, AMD

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2016 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
  • -

    Approved by the SPIR Working Group: 2016-07-12

    -
  • -
  • -

    Ratified by the Khronos Board: 2016-09-02

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2016-10-18

Revision

11

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.1 Revision 1.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension provides new builtin variable decorations and instructions -to support the OpenGL GL_ARB_shader_ballot extension in SPIR-V.

-
-
-

Unlike GL_ARB_shader_ballot the SPIR-V extension does not depend on -GL_ARB_shader_gpu_int64 because the types representing subgroup IDs -are held in a 4 component vector of integers.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, OpExtension -"SPV_KHR_shader_ballot", must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_shader_ballot"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability.

-
-
-
-
SubgroupBallotKHR
-
-
-
-
-
-

New Builtins

-
-
-

Builtin variables provide a bitmask for invocations.

-
-
-
-
SubgroupEqMaskKHR
-SubgroupGeMaskKHR
-SubgroupGtMaskKHR
-SubgroupLeMaskKHR
-SubgroupLtMaskKHR
-
-
-
-
-
-

New Instructions

-
-
-

Instructions added under SubgroupBallotKHR capability.

-
-
-
-
OpSubgroupBallotKHR
-OpSubgroupFirstInvocationKHR
-OpSubgroupReadInvocationKHR
-
-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

SubgroupEqMaskKHR

4416

SubgroupGeMaskKHR

4417

SubgroupGtMaskKHR

4418

SubgroupLeMaskKHR

4419

SubgroupLtMaskKHR

4420

OpSubgroupBallotKHR

4421

OpSubgroupFirstInvocationKHR

4422

SubgroupBallotKHR

4423

OpSubgroupReadInvocationKHR

4432

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.1

-
-
-
-
(Modify Section 3.21, BuiltIn)
-
-
-
-
-
-
(add the following new builtins to the table)
-
-
-
-
-
-
-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
BuiltInEnabling Capabilities

4416

SubgroupEqMaskKHR
-Provides a 4 component 32 bit integer vector bitmask -for all invocations with one bit per invocation starting with the -least significant bit in the first vector component continuing to -the last bit (less than SubgroupSize) in the last required vector -component where the bit index is equal to SubgroupLocalInvocationId.

SubgroupBallotKHR

4417

SubgroupGeMaskKHR
-Provides a 4 component 32 bit integer vector bitmask -for all invocations with one bit per invocation starting with the -least significant bit in the first vector component continuing to -the last bit (less than SubgroupSize) in the last required vector -component where the bit index is greater than or equal to -SubgroupLocalInvocationId.

SubgroupBallotKHR

4418

SubgroupGtMaskKHR
-Provides a 4 component 32 bit integer vector bitmask -for all invocations with one bit per invocation starting with the -least significant bit in the first vector component continuing to -the last bit (less than SubgroupSize) in the last required vector -component where the bit index is greater than -SubgroupLocalInvocationId.

SubgroupBallotKHR

4419

SubgroupLeMaskKHR
-Provides a 4 component 32 bit integer vector bitmask -for all invocations with one bit per invocation starting with the -least significant bit in the first vector component continuing to -the last bit (less than SubgroupSize) in the last required vector -component where the bit index is less than or equal to -SubgroupLocalInvocationId.

SubgroupBallotKHR

4420

SubgroupLtMaskKHR
-Provides a 4 component 32 bit integer vector bitmask -for all invocations with one bit per invocation starting with the -least significant bit in the first vector component continuing to -the last bit (less than SubgroupSize) in the last required vector -component where the bit index is less than SubgroupLocalInvocationId.

SubgroupBallotKHR

-
-
-
-

(Add the SubgroupBallotKHR capability to SubgroupSize.)

-
-
-

(Add the SubgroupBallotKHR capability to SubgroupLocalInvocationId.)

-
-
-
-
-
-
(Modify Section 3.31, Capability, adding a row to the Capability table)
-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityDepends On

4423

SubgroupBallotKHR

-
-
-
-
(Modify Section 3.32.21, Group Instructions, adding to the end of the list of instructions)
-
-
-
- ------- - - - - - - - - - - - - - -

OpSubgroupBallotKHR
-
-Computes a bitfield value combining the Predicate value from all -invocations in the current Subgroup that execute the same dynamic -instance of this instruction. The bit is set to one if the corresponding -invocation is active and the predicate is evaluated to true; otherwise, -it is set to zero.
-
-Predicate must be a Boolean type.
-
-Result Type must be a 4 component vector of 32 bit integer types.
-
-Result is a set of bitfields where the first invocation is represented -in bit 0 of the first vector component and the last (up to SubgroupSize) -is the higher bit number of the last bitmask needed to represent all -bits of the subgroup invocations.

Capability:
-SubgroupBallotKHR

4

4421

<id> Result Type

<id> Result

<id> Predicate

- ------- - - - - - - - - - - - - - -

OpSubgroupFirstInvocationKHR
-
-Return the Value from the invocation in the current subgroup which has the -lowest subgroup local invocation ID, and which executes the same dynamic -instance of this instruction.
-
-The type of Value must be the same as Result Type.
-
-Result Type must be a 32-bit integer type or a 32-bit float type scalar.

Capability:
-SubgroupBallotKHR

4

4422

<id> Result Type

<id> Result

<id> Value

- -------- - - - - - - - - - - - - - - -

OpSubgroupReadInvocationKHR
-
-Return the Value from the invocation in the subgroup with an invocation ID -equal to Index. The Index must be the same for all active invocations in -the subgroup, otherwise the results are undefined.
-
-The type of Value must be the same as Result Type.
-
-Result Type must be a 32-bit integer type or a 32-bit float type scalar.

Capability:
-SubgroupBallotKHR

5

4432

<id> Result Type

<id> Result

<id> Value

<id> Index

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

None.

-
-
-
-
-

Issues

-
-
-
    -
  1. -

    The subgroup mask is specified as a 64 bit integer type which -may artificially limit the number of subgroups.

    -
    -
    -
    -

    RESOLVED: Result type and masks now changed to 4 component vector of -32 bit integers.

    -
    -
    -
    -
  2. -
  3. -

    How are the values of Subgroup??MaskKHR defined?

    -
    -
    -
    -

    RESOLVED: Earlier versions of this specification defined a bitmask such as -"LtMask" ("less than mask") as having bits set if SubgroupLocalInvocationId < -bit index. However, this was reversed relative to the Vulkan extension and to -the GL_ARB_shader_ballot extension (see issue 1 of that spec). Fortunately, all -known implementations of this extension had implemented "wrong" behavior so the -best thing to do is change the definition in the spec.

    -
    -
    -
    -
  4. -
  5. -

    Should these instructions have a scope of Subgroup instead -of limiting them to a set of sub-groups?

    -
    -
    -
    -

    RESOLVED: The scope is Subgroup (SPIR-V WG 6/28/2016)

    -
    -
    -
    -
  6. -
  7. -

    The functionality for readInvocationARB is presumed to be -supported through the OpGroupBroadcast with Subgroup scope.

    -
    -
    -
    -

    RESOLVED: The use of OpGroupBroadcast is sufficient (SPIR-V WG 6/28/2016) -RE-RESOLVED: OpGroupBroadcast has a different semantic than what is -precisely desired. readInvocationARB may appear in dynamically non-uniform -control flow paths and doesn’t have a scope. Concluded that a new -instruction is required. -(SPIR-V WG 10/18/2016)

    -
    -
    -
    -
  8. -
  9. -

    The GL_ARB_shader_ballot extension calls out explicitly a dependency -on the int64 bit type. Does this dependency need to be called out?

    -
    -
    -
    -

    RESOLVED: Result type and mask type changed to 4 component vector and -thus removes dependency on GL_ARB_shader_gpu_int64.

    -
    -
    -
    -
  10. -
  11. -

    GL_ARB_shader_ballot allows calls to ballotARB in control flow so the -semantics of subgroup may be different than the current SPIR-V -definition of subgroup.

    -
    -
    -
    -

    RESOLVED: (Paraphrasing David Neto) A "lockstep" concept of execution -is replaced by use of the concept of "dynamic instance" (already in the -SPIR-V spec), and subgroups. This doesn’t force B=D in the following -example. It does not define pair-wise reconvergence of invocations in -the absence of completely uniform control flow.

    -
    -
    -
    -
    void foo() {
    -  const bool odd = gl_VertexID & 1;
    -  const bool odd2 = gl_VertexID & 2;
    -
    -  uint64_t A = 0;
    -  uint64_t B = 0;
    -  uint64_t C = 0;
    -  uint64_t D = 0;
    -  uint64_t E = 0;
    -
    -  A = ballotARB(true)
    -  if (odd) {
    -    B = ballotARB(true);
    -    if (odd2) {
    -      C = ballotARB(true);
    -    }
    -    D = ballotARB(true);
    -  }
    -  E = ballotARB(true);
    -}
    -
    -
    -
    -
    -
  12. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2016-05-10

Kerch Holt

Initial revision

2

2016-05-17

Kerch Holt

Changes as per SPIR-V WG May 17th

3

2016-05-24

Kerch Holt

Change result type and mask type to 4 component int 32 vector

4

2016-06-08

Kerch Holt

Change names to include "KHR" and update to include suggestions from reviews and SPIR-V WG.

5

2016-06-28

Kerch Holt

Filled in the remaining "UNRESOLVED" text as per SPIR-V WG. Added token number assignments

6

2016-08-02

Kerch Holt

Added wording to cover case of bit values for inactive invocations.

7

2016-09-02

Kerch Holt

Added token number for ShaderBallot capability.

8

2016-09-06

David Neto

Rename SubgroupBallot capability to SubgroupBallotKHR

9

2016-09-13

Kerch Holt

Changed status to "ratified" with date

10

2016-09-20

Daniel Koch

Improve formatting, use ISO dates, remove extension number

11

2016-10-18

Kerch Holt

Add instruction for readInvocationARB (as per Oct 18th SPIR-V meeting)

12

2018-03-15

Graeme Leese

Correct definition of mask builtins.

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_shader_ballot.html + + +

extensions/KHR/SPV_KHR_shader_ballot.html

+ + diff --git a/extensions/KHR/SPV_KHR_shader_clock.html b/extensions/KHR/SPV_KHR_shader_clock.html index fef0adb..2c96f8f 100644 --- a/extensions/KHR/SPV_KHR_shader_clock.html +++ b/extensions/KHR/SPV_KHR_shader_clock.html @@ -1,399 +1,12 @@ - - - - - - - -SPV_KHR_shader_clock - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_shader_clock

-
-
-
-
-

Contact

-
-
-

See Issues list in the Khronos SPIRV-Registry repository: -https://github.com/KhronosGroup/SPIRV-Registry

-
-
-
-
-

Contributors

-
-
-
    -
  • -

    Aaron Hagan, AMD

    -
  • -
  • -

    Neil Henning, AMD

    -
  • -
  • -

    Daniel Rakos, AMD

    -
  • -
  • -

    David Neto, Google

    -
  • -
  • -

    Daniel Koch, NVIDIA

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
  • -

    Approved by the SPIR Working Group: 2019-03-06

    -
  • -
  • -

    Ratified by Khronos on 2019-05-03

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2019-10-30

Revision

3

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.3 Revision 1.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds new capabilities to support the -GL_ARB_shader_clock -and -GL_EXT_shader_realtime_clock -GLSL extensions in SPIR-V.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_shader_clock"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
ShaderClockKHR
-
-
-
-
-
-

New Instructions

-
-
-

Instructions added under the ShaderClockKHR capability:

-
-
-
-
OpReadClockKHR
-
-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - - - - - -

ShaderClockKHR

5055

OpReadClockKHR

5056

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.3

-
-
-

Capabilities

-
-

Modify Section 3.31, Capability, adding rows to the Capability table:

-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

5055

ShaderClockKHR

-
-
-

Instructions

-
-
-
(Add Section 3.32.25, Time Instructions)
-
-
-
- ------- - - - - - - - - - - - - - -

OpReadClockKHR
-
-Result is the sample value of a clock as seen by the shader processor. -An idealized clock is an unbounded unsigned scalar integer tick count -increasing monotonically over time. A clock’s rate of progress may vary -within the lifetime of an invocation, may vary across different executions -of the program, and may be affected by conditions beyond the control of -the programmer. The sampled value read by this instruction consists of -the least significant bits of the idealized clock’s tick count at the -time the instruction was executed. In particular, an observer may see -sampled values wrap around zero.
-
-Result Type must be a 64-bit unsigned integer type or a vector of -two-components of 32-bit unsigned integer type with the first component -containing the 32 least significant bits and the second component containing -the 32 most significant bits.
-
-All invocations within the Scope read from the same source clock. -Scope must be a valid Scope <id> (Section 3.27).

Capability:
-ShaderClockKHR

4

5055

<id> Result Type

<id> Result

Scope <id> Scope

-
-
-
-
(Add to Used by: list in Section 3.27, Scope <id>)
-
-
-
-
-
OpReadClockKHR
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_KHR_shader_clock"
-
-
-
-

The Scope operand of the OpReadClockKHR instruction must be a valid -Scope <id>.

-
-
-
-
-

Issues

-
-
-
    -
  1. -

    How does this extension interact with ARB_shader_clock ?

    -
    -
    -
    -

    RESOLVED: This extension purposefully does not fully implement -ARB_shader_clock, as there is no guarantee of code motion barriers.

    -
    -
    -
    -
  2. -
  3. -

    If two invocations execute the same dynamic instance of the ReadClockKHR -instruction, do both invocations get exactly the same value ?

    -
    -
    -
    -

    RESOLVED: There is no guarantee that two invocations will produce exactly -the same value.

    -
    -
    -
    -
  4. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2019-02-22

Aaron Hagan

Initial draft

2

2019-10-29

Daniel Koch

Add Op prefix to new instruction, add links to GLSL specs

3

2019-10-30

Daniel Koch

SPIR-V/issues/511 (gitlab) - stop calling the scope "Execution Scope"

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_shader_clock.html + + +

extensions/KHR/SPV_KHR_shader_clock.html

+ + diff --git a/extensions/KHR/SPV_KHR_shader_draw_parameters.html b/extensions/KHR/SPV_KHR_shader_draw_parameters.html index b7dcf39..ada8614 100644 --- a/extensions/KHR/SPV_KHR_shader_draw_parameters.html +++ b/extensions/KHR/SPV_KHR_shader_draw_parameters.html @@ -1,405 +1,12 @@ - - - - - - - -SPV_KHR_shader_draw_parameters - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_shader_draw_parameters

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Piers Daniell, NVIDIA

    -
  • -
  • -

    Kerch Holt, NVIDIA

    -
  • -
  • -

    Daniel Koch, NVIDIA

    -
  • -
  • -

    Ashwin Kolhe, NVIDIA

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2016 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete.

    -
  • -
  • -

    Approved by the SPIR-V Working group: 2016-08-02

    -
  • -
  • -

    Approved by the Khronos Board of Promoters: 2016-09-30

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2016-09-20

Revision

6

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.1 Revision 1.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension provides new built-in variables: BaseVertex, BaseInstance, and -DrawIndex, to support the OpenGL GL_ARB_shader_draw_parameters extension in SPIR-V.

-
-
-

The new functionality is enabled under the DrawParameters capability.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_shader_draw_parameters"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
DrawParameters
-
-
-
-
-
-

New Builtins

-
-
-

Builtin IDs added:

-
-
-
-
BaseVertex
-BaseInstance
-DrawIndex
-
-
-
-
-
-

New Instructions

-
-
-

None.

-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - - - - - - - - - - - - - -

BaseVertex

4424

BaseInstance

4425

DrawIndex

4426

DrawParameters

4427

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.1

-
-
-
-
(Modify Section 3.21, BuiltIn to include new builtins)
-
-
-
- ----- - - - - - - - - - - - - - - - - - - - - - - - -
BuiltInEnabling Capabilities

4424

BaseVertex
-Base vertex component of vertex ID. See OpenGL -GL_ARB_shader_draw_parameters for more detail.

DrawParameters

4425

BaseInstance
-Base instance component of instance ID. See OpenGL -GL_ARB_shader_draw_parameters for more detail.

DrawParameters

4426

DrawIndex
-Contains the index of the draw currently being processed. -See OpenGL GL_ARB_shader_draw_parameters for more detail.

DrawParameters

-
-
-
-
(Modify Section 3.31, Capability, adding a row to the Capability table)
-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityDepends On

4427

DrawParameters

Shader

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_KHR_shader_draw_parameters"
-
-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2016-05-31

Kerch Holt

Initial revision

2

2016-05-31

Kerch Holt

Removed "BuiltIn" from name (used in header not doc).

3

2016-08-07

Kerch Holt

Added extension number

4

2016-08-19

Daniel Koch

drawID → drawIndex, drop KHR and add DrawParameters

5

2016-09-02

Kerch Holt

Renumbered tokens as per GitLab issue #52 in SPIR-V

6

2016-09-20

Daniel Koch

Fix extension name in validation rules, standardize dates, - remove extension number, update contributors

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_shader_draw_parameters.html + + +

extensions/KHR/SPV_KHR_shader_draw_parameters.html

+ + diff --git a/extensions/KHR/SPV_KHR_storage_buffer_storage_class.html b/extensions/KHR/SPV_KHR_storage_buffer_storage_class.html index e592eb3..143ce60 100644 --- a/extensions/KHR/SPV_KHR_storage_buffer_storage_class.html +++ b/extensions/KHR/SPV_KHR_storage_buffer_storage_class.html @@ -1,475 +1,12 @@ - - - - - - - -SPV_KHR_storage_buffer_storage_class - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_storage_buffer_storage_class

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Alexander Galazin, ARM

    -
  • -
  • -

    David Neto, Google

    -
  • -
  • -

    John Kessenich, Google

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2017 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Approved by SPIR-V working group 2017-Mar-29.

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2017-03-23

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.1.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-

This extension interacts with SPV_KHR_16bit_storage.

-
-
-

This extension interacts with GL_KHR_vulkan_glsl.

-
-
-
-
-

Overview

-
-
-

This extension adds a new StorageBuffer Storage Class. -A Block-decorated object in this class is equivalent to a BufferBlock-decorated -object in the Uniform Storage Class.

-
-
-

This extension also deprecates the BufferBlock Decoration.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_storage_buffer_storage_class"
-
-
-
-
-
-

New Storage Classes

-
-
-

This extension introduces new storage class:

-
-
-
-
StorageBuffer
-
-
-
-
-
-

New Capabilities

-
-
-

None.

-
-
-
-
-

New Builtins

-
-
-

None.

-
-
-
-
-

New Instructions

-
-
-

None.

-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - -

StorageBuffer

12

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.1

-
-
-

Modify Section 3.7, Storage Class, adding this row to the Storage Classes table:

-
-
-
- ----- - - - - - - - - - - - - - -
Storage ClassEnabling Capabilities

12

StorageBuffer
-Shared externally, readable and writeable, visible across all functions -in all invocations in all work groups. Graphics storage buffers (buffer blocks).

Shader

-
-
-
-
-
Modify Section 2.16.1, Universal Validation Rules, change the Atomic access rules:
-
-

add StorageBuffer to the list of allowed Storage Classes.

-
-
Modify Section 2.16.2, Validation Rules for Shader Capabilities, change the first sentence in the Composite objects rules to say:
-
-

Composite objects in the StorageBuffer, UniformConstant, Uniform, and PushConstant Storage Classes must be explicitly laid out.

-
-
Modify Section 3.20, Decoration
-
-
-
-
Change the description of the BufferBlock decoration to say:
-
-
-
-
-
-
-

Apply to a structure type to establish it is a SSBO-like shader-interface block. -This decoration is deprecated. SPIR-V producers are encouraged to generate -Block-decorated objects in the StorageBuffer Storage Class instead.

-
-
-
-
Change the second sentence in the description of the Volatile decoration to say:
-
-

Can only be used for objects declared as storage images (see OpTypeImage), -in the StorageBuffer Storage Class, or in -the Uniform Storage Class with the BufferBlock Decoration.

-
-
Change the second sentence in the description of the Coherent decoration to say:
-
-

Can only be used for objects declared as storage images (see OpTypeImage), -in the StorageBuffer Storage Class, or in -the Uniform Storage Class with the BufferBlock Decoration.

-
-
Change the second sentence in the description of the NonWritable decoration to say:
-
-

Can only be used for objects declared as storage images (see OpTypeImage), -in the StorageBuffer Storage Class, or in -the Uniform Storage Class with the BufferBlock Decoration.

-
-
Change the second sentence in the description of the NonReadable decoration to say:
-
-

Can only be used for objects declared as storage images (see OpTypeImage), -in the StorageBuffer Storage Class, or in -the Uniform Storage Class with the BufferBlock Decoration.

-
-
-
Modify Section 3.25, Memory Semantics <id>
-
-
-
-
Change the description of the UniformMemory semantics to say:
-
-

Apply the memory-ordering constraints to Uniform or StorageBuffer Storage Class memory.

-
-
-
Modify Section 3.31, Capability
-
-
-
-
Change the description of the StorageBufferArrayDynamicIndexing capability to say:
-
-

Arrays in the StorageBuffer Storage Class or -BufferBlock-decorated arrays in the Uniform Storage Class use dynamically uniform indexing.

-
-
-
Modify Section 3.32.6, Type-Declaration Instructions
-
-
-
-
Change the fourth sentence of the description of the OpTypeRuntimeArray instruction to say:
-
-

Objects of this type can only be created with OpVariable -using the Uniform or the StorageBuffer Storage Classes.

-
-
-
-
-
-
-

Validation Rules

-
-
-

It is invalid to have a construct that uses the StorageBuffer Storage Class and -a construct that uses the Uniform Storage Class with the BufferBlock Decoration -in the same SPIR-V module.

-
-
-
-
-

Interactions with SPV_KHR_16bit_storage

-
-
-
-
If SPV_KHR_16bit_storage is supported,
-
-
-
-
modify the description of the StorageUniformBufferBlock16 capability, adding the following sentence to the first paragraph of the description:
-
-
-
-
-
-
-

The object can also be in the StorageBuffer Storage Class and have any decorations supported for this Storage Class.

-
-
-
-
modify the description of the StorageUniform16 capability, adding the following sentence to the first paragraph of the description:
-
-

The object can also be in the StorageBuffer Storage Class and have any decorations supported for this Storage Class.

-
-
-
Modify Section 3.31, Capability, adding the following rows to the Capability table:
-
-
-
-
-
- ----- - - - - - - - - - - - - - - - - - - -
CapabilityDepends On

4433

StorageBuffer16BitAccess
-Same as StorageUniformBufferBlock16

4434

UniformAndStorageBuffer16BitAccess
-Same as StorageUniform16

StorageBuffer16BitAccess

-
-
-
-

Interactions with GL_KHR_vulkan_glsl

-
-
-
-
If GL_KHR_vulkan_glsl is supported,
-
-
-
-
Modify Mapping to SPIR-V section, adding the following statement to mapping of storage classes:
-
-
-
-
-
-
-

buffer blockN { …​ } …​; → StorageBuffer, with Block decoration

-
-
-
-
-

Issues

-
- -
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2017-03-23

Alexander Galazin

Initial revision

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_storage_buffer_storage_class.html + + +

extensions/KHR/SPV_KHR_storage_buffer_storage_class.html

+ + diff --git a/extensions/KHR/SPV_KHR_subgroup_rotate.html b/extensions/KHR/SPV_KHR_subgroup_rotate.html index aee86fb..1c86944 100644 --- a/extensions/KHR/SPV_KHR_subgroup_rotate.html +++ b/extensions/KHR/SPV_KHR_subgroup_rotate.html @@ -1,327 +1,12 @@ - - - - - - - -SPV_KHR_subgroup_rotate - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_subgroup_rotate

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Kevin Petit, Arm Ltd.

    -
  • -
  • -

    Ruihao Zhang, Qualcomm

    -
  • -
  • -

    Faith Ekstrand, Collabora

    -
  • -
  • -

    Graeme Leese, Broadcom

    -
  • -
  • -

    Alan Baker, Google

    -
  • -
  • -

    Caio Oliveira, Intel

    -
  • -
  • -

    Jeff Bolz, NVIDIA

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2022 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Approved by the SPIR-V Working Group: 2022-03-02

    -
  • -
  • -

    Approved by the Khronos Board of Promoters: 2022-04-15

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2022-03-02

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification Version 1.5 Revision 5.

-
-
-

This extension requires SPIR-V 1.3.

-
-
-
-
-

Overview

-
-
-

This extension adds a new instruction that enables rotating values across invocations -within a subgroup. Taking the example of a subgroup of size 16, a rotation by an -amount of 2 would, when executed by the invocation identified by id 0, return the value -from the invocation identified by the id 2. The same rotation instruction, when -executed by the invocation identified by id 14, would return the value from the -invocation identified by id 0.

-
-
-

A rotation by an amount of N rotates values "down" N invocations within the subgroup.

-
-
-

A rotation by an amount of (SubgroupSize - N) rotates values "up" N invocations -within the subgroup.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_subgroup_rotate"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5

-
-
-

Capabilities

-
-

Modify Section 3.31, "Capability", adding these rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly declares

6026

GroupNonUniformRotateKHR
-Use OpGroupNonUniformRotateKHR instruction

GroupNonUniform

-
-
-
-
-

Instructions

-
-

Add the new instruction:

-
- ---------- - - - - - - - - - - - - - - - - -

OpGroupNonUniformRotateKHR
-
-Return the Value of the invocation whose id within the group is -calculated as follows:
-
-LocalId = SubgroupLocalInvocationId if Execution is Subgroup or LocalInvocationId if Execution is Workgroup
-RotationGroupSize = ClusterSize when ClusterSize is present, otherwise
-RotationGroupSize = SubgroupMaxSize if the Kernel capability is declared and SubgroupSize if not.
-Invocation ID = ( (LocalId + Delta) & (RotationGroupSize - 1) ) + (LocalId & ~(RotationGroupSize - 1))
-
-Result Type must be a scalar or vector of floating-point type, -integer type, or Boolean type.
-
-Execution is a Scope. It must be either Workgroup or Subgroup.
-
- The type of Value must be the same as Result Type.
-
-Delta must be a scalar of integer type, whose Signedness operand is 0.
-Delta must be dynamically uniform within Execution.
-
-Delta is treated as unsigned and the resulting value is undefined if the selected lane is inactive.
-
-ClusterSize is the size of cluster to use. ClusterSize must be a scalar -of integer type, whose Signedness operand is 0. ClusterSize must -come from a constant instruction. Behavior is undefined unless -ClusterSize is at least 1 and a power of 2. If ClusterSize is greater -than the declared SubGroupSize, executing this instruction results -in undefined behavior.

Capability:
-GroupNonUniformRotateKHR -

6 + variable

4431

<id>
-Result Type

Result <id>

Scope <id>
-Execution

<id>
-Value

<id>
-Delta

Optional <id> ClusterSize

-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2022-03-02

Kevin Petit

Initial revision

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_subgroup_rotate.html + + +

extensions/KHR/SPV_KHR_subgroup_rotate.html

+ + diff --git a/extensions/KHR/SPV_KHR_subgroup_uniform_control_flow.html b/extensions/KHR/SPV_KHR_subgroup_uniform_control_flow.html index dabd743..031bc63 100644 --- a/extensions/KHR/SPV_KHR_subgroup_uniform_control_flow.html +++ b/extensions/KHR/SPV_KHR_subgroup_uniform_control_flow.html @@ -1,260 +1,12 @@ - - - - - - - -SPV_KHR_subgroup_uniform_control_flow - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_subgroup_uniform_control_flow

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Alan Baker, Google

    -
  • -
  • -

    John Kessenich, Google

    -
  • -
  • -

    David Neto, Google

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2020 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
  • -

    Approved by the SPIR Working Group: 2020-05-13

    -
  • -
  • -

    Approved by the Khronos Board of Promoters: 2020-06-26

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2020-05-27

Revision

2

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.5, Revision 3, Unified.

-
-
-

This extension requires SPIR-V 1.3.

-
-
-
-
-

Overview

-
-
-

This extension adds a new execution mode SubgroupUniformControlFlowKHR to -extend the guarantees provided by the definition of Uniform Control Flow in -the SPIR-V Specification. The SubgroupUniformControlFlowKHR execution mode -requires that all invocations in each subgroup scope instance must reconverge -if they were uniform for that subgroup instance upon entry into a structured -loop or selection and they all exit via the loop’s or selection’s merge block.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension -must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_subgroup_uniform_control_flow"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5

-
-
-

Execution Mode

-
-

In section 3.6 "Execution Mode", add the following row to the table:

-
-
-
- -------- - - - - - - - - - - - - - - - -
Execution ModeExtra OperandsEnabling Capabilities

4421

SubgroupUniformControlFlowKHR
-Extends the definition of Uniform Control Flow to apply to each Subgroup scope instance in addition to the invocation group.

Shader

-
-
-
-
-
-
-

Issues

-
-
-

1) Should this wait for the resolution of memory model issue #125?

-
-
-

RESOLVED: That issue has been resolved and does not materially impact this -extension’s changes.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2020-03-17

Alan Baker

Initial revision

2

2020-05-27

Alan Baker

Rename extension from SPV_KHR_subgroup_reconvergence

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_subgroup_uniform_control_flow.html + + +

extensions/KHR/SPV_KHR_subgroup_uniform_control_flow.html

+ + diff --git a/extensions/KHR/SPV_KHR_subgroup_vote.html b/extensions/KHR/SPV_KHR_subgroup_vote.html index 6fa78c8..3ba9abb 100644 --- a/extensions/KHR/SPV_KHR_subgroup_vote.html +++ b/extensions/KHR/SPV_KHR_subgroup_vote.html @@ -1,625 +1,12 @@ - - - - - - - -SPV_KHR_subgroup_vote - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_subgroup_vote

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Kerch Holt, NVIDIA

    -
  • -
  • -

    Daniel Koch, NVIDIA

    -
  • -
  • -

    Ashwin Kolhe, NVIDIA

    -
  • -
  • -

    John Kessenich, Google

    -
  • -
  • -

    David Neto, Google

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2016 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
  • -

    Approved by the SPIR Working Group: 2016-10-18

    -
  • -
  • -

    Ratified by the Khronos Board: 2017-01-11

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2016-10-05

Revision

6

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.1 Revision 1.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds three new subgroup instructions: OpSubgroupAllKHR, -OpSubgroupAnyKHR, and OpSubgroupAllEqualKHR -to support the OpenGL GL_ARB_shader_group_vote extension in -SPIR-V.

-
-
-

Each of these instructions computes a boolean function over boolean values -contributed by the set of invocations in a subgroup.

-
-
-

The new functionality is enabled under the SubgroupVoteKHR capability.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_subgroup_vote"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
SubgroupVoteKHR
-
-
-
-
-
-

New Builtins

-
-
-

None.

-
-
-
-
-

New Instructions

-
-
-

Instructions added under SubgroupVoteKHR capability:

-
-
-
-
OpSubgroupAllKHR
-OpSubgroupAnyKHR
-OpSubgroupAllEqualKHR
-
-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - - - - - - - - - - - - - -

OpSubgroupAllKHR

4428

OpSubgroupAnyKHR

4429

OpSubgroupAllEqualKHR

4430

SubgroupVoteKHR

4431

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.1

-
-
-
-
(Add a new Section 2.21, Subgroup Vote)
-
-
-
-
-

Subgroup Vote

-
-
-

This functionality is available on the OpSubgroupAllKHR, -OpSubgroupAnyKHR, and OpSubgroupAllEqualKHR instructions. -The SubgroupVoteKHR capability must be declared in any module -where these instructions are used.

-
-
-

These operations may be executed within dynamically non-uniform control -flow. -In groups where some invocations do not execute the instruction, the -value returned is not affected by any invocation not executing the -instruction, even when Predicate is well-defined for that invocation.

-
-
-

Since these instructions depend on the values of Predicate in an -implementation-defined group of invocations (the Subgroup), the value -returned by these instructions is implementation-defined. -However, OpSubgroupAnyKHR is guaranteed to return true if -Predicate evaluates to true, and OpSubgroupAllKHR is guaranteed -to return false if Predicate evaluates false.

-
-
-

Note: Implementations are not required to combine invocations into groups -of any specific size. -When SubgroupSize is 1, OpSubgroupAnyKHR and OpSubgroupAllKHR will -return Predicate and OpSubgroupAllEqualKHR will return true.

-
-
-

For fragment shaders, invocations in a subgroup may include -invocations corresponding to pixels that are covered by a primitive being -rasterized, as well as invocations corresponding to neighboring pixels not -covered by the primitive. -The invocations for these neighboring pixels (HelperInvocation) may be -created so that differencing can be used to evaluate derivative instructions -like OpDPdx and OpDPdy (section 3.32.16) and implicit derivatives used -by OpImageSampleImplicitLod and related functions (section 3.32.10). -The value of Predicate for such HelperInvocations contribute to the -value returned by OpSubgroupAllKHR, OpSubgroupAnyKHR, and -OpSubgroupAllEqualKHR.

-
-
-
-
-
(Modify Section 3.31, Capability, adding a row to the Capability table)
-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityDepends On

4431

SubgroupVoteKHR

-
-
-
-
(Modify Section 3.32.21, Group Instructions, adding to the end of the list of instructions)
-
-
-
- ------- - - - - - - - - - - - - - -

OpSubgroupAllKHR
-
-Evaluates a predicate for all invocations in the current Subgroup that -execute the same dynamic instance of this instruction, resulting in true -if Predicate evaluates to true for all such invocations, otherwise the -result is false. -See Subgroup Vote.
-
-Result Type must be a Boolean type.
-
-Predicate must be a Boolean type.

Capability:
-SubgroupVoteKHR

4

4428

<id>
-Result Type

Result <id>

<id> Predicate

- ------- - - - - - - - - - - - - - -

OpSubgroupAnyKHR
-
-Evaluates a predicate for all invocations in the current Subgroup that -execute the same dynamic instance of this instruction, resulting in true -if Predicate evaluates to true for any such invocations, otherwise -the result is false. -See Subgroup Vote.
-
-Result Type must be a Boolean type.
-
-Predicate must be a Boolean type.

Capability:
-SubgroupVoteKHR

4

4429

<id>
-Result Type

Result <id>

<id> Predicate

- ------- - - - - - - - - - - - - - -

OpSubgroupAllEqualKHR
-
-Evaluates a predicate for all invocations in the current Subgroup that -execute the same dynamic instance of this instruction, resulting -in true if Predicate evaluates the same for such invocations, -otherwise the result is false. -See Subgroup Vote.
-
-Result Type must be a Boolean type.
-
-Predicate must be a Boolean type.

Capability:
-SubgroupVoteKHR

4

4430

<id>
-Result Type

Result <id>

<id> Predicate

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_KHR_subgroup_vote"
-
-
-
-
-
-

Issues

-
-
-
    -
  1. -

    SPIR-V 1.1 already has OpGroupAny and OpGroupAll, are these sufficient?

    -
    -
    -
    -

    RESOLVED: -OpGroupAllEqual(predicate) could be emulated in a compiler front-end -as (OpGroupAll(predicate) || !OpGroupAny(predicate)). However if -the underlying hardware’s instruction set actually has a native AllEqual -instruction this would result in either a) reduced performance since -it must execute two instructions instead of one, or b) complicated -compiler heuristics to detect the above pattern and collapse it back -to one instruction. In order to give the full expressiveness of the -higher level languages (such as GLSL), we’ll add a dedicated -instruction for this.

    -
    -
    -
    -
  2. -
  3. -

    Do we need a capability?

    -
    -
    -
    -

    RESOLVED: -Yes. We’ll add capability with extensions so that it’s simpler to move -them into the core without needing complicated consumer logic.

    -
    -
    -
    -
  4. -
  5. -

    Where can these instructions be executed?

    -
    -
    -
    -

    DISCUSSION: -GL_ARB_shader_group_vote says: -"These functions may be called in conditionally executed code. In groups -where some invocations do not execute the function call, the value -returned by the function is not affected by any invocation not calling the -function, even when <value> is well-defined for that invocation."

    -
    -
    -

    The existing SPIR-V OpGroup* instructions say: -"All invocations of this module within Execution must reach this point -of execution. This instruction is only guaranteed to work correctly if -placed strictly within uniform control flow within Execution. This ensures -that if any invocation executes it, all invocations will execute it. If -placed elsewhere, an invocation may stall indefinitely."

    -
    -
    -

    RESOLVED: -Due to the potentially differing semantics between the existing OpGroup* -instructions and the instructions this extension wishes to support, -we’ll add new dedicated instructions here.

    -
    -
    -
    -
  6. -
  7. -

    Should the SubgroupVoteKHR capability be dependent on the Shader -capability?

    -
    -
    -
    -

    RESOLVED: No. -There is no technical reason why it needs to be, and this enables -it to be used in Kernels, if so desired and supported.

    -
    -
    -
    -
  8. -
  9. -

    How do OpGroup{All,Any} differ from OpSubgroup{All,Any}KHR?

    -
    -
    -
    -

    RESOLVED: -The existing OpGroup instructions can only be used in uniform control -flow, and take an execution scope which can either be workgroup or subgroup. -The OpSubgroup*KHR instructions allow execution in dynamically non-uniform -control flow, and only operate at the subgroup scope.

    -
    -
    -
    -
  10. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2016-07-19

Daniel Koch

Initial draft

2

2016-08-09

Daniel Koch

Add issue 2 and 3. Require Subgroup scope. - Editorial changes.

3

2016-08-16

Daniel Koch

Add SubgroupVote capability. - Add language allowing these to be used in conditionally executed code. - Add more expository language about the functionality. - Add Validation rules.

4

2016-09-13

Daniel Koch

Add suffix to capability and beautify. - Move functional language to new section 2.21.

5

2016-09-23

Daniel Koch

Rename to KHR and assign enums. - Use dedicated instructions instead of trying to leverage existing - OpGroup instructions. - Align language with SPV_KHR_shader_ballot. Various clarifications.

6

2016-10-05

Daniel Koch

Incorporated review feedback from dneto.

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_subgroup_vote.html + + +

extensions/KHR/SPV_KHR_subgroup_vote.html

+ + diff --git a/extensions/KHR/SPV_KHR_terminate_invocation.html b/extensions/KHR/SPV_KHR_terminate_invocation.html index 0f4d169..4740bfd 100644 --- a/extensions/KHR/SPV_KHR_terminate_invocation.html +++ b/extensions/KHR/SPV_KHR_terminate_invocation.html @@ -1,276 +1,12 @@ - - - - - - - -SPV_KHR_terminate_invocation - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_terminate_invocation

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Alan Baker, Google LLC

    -
  • -
  • -

    Jeff Bolz, NVIDIA

    -
  • -
  • -

    John Kessenich, Google LLC

    -
  • -
  • -

    David Neto, Google LLC

    -
  • -
  • -

    Kevin Petit, ARM

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2020 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
  • -

    Approved by the SPIR Working Group: 2020-05-13

    -
  • -
  • -

    Approved by the Khrono Board of Promoters: 2020-06-26

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2020-05-01

Revision

2

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.5 Revision 2.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds a new instruction OpTerminateInvocation to provide a disambiguated -functionality compared to OpKill.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_terminate_invocation"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5

-
-
-

Terms

-
-

In section 2.2.5 "Control Flow", add OpTerminateInvocation to the -Termination Instruction list.

-
-
-
-

Instructions

-
-

In section 3.22.17 "Control-Flow Instructions", add the new instruction:

-
-
-
- ---- - - - - - - - - - - -

OpTerminateInvocation
-
-Fragment-shader terminate.
-
-Ceases all further processing in any invocation that executes it: Only instructions these invocations executed before OpTerminateInvocation will have observable side effects. If this instruction is executed in non-uniform control flow, all subsequent control flow is non-uniform (for invocations that continue to execute).
-
-This instruction must be the last instruction in a block.
-
-This instruction is only valid in the Fragment Execution Model.

Capability:
-Shader

1

4416

-
-
-
-
-
-
-

Issues

-
-
-

1) Why not just continue to use OpKill?

-
-
-

Discussion:

-
-
-

Historically, OpKill was mapped for both GLSL and HLSL discard builtins; however, -the behavior of the two operations differs. HLSL’s discard maps more naturally to -OpDemoteToHelperInvocationEXT (see SPV_EXT_demote_to_helper_invocation). -Implementors naturally implemented OpKill to match their hardware, which might use -either semantics. Introducing OpTerminateInvocation allows for better -disambiguation of the desired behavior by an application and more rigorous -testing of effects.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2020-04-08

Alan Baker

Initial revision

1

2020-05-01

Alan Baker

Rename extension

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_terminate_invocation.html + + +

extensions/KHR/SPV_KHR_terminate_invocation.html

+ + diff --git a/extensions/KHR/SPV_KHR_uniform_group_instructions.html b/extensions/KHR/SPV_KHR_uniform_group_instructions.html index f5f101c..45035c0 100644 --- a/extensions/KHR/SPV_KHR_uniform_group_instructions.html +++ b/extensions/KHR/SPV_KHR_uniform_group_instructions.html @@ -1,631 +1,12 @@ - - - - - - - -SPV_KHR_uniform_group_instructions - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_uniform_group_instructions

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Dmitry Sidorov, Intel

    -
  • -
  • -

    Alexey Sotkin, Intel

    -
  • -
  • -

    John Pennycook, Intel

    -
  • -
  • -

    Ben Ashbaugh, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2022 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
  • -

    Approved by the SPIR-V Working Group: 2021-12-08

    -
  • -
  • -

    Approved by the Khronos Board of Promoters: 2022-01-21

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2021-11-08

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.5, Revision 5, Unified.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds new instructions to SPIR-V to support additional group operations within uniform control flow. -Some SPIR-V consumers may only be able to support these operations within uniform control flow for some Execution Scopes, and some SPIR-V consumers may be able to generate more efficient code when control flow is known to be uniform.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_uniform_group_instructions"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5

-
-
-

Capabilities

-
-

Modify Section 3.31, Capability, adding rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

6400

GroupUniformArithmeticKHR
-Uses additional group uniform arithmetic instructions.

Groups

-
-
-
-
-

Instructions

-
-

Add new instructions to Section 3.37.21, Group and Subgroup Instructions:

-
- --------- - - - - - - - - - - - - - - - -

OpGroupIMulKHR
-
-An integer multiplication group operation specified for all values of X -specified by invocations in the group.
-
-Behavior is undefined if not all invocations of this module within Execution -reach this point of execution.
-
-Behavior is undefined unless all invocations within Execution execute the -same dynamic instance of this instruction.
-
-Result Type must be a scalar or vector of integer type.
-
-Execution is a Scope. It must be either Workgroup or Subgroup.
-
-The identity I for Operation is 1.
-
-The type of X must be the same as Result Type.

Capability:
-GroupUniformArithmeticKHR

6

6401

<id>
-Result Type

Result <id>

Scope <id>
-Execution

<Group Operation>
-Operation

<id>
-X

- --------- - - - - - - - - - - - - - - - -

OpGroupFMulKHR
-
-A floating-point multiplication group operation specified for all values of X -specified by invocations in the group.
-
-Behavior is undefined if not all invocations of this module within Execution -reach this point of execution.
-
-Behavior is undefined unless all invocations within Execution execute the -same dynamic instance of this instruction.
-
-Result Type must be a scalar or vector of floating-point type.
-
-Execution is a Scope. It must be either Workgroup or Subgroup.
-
-The identity I for Operation is 1.
-
-The type of X must be the same as Result Type.

Capability:
-GroupUniformArithmeticKHR

6

6402

<id>
-Result Type

Result <id>

Scope <id>
-Execution

<Group Operation>
-Operation

<id>
-X

- --------- - - - - - - - - - - - - - - - -

OpGroupBitwiseAndKHR
-
-A bitwise And group operation specified for all values of X -specified by invocations in the group.
-
-Behavior is undefined if not all invocations of this module within Execution -reach this point of execution.
-
-Behavior is undefined unless all invocations within Execution execute the -same dynamic instance of this instruction.
-
-Result Type must be a scalar or vector of integer type.
-
-Execution is a Scope. It must be either Workgroup or Subgroup.
-
-The identity I for Operation is ~0.
-
-The type of X must be the same as Result Type.

Capability:
-GroupUniformArithmeticKHR

6

6403

<id>
-Result Type

Result <id>

Scope <id>
-Execution

<Group Operation>
-Operation

<id>
-X

- --------- - - - - - - - - - - - - - - - -

OpGroupBitwiseOrKHR
-
-A bitwise Or group operation specified for all values of X -specified by invocations in the group.
-
-Behavior is undefined if not all invocations of this module within Execution -reach this point of execution.
-
-Behavior is undefined unless all invocations within Execution execute the -same dynamic instance of this instruction.
-
-Result Type must be a scalar or vector of integer type.
-
-Execution is a Scope. It must be either Workgroup or Subgroup.
-
-The identity I for Operation is 0.
-
-The type of X must be the same as Result Type.

Capability:
-GroupUniformArithmeticKHR

6

6404

<id>
-Result Type

Result <id>

Scope <id>
-Execution

<Group Operation>
-Operation

<id>
-X

- --------- - - - - - - - - - - - - - - - -

OpGroupBitwiseXorKHR
-
-A bitwise Xor group operation specified for all values of X -specified by invocations in the group.
-
-Behavior is undefined if not all invocations of this module within Execution -reach this point of execution.
-
-Behavior is undefined unless all invocations within Execution execute the -same dynamic instance of this instruction.
-
-Result Type must be a scalar or vector of integer type.
-
-Execution is a Scope. It must be either Workgroup or Subgroup.
-
-The identity I for Operation is 0.
-
-The type of X must be the same as Result Type.

Capability:
-GroupUniformArithmeticKHR

6

6405

<id>
-Result Type

Result <id>

Scope <id>
-Execution

<Group Operation>
-Operation

<id>
-X

- --------- - - - - - - - - - - - - - - - -

OpGroupLogicalAndKHR
-
-A logical And group operation specified for all values of X -specified by invocations in the group.
-
-Behavior is undefined if not all invocations of this module within Execution -reach this point of execution.
-
-Behavior is undefined unless all invocations within Execution execute the -same dynamic instance of this instruction.
-
-Result Type must be a scalar or vector of Boolean type.
-
-Execution is a Scope. It must be either Workgroup or Subgroup.
-
-The identity I for Operation is ~0.
-
-The type of X must be the same as Result Type.

Capability:
-GroupUniformArithmeticKHR

6

6406

<id>
-Result Type

Result <id>

Scope <id>
-Execution

<Group Operation>
-Operation

<id>
-X

- --------- - - - - - - - - - - - - - - - -

OpGroupLogicalOrKHR
-
-A logical Or group operation specified for all values of X -specified by invocations in the group.
-
-Behavior is undefined if not all invocations of this module within Execution -reach this point of execution.
-
-Behavior is undefined unless all invocations within Execution execute the -same dynamic instance of this instruction.
-
-Result Type must be a scalar or vector of Boolean type.
-
-Execution is a Scope. It must be either Workgroup or Subgroup.
-
-The identity I for Operation is 0.
-
-The type of X must be the same as Result Type.

Capability:
-GroupUniformArithmeticKHR

6

6407

<id>
-Result Type

Result <id>

Scope <id>
-Execution

<Group Operation>
-Operation

<id>
-X

- --------- - - - - - - - - - - - - - - - -

OpGroupLogicalXorKHR
-
-A logical Xor group operation specified for all values of X -specified by invocations in the group.
-
-Behavior is undefined if not all invocations of this module within Execution -reach this point of execution.
-
-Behavior is undefined unless all invocations within Execution execute the -same dynamic instance of this instruction.
-
-Result Type must be a scalar or vector of Boolean type.
-
-Execution is a Scope. It must be either Workgroup or Subgroup.
-
-The identity I for Operation is 0.
-
-The type of X must be the same as Result Type.

Capability:
-GroupUniformArithmeticKHR

6

6408

<id>
-Result Type

Result <id>

Scope <id>
-Execution

<Group Operation>
-Operation

<id>
-X

-
-
-

Issues

-
-

None

-
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2021-11-08

Ben Ashbaugh

Converted to a KHR extension.

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_uniform_group_instructions.html + + +

extensions/KHR/SPV_KHR_uniform_group_instructions.html

+ + diff --git a/extensions/KHR/SPV_KHR_untyped_pointers.html b/extensions/KHR/SPV_KHR_untyped_pointers.html index 552a3e5..a5216ad 100644 --- a/extensions/KHR/SPV_KHR_untyped_pointers.html +++ b/extensions/KHR/SPV_KHR_untyped_pointers.html @@ -1,2148 +1,12 @@ - - - - - - - -SPV_KHR_untyped_pointers - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_untyped_pointers

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Alan Baker, Google

    -
  • -
  • -

    David Neto, Google

    -
  • -
  • -

    Hugo Devillers, Saarland University

    -
  • -
  • -

    Tobias Hector, AMD

    -
  • -
  • -

    Caio Oliveira, Intel

    -
  • -
  • -

    Graeme Leese, Broadcom

    -
  • -
  • -

    Ruihao Zhang, Qualcomm,

    -
  • -
  • -

    Dmitry Sidorov, Intel

    -
  • -
  • -

    Jeff Bolz, Nvidia

    -
  • -
  • -

    Victor Lomuller, Codeplay

    -
  • -
  • -

    Kevin Petit, Arm

    -
  • -
  • -

    Ben Ashbaugh, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2024 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-

Provisional

-
-
-
    -
  • -

    Approved by the SPIR-V Working Group: 2024-05-29

    -
  • -
  • -

    Approved by the Khronos Board of Promoters: 2024-07-12

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2024-08-08

Revision

2

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, Version 1.6 -Revision 3.

-
-
-

This extension modifies SPV_KHR_workgroup_memory_explicit_layout.

-
-
-

This extension modifies SPV_KHR_cooperative_matrix.

-
-
-

This extension modifies the OpenCL.std extended instruction set.

-
-
-
-
-

Overview

-
-
-

This extension introduces support for untyped pointers. It allows for the -declaration and use of pointers that do not specify the type of data they point -to. It also allows memory, atomic and other instructions to reinterpret data -differently than the declared type of the variables they are used with. For -example, loading a vector of floating-point values from a variable with a -declared type of an array of integers. It provides an equivalent set of -functionality to type-punning via pointer casting in high-level languages.

-
-
-

This extension adds the following new instructions:

-
- -
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following OpExtension must -be present in the module:

-
-
-
-
OpExtension "SPV_KHR_untyped_pointers"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6

-
-
-

Modify Section 2.2.1 Instructions:

-
-
-

Add the following new term:

-
-
-

Variable: An OpVariable or OpUntypedVariableKHR.

-
-
-

Change the following existing terms:

-
-
-

Object: An instantiation of a non-void type, either as the Result -<id> of an operation, or created through a variable.

-
-
-

Memory Object: An object created through a -variable. Such an object exists only for the duration of a -function if it is a function variable, and otherwise exists for the duration of -the invocation.

-
-
-

Memory Object Declaration: A -variable, or an OpFunctionParameter of pointer -type, or the contents of a variable that holds either a pointer to the -PhysicalStorageBuffer storage class or an array of such pointers.

-
-
-

Intermediate Object or Intermediate Value or Intermediate Result: An -object created by an operation (not memory allocated by a -variable) and dying on its last consumption.

-
-
-

Modify Section 2.2.2 Types:

-
-
-

Add the following new term:

-
-
-

Pointer Type: An OpTypePointer or OpTypeUntypedPointerKHR.

-
-
-

Changes the following existing terms:

-
-
-

Physical Pointer Type: A pointer type -whose Storage Class uses physical addressing according to the addressing -model.

-
-
-

Variable Pointer: A pointer of logical -pointer type that results from one of the following opcodes:

-
-
-
    -
  • -

    OpSelect

    -
  • -
  • -

    OpPhi

    -
  • -
  • -

    OpFunctionCall

    -
  • -
  • -

    OpPtrAccessChain

    -
  • -
  • -

    OpUntypedPtrAccessChainKHR

    -
  • -
  • -

    OpLoad

    -
  • -
  • -

    OpConstantNull

    -
  • -
-
-
-

Additionally, any OpAccessChain, OpInBoundsAccessChain, -OpUntypedAccessChainKHR, -OpUntypedInBoundsAccessChainKHR or -OpCopyObject that takes a variable pointer as an operand also produces a -variable pointer. An OpFunctionParameter of pointer type is -a variable pointer if any OpFunctionCall to the function statically passes a -variable pointer as the value of the parameter.

-
-
-

Modify Section 2.4 Logical Layout of a Module:

-
-
-

Change references to OpVariable to variable.

-
-
-

Modify Section 2.16.1 Universal Validation Rules:

-
-
-

Modify the list items under the following list item:

-
-
-
-
-

If neither the VariablePointers nor VariablePointersStorageBuffer capabilities -are declared, the following rules apply to logical pointer types:

-
-
-
-
-

Change:

-
-
-
-
-

OpVariable must not allocate an object whose type is or contains a logical pointer type.

-
-
-
-
-

To:

-
-
-
-
-

Variables must not allocate an object whose type is or contains a logical pointer type.

-
-
-
-
-

Change:

-
-
-
-
-

It is invalid for a pointer to be an operand to any instruction other than:

-
-
-
    -
  • -

    OpLoad

    -
  • -
  • -

    OpStore

    -
  • -
  • -

    OpAccessChain

    -
  • -
  • -

    OpInBoundsAccessChain

    -
  • -
  • -

    OpFunctionCall

    -
  • -
  • -

    OpImageTexelPointer

    -
  • -
  • -

    OpCopyMemory

    -
  • -
  • -

    OpCopyObject

    -
  • -
  • -

    OpArrayLength

    -
  • -
  • -

    all OpAtomic instructions

    -
  • -
  • -

    extended instruction-set instructions that are explicitly identified as taking pointer operands

    -
  • -
-
-
-
-
-

To:

-
-
-
-
-

It is invalid for a pointer to be an operand to any instruction other than:

-
-
-
    -
  • -

    OpLoad

    -
  • -
  • -

    OpStore

    -
  • -
  • -

    OpAccessChain

    -
  • -
  • -

    OpInBoundsAccessChain

    -
  • -
  • -

    OpUntypedAccessChainKHR

    -
  • -
  • -

    OpUntypedInBoundsAccessChainKHR

    -
  • -
  • -

    OpFunctionCall

    -
  • -
  • -

    OpImageTexelPointer

    -
  • -
  • -

    OpCopyMemory

    -
  • -
  • -

    OpCopyObject

    -
  • -
  • -

    OpArrayLength

    -
  • -
  • -

    OpUntypedArrayLengthKHR

    -
  • -
  • -

    OpCopyMemorySized

    -
  • -
  • -

    all OpAtomic instructions

    -
  • -
  • -

    extended instruction-set instructions that are explicitly identified as taking pointer operands

    -
  • -
-
-
-
-
-

Change:

-
-
-
-
-

It is invalid for a pointer to be the Result <id> of any instruction other than:

-
-
-
    -
  • -

    OpVariable

    -
  • -
  • -

    OpAccessChain

    -
  • -
  • -

    OpInBoundsAccessChain

    -
  • -
  • -

    OpFunctionParameter

    -
  • -
  • -

    OpImageTexelPointer

    -
  • -
  • -

    OpCopyObject

    -
  • -
-
-
-
-
-

To:

-
-
-
-
-

It is invalid for a pointer to be the Result <id> of any instruction other than:

-
-
- -
-
-
-
-

Change:

-
-
-
-
-

All indexes in OpAccessChain and OpInBoundsAccessChain that are OpConstant with -type of OpTypeInt with a signedness of 1 must not have their sign bit set.

-
-
-
-
-

To:

-
-
-
-
-

All indexes in OpAccessChain, OpInBoundsAccessChain, -OpUntypedAccessChainKHR and -OpUntypedInBoundsAccessChainKHR that are -OpConstant with type of OpTypeInt with a signedness of 1 must not have -their sign bit set.

-
-
-
-
-

Modify the list items under the following list item:

-
-
-
-
-

If the VariablePointers or VariablePointersStorageBuffer capability is -declared, the following are allowed for logical pointer types:

-
-
-
-
-

Change:

-
-
-
-
-

If OpVariable allocates an object whose type is or contains a logical pointer -type, the Storage Class operand of the OpVariable must be one of the -following:

-
-
-
    -
  • -

    Function

    -
  • -
  • -

    Private

    -
  • -
-
-
-
-
-

To:

-
-
-
-
-

If a variable allocates an object whose type is or contains a logical pointer -type, the Storage Class operand of the variable must be one of the -following:

-
-
-
    -
  • -

    Function

    -
  • -
  • -

    Private

    -
  • -
-
-
-
-
-

Change:

-
-
-
-
-

A pointer can be a variable pointer or an operand to one of:

-
-
-
    -
  • -

    OpPtrAccessChain

    -
  • -
  • -

    OpPtrEqual

    -
  • -
  • -

    OpPtrNotEqual

    -
  • -
  • -

    OpPtrDiff

    -
  • -
-
-
-
-
-

To:

-
-
-
-
-

A pointer can be a variable pointer or an operand to one of:

-
-
- -
-
-
-
-

Change:

-
-
-
-
-

The instructions OpPtrEqual and OpPtrNotEqual can be used only if the -Storage Class of the operands OpTypePointer declaration:

-
-
-
-
-

To:

-
-
-
-
-

The instructions OpPtrEqual and OpPtrNotEqual can be used only if the -Storage Class of the operands pointer type declaration:

-
-
-
-
-

Modify the list items under the following list item:

-
-
-
-
-

A variable pointer must not:

-
-
-
-
-

Change:

-
-
-
-
-

be an operand to an OpArrayLength instruction

-
-
-
-
-

To:

-
-
-
-
-

be an operand to an OpArrayLength or OpUntypedArrayLengthKHR instruction

-
-
-
-
-

Modify the list items under the following list item:

-
-
-
-
-

Physical Storage Buffer

-
-
-
-
-

Change:

-
-
-
-
-

OpVariable must not use the PhysicalStorageBuffer storage class.

-
-
-
-
-

To:

-
-
-
-
-

Variables must not use the PhysicalStorageBuffer storage class.

-
-
-
-
-

Change:

-
-
-
-
-

If the type an OpVariable points to is a pointer (or array of pointers) in the -PhysicalStorageBuffer storage class, the OpVariable must be decorated with -exactly one of AliasedPointer or RestrictPointer.

-
-
-
-
-

To:

-
-
-
-
-

If the type a variable points to is a pointer (or array of pointers) in the -PhysicalStorageBuffer storage class, the variable must be decorated with -exactly one of AliasedPointer or RestrictPointer.

-
-
-
-
-

Modify the list items under the following list item:

-
-
-
-
-

Global (Module Scope) Variables

-
-
-
-
-

Change:

-
-
-
-
-

A module-scope OpVariable with an Initializer operand must not be decorated -with the Import Linkage Type.

-
-
-
-
-

To:

-
-
-
-
-

A module-scope variable with an Initializer operand must not be decorated -with the Import Linkage Type.

-
-
-
-
-

Changes list items under the following list item:

-
-
-
-
-

The capabilities StorageBuffer16BitAccess, UniformAndStorageBuffer16BitAccess, -StoragePushConstant16, and StorageInputOutput16 do not generally add 16-bit -operations. Rather, they add only the following specific abilities:

-
-
-
-
-

Change:

-
-
-
-
-

A structure containing a 16-bit member can be an operand to OpArrayLength.

-
-
-
-
-

To:

-
-
-
-
-

A structure containing a 16-bit member can be an operand to OpArrayLength or OpUntypedArrayLengthKHR.

-
-
-
-
-

Add the following list items:

-
-
-
-
- -
-
-
-
-

Change list items under the following list item:

-
-
-
-
-

The capabilities StorageBuffer8BitAccess, UniformAndStorageBuffer8BitAccess, -and StoragePushConstant8, do not generally add 8-bit operations. Rather, they -add only the following specific abilities:

-
-
-
-
-

Change:

-
-
-
-
-

A structure containing a 8-bit member can be an operand to OpArrayLength.

-
-
-
-
-

To:

-
-
-
-
-

A structure containing a 8-bit member can be an operand to OpArrayLength or OpUntypedArrayLengthKHR.

-
-
-
-
-

Add the following list items:

-
-
-
-
- -
-
-
-
-

Modify Section 2.16.2 Validation Rules for Shader Capabilities:

-
-
-

Modify the list items under the following list item:

-
-
-
-
-

Composite objects in the StorageBuffer, PhysicalStorageBuffer, Uniform, and -PushConstant Storage Classes must be explicitly laid out. The following apply -to all the aggregate and matrix types describing such an object, recursively -through their nested types:

-
-
-
-
-

Add the following list items:

-
-
-
-
- -
-
-
-
-

Modify the list items under the following list item:

-
-
-
-
-

Type Rules:

-
-
-
-
-

Change:

-
-
-
-
-

All declared types are restricted to those types that are, or are contained -within, valid types for an OpVariable Result Type or an OpTypeFunction Return -Type.

-
-
-
-
-

To:

-
-
-
-
-

All declared types are restricted to those types that are, or are contained -within, valid types for an OpVariable Result Type, an -OpUntypedVariableKHR Data Type, -or an OpTypeFunction Return Type.

-
-
-
-
-

Change:

-
-
-
-
-

Aggregate types for intermediate objects are restricted to those types that are -a valid Type of an OpVariable Result Type in the global storage classes.

-
-
-
-
-

To:

-
-
-
-
-

Aggregate types for intermediate objects are restricted to those types that are -a valid Type of an OpVariable Result Type, or an -OpUntypedVariableKHR Data Type in the global -storage classes.

-
-
-
-
-

Modify Section 2.17 Universal Limits:

-
-
-

Change the table entry:

-
-
-
-
-

Indexes for OpAccessChain, OpInBoundsAccessChain, OpPtrAccessChain, -OpInBoundsPtrAccessChain, OpCompositeExtract, and OpCompositeInsert

-
-
-
-
-

To:

-
-
-
-
-

Indexes for OpAccessChain, OpInBoundsAccessChain, OpPtrAccessChain, -OpInBoundsPtrAccessChain, OpCompositeExtract, OpCompositeInsert, -OpUntypedAccessChainKHR, -OpUntypedInBoundsAccessChainKHR, -OpUntypedPtrAccessChainKHR, and -OpUntypedInBoundsPtrAccessChainKHR

-
-
-
-
-

Modify Section 2.18 Memory Model:

-
-
-

Change references to OpVariable to variable.

-
-
-

Add a new section at the end of Section 2 Specification titled Untyped Pointers:

-
-
-

OpTypePointer includes the data type of the memory that it points to as an -operand of the type-declaration. Logical pointer types of type OpTypePointer -are strongly typed. That is, the data they point to cannot be reinterpreted as -another type in memory. Physical pointer types of type OpTypePointer are -not strongly typed as OpBitcast can be used to cast from one representation -to another. Unlike, OpTypePointer, OpTypeUntypedPointerKHR -does not encode the type of data that it points to. This means that -interpretation of the data type is left to instructions that utilize the -pointer.

-
-
-

Each untyped instruction (OpUntyped…​) has an operand that specifies how the -data should be interpreted (e.g. Base Type in -OpUntypedAccessChainKHR). Also, -OpUntypedAccessChainKHR, -OpUntypedInBoundsAccessChainKHR, -OpUntypedPtrAccessChainKHR, and -OpUntypedInBoundsPtrAccessChainKHR -may take either a typed or untyped pointer as the Base operand. This -facilitates translations from high-level languages as it can localize where -untyped pointers appear in syntax evaluation.

-
-
-

When memory accessed via instructions have a pointer operand with type -OpTypeUntypedPointerKHR (e.g. OpLoad or atomic -instructions), the interpreted data type is specified by the Result Type if -it exists. The intepreted data type for instructions without a Result Type -(e.g. OpStore) comes from the type of the operand of the object being stored. -OpCopyMemorySized interprets the data as an array of 8-bit integers.

-
-
-

When an instruction accesses memory via an untyped pointer for storage class -S and with interpreted data type T, the instruction behaves as if the -pointer were of type OpTypePointer having Storage Class S and Type T. -That is, the instruction will access exactly the same memory locations and -interpret the data there as if using the corresponding strongly typed pointer.

-
-
-

Modify Section 3.7 Storage Class:

-
-
-

Add OpTypeUntypedPointerKHR and -OpUntypedVariableKHR to the list of "Used by" -instructions.

-
-
-

Modify Section 3.20 Decoration:

-
-
-

Change references to OpVariable to variable.

-
-
-

Modify Section 3.21 BuiltIn:

-
-
-

Change references to OpVariable to variable.

-
-
-

Modify Section 3.31 Capability:

-
-
-

Change references to OpTypePointer to pointer type.

-
-
-

Add the following rows to the table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

4473

UntypedPointersKHR
-
-Enables the use of untyped pointers.

-
-
-
-

Modify Section 3.37 Instructions:

-
-
-

In the following instructions, change references to OpVariable to variable:

-
-
-
    -
  • -

    OpDecorateId

    -
  • -
  • -

    OpEntryPoint

    -
  • -
  • -

    OpTypeBool

    -
  • -
-
-
-

Change the description of Result Type in OpImageTexelPointer to:

-
-
-
-
-

Result Type must be a pointer type whose Storage Class is Image. If it is an -OpTypePointer type, its Type operand must be a numerical scalar type or OpTypeVoid.

-
-
-
-
-

Change the description of Pointer in OpLoad to:

-
-
-
-
-

Pointer is the pointer to load through. It must be a pointer type. If it is an -OpTypePointer type, its Type operand must be the same as Result Type.

-
-
-
-
-

Change the description of Pointer in OpStore to:

-
-
-
-
-

Pointer is the pointer to store through. It must be a pointer -type. If it is an OpTypePointer type, its Type operand must be the same -as the type of Object.

-
-
-
-
-

Change the description of OpCopyMemory to:

-
-
-
-
-

Copy from the memory pointed to by Source to the memory pointed to by Target. -Both operands must be pointers and at least one must be an OpTypePointer type. -If either Source or Target has type of OpTypePointer, the <id> Type -operand must be non-void. -If both Source and Target have a type of OpTypePointer, they must have -the <id> Type operand. -Matching Storage Class is not required. -The amount of memory copied is the size of the type pointed to by an operand -with a type of OpTypePointer. -The copied type must have a fixed size; i.e., it must not be, nor include, any -OpTypeRuntimeArray types.

-
-
-

If present, any Memory Operands must begin with a memory operand literal. If -not present, it is the same as specifying the memory operand None. Before -version 1.4, at most one memory operands mask can be provided. Starting with -version 1.4 two masks can be provided, as described in Memory Operands. If no -masks or only one mask is present, it applies to both Source and Target. If two -masks are present, the first applies to Target and must not include -MakePointerVisible, and the second applies to Source and must not include -MakePointerAvailable.

-
-
-
-
-

Add the enabling capability UntypedPointersKHR to OpCopyMemorySized.

-
-
-

Change the restrictions on Operand 1 and Operand 2 in OpPtrEqual and OpPtrNotEqual to:

-
-
-
-
-

The Storage Class operand of the type of both Operand 1 and Operand 2 must match. -If the types of Operand 1 and Operand 2 are OpTypePointer, they must be the same type.

-
-
-
-
-

Change the restriction on Operand 1 and Operand 2 in OpPtrDiff to:

-
-
-
-
-

The types of Operand 1 and Operand 2 must be the same OpTypePointer or -OpTypeUntypedPointerKHR. If the types of Operand 1 and Operand 2 are -OpTypePointer, they must point to a type that can be aggregated into an array. -For an array of length L, Operand 1 and Operand 2 can point to any -element in the range [0, L], where element L is outside the array but has a -representative address computed with the same stride as elements in the array. -Additionally, Operand 1 must be a valid Base operand of OpPtrAccessChain, -OpUntypedPtrAccessChainKHR, -OpInBoundsPtrAccessChain, or -OpUntypedInBoundsPtrAccessChainKHR. -Behavior is undefined if Operand 1 and Operand 2 are not pointers to -element numbers in [0, L] in the same array. If Operand 1 and Operand 2 are -OpTypeUntypedPointerKHR, the array is interpreted as -an array of 8-bit integers.

-
-
-
-
-

Change the description of Result Type in OpPtrCastToGeneric to:

-
-
-
-
-

Result Type must be a pointer type. Its Storage Class must be Generic.

-
-
-
-
-

Change the description of OpGenericCastToPtr to:

-
-
-
-
-

Convert a pointer’s Storage Class to a non-Generic class.

-
-
-

Result Type must be a pointer type. Its Storage Class must be -Workgroup, CrossWorkgroup, or Function.

-
-
-

Pointer must point to the Generic Storage Class.

-
-
-

If Result Type and the type of Pointer are OpTypePointer, they must point to the same type.

-
-
-
-
-

Change the description of OpGenericCastToPtrExplicit to:

-
-
-
-
-

Attempts to explicitly convert Pointer to Storage storage-class pointer value.

-
-
-

Result Type must be a pointer type. Its Storage Class must be Storage.

-
-
-

The type of Pointer must be a pointer type. Pointer must point to the -Generic Storage Class. If the cast fails, the instruction result is an -OpConstantNull pointer in the Storage Storage Class.

-
-
-

If Result Type and the type of Pointer are OpTypePointer, they must point to the same type.

-
-
-

Storage must be one of the following literal values from Storage Class: -Workgroup, CrossWorkgroup, or Function.

-
-
-
-
-

Change the description of OpBitcast to:

-
-
-
-
-

Bit pattern-preserving type conversion.

-
-
-

Result Type must be a pointer type, or a scalar or vector of numerical-type.

-
-
-

Operand must be a pointer type, or a scalar or vector of -numerical-type. It must be a different type than Result Type.

-
-
-

Before version 1.5: If either Result Type or Operand is a pointer, the other -must be a pointer or an integer scalar. -Starting with version 1.5: If either Result Type or Operand is a pointer, the -other must be a pointer, an integer scalar, or an integer vector.

-
-
-

If Result Type has the same number of components as Operand, they must also -have the same component width, and results are computed per component.

-
-
-

If Result Type has a different number of components than Operand, the total -number of bits in Result Type must equal the total number of bits in Operand. -Let L be the type, either Result Type or Operand’s type, that has the larger -number of components. Let S be the other type, with the smaller number of -components. The number of components in L must be an integer multiple of the -number of components in S. The first component (that is, the only or -lowest-numbered component) of S maps to the first components of L, and so on, -up to the last component of S mapping to the last components of L. Within this -mapping, any single component of S (mapping to multiple components of L) maps -its lower-ordered bits to the lower-numbered components of L.

-
-
-
-
-

Change the description of Pointer in OpLifetimeStart and OpLifetimeStop to:

-
-
-
-
-

Pointer is a pointer to the object whose lifetime is starting/ending. Its type must be -a pointer type with Storage Class Function.

-
-
-
-
-

Change the description of Pointer in OpAtomicLoad to:

-
-
-
-
-

Pointer is the pointer to the memory to read. It must be a -pointer type. If its type is OpTypePointer, the type of the -value pointed to by Pointer must be the same as Result Type.

-
-
-
-
-

Change the description of Pointer in OpAtomicStore to:

-
-
-
-
-

Pointer is the pointer to the memory to write. It must be a -pointer type. If its type is OpTypePointer, the type it -points to must be a scalar of integer type or floating-point type.

-
-
-
-
-

Change the description of Value in OpAtomicExchange to:

-
-
-
-
-

The type of Value must be the same as Result Type. Pointer must be a -pointer type. If the type of Pointer is OpTypePointer, the -type of the value pointed to by Pointer must be the same as Result Type.

-
-
-
-
-

Change the description of Value in OpAtomicCompareExchange to:

-
-
-
-
-

The type of Value must be the same as Result Type. Pointer must be a -pointer type. If the type of Pointer is OpTypePointer, the -type of the value pointed to by Pointer must be the same as Result Type. -This type must also match the type of Comparator.

-
-
-
-
-

Change the description of Value in OpAtomicIIncrement, OpAtomicIDecrement, -OpAtomicIAdd, OpAtomicISub, OpAtomicSMin, OpAtomicUMin, OpAtomicSMax, -OpAtomicUMax, OpAtomicAnd, OpAtomicOr, and OpAtomicXor to:

-
-
-
-
-

The type of Value must be the same as Result Type. Pointer must be a -pointer type. If the type of Pointer is OpTypePointer, the -type of the value pointed to by Pointer must be the same as Result Type.

-
-
-
-
-

Add the following instruction to Section 3.37.6 Type-Declaration Instructions:

-
- ------ - - - - - - - - - - - - -

OpTypeUntypedPointerKHR
-
-Declare a new pointer type.
-
-Storage Class is the Storage Class of the memory holding object pointed to. Refer to the client API for allowed storage classes.

Capability:
-UntypedPointersKHR

3

4417

Result <id>

Storage Class

-
-

Add the following instructions to Section 3.37.8 Memory Instructions:

-
- --------- - - - - - - - - - - - - - - - -

OpUntypedVariableKHR
-
-Allocate an object in memory, resulting in a pointer to it.
-
-Result Type must be an OpTypeUntypedPointerKHR.
-
-Storage Class is the Storage Class of the memory holding the object. It must not be Generic. It must be the same storage class as the Storage Class operand of the Result Type.
-
-Data Type is optional. It is the type of the object in memory. Data Type must be specified if Storage Class is Function, Private, or Workgroup. Refer to the client API for other storage classes.
-
-Initializer is optional. If Initializer is present, it will be the initial value of the variable’s memory content. Initializer must be an <id> from a constant instruction or a global (module scope) variable. Initializer must have the same type as Data Type.

Capability:
-UntypedPointersKHR

4 + variable

4418

<id> Result Type

Result <id>

Storage Class

Optional <id> Data Type

Optional <id> Initializer

- --------- - - - - - - - - - - - - - - - -

OpUntypedAccessChainKHR
-
-Has the same semantics as OpAccessChain, with the following additions:
-- Result Type must be an OpTypeUntypedPointerKHR. Its Storage Class operand must be the same Storage Class as Base.
-- a Base Type operand. It must be a non-pointer type-declaration instruction.
-- Base must be a pointer type.
-- Indexes walk the type hierarchy of Base Type instead of Base.

Capability:
-UntypedPointersKHR

5 + variable

4419

<id> Result Type

Result <id>

<id> Base Type

<id> Base

<id>, <id>, …​
-Indexes

- --------- - - - - - - - - - - - - - - - -

OpUntypedInBoundsAccessChainKHR
-
-Has the same semantics as OpUntypedAccessChainKHR, with the addition that the resulting pointer is known to point within the base object.

Capability:
-UntypedPointersKHR

5 + variable

4420

<id> Result Type

Result <id>

<id> Base Type

<id> Base

<id>, <id>, …​
-Indexes

- ---------- - - - - - - - - - - - - - - - - -

OpUntypedPtrAccessChainKHR
-
-Has the same semantics as OpPtrAccessChain, with the following additions:
-- Result Type must be an OpTypeUntypedPointerKHR. Its Storage Class operand must be the same Storage Class as Base.
-- a Base Type operand. It must be a non-pointer type-declaration instruction.
-- Base must be a pointer type.
-- Element is used to generate an OpUntypedAccessChainKHR Base.
-- Indexes walk the type hierarchy of Base Type instead of Base.

Capability:
-UntypedPointersKHR

6 + variable

4423

<id> Result Type

Result <id>

<id> Base Type

<id> Base

<id> Element

<id>, <id>, …​
-Indexes

- ---------- - - - - - - - - - - - - - - - - -

OpUntypedInBoundsPtrAccessChainKHR
-
-Has the same semantics as OpUntypedPtrAccessChainKHR, with the addition that the resulting pointer is known to point within the base object.

Capability:
-UntypedPointersKHR

6 + variable

4424

<id> Result Type

Result <id>

<id> Base Type

<id> Base

<id> Element

<id>, <id>, …​
-Indexes

- --------- - - - - - - - - - - - - - - - -

OpUntypedArrayLengthKHR
-
-Length of a run-time array.
-
-Result Type must be an OpTypeInt with 32-bit Width and 0 Signedness.
-
-Structure must be a Block-decorated OpTypeStruct whose last member is a run-time array.
-
-Pointer must be a pointer type. Pointer must have the same value as a descriptor. That is, the value must be the same as a variable decorated with DescriptorSet and Binding or an element in such a variable when the data type is an array of Block-decorated structures.
-
-Array member is an unsigned 32-bit integer index of the last member of Structure. That member’s type must be from OpTypeRuntimeArray.

Capability:
-UntypedPointersKHR

6

4425

<id> Result Type

Result <id>

<id> Structure

<id> Pointer

Literal Array member

- --------- - - - - - - - - - - - - - - - -

OpUntypedPrefetchKHR
-
-Prefetch Num Bytes bytes of data from Pointer into the global cache. -This instruction does not affect the functionality of the module.
-
-Pointer must be a pointer whose Storage Class is CrossWorkgroup.
-
-Num Bytes is the number of bytes to prefetch. Its type must be an integer scalar.
-
-RW is optional. -If RW is present, it specifies whether the fetch should be for a read or write. -It must be a constant instruction with an integer scalar type. -The value must be either 0 (for read) or 1 (for write).
-
-Locality is optional. -If Locality is present, it specifies the temporal locality for the caching. -It must be a constant instruction with an integer scalar type. -The value must be between 0 (for no locality) and 3 (for extreme locality) inclusive.
-
-Cache Type is optional. -If Cache Type is present, it specifies the type of cache. -It must be a constant instruction with an integer scalar type. -The value must be either 0 (for instruction cache) or 1 (for data cache).
-
-The default values of all optional operands are implementation defined.

Capability:
-UntypedPointersKHR

3 + variable

4426

<id> Pointer Type

<id> Num Bytes

Optional <id> RW

Optional <id> Locality

Optional <id> Cache Type

-
-
-
-

Modifications to the extension SPV_KHR_workgroup_memory_explicit_layout

-
-
-

Change:

-
-
-
-
-

If WorkgroupMemoryExplicitLayoutKHR capability is declared, for each entry point in the module

-
-
-
    -
  • -

    Either all or none of the Workgroup Storage Class variables in the entry point interface must point to struct types decorated with Block.

    -
  • -
  • -

    If more than one Workgroup Storage Class variable in the entry point interface point to a type decorated with Block, all of them must be decorated with Aliased.

    -
  • -
-
-
-
-
-

To:

-
-
-
-
-

If WorkgroupMemoryExplicitLayoutKHR capability is declared, for each entry point in the module

-
-
-
    -
  • -

    Either all or none of the Workgroup Storage Class variables in the entry point -interface must point to struct types decorated with Block.

    -
  • -
  • -

    If more than one Workgroup Storage Class variable in the entry point interface -point to a type decorated with Block, all of them must be decorated with Aliased, -unless the UntypedPointersKHR capability is declared. Only those variables -decorated with Aliased may alias each other.

    -
  • -
-
-
-
-
-

Change:

-
-
-
-
-

In addition to the above table, memory object declarations in the -CrossWorkgroup, Function, Input, Output or Private storage classes must also -have matching pointee types for aliasing to be present. The restriction also -applies for Workgroup storage class, except when -WorkgroupMemoryExplicitLayoutKHR capability is declared and the pointee types -are structs decorated with Block. In all other cases the decoration is ignored.

-
-
-
-
-

To:

-
-
-
-
-

In addition to the above table, memory object declarations in the -CrossWorkgroup, Function, Input, Output or Private storage classes must also -have matching pointee types for aliasing to be present. The restriction also -applies for Workgroup storage class, except when -WorkgroupMemoryExplicitLayoutKHR capability is declared and the pointee types -are structs decorated with Block or the pointer has the type -OpTypeUntypedPointerKHR. In all other cases the -decoration is ignored.

-
-
-
-
-
-
-

Modifications to the extension SPV_KHR_cooperative_matrix

-
-
-

In the descriptions of OpCooperativeMatrixLoadKHR and -OpCooperativeMatrixStoreKHR change:

-
-
-
-
-

Pointer is a pointer. -Its type must be an OpTypePointer whose Type operand is a scalar or vector -type. -If the Shader capability was declared, Pointer must point into an array and any -ArrayStride decoration on Pointer is ignored.

-
-
-
-
-

To:

-
-
-
-
-

Pointer is a pointer. -Its type must be a pointer type. -If it is an OpTypePointer, its Type operand must be a scalar or vector -type. -If the Shader capability was declared and Pointer’s type is -OpTypePointer, Pointer must point into an array and any ArrayStride -decoration on Pointer is ignored.

-
-
-
-
-

And, change:

-
-
-
-
-

Stride further qualifies how matrix elements are laid out in memory. -It must be a scalar integer type and its exact semantics depend on -MemoryLayout.

-
-
-
-
-

To:

-
-
-
-
-

Stride further qualifies how matrix elements are laid out in memory. -It must be a scalar integer type and its exact semantics depend on -MemoryLayout. -When the type of Pointer is OpTypePointer, Stride is specified in number -of elements based on the Type operand of the pointer type. -When the type of Pointer is OpTypeUntypedPointerKHR, Stride is specified -in bytes.

-
-
-
-
-
-
-

Modifications to the OpenCL.std extended instruction set

-
-
-

Change the pointer naming conventions from:

-
-
-
-
-
    -
  • -

    pointer(storage) denotes an OpTypePointer which points to the storage Storage Class.

    -
    -
      -
    • -

      pointer(constant) denotes an OpTypePointer which points to the UniformConstant Storage Class.

      -
    • -
    • -

      pointer(generic) denotes an OpTypePointer which points to the Generic Storage Class.

      -
    • -
    • -

      pointer(global) denotes an OpTypePointer which points to the CrossWorkgroup Storage Class.

      -
    • -
    • -

      pointer(local) denotes an OpTypePointer which points to the Workgroup Storage Class.

      -
    • -
    • -

      pointer(private) denotes an OpTypePointer which points to the Function Storage Class.

      -
    • -
    -
    -
  • -
-
-
-
-
-

To:

-
-
-
-
-
    -
  • -

    pointer(storage) denotes an OpTypePointer or OpTypeUntypedPointerKHR which points to the storage Storage Class.

    -
    -
      -
    • -

      pointer(constant) denotes an OpTypePointer or OpTypeUntypedPointerKHR which points to the UniformConstant Storage Class.

      -
    • -
    • -

      pointer(generic) denotes an OpTypePointer or OpTypeUntypedPointerKHR which points to the Generic Storage Class.

      -
    • -
    • -

      pointer(global) denotes an OpTypePointer or OpTypeUntypedPointerKHR which points to the CrossWorkgroup Storage Class.

      -
    • -
    • -

      pointer(local) denotes an OpTypePointer or OpTypeUntypedPointerKHR which points to the Workgroup Storage Class.

      -
    • -
    • -

      pointer(private) denotes an OpTypePointer or OpTypeUntypedPointerKHR which points to the Function Storage Class.

      -
    • -
    -
    -
  • -
-
-
-
-
-

In the descriptions of the extended instructions, whenever a pointer operand is described as pointer(p1, p2, …​) to data types, split the sentence into two as follows:

-
-
-
-
-

operand must be a pointer(p1, …​). -If it is a typed pointer, it must point to data types.

-
-
-
-
-

This applies to the following instructions:

-
-
-
    -
  • -

    ptr in fract

    -
  • -
  • -

    exp in frexp

    -
  • -
  • -

    signp in lgamma_r

    -
  • -
  • -

    iptr in modf

    -
  • -
  • -

    quo in remquo

    -
  • -
  • -

    cosval in sincos

    -
  • -
  • -

    p in vloadn

    -
  • -
  • -

    p in vstoren

    -
  • -
  • -

    p in vload_half

    -
  • -
  • -

    p in vload_halfn

    -
  • -
  • -

    p in vstore_half

    -
  • -
  • -

    p in vstore_half_r

    -
  • -
  • -

    p in vstore_halfn

    -
  • -
  • -

    p in vstore_halfn_r

    -
  • -
  • -

    p in vloada_halfn

    -
  • -
  • -

    p in vstorea_halfn

    -
  • -
  • -

    p in vstorea_halfn_r

    -
  • -
  • -

    format in printf

    -
  • -
-
-
-

In the above instructions any type matching rule that applies to a pointee type is only applied to typed pointers. -For untyped pointers, the instructions as if the it were an appropriate typed pointer.

-
-
-

Note: prefetch should be replaced with OpUntypedPrefetchKHR.

-
-
-
-
-

Issues

-
-
-
    -
  1. -

    Should this extension modify any other extensions?

    -
    -
    -
    -

    Resolved

    -
    -
    -

    This extension modifies SPV_KHR_workgroup_memory_explicit_layout and -SPV_KHR_cooperative_matrix.

    -
    -
    -
    -
  2. -
  3. -

    Should this extension include pointer access chain equivalents?

    -
    -
    -
    -

    Resolved

    -
    -
    -

    OpUntypedPtrAccessChainKHR and -OpUntypedInBoundsPtrAccessChainKHR are not strictly necessary. -OpUntypedAccessChainKHR (or OpUntypedInBoundsAccessChainKHR) could be used -in place in all cases by changing the Base Type to be an array instead of -just the element type; however, to simplify implementation transitions these -instructions are included.

    -
    -
    -
    -
  4. -
  5. -

    Should this extension modify any extended instructions?

    -
    -
    -
    -

    Resolved

    -
    -
    -

    This extension modifies the OpenCL.std extended instruction set. -GLSL.std.450 is not modified as the interpolation instructions operate on the -Input storage class and FrexpStruct and ModfStruct should be preferred to -the version that utilize pointers.

    -
    -
    -
    -
  6. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - -

Rev

Date

Author

Changes

2

2024-08-08

Kevin Petit

Clarify OpPtrDiff support

1

2024-05-29

Alan Baker

Initial Revision

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_untyped_pointers.html + + +

extensions/KHR/SPV_KHR_untyped_pointers.html

+ + diff --git a/extensions/KHR/SPV_KHR_variable_pointers.html b/extensions/KHR/SPV_KHR_variable_pointers.html index e45bd19..a695eb5 100644 --- a/extensions/KHR/SPV_KHR_variable_pointers.html +++ b/extensions/KHR/SPV_KHR_variable_pointers.html @@ -1,676 +1,12 @@ - - - - - - - -SPV_KHR_variable_pointers - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_variable_pointers

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    John Kessenich, Google

    -
  • -
  • -

    Neil Henning, Codeplay

    -
  • -
  • -

    David Neto, Google

    -
  • -
  • -

    Daniel Koch, Nvidia

    -
  • -
  • -

    Graeme Leese, Broadcom

    -
  • -
  • -

    Weifeng Zhang, Qualcomm

    -
  • -
  • -

    Stephen Clarke, Imagination Technologies

    -
  • -
  • -

    Faith Ekstrand, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2017 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Approved by the SPIR-V Working group: 2017-02-08

    -
  • -
  • -

    Approved by the Khronos Board of Promoters: 2017-03-31

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2017-07-05

Revision

13

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.1 Revision 1.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-

This extension requires SPV_KHR_storage_buffer_storage_class.

-
-
-
-
-

Overview

-
-
-

This extension adds new pointer capabilities to the Logical addressing model -that keep pointers as an abstract type while allowing a variable pointer that has the -following additional features:

-
-
-
    -
  • -

    A pointer is allowed to not know statically what object (or what part of a composite object) -it points to, by being selected from multiple pointers. -E.g, it can come from an OpSelect instruction, which selects between two OpAccessChain -instructions.

    -
  • -
  • -

    Depending on the capability selected, a variable pointer might be restricted to select only -from within a single StorageBuffer object.

    -
  • -
  • -

    As with the abstract Boolean type, a pointer can be stored to non-externally visible shader -Storage Classes, but is limited to Private and Function.

    -
  • -
  • -

    Allow use of OpConstantNull as a variable pointer.

    -
  • -
-
-
-

Variable pointers still have no exposed physical bit pattern or size.

-
-
-

This extension does not add any "generic" pointer ability, or modify existing aliasing rules.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_variable_pointers"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.1

-
-
-

Terms

-
-

Add a new term to section 2.2.2 Types:

-
-
-

Variable pointer: A pointer that results from one of the following instructions:

-
-
-
    -
  • -

    OpSelect

    -
  • -
  • -

    OpPhi

    -
  • -
  • -

    OpFunctionCall

    -
  • -
  • -

    OpPtrAccessChain

    -
  • -
  • -

    OpCopyObject

    -
  • -
  • -

    OpLoad

    -
  • -
  • -

    OpConstantNull

    -
  • -
-
-
-
-

Types

-
-

In section 2.8 "Types and Variables", in type matching, add this after discussing aggregate matching rules -under decoration:

-
-
-

Pointer types are also allowed to have multiple <id>s for the same opcode and operands, -to allow for differing ArrayStride Array Stride decoration values.

-
-
-

Also, in that paragraph, "non-aggregate types" will then generally be "non-aggregate non-pointer types."

-
-
-
-

Validation Rules

-
-

Modify the Logical Address Model list of rules, by changing:

-
-
-
    -
  • -

    If the Logical addressing model is selected:

    -
  • -
-
-
-

To:

-
-
-
    -
  • -

    If the Logical addressing model is selected and the VariablePointers -capability is not declared:

    -
  • -
-
-
-

Keeping the subsequent list the same. (That is, there is no change here).

-
-
-

Add another set of rules, after the above:

-
-
-
    -
  • -

    If the Logical addressing model is selected and the VariablePointers or -VariablePointersStorageBuffer capability is declared (in addition to -what is allowed above by the Logical addressing model):

    -
    -
      -
    • -

      OpVariable can allocate an object whose type is a pointer type, if -the Storage Class of the OpVariable is one of the following:

      -
      -
        -
      • -

        Function

        -
      • -
      • -

        Private

        -
      • -
      -
      -
    • -
    • -

      A pointer can be the Object operand of OpStore or result of OpLoad, if the storage class -the pointer is stored to or loaded from is one of the following:

      -
      -
        -
      • -

        Function

        -
      • -
      • -

        Private

        -
      • -
      -
      -
    • -
    • -

      A pointer type can be the:

      -
      -
        -
      • -

        Result Type of OpFunction

        -
      • -
      • -

        Result Type of OpFunctionCall

        -
      • -
      • -

        Return Type of OpTypeFunction

        -
      • -
      -
      -
    • -
    • -

      A pointer can be a variable pointer or an operand to OpPtrAccessChain.

      -
    • -
    • -

      If the VariablePointers capability is declared, -A variable pointer can be the Pointer operand of OpStore or OpLoad, -or result of OpConstantNull, if it points to one of the following storage classes:

      -
      -
        -
      • -

        StorageBuffer

        -
      • -
      • -

        Workgroup

        -
      • -
      -
      -
    • -
    • -

      If the VariablePointers capability is not declared, -A variable pointer can be the Pointer operand of OpStore or OpLoad -only if:

      -
      -
        -
      • -

        it points into the StorageBuffer storage classes

        -
      • -
      • -

        it is selected from pointers pointing into the same structure, or is OpConstantNull

        -
      • -
      -
      -
    • -
    -
    -
  • -
  • -

    A variable pointer with the Logical addressing model cannot

    -
    -
      -
    • -

      be an operand to an OpArrayLength instruction

      -
    • -
    • -

      point to an object that is or contains any OpTypeMatrix types

      -
    • -
    -
    -
  • -
-
-
-

Add under the rules for "Composite objects in the UniformConstant, Uniform, and PushConstant …​":

-
-
-
    -
  • -

    Each OpPtrAccessChain must have a Base whose type is decorated with ArrayStride.

    -
  • -
  • -

    When an array-element pointer is derived from an array (e.g., using OpAccessChain), -and the resulting element-pointer type was decorated with ArrayStride, -its Array Stride must match the Array Stride of the originating array’s type.

    -
  • -
-
-
-
-

Memory Model

-
-

Add a new section:

-
-
-

2.18.3 Null pointers

-
-
-

A "null" pointer can be formed from an OpConstantNull instruction with a pointer result type. -The resulting pointer value is abstract, and will not equal the pointer value formed from any -declared object or access chain into a declared object. Behavior is undefined when loading or storing -through an OpConstantNull value.

-
-
-
-

Decorations

-
-

In section 3.20 "Decoration", update the description of what ArrayStride applies to:

-
-
-

Apply to an array type to specify the stride, in bytes, of the array’s elements. -Can also apply to a pointer type to an array element, to specify the stride of the array that the element resides in. -Must not be applied to any other type.

-
-
-
-

Capabilities

-
-

Modify Section 3.31, Capability, adding these rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - - - - - - -
CapabilityDepends On

4441

VariablePointersStorageBuffer
-Allow variable pointers, each confined to a single Block-decorated struct in the StorageBuffer storage class.

Shader

4442

VariablePointers
-Allow variable pointers

VariablePointersStorageBuffer

-
-
-
-

Modify section 3.32.8. "Memory Instructions"

-
-
-

Add these capabilities to the OpPtrAccessChain instruction:

-
-
-
    -
  • -

    VariablePointers

    -
  • -
  • -

    VariablePointersStorageBuffer

    -
  • -
-
-
-
-

Instructions

-
-

Modify the OpPtrAccessChain instruction. Add to the paragraph explaining that Element does an array dereference:

-
-
-

"When the type of Base is decorated with ArrayStride, -this array is dereferenced as an array whose stride is the Base-type’s Array Stride."

-
-
-

Modify the OpSelect instruction description by changing this existing text from:

-
-
-

"Select between two objects.

-
-
-

"Result Type must be a scalar or vector."

-
-
-

To:

-
-
-

"Select components from two objects.

-
-
-

"Result Type must be a pointer, scalar, or vector."

-
-
-
-
-
-

Issues

-
-
-

1) Do we need a NULL value?

-
-
-

Discussion:

-
-
-

Pro: It can be symmetric with OpTypeBool having OpConstantTrue and OpConstantFalse.

-
-
-

Con: Can be worked around.

-
-
-

Resolution: Allow use of OpConstantNull for this.

-
-
-

2) Can pointer selection be across buffers? E.g.:

-
-
-
-
  P = c ? P1 : P2; // P1 and P2 must be in the same buffer block
-
-
-
-

Discussion: It may be hardware dependent whether this is easy to implement or not.

-
-
-

Resolution: This is selected by the difference between the VariablePointers and -VariablePointersStorageBuffer capabilities.

-
-
-

3) Can pointers be to the Private storage class?

-
-
-

Discussion:

-
-
-

Con: What’s the real use case? (Alloca becomes OpVariable in Private space.). -Can subset the language: fail the high-level compile if it can’t fit in existing Vulkan rules. -(If you aggressively inline, then into-SSA fixes up most "real" code examples.)

-
-
-

Pro: The SPIR-V can more closely match the original shader intent. -If the shader had two functions, the SPIR-V can have two functions to match. -This is particularly useful when we look at things like debugger support -(something that is later in the pipe for sure, but I’d like to if possible leave the door open!).

-
-
-

Resolution: Don’t have variable pointers into the Private storage class.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2016-10-31

JohnK

Initial revision

2

2016-11-15

JohnK

Add VariablePointers to address model table

3

2016-11-21

JohnK

Address feedback: use a capability instead of an address model and make load/store levels of indirection more clear

4

2016-12-13

JohnK

Split into two capabilities, eliminating the need for a Vulkan extension to define the difference

5

2017-01-16

JohnK

Address editorial feedback in the overview

6

2017-01-17

JohnK

Add NULL pointer

7

2017-01-18

JohnK

Remove Private, CrossWorkGroup, and UBO storage classes

8

2017-02-07

JohnK

Don’t allow OpTypeMatrix for variable pointers

9

2017-02-08

JohnK

Disallow OpArrayLength, list OpPtrAccessChain capabilities, and make additional allowances all in the positive

10

2017-02-09

DavidN

Assign token values

11

2017-03-23

Alexander Galazin

Added interactions with SPV_KHR_storage_buffer_storage_class

12

2017-05-11

JohnK

Be explicit that OpSelect supports pointers, and record ratification date.

13

2017-07-05

JohnK

Add generator requirement to decorate OpPtrAccessChain base-pointer type with ArrayStride, - optional for driver consumption.

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_variable_pointers.html + + +

extensions/KHR/SPV_KHR_variable_pointers.html

+ + diff --git a/extensions/KHR/SPV_KHR_vulkan_memory_model.html b/extensions/KHR/SPV_KHR_vulkan_memory_model.html index 1784c36..b7e2228 100644 --- a/extensions/KHR/SPV_KHR_vulkan_memory_model.html +++ b/extensions/KHR/SPV_KHR_vulkan_memory_model.html @@ -1,662 +1,12 @@ - - - - - - - -SPV_KHR_vulkan_memory_model - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_vulkan_memory_model

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Jeff Bolz, NVIDIA

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
  • -

    Approved by the SPIR Working Group: 2018-07-13

    -
  • -
  • -

    Approved by the Khronos Board of Promoters: 2018-08-24

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2019-06-13

Revision

4

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.3 Revision 1.

-
-
-

This extension requires SPIR-V 1.3.

-
-
-
-
-

Overview

-
-
-

This extension adds new functionality related to the Vulkan memory model. The -definitions of the new semantics are left intentionally -brief, with more thorough specifications left to the Vulkan SPIR-V environment -and Vulkan Memory Model appendix.

-
-
-

New Memory Semantics Bits:

-
-
-

MakeAvailableKHR and MakeVisibleKHR make memory barriers and atomics -perform availability and visibility operations for an entire storage class, -as defined in the memory model.

-
-
-

OutputMemoryKHR is a new memory semantics bit that indicates the operation -synchronizes accesses to the output storage class (for tessellation control -shaders).

-
-
-

New Memory Access Bits:

-
-
-

MakePointerAvailableKHR and MakePointerVisibleKHR make memory access -instructions perform availability and visibility operations on the locations -pointed to by the pointer operand, as defined in the memory model. -NonPrivatePointerKHR makes memory access instructions obey inter-thread -ordering, as defined in the memory model.

-
-
-

New Image Operands Bits:

-
-
-

MakeTexelAvailableKHR and MakeTexelVisibleKHR make image access -instructions perform availability and visibility operations on the texel’s -memory locations, as defined in the memory model. NonPrivateTexelKHR makes -image access instructions obey inter-thread ordering, as defined in the -memory model.

-
-
-

New Scope:

-
-
-

QueueFamilyKHR is a scope that includes all invocations from queues in the -same queue family. The existing Device scope is optional in Vulkan, and use -of it with the new memory model requires a new capability -VulkanMemoryModelDeviceScopeKHR.

-
-
-

The Coherent decoration is deprecated and replaced (and extended) by -MakePointerAvailableKHR or MakePointerVisibleKHR and -MakeTexelAvailableKHR or MakeTexelVisibleKHR. Similarly, the Volatile -decoration is deprecated and replaced by the Volatile Memory Access bit for -pointers, the VolatileTexelKHR Image Operands bit for image accesses, -and the Volatile Memory Semantics bit for atomics.

-
-
-

VulkanKHR is a new Memory Model enum which indicates that a module opts into -the Vulkan Memory Model.

-
-
-

VulkanMemoryModelKHR is a capability that indicates a module uses the new -memory model. -VulkanMemoryModelDeviceScopeKHR is a capability that indicates a module -uses Device scope with the Vulkan Memory Model.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_vulkan_memory_model"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces the following new capabilities:

-
-
-
-
VulkanMemoryModelKHR
-VulkanMemoryModelDeviceScopeKHR
-
-
-
-
-
-

New Decorations

-
-
-

None

-
-
-
-
-

New Builtins

-
-
-

None

-
-
-
-
-

New Instructions

-
-
-

None

-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.3

-
-
-

(Modify section 2.18.2, Aliasing):

-
-
-

Add VulkanKHR to the list of models that assume aliasing is generally not -present:

-
-
-

How aliasing is managed depends on the Memory Model:

-
-
-
    The simple, GLSL450, and VulkanKHR memory models can assume that
-    aliasing is generally not present.  …​
-
-
-

(Add to the table in 3.5, Memory Model):

-
- ----- - - - - - - - -

3

VulkanKHR
-Vulkan Memory Model, as specified by the client API. -This OpMemoryModel memory model must be used if and only if the -VulkanMemoryModelKHR capability is declared.

VulkanMemoryModelKHR

-
-

(Modify the table in 3.20, Decoration):

-
-
-

Add to the description of Coherent

-
-
-

Coherent is not allowed when the declared memory model is VulkanKHR. -The Memory Access bits MakePointerAvailableKHR and MakePointerVisibleKHR or the -Image Operands bits MakeTexelAvailableKHR and MakeTexelVisibleKHR can be -used instead.

-
-
-

Add to the description of Volatile

-
-
-

Volatile is not allowed when the declared memory model is VulkanKHR. -The Memory Access bit Volatile, the Image Operands bit VolatileTexelKHR, -or the Memory Semantics bit Volatile can be used instead.

-
-
-

(Modify Section 3.14, Image Operands, adding to the end of the table)

-
- ----- - - - - - - - - - - - - - - - - - - - - - - -

0x100

MakeTexelAvailableKHR
-Perform an availability operation on the texel locations after the store. -A following operand is the Scope <id> that controls the scope of the -availability operation. -Requires NonPrivateTexelKHR to also be set. Only valid with OpImageWrite.

VulkanMemoryModelKHR

0x200

MakeTexelVisibleKHR
-Perform a visibility operation on the texel locations before the load. -A following operand is the Scope <id> that controls the scope of the -visibility operation. -Requires NonPrivateTexelKHR to also be set. Only valid with OpImageRead and -OpImageSparseRead.

VulkanMemoryModelKHR

0x400

NonPrivateTexelKHR
-The image access obeys inter-thread ordering, as specified by the client API.

VulkanMemoryModelKHR

0x800

VolatileTexelKHR
-This access cannot be eliminated, duplicated, or combined with other -accesses.

VulkanMemoryModelKHR

-
-

(Modify Section 3.25, Memory Semantics)

-
-
-

Add to the description of SequentiallyConsistent

-
-
-

If the declared memory model is VulkanKHR, SequentiallyConsistent must not be used.

-
-
-

Add new entries to the end of the table:

-
- ----- - - - - - - - - - - - - - - - - - - - - - - -

0x1000

OutputMemoryKHR
-Apply the memory-ordering constraints to Output Storage Class memory.

VulkanMemoryModelKHR

0x2000

MakeAvailableKHR
-Perform an availability operation on all references in the selected storage -classes.

VulkanMemoryModelKHR

0x4000

MakeVisibleKHR
-Perform a visibility operation on all references in the selected storage -classes.

VulkanMemoryModelKHR

0x8000

Volatile
-This access cannot be eliminated, duplicated, or combined with other -accesses.

VulkanMemoryModelKHR

-
-

(Modify Section 3.26, Memory Operands)

-
-
-

Add to the end of the table:

-
- ----- - - - - - - - - - - - - - - - - - -

0x08

MakePointerAvailableKHR
-Perform an availability operation on the locations pointed to by the -pointer operand, after a store. -A following operand is a Scope <id> specifying the scope of -the availability operation. -Requires NonPrivatePointerKHR to also be set. Not valid with OpLoad.

VulkanMemoryModelKHR

0x10

MakePointerVisibleKHR
-Perform a visibility operation on the locations pointed to by the -pointer operand, before a load. -A following operand is a Scope <id> specifying the scope of -the visibility operation. -Requires NonPrivatePointerKHR to also be set. Not valid with OpStore.

VulkanMemoryModelKHR

0x20

NonPrivatePointerKHR
-The memory access obeys inter-thread ordering, as specified by the client API.

VulkanMemoryModelKHR

-
-

(Modify Section 3.27, Scope <id>, adding to the end of the table)

-
- ----- - - - - - - - -

5

QueueFamilyKHR
-Scope is the current queue family.

VulkanMemoryModelKHR

-
-
-
(Modify Section 3.31, Capability, adding new rows to the Capability table)
-
-
-
- ------ - - - - - - - - - - - - - - - - - - - - - -
CapabilityImplicitly DeclaresEnabled by Extension

5345

VulkanMemoryModelKHR
-Uses the Vulkan Memory Model. This capability must be declared if and only if -the VulkanKHR OpMemoryModel memory model is used.

None

SPV_KHR_vulkan_memory_model

5346

VulkanMemoryModelDeviceScopeKHR
-Uses Device scope with the Vulkan Memory Model. This capability must be -declared if a scope of Device is used with any instruction and the -VulkanKHR OpMemoryModel memory model is used.

None

SPV_KHR_vulkan_memory_model

-
-
-
-
-
-
-

(Modify Section 3.32.8. Memory Instructions)

-
-
-

In OpCopyMemory and OpCopyMemorySized, if this extension is being used -with SPIR-V 1.4, replace:

-
-
-
    If two masks are present, the first applies to Target and the second
-    applies to Source.
-
-
-

with:

-
-
-
    If two masks are present, the first applies to Target and cannot include
-    MakePointerVisibleKHR, and the second applies to Source and cannot
-    include MakePointerAvailableKHR.
-
-
-

(Modify Section 3.32.20. Barrier Instructions)

-
-
-

Update the description of OpMemoryBarrier. Modify the second paragraph to -say:

-
-
-

Ensures that memory accesses issued before this instruction will be observed -before memory accesses issued after this instruction. This control is ensured -only for memory accesses issued by this invocation and observed by another -invocation executing within Memory scope. If the VulkanKHR memory model is -used, this ordering only applies to memory accesses that use the -NonPrivatePointerKHR or NonPrivateTexelKHR flags.

-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_KHR_vulkan_memory_model"
-
-
-
-

If OpLoad, OpStore, OpCopyMemory, or OpCopyMemorySized use -MakePointerAvailableKHR or MakePointerVisibleKHR, the optional scope -operand must be present.

-
-
-

If OpImageRead, OpImageSparseRead, or OpImageWrite use -MakeTexelAvailableKHR or MakeTexelVisibleKHR, the optional scope operand -must be present.

-
-
-

Memory accesses that use NonPrivatePointerKHR must use pointers in the Uniform, -Workgroup, CrossWorkgroup, Generic, Image, or StorageBuffer storage classes.

-
-
-

If OpMemoryModel memory model is VulkanKHR and any instruction uses Device -scope, VulkanMemoryModelDeviceScopeKHR must be declared.

-
-
-
-
-

Issues

-
-
-

(1) How many capabilities do we need?

-
-
-

RESOLVED: We use a single capability for most of the functionality, even though -it is (arguably) redundant with the new OpMemoryModel enum, because we expect a -lot of tooling to rely n the existence of a capability. There is a second -capability (VulkanMemoryModelDeviceScopeKHR) tied to an optional feature.

-
-
-

(2) Can we deprecate "Coherent" and put Availability/Visibility decorations -on individual memory instructions instead?

-
-
-

RESOLVED. Yes. In many ways it is cleaner and more natural to use -per-instruction coherency. It better matches the definition in the model, -matches many hardware implementations, and is more natural when using -variable pointers. We do the same for the "Volatile" decoration.

-
-
-

(3) Should inter-thread ordering rules be opt-in (NonPrivate{Pointer,Texel}KHR) or opt-out?

-
-
-

RESOLVED: Having accesses default to private and requiring explicit opt-in to -non-private is cleaner in a few ways. It is a default that is valid for all -storage classes, including those like Private that can’t possibly be shared -between invocations. It naturally matches the default we’ll want in GLSL, -where undecorated (non-coherent) variables are usually not used for -communication between invocations, and setting the "coherent" qualifier can -implicitly make accesses non-private. And it makes it more natural to express -some of the validation rules.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2018-04-20

Jeff Bolz

Initial draft

2

2018-09-05

Jeff Bolz, David Neto

Add QueueFamilyKHR, update Memory Access Operands

3

2019-02-19

David Neto

Khronos SPIR-V Issue #413: Interaction with SPIR-V 1.4: Restrictions on memory access bits in two-operand OpCopyMemory and OpCopyMemorySized.

4

2019-06-13

Jeff Bolz

Added Volatile to Memory Semantics

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_vulkan_memory_model.html + + +

extensions/KHR/SPV_KHR_vulkan_memory_model.html

+ + diff --git a/extensions/KHR/SPV_KHR_workgroup_memory_explicit_layout.html b/extensions/KHR/SPV_KHR_workgroup_memory_explicit_layout.html index de3f857..ca94108 100644 --- a/extensions/KHR/SPV_KHR_workgroup_memory_explicit_layout.html +++ b/extensions/KHR/SPV_KHR_workgroup_memory_explicit_layout.html @@ -1,538 +1,12 @@ - - - - - - - -SPV_KHR_workgroup_memory_explicit_layout - - - - - -
-
-

Name Strings

-
-
-

SPV_KHR_workgroup_memory_explicit_layout

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Caio Marcelo de Oliveira Filho, Intel

    -
  • -
  • -

    Jeff Bolz, NVIDIA

    -
  • -
  • -

    Graeme Leese, Broadcom

    -
  • -
  • -

    Faith Ekstrand, Intel

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2020 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
  • -

    Approved by the SPIR Working Group: 2020-07-15

    -
  • -
  • -

    Ratified by Khronos on 2019-09-11

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2020-06-29

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.5, Revision 2.

-
-
-

This extension requires SPIR-V 1.4.

-
-
-
-
-

Overview

-
-
-

This extension provides a way for the shader author to define the -layout of Workgroup storage class memory.

-
-
-

Workgroup variables can be declared in blocks, and then use the same -explicit layout decorations (e.g. Offset, ArrayStride) as other -storage classes.

-
-
-

All the Workgroup blocks share the same underlying storage, so it is -possible to get different views of the workgroup storage. This allow -more directly efficient manipulation of that storage by the shader -author.

-
-
-

Either all or none of the variables must be explicitly laid out.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_KHR_workgroup_memory_explicit_layout"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5, Revision 2

-
-
-

Validation Rules

-
-

In Section 2.16.1 ("Universal Validation Rules"), modify the list in -the item

-
-
-
-
-

The capabilities StorageBuffer16BitAccess, -UniformAndStorageBuffer16BitAccess, StoragePushConstant16, and -StorageInputOutput16 do not generally add 16-bit operations. Rather, -they add only the following specific abilities:

-
-
-
-
-

to also include the WorkgroupMemoryExplicitLayout16BitAccessKHR -capability. Similarly, modify the list in the item

-
-
-
-
-

The capabilities StorageBuffer8BitAccess, -UniformAndStorageBuffer8BitAccess, and StoragePushConstant8, do not -generally add 8-bit operations. Rather, they add only the following -specific abilities:

-
-
-
-
-

to also include the WorkgroupMemoryExplicitLayout8BitAccessKHR -capability.

-
-
-

In Section 2.16.2 ("Validation Rules for Shader Capabilities"), modify -the item

-
-
-
    -
  • -

    Composite objects in the StorageBuffer, PhysicalStorageBuffer, -Uniform, and PushConstant Storage Classes must be explicitly laid -out. The following apply to all the aggregate and matrix types -describing such an object, recursively through their nested types:

    -
  • -
-
-
-

to be

-
-
-
    -
  • -

    Composite objects in the StorageBuffer, PhysicalStorageBuffer, -Uniform, and PushConstant Storage Classes must be explicitly laid -out. If WorkgroupMemoryExplicitLayoutKHR capability is declared, -composite objects in the Workgroup Storage Class with types decorated -with Block also must be explicitly laid out. The following -apply to all the aggregate and matrix types describing such an object, -recursively through their nested types:

    -
  • -
-
-
-

Append following to the same section

-
-
-
    -
  • -

    If WorkgroupMemoryExplicitLayoutKHR capability is declared, -for each entry point in the module

    -
    -
      -
    • -

      Either all or none of the Workgroup Storage Class variables in -the entry point interface must point to struct types decorated -with Block.

      -
    • -
    • -

      If more than one Workgroup Storage Class variable in the entry -point interface point to a type decorated with Block, all of -them must be decorated with Aliased.

      -
    • -
    -
    -
  • -
-
-
-
-

Memory Model

-
-

In Section 2.18.2 ("Aliasing"), modify

-
-
-
-
-

The Aliased decoration can be used to express that certain memory -object declarations may alias. Referencing the following table, a -memory object declaration P may alias another declared pointer -Q if within a single row:

-
-
-
    -
  • -

    P is an instruction with opcode and storage class from the first -pair of columns, and

    -
  • -
  • -

    Q is an instruction with opcode and storage class from the second -pair of columns.

    -
  • -
-
-
-
-
-

to be

-
-
-
-
-

The Aliased decoration can be used to express that certain memory -object declarations may alias. Referencing the following table, a -memory object declaration P may alias another declared pointer -Q if within a single row:

-
-
-
    -
  • -

    P is an instruction with opcode and storage class from the first -pair of columns,

    -
  • -
  • -

    Q is an instruction with opcode and storage class from the second -pair of columns, and

    -
  • -
  • -

    If present, one of the enabling capabilities in the last column is -declared by the module.

    -
  • -
-
-
-
-
-

Add an extra column Enabling Capabilities to the table

-
-
-
- ------- - - - - - - - - - -

First Storage Class

First Instruction(s)

Second Instructions

Second Storage Classes

Enabling Capabilities

-
-
-
-

and append the row

-
-
-
- ------- - - - - - - - - - -

Workgroup

OpVariable

OpVariable

Workgroup

WorkgroupMemoryExplicitLayoutKHR

-
-
-
-

Modify the paragraph right after the table from

-
-
-
-
-

In addition to the above table, memory object declarations in the -CrossWorkgroup, Function, Input, Output, Private, -or Workgroup storage classes must also have matching pointee types -for aliasing to be present. In all other cases the decoration is ignored.

-
-
-
-
-

to be

-
-
-
-
-

In addition to the above table, memory object declarations in the -CrossWorkgroup, Function, Input, Output or Private storage -classes must also have matching pointee types for aliasing to be -present. The restriction also applies for Workgroup storage class, -except when WorkgroupMemoryExplicitLayoutKHR capability is declared -and the pointee types are structs decorated with Block. In all other -cases the decoration is ignored.

-
-
-
-
-
-

Capabilities

-
-

In Section 3.31 ("Capability"), add

-
-
-
- ----- - - - - - - - - - - - - - - - - - - - - - - - -
CapabilityImplicitly Declares

4428

WorkgroupMemoryExplicitLayoutKHR
-Allows Workgroup storage class variables to be explicitly laid out in blocks.

Shader

4429

WorkgroupMemoryExplicitLayout8BitAccessKHR
-Uses 8-bit OpTypeInt instructions for creating scalar, vector, and composite types that become members of a block residing in the Workgroup storage class.

WorkgroupMemoryExplicitLayoutKHR

4430

WorkgroupMemoryExplicitLayout16BitAccessKHR
-Uses 16-bit OpTypeFloat and OpTypeInt instructions for creating scalar, vector, and composite types that become members of a block residing in the Workgroup storage class.

WorkgroupMemoryExplicitLayoutKHR

-
-
-
-
-

Instructions

-
-

In Section 3.32 ("Instructions"), modify the last sentence of the -definition of OpTypeBool from

-
-
-
-
-

If they are stored (in conjunction with OpVariable), they can only -be used with logical addressing operations, not physical, and only -with non-externally visible shader Storage Classes: Workgroup, -CrossWorkgroup, Private, Function, Input, and Output.

-
-
-
-
-

to be

-
-
-
-
-

If they are stored (in conjunction with OpVariable), they can only -be used with logical addressing operations, not physical, and only -with variables that are not required to be explicitly laid out.

-
-
-
-
-

Also in Section 3.32 ("Instructions"), modify the definition of -OpPtrAccessChain to include the following

-
-
-
-
-

When WorkgroupMemoryExplicitLayoutKHR capability is declared, for -objects in Workgroup storage class that are explicitly laid out -the element’s address or location is also calculated using a stride.

-
-
-
-
-
-
-
-

Issues

-
-
-

None yet.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2020-06-29

Caio Marcelo de Oliveira Filho

Initial KHR extension.

-
-
-
- - \ No newline at end of file + + + + + + extensions/KHR/SPV_KHR_workgroup_memory_explicit_layout.html + + +

extensions/KHR/SPV_KHR_workgroup_memory_explicit_layout.html

+ + diff --git a/extensions/NV/SPV_NVX_multiview_per_view_attributes.html b/extensions/NV/SPV_NVX_multiview_per_view_attributes.html index 0c07f9a..f472552 100644 --- a/extensions/NV/SPV_NVX_multiview_per_view_attributes.html +++ b/extensions/NV/SPV_NVX_multiview_per_view_attributes.html @@ -1,378 +1,12 @@ - - - - - - - -SPV_NVX_multiview_per_view_attributes - - - - - -
-
-

Name Strings

-
-
-

SPV_NVX_multiview_per_view_attributes

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Jeff Bolz, NVIDIA

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2017-02-20

Revision

2

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.1 Revision 4.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds a new capability to support the Vulkan -VK_NVX_multiview_per_view_attributes extension in SPIR-V.

-
-
-

The new PerViewAttributesNV capability adds two builtin variables, -PositionPerViewNV and ViewportMaskPerViewNV, which can be -exported from Vertex, Tessellation, or Geometry shaders. -PositionPerViewNV can be imported to Tessellation or Geometry shaders.

-
-
-

The PositionPerViewNV builtin decoration corresponds to the -gl_PositionPerViewNV[] array in GLSL and is used to specify -per-view positions.

-
-
-

The ViewportMaskPerViewNV builtin decoration corresponds to the -gl_ViewportMaskPerViewNV[] array in GLSL and is used to specify -the per-view viewport masks.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_NVX_multiview_per_view_attributes"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
PerViewAttributesNV
-
-
-
-
-
-

New Decorations

-
-
-

None.

-
-
-
-
-

New Builtins

-
-
-

Two new builtins are added as outputs for the Vertex, Tessellation -and Geometry Execution Models under the PerViewAttributesNV capability:

-
-
-
-
PositionPerViewNV
-ViewportMaskPerViewNV
-
-
-
-

PositionPerViewNV can also be used as an input for the Tesselation and -Geometry Execution Models.

-
-
-
-
-

New Instructions

-
-
-

None.

-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - - - - - - - - - -

PerViewAttributesNV

5260

PositionPerViewNV

5261

ViewportMaskPerViewNV

5262

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.1

-
-
-
-
(Modify Section 3.21, BuiltIn)
-
-
-
-
-

(add two new rows to the BuiltIn table)

-
- ----- - - - - - - - - - - - - - - - - - - -
BuiltInEnabling Capabilities

5261

PositionPerViewNV
-Output vertex position for each view in Vertex, Tessellation, or -Geometry Execution Model, and input position for each view in -Tessellation and Geometry Execution Models. See Vulkan API -specification for more detail.

PerViewAttributesNV

5262

ViewportMaskPerViewNV
-Output viewport mask for each view in Vertex, Tessellation, or Geometry -Execution Model. See Vulkan API specification for more detail.

PerViewAttributesNV

-
-
-
-
(Modify Section 3.31, Capability, add a new row to the Capability table)
-
-
-
- ------ - - - - - - - - - - - - - - - -
CapabilityDepends OnEnabled by Extension

5260

PerViewAttributesNV

MultiView

SPV_NVX_multiview_per_view_attributes

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_NVX_multiview_per_view_attributes"
-
-
-
-
-
-

Issues

-
-
-

None yet!

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2017-02-01

Jeff Bolz

Initial draft

2

2017-02-20

Jeff Bolz

Mark complete.

-
-
-
- - \ No newline at end of file + + + + + + extensions/NV/SPV_NVX_multiview_per_view_attributes.html + + +

extensions/NV/SPV_NVX_multiview_per_view_attributes.html

+ + diff --git a/extensions/NV/SPV_NV_bindless_texture.html b/extensions/NV/SPV_NV_bindless_texture.html index ede8bd3..dd8ad7f 100644 --- a/extensions/NV/SPV_NV_bindless_texture.html +++ b/extensions/NV/SPV_NV_bindless_texture.html @@ -1,736 +1,12 @@ - - - - - - - -SPV_NV_bindless_texture - - - - - -
-
-

Name Strings

-
-
-

SPV_NV_bindless_texture

-
-
-
-
-

Contact

-
-
-

See Issues list in the Khronos SPIRV-Headers repository: -https://github.com/KhronosGroup/SPIRV-Headers

-
-
-
-
-

Contributors

-
-
-
    -
  • -

    Jeff Bolz, NVIDIA

    -
  • -
  • -

    Pankaj Mistry, NVIDIA

    -
  • -
  • -

    Ashwin Lele, NVIDIA

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Completed

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2021-05-26

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.5 Revision 2.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds new capabilities to support the GL_NV_bindless_texture -extensions in SPIR-V.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_NV_bindless_texture"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
BindlessTextureNV
-
-
-
-
-
-

New Instructions

-
-
-

Instructions added under the BindlessTextureNV capability:

-
-
-
-
OpConvertUToImageNV
-OpConvertUToSamplerNV
-OpConvertUToSampledImageNV
-OpConvertImageToUNV
-OpConvertSamplerToUNV
-OpConvertSampledImageToUNV
-OpSamplerImageAddressingModeNV
-
-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

BindlessTextureNV

5390

OpConvertUToImageNV

5391

OpConvertUToSamplerNV

5392

OpConvertImageToUNV

5393

OpConvertSamplerToUNV

5394

OpConvertUToSampledImageNV

5395

OpConvertSampledImageToUNV

5396

OpSamplerImageAddressingModeNV

5397

BindlessSamplerNV

5398

BindlessImageNV

5399

BoundSamplerNV

5400

BoundImageNV

5401

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.3

-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

5390

BindlessTextureNV

-
-
-
(Modify Section 2.2.2, Types)
-
-
-
-
-

Update concrete type definition to include OpTypeSampledImage, OpTypeImage and OpTypeSampler:

-
-
-

Concrete Type: A numerical scalar, vector, or matrix type, or physical pointer type, -or an image type, or sampler type or sampled image type, or any aggregate containing only these types.

-
-
-
-
-
(Modify Section 2.4, Logical layout of a Module)
-
-
-
-
-

Modify the layout and insert after OpMemoryModel the address mode of sampler/image type variables. -Below is the new fifth entry in the SPIRV layout. Everything gets moved below by a number. Below text -shows the 4th, 5th and 6th elements of the updated layout

-
-
-
    -
  1. -

    The single required OpMemoryModel instruction

    -
  2. -
  3. -

    Sampler/Image type variable addressing mode defined with OpSamplerImageAddressingModeNV

    -
  4. -
  5. -

    All entry point declarations, using OpEntryPoint.

    -
  6. -
-
-
-
-
-
(Modify Section 2.16.1, Universal Validation Rules)
-
-
-
-
-

Update the ninth bullet under "Data rules" to relax the restriction on -OpSampledImage instructions such that, it can appear as an operand of OpPhi and OpSelect instructions.

-
-
-

Removes the statement: All OpSampledImage instructions must be in the same block in which their Result <id> are consumed. Result <id> -from OpSampledImage instructions must not appear as operands to OpPhi instructions or OpSelect instructions

-
-
-

Rephrases the validation rule as follows:

-
-
-
    -
  • -

    OpSampledImage instructions must not appear as operands to any instructions other than:

    -
    -
      -
    • -

      the image lookup and image query instructions specified to take an operand whose type is OpTypeSampledImage

      -
    • -
    • -

      OpPhi and OpSelect instructions whose operand and result type can be one of -OpTypeSampledImage, OpTypeImage or OpTypeSampler. Result and operand type must be same.

      -
    • -
    -
    -
  • -
-
-
-

Update the tenth bullet under "Data rules" to relax the restriction on image or sampler type data -in a composite, to be allowed as operands to OpPhi instructions or OpSelect instructions

-
-
-

Rephrases the validation rule as follows:

-
-
-

– Instructions for extracting a scalar image or scalar sampler out of a composite must only use dynamically-uniform -indexes. Result <id> extracted from these composite of type OpTypeImage, OpTypeSampler or OpTypeSampledImage can appear -as operands to OpPhi instructions or OpSelect instructions or other image instructions. Such Result <id> must not appear as operands -to any other instructions specified to operate on them.

-
-
-
-
-
(Add New Subsection 3.32.<TBD>, Bindless Texture cast Instructions)
-
-
-
- ------- - - - - - - - - - - - - -

OpConvertUToImageNV
-
-Convert an unsigned integer to image type.

-

If OpSamplerImageAddressingModeNV has a literal value of 64, -Operand should be specified either as 64-bit unsigned integer type or -vector of 2 unsigned 32-bit integer type.

-

If OpSamplerImageAddressingModeNV has a literal value of 32, -Operand should be specified as a 32-bit unsigned integer type.

-

Result Type must be of type OpTypeImage

4

5391

<id> Result Type

<id> Result

<id> Operand

- ------- - - - - - - - - - - - - -

OpConvertUToSamplerNV
-
-Convert an unsigned integer to sampler type.

-

If OpSamplerImageAddressingModeNV has a literal value of 64, -Operand should be specified either as 64-bit unsigned integer type or -vector of 2 unsigned 32-bit integer type.

-

If OpSamplerImageAddressingModeNV has a literal value of 32, -Operand should be specified as a 32-bit unsigned integer type.

-

Result Type is of type OpTypeSampler

4

5392

<id> Result Type

<id> Result

<id> Operand

- ------- - - - - - - - - - - - - -

OpConvertImageToUNV
-
-Convert an image type to unsigned integer.

-

Operand is of type OpTypeImage.

-

If OpSamplerImageAddressingModeNV has a literal value of 64, -Result Type should be specified either as 64-bit unsigned integer type or vector of 2 unsigned 32-bit integer type.

-

If OpSamplerImageAddressingModeNV has a literal value of 32, -Result Type should be specified as 32-bit unsigned integer type.

4

5393

<id> Result Type

<id> Result

<id> Operand

- ------- - - - - - - - - - - - - -

OpConvertSamplerToUNV
-
-Convert a sampler type to unsigned integer.

-

Operand is of type OpTypeSampler

-

If OpSamplerImageAddressingModeNV has a literal value of 64, -Result Type should be specified either as 64-bit unsigned integer type or vector of 2 unsigned 32-bit integer type.

-

If OpSamplerImageAddressingModeNV has a literal value of 32, -Result Type should be specified as 32-bit unsigned integer type.

4

5394

<id> Result Type

<id> Result

<id> Operand

- ------- - - - - - - - - - - - - -

OpConvertUToSampledImageNV
-
-Convert an unsigned integer to sampled image type.

-

If OpSamplerImageAddressingModeNV has a literal value of 64, -Operand should be specified either as 64-bit unsigned integer type or -vector of 2 unsigned 32-bit integer type.

-

If OpSamplerImageAddressingModeNV has a literal value of 32, -Operand should be specified as a 32-bit unsigned integer type.

-

Result Type is of type OpTypeSampledImage

4

5395

<id> Result Type

<id> Result

<id> Operand

- ------- - - - - - - - - - - - - -

OpConvertSampledImageToUNV
-
-Convert a sampled image type to unsigned integer.

-

Operand is of type OpTypeSampledImage

-

If OpSamplerImageAddressingModeNV has a literal value of 64, -Result Type should be specified either as 64-bit unsigned integer type or vector of 2 unsigned 32-bit integer type.

-

If OpSamplerImageAddressingModeNV has a literal value of 32, -Result Type should be specified as 32-bit unsigned integer type.

4

5396

<id> Result Type

<id> Result

<id> Operand

- ----- - - - - - - - - - - -

OpSamplerImageAddressingModeNV
-
-Sets up the addressing mode for variables of type OpTypeSampledImage, OpTypeImage and OpTypeSampler.

-

Bit Width takes either a value of 32 or 64, any other value is invalid.

-

It indicates size of the opaque type variable in memory.

2

5397

<Literal> Bit Width

-
-
-
-
(Modify Subsection 3.32.15, OpSelect Instruction)
-
-
-

As part of the extension, Update OpSelect instruction to accept OpTypeSampledImage as operand and result

-
-
-
-
- --------- - - - - - - - - - - - - - - -

OpSelect
-
- Select between two objects. Before version 1.4, results are only computed per component.

-

Before version 1.4, Result Type must be a pointer, scalar, or vector. Starting with version 1.4, Result Type can - additionally be a composite type other than a vector. Starting with version 1.5, Result Type can additionally - be of type OpTypeSampledImage, OpTypeImage and OpTypeSampler.

-

The types of Object 1 and Object 2 must be the same as Result Type.

-

Condition must be a scalar or vector of Boolean type.

-

If Condition is a scalar and true, the result is Object 1. If Condition is a scalar and false, the result is Object 2.

-

If Condition is a vector, Result Type must be a vector with the same number of components as _Condition and the - result is a mix of Object 1 and Object 2: When a component of Condition is true, the corresponding component in - the result is taken from Object 1, otherwise it is taken from Object 2.

6

169

<id> Result Type

Result <id>

<id> Condition

<id> Object 1

<id> Object 2

-
-
- -
-
-
-
-
(Add to Decorations: list in Section 3.20)
-
-
-
-
-
DecorationBindlessSamplerNV
-DecorationBindlessImageNV
-DecorationBoundSamplerNV
-DecorationBoundImageNV
-
-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Decoration

Extra Operands

Enabling Capabilities

5398

BindlessSamplerNV
- Applies to a sampler type variable as a layout qualifier, - indicating it is bindless. Behavior is defined - by the runtime environment.

Also see SPV_NV_bindless_texture

5399

BindlessImageNV
- Applies to an image type variable as a layout qualifier, - indicating it is bindless. Behavior is defined by - the runtime environment.

Also see SPV_NV_bindless_texture

5400

BoundSamplerNV
- Applies to a sampler type variable as a layout qualifier, - indicating it is bound. Behavior is defined by - the runtime environment.

Also see SPV_NV_bindless_texture

5401

BoundImageNV
- Applies to an image type variable as a layout qualifier, - indicating it is bound. Behavior is defined by the - runtime environment.

Also see SPV_NV_bindless_texture

-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to SPIR-V for validation layers to check legal use of this extension:

-
-
-
-
OpExtension "SPV_NV_bindless_texture"
-
-
-
-
-
-

Issues

-
-
-
    -
  1. -

    How does this extension interact with GL_NV_bindless_texture ?

    -
    -
    -
    -

    RESOLVED: This extension defines the SPIRV instructions and decorations -needed to implement GL_NV_bindless_texture.

    -
    -
    -
    -
  2. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2021-05-26

Pankaj Mistry

Initial version

-
-
-
- - \ No newline at end of file + + + + + + extensions/NV/SPV_NV_bindless_texture.html + + +

extensions/NV/SPV_NV_bindless_texture.html

+ + diff --git a/extensions/NV/SPV_NV_compute_shader_derivatives.html b/extensions/NV/SPV_NV_compute_shader_derivatives.html index 96ad5b0..d5f1132 100644 --- a/extensions/NV/SPV_NV_compute_shader_derivatives.html +++ b/extensions/NV/SPV_NV_compute_shader_derivatives.html @@ -1,457 +1,12 @@ - - - - - - - -SPV_NV_compute_shader_derivatives - - - - - -
-
-

Name Strings

-
-
-

SPV_NV_compute_shader_derivatives

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Daniel Koch, NVIDIA

    -
  • -
  • -

    John Kessenich, Google

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2018-09-12

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.3 Revision 2, Unified.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension provides a capability to enable derivatives in the GLCompute -Execution Model. There are two new execution modes added which specify -which four compute shader invocations are grouped together.

-
-
-

The new ComputeDerivativeGroupQuadsNV and ComputeDerivativeGroupLinearNV -capabilities enable the use of OpImageQueryLod, the ImplicitLod instructions, -and the Derivative instructions in the GLCompute Execution Model.

-
-
-

This SPIR-V extension provides support for the GLSL -GL_NV_compute_shader_derivatives extension.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_NV_compute_shader_derivatives"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.3

-
-
-
-
(Modify Section 2.2.4, Control Flow)
-
-

(Modify the definition of Derivative Group, to include GLCompute)

-
-
-
-

Derivative Group: Defined only for the Fragment and GLCompute Execution Models. -In the Fragment execution model this is the set of invocations collectively -processing a single point, line, or triangle, including any helper invocations. -In the GLCompute execution model this is a single local workgroup.

-
-
-
-
-
(Modify Section 2.19, Derivatives)
-
-

(Replace the first sentence:)

-
-
-
-

Derivatives appear only in the Fragment Execution Model.

-
-
-
-
-

(with the following:)

-
-
-
-
-

Derivatives appear in the Fragment and GLCompute Execution Models.

-
-
-
-
-
(Modify Section 3.6, Execution Mode)
-
-
-
-
-

(add new rows to the Execution Mode table)

-
- ------ - - - - - - - - - - - - - - - - - - - - - -
Execution ModeEnabling CapabilitiesExtra Operands

5289

DerivativeGroupQuadsNV
-Specifies that compute shader derivatives are evaluated over 2x2 -groups of invocations. -See the Vulkan or OpenGL API specifications for more detail. -Only valid with the GLCompute Execution Model.

ComputeDerivativeGroupQuadsNV

5290

DerivativeGroupLinearNV
-Specifies that compute shader derivatives are evaluated over groups -of four invocations with consecutive LocalInvocationIndex values. -See the Vulkan or OpenGL API specifications for more detail. -Only valid with the Compute Execution Model.

ComputeDerivativeGroupLinearNV

-
-
-
-
(Modify Section 3.31, Capability, adding a new row to the Capability table)
-
-
-
- ------ - - - - - - - - - - - - - - - - - - - - - -
CapabilityDepends OnEnabled by Extension

5288

ComputeDerivativeGroupQuadsNV
-Uses the DerivativeGroupQuadsNV execution mode.

Shader

SPV_NV_compute_shader_derivatives

5350

ComputeDerivativeGroupLinearNV
-Uses the DerivativeGroupLinearNV execution mode.

Shader

SPV_NV_compute_shader_derivatives

-
-
-
-
(Modify Section 3.32.10, Image Instructions)
-
-

(Modify the description of the following instructions to allow them in the - GLCompute Execution Model in addition to the Fragment Execution Model)

-
-
-
-
    -
  • -

    OpImageSampleImplicitLod

    -
  • -
  • -

    OpImageSampleDrefImplicitLod

    -
  • -
  • -

    OpImageSampleProjImplicitLod

    -
  • -
  • -

    OpImageSampleProjDrefImplicitLod

    -
  • -
  • -

    OpImageQueryLod

    -
  • -
  • -

    OpImageSparseSampleImplicitLod

    -
  • -
  • -

    OpImageSparseSampleDrefImplicitLod

    -
    -
      -
    • -

      This instruction is only valid in the Fragment and GLCompute Execution Models. -In addition, it consumes an implicit derivative that can be affected by code motion.

      -
    • -
    -
    -
  • -
-
-
-
-
-
(Modify Section 3.32.16, Derivative Instructions)
-
-

(Modify the description of the following instructions to allow them in the - GLCompute Execution Model in addition to the Fragment Execution Model)

-
-
-
-
    -
  • -

    OpDPdx

    -
  • -
  • -

    OpDPdy

    -
  • -
  • -

    OpFwidth

    -
  • -
  • -

    OpDPdxFine

    -
  • -
  • -

    OpDPdyFine

    -
  • -
  • -

    OpFwidthFine

    -
  • -
  • -

    OpDPdxCoarse

    -
  • -
  • -

    OpDPdyCoarse

    -
  • -
  • -

    OpFwidthCoarse

    -
    -
      -
    • -

      This instruction is only valid in the Fragment and GLCompute Execution Models.

      -
    • -
    -
    -
  • -
-
-
-

(Modify the existing descriptions of OpDPd{x,y}{Fine,Course}, prefacing the - existing language that talks about partial derivatives relative to the window - x or y coordinate with "In the Fragment Execution Model:")

-
-
-

(Add the following to the descriptions of OpDPd{x,y}{Fine,Course}, describing - how partial derivatives work in compute shaders)

-
-
-

In the GLCompute Execution Model:
-Result is the partial derivative of P evaluated over groups of four invocations. -Selection of the four invocations is determined by the DerivativeGroup*NV -execution mode that was specified for the entry point. If neither derivative group -mode was specified, the derivatives return zero.

-
-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_NV_compute_shader_derivatives"
-
-
-
-
    -
  • -

    An entry point cannot have both the DerivativeGroupQuadsNV and -DerivativeGroupLinearNV execution modes specified.

    -
  • -
  • -

    The DerivativeGroupQuadsNV and DerivativeGroupLinearNV execution modes -can only be used on entry points with an execution model of GLCompute

    -
  • -
-
-
-
-
-

Issues

-
-
-

None yet!

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2018-09-12

Daniel Koch

Internal revisions

-
-
-
- - \ No newline at end of file + + + + + + extensions/NV/SPV_NV_compute_shader_derivatives.html + + +

extensions/NV/SPV_NV_compute_shader_derivatives.html

+ + diff --git a/extensions/NV/SPV_NV_cooperative_matrix.html b/extensions/NV/SPV_NV_cooperative_matrix.html index 68111a1..d1db2cb 100644 --- a/extensions/NV/SPV_NV_cooperative_matrix.html +++ b/extensions/NV/SPV_NV_cooperative_matrix.html @@ -1,730 +1,12 @@ - - - - - - - -SPV_NV_cooperative_matrix - - - - - -
-
-

Name Strings

-
-
-

SPV_NV_cooperative_matrix

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Jeff Bolz, NVIDIA

    -
  • -
  • -

    Daniel Koch, NVIDIA

    -
  • -
  • -

    Markus Tavenrath, NVIDIA

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2018-2019 NVIDIA Corp.

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2019-07-12

Revision

2

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.3, Revision 5, Unified.

-
-
-

This extension requires SPIR-V 1.3.

-
-
-

This extension requires SPV_KHR_vulkan_memory_model.

-
-
-

This extension interacts with SPV_EXT_physical_storage_buffer.

-
-
-
-
-

Overview

-
-
-

This extension adds a new set of types known as "cooperative matrix" types, -where the storage for and computations performed on the matrix are spread -across a set of invocations such as a subgroup. These types give the -implementation freedom in how to optimize matrix multiplies.

-
-
-

This extension introduces the types and instructions, but does not specify -rules about what sizes/combinations are valid. This is left to the -client API specs, and it is expected that different implementations may -support different sizes. To help accommodate this, the dimensions of the -cooperative types can be specialized via specialization constants. Since -the scope parameter is also something that could potentially be specialized, -this extension allows all scope ids to be specialization constants.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_NV_cooperative_matrix"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.3

-
-
-

2.2 Terms

-
-

Add new terms to section 2.2.2 Types:

-
-
-

Cooperative Matrix: A two-dimensional ordered -collection of scalars, whose storage is spread across multiple shader -invocations.

-
-
-

Add Cooperative Matrix to the definition of Concrete Type.

-
-
-

Add Cooperative Matrix to the definition of Composite. As a composite, -a cooperative matrix acts like a vector with an implementation-dependent -number of components (which can be queried with -OpCooperativeMatrixLengthNV). It can be used as a composite for all -operations that act on composite types, including OpCompositeExtract, -OpCompositeInsert, OpAccessChain, etc.. The mapping of components to -invocations and indexes is implementation-dependent.

-
-
-
-

2.16 Validation Rules

-
-

Modify section 2.16.1. Universal Validation Rules:

-
-
-

Add OpCooperativeMatrixLoadNV and OpCooperativeMatrixStoreNV to the list -of instructions under "A pointer can only be an operand to the following -instructions:", when the Logical addressing model is selected and the -VariablePointers capability is not declared.

-
-
-

Cooperative matrix types (or types containing them) can only be allocated -in Function or Private storage classes.

-
-
-

Modify section 2.16.2. Shader Validation Rules, to relax the rules for -scopes to allow specialization constants:

-
-
-

All <id> used for Scope and Memory Semantics must be the result of a constant -instruction.

-
-
-
-

3.26 Memory Access

-
-

Modify Section 3.26, "Memory Access":

-
-
-

In the description of MakePointerAvailableKHR added by -SPV_KHR_vulkan_memory_model, change "Not valid with OpLoad" to "Not valid -with OpLoad or OpCooperativeMatrixLoadNV".

-
-
-

In the description of MakePointerVisibleKHR added by -SPV_KHR_vulkan_memory_model, change "Not valid with OpStore" to "Not valid -with OpStore or OpCooperativeMatrixStoreNV".

-
-
-
-

3.31 Capabilities

-
-

Modify Section 3.31, "Capability", adding these rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityEnabling Capabilities

5357

CooperativeMatrixNV
-Enables cooperative matrix types and instructions operating on them.

Shader

-
-
-
-
-

3.32.6 Type-Declaration Instructions

- --------- - - - - - - - - - - - - - - - -

OpTypeCooperativeMatrixNV
-
-Declare a new cooperative matrix type, where all invocations in Scope -cooperate to compute and store the matrix.
-
-Component Type must be a scalar numerical type.
-
-Rows must be a constant instruction with scalar integer type.
-
-Columns must be a constant instruction with scalar integer type.

Capability:
-CooperativeMatrixNV

6

5358

Result <id>

<id>
-Component Type

Scope <id>
-Scope

<id>
-Rows

<id>
-Columns

-
-
-

3.32.7 Constant-Creation Instructions

-
-

Modify OpConstantComposite to make an exception for cooperative matrix types: -"If the Result Type is a cooperative matrix type, then there must be only one -Constituent and it is used to initialize all members."

-
-
-

Modify OpSpecConstantOp to add: -If the CooperativeMatrixNV capability was declared, the following opcode is -also valid: OpCooperativeMatrixLengthNV. Relax the limitation on operands for -this instruction to allow the operand to be a cooperative matrix type.

-
-
-
-

3.32.8 Memory Instructions

- ---------- - - - - - - - - - - - - - - - - -

OpCooperativeMatrixLoadNV
-
-Load a cooperative matrix through a pointer.
-
-Result Type is the type of the loaded object. It must be a cooperative matrix -type.
-
-Pointer is a pointer into an array. Its type must be an OpTypePointer whose -Type operand is a scalar or vector type. The storage class of Pointer must be -Workgroup, StorageBuffer, or (if SPV_EXT_physical_storage_buffer is -supported) PhysicalStorageBufferEXT.
-
-Stride is the number of elements in the array in memory between the first -component of consecutive rows (or columns) in the result. It must be a scalar -integer type.
-
-ColumnMajor indicates whether the values loaded from memory are arranged in -column-major or row-major order. It must be a boolean constant instruction, with -false indicating row major and true indicating column major.
-
-Memory Access must be a Memory Access literal. If not present, it is the -same as specifying None.
-
-If ColumnMajor is false, then elements (row,*) of the result are taken in -order from contiguous locations starting at Pointer[row*Stride]. If -ColumnMajor is true, then elements (*,col) of the result are taken in order -from contiguous locations starting from Pointer[col*Stride]. Any -ArrayStride decoration on Pointer is ignored.
-
-For a given dynamic instance of this instruction, all operands of this -instruction must be the same for all invocations in a given scope instance -(where the scope is the scope the cooperative matrix type was created with). -All invocations in a given scope instance must be active or all must be -inactive.

Capability:
-CooperativeMatrixNV

6+variable

5359

<id>
-Result Type

Result <id>

<id>
-Pointer

<id>
-Stride

<id>
-ColumnMajor

Optional
-Memory Access

- --------- - - - - - - - - - - - - - - - -

OpCooperativeMatrixStoreNV
-
-Store a cooperative matrix through a pointer.
-
-Pointer is a pointer into an array. Its type must be an OpTypePointer whose -Type operand is a scalar or vector type. The storage class of Pointer -must be Workgroup, StorageBuffer, or (if SPV_EXT_physical_storage_buffer -is supported) PhysicalStorageBufferEXT.
-
-Object is the object to store. Its type must be an -OpTypeCooperativeMatrixNV.
-
-Stride is the number of elements in the array in memory between the first -component of consecutive rows (or columns) in the result. It must be a scalar -integer type.
-
-ColumnMajor indicates whether the values stored to memory are arranged in -column-major or row-major order. It must be a boolean constant instruction, with -false indicating row major and true indicating column major.
-
-Memory Access must be a Memory Access literal. If not present, it is the -same as specifying None.
-
-If ColumnMajor is false, then elements (row,*) of Object are stored in -order to contiguous locations starting at Pointer[row*Stride]. If -ColumnMajor is true, then elements (*,col) of Object are stored in order -to contiguous locations starting from Pointer[col*Stride]. Any -ArrayStride decoration Pointer is ignored.
-
-For a given dynamic instance of this instruction, all operands of this -instruction must be the same for all invocations in a given scope instance -(where the scope is the scope the cooperative matrix type was created with). -All invocations in a given scope instance must be active or all must be -inactive.

Capability:
-CooperativeMatrixNV

5+variable

5360

<id>
-Pointer

<id>
-Object

<id>
-Stride

<id>
-ColumnMajor

Optional
-Memory Access

- ------- - - - - - - - - - - - - - -

OpCooperativeMatrixLengthNV
-
-Number of components of a cooperative matrix type accessible to each -invocation when treated as a composite.
-
-Result Type must be an OpTypeInt with 32-bit Width and 0 Signedness.
-
-Type is a cooperative matrix type.

Capability:
-CooperativeMatrixNV

4

5362

<id>
-Result Type

Result <id>

<id>
-Type

-
-
-

3.32.11 Conversion Instructions

-
-

Allow cooperative matrix types for the following conversion instructions (if -the component types are appropriate): OpConvertFToU, OpConvertFToS, -OpConvertSToF, OpConvertUToF, OpUConvert, OpSConvert, OpFConvert. -The result type and value type must have the same scope, number of rows, and -number of columns.

-
-
-
-

3.32.12 Composite Instructions

-
-

Modify OpCompositeConstruct to make an exception for cooperative matrix types: -"If the Result Type is a cooperative matrix type, then there must be only one -Constituent and it is used to initialize all members."

-
-
-
-

3.32.13 Arithmetic Instructions

- --------- - - - - - - - - - - - - - - - -

OpCooperativeMatrixMulAddNV
-
-Linear-algebraic matrix multiply of A by B and then component-wise -add C. The order of the operations is implementation-dependent. The -internal precision of floating-point operations is defined by the client -API. Integer operations are performed at the precision of the Result Type -and are exact unless there is overflow or underflow, in which case the -result is undefined.
-
-Result Type must be a cooperative matrix type with M rows and N columns.
-
-A is a cooperative matrix with M rows and K columns.
-
-B is a cooperative matrix with K rows and N columns.
-
-C is a cooperative matrix with M rows and N columns.
-
-The values of M, N, and K must be consistent across the result and operands. -This is referred to as an MxNxK matrix multiply.
-
-A, B, C, and Result Type must have the same scope, and this defines -the scope of the operation. A, B, C, and Result Type need not -necessarily have the same component type, this is defined by the client API.
-
-If the Component Type of any matrix operand is an integer type, then its -components are treated as signed if its Component Type has Signedness of 1 -and are treated as unsigned otherwise.
-
-For a given dynamic instance of this instruction, all invocations in a given -scope instance must be active or all must be inactive (where the scope is the -scope of the operation).

Capability:
-CooperativeMatrixNV

6

5361

<id>
-Result Type

Result <id>

<id>
-A

<id>
-B

<id>
-C'

-
-

Allow cooperative matrix types for the following instructions (if the -component type is appropriate): OpSNegate, OpFNegate, OpIAdd, OpFAdd, -OpISub, OpFSub, OpFDiv, OpSDiv, and OpUDiv. Allow cooperative -matrix types for OpMatrixTimesScalar.

-
-
-
-
-
-

Issues

-
-
-

1) Should this functionality hardwire a scope (subgroup) or be more flexible?

-
-
-

Discussion: Volta hardware used a smaller scope (8 threads), and it is -plausible that other implemenations could also want a smaller scope.

-
-
-

Resolution: Allow a specialization constant scope.

-
-
-

2) Should we have capabilities for each MxNxK matrix multiply "size" that is -supported?

-
-
-

Discussion: It’s nice for validation if the shader instructions can be -validated solely based on the OpCapability instructions. But that already -breaks down for spec-constant-defined cooperative matrix types.

-
-
-

Resolution: Just one capability for the overall feature.

-
-
-

3) Should stride be in bytes or elements?

-
-
-

Discussion: Using elements helps avoid the unsupportable (or more difficult -to support) cases.

-
-
-

Resolution: Stride is in elements of the pointee type (which can be different -than the matrix component type).

-
-
-

4) Should we allow matrices to be stored in an opaque layout in shared -memory?

-
-
-

Discussion: Currently matrices can be stored to shared memory as an array of -floats, in row- or column-major order. It might be beneficial to let shaders -spill matrices to shared memory in an opaque, implementation-dependent -layout. There are a few possible ways to handle this: (1) Reuse the -existing OpCooperativeMatrixLoad/Store opcodes with a flag or a trick like -Stride=0, (2) new instructions without the stride parameter, (3) let -the cooperative matrix types be placed in shared memory and use OpLoad/OpStore.

-
-
-

Resolution: Not supported in the current extension.

-
-
-

5) Should the "column major" operand be a literal constant, or a constant -instruction?

-
-
-

Discussion: Constant instructions are more general, and easier for code -generation.

-
-
-

Resolution: Constant instruction.

-
-
-

6) Should we allow OpTranspose on cooperative matrix types?

-
-
-

Discussion: In NVIDIA’s initial implementation, we’ll support a pretty -restricted set of sizes where the transpose of a matrix will sometimes not -be a valid type. So it’s unclear if this is useful.

-
-
-

Resolved: Not supported in this extension.

-
-
-

7) What should the Pointer operand to a cooperative Load/Store be?

-
-
-

Discussion: The spec currently chooses to have the Pointer parameter point at -the first element of the matrix in memory, and this pointer is assumed to be -in the middle of an array. Another option would be to have the Pointer -parameter be a pointer to the whole array, and have an additional "Element" -parameter to the instructions, which indicates where the matrix starts in the -array.

-
-
-

The alternative option’s main benefit is that you don’t end up with a pointer -parameter being used to access something it does not point to. However, it -effectively splits out the last element of the access chain into the -load/store instruction, which is kind of weird. And in the first option, the -pointer to the array is still there implicitly in the access chain.

-
-
-

Resolution: Pointer points to the first element of the array.

-
-
-

8) Should we allow the Pointer type and matrix component type to mismatch?

-
-
-

Resolution: Yes, this makes it easier to efficiently load matrix data into -shared memory, which can be declared to use a larger type (e.g. uvec4). The -Stride parameter is interpreted in units of the pointed-to type, not in -units of the matrix’s component type.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2019-01-30

Jeff Bolz

Initial revision

2

2019-07-12

Jeff Bolz

Added details for integer operations

-
-
-
- - \ No newline at end of file + + + + + + extensions/NV/SPV_NV_cooperative_matrix.html + + +

extensions/NV/SPV_NV_cooperative_matrix.html

+ + diff --git a/extensions/NV/SPV_NV_cooperative_matrix2.html b/extensions/NV/SPV_NV_cooperative_matrix2.html index 53e23c3..83cda30 100644 --- a/extensions/NV/SPV_NV_cooperative_matrix2.html +++ b/extensions/NV/SPV_NV_cooperative_matrix2.html @@ -1,1108 +1,12 @@ - - - - - - - -SPV_NV_cooperative_matrix2 - - - - - -
-
-

Name Strings

-
-
-

SPV_NV_cooperative_matrix2

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Jeff Bolz, NVIDIA

    -
  • -
  • -

    Karthik Vaidyanathan, NVIDIA

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2024 NVIDIA Corp.

-
-
-
-
-

Status

-
-
-
    -
  • -

    Draft

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2024-09-18

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.6, Revision 3, Unified.

-
-
-

This extension requires SPIR-V 1.6.

-
-
-

This extension requires SPV_KHR_cooperative_matrix.

-
-
-

If CooperativeMatrixTensorAddressingNV is used, SPV_NV_tensor_addressing is -required.

-
-
-
-
-

Overview

-
-
-

This extension adds several new features building on the cooperative matrix -types added in SPV_KHR_cooperative_matrix. The goal is to add and accelerate -features beyond just simple GEMM kernels, including adding support for type/use -conversions, reductions, per-element operations, and tensor addressing, and -also to improve usability and out-of-the-box performance by adding support -for more flexible matrix sizes, and workgroup scope matrices with -compiler-managed staging through shared memory.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_NV_cooperative_matrix2"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6

-
-
-

2.16 Validation Rules

-
-

Modify section 2.16.1. Universal Validation Rules:

-
-
    -
  • -

    Add OpCooperativeMatrixLoadTensorNV and OpCooperativeMatrixStoreTensorNV to the list -of instructions under "It is invalid for a pointer to be an operand to any -instruction other than:", when the Logical addressing model is selected and -neither the VariablePointers nor VariablePointersStorageBuffer capability -are declared.

    -
  • -
  • -

    If an OpTypeCooperativeMatrixKHR instruction uses a Scope of Workgroup, -then the workgroup size must have already been specified in the module, -including any constant instructions used by LocalSizeId.

    -
  • -
  • -

    In any function used as a DecodeFunc parameter to OpCooperativeMatrixLoadTensorNV -or as a Func parameter to OpCooperativeMatrixPerElementOpNV or as a CombineFunc -parameter to OpCooperativeMatrixReduceNV, and any function called directly or -indirectly by those functions, tangled instructions are not allowed.

    -
  • -
-
-
-
-
-

3.26 Memory Operands

-
-

Modify Section 3.26, "Memory Operands":

-
-
-

In the description of MakePointerAvailable, change "Not valid with OpLoad" -to "Not valid with OpLoad or OpCooperativeMatrixLoadKHR or OpCooperativeMatrixLoadTensorNV".

-
-
-

In the description of MakePointerVisible, change "Not valid with OpStore" -to "Not valid with OpStore or OpCooperativeMatrixStoreKHR or OpCooperativeMatrixStoreTensorNV".

-
-
-
-

3.31 Capabilities

-
-

Modify Section 3.31, "Capability", adding these rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
CapabilityEnabling Capabilities

5430

CooperativeMatrixReductionsNV
-Enables cooperative matrix reduction instructions.

5431

CooperativeMatrixConversionsNV
-Enables cooperative matrix conversion/transpose instructions.

5432

CooperativeMatrixPerElementOperationsNV
-Enables cooperative matrix per-element operations.

5433

CooperativeMatrixTensorAddressingNV
-Enables cooperative matrix load/store instruction using tensor addressing -(OpCooperativeMatrixLoadTensorNV and OpCooperativeMatrixStoreTensorNV).

5434

CooperativeMatrixBlockLoadsNV
-Enables the DecodeFunc parameter for OpCooperativeMatrixLoadTensorNV.

-
-
-
-
-

3.X Tensor Layout and View

-
-

Tensor layout and tensor view types are representations of the mapping -between matrix coordinates and tensor memory layout. They each have a -number of dimensions in the range [1,5], with dimension 0 being the -outermost dimension and the last dimension being the innermost. These types -have the following logical state:

-
-
-
-
    struct tensorLayoutNV<uint32_t Dim,
-                          TensorClampMode Mode = TensorClampModeUndefined>
-    {
-      static constexpr uint32_t LDim = Dim;
-      static constexpr TensorClampMode clampMode = Mode;
-
-      uint32_t blockSize[LDim];
-      uint32_t layoutDimension[LDim];
-      uint32_t stride[LDim];
-      int32_t offset[LDim];
-      uint32_t span[LDim];
-      uint32_t clampValue;
-    };
-
-    struct tensorViewNV<uint Dim, bool hasDimensions, uint32_t p0, ..., uint32_t p<Dim-1>>
-    {
-      static constexpr uint32_t VDim = Dim;
-      static constexpr bool hasDim = hasDimensions;
-      static constexpr uint32_t permutation[VDim] = {p0, ..., p<Dim-1>};
-
-      uint32_t viewDimension[VDim];
-      uint32_t viewStride[VDim];
-      uint32_t clipRowOffset, clipRowSpan, clipColOffset, clipColSpan;
-    };
-
-
-
-

A tensor layout represents the layout of values in memory (number of -dimensions and size), along with a region being accessed (offset and span).

-
-
-
-
    ---------------------------------------------------------------------------
-    |                           layoutDimension1                              |
-    |                                                                         |
-    |                                                                         |
-    |                                                                         |
-    |                                                                         |
-    |                                                                         |
-    |                                                                         |
-    |                                                                         |
-    |                        span1                                            |
-    |                  -----------------                                      |
-    |                  |               |                                      |
-    |                  |               |                                      |
-    |                  |     slice     | span0                                |
-    |                  |               |                      layoutDimension0|
-    |                  |               |                                      |
-    |      offset1     |               |                                      |
-    | ---------------> -----------------                                      |
-    |                                                                         |
-    |                  ^                                                      |
-    |                  |                                                      |
-    |                  |                                                      |
-    |                  | offset0                                              |
-    |                  |                                                      |
-    |                  |                                                      |
-    |                  |                                                      |
-    |                  |                                                      |
-    ---------------------------------------------------------------------------
-    Figure: A 2D tensor layout, and a slice selecting a region within it.
-
-
-
-

A tensor view allows reinterpreting the dimensions of the region being -accessed, including changing the number of dimensions, reordering the -dimensions as they are loaded or stored, and clipping the region of the -matrix that is loaded or stored. Often the span will have the -same number of elements as the matrix, but in some more advanced uses -that may not be the case.

-
-
-

Loads and stores can either use just a tensor layout, or a tensor layout and -tensor view. The addressing starts by treating the matrix itself as a 2D -"view" and mapping the (row,col) coordinate to a 1D index. If there is only a -tensor layout parameter, then that 1D index is mapped to an N-D coordinate -within the slice. If there is both a tensor layout and a tensor view, then -the 1D index is first mapped to a coordinate within the view, the -coordinate components can be permuted, and then is converted back to a 1D -index which is then run through the tensor layout addressing calculation.

-
-
-

The tensor view dimensions and stride can be used to do more complex -addressing calculations. If the tensor view type has "hasDimensions" false, -then the dimensions of the tensor layout span are used instead.

-
-
-

The tensor view "clip" region restricts which elements of the matrix are -loaded or stored, and also affects the shape of the implicit 2D "view".

-
-
-

Unlike some other ML APIs, tensor layouts and views only describe -addressing calculations and never involve making copies of tensors. For -this reason, the functionality is slightly more limited (e.g. there’s no -way to slice, then permute, then slice again).

-
-
-

While these calculations may look expensive in their full generality, -certain calculations can be skipped when they’re not needed, and the -common cases should be quite efficient.

-
-
-

OpTensorLayout and OpTensorView instructions operate by copying -existing object state and updating the requested state and returning -that as a new result. Some of these instructions initialize multiple -related pieces of state, setting some to common default values, so -the order of the operations matters.

-
-
-

For load and store functions with no TensorView parameter, an element index -is computed according to the matrixCoordToTensorElement function for each -(row,col) of the matrix, which has M rows and N columns. This converts the (row,col) into a row-major index, -converts that index into an N-dimensional coord relative to the span, -and uses the span coordinate to compute a location within the tensor.

-
-
-
-
    constexpr uint32_t MAX_DIM = 5;
-    using Coord = array<uint32_t, MAX_DIM>;
-
-    uint32_t matrixCoordToLinear(tensorLayoutNV t, uint32_t row, uint32_t col, uint32_t N)
-    {
-        uint32_t index = row * N + col;
-        return index;
-    }
-
-    Coord linearToSpanCoord(tensorLayoutNV t, uint32_t index)
-    {
-        Coord spanCoord {};
-        for (int32_t dim = t.LDim-1; dim >= 0; --dim) {
-            spanCoord[dim] = index % t.span[dim];
-            index /= t.span[dim];
-        }
-        return spanCoord;
-    }
-
-    auto spanCoordToTensorCoord(tensorLayoutNV t, Coord spanCoord)
-    {
-        Coord blockCoord {};
-        Coord coordInBlock {};
-
-        for (uint32_t dim = 0; dim <= t.LDim-1; ++dim) {
-            int32_t c = spanCoord[dim] + t.offset[dim];
-
-            if (c < 0 || c >= t.layoutDimension[dim]) {
-
-                ClampMode clampMode = t.clampMode;
-                // For stores, other than Undefined, everything is treated as "discard"
-                if (operation is a store && clampMode != Undefined) {
-                    clampMode = Constant;
-                }
-
-                // remainders are computed as defined in OpSMod
-                switch (clampMode) {
-                case Undefined:
-                    undefined behavior;
-                case Constant:
-                    For load, set result value to t.clampValue;
-                    For store, discard the store;
-                    terminate index calculation;
-                case ClampToEdge:
-                    c = min(max(c, 0), t.layoutDimension[dim]-1);
-                    break;
-                case Repeat:
-                    c = c % t.layoutDimension[dim];
-                    break;
-                case MirrorRepeat:
-                    c = c % (2*t.layoutDimension[dim]-2);
-                    c = (c >= dim) ? (2*dim-2-c) : c;
-                    break;
-                }
-            }
-
-            coordInBlock[dim] = c % t.blockSize[dim];
-            blockCoord[dim] = c / t.blockSize[dim];
-        }
-
-        return tuple(blockCoord, coordInBlock);
-    }
-
-    uint32_t tensorCoordToLinear(tensorLayoutNV t, Coord blockCoord)
-    {
-        uint32_t index = 0;
-
-        for (uint32_t dim = 0; dim <= t.LDim-1; ++dim) {
-            index += blockCoord[dim] * t.stride[dim];
-        }
-        return index;
-    }
-
-    // map (row,col) -> linear index in span -> span coordinate -> tensor coordinate -> linear index in tensor
-    uint32_t matrixCoordToTensorElement(tensorLayoutNV t, uint32_t row, uint32_t col, uint32_t N)
-    {
-        uint32_t index = matrixCoordToLinear(t, row, col, N);
-
-        Coord spanCoord = linearToSpanCoord(t, index);
-
-        Coord blockCoord;
-        Coord coordInBlock;
-
-        tie(blockCoord, coordInBlock) = spanCoordToTensorCoord(t, spanCoord);
-
-        index = tensorCoordToLinear(t, blockCoord);
-
-        return index;
-    }
-
-
-
-

This index is then multiplied by the size of the component type of the matrix and -treated as a byte offset from the Pointer operand. The matrix element is -loaded from or stored to this location. The Pointer must be a multiple of 16B, -but the region of elements selected by the span need not be so aligned. If the -OpCooperativeMatrixLoadTensorNV instruction has a decode parameter, -then the blockCoord and coordInBlock arrays are passed to it as parameters.

-
-
-

For load and store functions with a TensorView parameter, an element index -is computed according to the matrixCoordToTensorElementWithView function -for each (row,col) of the matrix, where has M rows and N columns. -This computes a row-major index relative to the clip region, converts that to -an N-dimensional coordinate relative to the permuted view dimensions, and -computes a linear index from the view coordinate, then runs through the tensor -layout calculation.

-
-
-
-
    uint32_t matrixCoordToLinear(tensorLayoutNV t, tensorViewNV v, uint32_t row, uint32_t col, uint32_t N)
-    {
-        if (row < v.clipRowOffset ||
-            row >= v.clipRowOffset + v.clipRowSpan ||
-            col < v.clipColOffset ||
-            col >= v.clipColOffset + v.clipColSpan) {
-
-            Load or store is skipped. For load, the matrix element is unmodified.
-            terminate index calculation;
-        }
-        row -= v.clipRowOffset;
-        col -= v.clipColOffset;
-        uint32_t width = min(N, v.clipColSpan);
-        uint32_t index = row * width + col;
-        return index;
-    }
-
-    Coord linearToViewCoord(tensorLayoutNV t, tensorViewNV v, uint32_t index)
-    {
-        auto &dimensions = v.hasDimensions ? v.viewDimension : t.span;
-
-        Coord viewCoord {};
-
-        for (int32_t dim = v.VDim-1; dim >= 0; --dim) {
-            uint32_t i = v.permutation[dim];
-
-            viewCoord[i] = index % dimensions[i];
-            index /= dimensions[i];
-        }
-
-        return viewCoord;
-    }
-
-    uint32_t viewCoordToLinear(tensorLayoutNV t, tensorViewNV v, Coord viewCoord)
-    {
-        Coord stride {};
-        if (v.hasDimensions) {
-            stride = v.viewStride;
-        } else {
-            // set stride to match t.span
-            stride[v.VDim-1] = 1;
-            for (int32_t dim = v.VDim-2; dim >= 0; --dim) {
-                stride[dim] = stride[dim+1] * t.span[dim+1];
-            }
-        }
-
-        uint32_t index = 0;
-        for (int32_t dim = v.VDim-1; dim >= 0; --dim) {
-            index += viewCoord[dim] * stride[dim];
-        }
-
-        return index;
-    }
-
-    // map (row,col) -> linear index in view -> view coordinate -> linear index in span -> span coordinate -> tensor coordinate -> linear index in tensor
-    uint32_t matrixCoordToTensorElementWithView(tensorLayoutNV t, uint32_t row, uint32_t col, uint32_t N)
-    {
-        uint32_t index = matrixCoordToLinear(t, v, row, col, N);
-
-        Coord viewCoord = linearToViewCoord(t, v, index);
-
-        index = viewCoordToLinear(t, v, viewCoord);
-
-        Coord spanCoord = linearToSpanCoord(t, index);
-
-        Coord blockCoord;
-        Coord coordInBlock;
-
-        tie(blockCoord, coordInBlock) = spanCoordToTensorCoord(t, spanCoord);
-
-        index = tensorCoordToLinear(t, blockCoord);
-
-        return index;
-    }
-
-
-
-

The final result is then multiplied by the size of the component type of -the matrix and treated as a byte offset from Pointer. The matrix -element is loaded from or stored to this location.

-
-
-

For OpCooperativeMatrixLoadTensorNV instructions with a DecodeFunc operand, -rather than loading a value, the function operand is invoked for each matrix -element at least once. The function’s return type must match the component -type of the result matrix type. The first parameter must be a pointer type -with storage class PhysicalStorageBuffer, -and the parameter is filled a pointer computed by multiplying the index -returned by matrixCoordToTensorElement(WithView) by the size of the pointee type. The second and third -parameters must each be an array of 32-bit integers whose dimension matches the -tensor dimension. The second parameter is filled with the blockCoord, and the -third parameter with the coordInBlock, for the matrix element being decoded. -The return value is stored in the corresponding element of the result matrix.

-
-
-

DecodeFunc is not allowed with OpCooperativeMatrixStoreTensorNV. Similarly, -a block size larger than 1 must not be used with OpCooperativeMatrixStoreTensorNV -because it will lead to data races.

-
-
-
-

3.X Cooperative Matrix Reduce Mode

-
-

New section in 3 "Binary Form".

-
-
-
- ----- - - - - - - - - - - - - - - - - - - - - - - - -
Cooperative Matrix Reduce ModeEnabling Capabilities

0x1

Row
-Elements within each row of a matrix are reduced.

0x2

Column
-Elements within each column of a matrix are reduced.

0x4

2x2
-Elements within an aligned 2x2 neighborhood are reduced.

-
-
-
-

It is invalid to combine 2x2 with Row or Column. -Row and Column can be used together.

-
-
-
-

3.X Tensor Addressing Operands

-
-

New section in 3 "Binary Form".

-
-
-

This is a literal mask; it can be formed by combining the bits from multiple -rows in the table below.

-
-
-

Provides additional operands to the listed memory instructions. Bits that are -set indicate whether an additional operand follows, as described by the table. -If there are multiple following operands indicated, they are ordered: Those -indicated by smaller-numbered bits appear first. An instruction needing two -masks must first provide the first mask followed by the first mask’s additional -operands, and then provide the second mask followed by the second mask’s -additional operands.

-
-
-

Used by:

-
-
-
    -
  • -

    OpCooperativeMatrixLoadTensorNV

    -
  • -
  • -

    OpCooperativeMatrixStoreTensorNV

    -
  • -
-
-
-
- ----- - - - - - - - - - - - - - - - - - - - - - - - -
Tensor Addressing OperandsEnabling Capabilities

0x0

None

0x1

TensorView
-Addressing calculations use a Tensor View. The <id> of a tensor view is -specified in a subsequent operand.

CooperativeMatrixTensorAddressingNV

0x2

DecodeFunc
-Addressing calculations use a decode function. The <id> of a function is -specified in a subsequent operand.

CooperativeMatrixBlockLoadsNV

-
-
-
-
-

3.49.8 Memory Instructions

- ----------- - - - - - - - - - - - - - - - - - -

OpCooperativeMatrixLoadTensorNV
-
-Load a cooperative matrix through a pointer.
-
-Result Type is the type of the loaded object. It must be a cooperative matrix -type.
-
-Pointer is a pointer from which the matrix will be loaded. If the Shader capability was declared, Pointer -must point into an array and any ArrayStride decoration on Pointer is ignored. -Addressing calculations are performed as described in the Tensor Layout and View -section.
-
-Object is a cooperative matrix object whose values are used for clipped loads. -It must have the same type as Result Type.
-
-TensorLayout is a tensor layout that affects addressing calculations.
-
-Memory Operand must begin with a Memory Operand literal.
-
-Tensor Addressing Operands must begin with a Tensor Addressing Operands -literal. If the operands include DecodeFunc, then Pointer must point to -PhysicalStorageBuffer or StorageBuffer storage class.
-
-All the operands to this instruction must be dynamically uniform within every -instance of the Scope of the cooperative matrix. -

Capability:
-CooperativeMatrixTensorAddressingNV

8+variable

5367

<id>
-Result Type

Result <id>

<id>
-Pointer

<id>
-Object

<id>
-TensorLayout

Literal
-Memory Operand
-…​
-optional literals and <ids>

Literal
-Tensor Addressing Operands
-…​
-optional literals and <ids>

- --------- - - - - - - - - - - - - - - - -

OpCooperativeMatrixStoreTensorNV
-
-Store a cooperative matrix through a pointer.
-
-Pointer is a pointer to which the matrix will be stored. If the Shader capability was declared, Pointer -must point into an array and any ArrayStride decoration on Pointer is ignored.
-Addressing calculations are performed as described in the Tensor Layout and View -section.
-
-Object is the object to store. Its type must be an -OpTypeCooperativeMatrixKHR.
-
-TensorLayout is a tensor layout that affects addressing calculations.
-
-Memory Operand must begin with a Memory Operand literal.
-
-Tensor Addressing Operands is a literal mask of Memory Operands.
-
-All the operands to this instruction must be dynamically uniform within every -instance of the Scope of the cooperative matrix. -

Capability:
-CooperativeMatrixTensorAddressingNV

6+variable

5368

<id>
-Pointer

<id>
-Object

<id>
-TensorLayout

Literal
-Memory Operand
-…​
-optional literals and <ids>

Literal
-Tensor Addressing Operands
-…​
-optional literals and <ids>

-
-
-

3.49.13. Arithmetic Instructions

- --------- - - - - - - - - - - - - - - - -

OpCooperativeMatrixReduceNV
-
-Computes a matrix where each element of the result matrix is computed from a -row, column, or neighborhood of the source matrix.
-
-Result Type must be an OpTypeCooperativeMatrixKHR type'.
-
-The type of Matrix must be an OpTypeCooperativeMatrixKHR with the same -Component Type as Result Type.
-
-The type of Matrix and Result Type must each have Use of MatrixAccumulatorKHR -and must have matching Scope.
-
-If Reduce includes 2x2, the dimensions of ResultType must be half of -the dimensions of Matrix. If Reduce equals Row, then Result Type must -have the same number of rows as Matrix. If Reduce equals Column, then -Result Type must have the same number of columns as Matrix. If Reduce -includes Row and Column, Result Type can have any number of rows and -columns.
-
-CombineFunc must be an OpFunction with two parameters whose types and result -type all match the component type of Matrix. This function is called to combine -pairs of elements (or intermediate results) when computing the reduction. This -function should be mathematically commutative and associative (though in practice, with floating -point numbers, may not be exactly commutative/associative).
-

Capability:
-CooperativeMatrixReductionsNV

5

5366

<id>
-Result Type

Result <id>

<id>
-Matrix

Literal
-Reduce

<id>
-CombineFunc

-
-
-

3.49.11 Conversion Instructions

-
-

Relax the restrictions on Op{F,S,U,etc.}Convert from SPV_KHR_cooperative_matrix -if CooperativeMatrixConversionsNV is enabled to allow Use to mismatch, -where the Use of the operand can be MatrixAccumulatorKHR and the Use -of the result type can be MatrixAKHR or MatrixBKHR. The restriction on -OpBitcast is not relaxed.

-
- ------- - - - - - - - - - - - - - -

OpCooperativeMatrixConvertNV
-
-Converts a cooperative matrix to another cooperative matrix with different -Use.
-
-Result Type must be an OpTypeCooperativeMatrixKHR.
-
-The type of Matrix must be an OpTypeCooperativeMatrixKHR with the same -Component Type, Scope, Rows, and Columns as Result Type.The Use -of Result Type must be MatrixAKHR or MatrixBKHR and the Use of -Matrix must be MatrixAccumulatorKHR. For conversions that change both -Component Type and Use, use Op{F,S,U,etc.}Convert.
-

Capability:
-CooperativeMatrixConversionsNV

3

5293

<id>
-Result Type

Result <id>

<id>
-Matrix

- ------- - - - - - - - - - - - - - -

OpCooperativeMatrixTransposeNV
-
-Converts a cooperative matrix to from MatrixAccumulatorKHR to MatrixBKHR -and transposes the matrix.
-
-Result Type must be an OpTypeCooperativeMatrixKHR.
-
-The type of Matrix must be an OpTypeCooperativeMatrixKHR with the same -Scope as Result Type, and with Rows, and Columns swapped relative to -Result Type. The Use of Result Type must be MatrixBKHR and the Use of -Matrix must be MatrixAccumulatorKHR.
-

Capability:
-CooperativeMatrixConversionsNV

3

5390

<id>
-Result Type

Result <id>

<id>
-Matrix

-
-
-

3.49.9 Function Instructions

- --------- - - - - - - - - - - - - - - - -

OpCooperativeMatrixPerElementOpNV
-
-Applies an operation to each element of a cooperative matrix.
-
-The type of Matrix must be an OpTypeCooperativeMatrixKHR.
-
-Result Type must match the type of Matrix.
-
-Func must be an OpFunction whose return type must match the component type -of Matrix, whose first two parameters must be 32-bit integer types, whose -third parameter type must match the component type of Matrix, and which may -have additional parameters. The function is called for each element of the -matrix where the element is passed as the third parameter to the function, -the row and column number of the matrix are passed as the first and second -parameters, and any optional operands are passed in order as the remaining -parameters. Any additional cooperative matrix elements have the corresponding -component passed to the function. The return value of that function is the -corresponding element of Result. The calls are considered unordered against -each other, and calls may occur more than once. -

Capability:
-CooperativeMatrixPerElementOperationsNV

5+variable

5369

<id>
-Result Type

Result <id>

<id>
-Matrix

<id>
-Func

Optional
-<id>, <id>, …​

-
-
-
-
-

Issues

-
-
-
    -
  1. -

    How are matrix type conversions with Use change handled?

    -
    -
    -
    -

    Discussion: RESOLVED. We need to support conversions that change both -Component Type and Use at the same time, because there is often not a -supported intermediate type that matches one but not the other. For example, -if converting from f32 MatrixAccumulatorKHR to u8 MatrixAKHR, there may -not be support for u8 MatrixAccumulatorKHR or f32 MatrixAKHR. Conversions -that change the Component Type should use Op{F,S,U,etc.}Convert even if the -Use changes.

    -
    -
    -

    We also need to support conversions that only change the Use, for example -converting from f16 MatrixAccumulatorKHR to f16 MatrixAKHR. For this, -OpFConvert could be confusing/misleading so we add a new -OpCooperativeMatrixConvertNV instruction for this case.

    -
    -
    -
    -
  2. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2024-09-18

Jeff Bolz

Initial revision of SPV_NV_cooperative_matrix2

-
-
-
- - \ No newline at end of file + + + + + + extensions/NV/SPV_NV_cooperative_matrix2.html + + +

extensions/NV/SPV_NV_cooperative_matrix2.html

+ + diff --git a/extensions/NV/SPV_NV_displacement_micromap.html b/extensions/NV/SPV_NV_displacement_micromap.html index 442282e..52bdf8a 100644 --- a/extensions/NV/SPV_NV_displacement_micromap.html +++ b/extensions/NV/SPV_NV_displacement_micromap.html @@ -1,546 +1,12 @@ - - - - - - - -SPV_NV_displacement_micromap - - - - - -
-
-

Name Strings

-
-
-

SPV_NV_displacement_micromap

-
-
-
-
-

Contact

-
-
-

See Issues list in the Khronos SPIRV-Registry repository: -https://github.com/KhronosGroup/SPIRV-Registry

-
-
-
-
-

Contributors

-
-
-
    -
  • -

    Eric Werness, NVIDIA

    -
  • -
  • -

    Ashwin Lele, NVIDIA

    -
  • -
  • -

    Pyarelal Knowles, NVIDIA

    -
  • -
  • -

    Christoph Kubisch, NVIDIA

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Provisional

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2023-08-01

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.5 Revision 1.

-
-
-

This extension requires SPIR-V 1.4.

-
-
-

This extension interacts with SPV_KHR_ray_tracing, SPV_NV_mesh_shader, -SPV_EXT_mesh_shader.

-
-
-
-
-

Overview

-
-
-

This extension adds new functionality to support the Vulkan -VK_NV_displacement_micromap extension in SPIR-V.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_NV_displacement_micromap"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces new capabilities:

-
-
-
-
DisplacementMicromapNV
-RayTracingDisplacementMicromapNV
-
-
-
-
-
-

New Builtins

-
-
-

Builtins added under the RayTracingDisplacementMicromapNV capability

-
-
-
-
HitMicroTriangleVertexPositionsNV
-HitMicroTriangleVertexBarycentricsNV
-HitKindFrontFacingMicroTriangleNV
-HitKindBackFacingMicroTriangleNV
-
-
-
-
-
-

New Instructions

-
-
-

Instructions added under the DisplacementMicromapNV capability

-
-
-
-
OpFetchMicroTriangleVertexPositionNV
-OpFetchMicroTriangleVertexBarycentricNV
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.1

-
-
-
-
(Modify Section 2.2.2, Types )
-
-
-

add OpTypeAccelerationStructureNV to list of opaque types

-
-
-
(Modify Section 3.21, Builtin, adding rows to the Builtin table)
-
-
-
- ------- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
DecorationEnabling CapabilitiesExtra Operands

5337

HitMicroTriangleVertexPositionsNV
-The vertex positions for a hit with a microtriangle for the ray being traced.

-

Allowed in AnyHitKHR and ClosestHitKHR stages.

RayTracingDisplacementMicromapNV

5344

HitMicroTriangleVertexBarycentricsNV
-The barycentric coordinates for vertices of a microtriangle hit for the ray being traced relative to the base triangle’s vertices.

-

Allowed in AnyHitKHR and ClosestHitKHR stages.

-

Refer to the Ray Tracing chapter of Vulkan API specification for more details.

RayTracingDisplacementMicromapNV

5405

HitKindFrontFacingMicroTriangleNV
-A link-time constant that HitKindKHR will return when OpTraceRayKHR intersects a front facing microtriangle.

-

Allowed in all stages.

RayTracingDisplacementMicromapNV

5406

HitKindBackFacingMicroTriangleNV
-A link-time constant that HitKindKHR will return when OpTraceRayKHR intersects a back facing microtriangle.

-

Allowed in all stages.

RayTracingDisplacementMicromapNV

-
-
-
-
(Modify Section 3.HK, Hit Kinds, adding rows to the table)
-
-
-
-
-

3.HK, Hit Kinds

-
-
-

Values returned in the variable decorated as HitKindKHR from built-in -intersections with triangle geometry.

-
- ----- - - - - - - - - - - - - - - - - - - -
Hit KindEnabling Capabilities

builtin

HitKindFrontFacingMicroTriangleNV
-The intersection was with front-facing displacement micromap geometry.

RayTracingDisplacementMicromapNV

builtin

HitKindBackFacingMicroTriangleKHR
-The intersection was with back-facing displacement micromap geometry.

RayTracingDisplacementMicromapNV

-
-
-
-
(Modify Section 3.31, Capability, adding a row to the Capability table)
-
-
-
- ----- - - - - - - - - - - - - - - - - - - -
CapabilityImplicitly declares

5380

DisplacementMicromapNV
-Uses the OpFetchMicroTriangleVertexPositionNV or OpFetchMicroTriangleVertexBarycentricNV instructions

Shader

5409

RayTracingDisplacementMicromapNV
-Uses either of HitMicroTriangleVertexPositionsNV, HitMicroTriangleVertexBarycentricsNV, HitKindFrontFacingMicroTriangleNV, -HitKindBackFacingMicroTriangleNV , HitKindFrontFacingMicroTriangleNV or -HitKindBackFacingMicroTriangleKHR builtins

Shader

-
-
-
-
(Modify Section 3.32.6, Type-Declaration Instructions, adding a new table)
-
-
-
- ----- - - - - - - - - - - - -

OpTypeAccelerationStructureKHR
-
-Declares an acceleration structure type which is an opaque reference to -acceleration structure handle as defined in the client API -specification.

-

Consumed by OpFetchMicroTriangleVertexPositionNV and -OpFetchMicroTriangleVertexBarycentricNV

-

This type is opaque: values of this type have no defined physical size or -bit pattern.

Capability:
-DisplacementMicromapNV

2

5341

<id> Result

-
-
-
-
-
-
-
-
-

Instructions

-
-
-
-
(Add the new instruction)
-
-
- ----------- - - - - - - - - - - - - - - - - - -

OpFetchMicroTriangleVertexPositionNV
-
- Returns the vertex position of a micro triangle in object space. -

-

Result Type must 3 component vector of 32-bit floating point type
-
- Acceleration Structure is the descriptor for the acceleration structure to trace into.
-
- Instance Id must be an <id> of 32-bit scalar integer type.
-
- Geometry Index must be an <id> of 32-bit scalar integer type.
-
- Primitive Index must be an <id> of 32-bit scalar integer type.
-
- Barycentrics must be an <id> of 2 component vector of integer type.
-
- This instruction is allowed only in RayGenerationKHR, MeshNV and GLCompute execution models.

Capability:
-DisplacementMicromapNV

7

5300

<id> Result Type

Result <id>

<id> Acceleration Structure

<id> Instance Id

<id> Geometry Index

<id> Primitive Index

<id> Barycentrics

- ----------- - - - - - - - - - - - - - - - - - -

OpFetchMicroTriangleVertexBarycentricNV
-
- Returns the barycentric coordinates of a micro triangle vertex relative to the base - triangle vertices. -

-

Result Type must 2 component vector of 32-bit floating point type
-
- Acceleration Structure is the descriptor for the acceleration structure to trace into.
-
- Instance Id must be an <id> of 32-bit scalar integer type.
-
- Geometry Index must be an <id> of 32-bit scalar integer type.
-
- Primitive Index must be an <id> of 32-bit scalar integer type.
-
- Barycentrics must be an <id> of 2 component vector of integer type.
-
- This instruction is allowed only in RayGenerationKHR, MeshNV and GLCompute execution models.

Capability:
-DisplacementMicromapNV

7

5301

<id> Result Type

Result <id>

<id> Acceleration Structure

<id> Instance Id

<id> Geometry Index

<id> Primitive Index

<id> Barycentrics

-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_NV_displacement_micromap"
-
-
-
-
-
-

Issues

-
-
-

None yet!

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2023-08-01

Pyarelal Knowles

Internal revisions

-
-
-
- - \ No newline at end of file + + + + + + extensions/NV/SPV_NV_displacement_micromap.html + + +

extensions/NV/SPV_NV_displacement_micromap.html

+ + diff --git a/extensions/NV/SPV_NV_fragment_shader_barycentric.html b/extensions/NV/SPV_NV_fragment_shader_barycentric.html index 59156af..bbd0c55 100644 --- a/extensions/NV/SPV_NV_fragment_shader_barycentric.html +++ b/extensions/NV/SPV_NV_fragment_shader_barycentric.html @@ -1,330 +1,12 @@ - - - - - - - -SPV_NV_fragment_shader_barycentric - - - - - -
-
-

Name Strings

-
-
-

SPV_NV_fragment_shader_barycentric

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Daniel Koch, NVIDIA

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2018-08-14

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.3 Revision 2.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension provides SPIR-V support for the GLSL -GL_NV_fragment_shader_barycentric extension which provides -fragment shaders with access to barycentric weights vectors and -enables fragment inputs to read the raw per-vertex outputs from -the last vertex processing stage.

-
-
-

The extension adds the following functionality under the new -FragmentBarycentricNV capability:

-
-
-
    -
  • -

    adds the PerVertexNV decorations for input and/or output variables

    -
  • -
  • -

    adds BaryCoordNV and BaryCoordNoPerspNV builtins in fragment -shaders

    -
  • -
-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_NV_fragment_shader_barycentric"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.3

-
-
-
-
(Modify Section 3.20, Decoration, add a new row to the Decoration table)
-
-
-
- ------- - - - - - - - - - - - - - - - -
DecorationEnabling CapabilitiesExtra Operands

5285

PerVertexNV
-Must only be used on a memory object declaration or a member of a structure type. -No interpolation. Values are accessed by vertex number in the fragment input. -Only valid for the Input and Output Storage Classes.

FragmentBarycentricNV

-
-
-
-
(Modify Section 3.21, BuiltIn)
-
-
-
-
-

(add new rows to the Builtin table)

-
- ----- - - - - - - - - - - - - - - - - - - -
BuiltInEnabling Capabilities

5286

BaryCoordNV
-Input barycentric coordinates in the Fragment Execution Model. -These values are perspective-corrected versions of the barycentric weights. -See the Vulkan API specification for more detail.

FragmentBarycentricNV

5287

BaryCoordNoPerspNV
-Input barycentric coordinates in the Fragment Execution Model. -These values vary linearly in screenspace. -See the Vulkan API specification for more detail.

FragmentBarycentricNV

-
-
-
-
(Modify Section 3.31, Capability, adding a new row to the Capability table)
-
-
-
- ------ - - - - - - - - - - - - - - - -
CapabilityDepends OnEnabled by Extension

5284

FragmentBarycentricNV

Shader

SPV_NV_fragment_shader_barycentric

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_NV_fragment_shader_barycentric"
-
-
-
-
-
-

Issues

-
-
-

None yet!

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2018-08-14

Daniel Koch

Internal revisions

-
-
-
- - \ No newline at end of file + + + + + + extensions/NV/SPV_NV_fragment_shader_barycentric.html + + +

extensions/NV/SPV_NV_fragment_shader_barycentric.html

+ + diff --git a/extensions/NV/SPV_NV_geometry_shader_passthrough.html b/extensions/NV/SPV_NV_geometry_shader_passthrough.html index 4dfaa1d..90a47b2 100644 --- a/extensions/NV/SPV_NV_geometry_shader_passthrough.html +++ b/extensions/NV/SPV_NV_geometry_shader_passthrough.html @@ -1,355 +1,12 @@ - - - - - - - -SPV_NV_geometry_shader_passthrough - - - - - -
-
-

Name Strings

-
-
-

SPV_NV_geometry_shader_passthrough

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Daniel Koch, NVIDIA

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2017-02-15

Revision

2

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.1 Revision 4.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds a new variable decoration to support the OpenGL -GL_NV_geometry_shader_passthrough and the Vulkan -VK_NV_geometry_shader_passthrough extensions in SPIR-V.

-
-
-

The PassthroughNV decoration corresponds to the passthrough layout qualifier.

-
-
-

The new functionality is enabled under the GeometryShaderPassthroughNV capability.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_NV_geometry_shader_passthrough"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
GeometryShaderPassthroughNV
-
-
-
-

that depends on the Geometry capability.

-
-
-
-
-

New Decorations

-
-
-

Decorations added under the GeometryShaderPassthroughNV capability

-
-
-
-
PassthroughNV
-
-
-
-
-
-

New Builtins

-
-
-

None.

-
-
-
-
-

New Instructions

-
-
-

None.

-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - - - - - -

PassthroughNV

5250

GeometryShaderPassthroughNV

5251

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.1

-
-
-
-
(Modify Section 3.20, Decoration, adding a row to the Decoration table)
-
-
-
- ------- - - - - - - - - - - - - - - - -
DecorationEnabling CapabilitiesExtra Operands

5250

PassthroughNV
-Apply to an object or a member of a structure type. Indictates a variable that -is passed through a shader stage unmodified. Only valid for the Input -Storage Class.

GeometryShaderPassthroughNV

-
-
-
-
(Modify Section 3.31, Capability, adding a row to the Capability table)
-
-
-
- ------ - - - - - - - - - - - - - - - -
CapabilityDepends OnEnabled by Extension

5251

GeometryShaderPassthroughNV

Geometry

SPV_NV_geometry_shader_passthrough

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_NV_geometry_shader_passthrough"
-
-
-
-
-
-

Issues

-
-
-

None yet!

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2016-11-24

Daniel Koch

Initial draft

2

2017-02-15

Daniel Koch

Mark complete, mention Vulkan extension

-
-
-
- - \ No newline at end of file + + + + + + extensions/NV/SPV_NV_geometry_shader_passthrough.html + + +

extensions/NV/SPV_NV_geometry_shader_passthrough.html

+ + diff --git a/extensions/NV/SPV_NV_mesh_shader.html b/extensions/NV/SPV_NV_mesh_shader.html index 05dc9ca..5e52943 100644 --- a/extensions/NV/SPV_NV_mesh_shader.html +++ b/extensions/NV/SPV_NV_mesh_shader.html @@ -1,1037 +1,12 @@ - - - - - - - -SPV_NV_mesh_shader - - - - - -
-
-

Name Strings

-
-
-

SPV_NV_mesh_shader

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Daniel Koch, NVIDIA

    -
  • -
  • -

    Christoph Kubisch, NVIDIA

    -
  • -
  • -

    Jeff Bolz, NVIDIA

    -
  • -
  • -

    John Kessenich, Google

    -
  • -
  • -

    Sahil Parmar, NVIDIA

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2018-10-04

Revision

2

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.3, Revision 2, Unified.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-

This extension interacts with SPV_NV_viewport_array2.

-
-
-

This extension interacts with SPV_EXT_shader_viewport_index_layer.

-
-
-

This extension interacts with SPV_NVX_multiview_per_view_attributes.

-
-
-

This extension interacts with SPIR-V 1.2 (LocalSizeId).

-
-
-

This extension interacts with SPIR-V 1.3 and -SPV_KHR_shader_draw_parameters (DrawIndex).

-
-
-
-
-

Overview

-
-
-

This extensions provides SPIR-V support for the GLSL GL_NV_mesh_shader -extension which adds two new programmable shader types — task and mesh -shaders — which are used instead of the standard programmable vertex -processing pipeline. Both new shader types have execution environments -similar to that of compute shaders.

-
-
-

This extension enables or adds the following functionality under -the new MeshShadingNV capability:

-
-
-
    -
  • -

    adds TaskNV and MeshNV Execution Models for task and mesh shaders, -respectively

    -
  • -
  • -

    adds OutputLinesNV, OutputTrianglesNV, and OutputPrimitivesNV -Execution Modes for mesh shaders

    -
  • -
  • -

    enables LocalSize, LocalSizeId, OutputVertices, and OutputPoints -Execution Modes for mesh and/or task shaders

    -
  • -
  • -

    adds PerPrimitiveNV, PerViewNV, PerTaskNV decorations for input -and/or output variables

    -
  • -
  • -

    adds TaskCountNV, PrimitiveCountNV, PrimitiveIndicesNV, -ClipDistancePerViewNV, CullDistancePerViewNV, LayerPerViewNV, -MeshViewCountNV, and MeshViewIndicesNV builtins in mesh and/or task -shaders

    -
  • -
  • -

    enables Position, PointSize, ClipDistance, CullDistance, -PrimitiveId, Layer, ViewportIndex, WorkgroupSize, WorkgroupId, -LocalInvocationId, GlobalInvocationId, LocalInvocationIndex, -DrawIndex, ViewportMaskNV, PositionPerViewNV, and -ViewportMaskPerViewNV in mesh and/or task shaders

    -
  • -
  • -

    adds the OpWritePackedPrimitiveIndices4x8NV instruction

    -
  • -
-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_NV_mesh_shader"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.3

-
-
-
-
(Modify Section 2.16.2, Validation Rules for Shader Capabilities)
-
-
-
-
-
-
(Add new items under "Entry point and execution model")
-
-
-
    -
  • -

    Each OpEntryPoint with the MeshNV Execution Model must have an -OpExecutionMode with exactly one of OutputPoints, OutputLinesNV, -or OutputTrianglesNV execution modes.

    -
  • -
  • -

    Each OpEntryPoint with the MeshNV Execution Model must specify -both the OutputPrimitivesNV and OutputVertices execution modes.

    -
  • -
-
-
-
(Add new items under "Decorations")
-
-
-
    -
  • -

    The PerPrimitiveNV decoration can only be used in the Output Storage -Class in a MeshNV Execution Model and can only be used in the Input -Storage Class in the Fragment Execution Model.

    -
  • -
  • -

    The PerViewNV decoration can only be used in the Output Storage Class -in the MeshNV Execution Model.

    -
  • -
  • -

    The PerTaskNV decoration can only be used in the Output Storage Class -in the TaskNV Execution Model and can only be used in the Input Storage -Class in the MeshNV Execution Model.

    -
  • -
-
-
-
-
-
-
-
-
(Modify Section 3.3, Execution Model, adding 2 new rows to the table)
-
-
-
- ----- - - - - - - - - - - - - - - - - - - -
Execution ModelEnabling Capabilities

5267

TaskNV
-Task shading stage.

MeshShadingNV

5268

MeshNV
-Mesh shading stage.

MeshShadingNV

-
-
-
-
(Modify Section 3.6, Execution Mode)
-
-
-
-
-

(add new rows to the Execution Mode table)

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Execution ModeEnabling CapabilitiesExtra Operands

5269

OutputLinesNV
-Stage output primitive is lines. -Only valid with the MeshNV Execution Model.

MeshShadingNV

5298

OutputTrianglesNV
-Stage output primitive is triangles. -Only valid with the MeshNV Execution Model.

MeshShadingNV

5270

OutputPrimitivesNV
-For the mesh stage, the maximum number of primitives the shader will ever -emit for the invocation group. -Only valid with the MeshNV Execution Model.

MeshShadingNV

Literal Number
-Primitive count

-
-

(Modify the definition of LocalSize, OutputVertices, OutputPoints, - and LocalSizeId as follows, allowing them to be outputs from MeshNV and/or TaskNV shaders)

-
- -------- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Execution ModeEnabling CapabilitiesExtra Operands

17

LocalSize
-Indicates the work-group size in the x, y, and z dimensions. Only valid with the GLCompute, MeshNV, TaskNV or Kernel Execution Models.

Literal Number
-x size

Literal Number
-y size

Literal Number
-z size

26

OutputVertices
-Only valid with the Geometry, TessellationControl, TessellationEvaluation, -or MeshNV Execution Models.

Literal Number
-Vertex count

For a geometry stage, the maximum number of vertices the shader will -ever emit in a single invocation.

Geometry

For a tessellation-control stage, the number of vertices in the output -patch produced by the tessellation control shader, which also specifies -the number of times the tessellation control shader is invoked.

Tessellation

For a mesh stage, the maximum number of vertices the shader will ever emit -for the invocation group.

MeshShadingNV

27

OutputPoints
-Stage output primitive is points. -Only valid with the Geometry and MeshNV Execution Models.

Geometry, MeshShadingNV

38

LocalSizeId
-Indicates the work-group size in the x, y, and z dimensions. Only valid with the GLCompute, MeshNV, TaskNV or Kernel Execution Models.
-
- Specified as Ids.

Missing before version 1.2.

<id>
-x size

<id>
-y size

<id>
-z size

-
-
-
-
(Modify Section 3.20, Decoration, adding new rows to the Decoration table)
-
-
-
- ------- - - - - - - - - - - - - - - - - - - - - - - - - - - - -
DecorationEnabling CapabilitiesExtra Operands

5271

PerPrimitiveNV
-Must only be used on a memory object declaration or a member of a structure type. -Indicates that the variable has separate instances for each primitive -in the mesh output. -Only valid for the Input and Output Storage Classes.

MeshShadingNV

5272

PerViewNV
-Must only be used on a memory object declaration or a member of a structure type. -Indicates that the variable has separate instances for each view -in the mesh output. -Only valid for the Output Storage Class.

MeshShadingNV

5273

PerTaskNV
-Must only be used on a memory object declaration or a member of a structure type. -Indicates that the variable is stored in task memory. -Only valid for the Input and Output Storage Classes.

MeshShadingNV

-
-
-
-
(Modify Section 3.21, BuiltIn)
-
-
-
-
-

(add a new rows to the Builtin table)

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
BuiltInEnabling CapabilitiesEnabled by Extension

5274

TaskCountNV
-Output task count in the TaskNV Execution Model. -See the Vulkan API specification for more detail.

MeshShadingNV

SPV_NV_mesh_shader

5275

PrimitiveCountNV
-Output primitive count in the MeshNV Execution Model. -See the Vulkan API specification for more detail.

MeshShadingNV

SPV_NV_mesh_shader

5276

PrimitiveIndicesNV
-Output array of vertex index values in the MeshNV Execution Model. -See the Vulkan API specification for more detail.

MeshShadingNV

SPV_NV_mesh_shader

5277

ClipDistancePerViewNV
-Output array of clip distances for each view in the MeshNV Execution Model. -See the Vulkan API specification for more detail.

MeshShadingNV

SPV_NV_mesh_shader

5278

CullDistancePerViewNV
-Output array of cull distances for each view in the MeshNV Execution Model. -See the Vulkan API specification for more detail.

MeshShadingNV

SPV_NV_mesh_shader

5279

LayerPerViewNV
-Output array of layer selection for each view in the MeshNV Execution Model. -See the Vulkan API specification for more detail.

MeshShadingNV

SPV_NV_mesh_shader

5280

MeshViewCountNV
-Input view count in the TaskNV and MeshNV Execution Models. -See the Vulkan API specification for more detail.

MeshShadingNV

SPV_NV_mesh_shader

5281

MeshViewIndicesNV
-Input array of view index values in the TaskNV and MeshNV Execution Models. -See the Vulkan API specification for more detail.

MeshShadingNV

SPV_NV_mesh_shader

-
-

(Modify the definition of following BuiltIns, allowing -them to be used in TaskNV and/or MeshNV Execution Models.)

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
BuiltInEnabling CapabilitiesEnabled by Extension

0

Position
-Vertex position. See Vulkan or OpenGL API specifications -for more detail.

Position input or output from a vertex processing Execution Model.

Shader

Position output from a MeshNV Execution Model

MeshShadingNV

SPV_NV_mesh_shader

1

PointSize
-Vertex point size. See Vulkan or OpenGL API specifications for more detail.

Point size input or output from a vertex processing Execution Model.

Shader

Point size output from a MeshNV Execution Model

MeshShadingNV

SPV_NV_mesh_shader

3

ClipDistance
-Array of clip distances. See Vulkan or OpenGL API specifications for more detail.

Clip distances input or output from a vertex processing Execution Model

ClipDistance

Clip distances output from a MeshNV Execution Model

MeshShadingNV

SPV_NV_mesh_shader

4

CullDistance
-Array of cull distances. See Vulkan or OpenGL API specifications for more detail.

Cull distances input or output from a vertex processing Execution Model

CullDistance

Cull distances output from a MeshNV Execution Model

MeshShadingNV

SPV_NV_mesh_shader

7

PrimitiveId
-Primitive identifier. See Vulkan or OpenGL API specifications for more detail.

Primitive ID in a Geometry Execution Model

Geometry

Primitive ID in a Tessellation Execution Model

Tessellation

Primitive ID output in a MeshNV Execution Model

MeshShadingNV

SPV_NV_mesh_shader

9

Layer
-Layer selection for multi-layer framebuffer. See Vulkan or OpenGL API -specification for more detail.

Layer output by a Geometry Execution Model, -input to a Fragment Execution Model.

Geometry

Layer output by a Vertex or Tessellation Execution Model.

ShaderViewportIndexLayerEXT

SPV_EXT_shader_viewport_index_layer

Layer output by a MeshNV Execution Model.

ShaderViewportIndexLayerEXT MeshShadingNV

SPV_EXT_shader_viewport_index_layer SPV_NV_mesh_shader

10

ViewportIndex
-Viewport selection for viewport transformation when using multiple viewports. -See Vulkan or OpenGL API specification for more detail.

Viewport index output by a Geometry Execution Model, -input to a Fragment Execution Model.

MultiViewport

Viewport index output by a Vertex or Tessellation Execution Model.

ShaderViewportIndexLayerEXT

SPV_EXT_shader_viewport_index_layer

Viewport index output by a MeshNV Execution Model

ShaderViewportIndexLayerEXT MeshShadingNV

SPV_EXT_shader_viewport_index_layer SPV_NV_mesh_shader

25

WorkgroupSize
-Work-group size in GLCompute or Kernel Execution Models. -See OpenCL, Vulkan, or OpenGL API specifications for more detail.

Work-group size in TaskNV or MeshNV Execution Models. -See Vulkan API specification for more detail.

MeshShadingNV

SPV_NV_mesh_shader

26

WorkgroupId
-Work-group ID in GLCompute or Kernel Execution Models. -See OpenCL, Vulkan, or OpenGL API specifications for more detail.

Work-group ID in TaskNV or MeshNV Execution Models. -See Vulkan API specification for more detail.

MeshShadingNV

SPV_NV_mesh_shader

27

LocalInvocationId
-Local invocation ID in GLCompute or Kernel Execution Models. -See OpenCL, Vulkan, or OpenGL API specifications for more detail.

Local invocation ID in TaskNV or MeshNV Execution Models. -See Vulkan API specification for more detail.

MeshShadingNV

SPV_NV_mesh_shader

28

GlobalInvocationId
-Global invocation ID in GLCompute or Kernel Execution Models. -See OpenCL, Vulkan, or OpenGL API specifications for more detail.

Global invocation ID in TaskNV or MeshNV Execution Models.

MeshShadingNV

SPV_NV_mesh_shader

29

LocalInvocationIndex
-Local invocation index in GLCompute Execution Model. -See Vulkan or OpenGL API specifications for more detail.
-
-Work-group Linear ID in Kernel Execution Model. -See OpenCL API specification for more detail.

Local invocation index in TaskNV or MeshNV Execution Models. -See Vulkan API specification for more detail.

MeshShadingNV

SPV_NV_mesh_shader

4426

DrawIndex
-Contains the index of the draw currently being processed.
-See the Vulkan 1.1 or OpenGL 4.6 specifications for more details.

DrawParameters
-
-Missing before version 1.3.

SPV_KHR_shader_draw_parameters

Draw index in TaskNV or MeshNV Execution Models

DrawParameters MeshShadingNV

SPV_KHR_shader_draw_parameters SPV_NV_mesh_shader

5253

ViewportMaskNV

Reserved

Output viewport mask in Vertex, Tessellation, or Geometry Execution Model. -See Vulkan or OpenGL API specifications for more detail.

ShaderViewportMaskNV

SPV_NV_viewport_array2

Output viewport mask in MeshNV Execution Model. -See Vulkan API specification for more detail.

ShaderViewportMaskNV MeshShadingNV

SPV_NV_viewport_array2 SPV_NV_mesh_shader

5261

PositionPerViewNV

Reserved

Output vertex position for each view in Vertex, Tessellation, or -Geometry Execution Model, and input position for each view in -Tessellation and Geometry Execution Models. See Vulkan API -specification for more detail.

PerViewAttributesNV

SPV_NVX_multiview_per_view_attributes

Output vertex position for each view in MeshNV Execution Model. -See Vulkan API specification for more detail.

PerViewAttributesNV MeshShadingNV

SPV_NVX_multiview_per_view_attributes SPV_NV_mesh_shader

5262

ViewportMaskPerViewNV

Reserved

Output viewport mask for each view in Vertex, Tessellation, or Geometry -Execution Model. See Vulkan API specification for more detail.

PerViewAttributesNV

SPV_NVX_multiview_per_view_attributes

Output viewport mask for each view in MeshNV Execution Model. -See Vulkan API specification for more detail.

PerViewAttributesNV MeshShadingNV

SPV_NVX_multiview_per_view_attributes SPV_NV_mesh_shader

-
-
-
-
(Modify Section 3.31, Capability, adding a new row to the Capability table)
-
-
-
- ------ - - - - - - - - - - - - - - - -
CapabilityDepends OnEnabled by Extension

5266

MeshShadingNV
-Uses the TaskNV or MeshNV Execution Models.

Shader

SPV_NV_mesh_shader

-
-
-
-
(Modify Section 3.32.1, Miscellaneous Instructions, adding a new row to the table)
-
-
-
- ------ - - - - - - - - - - - - -

OpWritePackedPrimitiveIndices4x8NV
-
-Interprets Packed Indices as four 8-bit unsigned integer values and -stores them into the output variable decorated with the PrimitiveIndicesNV BuiltIn -starting from the byte offset given by Index Offset. The lower bytes of -Packed Indices are stored at lower addresses in the output array variable.
-
-Index Offset must be a scalar of 32-bit integer type, whose Signedness -operand is 0, and must be a multiple of four.
-
-Packed Indices must be a scalar of 32-bit integer type, whose Signedness -operand is 0.

Capability:
-MeshShadingNV

3

5299

<id>
-Index Offset

<id>
-Packed Indices

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_NV_mesh_shader"
-
-
-
-
-
-

Issues

-
-
-
    -
  1. -

    Should writePackedPrimitiveIndices4x8NV be added as a new core instruction -or should it be an extended instruction?

    -
    -
    -
    -

    RESOLVED: adding it as a new core instruction as that’s simpler (doesn’t need -a new grammar file) and that seems to be what the extension guide recommends.

    -
    -
    -
    -
  2. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2018-09-12

Daniel Koch

Internal revisions

2

2018-10-04

Sahil Parmar

Add support for LocalSize and LocalSizeId in TaskNV shaders

-
-
-
- - \ No newline at end of file + + + + + + extensions/NV/SPV_NV_mesh_shader.html + + +

extensions/NV/SPV_NV_mesh_shader.html

+ + diff --git a/extensions/NV/SPV_NV_raw_access_chains.html b/extensions/NV/SPV_NV_raw_access_chains.html index 000e98a..691ff51 100644 --- a/extensions/NV/SPV_NV_raw_access_chains.html +++ b/extensions/NV/SPV_NV_raw_access_chains.html @@ -1,477 +1,12 @@ - - - - - - - -SPV_NV_raw_access_chains - - - - - -
-
-

Name Strings

-
-
-

SPV_NV_raw_access_chains

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Hans-Kristian Arntzen, Valve

    -
  • -
  • -

    Jeff Bolz, NVIDIA

    -
  • -
  • -

    Rodrigo Locatti, NVIDIA

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2023 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2024-01-17

Revision

7

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.6 Revision 2.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension exposes an interface similar to Direct3D structured buffers and byte address buffers, allowing shaders compiled from an HLSL source to generate more efficient code.
-This adds the instruction OpRawAccessChainNV under the RawAccessChainsNV capability. An optional operand can be provided to control bounds checking.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_NV_raw_access_chains"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6

-
-
-

Validation Rules

-
-

Modify section 2.16.1. Universal Validation Rules:

-
-
-

In the section:

-
-
-
-
"It is invalid for a pointer to be an operand to any instruction other than:"
-
-
-
-

Add:

-
-
-
-
<<OpRawAccessChainNV,*OpRawAccessChainNV*>>
-
-
-
-

In the section:

-
-
-
-
"It is invalid for a pointer to be the Result <id> of any instruction other than:"
-
-
-
-

Add:

-
-
-
-
<<OpRawAccessChainNV,*OpRawAccessChainNV*>>
-
-
-
-

Change:

-
-
-
-
"All indexes in <<OpAccessChain,*OpAccessChain*>> and <<OpInBoundsAccessChain,*OpInBoundsAccessChain*>> that are <<OpConstant,*OpConstant*>> with type of <<OpTypeInt,*OpTypeInt*>> with a 'signedness' of 1 must not have their sign bit set."
-
-
-
-

To:

-
-
-
-
"All indexes in <<OpAccessChain,*OpAccessChain*>>, <<OpInBoundsAccessChain,*OpInBoundsAccessChain*>>, and <<OpRawAccessChainNV,*OpRawAccessChainNV*>> that are <<OpConstant,*OpConstant*>> with type of <<OpTypeInt,*OpTypeInt*>> with a 'signedness' of 1 must not have their sign bit set."
-
-
-
-

Change:

-
-
-
-
"An <<OpTypePointer, *OpTypePointer*>> pointing to a 16-bit scalar, a 16-bit vector, or a composite containing a 16-bit member can be used as the result type of <<OpVariable, *OpVariable*>>, or <<OpAccessChain, *OpAccessChain*>>, or <<OpInBoundsAccessChain, *OpInBoundsAccessChain*>>."
-
-
-
-

To:

-
-
-
-
"An <<OpTypePointer, *OpTypePointer*>> pointing to a 16-bit scalar, a 16-bit vector, or a composite containing a 16-bit member can be used as the result type of <<OpVariable, *OpVariable*>>, or <<OpAccessChain, *OpAccessChain*>>, or <<OpInBoundsAccessChain, *OpInBoundsAccessChain*>>, or <<OpRawAccessChainNV,*OpRawAccessChainNV*>>."
-
-
-
-

Change:

-
-
-
-
"An <<OpTypePointer, *OpTypePointer*>> pointing to an 8-bit scalar, an 8-bit vector, or a composite containing an 8-bit member can be used as the result type of <<OpVariable, *OpVariable*>>, or <<OpAccessChain, *OpAccessChain*>>, or <<OpInBoundsAccessChain, *OpInBoundsAccessChain*>>."
-
-
-
-

To:

-
-
-
-
"An <<OpTypePointer, *OpTypePointer*>> pointing to an 8-bit scalar, an 8-bit vector, or a composite containing an 8-bit member can be used as the result type of <<OpVariable, *OpVariable*>>, or <<OpAccessChain, *OpAccessChain*>>, or <<OpInBoundsAccessChain, *OpInBoundsAccessChain*>>, or <<OpRawAccessChainNV,*OpRawAccessChainNV*>>."
-
-
-
-

Modify section 2.6.2. Validation Rules for Shader Capabilities:

-
-
-

Add text to the decorations Restrict, Aliased, Volatile, Coherent, NonWritable, and NonReadable specifying that they can also be used on the result of OpRawAccessChainNV.

-
-
-
-

Capabilities

-
-

Modify Section 3.31, "Capability", adding these rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityDepends On

5414

RawAccessChainNV
-Allow the instruction OpRawAccessChainNV, and raw access chain operands RawAccessChainRobustnessPerComponentNV and RawAccessChainRobustnessPerElementNV.

Shader

-
-
-
-
-

Instructions

-
-

Add the following instruction:

-
- ----------- - - - - - - - - - - - - - - - - - -

OpRawAccessChainNV
-
-Create a pointer into a composite object with a type different than Base's pointer type.
-Result type must be an OpTypePointer. Its Storage Class operand must be the same as the Storage Class of Base. -The pointee type of Result type must also not be OpTypeArray, OpTypeMatrix, or OpTypeStruct.
-Base's type must be an OpTypePointer. The storage class must be StorageBuffer, PhysicalStorageBuffer, or Uniform. -If StorageBuffer, the pointee type of Base, or if it points to an OpTypeArray, the pointee type of the array it points to, must be decorated with Block. -If Uniform, they must be decorated with BufferBlock.
-Stride must be a scalar OpConstant of integer type. It is treated as an unsigned value.
-Index and Offset must be a scalar integer type with a 32-bit width. They are treated as unsigned values.
-If the product of Stride and Index would overflow, or the addition of Offset to that result would overflow, the behavior is implementation defined but it is not allowed to fault. -If Stride is not zero, Offset plus the size of the pointee type must be less than or equal to Stride.
-The returned pointer is calculated using Base’s byte address, adding Offset, and the product of Stride and Index.
-If the optional operand Raw Access Chain Operands is not provided, the default value of None is used.
-The result must only be consumed by OpLoad and OpStore.

Capability:
-RawAccessChainsNV

7+

5398

<id>
-Result Type

Result <id>

<id>
-Base

<id>
-Stride

<id>
-Index

<id>
-Offset

Optional
-Raw Access Chain
-Operands

-
-

Modify the OpLoad and OpStore instructions to add:

-
-
-

If Pointer is the result of a OpRawAccessChainNV instruction, a valid Aligned memory operand must be defined.
-This alignment must be at least the size of the component.

-
-
-
-

Raw Access Chain Operands

-
-

At the end of Section 3 "Binary Form", add:

-
-
-
- ----- - - - - - - - - - - - - - - - - - - - - - - - -
Raw Access Chain OperandsEnabling Capabilities

0x0

None

0x1

RobustnessPerComponentNV
-Consumers of this access chain will bounds check each scalar using the computed offset.
-The behavior for out of bounds accesses is specified by the host environment.
-Must not be used on a pointer with storage class PhysicalStorageBuffer.
-Must not be used with RobustnessPerElementNV.

RawAccessChainsNV

0x2

RobustnessPerElementNV
-Consumers of this access chain will bounds check using the product of Stride and Index as access offset for the whole operation. Stride must not be zero.
-The behavior for out of bounds accesses is specified by the host environment.
-The implementation may assume any offset within Index is either all in-bounds or all out-of-bounds.
-Must not be decorated on a pointer with storage class PhysicalStorageBuffer.
-Must not be used with RobustnessPerComponentNV.

RawAccessChainsNV

-
-
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2023-07-03

Rodrigo Locatti

Initial revision.

2

2023-07-06

Hans-Kristian Arntzen

Misc refinements.

3

2023-07-07

Rodrigo Locatti

Add support for memory decorations to OpRawAccessChainEXT.
-Clarify stride restrictions for bounds checking decorations.
-Define behavior for overflows. Offset must be smaller than Stride.

4

2023-07-12

Rodrigo Locatti

Define overflows on Offset.

5

2023-08-29

Rodrigo Locatti

Rename extension to SPV_NV_raw_access_chains.

6

2023-09-29

Rodrigo Locatti

Allow per-component bounds checking for non-zero strides.
-Move robustness decorations to an operand in the OpRawAccessChainNV.
-Remove support for ignoring Index if Stride is zero.
-Define enums.

7

2024-01-17

Rodrigo Locatti

Remove mentions to old decorations.
-Fix OpRawAccessChainsNV pointer type rules for Base.
-Explicitly mention overflow does not allow faulting.
-Give the optional operand a default value.
-Minor wording changes.

-
-
-
- - \ No newline at end of file + + + + + + extensions/NV/SPV_NV_raw_access_chains.html + + +

extensions/NV/SPV_NV_raw_access_chains.html

+ + diff --git a/extensions/NV/SPV_NV_ray_tracing.html b/extensions/NV/SPV_NV_ray_tracing.html index 66d48cd..8d594db 100644 --- a/extensions/NV/SPV_NV_ray_tracing.html +++ b/extensions/NV/SPV_NV_ray_tracing.html @@ -1,968 +1,12 @@ - - - - - - - -SPV_NV_ray_tracing - - - - - -
-
-

Name Strings

-
-
-

SPV_NV_ray_tracing

-
-
-
-
-

Contact

-
-
-

See Issues list in the Khronos SPIRV-Registry repository: -https://github.com/KhronosGroup/SPIRV-Registry

-
-
-
-
-

Contributors

-
-
-
    -
  • -

    Eric Werness, NVIDIA

    -
  • -
  • -

    Ashwin Lele, NVIDIA

    -
  • -
  • -

    Daniel Koch, NVIDIA

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2020-09-17

Revision

5

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.1 Revision 4.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds new functionality to support the Vulkan -VK_NV_ray_tracing extension in SPIR-V.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_NV_ray_tracing"
-
-
-
-
-
-

New Execution Models

-
-
-

This extension introduces new execution models:

-
-
-
-
RayGenerationNV
-IntersectionNV
-AnyHitNV
-ClosestHitNV
-MissNV
-CallableNV
-
-
-
-

these depend on the RayTracingNV capability.

-
-
-
-
-

New Capabilities

-
-
-

This extension introduces new capabilities:

-
-
-
-
RayTracingNV
-
-
-
-
-
-

New Storage Classes

-
-
-

Storage classes added under the RayTracingNV capability

-
-
-
-
RayPayloadNV
-IncomingRayPayloadNV
-HitAttributeNV
-CallableDataNV
-IncomingCallableDataNV
-ShaderRecordBufferNV
-
-
-
-
-
-

New Builtins

-
-
-

Builtins added under the RayTracingNV capability

-
-
-
-
LaunchIdNV
-LaunchSizeNV
-InstanceCustomIndexNV
-WorldRayOriginNV
-WorldRayDirectionNV
-ObjectRayOriginNV
-ObjectRayDirectionNV
-RayTminNV
-RayTmaxNV
-ObjectToWorldNV
-WorldToObjectNV
-HitTNV
-HitKindNV
-IncomingRayFlagsNV
-
-
-
-
-
-

New Instructions

-
-
-

Instructions added under the RayTracingNV capability

-
-
-
-
OpReportIntersectionNV
-OpIgnoreIntersectionNV
-OpTerminateRayNV
-OpTraceNV
-OpTypeAccelerationStructureNV
-OpExecuteCallableNV
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.1

-
-
-
-
(Modify Section 2.2.2, Types )
-
-
-

add OpTypeAccelerationStructureNV to list of opaque types

-
-
-
(Modify Section 3.3, Execution Model, adding rows to the Execution Model table)
-
-
-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Execution ModelEnabling Capabilities

5313

RayGenerationNV
-Ray generation shading stage.

RayTracingNV

5314

IntersectionNV
-Intersection shading stage.

RayTracingNV

5315

AnyHitNV
-Any hit shading stage.

RayTracingNV

5316

ClosestHitNV
-Closest hit shading stage.

RayTracingNV

5317

MissNV
-Miss shading stage.

RayTracingNV

5318

CallableNV
-Ray callable shading stage.

RayTracingNV

-
-
-
-
(Modify Section 3.7, Storage Class, adding rows to the Storage Class table)
-
-
-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Storage ClassEnabling CapabilitiesEnabled by Extension

5328

CallableDataNV
-Used for storing arbitrary data associated with a ray to pass to callables. -Visible across all functions in the current invocation. Not shared externally. Variables declared -with this storage class can be both read and written to. Only allowed in RayGenerationNV, -ClosestHitNV, CallableNV, and MissNV execution models.

RayTracingNV

SPV_NV_ray_tracing

5329

IncomingCallableDataNV
-Used for storing arbitrary data from parent sent to current callable stage invoked from -an executeCallable call. Visible across all functions in current invocation. Not shared externally. -Variables declared with the storage class are allowed only in CallableNV execution models. -Can be both read and written to in above execution models.

RayTracingNV

SPV_NV_ray_tracing

5338

RayPayloadNV
-Used for storing payload data associated with a ray. Visible across all functions in -the current invocation. Not shared externally. Variables declared -with this storage class can be both read and written to. Only allowed in RayGenerationNV, -AnyHitNV, ClosestHitNV and MissNV execution models.

RayTracingNV

SPV_NV_ray_tracing

5339

HitAttributeNV
-Used for storing attributes of geometry intersected by a ray. Visible across all -functions in the current invocation. Not shared externally. Variables declared with this -storage class are allowed only in IntersectionNV, AnyHitNV and ClosestHitNV execution models. -They can be written to only in IntersectionNV execution model and read from only -in AnyHitNV and ClosestHitNV execution models.

RayTracingNV

SPV_NV_ray_tracing

5342

IncomingRayPayloadNV
-Used for storing parent payload data associated with a ray in current stage invoked from -a trace call. Visible across all functions in current invocation. Not shared externally. -Variables declared with the storage class are allowed only in AnyHitNV, ClosestHitNV and -MissNV execution models. Can be both read and written to in above execution models.

RayTracingNV

SPV_NV_ray_tracing

5343

ShaderRecordBufferNV
-Used for storing data in shader record associated with each unique shader in ray_tracing -pipeline. Visible across all functions in current invocation. Can be initialized externally via API. -Variables declared with this storage class are allowed in RayGenerationNV, IntersectionNV, -AnyHitNV, ClosestHitNV, MissNV and CallableNV execution models and can be both read and written to -but cannot have initializers. Refer to the Ray Tracing chapter of Vulkan API specification for details on shader records.

RayTracingNV

SPV_NV_ray_tracing

-
-
-
-
(Modify Section 3.20, Decoration)
-
-
-
-
-

Modify the definition of the Location decoration to add:

-
-
-

Location also forms the linkage between:

-
-
-
    -
  • -

    OpTraceNV and the RayPayloadKHR and IncomingRayPayloadKHR Storage Classes

    -
  • -
  • -

    OpExecuteCallableNV and the CallableDataKHR and IncomingCallableDataKHR Storage Classes

    -
  • -
-
-
-

Add RayPayloadKHR, IncomingRayPayloadKHR, CallableDataKHR, and IncomingCallableDataKHR -to the list of Storage Classes that Location is valid on.

-
-
-
-
-
(Modify Section 3.21, Builtin, adding rows to the Builtin table)
-
-
-
- ------- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
DecorationEnabling CapabilitiesExtra Operands

5319

LaunchIdNV
-Index of work item being processed in current invocation of ray tracing shader stage. -Allowed in all ray tracing execution models.

-

Refer to the Ray Tracing chapter of Vulkan API specification for more details.

RayTracingNV

5320

LaunchSizeNV
-Width and height dimensions passed to vkCmdTraceRaysNV call which resulted in invocation of -current ray tracing shader stage. Allowed in all ray tracing execution models.

-

Refer to the Ray Tracing chapter of Vulkan API specification for more details.

RayTracingNV

5321

WorldRayOriginNV
-World-space origin coordinates for the ray being traced in the IntersectionNV, -AnyHitNV, ClosestHitNV, or MissNV execution models.

-

Refer to the Ray Tracing chapter of Vulkan API specification for more details.

RayTracingNV

5322

WorldRayDirectionNV
-World-space direction for the ray being traced in the IntersectionNV, -AnyHitNV, ClosestHitNV, or MissNV execution models.

-

Refer to the Ray Tracing chapter of Vulkan API specification for more details.

RayTracingNV

5323

ObjectRayOriginNV
-Object-space origin coordinates for the ray being traced in the IntersectionNV, -AnyHitNV, ClosestHitNV, or MissNV execution models.

-

Refer to the Ray Tracing chapter of Vulkan API specification for more details.

RayTracingNV

5324

ObjectRayDirectionNV
-Object-space direction for the ray being traced in the IntersectionNV, -AnyHitNV, ClosestHitNV, or MissNV execution models.

-

Refer to the Ray Tracing chapter of Vulkan API specification for more details.

RayTracingNV

5325

RayTminNV
-The current Tmin parametric value for the ray being traced in the IntersectionNV, -AnyHitNV, ClosestHitNV, or MissNV execution models.

-

Refer to the Ray Tracing chapter of Vulkan API specification for more details.

RayTracingNV

5326

RayTmaxNV
-The current Tmax parametric value for the ray being traced in the IntersectionNV, -AnyHitNV, ClosestHitNV, or MissNV execution models.

-

Refer to the Ray Tracing chapter of Vulkan API specification for more details.

RayTracingNV

5327

InstanceCustomIndexNV
-Application specified value associated with the instance that was hit by the current ray in the IntersectionNV, -AnyHitNV, ClosestHitNV execution models.

-

Refer to the Ray Tracing chapter of Vulkan API specification for more details.

RayTracingNV

5330

ObjectToWorldNV
-The 4x3 object to world transformation matrix for the ray being traced in the IntersectionNV, -AnyHitNV, or ClosestHitNV execution models.

-

Refer to the Ray Tracing chapter of Vulkan API specification for more details.

RayTracingNV

5331

WorldToObjectNV
-The 4x3 world to object transformation matrix for the ray being traced in the IntersectionNV, -AnyHitNV, or ClosestHitNV execution models.

-

Refer to the Ray Tracing chapter of Vulkan API specification for more details.

RayTracingNV

5332

HitTNV
-The parametric value T for the ray resulting in a hit for the ray being traced in the AnyHitNV or -ClosestHitNV execution models. This is an alias for RayTMaxNV for convenience.

-

Refer to the Ray Tracing chapter of Vulkan API specification for more details.

RayTracingNV

5333

HitKindNV
-The hit kind of the hit for the ray being traced in the AnyHitNV or -ClosestHitNV execution models.

-

Refer to the Ray Tracing chapter of Vulkan API specification for more details.

RayTracingNV

5351

IncomingRayFlagsNV
-The ray flags in current stage as passed in through trace call in parent. Available in AnyHitNV, -ClosestHitNV, IntersectionNV, and MissNV stage

-

Refer to the Ray Tracing chapter of Vulkan API specification for more details.

RayTracingNV

-
-
-
-
(Modify the definition of following BuiltIns, allowing them to be used in IntersectionNV, AnyHitNV, or ClosestHitNV Execution Models.)
-
-
-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
BuiltInEnabling CapabilitiesEnabled by Extension

6

InstanceId
-Input Instance identifier. See the client API specifications -for more detail.

Instance ID in a Vertex Execution Model

Shader

Instance ID in an IntersectionNV, AnyHitNV, or ClosestHitNV Execution Model

RayTracingNV

SPV_NV_ray_tracing

7

PrimitiveId
-Primitive identifier. See the client API specifications for more detail.

Primitive ID in a Geometry Execution Model

Geometry

Primitive ID in a Tessellation Execution Model

Tessellation

Primitive ID in an IntersectionNV, AnyHitNV, or ClosestHitNV Execution Model

RayTracingNV

SPV_NV_ray_tracing

-
-
-
-
(Modify Section 3.31, Capability, adding a row to the Capability table)
-
-
-
- ------ - - - - - - - - - - - - - - - -
CapabilityDepends OnEnabled by Extension

5340

RayTracingNV
-Uses the RayGenerationNV, IntersectionNV, AnyHitNV, ClosestHitNV, -MissNV, or CallableNV Execution Models

SPV_NV_ray_tracing

-
-
-
-
(Modify Section 3.32.6, Type-Declaration Instructions, adding a new table)
-
-
-
- ----- - - - - - - - - - - - -

OpTypeAccelerationStructureNV
-
-Declares an acceleration structure type which is an opaque reference to -acceleration structure handle as defined in the Ray Tracing chapter of Vulkan API -specification.

-

Consumed by OpTraceNV

-

This type is opaque: values of this type have no defined physical size or -bit pattern.

Capability:
-RayTracingNV

2

5341

<id> Result

-
-
-
-
(Add a new sub section 3.32.24, Ray Tracing Instructions, adding to end of list of instructions)
-
-
-
- --------------- - - - - - - - - - - - - - - - - - - - - - -

OpTraceNV
-
- Trace a ray into the acceleration structure.
-
- Acceleration Structure is the descriptor for the acceleration structure to trace into.
-
- Ray Flags controls the properties for the trace. See the Ray Tracing chapter of Vulkan API specification for more details.
-
- Cull Mask is the 8-bit mask for test against the instance mask.
-
- SBT Offset and SBT Stride control indexing into the SBT for hit shaders called from this trace. - SBT stands for Shader Binding Table. Refer to the Ray Tracing chapter of Vulkan API specification for details.
-
- Miss Index is the index of the miss shader to be called from this trace call.
-
- Ray Origin, Ray Tmin, Ray Direction, and Ray Tmax control the basic parameters of the ray to be traced.
-
- Payload number matches the declared location of the payload structure to use for this trace.
-
- Ray Flags, Cull Mask, SBT Offset, SBT Stride, and Miss Index must be a 32-bit integer type scalar.
-
- Ray Origin and Ray Direction must be a 32-bit float type 3-component vector.
-
- Ray Tmin and Ray Tmax must be a 32-bit float type scalar.
-
- This instruction is allowed only in RayGenerationNV, ClosestHitNV and MissNV execution models.
-

Capability:
-RayTracingNV

12

5337

<id> Acceleration Structure

<id> Ray Flags

<id> Cull Mask

<id> SBT Offset

<id> SBT Stride

<id> Miss Index

<id> Ray Origin

<id> Ray Tmin

<id> Ray Direction

<id> Ray Tmax

<id> Payload number

- -------- - - - - - - - - - - - - - - -

OpReportIntersectionNV
-
-Reports an intersection back to the traversal infrastructure.

-

Hit is the floating point parametric value along ray for the intersection.

-

Hit Kind is the integer hit kind reported back to other shaders and accessible by the hit kind builtin.

-

Result Type must be a scalar boolean.

-

Hit must be a 32-bit float type scalar.

-

Hit Kind must be a 32-bit unsigned integer type scalar.

-

This instruction is allowed only in IntersectionNV execution model.

Capability:
-RayTracingNV

5

5334

<id> Result Type

<id> Result

<id> Hit

<id> Hit Kind

- ---- - - - - - - - - - - -

OpIgnoreIntersectionNV
-
-Ignores the current potential intersection.

-

This instruction is allowed only in AnyHitNV execution model.

Capability:
-RayTracingNV

1

5335

- ---- - - - - - - - - - - -

OpTerminateRayNV
-
-Terminates further traversal of a ray.

-

This instruction is allowed only in AnyHitNV execution model.

Capability:
-RayTracingNV

1

5336

- ------ - - - - - - - - - - - - -

OpExecuteCallableNV
-
-Invoke a callable shader

-

SBT Index is the index into the SBT table to select callable shader to execute

-

Callable Data Number matches the declared location of the callable data to pass into through this call

-

This instruction is allowed only in RayGenerationNV, ClosestHitNV, MissNV and CallableNV execution models.

Capability:
-RayTracingNV

3

5344

<id> SBT Index

<id> Callable Data Number

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_NV_ray_tracing"
-
-
-
-
-
-

Issues

-
-
-

None yet!

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2018-09-12

Eric Werness

Internal revisions

2

2018-10-19

Ashwin Lele

Rename from NVX_raytracing to NV_ray_tracing. - Add IncomingRayFlagsNV, CallableDataNV, - IncomingCallableDataNV, OpExecuteCallableNV

3

2018-11-20

Daniel Koch

Uses InstanceId not InstanceIndex for - Intersection, Any Hit and Closest Hit shaders

4

2019-03-25

Eric Werness

Incoming ray flags shouldn’t be exposed in - Callable shaders

5

2020-09-17

Daniel Koch

Add edits for the Location decoration (SPIR #583)

-
-
-
- - \ No newline at end of file + + + + + + extensions/NV/SPV_NV_ray_tracing.html + + +

extensions/NV/SPV_NV_ray_tracing.html

+ + diff --git a/extensions/NV/SPV_NV_ray_tracing_motion_blur.html b/extensions/NV/SPV_NV_ray_tracing_motion_blur.html index 08588a2..f76a568 100644 --- a/extensions/NV/SPV_NV_ray_tracing_motion_blur.html +++ b/extensions/NV/SPV_NV_ray_tracing_motion_blur.html @@ -1,517 +1,12 @@ - - - - - - - -SPV_NV_ray_tracing_motion_blur - - - - - -
-
-

Name Strings

-
-
-

SPV_NV_ray_tracing_motion_blur

-
-
-
-
-

Contact

-
-
-

See Issues list in the Khronos SPIRV-Registry repository: -https://github.com/KhronosGroup/SPIRV-Registry

-
-
-
-
-

Contributors

-
-
-
    -
  • -

    Eric Werness, NVIDIA

    -
  • -
  • -

    Ashwin Lele, NVIDIA

    -
  • -
  • -

    Daniel Koch, NVIDIA

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2023-11-29

Revision

2

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.5 Revision 5.

-
-
-

This extension requires SPIR-V 1.4.

-
-
-

This extension requires SPV_KHR_ray_tracing and interacts with -SPV_NV_ray_tracing.

-
-
-
-
-

Overview

-
-
-

This extension adds new functionality to support the Vulkan -VK_NV_ray_tracing_motion_blur extension in SPIR-V.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_NV_ray_tracing_motion_blur"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces new capabilities:

-
-
-
-
RayTracingMotionBlurNV
-
-
-
-
-
-

New Builtins

-
-
-

Builtins added under the RayTracingMotionBlurNV capability

-
-
-
-
CurrentRayTimeNV
-
-
-
-
-
-

New Instructions

-
-
-

Instructions added under the RayTracingMotionBlurNV capability

-
-
-
-
OpTraceMotionNV
-OpTraceRayMotionNV
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5

-
-
-
-
(Modify Section 3.21, Builtin, adding rows to the Builtin table)
-
-
-
- ------- - - - - - - - - - - - - - - - -
DecorationEnabling CapabilitiesExtra Operands

5334

CurrentRayTimeNV
-Provides the time parameter as passed to the parent OpTraceMotionNV or -OpTraceRayMotionNV call -Allowed only in IntersectionKHR, AnyHitKHR, ClosestHitKHR and -MissKHR ray tracing execution models.

-

Refer to the Ray Tracing chapter of Vulkan API specification for more details.

RayTracingMotionBlurNV

-
-
-
-
(Modify Section 3.31, Capability, adding a row to the Capability table)
-
-
-
- ------ - - - - - - - - - - - - - - - -
CapabilityDepends OnEnabled by Extension

5341

RayTracingMotionBlurNV
-Allows the use of OpTraceMotionNV or OpTraceRayMotionNV.

SPV_NV_ray_tracing_motion_blur

-
-
-
-
(Add a new sub section 3.32.24, Ray Tracing Instructions, adding to end of list of instructions)
-
-
-
- ---------------- - - - - - - - - - - - - - - - - - - - - - - -

OpTraceMotionNV
-
- Trace a ray into the acceleration structure.
-
- Acceleration Structure is the descriptor for the acceleration structure to trace into.
-
- Ray Flags contains one or more of the Ray Flag values as described in SPV_KHR_ray_tracing.
-
- Cull Mask is the mask to test against the instance mask.
-
- SBT Offset and SBT Stride control indexing into the SBT for hit shaders called from this trace. - SBT stands for Shader Binding Table. Refer to the Ray Tracing chapter of Vulkan API specification for details.
-
- Miss Index is the index of the miss shader to be called from this trace call.
-
- Ray Origin, Ray Tmin, Ray Direction, and Ray Tmax control the basic parameters of the ray to be traced.
-
- Payload number matches the declared location of the payload structure to use for this trace.
-
- Ray Flags, Cull Mask, SBT Offset, SBT Stride, and Miss Index must be a 32-bit integer type scalar.
-
- Only the 8 least-significant bits of Cull Mask are used by this instruction - other bits are ignored. -
- Only the 4 least-significant bits of SBT Offset and SBT Stride are used by this instruction - other bits are ignored. -
- Only the 16 least-significant bits of Miss Index are used by this instruction - other bits are ignored. -
- Ray Origin and Ray Direction must be a 32-bit float type 3-component vector.
-
- Ray Tmin and Ray Tmax must be a 32-bit float type scalar.
-
- Current Time must be a 32-bit float type scalar.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models.
-
- This instruction is a shader call instruction which may invoke shaders with the IntersectionKHR, AnyHitKHR, - ClosestHitKHR, and MissKHR execution models. -

Capability:
-RayTracingMotionBlurNV

13

5338

<id> Acceleration Structure

<id> Ray Flags

<id> Cull Mask

<id> SBT Offset

<id> SBT Stride

<id> Miss Index

<id> Ray Origin

<id> Ray Tmin

<id> Ray Direction

<id> Ray Tmax

<id> Current Time

<id> Payload number

- ---------------- - - - - - - - - - - - - - - - - - - - - - - -

OpTraceRayMotionNV
-
- Trace a ray into the acceleration structure.
-
- Acceleration Structure is the descriptor for the acceleration structure to trace into.
-
- Ray Flags contains one or more of the Ray Flag values as described in SPV_KHR_ray_tracing.
-
- Cull Mask is the mask to test against the instance mask.
-
- SBT Offset and SBT Stride control indexing into the SBT for hit shaders called from this trace. - SBT stands for Shader Binding Table. Refer to the Ray Tracing chapter of Vulkan API specification for details.
-
- Miss Index is the index of the miss shader to be called from this trace call.
-
- Ray Origin, Ray Tmin, Ray Direction, and Ray Tmax control the basic parameters of the ray to be traced.
-
- Payload is a pointer to the ray payload structure to use for this trace. Payload must be the result of an OpVariable with a storage class of RayPayloadKHR or IncomingRayPayloadKHR.
-
- Ray Flags, Cull Mask, SBT Offset, SBT Stride, and Miss Index must be a 32-bit integer type scalar.
-
- Only the 8 least-significant bits of Cull Mask are used by this instruction - other bits are ignored. -
- Only the 4 least-significant bits of SBT Offset and SBT Stride are used by this instruction - other bits are ignored. -
- Only the 16 least-significant bits of Miss Index are used by this instruction - other bits are ignored. -
- Ray Origin and Ray Direction must be a 32-bit float type 3-component vector.
-
- Ray Tmin and Ray Tmax must be a 32-bit float type scalar.
-
- Current Time must be a 32-bit float type scalar.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models.
-
- This instruction is a shader call instruction which may invoke shaders with the IntersectionKHR, AnyHitKHR, - ClosestHitKHR, and MissKHR execution models. -

Capability:
-RayTracingMotionBlurNV

13

5339

<id> Acceleration Structure

<id> Ray Flags

<id> Cull Mask

<id> SBT Offset

<id> SBT Stride

<id> Miss Index

<id> Ray Origin

<id> Ray Tmin

<id> Ray Direction

<id> Ray Tmax

<id> Current Time

<id> Payload

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_NV_ray_tracing_motion_blur"
-
-
-
-
-
-

Interactions with SPV_NV_ray_tracing

-
-
-

OpTypeAccelerationStructureKHR, RayGenerationKHR, IntersectionKHR, -AnyHitKHR, ClosestHitKHR , MissKHR, RayPayloadKHR and IncomingRayPayloadKHR -are aliases of OpTypeAccelerationStructureNV,RayGenerationNV, IntersectionNV, -AnyHitNV, ClosestHitNV and MissNV respectively and can be used -interchangeably in this extension.

-
-
-

OpTraceMotionNV is supported only if SPV_NV_ray_tracing is supported.

-
-
-
-
-

Issues

-
-
-

1) Why are there two separate instructions OpTraceMotionNV and OpTraceRayMotionNV added -with this extension?

-
-
-

Resolved : OpTraceNV instruction in SPV_NV_ray_tracing extension has the last argument as -payload id when compared to OpTraceRayKHR which has id of an OpVariable. We follow the same -convention and provide two separate instructions. OpTraceMotionNV has payload id as the last -argument and OpTraceRayMotionNV has id of an OpVariable.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2020-06-01

Ashwin Lele

Internal revisions

2

2023-11-29

Daniel Koch

fix typo in document title

-
-
-
- - \ No newline at end of file + + + + + + extensions/NV/SPV_NV_ray_tracing_motion_blur.html + + +

extensions/NV/SPV_NV_ray_tracing_motion_blur.html

+ + diff --git a/extensions/NV/SPV_NV_sample_mask_override_coverage.html b/extensions/NV/SPV_NV_sample_mask_override_coverage.html index a53721f..fab1bc6 100644 --- a/extensions/NV/SPV_NV_sample_mask_override_coverage.html +++ b/extensions/NV/SPV_NV_sample_mask_override_coverage.html @@ -1,359 +1,12 @@ - - - - - - - -SPV_NV_sample_mask_override_coverage - - - - - -
-
-

Name Strings

-
-
-

SPV_NV_sample_mask_override_coverage

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Kerch Holt, NVIDIA

    -
  • -
  • -

    Daniel Koch, NVIDIA

    -
  • -
-
-
-
-
-

Status

-
-
-

Complete.

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2017-02-15

Revision

3

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.1 Revision 4.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension provides a new builtin variable decoration -to support the OpenGL GL_NV_sample_mask_override_coverage and -Vulkan VK_NV_sample_mask_override_coverage extensions in SPIR-V.

-
-
-

The OverrideCoverageNV decoration corresponds to the override_coverage layout -qualifier annotation.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_NV_sample_mask_override_coverage"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
SampleMaskOverrideCoverageNV
-
-
-
-

that depends on the SampleRateShading capability.

-
-
-
-
-

New Decorations

-
-
-

Decorations added under the SampleMaskOverrideCoverageNV capability

-
-
-
-
OverrideCoverageNV
-
-
-
-
-
-

New Builtins

-
-
-

None.

-
-
-
-
-

New Instructions

-
-
-

None.

-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - - - - - -

OverrideCoverageNV

5248

SampleMaskOverrideCoverageNV

5249

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.1

-
-
-
-
(Modify Section 3.20, Decoration, adding a row to the Decoration table)
-
-
-
- ------- - - - - - - - - - - - - - - - -
DecorationEnabling CapabilitiesExtra Operands

5248

OverrideCoverageNV
-Allows the fragment shader to control whether the -SampleMask builtin output can enable samples that were not covered -by the original primitive, or that failed the early depth/stencil tests.

SampleMaskOverrideCoverageNV

-
-
-
-
(Modify Section 3.31, Capability, adding a row to the Capability table)
-
-
-
- ------ - - - - - - - - - - - - - - - -
CapabilityDepends OnEnabled by Extension

5249

SampleMaskOverrideCoverageNV

SampleRateShading

SPV_NV_sample_mask_override_coverage

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_NV_sample_mask_override_coverage"
-
-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2016-05-25

Kerch Holt

Initial revision

2

2016-11-24

Daniel Koch

Add capability, assign tokens. - Improve formatting.

3

2017-02-15

Daniel Koch

Mark complete, mention Vulkan extension

-
-
-
- - \ No newline at end of file + + + + + + extensions/NV/SPV_NV_sample_mask_override_coverage.html + + +

extensions/NV/SPV_NV_sample_mask_override_coverage.html

+ + diff --git a/extensions/NV/SPV_NV_shader_atomic_fp16_vector.html b/extensions/NV/SPV_NV_shader_atomic_fp16_vector.html index 1873240..e495999 100644 --- a/extensions/NV/SPV_NV_shader_atomic_fp16_vector.html +++ b/extensions/NV/SPV_NV_shader_atomic_fp16_vector.html @@ -1,289 +1,12 @@ - - - - - - - -SPV_NV_shader_atomic_fp16_vector - - - - - -
-
-

Name Strings

-
-
-

SPV_NV_shader_atomic_fp16_vector

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Pankaj Mistry, NVIDIA

    -
  • -
  • -

    Jeff Bolz, NVIDIA

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Draft

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2024-02-21

Revision

3

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, Version 1.6 Revision 3.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-

This extension reuses instructions from extensions -SPV_EXT_shader_atomic_float_add and SPV_EXT_shader_atomic_float_min_max -but does not require them to be supported or enabled.

-
-
-
-
-

Overview

-
-
-

This extension adds support for atomic add, min, max and exchange instructions -on float16 vectors with 2 or 4 components.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_NV_shader_atomic_fp16_vector"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces the new capability:

-
-
-
-
AtomicFloat16VectorNV
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.5

-
-
-

Modify Section 3.31, "Capability", adding this row to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

6095

AtomicFloat16VectorNV
-Uses the OpAtomicFAddEXT, OpAtomicExchange, OpAtomicFMinEXT, and -OpAtomicFMaxEXT instructions on float16 vectors with 2 or 4 components.

-
-
-
-

Modify section 3.49.18, Atomic Instructions:

-
-
-

Add the AtomicFloat16VectorNV capability to the OpAtomicFAddEXT, -OpAtomicFMinEXT, and OpAtomicFMaxEXT instructions,

-
-
-

Modify each of the instructions OpAtomicFAddEXT, -OpAtomicFMinEXT, OpAtomicFMaxEXT, and OpAtomicExchange to allow the -Result Type to be a vector of float16 with two or four components. -Atomic operations on vectors only guarantee atomicity of each component.

-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_NV_shader_atomic_fp16_vector"
-
-
-
-
    -
  • -

    When using OpAtomicFAddEXT, OpAtomicExchange, OpAtomicFMinEXT or OpAtomicFMaxEXT float16 vector with 2 or 4 components are allowed.

    -
  • -
  • -

    If OpAtomicFAddEXT, OpAtomicExchange, OpAtomicFMinEXT or OpAtomicFMaxEXT is used with float16 vector with 2 or 4 components, the AtomicFloat16VectorNV -capability must be declared.

    -
  • -
-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

Pankaj Mistry

Internal revisions

2

2024-02-03

Jeff Bolz

Updates and simplifications

3

2024-02-21

Jeff Bolz

Fix interaction with float_add/float_min_max

-
-
-
- - \ No newline at end of file + + + + + + extensions/NV/SPV_NV_shader_atomic_fp16_vector.html + + +

extensions/NV/SPV_NV_shader_atomic_fp16_vector.html

+ + diff --git a/extensions/NV/SPV_NV_shader_image_footprint.html b/extensions/NV/SPV_NV_shader_image_footprint.html index 6955f68..c933504 100644 --- a/extensions/NV/SPV_NV_shader_image_footprint.html +++ b/extensions/NV/SPV_NV_shader_image_footprint.html @@ -1,500 +1,12 @@ - - - - - - - -SPV_NV_shader_image_footprint - - - - - -
-
-

Name Strings

-
-
-

SPV_NV_shader_image_footprint

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Daniel Koch, NVIDIA

    -
  • -
  • -

    Chao Chen, NVIDIA

    -
  • -
  • -

    Jeff Bolz, NVIDIA

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2018-08-14

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.3 Revision 2, Unified.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds support for a new image function for -querying information about the texel footprint of a -corresponding image sampling operation. -This extension provides SPIR-V support for the GLSL -GL_NV_shader_texture_footprint extension.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_NV_shader_image_footprint"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.3

-
-
-
-
(Modify Section 3.14, Image Operands)
-
-

Add OpImageSampleFootprintNV to the list of opcodes that use Image Operands.

-
-
(Modify Section 3.31, Capability, adding a new row to the Capability table)
-
-
-
- ------ - - - - - - - - - - - - - - - -
CapabilityDepends OnEnabled by Extension

5282

ImageFootprintNV

Shader

SPV_NV_shader_image_footprint

-
-
-
-
(Modify Section 3.32.10, Image Instructions, adding a new row to the table)
-
-
-
- ------------ - - - - - - - - - - - - - - - - - - -

OpImageSampleFootprintNV
-
- Return the texel footprint of a corresponding sampling instruction.
-
- Result Type must be an OpTypeStruct with six members.
- Member 0 must be a Boolean type scalar. Member 0 holds a boolean value indicating - whether the corresponding image sampling operation only touches a single LOD level.
- Member 1 must be a vector of integer type, whose Signedness operand is 0, and - have the same number of components as Coordinate. Member 1 holds the footprint - anchor.
- Member 2 must be a vector of integer type, whose Signedness operand is 0, and - have the same number of components as Coordinate. Member 2 holds the footprint - offset.
- Member 3 must be a vector of integer type, whose Signedness operand is 0, and - have two components. Member 3 holds the footprint mask.
- Member 4 must be a scalar of integer type, whose Signedness operand is 0. - Member 4 holds the footprint LOD.
- Member 5 must be a scalar of integer type, whose Signedness operand is 0. - Member 5 holds the footprint granularity after coarsening. A value - of zero means that no coarsening occurred.
-
- This structure type must be explicitly declared by the module. -
-
-Sampled Image must be an object whose type is - OpTypeSampledImage. The Dim operand of the underlying - OpTypeImage must be 2D or 3D, and the Arrayed and MS operands must be 0.
-
-Coordinate must be a vector of floating-point type. -It contains (u, v[, w]) as needed by the definition of Sampled Image.
-
-Granularity must be a scalar of integer type with the granularity - of the returned image footprint mask, indicating how many texels each bit in the - bitmask corresponds to.
-
-Coarse must be a constant instruction of scalar Boolean type indicating the - low (fine) or high (coarse) LOD level for trilinear filtering, where two - footprint operations are required to retrieve both bitmasks.
-
-Image Operands encodes what operands follow, as per Image Operands.
- Supported combinations are:
- For Dim = 2D: None, Bias, MinLod, (MinLod + Bias), Lod, Grad, - (Grad + MinLod).
- For Dim = 3D: None, Bias, MinLod, (MinLod + Bias), Lod.

Capability:
-ImageFootprintNV

7 + variable

5283

<id>
-Result Type

Result <id>

<id>
-Sampled Image

<id>
-Coordinate

<id>
-Granularity

<id>
-Coarse

Optional Image Operands

Optional
-<id>, <id>, …​

-
-
-
-
-
-
-

(Add a new Section 3.Granularity, Granularity)

-
-
-

Granularity

-
-

Granularity of an image footprint mask. -Used by OpImageSampleFootprintNV.

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
GranularityEnabling Capabilities

Dim=2D

Dim=3D

0

Input: not valid
- Output: no coarsening

ImageFootprintNV

1

2x2

2x2x2

ImageFootprintNV

2

4x2

(reserved)

ImageFootprintNV

3

4x4

4x4x2

ImageFootprintNV

4

8x4

(reserved)

ImageFootprintNV

5

8x8

(reserved)

ImageFootprintNV

6

16x8

(reserved)

ImageFootprintNV

7

16x16

(reserved)

ImageFootprintNV

8

(reserved)

(reserved)

9

(reserved)

(reserved)

10

(reserved)

16x16x16

ImageFootprintNV

11

64x64

32x16x16

ImageFootprintNV

12

128x64

32x32x16

ImageFootprintNV

13

128x128

32x32x32

ImageFootprintNV

14

256x128

64x32x32

ImageFootprintNV

15

256x256

(reserved)

ImageFootprintNV

-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_NV_shader_image_footprint"
-
-
-
-
-
-

Issues

-
-
-
    -
  1. -

    How do we handle out parameters from functions?

    -
    -
    -
    -

    RESOLVED: Op returns a structure - see ModfStruct and FrexpStruct in -the GLSL.std.450 extended instruction sets.

    -
    -
    -
    -
  2. -
  3. -

    How many variants of the "footprint" instructions do we need?

    -
    -
    -
    -

    RESOLVED: Using the existing Image Operands, we can get away with just one.

    -
    -
    -
    -
  4. -
  5. -

    Should we allow expandable arguments for future targets (like cube maps)?

    -
    -
    -
    -

    RESOLVED: Not at this time. It would likely be difficult to express footprint -for cube maps, particularly access along the seams.

    -
    -
    -
    -
  6. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2018-08-14

Daniel Koch

Internal revisions

-
-
-
- - \ No newline at end of file + + + + + + extensions/NV/SPV_NV_shader_image_footprint.html + + +

extensions/NV/SPV_NV_shader_image_footprint.html

+ + diff --git a/extensions/NV/SPV_NV_shader_invocation_reorder.html b/extensions/NV/SPV_NV_shader_invocation_reorder.html index a12ae75..2526de1 100644 --- a/extensions/NV/SPV_NV_shader_invocation_reorder.html +++ b/extensions/NV/SPV_NV_shader_invocation_reorder.html @@ -1,1916 +1,12 @@ - - - - - - - -SPV_NV_shader_invocation_reorder - - - - - -
-
-

Name Strings

-
-
-

SPV_NV_shader_invocation_reorder

-
-
-
-
-

Contact

-
-
-

See Issues list in the Khronos SPIRV-Registry repository: -https://github.com/KhronosGroup/SPIRV-Registry

-
-
-
-
-

Contributors

-
-
-
    -
  • -

    Ashwin Lele, NVIDIA

    -
  • -
  • -

    Eric Werness, NVIDIA

    -
  • -
  • -

    Daniel Koch, NVIDIA

    -
  • -
  • -

    Pankaj Mistry, NVIDIA

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2023-12-06

Revision

2

-
-
-
-

Dependencies

-
-
-

This extension is written against the Unified SPIR-V Specification, -Version 1.6, Revision 1.

-
-
-

This extension requires SPIR-V 1.4.

-
-
-

This extension requires SPV_KHR_ray_tracing.

-
-
-

This extension requires SPV_EXT_physical_storage_buffer, SPV_KHR_physical_storage_buffer -or SPIR-V 1.5.

-
-
-

This extension interacts with SPV_NV_ray_tracing_motion_blur.

-
-
-
-
-

Overview

-
-
-

This extension adds hit objects and reordering builtins to provide finer -grain control over traversal during ray tracing.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_NV_shader_invocation_reorder"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces new capabilities:

-
-
-
-
ShaderInvocationReorderNV
-
-
-
-
-
-

New Storage Classes

-
-
-

Storage classes added under the ShaderInvocationReorderNV capability

-
-
-
-
HitObjectAttributeNV
-
-
-
-
-
-

New Decorations

-
-
-

This extension introduces new decorations:

-
-
-
-
HitObjectShaderRecordBufferNV
-
-
-
-
-
-

New Instructions

-
-
-

Instructions added under the ShaderInvocationReorderNV capability

-
-
-
-
OpTypeHitObjectNV
-OpReorderThreadWithHintNV
-OpReorderThreadWithHitObjectNV
-OpHitObjectIsMissNV
-OpHitObjectIsHitNV
-OpHitObjectIsEmptyNV
-OpHitObjectGetRayTMinNV
-OpHitObjectGetRayTMaxNV
-OpHitObjectGetObjectRayOriginNV
-OpHitObjectGetObjectRayDirectionNV
-OpHitObjectGetWorldRayOriginNV
-OpHitObjectGetWorldRayDirectionNV
-OpHitObjectGetObjectToWorldNV
-OpHitObjectGetWorldToObjectNV
-OpHitObjectGetInstanceCustomIndexNV
-OpHitObjectGetInstanceIdNV
-OpHitObjectGetGeometryIndexNV
-OpHitObjectGetPrimitiveIndexNV
-OpHitObjectGetHitKindNV
-OpHitObjectGetAttributesNV
-OpHitObjectGetCurrentTimeNV
-OpHitObjectGetShaderBindingTableRecordIndexNV
-OpHitObjectGetShaderRecordBufferHandleNV
-OpHitObjectExecuteShaderNV
-OpHitObjectRecordMissNV
-OpHitObjectRecordMissMotionNV
-OpHitObjectRecordHitWithIndexNV
-OpHitObjectRecordHitWithIndexMotionNV
-OpHitObjectRecordHitNV
-OpHitObjectRecordHitMotionNV
-OpHitObjectRecordEmptyNV
-OpHitObjectTraceRayNV
-OpHitObjectTraceRayMotionNV
-
-
-
-
-
-

Modifications to the SPIR-V Specification

-
-
-
-
(Modify Section 2.2.1, Instructions )
-
-
-

Add OpTraceRayWithHitObjectNV, -OpHitObjectExecuteShaderNV to the list -of shader call instructions.

-
-
-
(Add the following terminology to section 2.2.2, Types)
-
-
-
-
-

Hit object type: The type returned by OpTypeHitObjectNV.

-
-
-
-
-
(Add to the list of opaque types in section 2.2.2, Types)
-
-
-
-
-
    -
  • -

    OpTypeHitObjectNV

    -
  • -
-
-
-
-
-
(Modify Section 3.2, Decorations, adding a row to the Decoration table)
-
-
-
- ----- - - - - - - - - - - - - - -
DecorationRequires

5386

ShaderInvocationReorderNV

HitObjectShaderRecordBufferNV

-
-
-
-
(Modify Section 3.7, Storage Class, adding rows to the Storage Class table)
-
-
-
- ------ - - - - - - - - - - - - - - - -
Storage ClassEnabling CapabilitiesEnabled by Extension

5385

HitObjectAttributeNV
-Used for storing attributes of geometry intersected by a ray to be passed on to -hit object instructions. Visible across all functions in the current invocation. -Not shared externally. Variables declared with this storage class can be both read and written to, but cannot have initializers. -Only allowed in RayGenerationKHR, ClosestHitKHR, and MissKHR execution models.

ShaderInvocationReorderNV

SPV_NV_shader_invocation_reorder

-
-
-
-
(Modify Section 3.31, Capability, adding a row to the Capability table)
-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly Declares

5383

ShaderInvocationReorderNV

RayTracingKHR

-
-
-
-
(Add the following line to the description of OpTypePointer, in Section 3.32.6, Type-Declaration Instructions)
-
-
-
-
-

If Type is OpTypeHitObjectNV, Storage Class must be Private or Function.

-
-
-
-
-
(Add the following line to the description of OpStore and OpLoad, in Section 3.32.8, Memory Instructions)
-
-
-
-
-

The Type operand to the OpTypePointer used for Pointer must not be OpTypeHitObjectNV.

-
-
-
-
-
(Add the following line to the description of OpCopyMemory and OpCopyMemorySized, in Section 3.32.8, Memory Instructions)
-
-
-
-
-

The Type operand to the OpTypePointer used for Target or Source must not be OpTypeHitObjectNV.

-
-
-
-
-
(Add a new sub section 3.36.Reorder Instructions)
-
-
-
- ------ - - - - - - - - - - - - -

OpReorderThreadWithHintNV
-
- Reorder threads based on user provided hint. Similar Hint values indicate similarity of subsequent work done after this call. Behavior is implementation-defined.
-
- Hint must be a 32-bit integer-type scalar.
-
- Bits must be a 32-bit integer-type scalar.
-
- This instruction is allowed only in RayGenerationKHR execution models.

Capability:
-ShaderInvocationReorderNV

3

5280

<id> Hint

<id> Bits

- ------- - - - - - - - - - - - - - -

OpReorderThreadWithHitObjectNV
-
- Reorder threads based on hit object supplemented by Hint and Bits if they are optionally provided values. Behavior is implementation-defined.
-
- Hit Object must be a pointer to hit object used to reorder threads.
-
- Hint must be a 32-bit integer-type scalar.
-
- Bits must be a 32-bit integer-type scalar.
-
- Hint and Bits are optional together i.e Either both Hint and Bits - should be provided or neither.
-
- This instruction is allowed only in RayGenerationKHR execution models.
-

Capability:
-ShaderInvocationReorderNV

2 + variable

5279

<id> Hit Object

Optional <id> Hint

Optional <id> Bits

-
-
-
-
(Add a new sub section 3.36.Hit Object Instructions)
-
-
-

The semantics of the arguments of OpHitObjectTraceRayNV and OpHitObjectTraceRayMotionNV -are same as those with the same names of OpTraceRayKHR and -OpTraceRayMotionNV as defined in the SPV_KHR_ray_tracing and -SPV_NV_ray_tracing_motion_blur extensions, respectively.

-
-
-
-
-
-
- ---------------- - - - - - - - - - - - - - - - - - - - - - - -

OpHitObjectTraceRayNV
-
- Traces a ray and triggers execution on any-hit or intersection shaders and populates resulting hit or miss information in the hit object.
-
- Hit Object is a pointer to the hit object.
-
- Acceleration Structure is the descriptor for the acceleration structure to trace into.
-
- Ray Flags contains one or more of the ray flag values. Refer to the client API specification for details. -
- Cull Mask is the mask to test against the instance mask.
-
- SBT Offset and SBT Stride control indexing into the SBT for hit shaders called from this trace. - SBT stands for Shader Binding Table. Refer to the client API specification for details.
-
- Miss Index is the index of the miss shader to be called from this trace call.
-
- Ray Origin, Ray Tmin, Ray Direction, and Ray Tmax control the basic parameters of the ray to be traced.
-
- Payload is a pointer to the ray payload structure to use for this trace. Payload must be the result of an OpVariable with a storage class of RayPayloadKHR or IncomingRayPayloadKHR.
-
- Ray Flags, Cull Mask, SBT Offset, SBT Stride, and Miss Index must be a 32-bit integer type scalar.
-
- Only the 8 least-significant bits of Cull Mask are used by this instruction - other bits are ignored. -
- Only the 4 least-significant bits of SBT Offset and SBT Stride are used by this instruction - other bits are ignored. -
- Only the 16 least-significant bits of Miss Index are used by this instruction - other bits are ignored. -
- Ray Origin and Ray Direction must be a 32-bit float type 3-component vector.
-
- Ray Tmin and Ray Tmax must be a 32-bit float type scalar.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models.
-
- This instruction is a shader call instruction which may invoke shaders with the IntersectionKHR and AnyHitKHR execution models.
-

Capability:
-ShaderInvocationReorderNV

13

5260

<id> Hit Object

<id> Acceleration Structure

<id> Ray Flags

<id> Cull Mask

<id> SBT Offset

<id> SBT Stride

<id> Miss Index

<id> Ray Origin

<id> Ray Tmin

<id> Ray Direction

<id> Ray Tmax

<id> Payload

- ----------------- - - - - - - - - - - - - - - - - - - - - - - - -

OpHitObjectTraceRayMotionNV
-
- Traces a ray and triggers execution of any-hit or intersection shaders and populates resulting hit or miss information in the hit object.
-
- Hit Object is a pointer to the hit object.
-
- Acceleration Structure is the descriptor for the acceleration structure to trace into.
-
- Ray Flags contains one or more of the ray flag values. Refer to the client API specification for details. -
- Cull Mask is the mask to test against the instance mask.
-
- SBT Offset and SBT Stride control indexing into the SBT for hit shaders called from this trace. - SBT stands for Shader Binding Table. Refer to the client API specification for details.
-
- Miss Index is the index of the miss shader to be called from this trace call.
-
- Ray Origin, Ray Tmin, Ray Direction, and Ray Tmax control the basic parameters of the ray to be traced.
-
- Payload is a pointer to the ray payload structure to use for this trace. Payload must be the result of an OpVariable with a storage class of RayPayloadKHR or IncomingRayPayloadKHR.
-
- Ray Flags, Cull Mask, SBT Offset, SBT Stride, and Miss Index must be a 32-bit integer type scalar.
-
- Only the 8 least-significant bits of Cull Mask are used by this instruction - other bits are ignored. -
- Only the 4 least-significant bits of SBT Offset and SBT Stride are used by this instruction - other bits are ignored. -
- Only the 16 least-significant bits of Miss Index are used by this instruction - other bits are ignored. -
- Ray Origin and Ray Direction must be a 32-bit float type 3-component vector.
-
- Ray Tmin and Ray Tmax must be a 32-bit float type scalar.
-
- Current Time must be a 32-bit float type scalar.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models with SPV_NV_ray_tracing_motion_blur extension.
-
- This instruction is a shader call instruction which may invoke shaders with the IntersectionKHR and AnyHitKHR execution models.
-

Capability:
-ShaderInvocationReorderNV, RayTracingMotionBlurNV

14

5256

<id> Hit Object

<id> Acceleration Structure

<id> Ray Flags

<id> Cull Mask

<id> SBT Offset

<id> SBT Stride

<id> Miss Index

<id> Ray Origin

<id> Ray Tmin

<id> Ray Direction

<id> Ray Tmax

<id> Current Time

<id> Payload

- ----------------- - - - - - - - - - - - - - - - - - - - - - - - -

OpHitObjectRecordHitNV
-
- Populates the hit object to represent a hit without tracing a ray.
-
- Hit Object is a pointer to the hit object.
-
- Acceleration Structure is the descriptor for the acceleration structure to trace into.
-
- Instance Id refers to the index of the instance within Acceleration Structure which - to be encoded in the hit object.
-
- Instance Id must be a 32 bit integer type scalar. -
- Primitive Id refers to the index of the primitive within Acceleration Strucutre which - is to be encoded in the hit object.
-
- Primitive Id must be a 32 bit integer type scalar. -
- Geometry Index refers to the index of the geometry within Acceleration Structure which - is to be encoded in the hit object.
-
- Geometry Index must be a 32 bit integer type scalar. -
- Hit Kind is the integer hit kind reported back to other shaders and accessible by the hit kind builtin.
-
- Hit Kind must be a 32 bit unsigned integer type scalar. -
- SBT Record Offset and SBT Record Stride control indexing into the SBT to determine the closest-hit shader to be encoded in the hit object. - SBT stands for Shader Binding Table. Refer to the client API specification for details.
- SBT stands for Shader Binding Table. Refer to the client API specification for details.
-
- SBT Record Offset and SBT Record Stride must be a 32 bit integer type scalar. -
- Ray Origin, Ray Tmin, Ray Direction, and Ray Tmax control the basic parameters of the ray.
-
- Ray Origin and Ray Direction must be a 32-bit float type 3-component vector.
-
- Ray Tmin and Ray Tmax must be a 32-bit float type scalar.
-
- Hit Object Attributes contains the attributes of the hit which are to be encoded in Hit Object. This must be an OpVariable in HitObjectAttributeNV storage class.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models.
-

Capability:
-ShaderInvocationReorderNV

14

5261

<id> Hit Object

<id> Acceleration Structure

<id> Instance Id

<id> Primitive Id

<id> Geometry Index

<id> Hit Kind

<id> SBT Record Offset

<id> SBT Record Stride

<id> Ray Origin

<id> Ray TMin

<id> Ray Direction

<id> Ray TMax

<id> Hit Object Attributes

- ------------------ - - - - - - - - - - - - - - - - - - - - - - - - -

OpHitObjectRecordHitMotionNV
-
- Populates the hit object to represent a hit without tracing a ray.
-
- Hit Object is a pointer to the hit object.
-
- Acceleration Structure is the descriptor for the acceleration structure to trace into.
-
- Instance Id refers to the index of the instance within Acceleration Structure which - to be encoded in the hit object.
-
- Instance Id must be a 32 bit integer type scalar. -
- Primitive Id refers to the index of the primitive within Acceleration Strucutre which - is to be encoded in the hit object.
-
- Primitive Id must be a 32 bit integer type scalar. -
- Geometry Index refers to the index of the geometry within Acceleration Structure which - is to be encoded in the hit object.
-
- Geometry Index must be a 32 bit integer type scalar. -
- Hit Kind is the integer hit kind reported back to other shaders and accessible by the hit kind builtin.
-
- Hit Kind must be a 32 bit unsigned integer type scalar. -
- SBT Record Offset and SBT Record Stride control indexing into the SBT to determine the closest-hit shader to be encoded in the hit object. - SBT stands for Shader Binding Table. Refer to the client API specification for details.
- SBT stands for Shader Binding Table. Refer to the client API specification for details.
-
- SBT Record Offset and SBT Record Stride must be a 32 bit integer type scalar. -
- Ray Origin, Ray Tmin, Ray Direction, and Ray Tmax control the basic parameters of the ray.
-
- Ray Origin and Ray Direction must be a 32-bit float type 3-component vector.
-
- Ray Tmin and Ray Tmax must be a 32-bit float type scalar.
-
- Current Time must be a 32-bit float type scalar.
-
- Hit Object Attributes contains the attributes of the hit which are to be encoded in Hit Object. This must be an OpVariable in HitObjectAttributeNV storage class.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models with SPV_NV_ray_tracing_motion_blur extension.
-

Capability:
-ShaderInvocationReorderNV, RayTracingMotionBlurNV

15

5249

<id> Hit Object

<id> Acceleration Structure

<id> Instance Id

<id> Primitive Id

<id> Geometry Index

<id> Hit Kind

<id> SBT Record Offset

<id> SBT Record Stride

<id> Ray Origin

<id> Ray TMin

<id> Ray Direction

<id> Ray TMax

<id> Current Time

<id> Hit Object Attributes

- ---------------- - - - - - - - - - - - - - - - - - - - - - - -

OpHitObjectRecordHitWithIndexNV
-
- Encodes the hit object to represent a hit without tracing a ray.
-
- Hit Object is a pointer to the hit object.
-
- Acceleration Structure is the descriptor for the acceleration structure to trace into.
- Instance Id refers to the index of the instance within Acceleration Structure which - to be encoded in the hit object.
-
- Instance Id must be a 32 bit integer type scalar. -
- Primitive Id refers to the index of the primitive within Acceleration Strucutre which - is to be encoded in the hit object.
-
- Primitive Id must be a 32 bit integer type scalar. -
- Geometry Index refers to the index of the geometry within Acceleration Structure which - is to be encoded in the hit object.
-
- Geometry Index must be a 32 bit integer type scalar. -
- Hit Kind is the integer hit kind reported back to other shaders and accessible by the hit kind builtin.
-
- Hit Kind must be a 32 bit unsigned integer type scalar. -
- SBT Index is record index for the closest-hit shader in the SBT to encode into the - hit object. -
- SBT stands for Shader Binding Table. Refer to the client API specification for details.
-
- SBT Index must be a 32 bit unsigned integer type scalar. -
- Ray Origin, Ray Tmin, Ray Direction, and Ray Tmax control the basic parameters of the ray.
-
- Ray Origin and Ray Direction must be a 32-bit float type 3-component vector.
-
- Ray Tmin and Ray Tmax must be a 32-bit float type scalar.
-
- Hit Object Attributes contains the attributes of the hit which are to be encoded in Hit Object. This must be an OpVariable in HitObjectAttributeNV storage class.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models.
-

Capability:
-ShaderInvocationReorderNV

13

5262

<id> Hit Object

<id> Acceleration Structure

<id> Instance Id

<id> Primitive Id

<id> Geometry Index

<id> Hit Kind

<id> SBT Index

<id> Ray Origin

<id> Ray TMin

<id> Ray Direction

<id> Ray TMax

<id> Hit Object Attributes

- ----------------- - - - - - - - - - - - - - - - - - - - - - - - -

OpHitObjectRecordHitWithIndexMotionNV
-
- Encodes the hit object to represent a hit without tracing a ray.
-
- Hit Object is a pointer to the hit object.
-
- Acceleration Structure is the descriptor for the acceleration structure to trace into.
- Instance Id refers to the index of the instance within Acceleration Structure which - to be encoded in the hit object.
-
- Instance Id must be a 32 bit integer type scalar. -
- Primitive Id refers to the index of the primitive within Acceleration Strucutre which - is to be encoded in the hit object.
-
- Primitive Id must be a 32 bit integer type scalar. -
- Geometry Index refers to the index of the geometry within Acceleration Structure which - is to be encoded in the hit object.
-
- Geometry Index must be a 32 bit integer type scalar. -
- Hit Kind is the integer hit kind reported back to other shaders and accessible by the hit kind builtin.
-
- Hit Kind must be a 32 bit unsigned integer type scalar. -
- SBT Index is record index for the closest-hit shader in the SBT to encode into the - hit object. -
- SBT stands for Shader Binding Table. Refer to the client API specification for details.
-
- SBT Index must be a 32 bit unsigned integer type scalar. -
- Ray Origin, Ray Tmin, Ray Direction, and Ray Tmax control the basic parameters of the ray.
-
- Ray Origin and Ray Direction must be a 32-bit float type 3-component vector.
-
- Ray Tmin and Ray Tmax must be a 32-bit float type scalar.
-
- Current Time must be a 32-bit float type scalar.
-
- Hit Object Attributes contains the attributes of the hit which are to be encoded in Hit Object. This must be an OpVariable in HitObjectAttributeNV storage class.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models with SPV_NV_ray_tracing_motion_blur extension.
-

Capability:
-ShaderInvocationReorderNV, RayTracingMotionBlurNV

14

5250

<id> Hit Object

<id> Acceleration Structure

<id> Instance Id

<id> Primitive Id

<id> Geometry Index

<id> Hit Kind

<id> SBT Index

<id> Ray Origin

<id> Ray TMin

<id> Ray Direction

<id> Ray TMax

<id> Current Time

<id> Hit Object Attributes

- ---------- - - - - - - - - - - - - - - - - -

OpHitObjectRecordMissNV
-
- Encodes the hit object to represent a miss without tracing a ray.
-
- Hit Object is a pointer to the hit object.
-
- Miss Index is the index of the miss shader to be encode in the hit object.
-
- Miss Index must be a 32-bit unsigned integer type scalar. -
- Ray Origin, Ray Tmin, Ray Direction, and Ray Tmax control the basic parameters of the ray.
-
- Ray Origin and Ray Direction must be a 32-bit float type 3-component vector.
-
- Ray Tmin and Ray Tmax must be a 32-bit float type scalar.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models.
-

Capability:
-ShaderInvocationReorderNV

7

5263

<id> Hit Object

<id> Miss Index

<id> Ray Origin

<id> Ray TMin

<id> Ray Direction

<id> Ray TMax

- ----------- - - - - - - - - - - - - - - - - - -

OpHitObjectRecordMissMotionNV
-
- Encodes the hit object to represent a miss without tracing a ray.
-
- Hit Object is a pointer to the hit object.
-
- Miss Index is the index of the miss shader to be encode in the hit object.
-
- Miss Index must be a 32-bit unsigned integer type scalar. -
- Ray Origin, Ray Tmin, Ray Direction, and Ray Tmax control the basic parameters of the ray.
-
- Ray Origin and Ray Direction must be a 32-bit float type 3-component vector.
-
- Ray Tmin and Ray Tmax must be a 32-bit float type scalar.
-
- Current Time must be a 32-bit float type scalar.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models with SPV_NV_ray_tracing_motion_blur extension.
-

Capability:
-ShaderInvocationReorderNV, RayTracingMotionBlurNV

8

5251

<id> Hit Object

<id> Miss Index

<id> Ray Origin

<id> Ray TMin

<id> Ray Direction

<id> Ray TMax

<id> Current Time

- ----- - - - - - - - - - - - -

OpHitObjectRecordEmptyNV
-
- Encodes the hit object to represent an empty hit object which is neither a hit nor a miss.
-
- Hit Object is a pointer to the hit object.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models.

Capability:
-ShaderInvocationReorderNV

2

5259

<id> Hit Object

- ------ - - - - - - - - - - - - -

OpHitObjectExecuteShaderNV
-
- Executes the closest-hit or miss shader as encoded in the hit object.
-
- Hit Object is a pointer to the hit object.
-
- Payload is a pointer to the ray payload structure to use for this trace. Payload must be the result of an OpVariable with a storage class of RayPayloadKHR or IncomingRayPayloadKHR.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models.
-
- This instruction is a shader call instruction which may invoke shaders with the -ClosestHitKHR, and MissKHR execution models.

Capability:
-ShaderInvocationReorderNV

3

5264

<id> Hit Object

<id> Payload

- ------- - - - - - - - - - - - - - -

OpHitObjectGetCurrentTimeNV
-
- Returns the current time value encoded in the hit object.
-
- Result is the current time value as encoded in the hit object.
-
- Result Type must be a 32-bit floating-point type scalar.
-
- Hit Object is a pointer to the hit object.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models.

Capability:
-ShaderInvocationReorderNV

4

5265

<id> Result Type

<id> Result

<id> Hit Object

- ------ - - - - - - - - - - - - -

OpHitObjectGetAttributesNV
-
- Returns the attributes as encoded in the hit object.
-
- Hit Object is a pointer to the hit object.
-
- Hit Object Attributes contains the attributes of the hit which are to be encoded in Hit Object. This must be an OpVariable in HitObjectAttributeNV storage class.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models.

Capability:
-ShaderInvocationReorderNV

3

5266

<id> Hit Object

<id> Hit Object Attributes

- ------- - - - - - - - - - - - - - -

OpHitObjectGetHitKindNV
-
- Returns a unsigned integer value if the hit as encoded in the hit object with front - face or back face of a primitive.
-
- Result is 0xFE if hit encoded in the hit object is with front facing primitive else - is 0xFF if it is back facing primitive.
-
- Result Type must be a 32bit integer type scalar.
-
- Hit Object is a pointer to the hit object.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models.

Capability:
-ShaderInvocationReorderNV

4

5267

<id> Result Type

<id> Result

<id> Hit Object

- ------- - - - - - - - - - - - - - -

OpHitObjectGetPrimitiveIndexNV
-
- Returns the primitive index as encoded in the hit object.
-
- Result is the primitive index as encoded in the hit object.
-
- Result Type must be a 32-bit integer type scalar.
-
- Hit Object is a pointer to the hit object.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models.

Capability:
-ShaderInvocationReorderNV

4

5268

<id> Result Type

<id> Result

<id> Hit Object

- ------- - - - - - - - - - - - - - -

OpHitObjectGetGeometryIndexNV
-
- Returns the geometry index as encoded in the hit object.
-
- Result is the geometry index as encoded in the hit object.
-
- Result Type must be a 32-bit integer type scalar.
-
- Hit Object is a pointer to the hit object.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models.

Capability:
-ShaderInvocationReorderNV

4

5269

<id> Result Type

<id> Result

<id> Hit Object

- ------- - - - - - - - - - - - - - -

OpHitObjectGetInstanceIdNV
-
- Returns the instance id as encoded in the hit object.
-
- Result is the instance id as encoded in the hit object.
-
- Result Type must be a 32-bit integer type scalar.
-
- Hit Object is a pointer to the hit object.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models.

Capability:
-ShaderInvocationReorderNV

4

5270

<id> Result Type

<id> Result

<id> Hit Object

- ------- - - - - - - - - - - - - - -

OpHitObjectGetInstanceCustomIndexNV
-
- Returns the application specified custom index value as encoded in the hit object.
-
- Result is the application specified custom index value as encoded in the hit object.
-
- Result Type must be a 32-bit integer type scalar.
-
- Hit Object is a pointer to the hit object.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models.

Capability:
-ShaderInvocationReorderNV

4

5271

<id> Result Type

<id> Result

<id> Hit Object

- ------- - - - - - - - - - - - - - -

OpHitObjectGetObjectRayOriginNV
-
- Returns the object-space ray origin as encoded in the hit object.
-
- Result is the ray object-space ray origin as encoded in the hit object.
-
- Result Type must be a 32-bit floating-point type 3-component vector.
-
- Hit Object is a pointer to the hit object.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models.

Capability:
-ShaderInvocationReorderNV

4

5255

<id> Result Type

<id> Result

<id> Hit Object

- ------- - - - - - - - - - - - - - -

OpHitObjectGetObjectRayDirectionNV
-
- Returns the object-space ray direction as encoded in the hit object.
-
- Result is the ray object-space ray direction as encoded in the hit object.
-
- Result Type must be a 32-bit floating-point type 3-component vector.
-
- Hit Object is a pointer to the hit object.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models.

Capability:
-ShaderInvocationReorderNV

4

5254

<id> Result Type

<id> Result

<id> Hit Object

- ------- - - - - - - - - - - - - - -

OpHitObjectGetWorldRayDirectionNV
-
- Returns the world-space ray direction as encoded in the hit object.
-
- Result is the ray world-space ray direction as encoded in the hit object.
-
- Result Type must be a 32-bit floating-point type 3-component vector.
-
- Hit Object is a pointer to the hit object.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models.

Capability:
-ShaderInvocationReorderNV

4

5272

<id> Result Type

<id> Result

<id> Hit Object

- ------- - - - - - - - - - - - - - -

OpHitObjectGetWorldRayOriginNV
-
- Returns the world-space ray origin as encoded in the hit object.
-
- Result is the ray world-space ray origin as encoded in the hit object.
-
- Result Type must be a 32-bit floating-point type 3-component vector.
-
- Hit Object is a pointer to the hit object.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models.

Capability:
-ShaderInvocationReorderNV

4

5273

<id> Result Type

<id> Result

<id> Hit Object

- ------- - - - - - - - - - - - - - -

OpHitObjectGetObjectToWorldNV
-
- Returns a matrix that transforms values from object-space to world-space as encoded in the hit object.
-
- Result is the matrix.
-
- Result Type must be a matrix with a Column Count of 4, and a Column Type that is a vector type with a Component Type that is a 32-bit floating-point type and a Component Count of 3.
-
- Hit Object is a pointer to the hit object.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models.

Capability:
-ShaderInvocationReorderNV

4

5253

<id> Result Type

<id> Result

<id> Hit Object

- ------- - - - - - - - - - - - - - -

OpHitObjectGetWorldToObjectNV
-
- Returns a matrix that transforms values from world-space to object-space as encoded in the hit object.
-
- Result is the matrix.
-
- Result Type must be a matrix with a Column Count of 4, and a Column Type that is a vector type with a Component Type that is a 32-bit floating-point type and a Component Count of 3.
-
- Hit Object is a pointer to the hit object.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models.

Capability:
-ShaderInvocationReorderNV

4

5253

<id> Result Type

<id> Result

<id> Hit Object

- ------- - - - - - - - - - - - - - -

OpHitObjectGetRayTMaxNV
-
- Returns the Ray Tmax value encoded in the hit object.
- Semantics are similar to RayTMaxKHR builtin as defined in SPV_KHR_ray_tracing. -
- Result is the Ray Tmax value as encoded in the hit object.
-
- Result Type must be a 32-bit floating-point type scalar.
-
- Hit Object is a pointer to the hit object.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models.

Capability:
-ShaderInvocationReorderNV

4

5274

<id> Result Type

<id> Result

<id> Hit Object

- ------- - - - - - - - - - - - - - -

OpHitObjectGetRayTMinNV
-
- Returns the Ray Tmin value encoded in the hit object.
- Semantics are similar to RayTMinKHR builtin as defined in SPV_KHR_ray_tracing. -
- Result is the Ray Tmin value as encoded in the hit object.
-
- Result Type must be a 32-bit floating-point type scalar.
-
- Hit Object is a pointer to the hit object.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models.

Capability:
-ShaderInvocationReorderNV

4

5275

<id> Result Type

<id> Result

<id> Hit Object

- ------- - - - - - - - - - - - - - -

OpHitObjectGetShaderBindingTableRecordIndexNV
-
- Returns the index for the record in shader binding table as encoded in hit object.
-
- Result is the current time value as encoded in the hit object.
-
- Result Type must be a 32-bit integer type scalar.
-
- Hit Object is a pointer to the hit object.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models.

Capability:
-ShaderInvocationReorderNV

4

5258

<id> Result Type

<id> Result

<id> Hit Object

- ------- - - - - - - - - - - - - - -

OpHitObjectGetShaderRecordBufferHandleNV
-
- Returns the address of shader record buffer for the hit or miss record encoded in hit object.
-
- Result is the address of data in shader record as encoded in the hit object.
-
- Result Type must be a 32-bit integer type 2-component vector.
-
- Hit Object is a pointer to the hit object.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models.

Capability:
-ShaderInvocationReorderNV

4

5257

<id> Result Type

<id> Result

<id> Hit Object

- ------- - - - - - - - - - - - - - -

OpHitObjectIsEmptyNV
-
- Returns a boolean indicating whether the hit object is an empty hit object.
-
- Result is true if hit object encodes a NOP, false otherwise.
-
- Result Type must be a boolean type scalar.
-
- Hit Object must be a pointer to hit object.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models.

Capability:
-ShaderInvocationReorderNV

4

5276

<id> Result Type

<id> Result

<id> Hit Object

- ------- - - - - - - - - - - - - - -

OpHitObjectIsHitNV
-
- Returns a boolean indicating whether the hit object has encoded a hit.
-
- Result is true if hit object encodes a hit, false otherwise.
-
- Result Type must be a boolean type scalar.
-
- Hit Object must be a pointer to hit object.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models.

Capability:
-ShaderInvocationReorderNV

4

5277

<id> Result Type

<id> Result

<id> Hit Object

- ------- - - - - - - - - - - - - - -

OpHitObjectIsMissNV
-
- Returns a boolean indicating whether the hit object has encoded a miss.
-
- Result is true if hit object encodes a miss, false otherwise.
-
- Result Type must be a boolean type scalar.
-
- Hit Object must be a pointer to hit object.
-
- This instruction is allowed only in RayGenerationKHR, ClosestHitKHR and MissKHR execution models.

Capability:
-ShaderInvocationReorderNV

3

5278

<id> Result Type

<id> Result

<id> Hit Object

-
-
-
-
-
(Modify Section 3.36.6, Type-Declaration Instructions, adding a new table)
-
-
-
- ----- - - - - - - - - - - - -

OpTypeHitObjectNV
-
-Declares a hit object type which is an opaque object representing state during -ray tracing traversal.

-

This type is opaque: values of this type have no defined physical size or -bit pattern.

Capability:
-ShaderInvocationReorderNV

2

5281

<id> Result

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_NV_shader_invocation_reorder"
-
-
-
-
-
-

Interactions with SPV_NV_ray_tracing_motion_blur

-
-
-

If the SPV_NV_ray_tracing_motion_blur extension is not supported, the -OpHitObjectTraceRayMotionNV, OpHitObjectRecordHitMotionNV, -OpHitObjectRecordHitWithIndexMotionNV, and OpHitObjectRecordMissMotionNV -instructions are not supported.

-
-
-
-
-

Issues

-
-
-

None

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2022-09-01

Ashwin Lele

Internal revisions

2

2023-12-06

Daniel Koch

Remove references to non-existant SPIR-V definitions

-
-
-
- - \ No newline at end of file + + + + + + extensions/NV/SPV_NV_shader_invocation_reorder.html + + +

extensions/NV/SPV_NV_shader_invocation_reorder.html

+ + diff --git a/extensions/NV/SPV_NV_shader_sm_builtins.html b/extensions/NV/SPV_NV_shader_sm_builtins.html index 8918b7e..7dbd8ff 100644 --- a/extensions/NV/SPV_NV_shader_sm_builtins.html +++ b/extensions/NV/SPV_NV_shader_sm_builtins.html @@ -1,322 +1,12 @@ - - - - - - - -SPV_NV_shader_sm_builtins - - - - - -
-
-

Name Strings

-
-
-

SPV_NV_shader_sm_builtins

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Daniel Koch, NVIDIA

    -
  • -
  • -

    Jeff Bolz, NVIDIA

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2019-05-28

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.4 Revision 1.

-
-
-

This extension requires SPIR-V 1.3.

-
-
-
-
-

Overview

-
-
-

This extension adds several new shader builtins that are available -under the ShaderSMBuiltinsNV capability.

-
-
-

The new builtins provide the ability to query information about the number -of warps and streaming multiprocessors (SMs) in a system, and the ability -to identify the warp and SM that a shader invocation is executing on.

-
-
-

This SPIR-V extension provides support for the GLSL GL_NV_shader_sm_builtins -extension.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_NV_shader_sm_builtins"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.4

-
-
-

Validation Rules

-
-

An OpExtension must be added to the SPIR-V for validation layers to -check legal use of this extension:

-
-
-
-
OpExtension "SPV_NV_shader_sm_builtins"
-
-
-
-
-

Builtin

-
-
-
(Modify Section 3.21, Builtin, adding rows to the Builtin table)
-
-
-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
BuiltinEnabling Capabilities

5374

WarpsPerSMNV
-The maximum number of warps executing simultaneously on a streaming -multiprocessor.

ShaderSMBuiltinsNV

5375

SMCountNV
-The number of streaming multiprocessors on the physical device.

ShaderSMBuiltinsNV

5376

WarpIDNV
-An integer index in the range [0, WarpsPerSMNV) such that the execution of -two warps with the same warp ID and the same SM ID do not overlap.

ShaderSMBuiltinsNV

5377

SMIDNV
-An integer index in the range [0, SMCountNV) uniquely identifying which -streaming multiprocessor the invocation is executing on.

ShaderSMBuiltinsNV

-
-
-
-
-
-
-
-

Capabilities

-
-

Modify Section 3.31, "Capability", adding these rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityDepends On

5373

ShaderSMBuiltinsNV
-Enables the WarpsPerSMNV, SMCountNV, WarpIDNV, and SMIDNV builtin decorations.

Shader

-
-
-
-
-
-
-

Issues

-
-
-
    -
  1. -

    Does this extension really require SPIR-V 1.3?

    -
    -
    -
    -

    RESOLVED: Not directly, but it’s assuming that the rest of the subgroup -functionality is already present, otherwise it would have included more -builtins, etc. For simplicity, we’ll just say it requires SPIR-V 1.3.

    -
    -
    -
    -
  2. -
  3. -

    Why are we using the the term "ID" instead of "Index"?

    -
    -
    -
    -

    RESOLVED: We choose ID to match the GLSL and GL extensions.

    -
    -
    -
    -
  4. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2019-05-28

Daniel Koch

Internal revisions

-
-
-
- - \ No newline at end of file + + + + + + extensions/NV/SPV_NV_shader_sm_builtins.html + + +

extensions/NV/SPV_NV_shader_sm_builtins.html

+ + diff --git a/extensions/NV/SPV_NV_shader_subgroup_partitioned.html b/extensions/NV/SPV_NV_shader_subgroup_partitioned.html index f52c491..2de77c1 100644 --- a/extensions/NV/SPV_NV_shader_subgroup_partitioned.html +++ b/extensions/NV/SPV_NV_shader_subgroup_partitioned.html @@ -1,469 +1,12 @@ - - - - - - - -SPV_NV_shader_subgroup_partitioned - - - - - -
-
-

Name Strings

-
-
-

SPV_NV_shader_subgroup_partitioned

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Jeff Bolz, NVIDIA

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2018-03-14

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.3 Revision 1.

-
-
-

This extension requires SPIR-V 1.3.

-
-
-

This extension provides SPIR-V support for the GL_NV_shader_subgroup_partitioned -GLSL extension.

-
-
-
-
-

Overview

-
-
-

This extension adds new subgroup functionality to support the Vulkan -GL_NV_shader_subgroup_partitioned GLSL extension.

-
-
-

OpGroupNonUniformPartitionNV is a new instruction that computes a -partition (a ballot value indicating which other invocations in the -subgroup have the same value of the operand).

-
-
-

PartitionedReduceNV, PartitionedInclusiveScanNV, and -PartitionedExclusiveScanNV are new GroupOperation enum values that -select the partitioned reduce/scan functionality.

-
-
-

GroupNonUniformPartitionedNV is a capability that indicates a module -uses these new features.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_NV_shader_subgroup_partitioned"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces the following new capabilities:

-
-
-
-
GroupNonUniformPartitionedNV
-
-
-
-
-
-

New Decorations

-
-
-

None

-
-
-
-
-

New Builtins

-
-
-

None.

-
-
-
-
-

New Instructions

-
-
-
-
OpGroupNonUniformPartitionNV
-
-
-
-
-
-

Token Number Assignments

-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
NameValueUsage

GroupNonUniformPartitionedNV

5297

Capability

OpGroupNonUniformPartitionNV

5296

Opcode

PartitionedReduceNV

6

GroupOperation

PartitionedInclusiveScanNV

7

GroupOperation

PartitionedExclusiveScanNV

8

GroupOperation

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.3

-
-
-

(Add to the table in 3.28, Group Operation):

-
- ----- - - - - - - - - - - - - - - - - - -

6

PartitionedReduceNV
-A reduction operation performed across the invocations in a subset of a -partition value, with a unique value computed for each subset.

GroupNonUniformPartitionedNV

7

PartitionedInclusiveScanNV
-An inclusive scan operation performed across the invocations in a subset of a -partition value, with a unique value computed for each subset.

GroupNonUniformPartitionedNV

8

PartitionedExclusiveScanNV
-An exclusive scan operation performed across the invocations in a subset of a -partition value, with a unique value computed for each subset.

GroupNonUniformPartitionedNV

-
-

Add: -"The ballot parameter to the partitioned operations must form a valid -partition of the active invocations in the subgroup. The values of ballot -are a valid partition if:

-
-
-
    -
  • -

    for each active invocation i, the bit corresponding to i is -set in i's value of ballot, and

    -
  • -
  • -

    for any two active invocations i and j, if the bit -corresponding to invocation j is set in invocation i's value -of ballot, then invocation j's value of ballot must equal -invocation i's value of ballot, and

    -
  • -
  • -

    bits not corresponding to any invocation in the subgroup are -ignored.

    -
  • -
-
-
-

If two active invocations i and j have the same value of ballot, -they are said to be "in the same subset of the partition"."

-
-
-
-
(Modify Section 3.32.21, Group Instructions, adding to the end of the list of instructions)
-
-
-
- ------- - - - - - - - - - - - - - -

OpGroupNonUniformPartitionNV
-
-Computes a ballot result that is a valid partition of the active invocations -such that all invocations in each subset of the partition have the same value -of value. For any two invocations in different subsets of the partition, -either their values of value must not be equal or one must be a floating -point NaN -.
-
-Value must be a scalar or vector type.
-
-Result Type must be a 4 component vector of 32 bit integer types.
-
-Result is a set of bitfields where the first invocation is represented -in bit 0 of the first vector component and the last (up to SubgroupSize) -is the higher bit number of the last bitmask needed to represent all -bits of the subgroup invocations.

Capability:
-GroupNonUniformPartitionedNV

4

5296

<id> Result Type

<id> Result

<id> Value

-
-
-
-
(Modify Section 3.32.21, Group Instructions, modify each GroupNonUniformArithmetic instruction)
-
-

Add an optional operand "Optional <id> ballot".

-
-
-
-
-

Add "If Operation is PartitionedReduceNV, PartitionedInclusiveScanNV, or -PartitionedExclusiveScanNV, ballot must be specified. ballot specifies -the partition of invocations to use when computing a partitioned operation."

-
-
-

Add GroupNonUniformPartitionedNV to the capability list.

-
-
-
-
(Modify Section 3.31, Capability, adding new rows to the Capability table)
-
-
-
- ------ - - - - - - - - - - - - - - - -
CapabilityDepends OnEnabled by Extension

5301

GroupNonUniformPartitionedNV
-Uses partitioned subgroup operations.

Shader

SPV_NV_shader_subgroup_partitioned

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_NV_shader_subgroup_partitioned"
-
-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2018-03-14

Jeff Bolz

Initial draft

-
-
-
- - \ No newline at end of file + + + + + + extensions/NV/SPV_NV_shader_subgroup_partitioned.html + + +

extensions/NV/SPV_NV_shader_subgroup_partitioned.html

+ + diff --git a/extensions/NV/SPV_NV_shading_rate.html b/extensions/NV/SPV_NV_shading_rate.html index 784e28f..9160f3d 100644 --- a/extensions/NV/SPV_NV_shading_rate.html +++ b/extensions/NV/SPV_NV_shading_rate.html @@ -1,356 +1,12 @@ - - - - - - - -SPV_NV_shading_rate - - - - - -
-
-

Name Strings

-
-
-

SPV_NV_shading_rate

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Daniel Koch, NVIDIA

    -
  • -
  • -

    Pat Brown, NVIDIA

    -
  • -
  • -

    Jeff Bolz, NVIDIA

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2018-09-12

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.3, Revision 2, Unified.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension provides SPIR-V support for the GLSL GL_NV_shading_rate_image -extension.

-
-
-

In the corresponding API extensions, applications can use a texture -to control the number of fragment shader invocations that will be spawned -for a particular neighborhood of covered pixels. We refer to the density -of fragment shader invocations as the "shading rate".

-
-
-

This extension adds support for two new fragment shader built-ins under the -new ShadingRateNV capability. These built-ins can be used to determine -the shading rate used when executing the fragment shader.

-
-
-

A FragmentSizeNV decorated variable will represent the size of a rectangle -of pixels that is being shaded by this fragment shader invocation.

-
-
-

A InvocationsPerPixelNV decorated variable will represent the maximum number -of fragment shader invocations executed for each pixel.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_NV_shading_rate"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.3

-
-
-
-
(Modify Section 3.21, BuiltIn)
-
-
-
-
-

(add new rows to the Builtin table)

-
- ------ - - - - - - - - - - - - - - - - - - - - - -
BuiltInEnabling CapabilitiesEnabled by Extension

5292

FragmentSizeNV
-Input that represents the size of a rectangle of pixels corresponding to this -invocation. Only valid in the Fragment Execution Model. -See the API specification for more detail.

ShadingRateNV

SPV_NV_shading_rate

5293

InvocationsPerPixelNV
-Input that represents the maximum number of fragment shader invocations -executed for each pixel, as derived from the effective shading rate for the -fragment. Only valid in the Fragment Execution Model. -See the API specification for more detail.

ShadingRateNV

SPV_NV_shading_rate

-
-
-
-
(Modify Section 3.31, Capability, adding a new row to the Capability table)
-
-
-
- ------ - - - - - - - - - - - - - - - -
CapabilityDepends OnEnabled by Extension

5291

ShadingRateNV
-Uses the FragmentSizeNV or InvocationsPerPixelNV Builtins.

Shader

SPV_NV_shading_rate

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_NV_shading_rate"
-
-
-
-
-
-

Issues

-
-
-
    -
  1. -

    What are we going to do for interactions with ARB_fragment_shader_interlock? -We don’t yet have a SPV extension for that.

    -
    -
    -
    -

    RESOLVED: Deferred to be added in an interaction with a future SPV/Vulkan -extension that adds support for pixel and sample exclusive access modes. -When this extension is supported, a third shading rate exclusive access mode -will be needed.

    -
    -
    -
    -
  2. -
  3. -

    How should we name the built-in variable describing the number of pixels -covered by a given fragment?

    -
    -
    -
    -

    RESOLVED: We are using FragmentSizeNV to mirror the "gl_FragmentSizeNV" -GLSL builtin from GL_NV_shading_rate_image. In retrospect it might have -been more consistent with existing naming conventions to call it FragSizeNV -instead. There are a number of other built-ins that have "Frag" in the name -(FragCoord, FragDepth, and FragStencilRefEXT), but none that have Fragment. -A future extension which promotes this functionality may wish to rename it as -as alias.

    -
    -
    -
    -
  4. -
  5. -

    Why is the SPIR-V extension named NV_shading_rate (without "image") but -the Vulkan API and GLSL extensions are called NV_shading_rate_image?

    -
    -
    -
    -

    RESOLVED: -The API extensions add the "shading rate image" to control the fragment -shading rate, however the GLSL/SPV only add builtins, so it is strange to -include "image" in the name. Unfortunately the GLSL portion was already -baked so it didn’t get the chance to drop the "_image" in time.

    -
    -
    -
    -
  6. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2018-09-12

Daniel Koch

internal revisions

-
-
-
- - \ No newline at end of file + + + + + + extensions/NV/SPV_NV_shading_rate.html + + +

extensions/NV/SPV_NV_shading_rate.html

+ + diff --git a/extensions/NV/SPV_NV_stereo_view_rendering.html b/extensions/NV/SPV_NV_stereo_view_rendering.html index 44a61a5..726418a 100644 --- a/extensions/NV/SPV_NV_stereo_view_rendering.html +++ b/extensions/NV/SPV_NV_stereo_view_rendering.html @@ -1,432 +1,12 @@ - - - - - - - -SPV_NV_stereo_view_rendering - - - - - -
-
-

Name Strings

-
-
-

SPV_NV_stereo_view_rendering

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Daniel Koch, NVIDIA

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2017-02-15

Revision

2

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.1 Revision 4.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-

This extension requires SVP_NV_viewport_array2.

-
-
-
-
-

Overview

-
-
-

This extension adds a new capability to support the OpenGL -GL_NV_stereo_view_rendering extension in SPIR-V.

-
-
-

The new ShaderStereoViewNV capability adds two builtin variables, -SecondaryPositionNV and SecondaryViewportMaskNV, which can be -exported from Vertex, Tessellation or Geometry shaders, or imported -to Tessellation or Geometry shaders.

-
-
-

The SecondaryPositionNV builtin decoration corresponds to the -gl_SecondaryPositionNV variable in GLSL and is used to specify -the position for the second view.

-
-
-

The SecondaryViewportMaskNV builtin decoration corresponds to the -gl_SecondaryViewportMaskNV[] variable in GLSL and is used to specify -the viewport mask for the second view.

-
-
-

This capability also adds the SecondaryViewportRelativeNV -decoration that corresponds to the secondary_view_offset layout -qualifier that can be applied to the gl_Layer geometry shader output -variable in GLSL.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_NV_stereo_view_rendering"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
ShaderStereoViewNV
-
-
-
-
-
-

New Decorations

-
-
-

Decoration added under the ShaderStereoViewNV capability:

-
-
-
-
SecondaryViewportRelativeNV
-
-
-
-
-
-

New Builtins

-
-
-

Two new builtins are added as outputs for the Vertex, Tessellation -and Geometry Execution Models under the ShaderStereoViewNV capability:

-
-
-
-
SecondaryPositionNV
-SecondaryViewportMaskNV
-
-
-
-

SecondaryPositionNV can also be used as an input for the Tesselation and -Geometry Execution Models.

-
-
-
-
-

New Instructions

-
-
-

None.

-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - - - - - - - - - - - - - -

SecondaryViewportRelativeNV

5256

SecondaryPositionNV

5257

SecondaryViewportMaskNV

5258

ShaderStereoViewNV

5259

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.1

-
-
-
-
(Modify Section 3.20, Decoration, adding a row to the Decoration table)
-
-
-
- ------- - - - - - - - - - - - - - - - -
DecorationEnabling CapabilitiesExtra Operands

5256

SecondaryViewportRelativeNV
-Apply to a variable. Indicates the layer offset for primitives in the second -view. If used with ViewportRelativeNV, the layer used for rendering -primitives of the second view is computed by adding the value of the -variable decorated with ViewportIndex to the value specified by -SecondaryViewportRelativeNV Only valid for the Output Storage Class.

ShaderStereoViewNV

Literal Number Offset

-
-
-
-
(Modify Section 3.21, BuiltIn)
-
-
-
-
-

(add two new rows to the BuiltIn table)

-
- ----- - - - - - - - - - - - - - - - - - - -
BuiltInEnabling Capabilities

5257

SecondaryPositionNV
-Output vertex position for secondary view in Vertex, Tessellation, or -Geometry Execution Model, and input secondary view position for -Tessellation and Geometry Execution Models. See Vulkan or OpenGL API -specifications for more detail.

ShaderStereoViewNV

5258

SecondaryViewportMaskNV
-Output secondary viewport mask in Vertex, Tessellation, or Geometry -Execution Model. See Vulkan or OpenGL API specifications for more detail.

ShaderStereoViewNV

-
-
-
-
(Modify Section 3.31, Capability, add a new row to the Capability table)
-
-
-
- ------ - - - - - - - - - - - - - - - -
CapabilityDepends OnEnabled by Extension

5259

ShaderStereoViewNV

ShaderViewportMaskNV

SPV_NV_stereo_view_rendering

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_NV_stereo_view_rendering"
-
-
-
-
-
-

Issues

-
-
-

None yet!

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2016-12-18

Daniel Koch

Initial draft

2

2017-02-15

Daniel Koch

Mark complete.

-
-
-
- - \ No newline at end of file + + + + + + extensions/NV/SPV_NV_stereo_view_rendering.html + + +

extensions/NV/SPV_NV_stereo_view_rendering.html

+ + diff --git a/extensions/NV/SPV_NV_tensor_addressing.html b/extensions/NV/SPV_NV_tensor_addressing.html index 063e92e..593de48 100644 --- a/extensions/NV/SPV_NV_tensor_addressing.html +++ b/extensions/NV/SPV_NV_tensor_addressing.html @@ -1,878 +1,12 @@ - - - - - - - -SPV_NV_tensor_addressing - - - - - -
-
-

Name Strings

-
-
-

SPV_NV_tensor_addressing

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Jeff Bolz, NVIDIA

    -
  • -
  • -

    Karthik Vaidyanathan, NVIDIA

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2024 NVIDIA Corp.

-
-
-
-
-

Status

-
-
-
    -
  • -

    Draft

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2024-09-18

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.6, Revision 3, Unified.

-
-
-

This extension requires SPIR-V 1.6.

-
-
-
-
-

Overview

-
-
-

This extension adds tensor layout and view types which initially can be be used -with SPV_NV_cooperative_matrix2. It is written as a separate extension to allow -it to potentially be used with other extensions in the future.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_NV_tensor_addressing"
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6

-
-
-

2.2 Terms

-
-

Add new terms to section 2.2.2 Types:

-
-
-

Tensor Layout: An opaque collection of values manipulated by -OpTensorLayout instructions, and used for tensor addressing calculations when -loading and storing cooperative matrices.

-
-
-

Tensor View: An opaque collection of values manipulated by -OpTensorView instructions, and used for tensor addressing calculations when -loading and storing cooperative matrices.

-
-
-

Add Tensor Layout and Tensor View to the list of Opaque Types.

-
-
-
-

3.31 Capabilities

-
-

Modify Section 3.31, "Capability", adding these rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityEnabling Capabilities

5439

TensorAddressingNV
-Enables tensor layout and view instructions.

-
-
-
-
-

3.X Tensor Layout and View

-
-

Tensor layout and tensor view types are representations of the mapping -between matrix coordinates and tensor memory layout. They each have a -number of dimensions in the range [1,5], with dimension 0 being the -outermost dimension and the last dimension being the innermost. These types -have the following logical state:

-
-
-
-
    struct tensorLayoutNV<uint32_t Dim,
-                          TensorClampMode Mode = TensorClampModeUndefined>
-    {
-      static constexpr uint32_t LDim = Dim;
-      static constexpr TensorClampMode clampMode = Mode;
-
-      uint32_t blockSize[LDim];
-      uint32_t layoutDimension[LDim];
-      uint32_t stride[LDim];
-      int32_t offset[LDim];
-      uint32_t span[LDim];
-      uint32_t clampValue;
-    };
-
-    struct tensorViewNV<uint Dim, bool hasDimensions, uint32_t p0, ..., uint32_t p<Dim-1>>
-    {
-      static constexpr uint32_t VDim = Dim;
-      static constexpr bool hasDim = hasDimensions;
-      static constexpr uint32_t permutation[VDim] = {p0, ..., p<Dim-1>};
-
-      uint32_t viewDimension[VDim];
-      uint32_t viewStride[VDim];
-      uint32_t clipRowOffset, clipRowSpan, clipColOffset, clipColSpan;
-    };
-
-
-
-

A tensor layout represents the layout of values in memory (number of -dimensions and size), along with a region being accessed (offset and span).

-
-
-
-
    ---------------------------------------------------------------------------
-    |                           layoutDimension1                              |
-    |                                                                         |
-    |                                                                         |
-    |                                                                         |
-    |                                                                         |
-    |                                                                         |
-    |                                                                         |
-    |                                                                         |
-    |                        span1                                            |
-    |                  -----------------                                      |
-    |                  |               |                                      |
-    |                  |               |                                      |
-    |                  |     slice     | span0                                |
-    |                  |               |                      layoutDimension0|
-    |                  |               |                                      |
-    |      offset1     |               |                                      |
-    | ---------------> -----------------                                      |
-    |                                                                         |
-    |                  ^                                                      |
-    |                  |                                                      |
-    |                  |                                                      |
-    |                  | offset0                                              |
-    |                  |                                                      |
-    |                  |                                                      |
-    |                  |                                                      |
-    |                  |                                                      |
-    ---------------------------------------------------------------------------
-    Figure: A 2D tensor layout, and a slice selecting a region within it.
-
-
-
-

A tensor view allows reinterpreting the dimensions of the region being -accessed, including changing the number of dimensions, reordering the -dimensions as they are loaded or stored, and clipping the region of the -matrix that is loaded or stored. Often the span will have the -same number of elements as the matrix, but in some more advanced uses -that may not be the case.

-
-
-

How the addressing calculations are performed is left to other extensions to -define.

-
-
-

Unlike some other ML APIs, tensor layouts and views only describe -addressing calculations and never involve making copies of tensors. For -this reason, the functionality is slightly more limited (e.g. there’s no -way to slice, then permute, then slice again).

-
-
-

OpTensorLayout and OpTensorView instructions operate by copying -existing object state and updating the requested state and returning -that as a new result. Some of these instructions initialize multiple -related pieces of state, setting some to common default values, so -the order of the operations matters.

-
-
-
-

3.X Tensor Clamp Mode

-
-

New section in 3 "Binary Form".

-
-
-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Tensor Clamp ModeEnabling Capabilities

0

Undefined
-Out of bounds accesses have undefined behavior.

1

Constant
-Out of bounds loads return a constant value. Out of bounds stores are discarded.

2

ClampToEdge
-Out of bounds load coordinates are clamped to the closest in-bounds coordinate. Out of bounds stores are discarded.

3

Repeat
-Out of bounds load coordinates wrap.
- c = c % dim;
-Out of bounds stores are discarded.

4

RepeatMirrored
-Out of bounds load coordinates wrap with mirroring.
- c = c % (2*dim-2);
- c = (c >= dim) ? (2*dim-2-c) : c;
-Out of bounds stores are discarded.

-
-
-
-
-

3.49.6 Type-Declaration Instructions

- ------- - - - - - - - - - - - - - -

OpTypeTensorLayoutNV
-
-Dim is the number of dimensions in the tensor layout, and must be a -constant instruction with scalar 32-bit integer type. The value must -be greater than zero and less than or equal to 5.
-
-ClampMode is a Tensor Clamp Mode which controls how out of bounds -coordinates are treated, and must be a constant instruction with scalar -32-bit integer type. -

Capability:
-TensorAddressingNV

4

5370

Result <id>

<id>
-Dim

<id>
-ClampMode

- -------- - - - - - - - - - - - - - - -

OpTypeTensorViewNV
-
-Dim is the number of dimensions in the tensor view, and must be a -constant instruction with scalar 32-bit integer type. The value must -be greater than zero and less than or equal to 5.
-
-HasDimensions is a boolean indicating whether the view has its own dimensions -(reinterpreting those from the tensor layout) or if the tensor layout’s -dimensions are used. It must be a constant instruction with scalar -boolean type.
-
-p0 …​ p<Dim-1> are integer values indicating how the tensor’s coordinates -are permuted. They each must be a constant instruction with scalar 32-bit -integer type, and they must form a valid permutation of the range [0,Dim).
-

Capability:
-TensorAddressingNV

5+variable

5371

Result <id>

<id>
-Dim

<id>
-HasDimensions

<id>, <id>, …​
-p0,
-…​
-p<Dim-1>

-
-
-

3.X Tensor Layout and View Instructions

-
-

New section in 3 "Binary Form".

-
- ------ - - - - - - - - - - - - -

OpCreateTensorLayoutNV
-
-Create a Tensor Layout of the requested type. The layoutDimension, stride, -span, and offset elements are initialized to zero. The blockSize elements are -initialized to one. clampValue is initialized to zero.
-
-Result Type must be OpTypeTensorLayoutNV.
-

Capability:
-TensorAddressingNV

3

5372

<id>
-Result Type

Result <id>

- -------- - - - - - - - - - - - - - - -

OpTensorLayoutSetBlockSizeNV
-
-Create a copy of TensorLayout, setting the blockSize elements to -BlockSize<i>. When the blockSize is not 1, the strides are considered to be -in blocks rather than in elements.
-
-The number of BlockSize operands must match the dimension of Result Type.
-
-The BlockSize operands must each be a scalar 32-bit integer type.
-
-The type of TensorLayout must be Result Type.
-

Capability:
-TensorAddressingNV

5+variable

5384

<id>
-Result Type

Result <id>

<id>
-TensorLayout

<id>, <id>, …​
-BlockSize0,
-…​
-BlockSize<LDim-1>

- -------- - - - - - - - - - - - - - - -

OpTensorLayoutSetDimensionNV
-
-Create a copy of TensorLayout, setting the layoutDimension and span elements to Dim<i>. -Sets offset elements to zero. Sets stride[LDim-1] to 1 and sets stride[i] to -stride[i+1] * ceiling(Dim<i+1> / blockSize[i+1]).
-
-The number of Dim operands must match the dimension of Result Type.
-
-The Dim operands must each be a scalar 32-bit integer type.
-
-The type of TensorLayout must be Result Type.
-

Capability:
-TensorAddressingNV

5+variable

5373

<id>
-Result Type

Result <id>

<id>
-TensorLayout

<id>, <id>, …​
-Dim0,
-…​
-Dim<LDim-1>

- -------- - - - - - - - - - - - - - - -

OpTensorLayoutSetStrideNV
-
-Create a copy of TensorLayout, setting the stride elements to Stride<i>.
-
-Stride<i> must be at least Stride<i+1> * ceiling(layoutDimension[i+1] / blockSize[i+1]).
-
-The Stride operands must each be a scalar 32-bit integer type.
-
-The number of Stride operands must match the dimension of Result Type.
-
-The type of TensorLayout must be Result Type.
-

Capability:
-TensorAddressingNV

5+variable

5374

<id>
-Result Type

Result <id>

<id>
-TensorLayout

<id>, <id>, …​
-Stride0,
-…​
-Stride<LDim-1>

- -------- - - - - - - - - - - - - - - -

OpTensorLayoutSliceNV
-
-Create a copy of TensorLayout, adding Offset<i> to offset[i], and span[i] -is set to Span<i>.
-
-Stride<i> must be at least Stride<i+1> times layoutDimension[i+1].
-
-The Offset and Span operands must each be a scalar 32-bit integer type.
-
-The number of Offset and Span operands must each match the dimension of Result Type.
-
-The type of TensorLayout must be Result Type.
-

Capability:
-TensorAddressingNV

6+variable

5375

<id>
-Result Type

Result <id>

<id>
-TensorLayout

<id>, <id>, …​
-Offset0, Span0,
-…​
-Offset<LDim-1>, Span<LDim-1>

- -------- - - - - - - - - - - - - - - -

OpTensorLayoutSetClampValueNV
-
-Create a copy of TensorLayout, setting the clampValue to Value.
-
-Value must be a scalar 32-bit integer type.
-
-The type of TensorLayout must be Result Type.
-

Capability:
-TensorAddressingNV

5

5376

<id>
-Result Type

Result <id>

<id>
-TensorLayout

<id>
-Value

- ------ - - - - - - - - - - - - -

OpCreateTensorViewNV
-
-Create a Tensor View of the requested type. The viewDimension and viewStride -elements are initialized to zero. The clip values are initialized to offsets of -0, spans of 0xFFFFFFFF.
-
-Result Type must be OpTypeTensorViewNV.
-

Capability:
-TensorAddressingNV

3

5377

<id>
-Result Type

Result <id>

- -------- - - - - - - - - - - - - - - -

OpTensorViewSetDimensionNV
-
-Create a copy of TensorView, setting the viewDimension to Dim<i>. -Sets viewStride[LDim-1] to 1 and sets viewStride[i] to the -product of Dim<i+1> to Dim<LDim-1>.
-
-The number of Dim operands must match the dimension of Result Type.
-
-The Dim operands must each be a scalar 32-bit integer type.
-
-The type of TensorView must be Result Type.
-

Capability:
-TensorAddressingNV

5+variable

5378

<id>
-Result Type

Result <id>

<id>
-TensorView

<id>, <id>, …​
-Dim0,
-…​
-Dim<N-1>

- -------- - - - - - - - - - - - - - - -

OpTensorViewSetStrideNV
-
-Create a copy of TensorView, setting the viewStride to Stride<i>.
-
-The number of Stride operands must match the dimension of Result Type.
-
-The Stride operands must each be a scalar 32-bit integer type.
-
-The type of TensorView must be Result Type.
-

Capability:
-TensorAddressingNV

5+variable

5379

<id>
-Result Type

Result <id>

<id>
-TensorView

<id>, <id>, …​
-Stride0,
-…​
-Stride<N-1>

- ----------- - - - - - - - - - - - - - - - - - -

OpTensorViewSetClipNV
-
-Create a copy of TensorView, setting the clip elements to the corresponding parameters.
-
-The Clip operands must each be a scalar 32-bit integer type.
-
-The type of TensorView must be Result Type.
-

Capability:
-TensorAddressingNV

8

5382

<id>
-Result Type

Result <id>

<id>
-TensorView

<id>
-ClipRowOffset

<id>
-ClipRowSpan

<id>
-ClipColOffset

<id>
-ClipColSpan

-
-
-
-
-

Issues

-
- -
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2024-09-18

Jeff Bolz

Initial revision of SPV_NV_tensor_addressing

-
-
-
- - \ No newline at end of file + + + + + + extensions/NV/SPV_NV_tensor_addressing.html + + +

extensions/NV/SPV_NV_tensor_addressing.html

+ + diff --git a/extensions/NV/SPV_NV_viewport_array2.html b/extensions/NV/SPV_NV_viewport_array2.html index 7893d3b..113798e 100644 --- a/extensions/NV/SPV_NV_viewport_array2.html +++ b/extensions/NV/SPV_NV_viewport_array2.html @@ -1,471 +1,12 @@ - - - - - - - -SPV_NV_viewport_array2 - - - - - -
-
-

Name Strings

-
-
-

SPV_NV_viewport_array2

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Daniel Koch, NVIDIA

    -
  • -
-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2017-02-15

Revision

2

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.1 Revision 4.

-
-
-

This extension requires SPIR-V 1.0.

-
-
-
-
-

Overview

-
-
-

This extension adds new capabilities to support the OpenGL -GL_ARB_shader_viewport_layer_array and -GL_NV_viewport_array2 extensions, as well as the Vulkan -VK_NV_viewport_array2 extension in SPIR-V.

-
-
-

The new ShaderViewportIndexLayerNV capability allows the -Layer and ViewportIndex builtin variables to be exported -from Vertex or Tessellation shaders, in addition to Geometry -shaders. This is functionality added by both -GL_ARB_shader_viewport_layer_array and GL_NV_viewport_array2, -and separately by GL_AMD_vertex_shader_layer, and -GL_AMD_vertex_shader_viewport_index.

-
-
-

The new ShaderViewportMaskNV capability adds a new ViewportMaskNV -builtin variable which can be exported from Vertex, Tessellation, -or Geometry shaders. This corresponds to the gl_ViewportMask -variable in GLSL. It also adds a new ViewportRelativeNV decoration -that corresponds to the viewport_relative layout qualifier that -can be applied to the gl_Layer shader output variable in GLSL.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_NV_viewport_array2"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces two new capabilities:

-
-
-
-
ShaderViewportIndexLayerNV
-ShaderViewportMaskNV
-
-
-
-
-
-

New Decorations

-
-
-

Decoration added under the ShaderViewportMaskNV capability:

-
-
-
-
ViewportRelativeNV
-
-
-
-
-
-

New Builtins

-
-
-

A new builtin is added as an ouput for the Vertex, Tessellation, -and Geometry Execution Models under the ShaderViewportMaskNV capability:

-
-
-
-
ViewportMaskNV
-
-
-
-

The existing Layer and ViewportIndex builtins are extended and may -also be used as outputs in the Vertex and Tessellation Execution -Models under the ShaderViewportIndexLayerNV capability.

-
-
-
-
-

New Instructions

-
-
-

None.

-
-
-
-
-

Token Number Assignments

-
- ---- - - - - - - - - - - - - - - - - - - -

ViewportRelativeNV

5252

ViewportMaskNV

5253

ShaderViewportIndexLayerNV

5254

ShaderViewportMaskNV

5255

-
-
-
-

Modifications to the SPIR-V Specification, Version 1.1

-
-
-
-
(Modify Section 3.20, Decoration, adding a row to the Decoration table)
-
-
-
- ------- - - - - - - - - - - - - - - - -
DecorationEnabling CapabilitiesExtra Operands

5252

ViewportRelativeNV
-Apply to a variable. Indicates that the value of the variable decorated with -ViewportIndex is added to this variable. Only valid for the Output -Storage Class.

ShaderViewportMaskNV

-
-
-
-
(Modify Section 3.21, BuiltIn)
-
-
-
-
-

(add a new row to the Builtin table)

-
- ----- - - - - - - - - - - - - - -
BuiltInEnabling Capabilities

5253

ViewportMaskNV
-Output viewport mask in Vertex, Tessellation, or Geometry Execution Model. -See Vulkan or OpenGL API specifications for more detail.

ShaderViewportMaskNV

-
-

(Modify the definition of Layer and ViewportIndex as follows, allowing -them to be outputs from Vertex and Tessellation shaders)

-
- ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
BuiltInEnabling Capabilities

9

Layer
-Layer selection for multi-layer framebuffer. See Vulkan or OpenGL API -specification for more detail.

Layer output by a Geometry Execution Model, -input to a Fragment Execution Model.

Geometry

Layer output by a Vertex or Tessellation Execution Model.

ShaderViewportIndexLayerNV

10

ViewportIndex
-Viewport selection for viewport transformation when using multipe viewports. -See Vulkan or OpenGL API specification for more detail.

Viewport Index output by a Geometry Execution Model, -input to a Fragment Execution Model.

MultiViewport

Viewport Index output by a Vertex or Tessellation Execution Model.

ShaderViewportIndexLayerNV

-
-
-
-
(Modify Section 3.31, Capability, adding new rows to the Capability table)
-
-
-
- ------ - - - - - - - - - - - - - - - - - - - - - -
CapabilityDepends OnEnabled by Extension

5254

ShaderViewportIndexLayerNV

MultiViewport

SPV_NV_viewport_array2

5255

ShaderViewportMaskNV

ShaderViewportIndexLayerNV

SPV_NV_viewport_array2

-
-
-
-
-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-
-
OpExtension "SPV_NV_viewport_array2"
-
-
-
-
-
-

Issues

-
-
-

None yet!

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2016-11-25

Daniel Koch

Initial draft

2

2017-02-15

Daniel Koch

Mark Complete, add mention of Vulkan extension

-
-
-
- - \ No newline at end of file + + + + + + extensions/NV/SPV_NV_viewport_array2.html + + +

extensions/NV/SPV_NV_viewport_array2.html

+ + diff --git a/extensions/QCOM/SPV_QCOM_image_processing.html b/extensions/QCOM/SPV_QCOM_image_processing.html index 2efc798..ef08b29 100644 --- a/extensions/QCOM/SPV_QCOM_image_processing.html +++ b/extensions/QCOM/SPV_QCOM_image_processing.html @@ -1,548 +1,12 @@ - - - - - - - -SPV_QCOM_image_processing - - - - - -
-
-

Name Strings

-
-
-

SPV_QCOM_image_processing

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Balaji Calidas, Qualcomm

    -
  • -
  • -

    Jeff Leger, Qualcomm

    -
  • -
  • -

    Ruihao Zhang, Qualcomm

    -
  • -
  • -

    Wooyoung Kim, Qualcomm

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2020 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Final

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2022-09-21

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.6 Revision 2.

-
-
-

This extension requires SPIR-V 1.4.

-
-
-
-
-

Overview

-
-
-

This extension introduces a new set of operations for image processing, along with -new capabilities and two new decorations.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_QCOM_image_processing"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces new capabilities:

-
-
-
-
TextureWeightedSampleQCOM
-TextureBoxFilterQCOM
-TextureBlockMatchQCOM
-
-
-
-
-
-

New Decorations

-
-
-

The extension adds two new decorations for texture types.

-
-
-
-
WeightTextureQCOM
-BlockMatchTextureQCOM
-
-
-
-
-
-

New Instructions

-
-
-

Instruction added under the TextureWeightedSampleQCOM capability:

-
-
-
-
OpImageSampleWeightedQCOM
-
-
-
-

Instruction added under the TextureBoxFilterQCOM capability:

-
-
-
-
OpImageBoxFilterQCOM
-
-
-
-

Instructions added under the TextureBlockMatchQCOM capability:

-
-
-
-
OpImageBlockMatchSSDQCOM
-OpImageBlockMatchSADQCOM
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6

-
-
-

Capabilities

-
-

Modify Section 3.31, "Capability", adding these rows to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - - - - - - - - - - - -
CapabilityImplicitly declares

4484

TextureWeightedSampleQCOM
-Add weighted sample operation.

4485

TextureBoxFilterQCOM
-Add box filter operation.

4486

TextureBlockMatchQCOM
-Add block matching operations (sum of absolute/square differences).

-
-
-
-
-

Decorations

-
-

Modify Section 3.20, "Decoration", adding the following rows to the Decoration table:

-
-
-
- ------- - - - - - - - - - - - - - - - - - - - - - -
DecorationExtra OperandsEnabling Capabilities

4487

WeightTextureQCOM
-Apply to a texture used as Weight Image in OpImageSampleWeightedQCOM. Behavior is defined by the runtime environment.

TextureWeightedSampleQCOM

4488

BlockMatchTextureQCOM
-Apply to textures used as Target Sampled Image and Reference Sampled Image in OpImageBlockMatchSSDQCOM/OpImageBlockMatchSADQCOM.
-Behavior is defined by the runtime environment.

TextureBlockMatchQCOM

-
-
-
-
-

Instructions

-
-

Modify Section 3.42.10, "Image Instructions", adding before OpImageSampleFootprintNV:

-
- --------- - - - - - - - - - - - - - - - -

OpImageSampleWeightedQCOM
-
-Weighted sample operation

-

Result Type is the type of the result of weighted sample operation

-

Texture Sampled Image must be an object whose type is OpTypeSampledImage. The MS operand of the -underlying OpTypeImage must be 0.

-

Coordinate must be a vector of floating-point type, whose vector size is 2.

-

Weight Image must be an object whose type is OpTypeSampledImage. If the object is an interface object, -it must be decorated with WeightTextureQCOM. Otherwise, a texture object which is used to construct the object -must be decorated with WeightTextureQCOM. The MS operand of the -underlying OpTypeImage must be 0.

Capability:
-TextureSampleWeightedQCOM

6

4480

<id> Result Type

<id> Result

<id> Texture Sampled Image

<id> Coordinate

<id> Weight Image

- --------- - - - - - - - - - - - - - - - -

OpImageBoxFilterQCOM
-
-Image box filter operation.

-

Result Type is the type of the result of image box filter operation

-

Texture Sampled Image must be an object whose type is OpTypeSampledImage. The MS operand of the -underlying OpTypeImage must be 0.

-

Coordinate must be a vector of floating-point type, whose vector size is 2. -
-Box Size must be a vector of floating-point type, whose vector size is 2 and signedness is 0.

Capability:
-TextureBoxFilterQCOM

6

4481

<id> Result Type

<id> Result

<id> Texture Sampled Image

<id> Coordinate

<id> Box Size

- ----------- - - - - - - - - - - - - - - - - - -

OpImageBlockMatchSSDQCOM
-
-Image block match operation with sum of square differences.

-

Result Type is the type of the result of image block match sum of square differences

-

Target Sampled Image must be an object whose type is OpTypeSampledImage. -If the object is an interface object, it must be decorated with BlockMatchTextureQCOM. -Otherwise, a texture object which is used to construct the object must be decorated with BlockMatchTextureQCOM. -The MS operand of the underlying OpTypeImage must be 0.

-

Target Coordinate must be a vector of integer type, whose vector size is 2 and signedness is 0.

-

Reference Sampled Image must be an object whose type is OpTypeSampledImage. -If the object is an interface object, it must be decorated with BlockMatchTextureQCOM. -Otherwise, a texture object which is used to construct the object must be decorated with BlockMatchTextureQCOM. -The MS operand of the underlying OpTypeImage must be 0.

-

Reference Coordinate must be a vector of integer type, whose vector size is 2 and signedness is 0.

-

Block Size must be a vector of integer type, whose vector size is 2 and signedness is 0.

Capability:
-TextureBlockMatchQCOM

8

4482

<id> Result Type

<id> Result

<id> Target Sampled Image

<id> Target Coordinate

<id> Reference Sampled Image

<id> Reference Coordinate

<id> Block Size

- ----------- - - - - - - - - - - - - - - - - - -

OpImageBlockMatchSADQCOM
-
-Image block match operation with sum of absolute differences.

-

Result Type is the type of the result of image block match sum of absolute differences

-

Target Sampled Image must be an object whose type is OpTypeSampledImage. -If the object is an interface object, it must be decorated with BlockMatchTextureQCOM. -Otherwise, a texture object which is used to construct the object must be decorated with BlockMatchTextureQCOM. -The MS operand of the underlying OpTypeImage must be 0.

-

Target Coordinate must be a vector of integer type, whose vector size is 2 and signedness is 0.

-

Reference Sampled Image must be an object whose type is OpTypeSampledImage. -If the object is an interface object, it must be decorated with BlockMatchTextureQCOM. -Otherwise, a texture object which is used to construct the object must be decorated with BlockMatchTextureQCOM. -The MS operand of the underlying OpTypeImage must be 0.

-

Reference Coordinate must be a vector of integer type, whose vector size is 2 and signedness is 0.

-

Block Size must be a vector of integer type, whose vector size is 2 and signedness is 0.

Capability:
-TextureBlockMatchQCOM

8

4483

<id> Result Type

<id> Result

<id> Target Sampled Image

<id> Target Coordinate

<id> Reference Sampled Image

<id> Reference Coordinate

<id> Block Size

-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-

An object decorated with either WeightTextureQCOM or BlockMatchTextureQCOM -must be used only with the corresponding built-in functions. Such an -object must not be used with any other functions.

-
-
-
-
OpExtension "SPV_QCOM_image_processing"
-
-
-
-
-
-

Issues

-
- -
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

0

2021-12-07

Ruihao Zhang

Initial version

1

2022-09-21

Wooyoung Kim

Replaced "should" with "must".
-Changed the SPV version requirement from 1.0 to 1.4

-
-
-
- - \ No newline at end of file + + + + + + extensions/QCOM/SPV_QCOM_image_processing.html + + +

extensions/QCOM/SPV_QCOM_image_processing.html

+ + diff --git a/extensions/QCOM/SPV_QCOM_image_processing2.html b/extensions/QCOM/SPV_QCOM_image_processing2.html index a372075..f874158 100644 --- a/extensions/QCOM/SPV_QCOM_image_processing2.html +++ b/extensions/QCOM/SPV_QCOM_image_processing2.html @@ -1,524 +1,12 @@ - - - - - - - -SPV_QCOM_image_processing2 - - - - - -
-
-

Name Strings

-
-
-

SPV_QCOM_image_processing2

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Jeff Leger, Qualcomm

    -
  • -
  • -

    Ruihao Zhang, Qualcomm

    -
  • -
  • -

    Wooyoung Kim, Qualcomm

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2023 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Final

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2023-11-12

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.6 Revision 3.

-
-
-

This extension requires SPIR-V 1.4.

-
-
-
-
-

Overview

-
-
-

This extension introduces a new set of operations for image processing, along with -a new capability and a new decoration.

-
-
-
-
-

Extension Name

-
-
-

To use this extension within a SPIR-V module, the following -OpExtension must be present in the module:

-
-
-
-
OpExtension "SPV_QCOM_image_processing2"
-
-
-
-
-
-

New Capabilities

-
-
-

This extension introduces a new capability:

-
-
-
-
TextureBlockMatch2QCOM
-
-
-
-
-
-

New Decorations

-
-
-

The extension adds a new sampler decoration.

-
-
-
-
BlockMatchSamplerQCOM
-
-
-
-
-
-

New Instructions

-
-
-

Instructions added under the TextureBlockMatch2QCOM capability:

-
-
-
-
OpImageBlockMatchWindowSADQCOM
-OpImageBlockMatchWindowSSDQCOM
-OpImageBlockMatchGatherSADQCOM
-OpImageBlockMatchGatherSSDQCOM
-
-
-
-
-
-

Modifications to the SPIR-V Specification, Version 1.6

-
-
-

Capabilities

-
-

Modify Section 3.31, "Capability", adding the following row to the Capability table:

-
-
-
- ----- - - - - - - - - - - - - - -
CapabilityImplicitly declares

4498

TextureBlockMatch2QCOM
-Add texture block match2 operations. This capability is required to use any of the -OpImageBlockMatchWindow* or OpImageBlockMatchGather* instructions.

-
-
-
-
-

Decorations

-
-

Modify Section 3.20, "Decoration", adding the following row to the Decoration table:

-
-
-
- ------- - - - - - - - - - - - - - - - -
DecorationExtra OperandsEnabling Capabilities

4499

BlockMatchSamplerQCOM
-Apply to samplers used to create Target Sampled Image and Reference Sampled Image in -OpImageBlockMatchWindowSSDQCOM or OpImageBlockMatchWindowSADQCOM. Behavior is defined by the runtime environment.

TextureBlockMatch2QCOM

-
-
-
-
-

Instructions

-
-

Modify Section 3.49.10, "Image Instructions", adding before OpImageSampleFootprintNV:

-
- ----------- - - - - - - - - - - - - - - - - - -

OpImageBlockMatchWindowSSDQCOM
-
-Windowed image block match operation with sum of square differences.

-

Result Type is the type of the result of windowed image block match sum of square differences

-

Target Sampled Image must be an object whose type is OpTypeSampledImage. -If the object is an interface object, it must be decorated with BlockMatchTextureQCOM and BlockMatchSamplerQCOM. -Otherwise, a texture object which is used to construct the object must be decorated with BlockMatchTextureQCOM -and the sampler object which is used to construct the object must be decorated with BlockMatchSamplerQCOM. -The MS operand of the underlying OpTypeImage must be 0.

-

Target Coordinate must be a vector of integer type, whose vector size is 2 and signedness is 0.

-

Reference Sampled Image must be an object whose type is OpTypeSampledImage. -If the object is an interface object, it must be decorated with BlockMatchTextureQCOM and BlockMatchSamplerQCOM. -Otherwise, a texture object which is used to construct the object must be decorated with BlockMatchTextureQCOM -and the sampler object which is used to construct the object must be decorated with BlockMatchSamplerQCOM. -The MS operand of the underlying OpTypeImage must be 0.

-

Reference Coordinate must be a vector of integer type, whose vector size is 2 and signedness is 0.

-

Block Size must be a vector of integer type, whose vector size is 2 and signedness is 0.

Capability:
-TextureBlockMatch2QCOM

8

4500

<id> Result Type

<id> Result

<id> Target Sampled Image

<id> Target Coordinate

<id> Reference Sampled Image

<id> Reference Coordinate

<id> Block Size

- ----------- - - - - - - - - - - - - - - - - - -

OpImageBlockMatchWindowSADQCOM
-
-Windowed image block match operation with sum of absolute differences.

-

Result Type is the type of the result of windowed image block match sum of absolute differences

-

Target Sampled Image must be an object whose type is OpTypeSampledImage. -If the object is an interface object, it must be decorated with BlockMatchTextureQCOM and BlockMatchSamplerQCOM. -Otherwise, a texture object which is used to construct the object must be decorated with BlockMatchTextureQCOM -and the sampler object which is used to construct the object must be decorated with BlockMatchSamplerQCOM. -The MS operand of the underlying OpTypeImage must be 0.

-

Target Coordinate must be a vector of integer type, whose vector size is 2 and signedness is 0.

-

Reference Sampled Image must be an object whose type is OpTypeSampledImage. -If the object is an interface object, it must be decorated with BlockMatchTextureQCOM and BlockMatchSamplerQCOM. -Otherwise, a texture object which is used to construct the object must be decorated with BlockMatchTextureQCOM -and the sampler object which is used to construct the object must be decorated with BlockMatchSamplerQCOM. -The MS operand of the underlying OpTypeImage must be 0.

-

Reference Coordinate must be a vector of integer type, whose vector size is 2 and signedness is 0.

-

Block Size must be a vector of integer type, whose vector size is 2 and signedness is 0.

Capability:
-TextureBlockMatch2QCOM

8

4501

<id> Result Type

<id> Result

<id> Target Sampled Image

<id> Target Coordinate

<id> Reference Sampled Image

<id> Reference Coordinate

<id> Block Size

- ----------- - - - - - - - - - - - - - - - - - -

OpImageBlockMatchGatherSSDQCOM
-
-Gathered image block match operation with sum of square differences.

-

Result Type is the type of the result of gathered image block match sum of square differences

-

Target Sampled Image must be an object whose type is OpTypeSampledImage. -If the object is an interface object, it must be decorated with BlockMatchTextureQCOM. -Otherwise, a texture object which is used to construct the object must be decorated with BlockMatchTextureQCOM. -The MS operand of the underlying OpTypeImage must be 0.

-

Target Coordinate must be a vector of integer type, whose vector size is 2 and signedness is 0.

-

Reference Sampled Image must be an object whose type is OpTypeSampledImage. -If the object is an interface object, it must be decorated with BlockMatchTextureQCOM. -Otherwise, a texture object which is used to construct the object must be decorated with BlockMatchTextureQCOM. -The MS operand of the underlying OpTypeImage must be 0.

-

Reference Coordinate must be a vector of integer type, whose vector size is 2 and signedness is 0.

-

Block Size must be a vector of integer type, whose vector size is 2 and signedness is 0.

Capability:
-TextureBlockMatch2QCOM

8

4502

<id> Result Type

<id> Result

<id> Target Sampled Image

<id> Target Coordinate

<id> Reference Sampled Image

<id> Reference Coordinate

<id> Block Size

- ----------- - - - - - - - - - - - - - - - - - -

OpImageBlockMatchGatherSADQCOM
-
-Gathered image block match operation with sum of absolute differences.

-

Result Type is the type of the result of gathered image block match sum of absolute differences

-

Target Sampled Image must be an object whose type is OpTypeSampledImage. -If the object is an interface object, it must be decorated with BlockMatchTextureQCOM. -Otherwise, a texture object which is used to construct the object must be decorated with BlockMatchTextureQCOM. -The MS operand of the underlying OpTypeImage must be 0.

-

Target Coordinate must be a vector of integer type, whose vector size is 2 and signedness is 0.

-

Reference Sampled Image must be an object whose type is OpTypeSampledImage. -If the object is an interface object, it must be decorated with BlockMatchTextureQCOM. -Otherwise, a texture object which is used to construct the object must be decorated with BlockMatchTextureQCOM. -The MS operand of the underlying OpTypeImage must be 0.

-

Reference Coordinate must be a vector of integer type, whose vector size is 2 and signedness is 0.

-

Block Size must be a vector of integer type, whose vector size is 2 and signedness is 0.

Capability:
-TextureBlockMatch2QCOM

8

4503

<id> Result Type

<id> Result

<id> Target Sampled Image

<id> Target Coordinate

<id> Reference Sampled Image

<id> Reference Coordinate

<id> Block Size

-
-
-
-
-

Validation Rules

-
-
-

An OpExtension must be added to the SPIR-V for validation layers to check -legal use of this extension:

-
-
-

An object decorated with BlockMatchSamplerQCOM must be used only with the -corresponding built-in functions. Such an object must not be used with any other functions.

-
-
-
-
OpExtension "SPV_QCOM_image_processing2"
-
-
-
-
-
-

Issues

-
- -
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2023-11-12

Wooyoung Kim

Initial version

-
-
-
- - \ No newline at end of file + + + + + + extensions/QCOM/SPV_QCOM_image_processing2.html + + +

extensions/QCOM/SPV_QCOM_image_processing2.html

+ + diff --git a/nonsemantic/NonSemantic.ClspvReflection.html b/nonsemantic/NonSemantic.ClspvReflection.html index ffd0e33..8c8e4ee 100644 --- a/nonsemantic/NonSemantic.ClspvReflection.html +++ b/nonsemantic/NonSemantic.ClspvReflection.html @@ -1,1911 +1,12 @@ - - - - - - - -SPIR-V Non-Semantic Clspv Reflection Instructions - - - - - -
-
-
-
-

Version 6

-
-
-
-
-

Contact

-
-
-

To report problems with this extended instruction set, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Alan Baker, Google

    -
  • -
  • -

    Kévin Petit, Arm Ltd.

    -
  • -
  • -

    Callum Fare, Codeplay

    -
  • -
  • -

    Finlay Marno, Codeplay

    -
  • -
  • -

    Romaric Jodin, Google

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2020-2022 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-

Unratified extended instruction set.

-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2023-10-11

Revision

6

-
-
-
-

Dependencies

-
-
-

This instruction set requires SPIR-V 1.0.

-
-
-

This instruction set requires SPV_KHR_non_semantic_info.

-
-
-
-
-

1. Introduction

-
-
-

This specifies the NonSemantic.ClspvReflection extended instruction set. It -provides all the necessary instructions needed to relay the reflection -information for clspv-generated shader modules -to a runtime (e.g clvk). It is not expected that -drivers implement this instruction set.

-
-
-

Import this extended instruction set using an OpExtInstImport -"NonSemantic.ClspvReflection.<ver>" instruction, where <ver> indicates the -version.

-
-
-
-
-

2. Version

-
-
-

This extended instruction set is versioned. The version is specified via the -import string. It must be an integer between 1 and the version specified at the -beginning of this instruction set. Differences between the version are -specified below.

-
-
-
-
-

3. Binary Form

-
-
-

The return type for all instructions must be OpTypeVoid.

-
-
-

None of the instructions support forward references in their operands. -Therefore, the Kernel and ArgumentInfo instructions must come before their -uses.

-
- -------- - - - - - - - - - - - - - -

Kernel
-
-Declares a shader entry-point generated by clspv.
-
-Kernel must be an OpFunction that is declared as a GLCompute entry-point.
-
-Name must be an OpString specifying the name of the entry-point. Name must -match an entry-point name for Kernel.
-
-NumArguments must be a 32-bit unsigned integer OpConstant specifying the -number of arguments to the kernel function. Unused kernel arguments may not -appear in subsequent reflection instructions. This operand is missing before -version 5.
-
-Flags must be a 32-bit unsigned integer OpConstant with a value combining -one or more of the values from the Kernel Property Flags -bit field. This operand is missing before version 5.
-
-Attributes must be an OpString. If the kernel was created from OpenCL C -source, Attributes must containing a space delimited list of attributes -specified for the entry-point using the __attribute__ OpenCL C qualifier. -Attributes must be spelled as they are declared inside the OpenCL C attribute -qualifier with any surrounding whitespace and embedded newlines removed. These -attributes include attributes described in the OpenCL C kernel language -specification and other attributes supported by the implementation. If the -kernel was not created from OpenCL C source then Attributes must be an -empty string. This operand is missing before version 5.

1

<id>
-Kernel

<id>
-Name

<id>
-NumArguments

<id>
-Flags

<id>
-Attributes

- -------- - - - - - - - - - - - - - -

ArgumentInfo
-
-The operands in this instruction are suitable for return from clGetKernelArgInfo.
-
-Name must be an OpString specifying the name of the kernel argument.
-
-TypeName must be an OpString specifying the type of the kernel argument.
-
-AddressQualifier must be a 32-bit unsigned integer OpConstant specifying the address space -enum value of the kernel argument.
-
-AccessQualifier must be a 32-bit unsigned integer OpConstant specifying the access qualifier -enum value of the kernel argument.
-
-TypeQualifier must be a 32-bit unsigned integer OpConstant specifying the type qualifier -enum value of the kernel argument.

2

<id>
-Name

Optional
-<id>
-TypeName

Optional
-<id>
-AddressQualifier

Optional
-<id>
-AccessQualifier

Optional
-<id>
-TypeQualifier

- -------- - - - - - - - - - - - - - -

ArgumentStorageBuffer
-
-Declares a StorageBuffer argument for Kernel. The Vulkan descriptor type should use -VK_DESCRIPTOR_TYPE_STORAGE_BUFFER.
-
-Kernel must be a Kernel extended instruction from the same import.
-
-Ordinal must be a 32-bit unsigned integer OpConstant specifying the argument ordinal using -zero-based counting.
-
-DescriptorSet must be a 32-bit unsigned integer OpConstant specifying the descriptor set.
-
-Binding must be a 32-bit unsigned integer OpConstant specifying the binding.
-
-ArgInfo must be an ArgumentInfo extended instruction from the same import.

3

<id>
-Kernel

<id>
-Ordinal

<id>
-DescriptorSet

<id>
-Binding

Optional
-<id>
-ArgInfo

- -------- - - - - - - - - - - - - - -

ArgumentUniform
-
-Declares a Uniform buffer argument for Kernel. The Vulkan descriptor type should use -VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER.
-
-Kernel must be a Kernel extended instruction from the same import.
-
-Ordinal must be a 32-bit unsigned integer OpConstant specifying the argument ordinal using -zero-based counting.
-
-DescriptorSet must be a 32-bit unsigned integer OpConstant specifying the descriptor set.
-
-Binding must be a 32-bit unsigned integer OpConstant specifying the binding.
-
-ArgInfo must be an ArgumentInfo extended instruction from the same import.

4

<id>
-Kernel

<id>
-Ordinal

<id>
-DescriptorSet

<id>
-Binding

Optional
-<id>
-ArgInfo

- ---------- - - - - - - - - - - - - - - - -

ArgumentPodStorageBuffer
-
-Declares a StorageBuffer plain-old-data argument for Kernel. The Vulkan descriptor type should use -VK_DESCRIPTOR_TYPE_STORAGE_BUFFER. This argument may share a descriptor set and binding with other -plain-old-data arguments.
-
-Kernel must be a Kernel extended instruction from the same import.
-
-Ordinal must be a 32-bit unsigned integer OpConstant specifying the argument ordinal using -zero-based counting.
-
-DescriptorSet must be a 32-bit unsigned integer OpConstant specifying the descriptor set.
-
-Binding must be a 32-bit unsigned integer OpConstant specifying the binding.
-
-Offset must be a 32-bit unsigned integer OpConstant specifying the offset in the block in bytes.
-
-Size must be a 32-bit unsigned integer OpConstant specifying the size of the argument in bytes.
-
-ArgInfo must be a ArgumentInfo extended instruction from the same import.

5

<id>
-Kernel

<id>
-Ordinal

<id>
-DescriptorSet

<id>
-Binding

<id>
-Offset

<id>
-Size

Optional
-<id>
-ArgInfo

- ---------- - - - - - - - - - - - - - - - -

ArgumentPodUniform
-
-Declares a Uniform buffer plain-old-data argument for Kernel. The Vulkan descriptor type should use -VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER. This argument may share a descriptor set and binding with other -plain-old-data arguments.
-
-Kernel must be a Kernel extended instruction from the same import.
-
-Ordinal must be a 32-bit unsigned integer OpConstant specifying the argument ordinal using -zero-based counting.
-
-DescriptorSet must be a 32-bit unsigned integer OpConstant specifying the descriptor set.
-
-Binding must be a 32-bit unsigned integer OpConstant specifying the binding.
-
-Offset must be a 32-bit unsigned integer OpConstant specifying the offset in the block in bytes.
-
-Size must be a 32-bit unsigned integer OpConstant specifying the size of the argument in bytes.
-
-ArgInfo must be an ArgumentInfo extended instruction from the same import.

6

<id>
-Kernel

<id>
-Ordinal

<id>
-DescriptorSet

<id>
-Binding

<id>
-Offset

<id>
-Size

Optional
-<id>
-ArgInfo

- -------- - - - - - - - - - - - - - -

ArgumentPodPushConstant
-
-Declares a PushConstant plain-old-data argument for Kernel. This argument’s -offset and size should be included in the push constant range declared for this -kernel using the VK_SHADER_STAGE_COMPUTE_BIT flag bit.
-
-Kernel must be a Kernel extended instruction from the same import.
-
-Ordinal must be a 32-bit unsigned integer OpConstant specifying the argument ordinal using -zero-based counting.
-
-Offset must be a 32-bit unsigned integer OpConstant specifying the offset in the block in bytes.
-
-Size must be a 32-bit unsigned integer OpConstant specifying the size of the argument in bytes.
-
-ArgInfo must be an ArgumentInfo extended instruction from the same import.

7

<id>
-Kernel

<id>
-Ordinal

<id>
-Offset

<id>
-Size

Optional
-<id>
-ArgInfo

- -------- - - - - - - - - - - - - - -

ArgumentSampledImage
-
-Declares a sampled image (OpTypeImage with Sampled operand of 1) argument for Kernel. The Vulkan -descriptor type should use VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE.
-
-Kernel must be a Kernel extended instruction from the same import.
-
-Ordinal must be a 32-bit unsigned integer OpConstant specifying the argument ordinal using -zero-based counting.
-
-DescriptorSet must be a 32-bit unsigned integer OpConstant specifying the descriptor set.
-
-Binding must be a 32-bit unsigned integer OpConstant specifying the binding.
-
-ArgInfo must be an ArgumentInfo extended instruction from the same import.

8

<id>
-Kernel

<id>
-Ordinal

<id>
-DescriptorSet

<id>
-Binding

Optional
-<id>
-ArgInfo

- -------- - - - - - - - - - - - - - -

ArgumentStorageImage
-
-Declares a storage image (OpTypeImage with Sampled operand of 2) argument for Kernel. The Vulkan -descriptor type should use VK_DESCRIPTOR_TYPE_STORAGE_IMAGE.
-
-Kernel must be a Kernel extended instruction from the same import.
-
-Ordinal must be a 32-bit unsigned integer OpConstant specifying the argument ordinal using -zero-based counting.
-
-DescriptorSet must be a 32-bit unsigned integer OpConstant specifying the descriptor set.
-
-Binding must be a 32-bit unsigned integer OpConstant specifying the binding.
-
-ArgInfo must be an ArgumentInfo extended instruction from the same import.

9

<id>
-Kernel

<id>
-Ordinal

<id>
-DescriptorSet

<id>
-Binding

Optional
-<id>
-ArgInfo

- -------- - - - - - - - - - - - - - -

ArgumentSampler
-
-Declares a sampler argument for Kernel. The Vulkan descriptor type should use VK_DESCRIPTOR_TYPE_SAMPELR.
-
-Kernel must be a Kernel extended instruction from the same import.
-
-Ordinal must be a 32-bit unsigned integer OpConstant specifying the argument ordinal using -zero-based counting.
-
-DescriptorSet must be a 32-bit unsigned integer OpConstant specifying the descriptor set.
-
-Binding must be a 32-bit unsigned integer OpConstant specifying the binding.
-
-ArgInfo must be an ArgumentInfo extended instruction from the same import.

10

<id>
-Kernel

<id>
-Ordinal

<id>
-DescriptorSet

<id>
-Binding

Optional
-<id>
-ArgInfo

- -------- - - - - - - - - - - - - - -

ArgumentWorkgroup
-
-Declares a workgroup buffer argument for Kernel. This argument is instantiated as Workgroup storage-class -array. It should be sized using the SpecId operand. The size of the array elements is indicated by the -ElemSize operand.
-
-Kernel must be a Kernel extended instruction from the same import.
-
-Ordinal must be a 32-bit unsigned integer OpConstant specifying the argument ordinal using -zero-based counting.
-
-SpecId must be a 32-bit unsigned integer OpConstant specifying the specialization -id used to size the argument.
-
-ElemSize must be a 32-bit unsigned integer OpConstant specifying the element size of -the argument in bytes.
-
-ArgInfo must be an ArgumentInfo extended instruction from the same import.

11

<id>
-Kernel

<id>
-Ordinal

<id>
-SpecId

<id>
-ElemSize

Optional
-<id>
-ArgInfo

- ------ - - - - - - - - - - - -

SpecConstantWorkgroupSize
-
-Declares the specialization ids used to set the WorkgroupSize builtin.
-
-X must be a 32-bit unsigned integer OpConstant specifying the specialization id -of the x dimension.
-
-Y must be a 32-bit unsigned integer OpConstant specifying the specialization id -of the y dimension.
-
-Z must be a 32-bit unsigned integer OpConstant specifying the specialization id -of the z dimension.

12

<id>
-X

<id>
-Y

<id>
-Z

- ------ - - - - - - - - - - - -

SpecConstantGlobalOffset
-
-Declares the specialization ids used to specify the global offset.
-
-X must be a 32-bit unsigned integer OpConstant specifying the specialization id -of the x dimension.
-
-Y must be a 32-bit unsigned integer OpConstant specifying the specialization id -of the y dimension.
-
-Z must be a 32-bit unsigned integer OpConstant specifying the specialization id -of the z dimension.

13

<id>
-X

<id>
-Y

<id>
-Z

- ---- - - - - - - - - - -

SpecConstantWorkDim
-
-Declares the specialization id used to specify the work dimensions.
-
-Dim must be a 32-bit unsigned integer OpConstant specifying the specialization id -of the dimensions.

14

<id>
-Dim

- ----- - - - - - - - - - - -

PushConstantGlobalOffset
-
-Declares a PushConstant entry to specify the global offset of a kernel. All kernels from -this module should include a push constant range that encompasses the Offset and Size operands.
-
-Offset must be a 32-bit unsigned integer OpConstant specifying the offset in the block -in bytes.
-
-Size must be a 32-bit unsigned integer OpConstant specifying the size of the push -constant in bytes.

15

<id>
-Offset

<id>
-Size

- ----- - - - - - - - - - - -

PushConstantEnqueuedLocalSize
-
-Declares a PushConstant entry to specify the enqueued local size of a kernel. All kernels from -this module should include a push constant range that encompasses the Offset and Size operands.
-
-Offset must be a 32-bit unsigned integer OpConstant specifying the offset in the block -in bytes.
-
-Size must be a 32-bit unsigned integer OpConstant specifying the size of the push -constant in bytes.

16

<id>
-Offset

<id>
-Size

- ----- - - - - - - - - - - -

PushConstantGlobalSize
-
-Declares a PushConstant entry to specify the global size of a kernel. All kernels from this -module should include a push constant range that encompasses the Offset and Size operands.
-
-Offset must be a 32-bit unsigned integer OpConstant specifying the offset in the block -in bytes.
-
-Size must be a 32-bit unsigned integer OpConstant specifying the size of the push -constant in bytes.

17

<id>
-Offset

<id>
-Size

- ----- - - - - - - - - - - -

PushConstantRegionOffset
-
-Declares a PushConstant entry to specify the region offset of a kernel. All kernels from this -module should include a push constant range that encompasses the Offset and Size operands.
-
-Offset must be a 32-bit unsigned integer OpConstant specifying the offset in the block -in bytes.
-
-Size must be a 32-bit unsigned integer OpConstant specifying the size of the push -constant in bytes.

18

<id>
-Offset

<id>
-Size

- ----- - - - - - - - - - - -

PushConstantNumWorkgroups
-
-Declares a PushConstant entry to specify the number of workgroups enqueued. All kernels from -this module should include a push constant range that encompasses the Offset and Size operands.
-
-Offset must be a 32-bit unsigned integer OpConstant specifying the offset in the block -in bytes.
-
-Size must be a 32-bit unsigned integer OpConstant specifying the size of the push -constant in bytes.

19

<id>
-Offset

<id>
-Size

- ----- - - - - - - - - - - -

PushConstantRegionGroupOffset
-
-Declares a PushConstant entry to specify the region group offset of a kernel. All kernels from -this module should include a push constant range that encompasses the Offset and Size operands.
-
-Offset must be a 32-bit unsigned integer OpConstant specifying the offset in the block -in bytes.
-
-Size must be a 32-bit unsigned integer OpConstant specifying the size of the push -constant in bytes.

20

<id>
-Offset

<id>
-Size

- ------ - - - - - - - - - - - -

ConstantDataStorageBuffer
-
-Declares a storage buffer to hold constant data specified by Data. All kernels from this module -should include a descriptor with the type VK_DESCRIPTOR_TYPE_STORAGE_BUFFER that is backed by -a buffer initialized with Data.
-
-DescriptorSet must be a 32-bit unsigned integer OpConstant specifying the descriptor set.
-
-Binding must be a 32-bit unsigned integer OpConstant specifying the binding.
-
-Data must be an OpString that encodes the hexbytes of the constant data.

21

<id>
-DescriptorSet

<id>
-Binding

<id>
-Data

- ------ - - - - - - - - - - - -

ConstantDataUniform
-
-Declares a uniform buffer to hold constant data specified by Data. All kernels from this module -should include a descriptor with the type VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER that is backed by -a buffer initialized with Data.
-
-DescriptorSet must be a 32-bit unsigned integer OpConstant specifying the descriptor set.
-
-Binding must be a 32-bit unsigned integer OpConstant specifying the binding.
-
-Data must be an OpString that encodes the hexbytes of the constant data.

22

<id>
-DescriptorSet

<id>
-Binding

<id>
-Data

- ------ - - - - - - - - - - - -

LiteralSampler
-
-Declares a literal sampler used by the module. All kernels from this module should include a -descriptor with the type VK_DESCRIPTOR_TYPE_SAMPLER that has the properties encoded by Mask -(see Sampler.h).
-
-DescriptorSet must be a 32-bit unsigned integer OpConstant specifying the descriptor set.
-
-Binding must be a 32-bit unsigned integer OpConstant specifying the binding.
-
-Mask must be a 32-bit unsigned integer OpConstant specifying the encoding of coordinate -normalization, address mode and filter mode.

23

<id>
-DescriptorSet

<id>
-Binding

<id>
-Mask

- ------- - - - - - - - - - - - - -

PropertyRequiredWorkgroupSize
-
-Declares the required workgroup size of Kernel.
-
-Kernel must be a Kernel extended instruction from the same import.
-
-X must be a 32-bit unsigned integer OpConstant specifying the x dimension.
-
-Y must be a 32-bit unsigned integer OpConstant specifying the y dimension.
-
-Z must be a 32-bit unsigned integer OpConstant specifying the z dimension.

24

<id>
-Kernel

<id>
-X

<id>
-Y

<id>
-Z

- ---- - - - - - - - - - -

SpecConstantSubgroupMaxSize
-
- Missing before version 2.
-
-Declares the specialization id used to set the maximum size of a subgroup, -i.e. the value returned by get_max_sub_group_size().
-
-Size must be a 32-bit unsigned integer OpConstant specifying the -specialization id for the value.

25

<id>
-Size

- -------- - - - - - - - - - - - - - -

ArgumentPointerPushConstant
-
- Missing before version 3.
-
-Declares a pointer argument for Kernel passed via PushConstant. This argument’s -offset and size should be included in the push constant range declared for this -kernel using the VK_SHADER_STAGE_COMPUTE_BIT flag bit.
-
-Kernel must be a Kernel extended instruction from the same import.
-
-Ordinal must be a 32-bit unsigned integer OpConstant specifying the argument ordinal using -zero-based counting.
-
-Offset must be a 32-bit unsigned integer OpConstant specifying the offset in the block in bytes.
-
-Size must be a 32-bit unsigned integer OpConstant specifying the size of the pointer in bytes.
-
-ArgInfo must be an ArgumentInfo extended instruction from the same import.

26

<id>
-Kernel

<id>
-Ordinal

<id>
-Offset

<id>
-Size

Optional
-<id>
-ArgInfo

- ---------- - - - - - - - - - - - - - - - -

ArgumentPointerUniform
-
- Missing before version 3.
-
-Declares a pointer argument for Kernel passed in a Uniform buffer. -The Vulkan descriptor type should use VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER. This -argument may share a descriptor set and binding with other users.
-
-Kernel must be a Kernel extended instruction from the same import.
-
-Ordinal must be a 32-bit unsigned integer OpConstant specifying the argument ordinal using -zero-based counting.
-
-DescriptorSet must be a 32-bit unsigned integer OpConstant specifying the descriptor set.
-
-Binding must be a 32-bit unsigned integer OpConstant specifying the binding.
-
-Offset must be a 32-bit unsigned integer OpConstant specifying the offset in the block in bytes.
-
-Size must be a 32-bit unsigned integer OpConstant specifying the size of the pointer in bytes.
-
-ArgInfo must be an ArgumentInfo extended instruction from the same import.

27

<id>
-Kernel

<id>
-Ordinal

<id>
-DescriptorSet

<id>
-Binding

<id>
-Offset

<id>
-Size

Optional
-<id>
-ArgInfo

- ------ - - - - - - - - - - - -

ProgramScopeVariablesStorageBuffer
-
- Missing before version 3.
-
-Declares a storage buffer to hold program scope variables. All kernels from this module -should include a descriptor with the type VK_DESCRIPTOR_TYPE_STORAGE_BUFFER that is backed by -a buffer initialized with Data and shared between all instances of kernels created from -this module.
-
-DescriptorSet must be a 32-bit unsigned integer OpConstant specifying the descriptor set.
-
-Binding must be a 32-bit unsigned integer OpConstant specifying the binding.
-
-Data must be an OpString that encodes the hexbytes of the data used to initialize the buffer.

28

<id>
-DescriptorSet

<id>
-Binding

<id>
-Data

- ------ - - - - - - - - - - - -

ProgramScopeVariablePointerRelocation
-
- Missing before version 3.
-
-Declares a relocation for a pointer into the program scope variables storage buffer -initialized with the address of another program scope variable.
-
-ObjectOffset must be a 32-bit unsigned integer OpConstant specifying the offset into the program -scope variable storage buffer at which the object pointed to resides.
-
-PointerOffset must be a 32-bit unsigned integer OpConstant specifying the offset into the program -scope variable storage buffer at which the pointer resides.
-
-PointerSize must be a 32-bit unsigned integer OpConstant specifying the size of the pointer -stored in the program scope variable storage buffer.

29

<id>
-ObjectOffset

<id>
-PointerOffset

<id>
-PointerSize

- ------- - - - - - - - - - - - - -

ImageArgumentInfoChannelOrderPushConstant
-
- Missing before version 3.
-
-Declares a PushConstant location to pass the -channel order -of the image that is argument Ordinal to Kernel. The offset and size should -be included in the push constant range declared for this kernel using the -VK_SHADER_STAGE_COMPUTE_BIT flag bit.
-
-Kernel must be a Kernel extended instruction from the same import.
-
-Ordinal must be a 32-bit unsigned integer OpConstant specifying the image -argument ordinal using zero-based counting.
-
-Offset must be a 32-bit unsigned integer OpConstant specifying the offset in the block in bytes.
-
-Size must be a 32-bit unsigned integer OpConstant specifying the size of the channel order in bytes.

30

<id>
-Kernel

<id>
-Ordinal

<id>
-Offset

<id>
-Size

- ------- - - - - - - - - - - - - -

ImageArgumentInfoChannelDataTypePushConstant
-
- Missing before version 3.
-
-Declares a PushConstant location to pass the -channel data type -of the image that is argument Ordinal to Kernel. The offset and size should -be included in the push constant range declared for this kernel using the -VK_SHADER_STAGE_COMPUTE_BIT flag bit.
-
-Kernel must be a Kernel extended instruction from the same import.
-
-Ordinal must be a 32-bit unsigned integer OpConstant specifying the image -argument ordinal using zero-based counting.
-
-Offset must be a 32-bit unsigned integer OpConstant specifying the offset in the block in bytes.
-
-Size must be a 32-bit unsigned integer OpConstant specifying the size of the channel data type in bytes.

31

<id>
-Kernel

<id>
-Ordinal

<id>
-Offset

<id>
-Size

- --------- - - - - - - - - - - - - - - -

ImageArgumentInfoChannelOrderUniform
-
- Missing before version 3.
-
-Declares a location in a Uniform buffer to pass the -channel order -of the image that is argument Ordinal to Kernel. The Vulkan descriptor type -should use VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER. This argument may share a -descriptor set and binding with other users.
-
-Kernel must be a Kernel extended instruction from the same import.
-
-Ordinal must be a 32-bit unsigned integer OpConstant specifying the image argument ordinal using -zero-based counting.
-
-DescriptorSet must be a 32-bit unsigned integer OpConstant specifying the descriptor set.
-
-Binding must be a 32-bit unsigned integer OpConstant specifying the binding.
-
-Offset must be a 32-bit unsigned integer OpConstant specifying the offset in the block in bytes.
-
-Size must be a 32-bit unsigned integer OpConstant specifying the size of the channel order in bytes.

32

<id>
-Kernel

<id>
-Ordinal

<id>
-DescriptorSet

<id>
-Binding

<id>
-Offset

<id>
-Size

- --------- - - - - - - - - - - - - - - -

ImageArgumentInfoChannelDataTypeUniform
-
- Missing before version 3.
-
-Declares a location in a Uniform buffer to pass the -channel data type -of the image that is argument Ordinal to Kernel. The Vulkan descriptor type -should use VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER. This argument may share a -descriptor set and binding with other users.
-
-Kernel must be a Kernel extended instruction from the same import.
-
-Ordinal must be a 32-bit unsigned integer OpConstant specifying the image argument ordinal using -zero-based counting.
-
-DescriptorSet must be a 32-bit unsigned integer OpConstant specifying the descriptor set.
-
-Binding must be a 32-bit unsigned integer OpConstant specifying the binding.
-
-Offset must be a 32-bit unsigned integer OpConstant specifying the offset in the block in bytes.
-
-Size must be a 32-bit unsigned integer OpConstant specifying the size of the channel data type in bytes.

33

<id>
-Kernel

<id>
-Ordinal

<id>
-DescriptorSet

<id>
-Binding

<id>
-Offset

<id>
-Size

- -------- - - - - - - - - - - - - - -

ArgumentStorageTexelBuffer
-
-Missing before version 4.
-
-Declares a storage texel buffer (OpTypeImage with Dim operand of Buffer) -argument for Kernel. The Vulkan descriptor type should use -VK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER.
-
-Kernel must be a Kernel extended instruction from the same import.
-
-Ordinal must be a 32-bit unsigned integer OpConstant specifying the argument ordinal using -zero-based counting.
-
-DescriptorSet must be a 32-bit unsigned integer OpConstant specifying the descriptor set.
-
-Binding must be a 32-bit unsigned integer OpConstant specifying the binding.
-
-ArgInfo must be an ArgumentInfo extended instruction from the same import.

34

<id>
-Kernel

<id>
-Ordinal

<id>
-DescriptorSet

<id>
-Binding

Optional
-<id>
-ArgInfo

- -------- - - - - - - - - - - - - - -

ArgumentUniformTexelBuffer
-
-Missing before version 4.
-
-Declares a uniform texel buffer (OpTypeImage with Dim operand of Buffer) -argument for Kernel. The Vulkan descriptor type should use -VK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER.
-
-Kernel must be a Kernel extended instruction from the same import.
-
-Ordinal must be a 32-bit unsigned integer OpConstant specifying the argument ordinal using -zero-based counting.
-
-DescriptorSet must be a 32-bit unsigned integer OpConstant specifying the descriptor set.
-
-Binding must be a 32-bit unsigned integer OpConstant specifying the binding.
-
-ArgInfo must be an ArgumentInfo extended instruction from the same import.

35

<id>
-Kernel

<id>
-Ordinal

<id>
-DescriptorSet

<id>
-Binding

Optional
-<id>
-ArgInfo

- ------ - - - - - - - - - - - -

ConstantDataPointerPushConstant
-
- Missing before version 5.
-
-Declares a PushConstant entry to specify the physical address of a buffer -containing constants for this module. All kernels from this module should -include a push constant range that encompasses the Offset and Size operands.
-
-Offset must be a 32-bit unsigned integer OpConstant specifying the offset in -the block in bytes.
-
-Size must be a 32-bit unsigned integer OpConstant specifying the size of the -push constant in bytes.
-
-Data must be an OpString that encodes the hexbytes of the constant data.

36

<id>
-Offset

<id>
-Size

<id>
-Data

- ------ - - - - - - - - - - - -

ProgramScopeVariablePointerPushConstant
-
- Missing before version 5.
-
-Declares a PushConstant entry to specify the physical address of a buffer -containing program-scope variables for this module. The buffer must be -initialized with Data and shared between all instances of kernels created from -this module. All kernels from this module should include a push constant range -that encompasses the Offset and Size operands.
-
-Offset must be a 32-bit unsigned integer OpConstant specifying the offset in -the block in bytes.
-
-Size must be a 32-bit unsigned integer OpConstant specifying the size of the -push constant in bytes.
-
-Data must be an OpString that encodes the hexbytes of the initialization -data.

37

<id>
-Offset

<id>
-Size

<id>
-Data

- ------ - - - - - - - - - - - -

PrintfInfo
-
- Missing before version 5.
-
-Declares a string associated with a printf builtin call, a unique ID for the -string, and an optional number of argument sizes.
-
-The string may represent the format string of a printf builtin call. In this -case ArgumentSizes must contain the storage size, in bytes, of each argument -that will be written to the buffer by the printf builtin, in the order that -they appear. The printf buffer is a buffer of unsigned 32-bit integers, so -arguments must be padded to reach a minimum storage size of 4 bytes if -necessary.
-
-The string will otherwise represent a string literal argument to another printf -call that is not the format string. A valid printf implementation should write -the value of this PrintfID to the printf buffer for a string literal argument -instead of the actual string data. ArgumentSizes has no meaning for this usage.
-
- A valid printf implementation should write the PrintfID associated with the - call to the next free location in the printf buffer, followed by the value of - each argument that appears after the format string.
-
-PrintfID must be a 32-bit unsigned integer OpConstant with a value unique to -each occurence of the PrintfInfo instruction.
-
-FormatString must be an OpString.
-
-ArgumentSizes must be zero or more 32-bit unsigned integer OpConstants -representing the storage size of the corresponding printf arguments. The Nth -value corresponds to the Nth printf argument after the format string (i.e. -the N+1th argument to printf).

38

<id>
-PrintfID

<id>
-FormatString

Optional
-<id>, …​
-ArgumentSizes

- ------ - - - - - - - - - - - -

PrintfBufferStorageBuffer
-
- Missing before version 5.
-
-Declares a storage buffer to hold the output of the printf builtin. All -kernels from this module should include a descriptor with the type -VK_DESCRIPTOR_TYPE_STORAGE_BUFFER that is backed by a zero-initialized buffer -with a size of at least Size. The first 4 bytes of the buffer should be -zero-initialized.
-
-The buffer contains a series of 32-bit unsigned integers. The first integer of -the buffer represents the offset from the second integer to the next available -free memory. This may be incremented atomically to allocate regions of the -buffer in a thread-safe way. This can be used to determine the amount of data -written after a kernel has executed. Subsequent data written by the printf -builtin will be as described by PrintfInfo.
-
-DescriptorSet must be a 32-bit unsigned integer OpConstant specifying the -descriptor set.
-
-Binding must be a 32-bit unsigned integer OpConstant specifying the binding. -
-
-Size must be a 32-bit unsigned integer OpConstant specifying the buffer size -in bytes.

39

<id>
-DescriptorSet

<id>
-Binding

<id>
-Size

- ------ - - - - - - - - - - - -

PrintfBufferPointerPushConstant
-
- Missing before version 5.
-
-Declares a PushConstant entry to specify the physical address of a buffer -to hold the output of the printf builtin. All kernels from this module should -include a push constant range that encompasses the Offset and Size operands. -The buffer should have a size of at least BufferSize, and the first 4 bytes -should be zero-initialized.
-
-The usage of the buffer is as described for PrintfBufferStorageBuffer.
-
-Offset must be a 32-bit unsigned integer OpConstant specifying the offset in -the block in bytes.
-
-Size must be a 32-bit unsigned integer OpConstant specifying the size of the -push constant in bytes.
-
-BufferSize must be a 32-bit unsigned integer OpConstant specifying the -buffer size in bytes.

40

<id>
-Offset

<id>
-Size

<id>
-BufferSize

- ------- - - - - - - - - - - - - -

NormalizedSamplerMaskPushConstant
-
- Missing before version 6.
-
-Declares a PushConstant entry to specify the sampler mask of a non literal -sampler. It means that the kernel accesses 3D images with this sampler, -but Vulkan does not allow accessing 3D images with a sampler using unnormalized coordinates. Clspv will generates code to normalized the coordinates. -Clspv will choose at runtime whether to use the original coordinates or the -normalized one depending on the mask of the sampler. -All kernels from this module should include a push constant range that -encompasses the Offset and Size operands. -
-Kernel must be a Kernel extended instruction from the same import.
-
-Ordinal must be a 32-bit unsigned integer OpConstant specifying the argument ordinal using -zero-based counting.
-
-Offset must be a 32-bit unsigned integer OpConstant specifying the offset in -the block in bytes.
-
-Size must be a 32-bit unsigned integer OpConstant specifying the size of the -push constant in bytes.
-

41

<id>
-Kernel

<id>
-Ordinal

<id>
-Offset

<id>
-Size

-
-

Kernel Property Flags

- ---- - - - - - - - - - - - - - - - - -
ValueFlag Name

0

None

1 << 0

MayUsePrintf

-
-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

6

2023-10-11

Romaric Jodin

Add support for sampler mask push constants.

5

2022-12-22

Callum Fare and Finlay Marno

Add support for module-scope buffer push constants and printf. Add NumArguments, Flags, and Attributes to the Kernel instruction.

4

2022-10-04

Kévin Petit

Add support for texel buffer arguments.

3

2022-06-26

Kévin Petit

Add support for pointer arguments, program scope variables and image channel order and data type queries.

2

2021-10-25

Kévin Petit

Add SpecConstantSubgroupMaxSize

1

2020-07-27

Alan Baker

Initial revision

-
-
-
- - \ No newline at end of file + + + + + + nonsemantic/NonSemantic.ClspvReflection.html + + +

nonsemantic/NonSemantic.ClspvReflection.html

+ + diff --git a/nonsemantic/NonSemantic.DebugBreak.html b/nonsemantic/NonSemantic.DebugBreak.html index 6c3112e..8a57d74 100644 --- a/nonsemantic/NonSemantic.DebugBreak.html +++ b/nonsemantic/NonSemantic.DebugBreak.html @@ -1,217 +1,12 @@ - - - - - - - -SPIR-V NonSemantic DebugBreak Instructions - - - - - -
-
-
-
-

Version 1.00

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Qingyuan Zheng, NVIDIA Corporation

    -
  • -
  • -

    Ashwin Lele, NVIDIA Corporation

    -
  • -
  • -

    Jeff Bolz, NVIDIA Corporation

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2019-2022 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Not Yet Approved by the SPIR-V Working Group

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2022-08-18

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.6 Revision 2.

-
-
-

This instruction set requires SPIR-V 1.0 and SPV_KHR_non_semantic_info.

-
-
-
-
-

1. Introduction

-
-
-

This specifies the NonSemantic.DebugBreak extended instruction set. It -provides a DebugBreak instruction which is a hint for any attached -shader debugger to hit a breakpoint.

-
-
-

Import this extended instruction set using an OpExtInstImport -"NonSemantic.DebugBreak" instruction.

-
-
-
-
-

2. Binary Form

-
- ------ - - - - - - - - - - - -

DebugBreak
-
-Hint the attached shader debugger to trigger a breakpoint and -pause the execution. If no debugger is available, this instruction -should be ignored.
-
-Return Type must be OpTypeVoid.

Capability:

1

1

-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2022-08-18

Qingyuan Zheng

Initial revision

-
-
-
- - \ No newline at end of file + + + + + + nonsemantic/NonSemantic.DebugBreak.html + + +

nonsemantic/NonSemantic.DebugBreak.html

+ + diff --git a/nonsemantic/NonSemantic.DebugPrintf.html b/nonsemantic/NonSemantic.DebugPrintf.html index d68647c..1299995 100644 --- a/nonsemantic/NonSemantic.DebugPrintf.html +++ b/nonsemantic/NonSemantic.DebugPrintf.html @@ -1,220 +1,12 @@ - - - - - - - -SPIR-V NonSemantic DebugPrintf Instructions - - - - - -
-
-
-
-

Version 1.00

-
-
-
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-
-
-

Contributors

-
-
-
    -
  • -

    Jeff Bolz, NVIDIA Corporation

    -
  • -
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2019-2020 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Not Yet Approved by the SPIR-V Working Group

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2020-02-11

Revision

1

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.4 Revision 1.

-
-
-

This instruction set requires SPIR-V 1.0.

-
-
-
-
-

1. Introduction

-
-
-

This specifies the NonSemantic.DebugPrintf extended instruction set. It -provides a DebugPrintf instruction which is intercepted by the Vulkan -validation layers and replaced with code to output the string to the debug -output log.

-
-
-

Import this extended instruction set using an OpExtInstImport -"NonSemantic.DebugPrintf" instruction.

-
-
-
-
-

2. Binary Form

-
- ------ - - - - - - - - - - - - -

DebugPrintf
-
-Writes output to an implementation-defined stream under control of the string -pointed to by Format, which can include format specifiers that control how -subsequent arguments are converted for output. If there are insufficient -arguments for the format specifiers, the behavior is undefined. If the format -is exhausted while arguments remain, the excess arguments are ignored. -Interpretation of the format specifiers is specified by the client API.
-
-Return Type must be OpTypeVoid.
-
-Format must be an OpString.

Capability:

2+variable

1

<id>
-Format

Optional -<id>, <id>, …​

-
-
-
-

Issues

-
-
-

None.

-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1

2020-02-11

Jeff Bolz

Initial revision

-
-
-
- - \ No newline at end of file + + + + + + nonsemantic/NonSemantic.DebugPrintf.html + + +

nonsemantic/NonSemantic.DebugPrintf.html

+ + diff --git a/nonsemantic/NonSemantic.Shader.DebugInfo.100.html b/nonsemantic/NonSemantic.Shader.DebugInfo.100.html index 9747541..19eacf6 100644 --- a/nonsemantic/NonSemantic.Shader.DebugInfo.100.html +++ b/nonsemantic/NonSemantic.Shader.DebugInfo.100.html @@ -1,3330 +1,12 @@ - - - - - - - -SPIR-V NonSemantic Shader DebugInfo Instructions - - - - - -
-
-

Contact

-
-
-

To report problems with this extension, please open a new issue at:

-
- -
-

Contributors and Acknowledgments

-
-
    -
  • -

    Baldur Karlsson, Valve

    -
  • -
-
-
-

Author of original OpenCL.DebugInfo.100 specification.

-
-
-
    -
  • -

    Alexey Sotkin, Intel

    -
  • -
-
-
-

Contributors to original OpenCL.DebugInfo.100 specification.

-
-
-
    -
  • -

    Yaxun Liu, AMD

    -
  • -
  • -

    Brian Sumner, AMD

    -
  • -
  • -

    Ben Ashbaugh, Intel

    -
  • -
  • -

    Alexey Bader, Intel

    -
  • -
  • -

    Raun Krisch, Intel

    -
  • -
  • -

    Pratik Ashar, Intel

    -
  • -
  • -

    John Kessenich, Google

    -
  • -
  • -

    David Neto, Google

    -
  • -
  • -

    Neil Henning, Codeplay

    -
  • -
  • -

    Kerch Holt, Nvidia

    -
  • -
  • -

    Jaebaek Seo, Google

    -
  • -
  • -

    Spencer Fricke, LunarG

    -
  • -
-
-
-
-
-
-

Notice

-
-
-

Copyright (c) 2019-2024 The Khronos Group Inc. Copyright terms at -http://www.khronos.org/registry/speccopyright.html

-
-
-
-
-

Status

-
-
-
    -
  • -

    Complete

    -
  • -
  • -

    Approved by the SPIR Working Group: 2021-02-05

    -
  • -
  • -

    Ratified by the Khronos Group: 2021-03-19

    -
  • -
-
-
-
-
-

Version

-
- ---- - - - - - - - - - - -

Last Modified Date

2024-10-08

Revision

11

-
-
-
-

Dependencies

-
-
-

This extension is written against the SPIR-V Specification, -Version 1.4 Revision 1.

-
-
-

This instruction set requires SPIR-V 1.0.

-
-
-
-
-

Introduction

-
-
-

This is the specification of the NonSemantic.Shader.DebugInfo.100 extended instruction -set.

-
-
-

This extended instruction set is imported into a SPIR-V module in the following -manner:

-
-
-

<extinst-id> OpExtInstImport "NonSemantic.Shader.DebugInfo.100"

-
-
-

The instructions below are capable of conveying debug information about the -source program.

-
-
-

The design guidelines for these instructions are:

-
-
-
    -
  • -

    Similarity with OpenCL.DebugInfo.100, to re-use its tooling and benefit from its design -work. To aid in future compatibility, new extended instructions in this extension begin -at number 100.

    -
  • -
  • -

    Compatibility with rules regarding non-semantic instruction sets

    -
  • -
  • -

    Expansion to handle cases needed for Vulkan SPIR-V modules

    -
  • -
-
-
-

This is a non-normative list of changes to the OpenCL.DebugInfo.100 specification:

-
-
-
    -
  • -

    OpExtInst instructions can no longer appear in any place in function bodies, but only -within the valid locations inside a block (i.e. after OpPhi, before merge/branch -instructions).

    -
  • -
  • -

    Forward references in any instruction are disallowed.

    -
  • -
  • -

    As the result of the above:

    -
    - -
    -
  • -
  • -

    DebugDeclare has an Indices parameter with the same meaning as -DebugValue. This parameter is optional and so tools can treat it as -if it were present in OpenCL.DebugInfo.100 too but with no values.

    -
  • -
  • -

    All literal parameters are passed as OpConstant values.

    -
  • -
  • -

    New instructions: DebugSourceContinued, -DebugLine, DebugNoLine, -DebugBuildIdentifier, -DebugStoragePath, DebugEntryPoint, -DebugTypeMatrix.

    -
  • -
  • -

    New flag FlagUnknownPhysicalLayout to indicate that implementations may have a different -physical layout for composite types than specified.

    -
  • -
  • -

    DebugTypeBasic now takes a Flags operand to allow specifying -FlagUnknownPhysicalLayout.

    -
  • -
-
-
-

Terms

- -
-

Local variable: A variable that is invisible in some -lexical scopes. It depends on the definition of a local -variable in the high-level language.

-
-
-

DWARF: The DWARF Debugging Standard, -which is a debugging file format used by many compilers and debuggers to -support source level debugging.

-
-
-
-
-
-

Binary Form

-
-
-

This section contains the semantics of the debug info extended instructions -using the OpExtInst instruction.

-
-
-

All Name operands are the <id> of OpString instructions, which represents -the name of the entry (type, variable, function, etc.) as it appears in the -source program.
-
-Result Type of all instructions below is the <id> of OpTypeVoid.
-
-Set operand in all instructions below is the result of an OpExtInstImport - instruction.
-
-DebugScope, DebugNoScope, -DebugDeclare, DebugValue, -DebugLine, DebugNoLine, and -DebugFunctionDefinition -instructions can interleave with the instructions within a function, but must appear -within valid locations in a block as required by SPV_KHR_non_semantic_info. In -particular this means they cannot come before any OpPhi or function-level variable -declarations in a block, and they cannot come after a Merge Instruction.

-
-
-

DebugLine and DebugNoLine cannot appear outside -of a block. Line number information for global objects such as variable declarations -should be specified using the line and column values within those declarations.

-
-
-

All other instructions from this extended instruction set should be located -after the logical layout section 9 "All type declarations (OpTypeXXX instructions), -all constant instructions, and all global variable declarations …​" and before -section 10 "All function declaration" in section 2.4 -Logical Layout of a Module -of the core SPIR-V specification.
-
-Debug info for source language opaque types is represented by -DebugTypeComposite without Members operands. -Size of the composite must be DebugInfoNone and Name -must start with @ symbol to avoid clashes with user defined names.

-
-
-

Removing Instructions

-
-

All instructions in this extended set have no semantic impact and can be -safely removed. This is easily done if all debug instructions are removed -together, at once. However, when removing a subset, for example, inlining -a function, there may be dangling references to <id> that have been removed. -These can be replaced with the Result <id> of the -DebugInfoNone instruction.

-
-
-

All <id> referred to must be defined (dangling references are not allowed).

-
-
-
-

Forward references

-
-

Forward references are not allowed, to be compliant with SPV_KHR_non_semantic_info.

-
-
-
-
-
-

Enumerations

-
-
-

Instruction Enumeration

- ---- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Instruction
- number
Instruction name

0

DebugInfoNone

1

DebugCompilationUnit

2

DebugTypeBasic

3

DebugTypePointer

4

DebugTypeQualifier

5

DebugTypeArray

6

DebugTypeVector

7

DebugTypedef

8

DebugTypeFunction

9

DebugTypeEnum

10

DebugTypeComposite

11

DebugTypeMember

12

DebugTypeInheritance

13

DebugTypePtrToMember

14

DebugTypeTemplate

15

DebugTypeTemplateParameter

16

DebugTypeTemplateTemplateParameter

17

DebugTypeTemplateParameterPack

18

DebugGlobalVariable

19

DebugFunctionDeclaration

20

DebugFunction

21

DebugLexicalBlock

22

DebugLexicalBlockDiscriminator

23

DebugScope

24

DebugNoScope

25

DebugInlinedAt

26

DebugLocalVariable

27

DebugInlinedVariable

28

DebugDeclare

29

DebugValue

30

DebugOperation

31

DebugExpression

32

DebugMacroDef

33

DebugMacroUndef

34

DebugImportedEntity

35

DebugSource

101

DebugFunctionDefinition

102

DebugSourceContinued

103

DebugLine

104

DebugNoLine

105

DebugBuildIdentifier

106

DebugStoragePath

107

DebugEntryPoint

108

DebugTypeMatrix

-
-
-

Debug Info Flags

- ---- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ValueFlag Name

1 << 0

FlagIsProtected

1 << 1

FlagIsPrivate

1<<0 | 1<<1

FlagIsPublic

1 << 2

FlagIsLocal

1 << 3

FlagIsDefinition

1 << 4

FlagFwdDecl

1 << 5

FlagArtificial

1 << 6

FlagExplicit

1 << 7

FlagPrototyped

1 << 8

FlagObjectPointer

1 << 9

FlagStaticMember

1 << 10

FlagIndirectVariable

1 << 11

FlagLValueReference

1 << 12

FlagRValueReference

1 << 13

FlagIsOptimized

1 << 14

FlagIsEnumClass

1 << 15

FlagTypePassByValue

1 << 16

FlagTypePassByReference

1 << 17

FlagUnknownPhysicalLayout

-
-
-

Build Identifier Flags

- - ----- - - - - - - - - - - - - - - -
ValueFlag NameDescription

1 << 0

IdentifierPossibleDuplicates

The same identifier may be generated for - different input sources that compile to - the same result, and so is not fully - unique. This could be e.g. multiple - different source code variations which - compile to the exact same SPIR-V binary.

-
-
-

Base Type Attribute Encodings

-
-

Used by DebugTypeBasic

-
- ---- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Encoding code name

0

Unspecified

1

Address

2

Boolean

3

Float

4

Signed

5

SignedChar

6

Unsigned

7

UnsignedChar

-
-
-

Composite Types

-
-

Used by DebugTypeComposite

-
- ---- - - - - - - - - - - - - - - - - - - - -
Tag code name

0

Class

1

Structure

2

Union

-
-
-

Type Qualifiers

-
-

Used by DebugTypeQualifier

-
- ---- - - - - - - - - - - - - - - - - - - - - - - - -
Qualifier tag code name

0

ConstType

1

VolatileType

2

RestrictType

3

AtomicType

-
-
-

Debug Operations

-
-

These operations are used to form a DWARF expression. -Such expressions provide information about the current location -(described by DebugDeclare) or value -(described by DebugValue) of a variable. -Operations in an expression are to be applied on a stack. -Initially, the stack contains one element: the address or value of the source variable.
-Used by DebugOperation

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Operation encodingsNo. of OperandsDescription

0

Deref

0

Pops the top stack entry, treats it as an address, pushes the value retrieved from that address.

1

Plus

0

Pops the top two entries from the stack, adds them together and push the result.

2

Minus

0

Pops the top two entries from the stack, subtracts the former top entry from the former second to top entry and push the result.

3

PlusUconst

1

Pops the top stack entry, adds the addend operand to it, and pushes the result. - The operand must be a single 32-bit integer OpConstant.

4

BitPiece

2

Describes an object or value that may be contained in part of a register or stored in more than one location. - The first operand is offset in bit from the location defined by the preceding operation. - The second operand is size of the piece in bits. - The operands must each be a single 32-bit integer OpConstant.

5

Swap

0

Swaps the top two stack values.

6

Xderef

0

Pops the top two entries from the stack. - Treats the former top entry as an address and the former second to top entry as an address space. - The value retrieved from the address in the given address space is pushed.

7

StackValue

0

Describes an object that doesn’t exist in memory but it’s value is known and is at the top of the DWARF expression stack.

8

Constu

1

Pushes a constant value onto the stack. The value operand must be a single 32-bit integer OpConstant.

9

Fragment

2

Has the same semantics as BitPiece, but the offset operand defines location within the source variable.

-
-
-

Imported Entities

- - ---- - - - - - - - - - - - - - - - -
Tag code name

0

ImportedModule

1

ImportedDeclaration

-
-
-
-
-

Instructions

-
-
-

Missing Debugging Information

- -------- - - - - - - - - - - - - - -

DebugInfoNone
-
-Other instructions can refer to this one in case the debugging information is -unknown, not available, or not applicable.
-
-Result Type must be OpTypeVoid.

5

12

<id>
-Result Type

Result <id>

<id> Set

0

-
-
-

Debug Info Metadata

- ---------- - - - - - - - - - - - - - - - -

DebugBuildIdentifier
-
- A build identifier for the shader that can be used to tie debug information to a - SPIR-V module even if the two are separated, as long as the identifier is present in - both.
-
- When removing debug information from a module tools should preserve this instruction and - any DebugStoragePath, to allow users to locate the correct debug - information again.
-
- The identifier must be a lowercase hexadecimal string - digits and the characters [a-f] - - with at least 32 characters.
-
- Result Type must be OpTypeVoid.
-
- Identifier is an OpString holding the hexadecimal representation of a GUID for this - build.
-
- Flags is a 32-bit integer constant containing a value from the - BuildIdentifierFlags table.

7

12

<id>
-Result Type

Result <id>

<id> Set

105

<id> Identifier

<id> Flags

- --------- - - - - - - - - - - - - - - -

DebugStoragePath
-
- A hint for consumers as to where to store this shader’s debug information. If the debug - information has been split apart and is identified with - DebugBuildIdentifier, this path can provide a hint as to where - the debug information has been stored.
-
- It is optional, and may be automatically generated based on a common prefix and the - identifier itself.
-
- Interpretation of the path and the storage method are not specified here, but commonly - the path will be a relative path on disk, which is searched relative to externally agreed - search paths.
-
- Result Type must be OpTypeVoid.
-
- Identifier is an OpString holding the absolute or relative path to the stored SPIR-V - module.

6

12

<id>
-Result Type

Result <id>

<id> Set

106

<id> Path

-
-
-

Compilation Unit

- ------------ - - - - - - - - - - - - - - - - - -

DebugCompilationUnit
-
- Describe a source compilation unit. A compilation unit is the single source input to a - SPIR-V front-end after any preprocessing has occurred. Multiple compilation units can - be linked together to produce a SPIR-V module, and the same source file can be used for - multiple compilation units if different compilation settings are used each time.
-
- The Result <id> of this instruction represents a lexical scope.
-
- Result Type must be OpTypeVoid.
-
- Version is version of the SPIRV debug information format, stored in a 32-bit integer - OpConstant.
-
- DWARF Version is version of the DWARF standard this specification is compatible - with, stored in a 32-bit integer OpConstant.
-
- Source is a DebugSource instruction representing the text of the initial input - file before pre-processing.
-
- Language is a 32-bit integer OpConstant. The value is the source programming language - of this particular compilation unit. Possible values of this operand are described in the - Source Language section of the core SPIR-V specification.

9

12

<id>
-Result Type

Result <id>

<id> Set

1

<id> Version

<id> DWARF version

<id> Source

<id> Language

- ---------- - - - - - - - - - - - - - - - -

DebugSource
-
- Describe the source program. It can be either the primary source file or a - file added via a #include directive.
-
- Result Type must be OpTypeVoid.
-
- File is an OpString holding the name of the source file including its full - path.
-
- Text is an OpString that contains text of the source program the SPIR-V - module is derived from.

6+

12

<id>
-Result Type

Result <id>

<id> Set

35

<id> File

Optional
- <id> Text

- --------- - - - - - - - - - - - - - - -

DebugSourceContinued
-
- Continue specifying source text from the previous instruction.
-
- The previous instruction must be a DebugSource or DebugSourceContinued instruction. - The previous instruction must use the same extended instruction set <id> as this one, - and it must contain some text string id.
-
- The text strings specified in both instructions are nul terminated, and the contents of - the string in this instruction is appended immediately after before the nul in the - previous instruction’s string to form the joined text.
-
-Result Type must be OpTypeVoid.
-
- Text is an OpString that contains text to append.

6

12

<id>
-Result Type

Result <id>

<id> Set

102

<id> Text

- ------------ - - - - - - - - - - - - - - - - - -

DebugEntryPoint
-
- Describe the compilation environment for an OpEntryPoint.
-
- Result Type must be OpTypeVoid.
-
- Entry Point is the <id> of the DebugFunction corresponding - to the OpFunction referenced in the OpEntryPoint. This function must also have - a DebugFunctionDefinition in the first basic block - of that OpFunction.
-
- Compilation Unit is the <id> of the DebugCompilationUnit - that produced the entry point.
-
- Compiler Signature is an OpString describing the compiler and version used for - compilation.
-
- Command-line Arguments is an OpString containing the command line arguments passed to - the compiler.

9

12

<id>
-Result Type

Result <id>

<id> Set

107

<id> Entry Point

<id> Compilation Unit

<id> Compiler Signature

<id> Command-line Arguments

-
-
-

Type instructions

- ------------ - - - - - - - - - - - - - - - - - -

DebugTypeBasic
-
- Describe a basic data type.
-
- Result Type must be OpTypeVoid.
-
- Name is an OpString representing the name of the type as it appears in the - source program. May be empty.
-
- Size is an OpConstant with 32-bit or 64-bit integer type and its value is - the number of bits required to hold an instance of the type.
-
- Encoding is a 32-bit integer OpConstant describing how the base type is - encoded.
-
-Flags is the <id> of a 32-bit integer OpConstant formed by the bitwise-OR of values from the Debug Info Flags table.
-
- Note: If flags contains the FlagUnknownPhysicalLayout flag, the Size - is a placeholder value based on an assumed memory layout and may not correspond - to the exact size of the composite by the implementation.

9

12

<id>
-Result Type

Result <id>

<id> Set

2

<id> Name

<id> Size

<id> Encoding

<id> Flags

- ----------- - - - - - - - - - - - - - - - - -

DebugTypePointer
-
-Describe a pointer or reference data type.
-
-Result Type must be OpTypeVoid.
-
-Base Type is the <id> of a debugging instruction that represents the pointee - type.
-
-Storage Class is a 32-bit integer OpConstant containing the class of the memory where - the object pointed to is allocated. Possible values of this operand are described in the - Storage Class section of the core SPIR-V specification.
-
-Flags is the <id> of a 32-bit integer OpConstant formed by the bitwise-OR of values from the Debug Info Flags table.

8

12

<id>
-Result Type

Result <id>

<id> Set

3

<id> Base Type

<id> Storage Class

<id> Flags

- ---------- - - - - - - - - - - - - - - - -

DebugTypeQualifier
-
-Describe a const, volatile, or restrict qualified data type. -A type with multiple qualifiers are represented as a sequence of -DebugTypeQualifier instructions.
-
-Result Type must be OpTypeVoid.
-
-Base Type is debug instruction that represents the type being qualified.
-
- Type Qualifier is a 32-bit integer constant containing a value from the - TypeQualifiers table.

7

12

<id>
-Result Type

Result <id>

<id> Set

4

<id> Base Type

<id> Type Qualifier

- ---------- - - - - - - - - - - - - - - - -

DebugTypeArray
-
- Describe a array data type.
-
-Result Type must be OpTypeVoid.
-
-Base Type is a debugging instruction that describes the element type of the - array.
-
-Component Count is the number of elements in the corresponding dimension of - the array. The number and order of Component Count operands must match with - the number and order of array dimensions as they appear in the source program. - Component Count must be a Result <id> of an OpConstant, - DebugGlobalVariable, or - DebugLocalVariable. If it is an OpConstant, its type - must be a 32-bit or 64-bit integer type. Otherwise its type must be - a DebugTypeBasic whose Size is 32 or 64 and whose - Encoding is Unsigned. If the OpConstant value is set to 0, this indicates - an array with an unknown size at compile time which is sized at runtime, - corresponding to the SPIR-V OpTypeRuntimeArray type.

7+

12

<id>
-Result Type

Result <id>

<id> Set

5

<id> Base Type

<id> Component Count, …​

- ---------- - - - - - - - - - - - - - - - -

DebugTypeVector
-
-Describe a vector data type.
-
-Result Type must be OpTypeVoid.
-
-Base Type is the <id> of a debugging instruction that describes the type of - element of the vector.
-
-Component Count is the <id> of a 32-bit integer OpConstant denoting the number of - elements in the vector.

7

12

<id>
-Result Type

Result <id>

<id> Set

6

<id> Base Type

<id>
- Component Count

- ----------- - - - - - - - - - - - - - - - - -

DebugTypeMatrix
-
-Describe a matrix data type.
-
-Result Type must be OpTypeVoid.
-
-Vector Type is the <id> of a debugging instruction that describes the type of - vector in the matrix.
-
-Vector Count is the <id> of a 32-bit integer OpConstant denoting the number of - vectors in the matrix.
-
-Column Major is the <id> of a boolean OpConstant denoting whether the matrix is -column major. If it is True then the matrix is column major with each Vector Type -representing a column and Vector Count giving the number of columns. If it is False -then correspondingly the matrix is row major with each vector being a row.

8

12

<id>
-Result Type

Result <id>

<id> Set

108

<id> Vector Type

<id>
- Vector Count

<id>
- Column Major

- -------------- - - - - - - - - - - - - - - - - - - - -

DebugTypedef
-
-Describe a C/C++ typedef declaration.
-
-Result Type must be OpTypeVoid.
-
-Name is an OpString that represents a new name for the Base Type.
-
-Base Type is a debugging instruction representing the type for which a new - name is being declared.
-
-Source is a DebugSource instruction representing the text of the source program containing the typedef declaration.
-
-Line is the <id> of a 32-bit integer OpConstant denoting the source line number at - which the declaration appears in the Source.
-
-Column is the <id> of a 32-bit integer OpConstant denoting the column number at - which the first character of the declaration appears.
-
-Parent is the <id> of a debug instruction that represents the - lexical scope that contains the typedef declaration.

11

12

<id>
-Result Type

Result <id>

<id> Set

7

<id> Name

<id> Base Type

<id> Source

<id> Line

<id> Column

<id> Parent

- ----------- - - - - - - - - - - - - - - - - -

DebugTypeFunction
-
-Describe a function type.
-
-Result Type must be OpTypeVoid.
-
- Flags is the <id> of a 32-bit integer OpConstant formed by the bitwise-OR of values from the Debug Info Flags table.
-
-Return Type is a debug instruction that represents the type of return value of - the function. If the function has no return value, this operand is - OpTypeVoid.
-
- Parameter Types are debug instructions that describe the type of parameters of - the function.

7+

12

<id>
-Result Type

Result <id>

<id> Set

8

<id> Flags

<id> Return Type

Optional <id>, <id>, …​ Parameter Types

- ----------------- - - - - - - - - - - - - - - - - - - - - - - -

DebugTypeEnum
-
-Describe an enumeration type.
-
-Result Type must be OpTypeVoid.
-
-Name is an OpString holding the name of the enumeration as it appears in the - source program.
-
-Underlying Type is a debugging instruction that describes the underlying type - of the enum in the source program. If the underlying type is not specified in - the source program, this operand must refer to - DebugInfoNone.
-
-Source is a DebugSource instruction representing the text of the source program containing the enum declaration.
-
-Line is the <id> of a 32-bit integer OpConstant denoting the source line number at - which the enumeration declaration appears in the Source.
-
-Column is the <id> of a 32-bit integer OpConstant denoting the column number at - which the first character of the enumeration declaration appears.
-
-Parent is the <id> of a debug instruction that represents the - lexical scope that contains the enumeration type.
-
-Size is an OpConstant with 32-bit or 64-bit integer type and its value is - the number of bits required to hold an instance of the enumeration type.
-
-Flags is the <id> of a 32-bit integer OpConstant formed by the bitwise-OR of values from the Debug Info Flags table.
-
-Enumerators are encoded as trailing pairs of Value and corresponding Name. -Values must be the <id> of OpConstant instructions, with a 32-bit integer result -type. Name must be the <id> of an OpString instruction.

13+

12

<id>
-Result Type

Result <id>

<id> Set

9

<id> Name

<id> Underlying Type

<id> Source

<id> Line

<id> Column

<id> Parent

<id> Size

<id> Flags

<id> Value,
- <id> Name,
- <id> Value,
- <id> Name, …​

- ------------------ - - - - - - - - - - - - - - - - - - - - - - - -

DebugTypeComposite
-
-Describe a structure, class, or union data type. The Result <id> of this - instruction represents a lexical scope.
-
-Result Type must be OpTypeVoid.
-
-Tag is the <id> of a 32-bit integer OpConstant with a value from the - Composite Types table that specifies the kind of the composite type.
-
-Name is an OpString holding the name of the type as it appears in the source - program.
-
-Source is a DebugSource instruction representing the text of the source program containing the type declaration.
-
-Line is the <id> of a 32-bit integer OpConstant denoting the source line number at - which the type declaration appears in the Source.
-
-Column is the <id> of a 32-bit integer OpConstant denoting the column number at - which the first character of the type declaration appears.
-
-Parent is the <id> of a debug instruction that represents the - lexical scope that contains the composite type. It must be - one of the following: DebugCompilationUnit, - DebugFunction, - DebugLexicalBlock, or - DebugTypeComposite.
-
-Linkage Name is an OpString, holding the linkage name or mangled name of the - composite.
-
-Size is an OpConstant with 32-bit or 64-bit integer type and its value is - the number of bits required to hold an instance of the composite type.
-
-Flags is the <id> of a 32-bit integer OpConstant formed by the bitwise-OR of values from the Debug Info Flags table.
-
-Members must be the <id>s of DebugTypeMember, - DebugFunction, - or DebugTypeInheritance. This could be a forward - reference.
-
- Note: If flags contains the FlagUnknownPhysicalLayout flag, the Size - is a placeholder value based on an assumed memory layout and may not correspond - to the exact size of the composite by the implementation. Size will be at - least greater than or equal to the highest Offset of any element in Members - plus that members Size. The order of members in memory can be determined by - the order of their Offset parameter.
-
-Note: To represent a source language opaque type, this instruction must have no -Members operands, Size operand must be DebugInfoNone, - and Name must start with @ to avoid clashes with user defined names.

14+

12

<id>
-Result Type

Result <id>

<id> Set

10

<id> Name

Tag

<id> Source

<id> Line

<id> Column

<id> Parent

<id> Linkage Name

<id> Size

<id> Flags

<id>, <id>, …​ Members

- ----------------- - - - - - - - - - - - - - - - - - - - - - - -

DebugTypeMember
-
-Describe a data member of a structure, class, or union.
-
-Result Type must be OpTypeVoid.
-
-Name is an OpString holding the name of the member as it appears in the - source program.
-
-Type is a debug type instruction that represents the type of the member.
-
-Source is a DebugSource instruction representing the text of the source program containing the member declaration.
-
-Line is the <id> of a 32-bit integer OpConstant denoting the source line number at - which the member declaration appears in the Source.
-
-Column is the <id> of a 32-bit integer OpConstant denoting the column number at - which the first character of the member declaration appears.
-
-Offset is an OpConstant with integral type, and its value is the memory - offset in bits from the beginning of the Scope type.
-
-Size is an OpConstant with 32-bit or 64-bit integer type and its value is - the number of bits the member occupies within the Scope type.
-
-Flags is the <id> of a 32-bit integer OpConstant formed by the bitwise-OR of values from the Debug Info Flags table.
-
-Value is an OpConstant representing initialization value in case of - const static qualified member in C++.
-
- Note: If flags contains the FlagUnknownPhysicalLayout flag, the Size - and Offset are placeholder values based on an assumed memory layout and may - not correspond to the exact size of the composite by the implementation. Size - will be greater than zero.

13+

12

<id>
-Result Type

Result <id>

<id> Set

11

<id> Name

<id> Type

<id> Source

<id> Line

<id> Column

<id> Offset

<id> Size

<id> Flags

Optional <id> Value

- ------------ - - - - - - - - - - - - - - - - - -

DebugTypeInheritance
-
-Describe the inheritance relationship with a parent class or structure. -The Result of this instruction can be used as a member of a composite type.
-
-Result Type must be OpTypeVoid.
-
-Parent is a debug instruction representing a class or structure the - Child Type is derived from.
-
-Offset is an OpConstant with integral type and its value is the offset of the - Parent Type in bits in layout of the Child Type.
-
-Size is an OpConstant with 32-bit or 64-bit integer type and its value is - the number of bits the Parent type occupies within the Child Type.
-
-Flags is the <id> of a 32-bit integer OpConstant formed by the bitwise-OR of values from the Debug Info Flags table.

9

12

<id>
-Result Type

Result <id>

<id> Set

12

<id> Parent

<id> Offset

<id> Size

<id> Flags

- ---------- - - - - - - - - - - - - - - - -

DebugTypePtrToMember
-
-Describe the type of an object that is a pointer to a structure or class member.
-
-Result Type must be OpTypeVoid.
-
-Member Type is a debug instruction representing the type of the member.
-
-Parent is a debug instruction, representing a structure or class type.

7

12

<id>
-Result Type

Result <id>

<id> Set

13

<id> Member Type

<id> Parent

-
-
-

Templates

- ---------- - - - - - - - - - - - - - - - -

DebugTypeTemplate
-
-Describe an instantiated template of class, struct, or function in C++.
-
-Result Type must be OpTypeVoid.
-
-Target is a debug instruction representing the class, struct, or function that has - template parameter(s).
-
-Parameters are debug instructions representing the template parameters for - this particular instantiation.
-

7

12

<id>
-Result Type

Result <id>

<id> Set

14

<id> Target

<id>…​ Parameters

- -------------- - - - - - - - - - - - - - - - - - - - -

DebugTypeTemplateParameter
-
-Describe a formal parameter of a C++ template instantiation.
-
-Result Type must be OpTypeVoid.
-
-Name is an OpString holding the name of the template parameter.
-
-Actual Type is a debug instruction representing the actual type of the formal - parameter for this particular instantiation.
-
- If this instruction describes a template value parameter, the Value is - represented by an OpConstant with an integer result type. For a template type - parameter, the Value operand must be the Result <id> of - DebugInfoNone.
-
-Source is a DebugSource instruction representing the text of the source program containing the template instantiation.
-
-Line is the <id> of a 32-bit integer OpConstant denoting the source line number at - which the template parameter declaration appears in the Source.
-
-Column is the <id> of a 32-bit integer OpConstant denoting the column number at - which the first character of the template parameter declaration appears.

11

12

<id>
-Result Type

Result <id>

<id> Set

15

<id> Name

<id> Actual Type

<id> Value

<id> Source

<id> Line

<id> Column

- ------------- - - - - - - - - - - - - - - - - - - -

DebugTypeTemplateTemplateParameter
-
- Describe a template template parameter of a C++ template instantiation.
-
-Result Type must be OpTypeVoid.
-
-Name is an OpString holding the name of the template template parameter
-
-Template Name is an OpString holding the name of the template used as - template parameter in this particular instantiation.
-
-Source is a DebugSource instruction representing the text of the source program containing the template instantiation.
-
-Line is the <id> of a 32-bit integer OpConstant denoting the source line number at - which the template template parameter declaration appears in the Source.
-
-Column is the <id> of a 32-bit integer OpConstant denoting the column number at - which the first character of the template template parameter declaration appears.

10

12

<id>
-Result Type

Result <id>

<id> Set

16

<id> Name

<id> Template Name

<id> Source

<id> Line

<id> Column

- ------------- - - - - - - - - - - - - - - - - - - -

DebugTypeTemplateParameterPack
-
-Describe the expanded template parameter pack in a variadic template instantiation - in C++.
-
-Result Type must be OpTypeVoid.
-
-Name is an OpString holding the name of the template parameter pack.
-
-Source is a DebugSource instruction representing the text of the source program containing the template instantiation.
-
-Line is the <id> of a 32-bit integer OpConstant denoting the source line number at - which the template parameter pack declaration appears in the Source.
-
-Column is the <id> of a 32-bit integer OpConstant denoting the column number at - which the first character of the template parameter pack declaration appears.
-
-Template parameters are - DebugTypeTemplateParameters describing the - expanded parameter pack in the variadic template instantiation.
-

10+

12

<id>
-Result Type

Result <id>

<id> Set

17

<id> Name

<id> Source

<id> Line

<id> Column

<id>…​ Template parameters

-
-
-

Global Variables

- ------------------ - - - - - - - - - - - - - - - - - - - - - - - -

DebugGlobalVariable
-
- Describe a source global variable.
-
-Result Type must be OpTypeVoid.
-
-Name is an OpString, holding the name of the variable as it appears in the - source program.
-
-Type is a debug instruction that represents the type of the variable.
-
-Source is a DebugSource instruction representing the text of the source program containing the source global variable declaration.
-
-Line is the <id> of a 32-bit integer OpConstant denoting the source line number at - which the source global variable declaration appears in the Source.
-
-Column is the <id> of a 32-bit integer OpConstant denoting the column number at - which the first character of the source global variable declaration appears.
-
-Parent is the <id> of a debug instruction that represents the - lexical scope that contains the source global variable - declaration. It must be one of the following: - DebugCompilationUnit, - DebugFunction, - DebugLexicalBlock, or - DebugTypeComposite.
-
-Linkage Name is an OpString, holding the linkage name of the variable.
-
-Variable can hold two kinds of values. First it can hold the <id> of the - source global variable or constant that is described by this instruction. If - the variable is optimized out, this operand can be the <id> of a - DebugExpression instruction that contains the constant - value of the variable that was optimized out. Otherwise this operand must be - DebugInfoNone.
-
-Flags is the <id> of a 32-bit integer OpConstant formed by the bitwise-OR of values from the Debug Info Flags table.
-
-If the source global variable represents a defining declaration - for a C++ static data member of a structure, class, or union, the optional - Static Member Declaration operand refers to the debugging type of the - previously declared variable, i.e. DebugTypeMember.

14+

12

<id>
-Result Type

Result <id>

<id> Set

18

<id> Name

<id> Type

<id> Source

<id> Line

<id> Column

<id> Parent

<id> Linkage Name

<id> Variable

<id> Flags

Optional <id> Static Member Declaration

-
-
-

Functions

- ---------------- - - - - - - - - - - - - - - - - - - - - - -

DebugFunctionDeclaration
-
-Describe a function or method declaration.
-
-Result Type must be OpTypeVoid.
-
-Name is an OpString, holding the name of the function as it appears in the - source program.
-
-Type is an DebugTypeFunction instruction that - represents the type of the function.
-
-Source is a DebugSource instruction representing the text of the source program containing the function declaration.
-
-Line is the <id> of a 32-bit integer OpConstant denoting the source line number at - which the function declaration appears in the Source.
-
-Column is the <id> of a 32-bit integer OpConstant denoting the column number at - which the first character of the function declaration appears.
-
-Parent is the <id> of a debug instruction that represents the - lexical scope that contains the function declaration.
-
-Linkage Name is an OpString, holding the linkage name of the function.
-
- Flags is the <id> of a 32-bit integer OpConstant formed by the bitwise-OR of values from the Debug Info Flags table.
-

13

12

<id>
-Result Type

Result <id>

<id> Set

19

<id> Name

<id> Type

<id> Source

<id> Line

<id> Column

<id> Parent

<id> Linkage Name

<id> Flags

- ------------------ - - - - - - - - - - - - - - - - - - - - - - - -

DebugFunction
-
-Describe a function or method definition. The Result <id> of this instruction - represents a lexical scope.
-
-Result Type must be OpTypeVoid.
-
-Name is an OpString, holding the name of the function as it appears in the - source program.
-
-Type is an DebugTypeFunction instruction that - represents the type of the function.
-
-Source is a DebugSource instruction representing the text of the source program containing the function definition.
-
-Line is the <id> of a 32-bit integer OpConstant denoting the source line number at - which the function declaration appears in the Source.
-
-Column is the <id> of a 32-bit integer OpConstant denoting the column number at - which the first character of the function declaration appears.
-
-Parent is the <id> of a debug instruction that represents the - lexical scope that contains the function definition.
-
-Linkage Name is an OpString, holding the linkage name of the function.
-
- Flags is the <id> of a 32-bit integer OpConstant formed by the bitwise-OR of values from the Debug Info Flags table.
-
-Scope Line is the <id> of a 32-bit integer OpConstant denoting the line number in - the source program at which the function lexical scope begins.
-
-Declaration is DebugFunctionDeclaration - that represents non-defining declaration of the function.

14+

12

<id>
-Result Type

Result <id>

<id> Set

20

<id> Name

<id> Type

<id> Source

<id> Line

<id> Column

<id> Parent

<id> Linkage Name

<id> Flags

<id> Scope Line

Optional <id> Declaration

- ---------- - - - - - - - - - - - - - - - -

DebugFunctionDefinition
-
-Describe a function definition. This instruction must appear in the entry basic block of -an OpFunction and there must be at most one such instruction.
-
- The referenced DebugFunction must not be referenced by any other - DebugFunctionDefinition.
-
-Result Type must be OpTypeVoid.
-
-Function is the <id> of a DebugFunction instruction that describes this function.
-
-Definition is the <id> of the OpFunction that this instruction is inside.

7

12

<id>
-Result Type

Result <id>

<id> Set

101

<id> Function

<id> Definition

-
-
-

Location Information

- ------------- - - - - - - - - - - - - - - - - - - -

DebugLexicalBlock
-
-Describe a lexical block in the source program. The Result <id> of this - instruction represents a lexical scope.
-
-Result Type must be OpTypeVoid.
-
-Source is a DebugSource instruction representing the text of the source program containing the lexical block.
-
-Line is the <id> of a 32-bit integer OpConstant denoting the source line number at - which the lexical block begins in the Source.
-
-Column is the <id> of a 32-bit integer OpConstant denoting the column number at - which the lexical block begins.
-
-Parent is the <id> of a debug instruction that represents the - lexical scope containing the lexical block. Entities - in the global lexical scope should have Parent referring to a - DebugCompilationUnit.
-
- The presence of the Name operand indicates that this instruction represents a - C++ namespace. This operand refers to an OpString holding the name of the - namespace. For anonymous C++ namespaces, the name must be an empty string.

9+

12

<id>
-Result Type

Result <id>

<id> Set

21

<id> Source

<id> Line

<id> Column

<id> Parent

Optional <id> Name

- ----------- - - - - - - - - - - - - - - - - -

DebugLexicalBlockDiscriminator
-
-Distinguish lexical blocks on a single line in the source program.
-
-Result Type must be OpTypeVoid.
-
-Source is a DebugSource instruction representing the text of the source program containing the lexical block.
-
-Parent is the <id> of a debug instruction that represents the - lexical scope containing the lexical block.
-
-Discriminator is the <id> of a 32-bit integer OpConstant denoting a DWARF - discriminator value for instructions in the lexical block.

8

12

<id>
-Result Type

Result <id>

<id> Set

22

<id> Source

<id> Discriminator

<id> Parent

- ---------- - - - - - - - - - - - - - - - -

DebugScope
-
-Provide information about a previously declared - lexical scope. This instruction delimits the start of a - contiguous group of instructions, to be ended by any of the following: the next - end of block, the next DebugScope instruction, or the next DebugNoScope - instruction.
-
- This instruction must only appear within a block.
-
-Result Type must be OpTypeVoid.
-
- Scope is a previously declared lexical scope.
-
- Inlined is a DebugInlinedAt instruction that represents - the lexical scope and location to where Scope instructions - were inlined.

6+

12

<id>
-Result Type

Result <id>

<id> Set

23

<id> Scope

Optional
- <id> Inlined

- -------- - - - - - - - - - - - - - -

DebugNoScope
-
-Delimit the end of a contiguous group of instructions started by the - previous DebugScope.
-
- This instruction must only appear within a block.
-
- Result Type must be OpTypeVoid.

5

12

<id>
-Result Type

Result <id>

<id> Set

24

- ----------- - - - - - - - - - - - - - - - - -

DebugInlinedAt
-
-Declare to where instructions grouped together by a DebugScope - instruction are inlined. When a function is inlined, a - DebugScope for the function or a part of the function can have - an Inlined operand i.e., DebugInlinedAt, which means the - set of instructions grouped by the DebugScope was inlined to - the Line operand of the DebugInlinedAt of the Scope - operand of the DebugInlinedAt.
-
-Result Type must be OpTypeVoid.
-
-Line is the <id> of a 32-bit integer OpConstant denoting the source line number - where the range of instructions were inlined.
-
-Scope is a lexical scope that contains Line.
-
-Inlined is a debug instruction representing the next level of inlining in case - of recursive inlining.

7+

12

<id>
-Result Type

Result <id>

<id> Set

25

<id> Line

<id> Scope

Optional <id> Inlined

- ------------- - - - - - - - - - - - - - - - - - - -

DebugLine
-
-Specify source-level line and column information. This information applies to all -following instructions, up to the first occurrence of any of the following: the -next end of block, the next DebugLine instruction, or the next DebugNoLine -instruction.
-
- This instruction must only appear within a block.
-
-Result Type must be OpTypeVoid.
-
- Source is a previously declared DebugSource indicating the file - containing the location.
-
- Line Start is the <id> of a 32-bit integer OpConstant denoting the source - line number where the location begins.
-
- Line End is the <id> of a 32-bit integer OpConstant denoting the source - line number where the location ends. This must be greater than or equal to Line End.
-
- Column Start is the <id> of a 32-bit integer OpConstant denoting the source - column number where the location begins.
-
- Column End is the <id> of a 32-bit integer OpConstant denoting the source - column number where the location ends. This must be greater than or equal to Column Start - if Line Start equals Line End.

10

12

<id>
-Result Type

Result <id>

<id> Set

103

<id> Source

<id> Line Start

<id> Line End

<id> Column Start

<id> Column End

- -------- - - - - - - - - - - - - - -

DebugNoLine
-
-Discontinue any source-level line and column information specified by any previous -DebugLine instruction.
-
- This instruction must only appear within a block.
-
- Result Type must be OpTypeVoid.

5

12

<id>
-Result Type

Result <id>

<id> Set

104

-
-
-

Local Variables

- ---------------- - - - - - - - - - - - - - - - - - - - - - -

DebugLocalVariable
-
- Describe a local variable.
-
-Result Type must be OpTypeVoid.
-
-Name is an OpString, holding the name of the variable as it appears in the - source program.
-
-Type is a debugging instruction that represents the type of the - local variable.
-
-Source is a DebugSource instruction representing the text of the source program containing the local variable declaration.
-
-Line is the <id> of a 32-bit integer OpConstant denoting the source line number at - which the local variable declaration appears in the Source.
-
-Column is the <id> of a 32-bit integer OpConstant denoting the column number at - which the first character of the local variable declaration appears.
-
-Parent is the <id> of a debug instruction that represents the - lexical scope that contains the the - local variable declaration.
-
- Flags is the <id> of a 32-bit integer OpConstant formed by the bitwise-OR of values from the Debug Info Flags table.
-
-If ArgNumber operand is present, this instruction represents a function formal - parameter. The argument is the <id> of a 32-bit integer OpConstant.

12+

12

<id>
-Result Type

Result <id>

<id> Set

26

<id> Name

<id> Type

<id> Source

<id> Line

<id> Column

<id> Parent

<id> Flags

Optional
- <id> ArgNumber

- ---------- - - - - - - - - - - - - - - - -

DebugInlinedVariable
-
- Describe an inlined local variable.
-
-Result Type must be OpTypeVoid.
-
-Variable is a debug instruction representing a - local variable that is inlined.
-
-Inlined is an DebugInlinedAt instruction representing - the inline location.

7+

12

<id>
-Result Type

Result <id>

<id> Set

27

<id> Variable

<id> Inlined

- ------------ - - - - - - - - - - - - - - - - - -

DebugDeclare
-
-Define point of declaration of a local variable.
-
-Result Type must be OpTypeVoid.
-
-Local Variable must be an <id> of DebugLocalVariable.
-
-Variable must be the <id> of an OpVariable instruction that defines the local - variable.
-
-Expression must be an <id> of a DebugExpression - instruction.
-
-Indexes have the same semantics as the corresponding operand(s) of - OpAccessChain, applied to the Local Variable.

8+

12

<id>
-Result Type

Result <id>

<id> Set

28

<id> Local Variable

<id> Variable

<id> Expression

<id>, <id>, …​ Indexes

- ------------ - - - - - - - - - - - - - - - - - -

DebugValue
-
-Represent a changing of value of a local variable.
-
-Result Type must be OpTypeVoid.
-
-Local Variable must be an <id> of a - DebugLocalVariable.
-
-Value is a Result <id> of a non-debug instruction. The new value of - Local Variable is the result of the evaluation of Expression to Value.
-
-Expression is the <id> of a DebugExpression - instruction.
-
-Indexes have the same semantics as the corresponding operand(s) of - OpAccessChain, applied to the Local Variable.

8+

12

<id>
-Result Type

Result <id>

<id> Set

29

<id> Local Variable

<id> Value

<id> Expression

<id>, <id>, …​ Indexes

- ---------- - - - - - - - - - - - - - - - -

DebugOperation
-
-Represent a DWARF operation that operates on a stack of values.
-
-Result Type must be OpTypeVoid.
-
-Operation is a 32-bit OpConstant specifying the DWARF operation from the - Debug Operations table.
-
-Operands are zero or more 32-bit integer OpConstant <id>s.

6+

12

<id>
-Result Type

Result <id>

<id> Set

30

<id> Operation

Optional <id>
- Operands …​

- --------- - - - - - - - - - - - - - - -

DebugExpression
-
- Represent a DWARF expression, which describe how to compute a value or name - location during debugging of a program. This is expressed in terms of DWARF - operations that operate on a stack of values.
-
-Result Type must be OpTypeVoid.
-
-Operation is zero or more ids of DebugOperation.

5+

12

<id>
-Result Type

Result <id>

<id> Set

31

Optional <id>…​ Operation

-
-
-

Macros

- ------------ - - - - - - - - - - - - - - - - - -

DebugMacroDef
-
- Represents a macro definition.
-
-Result Type must be OpTypeVoid.
-
- Source is the <id> of an OpString, which contains the name of the file that - contains definition of the macro.
-
- Line is <id> of a 32-bit integer OpConstant denoting the line number in the source - file at which the macro is defined. If Line is zero, the macro definition is provided - by compiler’s command line argument.
-
- Name is the <id> of an OpString, which contains the name of the macro as it appears - in the source program. In the case of a function-like macro definition, no - whitespace characters appear between the name of the defined macro and the - following left parenthesis. Formal parameters are separated by a comma without - any whitespace. A right parenthesis terminates the formal parameter list.
-
-Value is the <id> of an OpString, which contains text with definition of the macro.

7+

12

<id>
-Result Type

Result <id>

<id> Set

32

<id> Source

<id> Line

<id> Name

Optional <id> Value

- ----------- - - - - - - - - - - - - - - - - -

DebugMacroUndef
-
- Discontinue previous macro definition.
-
-Result Type must be OpTypeVoid.
-
-Source is the <id> of an OpString, which contains the name of the file in which the - macro is undefined.
-
-Line is the <id> of a 32-bit integer OpConstant denoting the line number in the - source program at which the macro is rendered as undefined.
-
-Macro is the <id> of DebugMacroDef which represent the macro - to be undefined.

8

12

<id>
-Result Type

Result <id>

<id> Set

33

<id> Source

<id> Line

<id> Macro

-
-
-

Imported Entities

- --------------- - - - - - - - - - - - - - - - - - - - - -

DebugImportedEntity
-
- Represents a C++ namespace using-directive, namespace alias, or - using-declaration.
-
- Name is an OpString, holding the name or alias for the imported entity.
-
- Tag is the <id> of a 32-bit integer OpConstant with a value from the - Imported Entities table which specifies the kind of the imported - entity.
-
- Source is a DebugSource instruction representing the text of the source program the Entity is being imported from.
-
- Entity is a debug instruction representing a namespace or declaration that is - being imported.
-
- Line is the <id> of a 32-bit integer OpConstant denoting the source line number at - which the using declaration appears in the Source.
-
- Column is the <id> of a 32-bit integer OpConstant denoting the column number at - which the first character of the using declaration appears.
-
- Parent is the <id> of a debug instruction that represents the - lexical scope that contains the namespace or declaration.

12

12

<id>
-Result Type

Result <id>

<id> Set

34

<id> Name

<id> Tag

<id> Source

<id> Entity

<id> Line

<id> Column

<id> Parent

-
-
-
-
-

Validation Rules

-
-
-

None.

-
-
-
-
-

Issues

-
-
-
    -
  1. -

    Should this specification only contain references to the OpenCL.DebugInfo.100 -specification with changes, or duplicate it in its entirety?

    -
    -
    -
    -

    RESOLVED: - The spec is duplicated. The number of changes is significant enough that having to read - two specifications to understand this one is not desirable. It’s also not guaranteed - that changes to OpenCL.DebugInfo.100 should be automatically reflected in this - extension.

    -
    -
    -
    -
  2. -
  3. -

    Should DebugSourceContinued exist or should DebugSource take an optional list of -<id>s instead of just a single optional <id>?

    -
    -
    -
    -

    RESOLVED: - We mirror OpSource and OpSourceContinued both because it is an existing - pattern for specifying overflowing strings longer than a 16-bit length allows, as - well as for compatibility with OpenCL.DebugInfo.100 which only allows a single - <id> for its DebugSource.

    -
    -
    -
    -
  4. -
  5. -

    Should we add a DebugNoLine or use OpNoLine?

    -
    -
    -
    -

    RESOLVED: - We have added DebugNoLine for symmetry and to clearly separate from OpLine and - OpNoLine.

    -
    -
    -
    -
  6. -
-
-
-
-
-

Revision History

-
- ------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
RevDateAuthorChanges

1.00 Rev 1

2020-11-02

Baldur Karlsson

Initial revision

1.00 Rev 2

2020-11-02

Baldur Karlsson

Changed to comply with non-semantic restrictions.
- Removed forward references.
- Converted literal operands to OpConstant ids.

1.00 Rev 3

2020-11-17

Baldur Karlsson

Added DebugSourceContinued, DebugLine/DebugNoLine, - DebugBuildIdentifier, DebugStoragePath, - DebugEntryPoint, DebugTypeMatrix.

1.00 Rev 4

2020-12-08

Baldur Karlsson

Grammar fixes, added FlagUnknownPhysicalLayout and - Indexes parameter in DebugValue. Limited where - DebugLine type instructions can appear.

1.00 Rev 5

2020-01-04

Baldur Karlsson

Add Flags parameter to DebugTypeBasic.

1.00 Rev 6

2020-01-22

Baldur Karlsson

Rename extended instruction set.

1.00 Rev 7

2021-07-01

Baldur Karlsson

Clarify runtime array sizing.

1.00 Rev 8

2021-07-27

Baldur Karlsson

Clarify that DebugFunctionDefinition can be in
- basic blocks.

1.00 Rev 9

2022-02-28

Baldur Karlsson

Clarify that DebugEntryPoint refers to a - DebugFunction, not an OpEntryPoint.

1.00 Rev 10

2024-08-07

Victor Lomüller

Fix that in DebugLine the Column end operand - can be equal to Column start operand.

1.00 Rev 11

2024-10-08

Spencer Fricke

Fix using Scope instead of Parent operand name.

-
-
-
- - \ No newline at end of file + + + + + + nonsemantic/NonSemantic.Shader.DebugInfo.100.html + + +

nonsemantic/NonSemantic.Shader.DebugInfo.100.html

+ +