| DXC Cookbook: HLSL Coding Patterns for SPIR-V |
| ============================================= |
| |
| Author: Steven Perron |
| |
| Date: Oct 22, 2018 |
| |
| Introduction |
| ============ |
| |
| This document provides a set of examples that demonstrate what will and |
| will not be accepted by the DXC compiler when generating SPIR-V. The |
| difficulty in defining what is acceptable is that it cannot be specified |
| by a grammar. The entire program must be taken into consideration. |
| Hopefully this will be useful. |
| |
| We are interested in how global resources are used. For a SPIR-V shader |
| to be valid, accesses to global resources like structured buffers and |
| images must be done directly on the global resources. They cannot be |
| copied or have their address returned from functions. However, in HLSL, |
| it is possible to copy a global resource or to pass it by reference to a |
| function. Since this can be arbitrarily complex, DXC can generate valid |
| SPIR-V only if the compiler is able to remove all of these copies. |
| |
| The transformations that are used to remove the copies will be the same |
| for both structured buffers and images, so we have chosen to focus on |
| structured buffer. The process of transforming the code in this way is |
| called *legalization.* |
| |
| Support evolves over time as the optimizations in SPIRV-Tools are |
| improved. At GDC 2018, Greg Fischer from LunarG |
| `presented <http://schedule.gdconf.com/session/hlsl-in-vulkan-there-and-back-again-presented-by-khronos-group/856616>`__ |
| earlier results in this space. The DXC, Glslang, and SPIRV-Tools |
| maintainers work together to handle new HLSL code patterns. This |
| document represents the state of the DXC compiler in October 2018. |
| |
| Glslang does legalization as well. However, what it is able to legalize |
| is different from DXC because of features it chooses to support, and the |
| optimizations from SPIRV-Tools it choose to run. For example, Glslang |
| does not support structured buffer aliasing yet, so many of these |
| examples will not work with Glslang. |
| |
| All of the examples are available in the DXC repository, at |
| https://github.com/Microsoft/DirectXShaderCompiler/tree/main/tools/clang/test/CodeGenSPIRV/legal-examples |
| . To open a link to Tim Jones' Shader Playground for an example, you can |
| follow the url in the comments of each example. |
| |
| Examples for structured buffers |
| =============================== |
| |
| Desired code |
| ------------ |
| |
| .. code-block:: hlsl |
| |
| // 0-copy-sbuf-ok.hlsl |
| // http://shader-playground.timjones.io/e6af2bdce0c61ed07d3a826aa8a95d45 |
| |
| struct S { |
| float4 f; |
| }; |
| |
| int i; |
| |
| StructuredBuffer<S> gSBuffer; |
| RWStructuredBuffer<S> gRWSBuffer; |
| |
| void main() { |
| gRWSBuffer[i] = gSBuffer[i]; |
| } |
| |
| This example shows code that directly translates to valid SPIR-V. In |
| this case, we have two structured buffers. When one of their elements is |
| accessed, it is done by naming the resource from which to get the |
| element. |
| |
| Note that it is fine to copy an element of the structured buffer. |
| |
| Single copy to a local |
| ---------------------- |
| |
| Cases that can be easily legalized are those where there is exactly one |
| assignment to the local copy of the structured buffer. In this context, |
| a local is either a global static or a function scope symbol. Something |
| that can be accessed by only a single instance of the shader. When you |
| have a single copy to a local, it is obvious which global is actually be |
| used. This allows the compiler to replace a reference to the local |
| symbol with the global resource. |
| |
| Initialization of a static |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| .. code-block:: hlsl |
| |
| // 1-copy-global-static-ok.hlsl |
| // http://shader-playground.timjones.io/815543dc91a4e6855a8d0c6a345d4a5a |
| |
| struct S { |
| float4 f; |
| }; |
| |
| int i; |
| |
| StructuredBuffer<S> gSBuffer; |
| RWStructuredBuffer<S> gRWSBuffer; |
| |
| static StructuredBuffer<S> sSBuffer = gSBuffer; |
| |
| void main() { |
| gRWSBuffer[i] = sSBuffer[i]; |
| } |
| |
| This example shows an implicitly addressed structured buffer |
| ``gSBuffer`` assigned to a static ``sSBuffer``. This copy is treated |
| like a shallow copy. This is implemented by making ``sSBuffer`` a |
| pointer to ``gSBuffer``. |
| |
| This example can be legalized because the compiler is able to see that |
| ``sSbuffer`` is points to ``gSBuffer``, which does not move, so uses of |
| ``sSbuffer`` can be replaced by ``gSBuffer``. |
| |
| .. code-block:: hlsl |
| |
| // 2-write-global-static-ok.hlsl |
| // http://shader-playground.timjones.io/1c65c467e395383945d219a60edbe10c |
| |
| struct S { |
| float4 f; |
| }; |
| |
| int i; |
| |
| RWStructuredBuffer<S> gRWSBuffer; |
| |
| static RWStructuredBuffer<S> sRWSBuffer = gRWSBuffer; |
| |
| void main() { |
| sRWSBuffer[i].f = 0.0; |
| } |
| |
| This example is similar to the previous example, except in this case the |
| shallow copy becomes important. ``sRWSBuffer`` is treated like a pointer |
| to ``gRWSBuffer``. As before, the references to ``sRWSBuffer`` can be |
| replaced by ``gRWSBuffer``. This means that the write that occurs will |
| be visible outside of the shader. |
| |
| Copy to function scope |
| ~~~~~~~~~~~~~~~~~~~~~~ |
| |
| .. code-block:: hlsl |
| |
| // 3-copy-local-struct-ok.hlsl |
| // http://shader-playground.timjones.io/77dd20774e4943044c2f1b630c539f07 |
| |
| struct S { |
| float4 f; |
| }; |
| |
| struct CombinedBuffers { |
| StructuredBuffer<S> SBuffer; |
| RWStructuredBuffer<S> RWSBuffer; |
| }; |
| |
| |
| int i; |
| |
| StructuredBuffer<S> gSBuffer; |
| RWStructuredBuffer<S> gRWSBuffer; |
| |
| void main() { |
| CombinedBuffers cb; |
| cb.SBuffer = gSBuffer; |
| cb.RWSBuffer = gRWSBuffer; |
| cb.RWSBuffer[i] = cb.SBuffer[i]; |
| } |
| |
| It is also possible to copy a structured buffer to a function scope |
| symbol. This is similar to a copy to a static scope symbol. The local |
| copy is really a pointer to the original. This example demonstrates that |
| DXC can legalize the copy even if it is a copy to part of a structure. |
| There are no specific restrictions on the structure. The structured |
| buffers can be anywhere in the structure, and there can be any number of |
| members. Structured buffers can be in nested structures of any depth. |
| The following is a move complicated example. |
| |
| .. code-block:: hlsl |
| |
| // 4-copy-local-nested-struct-ok.hlsl |
| // http://shader-playground.timjones.io/14f59ff2a28c0a0180daf6ce4393cf6b |
| |
| struct S { |
| float4 f; |
| }; |
| |
| struct CombinedBuffers { |
| StructuredBuffer<S> SBuffer; |
| RWStructuredBuffer<S> RWSBuffer; |
| }; |
| |
| struct S2 { |
| CombinedBuffers cb; |
| }; |
| |
| struct S1 { |
| S2 s2; |
| }; |
| |
| int i; |
| |
| StructuredBuffer<S> gSBuffer; |
| RWStructuredBuffer<S> gRWSBuffer; |
| |
| void main() { |
| S1 s1; |
| s1.s2.cb.SBuffer = gSBuffer; |
| s1.s2.cb.RWSBuffer = gRWSBuffer; |
| s1.s2.cb.RWSBuffer[i] = s1.s2.cb.SBuffer[i]; |
| } |
| |
| Function parameters |
| ~~~~~~~~~~~~~~~~~~~ |
| |
| .. code-block:: hlsl |
| |
| // 5-func-param-sbuf-ok.hlsl |
| // http://shader-playground.timjones.io/aeb06f527c5390d82d63bdb4eafc9ae7 |
| |
| struct S { |
| float4 f; |
| }; |
| |
| struct CombinedBuffers { |
| StructuredBuffer<S> SBuffer; |
| RWStructuredBuffer<S> RWSBuffer; |
| }; |
| |
| |
| int i; |
| |
| StructuredBuffer<S> gSBuffer; |
| RWStructuredBuffer<S> gRWSBuffer; |
| |
| void foo(StructuredBuffer<S> pSBuffer) { |
| gRWSBuffer[i] = pSBuffer[i]; |
| } |
| |
| void main() { |
| foo(gSBuffer); |
| } |
| |
| It is possible to pass a structured buffer as a parameter to a function. |
| As with the copies in the previous section, it is a pointer to the |
| structured buffer that is actually being passed to ``foo``. This is the |
| same way that arrays work in C/C++. |
| |
| .. code-block:: hlsl |
| |
| // 6-func-param-rwsbuf-ok.hlsl |
| // http://shader-playground.timjones.io/f4e0194ce78118c0a709d85080ccea93 |
| |
| struct S { |
| float4 f; |
| }; |
| |
| int i; |
| |
| StructuredBuffer<S> gSBuffer; |
| RWStructuredBuffer<S> gRWSBuffer; |
| |
| void foo(RWStructuredBuffer<S> pRWSBuffer) { |
| pRWSBuffer[i] = gSBuffer[i]; |
| } |
| |
| void main() { |
| foo(gRWSBuffer); |
| } |
| |
| The same is true for RW structured buffers. So in this case, the write |
| to ``pRWSBuffer`` is changing ``gRWSBuffer``. This means that the write |
| to ``pRWSBuffer`` will be visible outside of the function, and outside |
| of the shader. |
| |
| Return values |
| ~~~~~~~~~~~~~ |
| |
| The next two examples show that structured buffers can be a function's |
| return value. As before, the return value of ``foo`` is really a pointer |
| to the global resource. |
| |
| .. code-block:: hlsl |
| |
| // 7-func-ret-tmp-var-ok.hlsl |
| // http://shader-playground.timjones.io/d6b706423f02dad58fbb01841282c6a1 |
| |
| struct S { |
| float4 f; |
| }; |
| |
| int i; |
| |
| StructuredBuffer<S> gSBuffer; |
| RWStructuredBuffer<S> gRWSBuffer; |
| |
| RWStructuredBuffer<S> foo() { |
| return gRWSBuffer; |
| } |
| |
| void main() { |
| RWStructuredBuffer<S> lRWSBuffer = foo(); |
| lRWSBuffer[i] = gSBuffer[i]; |
| } |
| |
| | In this case, the compiler will replace ``lRWSBuffer`` by |
| ``gRWSBuffer``. |
| |
| .. code-block:: hlsl |
| |
| // 8-func-ret-direct-ok.hlsl |
| // http://shader-playground.timjones.io/6edbbc1aa6c6b6533c5a728135f87fb9 |
| |
| struct S { |
| float4 f; |
| }; |
| |
| int i; |
| |
| StructuredBuffer<S> gSBuffer; |
| RWStructuredBuffer<S> gRWSBuffer; |
| |
| StructuredBuffer<S> foo() { |
| return gSBuffer; |
| } |
| |
| void main() { |
| gRWSBuffer[i] = foo()[i]; |
| } |
| |
| This example is similar to the previous, but shows that you do not have |
| to use an explicit temporary value. |
| |
| Conditional control flow |
| ------------------------ |
| |
| The examples so far have do not have any conditional control flow. This |
| makes it obvious which resources are being used. The introduction of |
| conditional control flow makes the job of the compiler much harder, and |
| in some cases impossible. Remember that the compiler is trying to |
| determine at compile time which resource will be used at run time. In |
| this section, we will look at how control flow affects the compiler's |
| ability to do this. The bottom line is that the compiler has to be able |
| to turn all of the conditional control flow that affects which resources |
| are used into straight line code. |
| |
| Inputs in if-statement |
| ~~~~~~~~~~~~~~~~~~~~~~ |
| |
| The first example is one where the compiler cannot determine which |
| resource is actually being accessed. |
| |
| .. code-block:: hlsl |
| |
| // 9-if-stmt-select-fail.hlsl |
| // http://shader-playground.timjones.io/2896e95627fd8a6689ca96c81a5c7c68 |
| |
| struct S { |
| float4 f; |
| }; |
| |
| struct CombinedBuffers { |
| StructuredBuffer<S> SBuffer; |
| RWStructuredBuffer<S> RWSBuffer; |
| }; |
| |
| |
| int i; |
| |
| StructuredBuffer<S> gSBuffer1; |
| StructuredBuffer<S> gSBuffer2; |
| RWStructuredBuffer<S> gRWSBuffer; |
| |
| #define constant 0 |
| |
| void main() { |
| |
| StructuredBuffer<S> lSBuffer; |
| if (constant > i) { // Condition can't be computed at compile time. |
| lSBuffer = gSBuffer1; // Will produce invalid SPIR-V for Vulkan. |
| } else { |
| lSBuffer = gSBuffer2; |
| } |
| gRWSBuffer[i] = lSBuffer[i]; |
| } |
| |
| In this example, ``lsBuffer`` could be either ``gSBuffer1`` or |
| ``gSBuffer2``. It depends on the value of ``i`` which is a parameter to |
| the shader and cannot be known at compile time. At this time, the |
| compiler is not able to convert this code into something that drivers |
| will accept. |
| |
| If this is the pattern that your code, I would suggest rewriting the |
| code into the following: |
| |
| .. code-block:: hlsl |
| |
| // 10-if-stmt-select-ok.hlsl |
| // http://shader-playground.timjones.io/5063d8a0a7ad1f9d0839cd34a6d94dd2 |
| |
| struct S { |
| float4 f; |
| }; |
| |
| struct CombinedBuffers { |
| StructuredBuffer<S> SBuffer; |
| RWStructuredBuffer<S> RWSBuffer; |
| }; |
| |
| |
| int i; |
| |
| StructuredBuffer<S> gSBuffer1; |
| StructuredBuffer<S> gSBuffer2; |
| RWStructuredBuffer<S> gRWSBuffer; |
| |
| #define constant 0 |
| |
| void main() { |
| |
| StructuredBuffer<S> lSBuffer; |
| if (constant > i) { |
| lSBuffer = gSBuffer1; |
| gRWSBuffer[i] = lSBuffer[i]; |
| } else { |
| lSBuffer = gSBuffer2; |
| gRWSBuffer[i] = lSBuffer[i]; |
| } |
| } |
| |
| Notice that this involves replicating code. If the code that follows the |
| if-statement is long, you could consider moving it to a function, and |
| having two calls to that function. |
| |
| If-statements with constants |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| Not all control flow is a problem. There are situations where the |
| compiler is able to determine that a condition is always true or always |
| false. For example, in the following code, the compiler looks at "0>2", |
| and knows that is always false. |
| |
| .. code-block:: hlsl |
| |
| // 11-if-stmt-const-ok.hlsl |
| // http://shader-playground.timjones.io/7ef5b89b3ec3d56c22e1bca45b40516a |
| |
| struct S { |
| float4 f; |
| }; |
| |
| int i; |
| |
| StructuredBuffer<S> gSBuffer1; |
| StructuredBuffer<S> gSBuffer2; |
| RWStructuredBuffer<S> gRWSBuffer; |
| |
| #define constant 0 |
| |
| void main() { |
| |
| StructuredBuffer<S> lSBuffer; |
| if (constant > 2) { |
| lSBuffer = gSBuffer1; |
| } else { |
| lSBuffer = gSBuffer2; |
| } |
| gRWSBuffer[i] = lSBuffer[i]; |
| } |
| |
| The compiler will turn this code into |
| |
| .. code-block:: hlsl |
| |
| struct S { |
| float4 f; |
| }; |
| |
| int i; |
| |
| StructuredBuffer<S> gSBuffer1; |
| StructuredBuffer<S> gSBuffer2; |
| RWStructuredBuffer<S> gRWSBuffer; |
| |
| #define constant 0 |
| |
| void main() { |
| gRWSBuffer[i] = gSBuffer2[i]; |
| } |
| |
| The two previous examples show that handling control flow depends on |
| what the compiler can do. This depends on the amount of optimization |
| that is done, and which optimizations are done. In general, when you are |
| writing code that will select a resource, keep the conditions as simple |
| as possible to make it as easy as possible for the compiler to determine |
| which path is taken. |
| |
| Switch statements |
| ~~~~~~~~~~~~~~~~~ |
| |
| Switch statements are similar to if-statements. If the selector is a |
| constant, then the compiler will be able to propagate the copies. |
| |
| .. code-block:: hlsl |
| |
| // 12-switch-stmt-select-fail.hlsl |
| // http://shader-playground.timjones.io/b079f878daeba5d77842725b90a476ca |
| |
| struct S { |
| float4 f; |
| }; |
| |
| struct CombinedBuffers { |
| StructuredBuffer<S> SBuffer; |
| RWStructuredBuffer<S> RWSBuffer; |
| }; |
| |
| |
| int i; |
| |
| StructuredBuffer<S> gSBuffer1; |
| StructuredBuffer<S> gSBuffer2; |
| RWStructuredBuffer<S> gRWSBuffer; |
| |
| #define constant 0 |
| |
| void main() { |
| |
| StructuredBuffer<S> lSBuffer; |
| switch(i) { // Compiler can't determine which case will run. |
| case 0: |
| lSBuffer = gSBuffer1; // Will produce invalid SPIR-V for Vulkan. |
| break; |
| default: |
| lSBuffer = gSBuffer2; |
| break; |
| } |
| gRWSBuffer[i] = lSBuffer[i]; |
| } |
| |
| The compiler is not able to remove the copies in this example because it |
| does not know the value of ``i`` at compile time. |
| |
| .. code-block:: hlsl |
| |
| // 13-switch-stmt-const-ok.hlsl |
| // http://shader-playground.timjones.io/a46dd1f1a84eba38c047439741ec08ab |
| |
| struct S { |
| float4 f; |
| }; |
| |
| struct CombinedBuffers { |
| StructuredBuffer<S> SBuffer; |
| RWStructuredBuffer<S> RWSBuffer; |
| }; |
| |
| |
| int i; |
| |
| StructuredBuffer<S> gSBuffer1; |
| StructuredBuffer<S> gSBuffer2; |
| RWStructuredBuffer<S> gRWSBuffer; |
| |
| const static int constant = 0; |
| |
| void main() { |
| |
| StructuredBuffer<S> lSBuffer; |
| switch(constant) { |
| case 0: |
| lSBuffer = gSBuffer1; |
| break; |
| default: |
| lSBuffer = gSBuffer2; |
| break; |
| } |
| gRWSBuffer[i] = lSBuffer[i]; |
| } |
| |
| However, if the selector is turned into a constant, the compiler can |
| replace uses of ``lSBuffer`` by ``gSBuffer1``. |
| |
| Loop Induction Variables in conditions |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| Besides inputs, another type of variable that hinders the compiler are |
| loop induction variables. These are variables that change value for each |
| iteration of the loop. Consider this example. |
| |
| .. code-block:: hlsl |
| |
| // 14-loop-var-fail.hlsl |
| // http://shader-playground.timjones.io/8df364770e3f425e6321e71f817bcd1a |
| |
| struct S { |
| float4 f; |
| }; |
| |
| struct CombinedBuffers { |
| StructuredBuffer<S> SBuffer; |
| RWStructuredBuffer<S> RWSBuffer; |
| }; |
| |
| StructuredBuffer<S> gSBuffer1; |
| StructuredBuffer<S> gSBuffer2; |
| RWStructuredBuffer<S> gRWSBuffer; |
| |
| #define constant 0 |
| |
| void main() { |
| |
| StructuredBuffer<S> lSBuffer; |
| |
| for( int j = 0; j < 2; j++ ) { |
| if (constant > j) { // Condition is different for different iterations |
| lSBuffer = gSBuffer1; // Will produces invalid SPIR-V for Vulkan. |
| } else { |
| lSBuffer = gSBuffer2; |
| } |
| gRWSBuffer[j] = lSBuffer[j]; |
| } |
| } |
| |
| In this example, ``j`` is an induction variable. It takes on the values |
| ``0`` and ``1``. The information is there to be able to determine which |
| path is taken in each iteration, but the compiler does not figure this |
| out by default. |
| |
| If you want the compiler to be able to legalize this code, then you will |
| have to direct the compiler to unroll this loop using the unroll |
| attribute. The following example can be legalized by the compiler: |
| |
| .. code-block:: hlsl |
| |
| // 15-loop-var-unroll-ok.hlsl |
| // http://shader-playground.timjones.io/3d0f6f830fc4a5102714e19c748e81c7 |
| |
| struct S { |
| float4 f; |
| }; |
| |
| struct CombinedBuffers { |
| StructuredBuffer<S> SBuffer; |
| RWStructuredBuffer<S> RWSBuffer; |
| }; |
| |
| StructuredBuffer<S> gSBuffer1; |
| StructuredBuffer<S> gSBuffer2; |
| RWStructuredBuffer<S> gRWSBuffer; |
| |
| #define constant 0 |
| |
| void main() { |
| |
| StructuredBuffer<S> lSBuffer; |
| |
| [unroll] |
| for( int j = 0; j < 2; j++ ) { |
| if (constant > j) { |
| lSBuffer = gSBuffer1; |
| } else { |
| lSBuffer = gSBuffer2; |
| } |
| gRWSBuffer[j] = lSBuffer[j]; |
| } |
| } |
| |
| Variable iteration counts |
| ~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| Adding the unroll attribute to loops does not guarantee that the |
| compiler is able to legalize the code. The compiler has to be able to |
| fully unroll the loop. That means the compiler will have to create a |
| copy of the body of the loop for each iteration so that there is no loop |
| anymore. That can only be done if the number of iterations can be known |
| at compile time. |
| |
| This means that the compiler must be able to determine the initial |
| value, the final value, and the step for the induction variable, ``j`` |
| in the example. None of ``foo1``, ``foo2``, or ``foo3`` can be legalized |
| because the number of iterations cannot be known at compile time. |
| |
| .. code-block:: hlsl |
| |
| // 16-loop-var-range-fail.hlsl |
| // http://shader-playground.timjones.io/376f5f985c3ceceea004ab58edb336f2 |
| |
| struct S { |
| float4 f; |
| }; |
| |
| struct CombinedBuffers { |
| StructuredBuffer<S> SBuffer; |
| RWStructuredBuffer<S> RWSBuffer; |
| }; |
| |
| StructuredBuffer<S> gSBuffer1; |
| StructuredBuffer<S> gSBuffer2; |
| RWStructuredBuffer<S> gRWSBuffer; |
| |
| int i; |
| |
| #define constant 0 |
| |
| void foo1() { |
| StructuredBuffer<S> lSBuffer; |
| |
| [unroll] |
| for( int j = i; j < 2; j++ ) { // Compiler can't determine the initial value |
| if (constant > j) { |
| lSBuffer = gSBuffer1; |
| } else { |
| lSBuffer = gSBuffer2; |
| } |
| gRWSBuffer[j] = lSBuffer[j]; |
| } |
| } |
| |
| void foo2() { |
| StructuredBuffer<S> lSBuffer; |
| |
| [unroll] |
| for( int j = 0; j < i; j++ ) { // Compiler can't determine the end value |
| if (constant > j) { |
| lSBuffer = gSBuffer1; |
| } else { |
| lSBuffer = gSBuffer2; |
| } |
| gRWSBuffer[j] = lSBuffer[j]; |
| } |
| } |
| |
| void foo3() { |
| StructuredBuffer<S> lSBuffer; |
| |
| [unroll] |
| for( int j = 0; j < 2; j += i ) { // Compiler can't determine the step count |
| if (constant > j) { |
| lSBuffer = gSBuffer1; |
| } else { |
| lSBuffer = gSBuffer2; |
| } |
| gRWSBuffer[j] = lSBuffer[j]; |
| } |
| } |
| |
| |
| void main() { |
| foo1(); foo2(); foo3(); |
| } |
| |
| As before the compiler will try to simplify expressions to determine |
| their value at compile time, but it may not always be successful. We |
| would recommend that you keep the expressions for the loop bounds as |
| simple as possible to increase the chances the compiler can figure it |
| out. |
| |
| Other restrictions on unrolling |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| Not being able to determine the iteration count at compile time is a |
| fundamental problem. No matter how good the compiler is, it will never |
| be able to fully unroll the loop. However, due to the internal details |
| (algorithms in the SPIRV-Tools optimizer), other cases cannot be |
| handled. The most notable one is that the induction variable must be an |
| integral type. |
| |
| .. code-block:: hlsl |
| |
| // 17-loop-var-float-fail.hlsl |
| // http://shader-playground.timjones.io/d5d2598699378688684a4a074553dddf |
| |
| struct S { |
| float4 f; |
| }; |
| |
| struct CombinedBuffers { |
| StructuredBuffer<S> SBuffer; |
| RWStructuredBuffer<S> RWSBuffer; |
| }; |
| |
| StructuredBuffer<S> gSBuffer1; |
| StructuredBuffer<S> gSBuffer2; |
| RWStructuredBuffer<S> gRWSBuffer; |
| |
| #define constant 0 |
| |
| void main() { |
| |
| StructuredBuffer<S> lSBuffer; |
| |
| [unroll] |
| for( float j = 0; j < 2; j++ ) { // Can't infer floating point induction values |
| if (constant > j) { |
| lSBuffer = gSBuffer1; |
| } else { |
| lSBuffer = gSBuffer2; |
| } |
| gRWSBuffer[j] = lSBuffer[j]; |
| } |
| } |
| |
| This example cannot be legalized because ``j`` is a ``float``. |
| |
| Other interesting cases |
| ----------------------- |
| |
| Multiple calls to a function |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| .. code-block:: hlsl |
| |
| // 18-multi-func-call-ok.hlsl |
| // http://shader-playground.timjones.io/e7b3ac1262a291c92902fd3f1fd3343c |
| |
| struct S { |
| float4 f; |
| }; |
| |
| int i; |
| |
| StructuredBuffer<S> gSBuffer; |
| RWStructuredBuffer<S> gRWSBuffer1; |
| RWStructuredBuffer<S> gRWSBuffer2; |
| |
| |
| void foo(RWStructuredBuffer<S> pRWSBuffer) { |
| pRWSBuffer[i] = gSBuffer[i]; |
| } |
| |
| void main() { |
| foo(gRWSBuffer1); |
| foo(gRWSBuffer2); |
| } |
| |
| In this example, we see the same function is called twice. Each call has |
| a different parameter. This can look like a problem because |
| ``pRWSBuffer`` could be either ``gRWSBuffer1`` or ``gRWSBuffer2``. |
| However, the compiler is able to work around this by creating a separate |
| copy of ``foo`` for each call site. In fact, these copies will be placed |
| inline. |
| |
| Multiple returns |
| ~~~~~~~~~~~~~~~~ |
| |
| As we have already seen, a return from a function is a copy. At this |
| point, it would be fair to ask what happens if there are multiple |
| returns. |
| |
| .. code-block:: hlsl |
| |
| // 19-multi-func-ret-fail.hlsl |
| // http://shader-playground.timjones.io/922facb688a5ba09b153d64cf1fc4557 |
| |
| struct S { |
| float4 f; |
| }; |
| |
| int i; |
| |
| StructuredBuffer<S> gSBuffer; |
| RWStructuredBuffer<S> gRWSBuffer1; |
| RWStructuredBuffer<S> gRWSBuffer2; |
| |
| RWStructuredBuffer<S> foo(int l) { |
| if (l == 0) { // Compiler does not know which branch will be taken: |
| // Branch taken depends on input i. |
| return gRWSBuffer1; |
| } else { |
| return gRWSBuffer2; |
| } |
| } |
| |
| void main() { |
| RWStructuredBuffer<S> lRWSBuffer = foo(i); |
| lRWSBuffer[i] = gSBuffer[i]; |
| } |
| |
| The compiler is not able to legalize this example because it does not |
| know which value will be returned. However, if the compiler is able to |
| determine which path will be taken, then it can be legalized. |
| |
| .. code-block:: hlsl |
| |
| // 20-multi-func-ret-const-ok.hlsl |
| // http://shader-playground.timjones.io/84b093c7cf9e3932c5f0d9691533bafe |
| |
| struct S { |
| float4 f; |
| }; |
| |
| int i; |
| |
| StructuredBuffer<S> gSBuffer1; |
| StructuredBuffer<S> gSBuffer2; |
| RWStructuredBuffer<S> gRWSBuffer1; |
| RWStructuredBuffer<S> gRWSBuffer2; |
| |
| StructuredBuffer<S> foo(int l) { |
| if (l == 0) { |
| return gSBuffer1; |
| } else { |
| return gSBuffer2; |
| } |
| } |
| |
| void main() { |
| gRWSBuffer1[i] = foo(0)[i]; |
| gRWSBuffer2[i] = foo(1)[i]; |
| } |
| |
| For each call to ``foo``, the compiler is able to determine which value |
| will be returned. In this case, the code can be legalized. |
| |
| Combining elements |
| ~~~~~~~~~~~~~~~~~~ |
| |
| Individually, these examples are simple; however, these elements can be |
| combined in arbitrary ways. As one last example, consider this HLSL |
| source code. |
| |
| .. code-block:: hlsl |
| |
| // 21-combined-ok.hlsl |
| // http://shader-playground.timjones.io/9f00d2d359da0731cdf8d0b68520e2c4 |
| |
| struct S { |
| float4 f; |
| }; |
| |
| int i; |
| |
| StructuredBuffer<S> gSBuffer1; |
| StructuredBuffer<S> gSBuffer2; |
| RWStructuredBuffer<S> gRWSBuffer1; |
| RWStructuredBuffer<S> gRWSBuffer2; |
| |
| #define constant 0 |
| |
| StructuredBuffer<S> bar() { |
| if (constant > 2) { |
| return gSBuffer1; |
| } else { |
| return gSBuffer2; |
| } |
| } |
| |
| void foo(RWStructuredBuffer<S> pRWSBuffer) { |
| StructuredBuffer<S> lSBuffer = bar(); |
| pRWSBuffer[i] = lSBuffer[i]; |
| } |
| |
| void main() { |
| foo(gRWSBuffer1); |
| foo(gRWSBuffer2); |
| } |
| |
| The compiler will do all of the transformations that mentioned earlier |
| to identify a single resource for each load and store from a resource. |
| |
| Conclusion |
| ========== |
| |
| It is impossible to enumerate all of the possible code sequences that |
| work or do not work, but hopefully this will give a guide as to what is |
| possible or not. The general rule of thumb is that there must be a |
| straightforward way to transform the code so that there are no copies of |
| global resources. |