Having my very first look at compute shaders in 099 (and compute shaders in general), I’m guessing they’re only meant to be used with imageLoad/Store and deal with image variables, and not shader buffer objects?
Guessing for shader buffer objects to make sense a mean to access them for rendering geometry would be needed though I’m not sure how it would work currently.
Is it possible for a compute shader to output to a 2D texture array? I’m a bit stuck because you can’t output to arbitrary layers (aka
layout(location=0) out vec4 buffer0
will fail to compile), shader storage buffers aren’t implemented yet, and then when I switch the GLSLTop to output a 2d texture array or 3d texture, the imageStore function fails. The message produced is
unable to find compatible overloaded function "imageStore(struct unimage3D4x8_bindless, ivec2, vec4)"
. Does this mean the implementation for imageStore accessing 2d texture arrays or 3d textures hasn’t been implemented yet?
Compute shaders have no notion of a direct output, so location semantics don’t work. The only output is done via imageStore(). To write to a 3D texture you need 3 coordinates, not 2. The error I see there says you are giving it an ivec2, which means you aren’t saying which slice in the 3D texture you want to write to.
I don’t think they are essential. You can do lots of things with compute shaders and images without shader storage buffers. Shader storage buffers are mainly for ease of use, but functionally you can do the same thing with textures and a little more code.
That’s not to say we won’t add them, but we just haven’t yet.
Thanks for the clarification, Malcolm! I was hoping I wasn’t missing something simple like that. Storage buffers aren’t really needed then, just a matter of convenience, like you said.
Actually Malcolm, after working on this some more, I realize there is a reason to use shader storage buffers - Non normalized data. If the only way to get data out of a compute shader is an output color buffer, all data has to be normalized to [0-1.0], correct? Where in a traditional vertex / geo / pixel shader you’d pass arbitrary information down the pipeline within a struct.
I’m new to compute shaders, but already having data coming out of one. Super cool tools to play with! Thanks for adding them in.
Jumping back in, it seems it’s the same as a regular fragment shader, make sure the format is 16 float or 32bits floats, and then you can write arbitrary values.
Modifying the default example from vec4(1.0) to vec4(10.0) and monitoring the value with TopTo it seems fine.
Malcolm, since we still have to pack things in vec4 and multiple color buffers quick question for you :
How do I output to all the color buffers in the compute shader?
Vincent, you’re totally correct! I just assumed anything over 1.0 was clamped, since I hadn’t switched the top type to 32bit.
Malcolm, I take back my last post about the normalized data. Moving forward with texture buffers!
Vincent, as a stopgap for your color buffer question, what I’m doing is using a 2d texture array. Make sure that’s enabled in the glslTop with an appropriate number of layers, then try this:
Probably I won’t go super deep into compute shaders right now though, I think I was interested in streamlining the gpu sim → lookup in the vertex shader workflow, but it seems it would be the same with compute shaders as is.
Still seem they have some nice perks like atomic counters and shared memory, so will definitely look more into them.
There was a bug with multiple buffer output for compute shaders. This will be fixed in build 2017.2580 or later. I’ll also add some more documentation right now.
Haven’t tried myself yet and also new to compute shaders, but curious about the answer as well/a working example, since atomic counters seem a great feature of compute shaders.
Compute Shaders
When creating a 3D Texture or a 2D Texture Array with a compute shader, the shader is still only ran once. The entire output texture is available to be written to using imageStore, and should be filled as desired, possibly with a Z dispatch size equal to the depth of the texture.
Still have to get more familiar with GPU profiling, but would it mean using a compute shader to write to a 3d texture is more efficient?
doing color.xyz = gl_WorkGroupSize;
causes a fatal error :
(74) : fatal error C9999: *** exception during compilation ***
I guess I don’t really need gl_WorkGroupSize since it’s defined with the local_size_x qualifier above and gl_GlobalInvocationID and gl_LocalInvocationID are provided, unlike with CUDA, but seems odd.
Another note, for half a second the default dispatch size of 64x64x1 felt confusing, since the default texture size is 256x256 and the default code is
layout (local_size_x = 8, local_size_y = 8) in;
32x32x1 would make more sense.