099 compute shaders with images only?

vinz99 · February 12, 2017, 10:35pm

Hi,

Having my very first look at compute shaders in 099 (and compute shaders in general), I’m guessing they’re only meant to be used with imageLoad/Store and deal with image variables, and not shader buffer objects?

Guessing for shader buffer objects to make sense a mean to access them for rendering geometry would be needed though I’m not sure how it would work currently.

I was looking at this nvidia sample docs.nvidia.com/gameworks/index … s%7C_____6

Thanks a lot!
Best,
Vincent

malcolm · February 13, 2017, 5:32pm

Currently we don’t have support for shader storage buffers. I think compute shaders could read/write to those though also if we added support.

pointshader · February 13, 2017, 5:43pm

+1 for shader storage buffers, they appear to be very useful (maybe essential?) for compute shader use.

pointshader · February 13, 2017, 11:18pm

Is it possible for a compute shader to output to a 2D texture array? I’m a bit stuck because you can’t output to arbitrary layers (aka layout(location=0) out vec4 buffer0 will fail to compile), shader storage buffers aren’t implemented yet, and then when I switch the GLSLTop to output a 2d texture array or 3d texture, the imageStore function fails. The message produced is unable to find compatible overloaded function "imageStore(struct unimage3D4x8_bindless, ivec2, vec4)" . Does this mean the implementation for imageStore accessing 2d texture arrays or 3d textures hasn’t been implemented yet?

Thanks!

malcolm · February 13, 2017, 11:25pm

Compute shaders have no notion of a direct output, so location semantics don’t work. The only output is done via imageStore(). To write to a 3D texture you need 3 coordinates, not 2. The error I see there says you are giving it an ivec2, which means you aren’t saying which slice in the 3D texture you want to write to.

malcolm · February 13, 2017, 11:26pm

I don’t think they are essential. You can do lots of things with compute shaders and images without shader storage buffers. Shader storage buffers are mainly for ease of use, but functionally you can do the same thing with textures and a little more code.

That’s not to say we won’t add them, but we just haven’t yet.

pointshader · February 13, 2017, 11:33pm

Thanks for the clarification, Malcolm! I was hoping I wasn’t missing something simple like that. Storage buffers aren’t really needed then, just a matter of convenience, like you said.

pointshader · February 14, 2017, 8:23pm

Actually Malcolm, after working on this some more, I realize there is a reason to use shader storage buffers - Non normalized data. If the only way to get data out of a compute shader is an output color buffer, all data has to be normalized to [0-1.0], correct? Where in a traditional vertex / geo / pixel shader you’d pass arbitrary information down the pipeline within a struct.

I’m new to compute shaders, but already having data coming out of one. Super cool tools to play with! Thanks for adding them in.

vinz99 · February 14, 2017, 9:13pm

Jumping back in, it seems it’s the same as a regular fragment shader, make sure the format is 16 float or 32bits floats, and then you can write arbitrary values.
Modifying the default example from vec4(1.0) to vec4(10.0) and monitoring the value with TopTo it seems fine.

Malcolm, since we still have to pack things in vec4 and multiple color buffers quick question for you :
How do I output to all the color buffers in the compute shader?

I tried

imageStore(sTDComputeOutputs[0], ivec2(gl_GlobalInvocationID.xy), vec4(0.1,0.0,0,1));
imageStore(sTDComputeOutputs[1], ivec2(gl_GlobalInvocationID.xy), vec4(0.0,1.0,0,1));

but it fills the buffer 0 with green and nothing in buffer1.
what is sTDComputeOutputs? A bit more doc, at least on TD specifics, would be great!

Thanks a lot!

pointshader · February 14, 2017, 9:44pm

Vincent, you’re totally correct! I just assumed anything over 1.0 was clamped, since I hadn’t switched the top type to 32bit.

Malcolm, I take back my last post about the normalized data. Moving forward with texture buffers!

Vincent, as a stopgap for your color buffer question, what I’m doing is using a 2d texture array. Make sure that’s enabled in the glslTop with an appropriate number of layers, then try this:

imageStore(sTDComputeOutputs[0], ivec3(gl_GlobalInvocationID.xy,0), pos);
imageStore(sTDComputeOutputs[0], ivec3(gl_GlobalInvocationID.xy,1), vel);

vinz99 · February 14, 2017, 9:49pm

Thanks Jonathan!

Probably I won’t go super deep into compute shaders right now though, I think I was interested in streamlining the gpu sim → lookup in the vertex shader workflow, but it seems it would be the same with compute shaders as is.

Still seem they have some nice perks like atomic counters and shared memory, so will definitely look more into them.

Cheers
Vincent

malcolm · February 16, 2017, 9:23pm

There was a bug with multiple buffer output for compute shaders. This will be fixed in build 2017.2580 or later. I’ll also add some more documentation right now.

vinz99 · February 16, 2017, 11:57pm

Great, thanks Malcolm!

timgerritsen · March 26, 2017, 10:40pm

Not sure it’s part of the not-yet implemented SSBOs, but am I correct the atomic counter isn’t available yet as well?

Trying to do something like the following gives me always a 0 counter:

layout(binding=0, offset=0) uniform atomic_uint ac;
layout (local_size_x = 16, local_size_y = 16) in;
void main()
{
	uint counter = atomicCounterIncrement(ac);
	imageStore(sTDComputeOutputs[0],ivec2(gl_GlobalInvocationID.xy), vec4(counter));
}

I’m quite new to compute shaders, so perhaps I’m missing something here?

Cheers,
Tim

vinz99 · March 27, 2017, 5:21pm

Haven’t tried myself yet and also new to compute shaders, but curious about the answer as well/a working example, since atomic counters seem a great feature of compute shaders.

Vincent

malcolm · March 27, 2017, 5:27pm

They are currently not supported. I’ll look into adding them though. I’m pretty swamped right now finishing off some other features.

vinz99 · June 23, 2017, 6:33pm

thanks Malcolm! Seems this page was added pretty recently which makes me hopeful
derivative.ca/wiki099/index. … ute_Shader

Will definitely start looking into them more seriously now that 099 is officially out and bother you with more questions!

pointshader · June 26, 2017, 3:18pm

Yes! Please update us when Atomic counters or SSBO’s are on the way.
:mrgreen: :mrgreen: :mrgreen:

vinz99 · June 29, 2017, 3:32pm

Hi Malcolm,

One thing on the wiki (derivative.ca/wiki099/index. … _Shaders_2) caught my eye :

Compute Shaders
When creating a 3D Texture or a 2D Texture Array with a compute shader, the shader is still only ran once. The entire output texture is available to be written to using imageStore, and should be filled as desired, possibly with a Z dispatch size equal to the depth of the texture.

Still have to get more familiar with GPU profiling, but would it mean using a compute shader to write to a 3d texture is more efficient?

Thanks!

vinz99 · June 29, 2017, 4:24pm

Another question/bug report, going through
khronos.org/opengl/wiki/Compute_Shader

doing color.xyz = gl_WorkGroupSize;
causes a fatal error :

(74) : fatal error C9999: *** exception during compilation ***

I guess I don’t really need gl_WorkGroupSize since it’s defined with the local_size_x qualifier above and gl_GlobalInvocationID and gl_LocalInvocationID are provided, unlike with CUDA, but seems odd.

Another note, for half a second the default dispatch size of 64x64x1 felt confusing, since the default texture size is 256x256 and the default code is
layout (local_size_x = 8, local_size_y = 8) in;
32x32x1 would make more sense.

Cheers
Vincent