Hello, I've been thinking a bit about how to properly implement TCS outputs in TGSI. As a quick reminder, there are per-vertex (i.e. invocation) and per-patch outputs in TCS. And while you can only write to the current invocation's per-vertex outputs, you can read from any of them. (With barrier() used to synchronize invocations.)
Per-patch outputs map quite nicely onto the existing infrastructure, so the rest of the questions will be about per-vertex outputs. One can represent per-vertex outputs as 2D output arrays. That means support for them needs to be added all over (which I've actually done, so I'm not complaining about the extra work but rather asking if it's a good idea). And then you might have DCL OUT[][0], GENERIC MOV ADDR[1].x, SV[0] /* invocation id */ MOV OUT[ADDR[1].x][0], TEMP[0] /* store value */ BARRIER MOV TEMP[0], OUT[3][0] /* read output from invocation == 3 */ The advantage here is that it's all nice and consistent. However the disadvantage is that we have to add a totally useless read of the invocation id and use it as a relative index for the store. At least the nvidia shaders don't even have a way of writing other invocations' data even if they wanted to (without resorting to global memory accesses). So it's complicating all sorts of logic for apparently no real benefit. Another approach might be to bypass the invocation id on storing the output, but using it on reads. For example code like DCL OUT[0], GENERIC MOV OUT[0], TEMP[0] BARRIER MOV TEMP[0], OUT[3][0] This avoids having to teach tgsi about 2d outputs (esp reladdr ones). This seems a lot simpler, but it ignores the gl_InvocationID indexing that happens when writing the output. However I don't think that's so bad. It also means that reads and writes are interpreted a little differently for OUT's, but that doesn't seem so bad either. Thoughts? -ilia _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev