Re: [Mesa-dev] [PATCH 3/7] gallium: add FBFETCH opcode to retrieve the current sample value

Nicolai Hähnle Fri, 06 Jan 2017 03:06:07 -0800

On 05.01.2017 17:59, Ilia Mirkin wrote:

On Thu, Jan 5, 2017 at 11:30 AM, Nicolai Hähnle <nhaeh...@gmail.com> wrote:

On 05.01.2017 17:02, Ilia Mirkin wrote:


On Thu, Jan 5, 2017 at 10:48 AM, Nicolai Hähnle <nhaeh...@gmail.com>
wrote:


On 02.01.2017 21:41, Marek Olšák wrote:



On Mon, Jan 2, 2017 at 7:01 AM, Ilia Mirkin <imir...@alum.mit.edu>
wrote:



Signed-off-by: Ilia Mirkin <imir...@alum.mit.edu>
---
 src/gallium/auxiliary/tgsi/tgsi_info.c     |  2 +-
 src/gallium/docs/source/tgsi.rst           | 11 +++++++++++
 src/gallium/include/pipe/p_shader_tokens.h |  2 +-
 3 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c
b/src/gallium/auxiliary/tgsi/tgsi_info.c
index 37549aa..e34b8c7 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_info.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_info.c
@@ -106,7 +106,7 @@ static const struct tgsi_opcode_info
opcode_info[TGSI_OPCODE_LAST] =
    { 1, 3, 0, 0, 0, 0, 0, COMP, "CMP", TGSI_OPCODE_CMP },
    { 1, 1, 0, 0, 0, 0, 0, CHAN, "SCS", TGSI_OPCODE_SCS },
    { 1, 2, 1, 0, 0, 0, 0, OTHR, "TXB", TGSI_OPCODE_TXB },
-   { 0, 1, 0, 0, 0, 0, 1, NONE, "", 69 },      /* removed */
+   { 1, 1, 0, 0, 0, 0, 0, OTHR, "FBFETCH", TGSI_OPCODE_FBFETCH },
    { 1, 2, 0, 0, 0, 0, 0, COMP, "DIV", TGSI_OPCODE_DIV },
    { 1, 2, 0, 0, 0, 0, 0, REPL, "DP2", TGSI_OPCODE_DP2 },
    { 1, 2, 1, 0, 0, 0, 0, OTHR, "TXL", TGSI_OPCODE_TXL },
diff --git a/src/gallium/docs/source/tgsi.rst
b/src/gallium/docs/source/tgsi.rst
index d2d30b4..accbe1d 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -2561,6 +2561,17 @@ Resource Access Opcodes
   image, while .w will contain the number of samples for multi-sampled
   images.

+.. opcode:: FBFETCH - Load data from framebuffer
+
+  Syntax: ``FBFETCH dst, output``
+
+  Example: ``FBFETCH TEMP[0], OUT[0]``
+
+  Returns the color of the current position in the framebuffer from
+  before this fragment shader invocation. Always returns the same
+  value from multiple calls for a particular output within a single
+  invocation.




I'm not a fan of this last sentence. It's true that we could somehow bend
things in the compiler to make the sentence true, but

(a) The statement is clearly false with a straight-forward implementation
of
the instruction: multiple fragment shaders can be simultaneously
in-flight
on the same pixel/sample. A second FBFETCH could happen after an earlier
invocation on the same pixel finished and get the new framebuffer value.

(b) I'm not aware of an API that actually requires this guarantee.

If the value is always the same, can it be declared as a system value
instead?




I don't know. I'd remove the statement about the value always being the
same
to begin with. And with an eye to how this actually ends up being
implemented, and possible interactions with
ARB_fragment_shader_interlock,
I'd say it makes sense (and our lives easier) for the TGSI to define
_when_
the framebuffer value is supposed to be read, and for that it makes sense
to
have an instruction for it.



In case it's not obvious, this is primarily for
KHR_blend_equation_advanced. It's illegal to use it with overdraw
without a barrier first.



Well, you get an undefined value :)

My point is that there's a plausible implementation of FBFETCH as an
instruction in which "the same value will be returned within a single
invocation" is only guaranteed if the application follows the rules
involving BlendBarrier.

There's also
KHR_blend_equation_advanced_coherent and EXT_shader_framebuffer_fetch,
which do allow overdraw. I'm not sure how the ordering is specified
(or, how it's specified for regular blending). The gl_LastFragData
stuff are a non-writable space, and as such, it makes sense that
they'd be computed once on invocation and then kept constant
throughout the shader run.



It still leaves open the question of _when_ they should be computed,
especially in the future when guarantees about the order of fragment shaders
are required (i.e. KHR_blend_equation_advanced_coherent): If you load the
data too early, you may lose some potential for parallelism because you have
to spend more time waiting for earlier invocations to finish. This obviously
depends on the details of the hardware, but ARB_fragment_shader_interlock
suggests that it is (or will be?) quite common to have synchronization that
is rather fine-grained.


This is a related to the implementation issues I had on nouveau with
the sysval - basically the code looks like

if (advanced blend is enabled) {
MOV TEMP[0], SV
IF
  use(sv)
ELSE
  use(sv)
use(sv some more)
}

And then the st_glsl_to_tgsi copy propagation logic pushes those SV's
down to the uses. This is further complicated by the fact that the
xyzw values are logically independent at the TGSI level. And now I'm
stuck either

(a) getting the value once at the top of the shader, even if advanced
blend might be disabled entirely
(b) fetching the texture once per bb
(c) figuring out some code motion heuristics which presently don't
exist in nouveau

So... the operation makes the most sense to me. How about "once you
call FBFETCH, all further invocations should return identical values"?

I don't know about "should". What about "may"? This would allow thedriver to "cache" the fbfetch result or fetch it only once at the startof the shader, but doesn't force it to do that. So things arestraightforward for the driver both when a simple texture-fetch-basedimplementation should be used _and_ when there really is fixed functionhardware. How about this:


> Example: ``FBFETCH TEMP[0], OUT[0]``
>
> Returns the current color of the current position in the framebuffer.
> The result is undefined if the current position is rendered to twice
> without a blend barrier in between.

I think that's pretty much the condition at the API level, isn't it?

Nicolai

This could still present difficulties in a situation like

if (a) {
  fbfetch();
}
do stuff
if (b) {
  fbfetch();
}

But I don't have any concrete problems with it I can identify, just
fear of the unknown.

Cheers,

  -ilia

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/7] gallium: add FBFETCH opcode to retrieve the current sample value

Reply via email to