On Wed, Jan 11, 2017 at 6:45 PM, Ilia Mirkin <imir...@alum.mit.edu> wrote: > On Fri, Jan 6, 2017 at 6:05 AM, Nicolai Hähnle <nhaeh...@gmail.com> wrote: >> On 05.01.2017 17:59, Ilia Mirkin wrote: >>> >>> On Thu, Jan 5, 2017 at 11:30 AM, Nicolai Hähnle <nhaeh...@gmail.com> >>> wrote: >>>> >>>> On 05.01.2017 17:02, Ilia Mirkin wrote: >>>>> >>>>> >>>>> On Thu, Jan 5, 2017 at 10:48 AM, Nicolai Hähnle <nhaeh...@gmail.com> >>>>> wrote: >>>>>> >>>>>> >>>>>> On 02.01.2017 21:41, Marek Olšák wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Mon, Jan 2, 2017 at 7:01 AM, Ilia Mirkin <imir...@alum.mit.edu> >>>>>>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Signed-off-by: Ilia Mirkin <imir...@alum.mit.edu> >>>>>>>> --- >>>>>>>> src/gallium/auxiliary/tgsi/tgsi_info.c | 2 +- >>>>>>>> src/gallium/docs/source/tgsi.rst | 11 +++++++++++ >>>>>>>> src/gallium/include/pipe/p_shader_tokens.h | 2 +- >>>>>>>> 3 files changed, 13 insertions(+), 2 deletions(-) >>>>>>>> >>>>>>>> diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c >>>>>>>> b/src/gallium/auxiliary/tgsi/tgsi_info.c >>>>>>>> index 37549aa..e34b8c7 100644 >>>>>>>> --- a/src/gallium/auxiliary/tgsi/tgsi_info.c >>>>>>>> +++ b/src/gallium/auxiliary/tgsi/tgsi_info.c >>>>>>>> @@ -106,7 +106,7 @@ static const struct tgsi_opcode_info >>>>>>>> opcode_info[TGSI_OPCODE_LAST] = >>>>>>>> { 1, 3, 0, 0, 0, 0, 0, COMP, "CMP", TGSI_OPCODE_CMP }, >>>>>>>> { 1, 1, 0, 0, 0, 0, 0, CHAN, "SCS", TGSI_OPCODE_SCS }, >>>>>>>> { 1, 2, 1, 0, 0, 0, 0, OTHR, "TXB", TGSI_OPCODE_TXB }, >>>>>>>> - { 0, 1, 0, 0, 0, 0, 1, NONE, "", 69 }, /* removed */ >>>>>>>> + { 1, 1, 0, 0, 0, 0, 0, OTHR, "FBFETCH", TGSI_OPCODE_FBFETCH }, >>>>>>>> { 1, 2, 0, 0, 0, 0, 0, COMP, "DIV", TGSI_OPCODE_DIV }, >>>>>>>> { 1, 2, 0, 0, 0, 0, 0, REPL, "DP2", TGSI_OPCODE_DP2 }, >>>>>>>> { 1, 2, 1, 0, 0, 0, 0, OTHR, "TXL", TGSI_OPCODE_TXL }, >>>>>>>> diff --git a/src/gallium/docs/source/tgsi.rst >>>>>>>> b/src/gallium/docs/source/tgsi.rst >>>>>>>> index d2d30b4..accbe1d 100644 >>>>>>>> --- a/src/gallium/docs/source/tgsi.rst >>>>>>>> +++ b/src/gallium/docs/source/tgsi.rst >>>>>>>> @@ -2561,6 +2561,17 @@ Resource Access Opcodes >>>>>>>> image, while .w will contain the number of samples for >>>>>>>> multi-sampled >>>>>>>> images. >>>>>>>> >>>>>>>> +.. opcode:: FBFETCH - Load data from framebuffer >>>>>>>> + >>>>>>>> + Syntax: ``FBFETCH dst, output`` >>>>>>>> + >>>>>>>> + Example: ``FBFETCH TEMP[0], OUT[0]`` >>>>>>>> + >>>>>>>> + Returns the color of the current position in the framebuffer from >>>>>>>> + before this fragment shader invocation. Always returns the same >>>>>>>> + value from multiple calls for a particular output within a single >>>>>>>> + invocation. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> I'm not a fan of this last sentence. It's true that we could somehow >>>>>> bend >>>>>> things in the compiler to make the sentence true, but >>>>>> >>>>>> (a) The statement is clearly false with a straight-forward >>>>>> implementation >>>>>> of >>>>>> the instruction: multiple fragment shaders can be simultaneously >>>>>> in-flight >>>>>> on the same pixel/sample. A second FBFETCH could happen after an >>>>>> earlier >>>>>> invocation on the same pixel finished and get the new framebuffer >>>>>> value. >>>>>> >>>>>> (b) I'm not aware of an API that actually requires this guarantee. >>>>>> >>>>>> >>>>>>> If the value is always the same, can it be declared as a system value >>>>>>> instead? >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> I don't know. I'd remove the statement about the value always being the >>>>>> same >>>>>> to begin with. And with an eye to how this actually ends up being >>>>>> implemented, and possible interactions with >>>>>> ARB_fragment_shader_interlock, >>>>>> I'd say it makes sense (and our lives easier) for the TGSI to define >>>>>> _when_ >>>>>> the framebuffer value is supposed to be read, and for that it makes >>>>>> sense >>>>>> to >>>>>> have an instruction for it. >>>>> >>>>> >>>>> >>>>> In case it's not obvious, this is primarily for >>>>> KHR_blend_equation_advanced. It's illegal to use it with overdraw >>>>> without a barrier first. >>>> >>>> >>>> >>>> Well, you get an undefined value :) >>>> >>>> My point is that there's a plausible implementation of FBFETCH as an >>>> instruction in which "the same value will be returned within a single >>>> invocation" is only guaranteed if the application follows the rules >>>> involving BlendBarrier. >>>> >>>> >>>>> There's also >>>>> KHR_blend_equation_advanced_coherent and EXT_shader_framebuffer_fetch, >>>>> which do allow overdraw. I'm not sure how the ordering is specified >>>>> (or, how it's specified for regular blending). The gl_LastFragData >>>>> stuff are a non-writable space, and as such, it makes sense that >>>>> they'd be computed once on invocation and then kept constant >>>>> throughout the shader run. >>>> >>>> >>>> >>>> It still leaves open the question of _when_ they should be computed, >>>> especially in the future when guarantees about the order of fragment >>>> shaders >>>> are required (i.e. KHR_blend_equation_advanced_coherent): If you load the >>>> data too early, you may lose some potential for parallelism because you >>>> have >>>> to spend more time waiting for earlier invocations to finish. This >>>> obviously >>>> depends on the details of the hardware, but ARB_fragment_shader_interlock >>>> suggests that it is (or will be?) quite common to have synchronization >>>> that >>>> is rather fine-grained. >>> >>> >>> This is a related to the implementation issues I had on nouveau with >>> the sysval - basically the code looks like >>> >>> if (advanced blend is enabled) { >>> MOV TEMP[0], SV >>> IF >>> use(sv) >>> ELSE >>> use(sv) >>> use(sv some more) >>> } >>> >>> And then the st_glsl_to_tgsi copy propagation logic pushes those SV's >>> down to the uses. This is further complicated by the fact that the >>> xyzw values are logically independent at the TGSI level. And now I'm >>> stuck either >>> >>> (a) getting the value once at the top of the shader, even if advanced >>> blend might be disabled entirely >>> (b) fetching the texture once per bb >>> (c) figuring out some code motion heuristics which presently don't >>> exist in nouveau >>> >>> So... the operation makes the most sense to me. How about "once you >>> call FBFETCH, all further invocations should return identical values"? >> >> >> I don't know about "should". What about "may"? This would allow the driver >> to "cache" the fbfetch result or fetch it only once at the start of the >> shader, but doesn't force it to do that. So things are straightforward for >> the driver both when a simple texture-fetch-based implementation should be >> used _and_ when there really is fixed function hardware. How about this: > > I like may. > >> >>> Example: ``FBFETCH TEMP[0], OUT[0]`` >>> >>> Returns the current color of the current position in the framebuffer. >>> The result is undefined if the current position is rendered to twice >>> without a blend barrier in between. >> >> I think that's pretty much the condition at the API level, isn't it? > > Yes. I just want to point out that this opcode is tied to the API operation. > > Marek, are you good with this non-SV approach? If so, I'll respin my > original series to take Nicolai's suggestions into account.
Yeah, it's OK. Marek _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev