On Fri, Jul 10, 2015 at 5:25 AM, Francisco Jerez <curroje...@riseup.net> wrote: > Jason Ekstrand <ja...@jlekstrand.net> writes: > >> On Jul 9, 2015 7:57 AM, "Francisco Jerez" <curroje...@riseup.net> wrote: >>> >>> We were passing src0 alpha and oMask in reverse order. There seems to >>> be no good way to pass them in the correct order to the new-style >>> LOAD_PAYLOAD (how surprising) because src0 alpha is per-channel while >>> oMask is not. Just split src0 alpha in fixed-width registers and pass >>> them to LOAD_PAYLOAD as if they were part of the header as work-around >>> for now. >> >> Bah... I came across this when I did the LOAD_PAYLOAD rework but thought it >> was only theoretical. I wasn't very familiar with what omask actually did >> and, since piglit didn't hit it, I wasn't sure if it was a real problem or >> not. I probably should have done more digging and written a piglit test at >> the time. My bad. >> >> One solution that I proposed at the time was to turn header_size into >> header_mask in the obvious way. We can still use 8 bits because we should >> never have a header source higher than 8. >> > > So your idea is to have one bit per source indicating whether it's > header-like or per-channel? I don't think that extends to instructions > other than LOAD_PAYLOAD (e.g. FB_WRITE) where the same source is at the > same time header and payload.
You're right, it doesn't. We really shouldn't be conflating them. We should have header_mask and header_present be different fields. Maybe use a union to save space, but they should have different semantic meaning and different names. We should probably also have a compr4_mask and get rid of the hackery there. > One bit per 32B register would extend > easily but it would be rather ugly to deal with if you want to keep your > code SIMD width-invariant. > > I think if you go with the per-source flag you'll want it to be in its > own subclass of fs_inst. With its own subclass you could even have an > array of per-source sizes determining the number of registers read for > each source, which would be rather nice for the visitor (no need to > split vectors into components while passing them to LOAD_PAYLOAD). > > Still I think the most elegant solution would be to simply get rid of > the header/payload distinction by using force_writemask_all and, if it > proves to be necessary, fix the optimizer to get rid of redundant > force_writemask_all flags where it doesn't do it already. I really don't think that's a good long-term or short-term solution. How badly are you blocking on this? I don't really have a lot of extra time to work on this at the moment but can carve some out if needed. --jason >> Thoughts? >> --Jason >> >>> I've written a piglit test that demonstrates the problem by using >>> gl_SampleMask from a fragment shader with multiple color outputs [1]. >>> >>> [1] http://lists.freedesktop.org/archives/piglit/2015-July/016499.html >>> --- >>> src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 26 >> +++++++++++++++++--------- >>> 1 file changed, 17 insertions(+), 9 deletions(-) >>> >>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp >> b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp >>> index 94d6a58..304ae74 100644 >>> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp >>> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp >>> @@ -1535,6 +1535,19 @@ fs_visitor::emit_single_fb_write(const fs_builder >> &bld, >>> length++; >>> } >>> >>> + if (src0_alpha.file != BAD_FILE && color0.file != BAD_FILE) { >>> + /* Neat, we need to chop the src0 alpha component and pass it as >> part of >>> + * the header even though it has per-channel semantics, because >> the next >>> + * optional field is header-like and LOAD_PAYLOAD requires all such >>> + * fields to form a contiguous segment at the beginning of the >> message. >>> + */ >>> + for (unsigned i = 0; i < exec_size / 8; i++) { >>> + setup_color_payload(&sources[length], src0_alpha, 1, 8, >>> + use_2nd_half || i == 1); >>> + length++; >>> + } >>> + } >>> + >>> prog_data->uses_omask = >>> prog->OutputsWritten & BITFIELD64_BIT(FRAG_RESULT_SAMPLE_MASK); >>> if (prog_data->uses_omask) { >>> @@ -1561,19 +1574,14 @@ fs_visitor::emit_single_fb_write(const fs_builder >> &bld, >>> offset(this->outputs[0], bld, 3), >>> 1, exec_size, false); >>> length += 4; >>> - } else if (color1.file == BAD_FILE) { >>> - if (src0_alpha.file != BAD_FILE) { >>> - setup_color_payload(&sources[length], src0_alpha, 1, exec_size, >> false); >>> - length++; >>> - } >>> - >>> - setup_color_payload(&sources[length], color0, components, >>> - exec_size, use_2nd_half); >>> - length += 4; >>> } else { >>> setup_color_payload(&sources[length], color0, components, >>> exec_size, use_2nd_half); >>> length += 4; >>> + >>> + } >>> + >>> + if (color1.file != BAD_FILE) { >>> setup_color_payload(&sources[length], color1, components, >>> exec_size, use_2nd_half); >>> length += 4; >>> -- >>> 2.4.3 >>> _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev