On Mon, Oct 19, 2015 at 4:19 AM, Francisco Jerez <curroje...@riseup.net> wrote: > Neil Roberts <n...@linux.intel.com> writes: > >> Just a thought, would it be better to move this check into the >> eliminate_find_live_channel optimisation? That way it could catch >> sources that become immediates through later optimisations. One problem >> with this that I've seen before is that eliminating the >> FIND_LIVE_CHANNEL doesn't cause the subsequent BROADCAST to be >> eliminated because the copy propagation doesn't work. > > I believe in this particular case the BROADCAST instruction should > already be eliminated (by opt_algebraic), because its first source is an > immediate, so this optimization would in fact be redundant with > opt_algebraic+constant propagation if it weren't because the latter is > unable to handle scalar copies.
Yes, is happens as you say: first the FIND_LIVE_CHANNEL gets eliminated because it's outside control flow (fs_visitor::eliminate_find_live_channel). Then the broadcast gets reduced to a MOV in opt_algebraic because src0 is uniform (immediate constant). The problem then is the dst of the MOV has stride 0 which can_propagate_from() then bails on. > Another possibility that would likely > be more general than this (because it would handle cases in which, as > Neil said, the argument becomes an immediate only after optimization, > and because it would also avoid the issue with liveness analysis > extending the live ranges of scalar values incorrectly [1]) would be to > extend the destination register of the BROADCAST instructions to a > vector [2] and then add a case to the switch statement of > try_constant_propagate() so that the immediate MOV resulting from > opt_algebraic is propagated into surface instructions (see attachment). I wrote exactly this code to make constant propagate work, plus adding the extra opcodes to the switch statement. It works and I could certainly send that out after this, but 1) This doesn't mean we shouldn't do the if (src.file == IMM) shortcut. It saves the compiler a bit of work in the very common case of non-indirect buffer access. 2) I'm not even sure it makes sense to extend copy-propagation to do this (which is why I went back to just the IMM test). Anything that would be an immediate at this point should be an immediate, if not we're missing something in nir. Kristian >> I made this patch a while ago but I never posted it anywhere because >> it's a of a kludge and it would probably be better to fix the copy >> propagation: >> >> https://github.com/bpeel/mesa/commit/e4c3286075f891f466fb8558106d2aaa >> > Heh, yeah, I'd rather fix copy propagation instead, which I believe will > become much easier with the use-def-chain analysis pass I'm working on. > >> Either way though I don't think it would do any harm to have Kristian's >> patch as well even if we did improve elimintate_find_live_channel so it >> is: >> >> Reviewed-by: Neil Roberts <n...@linux.intel.com> >> >> - Neil >> >> Kristian Høgsberg Kristensen <k...@bitplanet.net> writes: >> >>> An immdiate is already uniform so just return it up front. Without this, >>> brw_fs_surface_builder ends up passing immediate surface indices through >>> SHADER_OPCODE_BROADCAST. This writes to a stride 0 dst, which we can't >>> constant propagate out of, and further, we don't constant propagate into >>> the typed/untype read/write opcodes at all. The end result is that all >>> typed/untyped read/write/atomics end up as indirect sends. >>> >>> Code generation should always produce either an immediate or an actual >>> indirect surface index, so we can fix this by just special casing >>> immediates in emit_uniformize. >>> >>> Signed-off-by: Kristian Høgsberg Kristensen <k...@bitplanet.net> >>> --- >>> src/mesa/drivers/dri/i965/brw_fs_builder.h | 3 +++ >>> 1 file changed, 3 insertions(+) >>> >>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_builder.h >>> b/src/mesa/drivers/dri/i965/brw_fs_builder.h >>> index df10a9d..98ce71e 100644 >>> --- a/src/mesa/drivers/dri/i965/brw_fs_builder.h >>> +++ b/src/mesa/drivers/dri/i965/brw_fs_builder.h >>> @@ -394,6 +394,9 @@ namespace brw { >>> const dst_reg chan_index = component(vgrf(BRW_REGISTER_TYPE_UD), >>> 0); >>> const dst_reg dst = component(vgrf(src.type), 0); >>> >>> + if (src.file == IMM) >>> + return src; >>> + >>> ubld.emit(SHADER_OPCODE_FIND_LIVE_CHANNEL, chan_index); >>> ubld.emit(SHADER_OPCODE_BROADCAST, dst, src, chan_index); >>> >>> -- >>> 2.6.2 >>> >>> _______________________________________________ >>> mesa-dev mailing list >>> mesa-dev@lists.freedesktop.org >>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev >> _______________________________________________ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev > > > [1] http://lists.freedesktop.org/archives/mesa-dev/2015-October/097085.html > [2] > http://lists.freedesktop.org/archives/mesa-dev/attachments/20151014/25dd38dc/attachment.patch > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev