On 28/09/15 17:05, Jason Ekstrand wrote: > > > On Sep 28, 2015 2:09 AM, "Alejandro Piñeiro" <apinhe...@igalia.com > <mailto:apinhe...@igalia.com>> wrote: > > > > Hi, > > > > TL;DR: > > > > as there are several people working on improving the shader quality at > > vec4 using NIR, to avoid overlapping, Im explicitly announcing that this > > week I will work on implement an equivalent to fs_cmod_propagation but > > for the vec4 case. > > > > More details: > > > > While checking shader-db HURT regressions, shaders like these: > > unity/15.shader_test > > warzone2100/1.shader_test > > humus-celshading/4.shader_test > > > > are emitting extra movs when conditions are involved. Writing the > > equivalent fs shader, I found those are optimized by > > opt_cmod_propagation. I vaguely remembered that Jason mentioned it some > > months ago, and I found this email [1], where he suggest to implement > > that pass. So just in case someone else was already doing that, Im > > sending this email. > > Hey Alejandro, > > First off, thanks for working on this. Now that we've fixed the type > issues in copy propagation and register coalesce, I think this is > probably the last major back end issue required for getting decent > results out of NIR. Not that more work can't be done (it always can) > but this solves the last known NIR->backend translation issue. > > I do have one comment for you to think about. In the fs backend we > never move a flag result *to* a GRF. We only ever use a CMP with a > GRF destination. In vec4, for our implementation of nir_op_banyN and > nir_op_ballN, we do something like this: > > CMP null a b > MOV reg 0ud > MOV(+f0.0) reg 0xffffffffud > > Where wise the ANY4H or ALL4H predicate on the second MOV. We should > pick up on this as a CMP that generates a special predicate so that a > MOV.nz that moves it to the flag actually tur s into a use of the ANY > or ALL predicate. >
Ok, thanks for the hints. But probably I will try to get the basic functionality working, based on the current brw_fs_cmod_propagation, and then try to be smarter based on your comments. BTW, I realized that there is a unit test test_fs_cmod_propagation. I assume that a vec4 equivalent is expected, and will work on that more or less at the same time I work on the optimization pass. Just saying in case I'm wrong. Best regards -- Alejandro Piñeiro (apinhe...@igalia.com)
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev