Tested on Haswell. Patch works well for me, thanks! Tested-by: Vadym Shovkoplias <vadym.shovkopl...@globallogic.com>
On Fri, Mar 23, 2018 at 8:35 PM, Jason Ekstrand <ja...@jlekstrand.net> wrote: > Otherwise we may end up trying to coalesce in a case such as > > ssa_1 = fadd r1, r2 > r3.x = fneg(r2); > r3 = vec4(ssa_1, ssa_1.y, ...) > > and that would cause us to move the writes to r3 from the vec to the > fadd which would re-order them with respect to the write from the fneg. > In order to solve this, we just don't coalesce if the destination of the > vec is not SSA. We could try to get clever and still coalesce if there > are no writes to the destination of the vec between the vec and the ALU > source. However, since registers only come from phi webs and indirects, > the chances of having a vec with a register destination that is actually > coalescable into its source is very slim. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105440 > Fixes: 2458ea95c56 "nir/lower_vec_to_movs: Coalesce movs on-the-fly when > possible" > Reported-by: Vadym Shovkoplias <vadym.shovkopl...@globallogic.com> > Cc: Andriy Khulap <andriy.khu...@globallogic.com> > Cc: Vadym Shovkoplias <vadym.shovkopl...@globallogic.com> > --- > src/compiler/nir/nir_lower_vec_to_movs.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/src/compiler/nir/nir_lower_vec_to_movs.c > b/src/compiler/nir/nir_lower_vec_to_movs.c > index 711ddd3..8b24376 100644 > --- a/src/compiler/nir/nir_lower_vec_to_movs.c > +++ b/src/compiler/nir/nir_lower_vec_to_movs.c > @@ -230,6 +230,7 @@ lower_vec_to_movs_block(nir_block *block, > nir_function_impl *impl) > continue; /* The loop */ > } > > + bool vec_had_ssa_dest = vec->dest.dest.is_ssa; > if (vec->dest.dest.is_ssa) { > /* Since we insert multiple MOVs, we have a register > destination. */ > nir_register *reg = nir_local_reg_create(impl); > @@ -263,7 +264,11 @@ lower_vec_to_movs_block(nir_block *block, > nir_function_impl *impl) > if (!(vec->dest.write_mask & (1 << i))) > continue; > > - if (!(finished_write_mask & (1 << i))) > + /* Coalescing moves the register writes from the vec up to the > ALU > + * instruction in the source. We can only do this if the > original > + * vecN had an SSA destination. > + */ > + if (vec_had_ssa_dest && !(finished_write_mask & (1 << i))) > finished_write_mask |= try_coalesce(vec, i); > > if (!(finished_write_mask & (1 << i))) > -- > 2.5.0.400.gff86faf > > -- Vadym Shovkoplias | Senior Software Engineer GlobalLogic P +380.57.766.7667 M +3.8050.931.7304 S vadym.shovkoplias www.globallogic.com <http://www.globallogic.com/> http://www.globallogic.com/email_disclaimer.txt
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev