https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115629

Richard Sandiford <rsandifo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rsandifo at gcc dot gnu.org

--- Comment #8 from Richard Sandiford <rsandifo at gcc dot gnu.org> ---
Perhaps I'm missing the point, but I think one of the issues here is that we
(still) don't model that MASK_LOAD sets inactive elements to zero.  Inactive
elements are currently undefined instead.  (I think Robin mentioned that
assuming zero is problematic for RVV, so we might need an explicit MASK_LOAD
argument for inactive elements, like for COND_ADD etc.)

So quoting the IL in comment 4:

  # loop_mask_63 = PHI <next_mask_95(10), max_mask_94(20)>
  vect__4.10_64 = .MASK_LOAD (vectp_a.8_53, 32B, loop_mask_63);
  mask__31.11_66 = vect__4.10_64 != { 0, ... };
  mask__56.12_67 = ~mask__31.11_66;
  vec_mask_and_70 = mask__56.12_67 & loop_mask_63;
  vect__7.15_71 = .MASK_LOAD (vectp_c.13_68, 32B, vec_mask_and_70);
  mask__22.16_73 = vect__7.15_71 == { 0, ... };
  mask__34.17_75 = vec_mask_and_70 & mask__22.16_73;

I think this and...

  vect_iftmp.20_78 = .MASK_LOAD (vectp_d.18_76, 32B, mask__34.17_75);
  vect__61.21_79 = vect__4.10_64 | vect__7.15_71;
  mask__35.22_81 = vect__61.21_79 != { 0, ... };
  vec_mask_and_84 = mask__35.22_81 & loop_mask_63;

...this have to be kept until we model inactive elements.

  vect_iftmp.25_85 = .MASK_LOAD (vectp_b.23_82, 32B, vec_mask_and_84);
  _86 = mask__34.17_75 & loop_mask_63;

This one is really curious though :)  Why does the code think that the loop
mask is needed here?  Does the code think the mask is needed for correctness,
or is the scalar_cond_masked_set optimisation misfiring?

  vect_iftmp.26_87 = VEC_COND_EXPR <_86, vect_iftmp.20_78, vect_iftmp.25_85>;
  .MASK_STORE (vectp_res.27_88, 32B, loop_mask_63, vect_iftmp.26_87);

Reply via email to