> Pan, can you confirm what path we take through extract_low_bits? Thanks Jeff for comments, will have a try soon and keep you posted.
Pan -----Original Message----- From: Jeff Law <jeffreya...@gmail.com> Sent: Tuesday, February 27, 2024 11:03 PM To: Li, Pan2 <pan2...@intel.com>; gcc-patches@gcc.gnu.org Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; Wang, Yanzhang <yanzhang.w...@intel.com>; rdapp....@gmail.com; Liu, Hongtao <hongtao....@intel.com> Subject: Re: [PATCH v2] DSE: Bugfix ICE after allow vector type in get_stored_val On 2/26/24 07:22, pan2...@intel.com wrote: > From: Pan Li <pan2...@intel.com> > > We allowed vector type for get_stored_val when read is less than or > equal to store in previous. Unfortunately, we missed to adjust the > validate_subreg part accordingly. When the vector type's size is > less than vector register, it will be considered as invalid in the > validate_subreg. > > Consider the validate_subreg is kind of a can with worms and we are > in stage 4. We will fix the issue from the DES side, and make sure > the subreg is valid for both the read_mode and store_mode before > perform the real gen_lowpart. > > The below test are passed for this patch: > > * The x86 bootstrap test. > * The x86 regression test. > * The riscv regression test. > * The aarch64 regression test. > > gcc/ChangeLog: > > * dse.cc (get_stored_val): Add validate_subreg check before > perform the gen_lowpart for rtl. > > gcc/testsuite/ChangeLog: > > * gcc.dg/tree-ssa/ssa-fre-44.c: Add compile option to trigger > the ICE. > * gcc.target/riscv/rvv/base/bug-6.c: New test. > > Signed-off-by: Pan Li <pan2...@intel.com> > --- > gcc/dse.cc | 4 +++- > gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c | 2 +- > .../gcc.target/riscv/rvv/base/bug-6.c | 22 +++++++++++++++++++ > 3 files changed, 26 insertions(+), 2 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/bug-6.c > > diff --git a/gcc/dse.cc b/gcc/dse.cc > index edc7a1dfecf..1596da91da0 100644 > --- a/gcc/dse.cc > +++ b/gcc/dse.cc > @@ -1946,7 +1946,9 @@ get_stored_val (store_info *store_info, machine_mode > read_mode, > copy_rtx (store_info->const_rhs)); > else if (VECTOR_MODE_P (read_mode) && VECTOR_MODE_P (store_mode) > && known_le (GET_MODE_BITSIZE (read_mode), GET_MODE_BITSIZE > (store_mode)) > - && targetm.modes_tieable_p (read_mode, store_mode)) > + && targetm.modes_tieable_p (read_mode, store_mode) > + && validate_subreg (read_mode, store_mode, copy_rtx (store_info->rhs), > + subreg_lowpart_offset (read_mode, store_mode))) > read_reg = gen_lowpart (read_mode, copy_rtx (store_info->rhs)); > else > read_reg = extract_low_bits (read_mode, store_mode, So we're just changing whether or not we call gen_lowpart directly or go through extract_low_bits, which may in turn generate subreg, call gen_lowpart itself and a few other things. I'm guessing that extract_low_bits is going to return NULL in this case via this code (specifically the second test). > if (!targetm.modes_tieable_p (src_int_mode, src_mode)) > return NULL_RTX; > if (!targetm.modes_tieable_p (int_mode, mode)) > return NULL_RTX; Pan, can you confirm what path we take through extract_low_bits? One might argue that we should just call into extract_low_bits unconditionally since it'll ultimately call gen_lowpart when it safely can. The downside is that's a bigger change than I'd like at this stage in our development cycle. I wouldn't be surprised if other direct uses of gen_lowpart have similar problems. > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c > b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c > index f79b4c142ae..624a00a4f32 100644 > --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c > +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c > @@ -1,5 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-O -fdump-tree-fre1" } */ > +/* { dg-options "-O -fdump-tree-fre1 -O3 -ftree-vectorize" } */ > > struct A { float x, y; }; > struct B { struct A u; }; So this may compromise the original intent of this test. What I would suggest instead is to create a new test with the dg-do & dg-options you want with a #include "ssa-fre-44.c". So to move forward. Let's confirm the path we take through extract_low_bits matches expectations and fixup the testsuite change. Jeff