Update v4 in below link, please help to ignore v3.

https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636216.html

Sorry for inconvenience.

Pan

-----Original Message-----
From: Li, Pan2 
Sent: Sunday, November 12, 2023 10:31 AM
To: Richard Sandiford <richard.sandif...@arm.com>; Jeff Law 
<jeffreya...@gmail.com>
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 
<yanzhang.w...@intel.com>; kito.ch...@gmail.com; richard.guent...@gmail.com
Subject: RE: [PATCH v2] DSE: Allow vector type for get_stored_val when read < 
store

Thanks Richard S and Jeff for comments.

> Did you want to use known_le so that you'd pick up the case when the two 
> modes are the same size?  Or was known_lt the test you really wanted 
> (and if so, why).

Take known_lt in v2 due to consideration that leave the equal go to original 
code path.
Just have a try for known_le and got sorts of ICE when test, I bet it may be 
related to the
latent bug as Richard S mentioned.

> instead.  Alternatively, we could remove the is_constant condition
> and fix PR87815 in a different way, e.g. by protecting the
> smallest_int_mode_for_size with a tighter condition.  That might
> allow a similar DSE optimisation to this patch for nonzero offsets,
> thanks to:

Thus, looks like we should fix the PR87815 from the way suggested by Richard S, 
before
we take known_le for vector here.

I will have a try soon and keep you posted.

Pan

-----Original Message-----
From: Richard Sandiford <richard.sandif...@arm.com> 
Sent: Saturday, November 11, 2023 11:23 PM
To: Jeff Law <jeffreya...@gmail.com>
Cc: Li, Pan2 <pan2...@intel.com>; gcc-patches@gcc.gnu.org; 
juzhe.zh...@rivai.ai; Wang, Yanzhang <yanzhang.w...@intel.com>; 
kito.ch...@gmail.com; richard.guent...@gmail.com
Subject: Re: [PATCH v2] DSE: Allow vector type for get_stored_val when read < 
store

Jeff Law <jeffreya...@gmail.com> writes:
> On 11/8/23 23:08, pan2...@intel.com wrote:
>> From: Pan Li <pan2...@intel.com>
>> 
>> Update in v2:
>> * Move vector type support to get_stored_val.
>> 
>> Original log:
>> 
>> This patch would like to allow the vector mode in the
>> get_stored_val in the DSE. It is valid for the read
>> rtx if and only if the read bitsize is less than the
>> stored bitsize.
>> 
>> Given below example code with
>> --param=riscv-autovec-preference=fixed-vlmax.
>> 
>> vuint8m1_t test () {
>>    uint8_t arr[32] = {
>>      1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>>      1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>>    };
>> 
>>    return __riscv_vle8_v_u8m1(arr, 32);
>> }
>> 
>> Before this patch:
>> test:
>>    lui     a5,%hi(.LANCHOR0)
>>    addi    sp,sp,-32
>>    addi    a5,a5,%lo(.LANCHOR0)
>>    li      a3,32
>>    vl2re64.v       v2,0(a5)
>>    vsetvli zero,a3,e8,m1,ta,ma
>>    vs2r.v  v2,0(sp)             <== Unnecessary store to stack
>>    vle8.v  v1,0(sp)             <== Ditto
>>    vs1r.v  v1,0(a0)
>>    addi    sp,sp,32
>>    jr      ra
>> 
>> After this patch:
>> test:
>>    lui     a5,%hi(.LANCHOR0)
>>    addi    a5,a5,%lo(.LANCHOR0)
>>    li      a4,32
>>    addi    sp,sp,-32
>>    vsetvli zero,a4,e8,m1,ta,ma
>>    vle8.v  v1,0(a5)
>>    vs1r.v  v1,0(a0)
>>    addi    sp,sp,32
>>    jr      ra
>> 
>> Below tests are passed within this patch:
>> 
>> * The x86 bootstrap and regression test.
>> * The aarch64 regression test.
>> * The risc-v regression test.
>> 
>>      PR target/111720
>> 
>> gcc/ChangeLog:
>> 
>>      * dse.cc (get_stored_val): Allow vector mode if the read
>>      bitsize is less than stored bitsize.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>      * gcc.target/riscv/rvv/base/pr111720-0.c: New test.
>>      * gcc.target/riscv/rvv/base/pr111720-1.c: New test.
>>      * gcc.target/riscv/rvv/base/pr111720-10.c: New test.
>>      * gcc.target/riscv/rvv/base/pr111720-2.c: New test.
>>      * gcc.target/riscv/rvv/base/pr111720-3.c: New test.
>>      * gcc.target/riscv/rvv/base/pr111720-4.c: New test.
>>      * gcc.target/riscv/rvv/base/pr111720-5.c: New test.
>>      * gcc.target/riscv/rvv/base/pr111720-6.c: New test.
>>      * gcc.target/riscv/rvv/base/pr111720-7.c: New test.
>>      * gcc.target/riscv/rvv/base/pr111720-8.c: New test.
>>      * gcc.target/riscv/rvv/base/pr111720-9.c: New test.
> We're always getting the lowpart here AFAICT and it appears that all the 
> right thing should happen if gen_lowpart_common fails (it returns NULL, 
> which bubbles up and is the right return value from get_stored_val if it 
> can't be optimized).

Yeah, we should always be operating on the lowpart, but it looks
like there's a latent bug.  This check:

  if (gap.is_constant () && maybe_ne (gap, 0))
    {
      ...
    }
  else ...

means that we ignore the gap if it's a nonzero runtime value.
I guess it should be:

  if (maybe_ne (gap, 0))
    {
      if (!gap.is_constant ())
        return NULL_RTX;
      ...
    }

instead.  Alternatively, we could remove the is_constant condition
and fix PR87815 in a different way, e.g. by protecting the
smallest_int_mode_for_size with a tighter condition.  That might
allow a similar DSE optimisation to this patch for nonzero offsets,
thanks to:

      if (multiple_p (shift, GET_MODE_BITSIZE (new_mode))
          && known_le (GET_MODE_SIZE (new_mode), GET_MODE_SIZE (store_mode)))
        {
          /* Try to implement the shift using a subreg.  */
          ...

> Did you want to use known_le so that you'd pick up the case when the two 
> modes are the same size?  Or was known_lt the test you really wanted 
> (and if so, why).

Agree it should be known_le FWIW.

Thanks,
Richard

Reply via email to