On Sun, Oct 29, 2023 at 10:44 PM Jeff Law <jeffreya...@gmail.com> wrote:
>
>
>
> On 10/20/23 03:53, Christoph Muellner wrote:
> > From: Christoph Müllner <christoph.muell...@vrull.eu>
> >
> > The XTheadMemIdx ISA extension provides a additional load and store
> > instructions with new addressing modes.
> >
> > The following memory accesses types are supported:
> > * load: b,bu,h,hu,w,wu,d
> > * store: b,h,w,d
> >
> > The following addressing modes are supported:
> > * immediate offset with PRE_MODIFY or POST_MODIFY (22 instructions):
> >    l<ltype>.ia, l<ltype>.ib, s<stype>.ia, s<stype>.ib
> > * register offset with additional immediate offset (11 instructions):
> >    lr<ltype>, sr<stype>
> > * zero-extended register offset with additional immediate offset
> >    (11 instructions): lur<ltype>, sur<stype>
> >
> > The RISC-V base ISA does not support index registers, so the changes
> > are kept separate from the RISC-V standard support as much as possible.
> >
> > To combine the shift/multiply instructions into the memory access
> > instructions, this patch comes with a few insn_and_split optimizations
> > that allow the combiner to do this task.
> >
> > Handling the different cases of extensions results in a couple of INSNs
> > that look redundant on first view, but they are just the equivalence
> > of what we already have for Zbb as well. The only difference is, that
> > we have much more load instructions.
> >
> > We already have a constraint with the name 'th_f_fmv', therefore,
> > the new constraints follow this pattern and have the same length
> > as required ('th_m_mia', 'th_m_mib', 'th_m_mir', 'th_m_miu').
> >
> > The added tests ensure that this feature won't regress without notice.
> > Testing: GCC regression test suite, GCC bootstrap build, and
> > SPEC CPU 2017 intrate (base&peak) on C920.
> >
> > Signed-off-by: Christoph Müllner <christoph.muell...@vrull.eu>
> >
> > gcc/ChangeLog:
> >
> >       * config/riscv/constraints.md (th_m_mia): New constraint.
> >       (th_m_mib): Likewise.
> >       (th_m_mir): Likewise.
> >       (th_m_miu): Likewise.
> >       * config/riscv/riscv-protos.h (enum riscv_address_type):
> >       Add new address types ADDRESS_REG_REG, ADDRESS_REG_UREG,
> >       and ADDRESS_REG_WB and their documentation.
> >       (struct riscv_address_info): Add new field 'shift' and
> >       document the field usage for the new address types.
> >       (riscv_valid_base_register_p): New prototype.
> >       (th_memidx_legitimate_modify_p): Likewise.
> >       (th_memidx_legitimate_index_p): Likewise.
> >       (th_classify_address): Likewise.
> >       (th_output_move): Likewise.
> >       (th_print_operand_address): Likewise.
> >       * config/riscv/riscv.cc (riscv_index_reg_class):
> >       Return GR_REGS for XTheadMemIdx.
> >       (riscv_regno_ok_for_index_p): Add support for XTheadMemIdx.
> >       (riscv_classify_address): Call th_classify_address() on top.
> >       (riscv_output_move): Call th_output_move() on top.
> >       (riscv_print_operand_address): Call th_print_operand_address()
> >       on top.
> >       * config/riscv/riscv.h (HAVE_POST_MODIFY_DISP): New macro.
> >       (HAVE_PRE_MODIFY_DISP): Likewise.
> >       * config/riscv/riscv.md (zero_extendqi<SUPERQI:mode>2): Disable
> >       for XTheadMemIdx.
> >       (*zero_extendqi<SUPERQI:mode>2_internal): Convert to expand,
> >       create INSN with same name and disable it for XTheadMemIdx.
> >       (extendsidi2): Likewise.
> >       (*extendsidi2_internal): Disable for XTheadMemIdx.
> >       * config/riscv/thead.cc (valid_signed_immediate): New helper
> >       function.
> >       (th_memidx_classify_address_modify): New function.
> >       (th_memidx_legitimate_modify_p): Likewise.
> >       (th_memidx_output_modify): Likewise.
> >       (is_memidx_mode): Likewise.
> >       (th_memidx_classify_address_index): Likewise.
> >       (th_memidx_legitimate_index_p): Likewise.
> >       (th_memidx_output_index): Likewise.
> >       (th_classify_address): Likewise.
> >       (th_output_move): Likewise.
> >       (th_print_operand_address): Likewise.
> >       * config/riscv/thead.md (*th_memidx_operand): New splitter.
> >       (*th_memidx_zero_extendqi<SUPERQI:mode>2): New INSN.
> >       (*th_memidx_extendsidi2): Likewise.
> >       (*th_memidx_zero_extendsidi2): Likewise.
> >       (*th_memidx_zero_extendhi<GPR:mode>2): Likewise.
> >       (*th_memidx_extend<SHORT:mode><SUPERQI:mode>2): Likewise.
> >       (*th_memidx_bb_zero_extendsidi2): Likewise.
> >       (*th_memidx_bb_zero_extendhi<GPR:mode>2): Likewise.
> >       (*th_memidx_bb_extendhi<GPR:mode>2): Likewise.
> >       (*th_memidx_bb_extendqi<SUPERQI:mode>2): Likewise.
> >       (TH_M_ANYI): New mode iterator.
> >       (TH_M_NOEXTI): Likewise.
> >       (*th_memidx_I_a): New combiner optimization.
> >       (*th_memidx_I_b): Likewise.
> >       (*th_memidx_I_c): Likewise.
> >       (*th_memidx_US_a): Likewise.
> >       (*th_memidx_US_b): Likewise.
> >       (*th_memidx_US_c): Likewise.
> >       (*th_memidx_UZ_a): Likewise.
> >       (*th_memidx_UZ_b): Likewise.
> >       (*th_memidx_UZ_c): Likewise.
> >
> > gcc/testsuite/ChangeLog:
> >
> >       * gcc.target/riscv/xtheadmemidx-helpers.h: New test.
> >       * gcc.target/riscv/xtheadmemidx-index-update.c: New test.
> >       * gcc.target/riscv/xtheadmemidx-index-xtheadbb-update.c: New test.
> >       * gcc.target/riscv/xtheadmemidx-index-xtheadbb.c: New test.
> >       * gcc.target/riscv/xtheadmemidx-index.c: New test.
> >       * gcc.target/riscv/xtheadmemidx-modify-xtheadbb.c: New test.
> >       * gcc.target/riscv/xtheadmemidx-modify.c: New test.
> >       * gcc.target/riscv/xtheadmemidx-uindex-update.c: New test.
> >       * gcc.target/riscv/xtheadmemidx-uindex-xtheadbb-update.c: New test.
> >       * gcc.target/riscv/xtheadmemidx-uindex-xtheadbb.c: New test.
> >       * gcc.target/riscv/xtheadmemidx-uindex.c: New test.
> > ---
>
> >
>
> );
> > @@ -5581,6 +5594,9 @@ riscv_print_operand_address (FILE *file, machine_mode 
> > mode ATTRIBUTE_UNUSED, rtx
> >   {
> >     struct riscv_address_info addr;
> >
> > +  if (th_print_operand_address (file, mode, x))
> > +    return;
> > +
> >     if (riscv_classify_address (&addr, x, word_mode, true))
> >       switch (addr.type)
> >         {
> Check indentation in this ^^^ hunk, specifically the initial indentation
> of the IF and its THEN arm.

I think the indentation looks ok (and check_GNU_style did not complain).
The structure of the function is as follows:

static void
f (...)
{
  if (th_print_operand_address ())
    return;

  if (foo (bar))
    switch
      {
      case A:
        baz ();
        return;
      default:
        gcc_unreachable ();
      }
  gcc_unreachable ()
}

So the then-arm just consists of a switch statement.
Would you prefer braces around the switch statement?
Or, is there something that I fail to see?

> > +}
> > +
> > +/* Provide a buffer for a th.lXia/th.lXib/th.sXia/th.sXib instruction
> > +   for the given MODE. If LOAD is true, a load instruction will be
> > +   provided (otherwise, a store instruction). If X is not suitable
> > +   return NULL.  */
> > +
> > +static const char *
> > +th_memidx_output_modify (rtx x, machine_mode mode, bool load)
> > +{
> > +  static char buf[128] = {0};
> A bit icky, though we don't have a threaded compiler so it's not
> catastrophic.  Consider having the caller pass in a buffer rather than
> having it be static in the callee.  It looks like you might need to do
> that 2 callers up, passing it through the intermediate functions.

In my defense, I am not the first to use this pattern
(allocating a static char array, filling it with snprintf and
returning a pointer to the array).
Some examples from other backends:
* aarch64.cc: aarch64_output_simd_mov_immediate
* arm/vfp.md: push_fpsysreg_insn
* i386/i386.cc: output_387_ffreep
* rs6000/darwin.md: @reload_macho_picbase_<mode>

I think your main concern is about returning a static-allocated-array.
If we would allocate the buffer in th_output_move, then we would
still have to use "static", because also this function returns the string.
Therefore, I will do the following:
* drop the "static" (allocate on the stack)
* call output_asm_insn()
* return ""


> > +
> > +static const char *
> > +th_memidx_output_index (rtx x, machine_mode mode, bool load)
> > +{
> > +  struct riscv_address_info info;
> > +  static char buf[128] = {0};
> Similarly here.
>
>
>
> > @@ -387,4 +387,429 @@ (define_insn "*th_mempair_load_zero_extendsidi2"
> >      (set_attr "mode" "DI")
> >      (set_attr "length" "8")])
> >
> > +;; XTheadMemIdx
> > +
> > +;; Help reload to add a displacement for the base register.
> > +;; In the case `zext(*(uN*)(base+(zext(rN)<<1)))` LRA splits
> > +;; off two new instructions: a) `new_base = base + disp`, and
> > +;; b) `index = zext(rN)<<1`.  The index calculation has no
> > +;; corresponding instruction pattern and needs this insn_and_split
> > +;; to recover.
> > +
> > +(define_insn_and_split "*th_memidx_operand"
> > +  [(set (match_operand:DI 0 "register_operand" "=r")
> > +     (ashift:DI
> > +       (zero_extend:DI (subreg:SI (match_operand:DI 1 "register_operand" 
> > "r") 0))
> > +       (match_operand 2 "const_int_operand" "n")))]
> > +  "TARGET_64BIT && TARGET_XTHEADMEMIDX"
> > +  "#"
> > +  ""
> > +  [(set (match_dup 0) (zero_extend:DI (subreg:SI (match_dup 1) 0)))
> > +   (set (match_dup 0) (ashift:DI (match_dup 0) (match_dup 2)))]
> > +  ""
> > +  [(set_attr "type" "bitmanip")])
> > +
> Interesting.  I'd be curious if this triggers outside the reload context
> and if it's useful to allow that.   Obviously not critical for this
> patch, but more of a curiosity.

I'm quite certain that this issue is specific to reload/LRA.
At least I'm not aware of other passes that change existing insns
such that they need to be split.
As such I will gate this with lra_in_progress.

> > +;;
> > +;; Note, that SHIFTs will be converted to MULTs during combine.
> More correctly, it's a canonicalization.  It really doesn't have much to
> do with combine.  From the canonicalization section in the manual:
>
> > @item
> > Within address computations (i.e., inside @code{mem}), a left shift is
> > converted into the appropriate multiplication by a power of two.
>
>
>
> > +
> > +/* If the shifted value is used later, we cannot eliminate it.  */
> > +/* { dg-final { scan-assembler-times "slli" 5 { target { rv32 } } } } */
> > +/* { dg-final { scan-assembler-times "slli" 8 { target { rv64 } } } } */
> Note we've started wrapping instructions in \M \m so that we don't get
> accidential matches with LTO output.  It may not matter for these
> specific instances, but we might as well get into good habits.  There
> should be numerous examples in the tree now.

Will do.

> I think the only things that really need to be fixed to move forward are
> in the indention check and the wrapping of instructions in the
> scan-assembler directives.  THe static buffer thing isn't critical --
> whichever way you think the code is cleaner is fine with me.  I don't
> think any of these changes necessitate waiting for an approval.  Just
> post for archival purposes and commit.

Thanks!

>
> jeff

Reply via email to