Hongtao Liu <crazy...@gmail.com> writes:
> On Tue, Oct 27, 2020 at 7:13 PM Richard Sandiford
> <richard.sandif...@arm.com> wrote:
>>
>> Hongtao Liu via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>> > Hi:
>> >   For inline asm, there could be an operand like (not (mem:)), it's
>> > not a valid operand for normal memory constraint.
>> >   Bootstrap is ok, regression test is ok for make check
>> > RUNTESTFLAGS="--target_board='unix{-m32,}'"
>> >
>> > gcc/ChangeLog
>> >         PR target/97540
>> >         * ira.c: (ira_setup_alts): Extract memory from operand only
>> >         for special memory constraint.
>> >         * recog.c (asm_operand_ok): Ditto.
>> >         * lra-constraints.c (process_alt_operands): MEM_P is
>> >         required for normal memory constraint.
>> >
>> > gcc/testsuite/ChangeLog
>> >         * gcc.target/i386/pr97540.c: New test.
>>
>> Sorry to stick my oar in, but I think we should reconsider the
>> bcst_mem_operand approach.  It seems like these patches (and the
>> previous one) are fighting against the principle that operands
>> cannot be arbitrary expressions.
>>
>> This kind of thing was attempted long ago (even before my time!)
>> for SIGN_EXTEND on MIPS.  It ended up causing more problems than
>> it solved and in the end it had to be taken out.  I'm worried that
>> we might end up going through the same cycle again.
>>
>> Also, this LRA code is extremely performance-sensitive in terms
>> of compile time: it's often at the top or near the top of the profile.
>> So adding calls to new functions like extract_mem_from_operand for
>> a fairly niche case probably isn't a good trade-off.
>>
>> I think we should instead find a nice(?) syntax for generating separate
>> patterns for the two bcst_vector_operand alternatives from a single
>> .md pattern.  That would fit the existing model much more closely.
>>
>
> We have define_subst for RTL template transformations, but it's not
> suitable for this case(too many define_subst templates need to
> be added, and it doesn't show any advantage compared to adding
> separate bcst patterns.). I don't find other workable existing syntax for it.

Yeah, I think it would need to be new syntax.  I was wondering if it
would help if we had somethine like (strawman suggestion):

          (one_of 0
            [(match_operand:VI_AVX2 1 "vector_operand" "...")
             (vec_duplicate:VI_AVX2
               (match_operand:<...> 1 "..." "..."))]

where all instances of (one_of N ...) for a given N are required
to have the same number of alternatives.

This could be handled in a similar way to define_subst, with the
one_of being expanded before the main generator routines see it.

But maybe it wouldn't help that much.  E.g. for:

(define_insn "*<plusminus_insn><mode>3"
  [(set (match_operand:VI_AVX2 0 "register_operand" "=x,v")
        (plusminus:VI_AVX2
          (match_operand:VI_AVX2 1 "bcst_vector_operand" "<comm>0,v")
          (match_operand:VI_AVX2 2 "bcst_vector_operand" "xBm,vmBr")))]

the vec_duplicate version should only really have one alternative.
I guess we could handle that by using a:

  (one_of 0
    [(set_attr "enabled" "*,*")
     (set_attr "enabled" "0,*")])

or some variant of that that uses a derived attribute.  But it feels
a bit clunky…

Without that, I guess the only pattern that would benefit directly is:

(define_insn "avx512dq_mul<mode>3<mask_name>"
  [(set (match_operand:VI8_AVX512VL 0 "register_operand" "=v")
        (mult:VI8_AVX512VL
          (match_operand:VI8_AVX512VL 1 "bcst_vector_operand" "%v")
          (match_operand:VI8_AVX512VL 2 "bcst_vector_operand" "vmBr")))]

> So suppose I should revert my former 2 patches and add separate bcst patterns.

Are there going to more patterns that need bcst_vector_operand,
or is the current set complete?

I definitely think we should have a better way of handling this in the
.md files, and I'd be happy to hack something up on the generator side
(given that I'm being the awkward one here).  But I guess the answer to
the question above will decide whether it make things better or not.

FWIW, I think having separate patterns (whether they're produced from
one .md construct or from several) might better optimisation results.
But I guess there's a risk of combinatorial explosion, and the port has
a lot of patterns as it is.

Thanks,
Richard

Reply via email to