On 8/8/23 21:54, Lehua Ding wrote:
Hi Jeff,
> The pattern's operand 0 explicitly allows MEMs as do the constraints.
> So forcing the operand into a register just seems like it's papering
> over the real problem.
The added of force_reg code is address the problem preduced after
address the error combine.
The more restrict condtion of the pattern forbidden mem->mem pattern
which will
produced in -O0. I think the implementation forgot to do this force_reg
operation before
when doing the intrinis expansion The reason this problem isn't exposed
before is because
the reload pass will converts mem->mem to mem->reg; reg->mem based on
the constraint.
So if the core issue if mem->mem, that is a common thing to avoid.
Basically in the expander you use a force_reg and then have a test like
!(MEM_P (op0) && MEM_P (op1)) in the define_insn's condition.
But the v1 had a much more complex condition. It looks like that got
cleaned up in the v2. So I'll need to look at that one more closely.
> This comment doesn't make sense in conjuction with your earlier details.
> In particular combine doesn't run at -O0, so your earlier comment that
> combine creates the problem seems inconsistent with the comment above.
As the above says, the code addresses the problem which produced
after addressing the combine problem.
But combine doesn't run at -O0. So something is inconsistent. I
certainly believe we need to avoid the mem->mem case, but that's
independent of combine and affects all optimization levels.
> Umm, wow. I haven't thought deeply about this, but the complexity of
> that insn condition is a huge red flag that our operand predicates
> aren't correct for this pattern.
This condition is large because the vsetvl info need (compare to scalar
mov or *mov<mode>_whole pattern),
but I think this condition is enough clear to understand. Let me explain
briefly.
(register_operand (operands[0], <MODE>mode) && MEM_P (operands[3]))
|| (MEM_P (operands[0]) && register_operand(operands[3], <MODE>mode))
This two conditons mean allow mem->reg and reg->mem pattern.
I think we can simplify to just
!(MEM_P (operands[0]) && MEM_P (operands[1])
(register_operand (operands[0], <MODE>mode) &&
satisfies_constraint_Wc1 (operands[1]))
This condition mean the mask must be all trues for reg->reg_or_imm
pattern since> reg->reg insn doen't support mask operand.
I would have expected those to be handled by the constraints rather than
the pattern's condition.
Jeff