Hi Stefan,
thanks for the patch and sorry for the slow review.
On 9/18/24 19:25, Stefan Schulze Frielinghaus wrote:
Bootstrapped and regtested on s390. Both expander are constrained to
z196 because of the conditional moves. I guess this is reasonable
nowadays.
The reason for the limitation seems to be that you rely on the load on
condition pattern in order to get the carry into a GPR. What you really
want here is load on condition immediate, which has been added with z13.
If you compile your test for z196 we would end up using another alcr,
what would have been possible also before z196. But I agree, that the
limitation to z196 upwards is reasonable.
...
+(define_expand "uaddc<mode>5"
+ [(match_operand:GPR 0 "register_operand")
+ (match_operand:GPR 1 "nonimmediate_operand")
+ (match_operand:GPR 2 "nonimmediate_operand")
+ (match_operand:GPR 3 "nonimmediate_operand")
+ (match_operand:GPR 4 "general_operand")]
+ "TARGET_Z196 && (<MODE>mode != DImode || TARGET_64BIT)"
Why is the second part necessary here? Shouldn't this also work for 31
bit with zarch?
+{
+ rtx cond = gen_rtx_LTU (<MODE>mode, gen_rtx_REG (CCL1mode, CC_REGNUM),
const0_rtx);
+ if (operands[4] == const0_rtx)
+ emit_insn (gen_add<mode>3_carry1_cc (operands[0], operands[2],
operands[3]));
+ else
If we would just generate the alc pattern with a carry in of 0, wouldn't
the other optimizers be able to figure out that the a normal add would
do here? This path does not seem to get exercised by your testcases.
+ {
+ rtx tmp;
+ if (CONSTANT_P (operands[4]))
+ {
+ tmp = gen_reg_rtx (SImode);
+ emit_move_insn (tmp, operands[4]);
+ }
+ else
+ tmp = operands[4];
+ s390_emit_compare (LTU, tmp, const0_rtx);
Couldn't you just replace tmp with a force_reg (<MODE>mode, operands[4])
here to get rid of the IF statement above?
The CC handling doesn't look correct to me. The s390_emit_compare would
create a CCU mode comparison result, which is currently consumed as CCL1
by the ALC pattern. This seems to be inconsistent. s390_emit_compare
returns the condition, which I think needs to be fed into the
alc_carry1_cc pattern as input condition.
Also is LTU really the correct code here? CCU + LTU would expect CC1,
but you want CC2 or CC3 for the carry in, so GTU should be the better
choice. s390_alc_comparison should make sure that only valid
combinations are accepted, CCU + GTU would be one of them.
Similar for usubc I think.
diff --git a/gcc/testsuite/gcc.target/s390/uaddc-1.c
b/gcc/testsuite/gcc.target/s390/uaddc-1.c
new file mode 100644
index 00000000000..dfad5915690
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/uaddc-1.c
@@ -0,0 +1,80 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -mzarch -save-temps -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump-times "\\.UADDC \\(" 2 "optimized" { target
lp64 } } } */
+/* { dg-final { scan-tree-dump-times "\\.UADDC \\(" 1 "optimized" { target { !
lp64 } } } } */
+/* { dg-final { scan-assembler-times "\\talcr\\t" 1 } } */
+/* { dg-final { scan-assembler-times "\\talcgr\\t" 1 { target lp64 } } } */
Your checks seem to rely on the testcase being compiled with at least
-march=z13.
Bye,
Andreas