On 12/31/23 09:23, Roger Sayle wrote:

Very many thanks (and a Happy New Year) to the pre-commit
patch testing folks at linaro.org.   Their testing has revealed that
although my patch is clean on x86_64, it triggers some problems
on aarch64 and arm.  The issue (with the previous version of my
patch) is that these platforms require a paradoxical subreg to be
generated by the middle-end, where we were previously checking
for truly_noop_truncation.

This has been fixed (in revision 2) below.  Where previously I had:

@@ -66,7 +66,9 @@ gen_lowpart_general (machine_mode mode, rtx x)
        scalar_int_mode xmode;
        if (is_a <scalar_int_mode> (GET_MODE (x), &xmode)
           && GET_MODE_SIZE (xmode) <= UNITS_PER_WORD
-         && TRULY_NOOP_TRUNCATION_MODES_P (mode, xmode)
+         && (known_lt (GET_MODE_SIZE (mode), GET_MODE_SIZE (xmode))
+             ? TRULY_NOOP_TRUNCATION_MODES_P (mode, xmode)
+             : known_eq (GET_MODE_SIZE (mode), GET_MODE_SIZE (xmode)))
           && !reload_completed)
         return gen_lowpart_general (mode, force_reg (xmode, x));

the correct change is:

        scalar_int_mode xmode;
        if (is_a <scalar_int_mode> (GET_MODE (x), &xmode)
           && GET_MODE_SIZE (xmode) <= UNITS_PER_WORD
-         && TRULY_NOOP_TRUNCATION_MODES_P (mode, xmode)
+         && (known_ge (GET_MODE_SIZE (mode), GET_MODE_SIZE (xmode))
+             || TRULY_NOOP_TRUNCATION_MODES_P (mode, xmode))
           && !reload_completed)
         return gen_lowpart_general (mode, force_reg (xmode, x));

i.e. we only call TRULY_NOOP_TRUNCATION_MODES_P when we
know we have a truncation, but the behaviour of non-truncations
is preserved (no longer depends upon unspecified behaviour) and
gen_lowpart_general is called to create the paradoxical SUBREG.


This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}
with no new failures.  Ok for mainline?

Hopefully this revision tests cleanly on the linaro.org CI pipeline.

2023-12-31  Roger Sayle  <ro...@nextmovesoftware.com>

gcc/ChangeLog
         * combine.cc (make_extraction): Confirm that OUTPREC is less than
         INPREC before calling TRULY_NOOP_TRUNCATION_MODES_P.
         * expmed.cc (store_bit_field_using_insv): Likewise.
         (extract_bit_field_using_extv): Likewise.
         (extract_bit_field_as_subreg): Likewise.
         * optabs-query.cc (get_best_extraction_insn): Likewise.
         * optabs.cc (expand_parity): Likewise.
         * rtlhooks.cc (gen_lowpart_general): Likewise.
         * simplify-rtx.cc (simplify_truncation): Disallow truncations
         to the same precision.
         (simplify_unary_operation_1) <case TRUNCATE>: Move optimization
         of truncations to the same mode earlier.
So just an FYI. This should probably wait for gcc-15. But it is worth noting that this causes tbz_2.c to fail on aarch64. Basically with this patch we use the "x" registers whereas before we used a "w" register. Meaning we're doing a 64bit comparison now rather than a 32bit comparison. THe test expects a 32bit comparison.

jeff

Reply via email to