On 8/19/2021 5:18 PM, Roger Sayle wrote:
Whilst working on a backend patch, I noticed that the middle-end's
RTL optimizers weren't simplifying a truncation of a paradoxical
subreg extension, though it does transform closely related (more
complex) expressions. The main (first) part of this patch
implements this simplification, reusing much of the logic already
in place.
I briefly considered suggesting that it's difficult to provide a new
testcase for this change, but then realized the reviewer's response
would be that this type of transformation should be self-tested
in simplify-rtx, so this patch adds a bunch of tests that integer
extensions and truncations are simplified as expected. No good
deed goes unpunished and I was equally surprised to see that we
don't currently simplify/check/defend (zero_extend:SI (reg:SI)),
i.e. useless no-op extensions to the same mode. So I've added
some logic to simplify (or more accurately prevent us generating
dubious RTL for) those.
This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and "make -k check" with no new failures.
Indeed. I'd bet there's other weaknesses in here. I've got some
patches here which add overflow handling on the H8 port (attempting to
cut runtime of the builtin-arith-overflow-* tests). Those end up using
subregs and extensions fairly heavily. While looking at how the code
moves through the RTL pipeline it became pretty clear that we're
generally not doing a good job at optimizing those cases well.
Thankfully I've found some sequences that allow the port to do limited
store-flag instructions and that eliminated the need to chase this stuff
down, at least for now.
Ok for mainline?
2021-08-20 Roger Sayle <ro...@nextmovesoftware.com>
gcc/ChangeLog
* simplify-rtx.c (simplify_truncation): Generalize simplification
of (truncate:A (subreg:B X)).
(simplify_unary_operation_1) [FLOAT_TRUNCATE, FLOAT_EXTEND,
SIGN_EXTEND, ZERO_EXTEND]: Handle cases where the operand
already has the desired machine mode.
(test_scalar_int_ops): Add tests that useless extensions and
truncations are optimized away.
(test_scalar_int_ext_ops): New self-test function to confirm
that truncations of extensions are correctly simplified.
(test_scalar_int_ext_ops2): New self-test function to check
truncations of truncations, extensions of extensions, and
truncations of extensions.
(test_scalar_ops): Call the above two functions with a
representative sampling of integer machine modes.
I briefly thought you were missing a subreg_lowpart check, but that's
checked in the outermost IF. The comments are somewhat misleading as
the subreg offset in a lowpart will vary based on endianness, but that's
not a big deal IMHO,
OK
jeff