On 02/26/2016 06:40 AM, Kyrill Tkachov wrote:
Hi all,
I'm looking at a case where some RTL passes create an RTL expression of
the form:
(subreg:QI (and:SI (reg:SI x1)
(const_int 31)) 0)
which I'd like to simplify to:
(and:QI (subreg:QI (reg:SI x1) 0)
(const_int 31))
I can think of cases where the first is better and other cases where the
second is better -- a lot depends on context. I don't have a good sense
for which is better in general.
Note that as-written these don't trigger the subtle issues in what
happens with upper bits. That's more for extensions.
(subreg:SI (whatever:QI))
vs
{zero,sign}_extend:SI (whatever:QI))
vs
(and:SI (subreg:SI (whatever:QI) (const_int 0x255)))
The first leave the bits beyond QI as "undefined" and sometimes (but I
doubt all that often in practice) the compiler will use the undefined
nature of those bits to enable optimizations.
The second & 3rd variants crisply define the upper bits.
It's easy enough to express in RTL but I'm trying to convince myself on
its validity.
I know there are some subtle points in this area. combine_simplify_rtx
in combine.c
has a comment:
/* Note that we cannot do any narrowing for non-constants since
we might have been counting on using the fact that some bits were
zero. We now do this in the SET. */
That comment makes no sense. Unfortunately it goes back to a change
from Kenner in 1994 -- which predates having patch discussions here and
consistently adding tests to the testsuite.
The code used to do this:
- if (GET_MODE_CLASS (mode) == MODE_INT
- && GET_MODE_CLASS (GET_MODE (SUBREG_REG (x))) == MODE_INT
- && GET_MODE_SIZE (mode) < GET_MODE_SIZE (GET_MODE (SUBREG_REG
(x)))
- && subreg_lowpart_p (x))
- return force_to_mode (SUBREG_REG (x), mode, GET_MODE_MASK (mode),
- NULL_RTX, 0);
Which appears to check that we've got a narrowing subreg expression, and
if we do try to force the SUBREG_REG into the right mode using
force_to_mode.
But if we had a narrowing SUBREG_REG, then I can't see how anything
would have been dependign on the upper bits being zero.
and if I try to implement this transformation in simplify_subreg from
simplify-rtx.c
I get some cases where combine goes into an infinite recursion in
simplify_comparison
because it tries to do:
/* If this is (and:M1 (subreg:M1 X:M2 0) (const_int C1)) where C1
fits in both M1 and M2 and the SUBREG is either paradoxical
or represents the low part, permute the SUBREG and the AND
and try again. */
Right. I think you just end up ping-ponging between the two equivalent
representations. Which may indeed argue that the existing
representation is preferred and we should look deeper into why the
existing representation isn't being handled as well as it should be.
Performing this transformation would help a lot with recognition of some
patterns that
I'm working on, so would it be acceptable to teach combine or
simplify-rtx to do this?
How does it help recognition? What kinds of patterns are you trying to
recognize?
jeff