Ok thanks Jeff. In that case I think I should draw this to the attention of the AArch64 maintainers to check the testsuite updates are OK before I commit...?

Methinks it may be possible to get further, or at least increase our confidence, if I "mock" out try_widen_shift_mode, and/or try injecting some dubious RTL from a builtin, although this'll only give a momentary snapshot of behaviour. I may or may not have time to look into this though ;)...

Cheers, Alan

Jeff Law wrote:
On 09/18/14 03:35, Alan Lawrence wrote:
Moreover, I think we both agree that if result_mode==shift_mode, the
transformation is correct. "Just" putting that check in, achieves
what I'm trying for here, so I'd be happy to go with the attached
patch and call it a day. However, I'm a little concerned about the
other cases - i.e. where shift_mode is wider than result_mode.
Let's go ahead and get the attached patch installed. I'm pretty sure it's correct and I know you want to see something move forward here. We can iterate further if we want.

If I understand correctly (and I'm not sure about that, let's see how
far I get), this means we'll perform the shift in (say) DImode, when
we're only really concerned about the lower (say) 32-bits (for an
originally-SImode shift).
That's the class of cases I'm concerned about.


  try_widen_shift_mode will in this case
check that the result of the operation *inside* the shift (in our
case an XOR) has 33 sign bit copies (via num_sign_bit_copies), i.e.
that its *top* 32-bits are all equal to the original SImode sign bit.
<count> of these bits may be fed into the top of the desired SImode
result by the DImode shift. Right so far?
Correct.

AFAICT, num_sign_bit_copies for an XOR, conservatively returns the
minimum of the num_sign_bit_copies of its two operands. I'm not sure
whether this is behaviour we should rely on in its callers, or for
the sake of abstraction we should treat num_sign_bit_copies as a
black box (which does what it says on the, erm, tin).
Treat it as a black box. It returns the number of known sign bit copies. There may be more, but never less.


If the former, doesn't having num_sign_bit_copies >= the difference
in size between result_mode and shift_mode, of both operands to the
XOR, guarantee safety of the commutation (whether the constant is
positive or negative)? We could perform the shift (in the larger
mode) on both of the XOR operands safely, then XOR together their
lower parts.
I had convinced myself that when we flip the sign bit via the XOR and commute the XOR out that we invalidate the assumptions made when widening. But I'm not so sure anymore. Damn I hate changes made without suitable tests :(

I almost convinced myself the problem is in the adjustment of C2 in the widened case, but that's not a problem either. At least not on paper.

If, however, we want to play safe and ensure that we deal safely with
 any XOR whose top (mode size difference + 1) bits were the same,
then I think the restriction that the XOR constant is positive is
neither necessary nor sufficient; rather (mirroring
try_widen_shift_mode) I think we need that num_sign_bit_copies of the
constant in shift_mode, is more than the size difference between
result_mode and shift_mode.
But isn't that the same? Isn't the only case where it isn't the same when the constant has bits set that are outside the mode of the other operand?

Hmm, what about (xor:QI A -1)? In that case -1 will be represented with bits outside the precision of QImode.

Hmmm. I might try that patch at some point, I think it is the right
check to make. (Meta-comment: this would be *so*much* easier if we
could write unit tests more easily!) In the meantime I'd be happy to
settle for the attached...
No argument on the unit testing comment. It's a major failing in the design of GCC that we can't easily build a unit testing framework.

Jeff



Reply via email to