On Thu, 28 May 2015, Marek Polacek wrote:

This PR points out that we weren't able to optimize 1 << x == 2 to just
x == 1.

Side note: if we are looking for extra patterns to simplify, llvm has an almost unlimited supply. Here are a few we don't seem to have (there are more where those came from), of course several need constraining / generalizing, it is just a list of hints I wrote for myself.

(A|B) & ~(A&B) -> A^B
(A | B) & ((~A) ^ B) -> (A & B)
(A & (~B)) | (A ^ B) -> (A ^ B)
((B | C) & A) | B -> B | (A & C)
A | ( A ^ B) -> A |  B
A | (~A ^ B) -> A | ~B
(A ^ B) & ((B ^ C) ^ A) -> (A ^ B) & ~C
(A ^ B) | ((B ^ C) ^ A) -> (A ^ B) | C
(A & B) | (A ^ B) -> (A | B)
A | ~(A ^ B) -> A | ~B
(A & B) | ((~A) ^ B) -> (~A ^ B)
~(~X & Y) -> (X | ~Y)
~(~X >>s Y) -> (X >>s Y)
(A & B)^(A | B) -> A ^ B
(A | ~B) ^ (~A | B) -> A ^ B
(A & ~B) ^ (~A & B) -> A ^ B
(A ^ C)^(A | B) -> ((~A) & B) ^ C
(A & B) ^ (A ^ B) -> (A | B)
(A & ~B) ^ (~A) -> ~(A & B)
(A&B)+(A^B) -> A|B
(A&B)+(A|B) -> A+B
(A|B)-(A^B) -> A&B
((X | Y) - X) -> (~X & Y)
fmax(x,NaN) -> x
fmax(a,fmax(a,b)) -> fmax(a,b)
(X+2) >u X -> x <u 256-2
(1 << X) <  30 -> X <= 4
((X & ~7) == 0) -> X < 8
2 * X < 5 -> X <= 2
((1 << x)&8) == 0 -> x != 3
((1 << x)&7) == 0 -> x > 2
Y - Z < X - Z -> Y < X
3 * X == 3 * Y -> X == Y
A >> 3 == B >> 3 -> (A ^ B) < 8
(float)int <= 4.4 -> int <= 4
x unle x -> x ord x


+/* (CST1 << A) == CST2 -> A == log2 (CST2 / CST1)
+   (CST1 << A) != CST2 -> A != log2 (CST2 / CST1)
+   if CST2 is a multiple of CST1.  */
+(for cmp (ne eq)
+ (simplify
+  (cmp (lshift@3 INTEGER_CST@0 @1) INTEGER_CST@2)
+  (if ((TREE_CODE (@3) != SSA_NAME || has_single_use (@3))
+       && wi::multiple_of_p (@2, @0, TYPE_SIGN (type)))

Doesn't "type" refer to the result of the EQ_EXPR here?


On Thu, 28 May 2015, Jakub Jelinek wrote:

Is CST2 a multiple of CST1 the best test though?
I mean say in
(0x8001U << x) == 0x20000U
0x20000U isn't a multiple of 0x8001U, yet there is only one
valid value of x for which it holds (17), so we could very well
optimize that to x == 17.
If popcount of the CST1 is 1, then multiple_of_p is supposedly sufficient
(have you checked if CST1 is negative that it still works?), for others
supposedly we could have a helper function that would just try
in a loop all shift counts from 0 to precision - 1, and note when
(CST1 << b) == CST2 - if for no b, then it should fold regardless of
has_single_use to false or true, if for exactly one shift count, then
use a comparison against that shift count, otherwise give up?

ctz(CST2)-ctz(CST1) should provide a single candidate without looping. ctz(CST1) is also relevant when CST2==0.

--
Marc Glisse

Reply via email to