On Thu, 28 May 2015, Marek Polacek wrote:
This PR points out that we weren't able to optimize 1 << x == 2 to just x == 1.
Side note: if we are looking for extra patterns to simplify, llvm has an almost unlimited supply. Here are a few we don't seem to have (there are more where those came from), of course several need constraining / generalizing, it is just a list of hints I wrote for myself.
(A|B) & ~(A&B) -> A^B (A | B) & ((~A) ^ B) -> (A & B) (A & (~B)) | (A ^ B) -> (A ^ B) ((B | C) & A) | B -> B | (A & C) A | ( A ^ B) -> A | B A | (~A ^ B) -> A | ~B (A ^ B) & ((B ^ C) ^ A) -> (A ^ B) & ~C (A ^ B) | ((B ^ C) ^ A) -> (A ^ B) | C (A & B) | (A ^ B) -> (A | B) A | ~(A ^ B) -> A | ~B (A & B) | ((~A) ^ B) -> (~A ^ B) ~(~X & Y) -> (X | ~Y) ~(~X >>s Y) -> (X >>s Y) (A & B)^(A | B) -> A ^ B (A | ~B) ^ (~A | B) -> A ^ B (A & ~B) ^ (~A & B) -> A ^ B (A ^ C)^(A | B) -> ((~A) & B) ^ C (A & B) ^ (A ^ B) -> (A | B) (A & ~B) ^ (~A) -> ~(A & B) (A&B)+(A^B) -> A|B (A&B)+(A|B) -> A+B (A|B)-(A^B) -> A&B ((X | Y) - X) -> (~X & Y) fmax(x,NaN) -> x fmax(a,fmax(a,b)) -> fmax(a,b) (X+2) >u X -> x <u 256-2 (1 << X) < 30 -> X <= 4 ((X & ~7) == 0) -> X < 8 2 * X < 5 -> X <= 2 ((1 << x)&8) == 0 -> x != 3 ((1 << x)&7) == 0 -> x > 2 Y - Z < X - Z -> Y < X 3 * X == 3 * Y -> X == Y A >> 3 == B >> 3 -> (A ^ B) < 8 (float)int <= 4.4 -> int <= 4 x unle x -> x ord x
+/* (CST1 << A) == CST2 -> A == log2 (CST2 / CST1) + (CST1 << A) != CST2 -> A != log2 (CST2 / CST1) + if CST2 is a multiple of CST1. */ +(for cmp (ne eq) + (simplify + (cmp (lshift@3 INTEGER_CST@0 @1) INTEGER_CST@2) + (if ((TREE_CODE (@3) != SSA_NAME || has_single_use (@3)) + && wi::multiple_of_p (@2, @0, TYPE_SIGN (type)))
Doesn't "type" refer to the result of the EQ_EXPR here? On Thu, 28 May 2015, Jakub Jelinek wrote:
Is CST2 a multiple of CST1 the best test though? I mean say in (0x8001U << x) == 0x20000U 0x20000U isn't a multiple of 0x8001U, yet there is only one valid value of x for which it holds (17), so we could very well optimize that to x == 17. If popcount of the CST1 is 1, then multiple_of_p is supposedly sufficient (have you checked if CST1 is negative that it still works?), for others supposedly we could have a helper function that would just try in a loop all shift counts from 0 to precision - 1, and note when (CST1 << b) == CST2 - if for no b, then it should fold regardless of has_single_use to false or true, if for exactly one shift count, then use a comparison against that shift count, otherwise give up?
ctz(CST2)-ctz(CST1) should provide a single candidate without looping. ctz(CST1) is also relevant when CST2==0.
-- Marc Glisse