Hi Wilco, On Wed, Sep 02, 2015 at 06:09:24PM +0100, Wilco Dijkstra wrote: > Combine canonicalizes certain AND masks in a comparison with zero into > extracts of the widest > register type. During matching these are expanded into a very inefficient > sequence that fails to > match. For example (x & 2) == 0 is matched in combine like this: > > Failed to match this instruction: > (set (reg:CC 66 cc) > (compare:CC (zero_extract:DI (subreg:DI (reg/v:SI 76 [ xD.2641 ]) 0) > (const_int 1 [0x1]) > (const_int 1 [0x1])) > (const_int 0 [0])))
Yes. Some processors even have specific instructions to do this. > Failed to match this instruction: > (set (reg:CC 66 cc) > (compare:CC (and:DI (lshiftrt:DI (subreg:DI (reg/v:SI 76 [ xD.2641 ]) 0) > (const_int 1 [0x1])) > (const_int 1 [0x1])) > (const_int 0 [0]))) This is after r223067. Combine tests only one "final" instruction; that revision rewrites zero_ext* if it doesn't match and tries again. This helps for processors that can do more generic masks (like rs6000, and I believe also aarch64?): without it, you need many more patterns to match all the zero_ext{ract,end} cases. > Neither matches the AArch64 patterns for ANDS/TST (which is just compare and > AND). If the immediate > is not a power of 2 or a power of 2 -1 then it matches correctly as expected. > > I don't understand how ((x >> 1) & 1) != 0 could be a useful expansion It is zero_extract(x,1,1) really. This is convenient for (old and embedded) processors that have special bit-test instructions. If we now want combine to not do this, we'll have to update all backends that rely on it. > (it even uses shifts by 0 at > times which are unlikely to ever match anything). It matches fine on some targets. It certainly looks silly, and could be expressed simpler. Patch should be simple; watch this space / remind me / or file a PR. > Why does combine not try to match the obvious (x & > C) != 0 case? Single-bit and mask tests are very common, so this blocks > efficient code generation on > many targets. They are common, and many processors had instructions for them, which is (supposedly) why combine special-cased them. > It's trivial to change change_zero_ext to expand extracts always into AND and > remove the redundant > subreg. Not really trivial... Think about comparisons... int32_t x; ((x >> 31) & 1) > 0; // is not the same as (x & 0x80000000) > 0; // signed comparison (You do not easily know what the COMPARE is used for). > However wouldn't it make more sense to never special case certain AND > immediate in the first > place? Yes, but we need to make sure no targets regress (i.e. by rewriting patterns for those that do). We need to know the impact of such a change, at the least. Segher