On Wed, Apr 12, 2017 at 01:15:56PM -0500, Segher Boessenkool wrote: > Hi, > > On Wed, Apr 12, 2017 at 07:06:38PM +0200, Jakub Jelinek wrote: > > On Wed, Apr 12, 2017 at 09:29:55AM +0000, Sudi Das wrote: > > > This is a fix for PR 80131 > > > Currently the code A << (B - C) is not simplified. > > > However at least a more specific case of 1U << (C -x) where C = > > > precision(type) - 1 can be simplified to (1 << C) >> x. > > > > Is that always a win though? > > Some constants have higher costs than others on various targets, some > > significantly higher. This change might be beneficial only > > if if C is as expensive as 1, then you get rid of a one (typically cheap) > > operation. > > Which makes me wonder whether this should be done at GIMPLE time and not > > at RTL time (expansion or combine etc.) when one can compare the RTX costs. > > Yeah, either combine or simplify-rtx I'd say. > > The transform A << (B - C) ---> (A << B) >> C > only works if A << B does not overflow but A << (B + 1) does (and then
Is that really true? The A << B does not overflow is obvious precondition. But consider say A 5U, B 29 and C (not compile time known) -2. 5U << 31 is valid 0x80000000U, but (5U << 29) >> (-2) is UB. Isn't the other condition instead that either C must be non-negative, or B must be number of bits in A's type - 1, i.e. that for negative C A << (B - C) would already be always UB? But then unless we know C is non-negative, A must be really just 1, otherwise A << B overflows. > always does work afaics). Or if we know C is non-negative and A << B > does not overflow. So realistically A and B need to be constant. > > > Or do this at match.pd as canonicalization and then have RTL transformation > > to rewrite such (1U << C) >> X as 1U << (C - X) if the latter is faster (or > > shorter). > > The inverse transform only works for A=1, not for the more general case. Jakub