On 01/07/11 13:33, Paolo Bonzini wrote: > Got it now! Casts from signed to unsigned are not value-preserving, but > they are "bit-preserving": s32->s64 obviously is, and s32->u64 has the > same result bit-by-bit as the s64 result. The fact that s64 has an > implicit 1111... in front, while an u64 has an implicit 0000... does not > matter.
But, the 1111... and 0000... are not implicit. They are very real, and if applied incorrectly will change the result, I think. > Is this the meaning of the predicate you want? I think so, based on the > discussion, but it's hard to say without seeing the cases enumerated > (i.e. a patch). The purpose of this predicate is to determine whether any type conversions that occur between the output of a widening multiply, and the input of an addition have any bearing on the end result. We know what the effective output type of the multiply is (the size is 2x the input type, and the signed if either one of the inputs in signed), and we know what the input type of the addition is, but any amount of junk can lie in between. The problem is determining if it *is* junk. In an ideal world there would only be two cases to consider: 1. No conversion needed. 2. A single sign-extend or zero-extend (according to the type of the inputs) to match the input size of the addition. Anything else would be unsuitable for optimization. Of course, it's never that simple, but it should still be possible to boil down a list of conversions to one of these cases, if it's valid. The signedness of the input to the addition is not significant - the code would be the same either way. But, I is important not to try to zero-extend something that started out signed, and not to sign-extend something that started out unsigned. > However, perhaps there is a catch. We can do the following thought > experiment. What would happen if you had multiple widening multiplies? > Like 8-bit signed to 64-bit unsigned and then 64-bit unsigned to 128-bit > unsigned? I believe in this case you couldn't optimize 8-bit signed to > 128-bit unsigned. Would your code do it? My code does not attempt to combine multiple multiplies. In any case, if you have two multiplications, surely you have at least three input values, so they can't be combined? It does attempt to combine a multiply and an addition, where a suitable madd* insn is available. (This is not new; I'm just trying to do it in more cases.) I have considered the case where you have "(a * b) + (c * d)", but have not yet coded anything for it. At present, the code will simply choose whichever multiply happens to find itself the first input operand of the plus, and ignores the other, even if the first turns out not to be a suitable candidate. Andrew