>>> On 24.10.14 at 11:11, <ja...@redhat.com> wrote: > On Fri, Oct 24, 2014 at 10:01:52AM +0100, Jan Beulich wrote: >> > This changed because of my http://gcc.gnu.org/PR60663 fix. >> > In your testcase the inline asm doesn't have more than one output >> > (which IMNSHO is very much desirable not to CSE), and doesn't have explicit >> > clobbers either, but happens to have implicit clobbers (fprs and cc), >> > so CSE still could generate invalid code out of that without the fix >> > (if it decided to materialize the inline asm somewhere, instead of reusing >> > existing inline asm). >> > So, if we e.g. weakened the PR60663 fix so that it only bails out >> > if the inline asm contains more than one output. we'd need to fix up CSE, >> > so >> > that it analyzes all the clobbers and doesn't consider asms as equivalent >> > just based on the ASM_OPERANDS, it needs to have the same clobbers too, >> > and either doesn't try to materialize it out without preexisting insn >> > if it has any clobbers. >> >> So why would clobbers in general matter? I can see memory clobbers >> to need special care, but any others? If two asm()-s only differ in the > > Please start by looking at the PR the change fixed.
This is what I have been doing. But your previous reply got me into some trouble parsing what you wrote (likely because I'm only occasionally looking into compiler details)... > There CSE decided (ok, with the help of not very smart costs, but as the > testcase shows, it clearly can happen) to rematerialize the asm in a place > where the asm wasn't originally at all. At that point it just inserted the > single ASM_OPERANDS, without anything else, leaving the other ASM_OPERANDS > (the testcase had asm with two outputs) and in theory anything else (like > clobbers) out. Leaving the clobbers out completely is definitely not > desirable. > > IMHO we should never CSE together asm with different clobbers, GCC > intentionally does not try to think what exactly the asm pattern does, > it is a black box, and if the programmer decides to use one set of clobbers > in one case and a different in another case, he might have a reason for > that. Aren't these two completely different things? One being to never fold asm()-s with different operands (with it being open whether clobbers would count here), and the other to make sure individual pieces of a parallel would end up in the table? I.e. by relaxing your original fix and adding code to compare clobbers too we'd deal with the first case, but I can't see what would prevent such a parallel to be broken up when there's just one output, but an arbitrary number of clobbers. The only alternative to this not being the case that I can see would be if the parallel as a whole got entered into the table, but if that was the case, why wouldn't the example I provided be properly CSE'd without any change? Apart from that I think it's high time for x86 to have a way to allow the programmer to suppress adding the two default clobbers. I drafted a respective (seemingly pretty non-intrusive) change, which seems to work fine. Would that be acceptable as a second means to at least partially gain back what we had before (in this case of course requiring the programmer to adjust the asm()-s or, if they're all safe, pass a new command line option)? Jan