>>> On 24.10.14 at 11:11, <ja...@redhat.com> wrote:
> On Fri, Oct 24, 2014 at 10:01:52AM +0100, Jan Beulich wrote:
>> > This changed because of my http://gcc.gnu.org/PR60663 fix.
>> > In your testcase the inline asm doesn't have more than one output
>> > (which IMNSHO is very much desirable not to CSE), and doesn't have explicit
>> > clobbers either, but happens to have implicit clobbers (fprs and cc),
>> > so CSE still could generate invalid code out of that without the fix
>> > (if it decided to materialize the inline asm somewhere, instead of reusing
>> > existing inline asm).
>> > So, if we e.g. weakened the PR60663 fix so that it only bails out
>> > if the inline asm contains more than one output. we'd need to fix up CSE, 
>> > so
>> > that it analyzes all the clobbers and doesn't consider asms as equivalent
>> > just based on the ASM_OPERANDS, it needs to have the same clobbers too,
>> > and either doesn't try to materialize it out without preexisting insn
>> > if it has any clobbers.
>> 
>> So why would clobbers in general matter? I can see memory clobbers
>> to need special care, but any others? If two asm()-s only differ in the
> 
> Please start by looking at the PR the change fixed.

This is what I have been doing. But your previous reply got me into
some trouble parsing what you wrote (likely because I'm only
occasionally looking into compiler details)...

> There CSE decided (ok, with the help of not very smart costs, but as the
> testcase shows, it clearly can happen) to rematerialize the asm in a place
> where the asm wasn't originally at all.  At that point it just inserted the
> single ASM_OPERANDS, without anything else, leaving the other ASM_OPERANDS
> (the testcase had asm with two outputs) and in theory anything else (like
> clobbers) out.  Leaving the clobbers out completely is definitely not
> desirable.
> 
> IMHO we should never CSE together asm with different clobbers, GCC
> intentionally does not try to think what exactly the asm pattern does,
> it is a black box, and if the programmer decides to use one set of clobbers
> in one case and a different in another case, he might have a reason for
> that.

Aren't these two completely different things? One being to never fold
asm()-s with different operands (with it being open whether clobbers
would count here), and the other to make sure individual pieces of a
parallel would end up in the table? I.e. by relaxing your original fix and
adding code to compare clobbers too we'd deal with the first case, but
I can't see what would prevent such a parallel to be broken up when
there's just one output, but an arbitrary number of clobbers. The only
alternative to this not being the case that I can see would be if the
parallel as a whole got entered into the table, but if that was the case,
why wouldn't the example I provided be properly CSE'd without any
change?

Apart from that I think it's high time for x86 to have a way to allow
the programmer to suppress adding the two default clobbers. I
drafted a respective (seemingly pretty non-intrusive) change, which
seems to work fine. Would that be acceptable as a second means to
at least partially gain back what we had before (in this case of course
requiring the programmer to adjust the asm()-s or, if they're all safe,
pass a new command line option)?

Jan

Reply via email to