https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114010

--- Comment #4 from Manolis Tsamis <manolis.tsamis at vrull dot eu> ---
Hi Andrew,

Thank for your insights on this. Let me reply to some of your points:

(In reply to Andrew Pinski from comment #1)
> >The most important case I have observed is that the vectorizer can fail or 
> >create inferior code with more shuffles/moves when the SSA names aren't 
> >monotonically increasing.
> 
> That should not be true.

Indeed, after further cleaning-up the dumps, some differences that I was
considering were just due to the diff algorithm not doing a good job and that
confused me (sigh).

So, for this example while we're in tree form I observe only naming changes,
but no different code or order of statements. 

(In reply to Andrew Pinski from comment #2)
> Note what I had found in the past it is not exactly SSA_NAMEs that cause the
> difference but rather the RTL register pesdu # causes differences in
> register allocation which was exposed from the different in operands
> canonicalization.

Yes, I have also observed this and it looks to be the main issue.

(In reply to Andrew Pinski from comment #3)
> The first example (of assembly here) in comment #0 is extra moves due to the
> RA not handling subreg that decent for the load/store lane. There are other
> bug reports dealing with that. Why the SSA_NAMES being monotonically help is
> just by an accident really. 
> 
> 

Do you happen to recall the relevant ticket(s)? I would like to have a look but
couldn't find them so far.

Also, while I agree than in some cases changes like this 'just happen' to
improve codegen in some particular case, it was in multiple experiments that
vectorized code was superior with sorted names and it never was worse with
sorted names. In most cases that I recall the version that used unsorted names
had additional shuffles of different sorts or moves. So, which anecdotal, the
effects doesn't look accidental to me in this case. I feel like there may be
some subtle difference due to the names that helps in this case?

> 
> Also:
> > This mostly affects all the bitmaps that use SSA_NAME_VERSION as a key.
> 
> Most use sparse bitmaps there so it is not a big deal.
> 

Agreed and that's probably why I couldn't measure any non-trivial difference in
compilation times.

I should just note that there are also places that create vectors or other data
structures sized to the number of ssa_names, so in theory this could still help
in extreme cases.

> I think this should be split up in a few different bug reports really.
> One for each case where better optimizations happen.
> 
Ok, the only cases that I found to be clearly better are the ones related to
vectorization. Would it help to create a ticket just for that now, or should I
wait for the discussion in this one to conclude first?

> Also:
> >I have seen two similar source files generating the exact same GIMPLE code 
> >up to some optimization pass but then completely diverging due to different 
> >freelists.
> 
> The only case where I have seen this happen is expand will have different
> pesdu # really. Yes I noticed this effect while I did
> r14-569-g21e2ef2dc25de3 really.

Afaik, the codegen differences that I observed was due to the same reason, but
it nonetheless felt weird that the same GIMPLE could produce two different
w.r.t. name ordering files later on just because the freelists were different
(but invisible in the dumps). So I naturally questioned 'why don't we just
flush the freelists after every pass if it's not a performance issue'?

Reply via email to