On 04/19/2013 03:53 PM, Steven Bosscher wrote:
On Thu, Apr 18, 2013 at 6:22 AM, Jeff Law wrote:
On 04/17/2013 03:52 PM, Steven Bosscher wrote:

First of all: What is still important to handle?

It's clear that the expectations in reorg.c are "anything goes" but
modern RISCs (everything since the PA-8000, say) probably have some
limitations on what is helpful to have, or not have, in a delay slot.
According to the comments in pa.h about MASK_JUMP_IN_DELAY, having
jumps in delay slots of other jumps is one such thing: They don't
bring benefit to the PA-8000 and they don't work with DWARF2 CFI. As
far as I know, SPARC and MIPS don't allow jumps in delay slots, SH
looks like it doesn't allow it either, and CRIS can do it for short
branches but doesn't do because the trade-off between benefit and
machine description complexity comes out negative.

Note that sparc and/or mips might use the adjust the return pointer trick.
I know it wasn't my idea when I added it to the PA.
After further research, it was the m88k I took the idea from -- 20 years ago this summer.


This shouldn't be very difficult to support if the target models this
as a jump in the delay slot of calls only. I can let the delay slot
filler allow jumps in delay slots of calls but not in delay slots of
other jumps. But for the moment I'm going to ignore this case unless
someone knows a target in the FSF tree that would benefit of it.
I'd say drop it given we now know the only other architecture that was supporting it is also dead.

So I collected some stats myself, for a small number (31) files of gcc
itself, mostly from libcpp and various generator files, compiled at
-O2 for sparc64:

        pass 1                  pass 2          
total   simple  eager   skip    simple  eager   skip
insns   9743    3488    22      1297    525     0
filled  5918    2980    22      21      0       0
hit%    61%     31%     0%      0%      0%      0%
                                                
total   pass 1  pass 2                          
insns   9743    1297
filled  8920    21
hit%    92%     2%
Seem like reasonable numbers. I can't say I recall fill slot statistics from the past, but those are in-line with what I'd expect.


So the first fill_simple_delay_slots pass fills ~60% of the slots, and
the first fill_eager_delay_slots fills another ~30%. The second pass
is not very effective.
Certainly doesn't look terribly effective. One could certainly ask the question if it's worth running at all or if we would do better off having relax_delay_slots record things that are worth a second look. Also note that fill_eager and optimize_skip do nothing useful in your test during the 2nd pass.

The 60% number also tells me there'd be a lot to be gained by using the scheduler's dependency information to drive filling. We'd end up looking at far fewer insns.

Jeff

Reply via email to