Re: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation

Jeff Law Thu, 04 Feb 2016 00:58:17 -0800

On 01/04/2016 07:32 AM, Ajit Kumar Agarwal wrote:



-----Original Message-----
From: Jeff Law [mailto:l...@redhat.com]
Sent: Wednesday, December 23, 2015 12:06 PM
To: Ajit Kumar Agarwal; Richard Biener
Cc: GCC Patches; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; 
Nagaraju Mekala
Subject: Re: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa 
representation

On 12/11/2015 02:11 AM, Ajit Kumar Agarwal wrote:


Mibench/EEMBC benchmarks (Target Microblaze)

Automotive_qsort1(4.03%), Office_ispell(4.29%), Office_stringsearch1(3.5%). 
Telecom_adpcm_d( 1.37%), ospfv2_lite(1.35%).

I'm having a real tough time reproducing any of these results.  In fact, I'm having a 
tough time seeing cases where path splitting even applies to the Mibench/EEMBC 
benchmarks >>mentioned above.

In the very few cases where split-paths might apply, the net resulting assembly 
code I get is the same with and without split-paths.

How consistent are these results?


I am consistently getting the gains for office_ispell and office_stringsearch1, 
telcom_adpcm_d. I ran it again today and we see gains in the same bench mark 
tests
with the split path changes.

What functions are being affected that in turn impact performance?


For office_ispell: The function are Function "linit (linit, funcdef_no=0, 
decl_uid=2535, cgraph_uid=0, symbol_order=2) for lookup.c file".
                                    "Function checkfile (checkfile, funcdef_no=1, 
decl_uid=2478, cgraph_uid=1, symbol_order=4)"
                                    " Function correct (correct, funcdef_no=2, 
decl_uid=2503, cgraph_uid=2, symbol_order=5)"
                                    " Function askmode (askmode, funcdef_no=24, 
decl_uid=2464, cgraph_uid=24, symbol_order=27)"
                                    for correct.c file.

For office_stringsearch1: The function is Function "bmhi_search (bmhi_search, 
funcdef_no=1, decl_uid=2178, cgraph_uid=1, symbol_order=5)"
for bmhisrch.c file.

In linit there are two path splitting opportunities. Neither of themare cases where path splitting exposes any CSE or DCE opportunities atthe tree level. In fact, in both cases there are no operands from thepredecessors that feed into the join (that we duplicate the split the path).

There's a path splitting opportunity in correct.c::givehelp which AFAICTis totally uninteresting from a performance standpoint. However, theseare one of the few cases where path splitting actually results insomething that is better optimized at the tree level. How ironic. We'deasily get the same result by sinking a statement down through a PHI ina manner similar to what's been suggested for 64700.

In correct.c::checkfile and correct.c::correct and correct.c::askmodethe path splitting opportunities do not lead to any furthersimplifications at the gimple level.


Similarly for bmhisrch.c::bmhi_search.

So when I look across all these examples, the only one that's really aCSE/DCE opportunity exposed by path splitting that is performanceimportant is adpcm_code.

The rest, AFAICT, benefit at a much lower level -- a diamond in the CFGwill require an unconditional branch from the end of one arm around theother arm to the join point. With path splitting that unconditionalbranch is eliminated. So there's a small gain for that. That gain maybe even larger on the microblaze because of its exposed delay slotarchitecture -- one less slot to fill. It may also result in bettercode layouts which help simplistic branch predictors.

So I find myself wondering if the primary effect we're looking for mostof the time is really elimination of that unconditional branch. And ifit is the case, then we're looking at a very different costing heuristic-- one that favors very small join blocks rather than larger ones (thatsupposedly help expose CSE/DCE, but in reality aren't for the benchmarksI've looked at).

And if that's the case, then we may really be looking at something thatbelongs at the RTL level rather than at the tree/gimple level. Sadly,it's harder to do things like duplicate blocks at the RTL level.


Anyway, I'm going to ponder some more.

jeff

Re: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation

Reply via email to