On 15 December 2013 23:17, Martin Liška <marxin.li...@gmail.com> wrote:
> Dear Jan and Teresa,
>     Jan was right that I've been using changes which were commited by
> Teresa and do live in trunk. So the graph with time profile presented
> in my previous post was really with enabled
> -freorder-blocks-and-partition. I removed the hack in varasm.c and I
> do use classic section layout. Please open the following dump
> (includes PDF graph+html report that shows functions with time profile
> located in cold section and all -fdump-ipa-all dumps):
>
> https://drive.google.com/file/d/0B0pisUJ80pO1YW1QWUFkZjdqME0/edit?usp=sharing
>
> Apart from that, I created also PDF graph 
> (https://drive.google.com/file/d/0B0pisUJ80pO1aHhPWW56dXpLVTQ/edit?usp=sharing)
>  that
> shows that time profile is almost perfect for GIMP. I miss just some
> examples that do not have profile in generate phase.
>
> I will merge current trunk and prepare final patch.
>
> Are there any other data that you want to be prepared?
>
> Martin
>
>
> On 13 December 2013 02:13, Jan Hubicka <hubi...@ucw.cz> wrote:
>>> On Wed, Dec 11, 2013 at 1:21 AM, Martin Liška <marxin.li...@gmail.com> 
>>> wrote:
>>> > Hello,
>>> >    I prepared a collection of systemtap graphs for GIMP.
>>> >
>>> > 1) just my profile-based function reordering: 550 pages
>>> > 2) just -freorder-blocks-and-partitions: 646 pages
>>> > 3) just -fno-reorder-blocks-and-partitions: 638 pages
>>> >
>>> > Please see attached data.
>>>
>>> Thanks for the data. A few observations/questions:
>>>
>>> With both 1) (your (time-based?) reordering) and 2)
>>> (-freorder-blocks-and-partitions) there are a fair amount of accesses
>>> out of the cold section. I'm not seeing so many accesses out of the
>>> cold section in the apps I am looking at with splitting enabled. In
>>
>> I see you already comitted the patch, so perhaps Martin's measurement assume
>> the pass is off by default?
>>
>> I rebuilded GCC with profiledboostrap and with the linkerscript unmapping
>> text.unlikely.  I get ICE in:
>> (gdb) bt
>> #0  diagnostic_set_caret_max_width(diagnostic_context*, int) () at 
>> ../../gcc/diagnostic.c:108
>> #1  0x0000000000f68457 in diagnostic_initialize (context=0x18ae000 
>> <global_diagnostic_context>, n_opts=n_opts@entry=1290) at 
>> ../../gcc/diagnostic.c:135
>> #2  0x000000000100050e in general_init (argv0=<optimized out>) at 
>> ../../gcc/toplev.c:1110
>> #3  toplev_main(int, char**) () at ../../gcc/toplev.c:1922
>> #4  0x00007ffff774cbe5 in __libc_start_main () from /lib64/libc.so.6
>> #5  0x0000000000f7898d in _start () at ../sysdeps/x86_64/start.S:122
>>
>> That is relatively early in startup process. The function seems inlined and
>> it fails only on second invocation, did not have time to investigate further,
>> yet while without -fprofile-use it starts...
>>
>> On our periodic testers I see off-noise improvement in crafty 2200->2300
>> and regression on Vortex, 2900->2800, plus code size increase.
>>
>> Honza

Reply via email to