Thanks for the data. A few questions: - Do you have the raw data used to generate your pdfs available? Since you gave me the binaries, if I have the data in terms of exactly what addresses are being plotted I can correlate with the specific cold functions via nm. Once I know what cold functions are being hit, I would then need the .i files and the .gcda files to reproduce the build.
- I tried running the binaries, but don't have the necessary shared library dependencies installed on my system: $ ldd gimp-2.8 | grep found libgimpwidgets-2.0.so.0 => not found libgimpconfig-2.0.so.0 => not found libgimpcolor-2.0.so.0 => not found libgimpmath-2.0.so.0 => not found libgimpthumb-2.0.so.0 => not found libgimpmodule-2.0.so.0 => not found libgimpbase-2.0.so.0 => not found libgegl-0.2.so.0 => not found libbabl-0.1.so.0 => not found I'll try to get these installed, but the last time I did that in an attempt to build gimp I had a lot of trouble trying to get the right versions and get them to build for me - any chance you could build an archive version of the gimp binary? Thanks, Teresa On Sun, Dec 15, 2013 at 2:19 PM, Martin Liška <marxin.li...@gmail.com> wrote: > On 15 December 2013 23:17, Martin Liška <marxin.li...@gmail.com> wrote: >> Dear Jan and Teresa, >> Jan was right that I've been using changes which were commited by >> Teresa and do live in trunk. So the graph with time profile presented >> in my previous post was really with enabled >> -freorder-blocks-and-partition. I removed the hack in varasm.c and I >> do use classic section layout. Please open the following dump >> (includes PDF graph+html report that shows functions with time profile >> located in cold section and all -fdump-ipa-all dumps): >> >> https://drive.google.com/file/d/0B0pisUJ80pO1YW1QWUFkZjdqME0/edit?usp=sharing >> >> Apart from that, I created also PDF graph >> (https://drive.google.com/file/d/0B0pisUJ80pO1aHhPWW56dXpLVTQ/edit?usp=sharing) >> that >> shows that time profile is almost perfect for GIMP. I miss just some >> examples that do not have profile in generate phase. >> >> I will merge current trunk and prepare final patch. >> >> Are there any other data that you want to be prepared? >> >> Martin >> >> >> On 13 December 2013 02:13, Jan Hubicka <hubi...@ucw.cz> wrote: >>>> On Wed, Dec 11, 2013 at 1:21 AM, Martin Liška <marxin.li...@gmail.com> >>>> wrote: >>>> > Hello, >>>> > I prepared a collection of systemtap graphs for GIMP. >>>> > >>>> > 1) just my profile-based function reordering: 550 pages >>>> > 2) just -freorder-blocks-and-partitions: 646 pages >>>> > 3) just -fno-reorder-blocks-and-partitions: 638 pages >>>> > >>>> > Please see attached data. >>>> >>>> Thanks for the data. A few observations/questions: >>>> >>>> With both 1) (your (time-based?) reordering) and 2) >>>> (-freorder-blocks-and-partitions) there are a fair amount of accesses >>>> out of the cold section. I'm not seeing so many accesses out of the >>>> cold section in the apps I am looking at with splitting enabled. In >>> >>> I see you already comitted the patch, so perhaps Martin's measurement assume >>> the pass is off by default? >>> >>> I rebuilded GCC with profiledboostrap and with the linkerscript unmapping >>> text.unlikely. I get ICE in: >>> (gdb) bt >>> #0 diagnostic_set_caret_max_width(diagnostic_context*, int) () at >>> ../../gcc/diagnostic.c:108 >>> #1 0x0000000000f68457 in diagnostic_initialize (context=0x18ae000 >>> <global_diagnostic_context>, n_opts=n_opts@entry=1290) at >>> ../../gcc/diagnostic.c:135 >>> #2 0x000000000100050e in general_init (argv0=<optimized out>) at >>> ../../gcc/toplev.c:1110 >>> #3 toplev_main(int, char**) () at ../../gcc/toplev.c:1922 >>> #4 0x00007ffff774cbe5 in __libc_start_main () from /lib64/libc.so.6 >>> #5 0x0000000000f7898d in _start () at ../sysdeps/x86_64/start.S:122 >>> >>> That is relatively early in startup process. The function seems inlined and >>> it fails only on second invocation, did not have time to investigate >>> further, >>> yet while without -fprofile-use it starts... >>> >>> On our periodic testers I see off-noise improvement in crafty 2200->2300 >>> and regression on Vortex, 2900->2800, plus code size increase. >>> >>> Honza -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413