RE: try_merge_delay_insn with delay list > 1
Great, I'll read more closely formatting rules next time I'll submit something. Regards, Selim -Message d'origine- De : Jeff Law [mailto:l...@redhat.com] Envoyé : lundi 20 avril 2015 19:47 À : BELBACHIR Selim; gcc@gcc.gnu.org Objet : Re: try_merge_delay_insn with delay list > 1 On 04/20/2015 05:08 AM, BELBACHIR Selim wrote: > I've attached the fixed version of the patch. I've tested it on the trunk > with my private target. > > I can't provide a test because apparently no backend (other than my private > one) uses delay slots with more that 1 slot. > I was also unable to test the behaviour of this patch for an hypothetic > target providing delay lots with more that 1 slot AND the possibility to > annul instruction in delay slots. > > It seems to me that this patch is a small enhancement anyway. > I hope it's ok for trunk :) Even for small enhancements or bugfixes, we try to at least do some basic testing. Unfortunately with no sparc or mips machines in the compile farm, good testing of a reorg.c change is hard. I built mips-elf cross tools and used those to compile newlib for mips-elf. Then I applied your patch, rebuilt the compiler and used that to compile newlib again. Then I compared all the objects from the two copies of newlib and verified the code we generated as identical. So there's at least some degree of confidence we didn't mess anything up in the single delay slot case. I fixed a couple more minor formatting problems and installed your change on the trunk. jeff
[wwwdocs] PATCH for Re: GCC Plugin Announcement; CTraps - Lightweight dynamic analysis for concurrent code
Hi Brandon, On Wed, 23 Jan 2013, Brandon Lucia wrote: > I have implemented a GCC plugin that I have found useful for doing > dynamic program analysis, debugging, and performance tuning in > concurrent code. > > The plugin is called CTraps, short for Communication Traps. The main > idea behind CTraps is that a compiler pass implemented as a GCC plugin > instruments instructions that access memory locations that might be > shared between threads. The instrumentation inserts a function call > before such accesses. I added this to our extensions page at https://gcc.gnu.org/extensions.html per the patch below. If you have further updates or changes, just advise. Gerald PS: The README file on github felt a bit confusing/not as clear as your e-mail here. Index: extensions.html === RCS file: /cvs/gcc/wwwdocs/htdocs/extensions.html,v retrieving revision 1.54 diff -u -r1.54 extensions.html --- extensions.html 20 Apr 2015 22:52:58 - 1.54 +++ extensions.html 21 Apr 2015 10:10:38 - @@ -12,6 +12,14 @@ tree. Please direct feedback and bug reports to their respective maintainers, not our mailing lists. +https://github.com/blucia0a/CTraps-gcc";>CTraps plugin for GCC + +CTraps, short for Communication Traps, adds a compiler pass as +a plugin that instruments instructions that access memory locations +that might be shared between threads. It supports dynamic program +analysis, debugging, and performance tuning in concurrent code. + + http://gcc-melt.org";>GCC MELT MELT is a high-level domain specific language to ease the
Re: [wwwdocs] PATCH for Re: GCC Plugin Announcement; CTraps - Lightweight dynamic analysis for concurrent code
On Tue, Apr 21, 2015 at 12:12:59PM +0200, Gerald Pfeifer wrote: > On Wed, 23 Jan 2013, Brandon Lucia wrote: > > I have implemented a GCC plugin that I have found useful for doing > > dynamic program analysis, debugging, and performance tuning in > > concurrent code. > > > > The plugin is called CTraps, short for Communication Traps. The main > > idea behind CTraps is that a compiler pass implemented as a GCC plugin > > instruments instructions that access memory locations that might be > > shared between threads. The instrumentation inserts a function call > > before such accesses. > > I added this to our extensions page at https://gcc.gnu.org/extensions.html > per the patch below. Shouldn't we also list the GCC Python Plugin on that page? Jakub
Re: AutoFDO profile toolchain is open-sourced
ping? On 15.04.2015 10:41, Ilya Palachev wrote: Hi, One more question. Does anybody know with which options should the perf be executed so that to collect appropriate data for the autofdo converter? I obtain the same data for different programs, and it seems to be empty (1600 Bytes). They have the same md5sum for different programs: # Data for simple program with 30 lines of code: $ md5sum ytest.gcov d85481c9154aa606ce4893b64fe109e7 ytest.gcov # Data for program of 3D Delaunay triangulation construction of 100 points. $ md5sum experimentCGAL_convexHullDynamic.gcov d85481c9154aa606ce4893b64fe109e7 experimentCGAL_convexHullDynamic.gcov We tried to collect perf data using option --call-graph fp but it does not help: the output gcov data is still the same. Sometimes create_gcov reports the following error: E0421 13:10:37.125629 8732 perf_parser.cc:209] Mapped 50% of samples, expected at least 95% But it does not mean that there are not enough samples collected in the profile, because 99% of samples are mapped in the case of very simple program (with 1 function). I try to find working case for more than a week but did not suceeded. Can anybody show me that create_gcov works at least for one case? -- Best regards, Ilya Palachev
Re: AutoFDO profile toolchain is open-sourced
On Tue, Apr 21, 2015 at 6:33 AM, Ilya Palachev wrote: > ping? > > On 15.04.2015 10:41, Ilya Palachev wrote: >> >> Hi, >> >> One more question. >> > Does anybody know with which options should the perf be executed so that to > collect appropriate data for the autofdo converter? >From the autofdo page: https://github.com/google/autofdo [ ... ] Inputs: --profile: PERF_PROFILE collected using linux perf (with last branch record). In order to collect this profile, you will need to have an Intel CPU that have last branch record (LBR) support. You also need to have your linux kernel configured with LBR support. To profile: # perf record -c PERIOD -e EVENT -b -o perf.data -- ./command EVENT is refering to BR_INST_RETIRED:TAKEN if available. For some architectures, BR_INST_EXEC:TAKEN also works. [ ... ] The important one for autofdo is -b. It asks perf to use LBR registers for branch tracking (assuming your architecture supports it). The binary you run under perf should also have line table information (compiled with -gmlt) to produce location support for autofdo. Diego.
Re: AutoFDO profile toolchain is open-sourced
On 21.04.2015 14:57, Diego Novillo wrote: >From the autofdo page: https://github.com/google/autofdo [ ... ] Inputs: --profile: PERF_PROFILE collected using linux perf (with last branch record). In order to collect this profile, you will need to have an Intel CPU that have last branch record (LBR) support. You also need to have your linux kernel configured with LBR support. To profile: # perf record -c PERIOD -e EVENT -b -o perf.data -- ./command EVENT is refering to BR_INST_RETIRED:TAKEN if available. For some architectures, BR_INST_EXEC:TAKEN also works. [ ... ] The important one for autofdo is -b. It asks perf to use LBR registers for branch tracking (assuming your architecture supports it). Thanks! It worked. Now big programs produce big gcov files. Sorry for this confusing message. But why create_gcov does not inform about that (no branch events were found)? It creates empty gcov file and says nothing :( Moreover, in the mentioned README it is said that perf should also be executed with option -e BR_INST_RETIRED:TAKEN. I tried to add it but perf said that invalid or unsupported event: 'BR_INST_RETIRED:TAKEN' Run 'perf list' for a list of valid events For my architecture x86_64 the perf list contains $ sudo perf list | grep -i br branch-instructions OR branches[Hardware event] branch-misses [Hardware event] branch-loads [Hardware cache event] branch-load-misses [Hardware cache event] branch-instructions OR cpu/branch-instructions/[Kernel PMU event] branch-misses OR cpu/branch-misses/[Kernel PMU event] mem:[:access] [Hardware breakpoint] syscalls:sys_enter_brk [Tracepoint event] syscalls:sys_exit_brk [Tracepoint event] There is no BR_INST_RETIRED:TAKEN there. Do you use some specific configuration of perf for that? However, I tried to use option "-e branch-instructions". Before that the following error was obtained: E0421 15:57:39.308374 11551 perf_parser.cc:210] Mapped 50% of samples, expected at least 95% and now it disappeared (because of option "-e branch-instructions"). Though, the performance decreases after adding option "-fauto-profile=file.gcov" or "-fprofile-use=file.gcov" to the list of compiler options. The program becomes 10% slower than before. Can you explain that? Maybe I should configure perf so that it will be able to collect events BR_INST_RETIRED:TAKEN ? How can it be done? -- Best regards, Ilya Palachev
Re: AutoFDO profile toolchain is open-sourced
Ilya Palachev writes: > > But why create_gcov does not inform about that (no branch events were > found)? It creates empty gcov file and says nothing :( > > Moreover, in the mentioned README it is said that perf should also be > executed with option -e BR_INST_RETIRED:TAKEN. Standard perf doesn't have a full event list This assumes a perf patched with the libpfm patch. Also I suspect it really wants to use PEBS events, so pp should be added. Alternatively you can use ocperf (from http://github.com/andikleen/pmu-tools) which is just a wrapper: ocperf.py record -e br_inst_retired.near_taken:pp -b ... or specify the event manually (depending on your CPU, like) perf record -e cpu/event=0xc4,umask=0x20,name=br_inst_retired_near_taken,period=49/pp -b ... BTW the biggest problem with autofdo currently is that it is quite bitrotten and supports only several years old perf. So all of this above will only work with old distributions, unless you compile an old perf utility first. -Andi -- a...@linux.intel.com -- Speaking for myself only
Re: AutoFDO profile toolchain is open-sourced
On Tue, Apr 21, 2015 at 6:42 AM, Ilya Palachev wrote: > On 21.04.2015 14:57, Diego Novillo wrote: >> >> >From the autofdo page: https://github.com/google/autofdo >> >> [ ... ] >> Inputs: >> >> --profile: PERF_PROFILE collected using linux perf (with last branch >> record). >> In order to collect this profile, you will need to have an Intel CPU that >> have last branch record (LBR) support. You also need to have your linux >> kernel configured with LBR support. To profile: >> # perf record -c PERIOD -e EVENT -b -o perf.data -- ./command >> EVENT is refering to BR_INST_RETIRED:TAKEN if available. For some >> architectures, BR_INST_EXEC:TAKEN also works. >> [ ... ] >> >> The important one for autofdo is -b. It asks perf to use LBR registers >> for branch tracking (assuming your architecture supports it). > > > Thanks! It worked. Now big programs produce big gcov files. Sorry for this > confusing message. > > But why create_gcov does not inform about that (no branch events were > found)? It creates empty gcov file and says nothing :( > > Moreover, in the mentioned README it is said that perf should also be > executed with option -e BR_INST_RETIRED:TAKEN. > I tried to add it but perf said that > >invalid or unsupported event: 'BR_INST_RETIRED:TAKEN' >Run 'perf list' for a list of valid events > > For my architecture x86_64 the perf list contains > >$ sudo perf list | grep -i br > branch-instructions OR branches[Hardware event] > branch-misses [Hardware event] > branch-loads [Hardware >cache event] > branch-load-misses [Hardware >cache event] > branch-instructions OR cpu/branch-instructions/[Kernel PMU event] > branch-misses OR cpu/branch-misses/[Kernel PMU event] > mem:[:access] [Hardware breakpoint] > syscalls:sys_enter_brk [Tracepoint event] > syscalls:sys_exit_brk [Tracepoint event] > > There is no BR_INST_RETIRED:TAKEN there. Do you use some specific > configuration of perf for that? > > However, I tried to use option "-e branch-instructions". Before that the > following error was obtained: > >E0421 15:57:39.308374 11551 perf_parser.cc:210] Mapped 50% of >samples, expected at least 95% > > and now it disappeared (because of option "-e branch-instructions"). > > Though, the performance decreases after adding option > "-fauto-profile=file.gcov" or "-fprofile-use=file.gcov" to the list of > compiler options. > The program becomes 10% slower than before. > Can you explain that? Maybe I should configure perf so that it will be able > to collect events BR_INST_RETIRED:TAKEN ? How can it be done? You can use dump_gcov to show a text version of the profile dump and check if the profile data makes sense. If your program is just a very tight single loop, the current implementation in trunk may not yield good results because it does not have discriminator support. Try the google-4_9 branch instead. Dehao > > > -- > Best regards, > Ilya Palachev
Re: AutoFDO profile toolchain is open-sourced
On Tue, Apr 21, 2015 at 7:25 AM, Andi Kleen wrote: > Ilya Palachev writes: >> >> But why create_gcov does not inform about that (no branch events were >> found)? It creates empty gcov file and says nothing :( >> >> Moreover, in the mentioned README it is said that perf should also be >> executed with option -e BR_INST_RETIRED:TAKEN. > > Standard perf doesn't have a full event list > This assumes a perf patched with the libpfm patch. > > Also I suspect it really wants to use PEBS events, so pp should be added. > > Alternatively you can use ocperf (from > http://github.com/andikleen/pmu-tools) which is just a wrapper: > > ocperf.py record -e br_inst_retired.near_taken:pp -b ... > > or specify the event manually (depending on your CPU, like) > > perf record -e > cpu/event=0xc4,umask=0x20,name=br_inst_retired_near_taken,period=49/pp > -b ... > > BTW the biggest problem with autofdo currently is that it is > quite bitrotten and supports only several years old perf. > So all of this above will only work with old distributions, > unless you compile an old perf utility first. Do you mean newer perf does not support LBR (-b) any more? Dehao > > -Andi > > -- > a...@linux.intel.com -- Speaking for myself only
Re: AutoFDO profile toolchain is open-sourced
> You can use dump_gcov to show a text version of the profile dump and > check if the profile data makes sense. If your program is just a very > tight single loop, the current implementation in trunk may not yield > good results because it does not have discriminator support. Try the > google-4_9 branch instead. Can we possibly merge the remaining patches now when stage1 is open? Honza > > Dehao > > > > > > > -- > > Best regards, > > Ilya Palachev
Re: AutoFDO profile toolchain is open-sourced
> > BTW the biggest problem with autofdo currently is that it is > > quite bitrotten and supports only several years old perf. > > So all of this above will only work with old distributions, > > unless you compile an old perf utility first. > > Do you mean newer perf does not support LBR (-b) any more? No. perf extended its perf.data output format, and quipper cannot parse any of the extensions, so it just bombs out with assertation failures. I have a patch to hack around some of this, but still couldn't get it actually to work so far. -Andi -- a...@linux.intel.com -- Speaking for myself only.
Re: AutoFDO profile toolchain is open-sourced
I'll get to it soon. When will stage1 close? OTOH, the most important patch (insn-level discriminator support) is not in yet. Cary has just retired. Do you know if anyone would be interested in porting insn-level discriminator support to trunk? Dehao On Tue, Apr 21, 2015 at 8:59 AM, Jan Hubicka wrote: >> You can use dump_gcov to show a text version of the profile dump and >> check if the profile data makes sense. If your program is just a very >> tight single loop, the current implementation in trunk may not yield >> good results because it does not have discriminator support. Try the >> google-4_9 branch instead. > > Can we possibly merge the remaining patches now when stage1 is open? > > Honza >> >> Dehao >> >> > >> > >> > -- >> > Best regards, >> > Ilya Palachev
Re: AutoFDO profile toolchain is open-sourced
In that case, we should get quipper fixed upstream to accommodate new format. (Maybe they already fixed it, I will do a batch sync to make quipper up-to-date). Dehao On Tue, Apr 21, 2015 at 10:24 AM, Andi Kleen wrote: >> > BTW the biggest problem with autofdo currently is that it is >> > quite bitrotten and supports only several years old perf. >> > So all of this above will only work with old distributions, >> > unless you compile an old perf utility first. >> >> Do you mean newer perf does not support LBR (-b) any more? > > No. > > perf extended its perf.data output format, and quipper cannot parse > any of the extensions, so it just bombs out with assertation > failures. > > I have a patch to hack around some of this, but still > couldn't get it actually to work so far. > > -Andi > -- > a...@linux.intel.com -- Speaking for myself only.
Re: AutoFDO profile toolchain is open-sourced
On Tue, Apr 21, 2015 at 10:27:49AM -0700, Dehao Chen wrote: > In that case, we should get quipper fixed upstream to accommodate new > format. (Maybe they already fixed it, I will do a batch sync to make > quipper up-to-date). >From a quick look at http://git.chromium.org/gitweb/?p=chromiumos/platform/chromiumos-wide-profiling.git;a=summary (I assume that is what you mean with upstream) it hasn't been updated. Is still stuck in 2013. I'm attaching what patches I have so far. -Andi autofdo-newer-perf-0.tgz Description: application/gtar-compressed
Re: fn spec attribute on builtin function in fortran
Hi! On Mon, 1 Dec 2014 18:58:10 +0100, Tom de Vries wrote: > On 01-12-14 09:43, Jakub Jelinek wrote: > > On Mon, Dec 01, 2014 at 09:35:25AM +0100, Tom de Vries wrote: > >> I've been adding an fn spec function attribute to some openacc builtin > >> functions: > >> ... > >> diff --git a/gcc/builtin-attrs.def b/gcc/builtin-attrs.def > >> index 9c05a94..4e34192 100644 > >> --- a/gcc/builtin-attrs.def > >> +++ b/gcc/builtin-attrs.def > >> @@ -64,6 +64,7 @@ DEF_ATTR_FOR_INT (6) > >> DEF_ATTR_TREE_LIST (ATTR_LIST_##ENUM, ATTR_NULL, \ > >> ATTR_##ENUM, ATTR_NULL) > >> DEF_ATTR_FOR_STRING (STR1, "1") > >> +DEF_ATTR_FOR_STRING (DOT_DOT_DOT_r_r_r, "...rrr") > >> #undef DEF_ATTR_FOR_STRING > >> > >> /* Construct a tree for a list of two integers. */ > >> @@ -127,6 +128,8 @@ DEF_ATTR_TREE_LIST (ATTR_PURE_NOTHROW_LIST, ATTR_PURE,\ > >>ATTR_NULL, ATTR_NOTHROW_LIST) > >> DEF_ATTR_TREE_LIST (ATTR_PURE_NOTHROW_LEAF_LIST, ATTR_PURE,\ > >>ATTR_NULL, ATTR_NOTHROW_LEAF_LIST) > >> +DEF_ATTR_TREE_LIST > >> (ATTR_FNSPEC_DOT_DOT_DOT_NOCLOB_NOCLOB_NOCLOB_NOTHROW_LIST,\ > >> + ATTR_FNSPEC, ATTR_LIST_DOT_DOT_DOT_r_r_r, > >> ATTR_NOTHROW_LIST) > >> DEF_ATTR_TREE_LIST (ATTR_NORETURN_NOTHROW_LIST, ATTR_NORETURN, \ > >>ATTR_NULL, ATTR_NOTHROW_LIST) > >> DEF_ATTR_TREE_LIST (ATTR_NORETURN_NOTHROW_LEAF_LIST, ATTR_NORETURN,\ > >> ... > >> > >> That worked well for c. When compiling the fortran compiler, I ran into > >> this error: > >> ... > >> In file included from gcc/fortran/f95-lang.c:1194:0: > >> gcc/fortran/../oacc-builtins.def: In function 'void > >> gfc_init_builtin_functions()': > >> gcc/fortran/../oacc-builtins.def:32:1: error: > >> 'ATTR_FNSPEC_DOT_DOT_DOT_NOCLOB_NOCLOB_NOCLOB_NOTHROW_LIST' was not > >> declared > >> in this scope > >> make[2]: *** [fortran/f95-lang.o] Error 1 > > > > Fortran FE uses gfc_build_library_function_decl_with_spec to build these. > > Thanks for the pointer, that's useful. That's for library functions though, I > need a builtin. > > I'm now trying the approach where I specify the attributes in two formats: > ... > DEF_GOACC_BUILTIN_FNSPEC (BUILT_IN_GOACC_DATA_START, "GOACC_data_start", > BT_FN_VOID_INT_PTR_SIZE_PTR_PTR_PTR, > ATTR_FNSPEC_DOT_DOT_DOT_r_r_r_NOTHROW_LIST, > ATTR_NOTHROW_LIST, > "...rrr") > ... > > In gcc/builtins.def, we use the first format (ATTRS): > ... > #undef DEF_GOACC_BUILTIN_FNSPEC > #define DEF_GOACC_BUILTIN_FNSPEC(ENUM, NAME, TYPE, ATTRS, ATTRS2, FNSPEC) \ >DEF_GOACC_BUILTIN(ENUM, NAME, TYPE, ATTRS) > ... > > And in gcc/fortran/f95-lang.c, we use the second format (ATTRS2, FNSPEC) and > a > new function gfc_define_builtin_with_spec: > ... > #undef DEF_GOACC_BUILTIN_FNSPEC > #define DEF_GOACC_BUILTIN_FNSPEC(code, name, type, attr, attr2, fnspec) \ >gfc_define_builtin_with_spec ("__builtin_" name, builtin_types[type], \ > code, name, attr2, fnspec); > ... > > Where gfc_define_builtin_with_spec borrows from > gfc_build_library_function_decl_with_spec: > ... > +static void > +gfc_define_builtin_with_spec (const char *name, tree fntype, > + enum built_in_function code, > + const char *library_name, int attr, > + const char *fnspec) > +{ > + if (fnspec) > +{ > + tree attr_args = build_tree_list (NULL_TREE, > + build_string (strlen (fnspec), > fnspec)); > + tree attrs = tree_cons (get_identifier ("fn spec"), > + attr_args, TYPE_ATTRIBUTES (fntype)); > + fntype = build_type_attribute_variant (fntype, attrs); > +} > + > + gfc_define_builtin (name, fntype, code, library_name, attr); > +} > ... Committed to gomp-4_0-branch in r77: commit 0279da75a39a5bd3ca54b7c5f7e3e303e067cf2c Author: tschwinge Date: Tue Apr 21 19:17:14 2015 + Add DEF_GOACC_BUILTIN_FNSPEC gcc/ * builtins.def (DEF_GOACC_BUILTIN_FNSPEC): Define. gcc/fortran/ * f95-lang.c (DEF_GOACC_BUILTIN_FNSPEC): Define. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@77 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog.gomp |4 gcc/builtins.def |7 +++ gcc/fortran/ChangeLog.gomp |2 ++ gcc/fortran/f95-lang.c | 11 +++ gcc/omp-builtins.def |1 + 5 files changed, 25 insertions(+) diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp index b499d04..b091dd5 100644 --- gcc/ChangeLog.gomp +++ gcc/ChangeLog.gomp @@ -1,3 +1,7 @@ +2015-04-21 Tom de Vries + + * builtins.def (DEF_GOACC_BUILTIN_FNSPEC): Define. + 2015-03-21 Tom de Vries PR tree-optimization/65460 diff --git gcc/builtins.def gcc/builtins.def index 55ce9
RE: AutoFDO profile toolchain is open-sourced
After patching linux perf. This script collects creates a coverage file (e.g., for linpack) which can be used for fdo. gcov=linpack-x86.gcov MAKE='make' # x86 x86() { CC=/usr/bin/gcc CXX=/usr/bin/g++ export CFLAGS="-Ofast -g3 -static" export CPPFLAGS=$CFLAGS $MAKE -C $SRC/SingleSource/Benchmarks/Linpack clean $MAKE -C $SRC/SingleSource/Benchmarks/Linpack -k TEST=simple TARGET_LLVMGCC=$CC TARGET_CXX=$CXX LLI_OPTFLAGS= TARGET_CC=$CC TARGET_LLVMGXX=$CXX CC_UNDER_TEST_IS_GCC=1 TARGET_FLAGS= USE_REFERENCE_OUTPUT=1 CC_UNDER_TEST_TARGET_IS_AARCH64=1 OPTFLAGS= LLC_OPTFLAGS= ENABLE_OPTIMIZED=1 ARCH=x86_64 ENABLE_HASHED_PROGRAM_OUTPUT=1 DISABLE_JIT=1 perfdata=autofdo-linpack/perf-x86.data perf record -b -e branch-instructions -o $perfdata $SRC/SingleSource/Benchmarks/Linpack/Output/linpack-pc.simple autofdo/usr/bin/create_gcov --binary=$SRC/SingleSource/Benchmarks/Linpack/Output/linpack-pc.simple --profile=$perfdata --gcov=$gcov } hth, -Aditya > From: a...@firstfloor.org > To: i.palac...@samsung.com > CC: dnovi...@google.com; gcc@gcc.gnu.org; davi...@google.com; hubi...@ucw.cz; > seb...@gmail.com; de...@google.com; v.bari...@samsung.com > Subject: Re: AutoFDO profile toolchain is open-sourced > Date: Tue, 21 Apr 2015 07:25:10 -0700 > > Ilya Palachev writes: >> >> But why create_gcov does not inform about that (no branch events were >> found)? It creates empty gcov file and says nothing :( >> >> Moreover, in the mentioned README it is said that perf should also be >> executed with option -e BR_INST_RETIRED:TAKEN. > > Standard perf doesn't have a full event list > This assumes a perf patched with the libpfm patch. > > Also I suspect it really wants to use PEBS events, so pp should be added. > > Alternatively you can use ocperf (from > http://github.com/andikleen/pmu-tools) which is just a wrapper: > > ocperf.py record -e br_inst_retired.near_taken:pp -b ... > > or specify the event manually (depending on your CPU, like) > > perf record -e > cpu/event=0xc4,umask=0x20,name=br_inst_retired_near_taken,period=49/pp > -b ... > > BTW the biggest problem with autofdo currently is that it is > quite bitrotten and supports only several years old perf. > So all of this above will only work with old distributions, > unless you compile an old perf utility first. > > -Andi > > -- > a...@linux.intel.com -- Speaking for myself only
Re: AutoFDO profile toolchain is open-sourced
We also needed to adjust the gcov_version in autofdo/gcov.cc to read 0x1 for dev branches of gcc (instead of the current 0x3430372a for some released version of GCC): -DEFINE_uint64(gcov_version, 0x3430372a, +DEFINE_uint64(gcov_version, 0x1, Sebastian On Tue, Apr 21, 2015 at 3:33 PM, Aditya K wrote: > After patching linux perf. This script collects creates a coverage file > (e.g., for linpack) which can be used for fdo. > > > gcov=linpack-x86.gcov > MAKE='make' > > > # x86 > x86() { > CC=/usr/bin/gcc > CXX=/usr/bin/g++ > > export CFLAGS="-Ofast -g3 -static" > export CPPFLAGS=$CFLAGS > > $MAKE -C $SRC/SingleSource/Benchmarks/Linpack clean > > $MAKE -C $SRC/SingleSource/Benchmarks/Linpack -k TEST=simple > TARGET_LLVMGCC=$CC TARGET_CXX=$CXX LLI_OPTFLAGS= TARGET_CC=$CC > TARGET_LLVMGXX=$CXX CC_UNDER_TEST_IS_GCC=1 TARGET_FLAGS= > USE_REFERENCE_OUTPUT=1CC_UNDER_TEST_TARGET_IS_AARCH64=1 OPTFLAGS= > LLC_OPTFLAGS= ENABLE_OPTIMIZED=1 ARCH=x86_64 ENABLE_HASHED_PROGRAM_OUTPUT=1 > DISABLE_JIT=1 > > perfdata=autofdo-linpack/perf-x86.data > > perf record -b -e branch-instructions -o $perfdata > $SRC/SingleSource/Benchmarks/Linpack/Output/linpack-pc.simple > > autofdo/usr/bin/create_gcov > --binary=$SRC/SingleSource/Benchmarks/Linpack/Output/linpack-pc.simple > --profile=$perfdata --gcov=$gcov > > } > > > hth, > -Aditya > > >> From: a...@firstfloor.org >> To: i.palac...@samsung.com >> CC: dnovi...@google.com; gcc@gcc.gnu.org; davi...@google.com; >> hubi...@ucw.cz; seb...@gmail.com; de...@google.com; v.bari...@samsung.com >> Subject: Re: AutoFDO profile toolchain is open-sourced >> Date: Tue, 21 Apr 2015 07:25:10 -0700 >> >> Ilya Palachev writes: >>> >>> But why create_gcov does not inform about that (no branch events were >>> found)? It creates empty gcov file and says nothing :( >>> >>> Moreover, in the mentioned README it is said that perf should also be >>> executed with option -e BR_INST_RETIRED:TAKEN. >> >> Standard perf doesn't have a full event list >> This assumes a perf patched with the libpfm patch. >> >> Also I suspect it really wants to use PEBS events, so pp should be added. >> >> Alternatively you can use ocperf (from >> http://github.com/andikleen/pmu-tools) which is just a wrapper: >> >> ocperf.py record -e br_inst_retired.near_taken:pp -b ... >> >> or specify the event manually (depending on your CPU, like) >> >> perf record -e >> cpu/event=0xc4,umask=0x20,name=br_inst_retired_near_taken,period=49/pp >> -b ... >> >> BTW the biggest problem with autofdo currently is that it is >> quite bitrotten and supports only several years old perf. >> So all of this above will only work with old distributions, >> unless you compile an old perf utility first. >> >> -Andi >> >> -- >> a...@linux.intel.com -- Speaking for myself only >
Re: AutoFDO profile toolchain is open-sourced
Andi, Thanks for the patches. Turns out that the first 3 patches are already in, the correct upstream quipper repository is: https://chromium.googlesource.com/chromiumos/platform2/+/master/chromiumos-wide-profiling/ The last 3 patches seem to be local hacks. Do you want any of them in? I just did a batch sync with quipper head. Please let me know if this solves the perf problem. Thanks, Dehao On Tue, Apr 21, 2015 at 10:36 AM, Andi Kleen wrote: > On Tue, Apr 21, 2015 at 10:27:49AM -0700, Dehao Chen wrote: >> In that case, we should get quipper fixed upstream to accommodate new >> format. (Maybe they already fixed it, I will do a batch sync to make >> quipper up-to-date). > > From a quick look at > > http://git.chromium.org/gitweb/?p=chromiumos/platform/chromiumos-wide-profiling.git;a=summary > > (I assume that is what you mean with upstream) > > it hasn't been updated. Is still stuck in 2013. > > I'm attaching what patches I have so far. > > -Andi
Re: AutoFDO profile toolchain is open-sourced
That's correct. For trunk, gcov_version is 0x1. We defined this as a flag so that you can actually change it via --gcov_version=0x1 instead of changing the code. Dehao On Tue, Apr 21, 2015 at 1:47 PM, Sebastian Pop wrote: > We also needed to adjust the gcov_version in autofdo/gcov.cc to read > 0x1 for dev branches of gcc (instead of the current 0x3430372a for > some released version of GCC): > > -DEFINE_uint64(gcov_version, 0x3430372a, > +DEFINE_uint64(gcov_version, 0x1, > > Sebastian > > On Tue, Apr 21, 2015 at 3:33 PM, Aditya K wrote: >> After patching linux perf. This script collects creates a coverage file >> (e.g., for linpack) which can be used for fdo. >> >> >> gcov=linpack-x86.gcov >> MAKE='make' >> >> >> # x86 >> x86() { >> CC=/usr/bin/gcc >> CXX=/usr/bin/g++ >> >> export CFLAGS="-Ofast -g3 -static" >> export CPPFLAGS=$CFLAGS >> >> $MAKE -C $SRC/SingleSource/Benchmarks/Linpack clean >> >> $MAKE -C $SRC/SingleSource/Benchmarks/Linpack -k TEST=simple >> TARGET_LLVMGCC=$CC TARGET_CXX=$CXX LLI_OPTFLAGS= TARGET_CC=$CC >> TARGET_LLVMGXX=$CXX CC_UNDER_TEST_IS_GCC=1 TARGET_FLAGS= >> USE_REFERENCE_OUTPUT=1CC_UNDER_TEST_TARGET_IS_AARCH64=1 OPTFLAGS= >> LLC_OPTFLAGS= ENABLE_OPTIMIZED=1 ARCH=x86_64 ENABLE_HASHED_PROGRAM_OUTPUT=1 >> DISABLE_JIT=1 >> >> perfdata=autofdo-linpack/perf-x86.data >> >> perf record -b -e branch-instructions -o $perfdata >> $SRC/SingleSource/Benchmarks/Linpack/Output/linpack-pc.simple >> >> autofdo/usr/bin/create_gcov >> --binary=$SRC/SingleSource/Benchmarks/Linpack/Output/linpack-pc.simple >> --profile=$perfdata --gcov=$gcov >> >> } >> >> >> hth, >> -Aditya >> >> >>> From: a...@firstfloor.org >>> To: i.palac...@samsung.com >>> CC: dnovi...@google.com; gcc@gcc.gnu.org; davi...@google.com; >>> hubi...@ucw.cz; seb...@gmail.com; de...@google.com; v.bari...@samsung.com >>> Subject: Re: AutoFDO profile toolchain is open-sourced >>> Date: Tue, 21 Apr 2015 07:25:10 -0700 >>> >>> Ilya Palachev writes: But why create_gcov does not inform about that (no branch events were found)? It creates empty gcov file and says nothing :( Moreover, in the mentioned README it is said that perf should also be executed with option -e BR_INST_RETIRED:TAKEN. >>> >>> Standard perf doesn't have a full event list >>> This assumes a perf patched with the libpfm patch. >>> >>> Also I suspect it really wants to use PEBS events, so pp should be added. >>> >>> Alternatively you can use ocperf (from >>> http://github.com/andikleen/pmu-tools) which is just a wrapper: >>> >>> ocperf.py record -e br_inst_retired.near_taken:pp -b ... >>> >>> or specify the event manually (depending on your CPU, like) >>> >>> perf record -e >>> cpu/event=0xc4,umask=0x20,name=br_inst_retired_near_taken,period=49/pp >>> -b ... >>> >>> BTW the biggest problem with autofdo currently is that it is >>> quite bitrotten and supports only several years old perf. >>> So all of this above will only work with old distributions, >>> unless you compile an old perf utility first. >>> >>> -Andi >>> >>> -- >>> a...@linux.intel.com -- Speaking for myself only >>
Re: [wwwdocs] PATCH for Re: GCC Plugin Announcement; CTraps - Lightweight dynamic analysis for concurrent code
On Tue, 21 Apr 2015, Jakub Jelinek wrote: >> I added this to our extensions page at https://gcc.gnu.org/extensions.html >> per the patch below. > Shouldn't we also list the GCC Python Plugin on that page? Yes, absolutely! David, want to suggest a patch? Or just some wording and a link and I'll take care. Gerald
Re: AutoFDO profile toolchain is open-sourced
Ok, thanks for the tip of the flag. You would also need to pass "-use_lbr=false" to create a gcov file for a device that does not have LBR support. We tried this on ARM collected profiles and we got the same speedup as x86 collected profiles on linpack. Sebastian On Tue, Apr 21, 2015 at 3:53 PM, Dehao Chen wrote: > That's correct. For trunk, gcov_version is 0x1. We defined this as a > flag so that you can actually change it via --gcov_version=0x1 instead > of changing the code. > > Dehao > > On Tue, Apr 21, 2015 at 1:47 PM, Sebastian Pop wrote: >> We also needed to adjust the gcov_version in autofdo/gcov.cc to read >> 0x1 for dev branches of gcc (instead of the current 0x3430372a for >> some released version of GCC): >> >> -DEFINE_uint64(gcov_version, 0x3430372a, >> +DEFINE_uint64(gcov_version, 0x1, >> >> Sebastian >> >> On Tue, Apr 21, 2015 at 3:33 PM, Aditya K wrote: >>> After patching linux perf. This script collects creates a coverage file >>> (e.g., for linpack) which can be used for fdo. >>> >>> >>> gcov=linpack-x86.gcov >>> MAKE='make' >>> >>> >>> # x86 >>> x86() { >>> CC=/usr/bin/gcc >>> CXX=/usr/bin/g++ >>> >>> export CFLAGS="-Ofast -g3 -static" >>> export CPPFLAGS=$CFLAGS >>> >>> $MAKE -C $SRC/SingleSource/Benchmarks/Linpack clean >>> >>> $MAKE -C $SRC/SingleSource/Benchmarks/Linpack -k TEST=simple >>> TARGET_LLVMGCC=$CC TARGET_CXX=$CXX LLI_OPTFLAGS= TARGET_CC=$CC >>> TARGET_LLVMGXX=$CXX CC_UNDER_TEST_IS_GCC=1 TARGET_FLAGS= >>> USE_REFERENCE_OUTPUT=1CC_UNDER_TEST_TARGET_IS_AARCH64=1 OPTFLAGS= >>> LLC_OPTFLAGS= ENABLE_OPTIMIZED=1 ARCH=x86_64 ENABLE_HASHED_PROGRAM_OUTPUT=1 >>> DISABLE_JIT=1 >>> >>> perfdata=autofdo-linpack/perf-x86.data >>> >>> perf record -b -e branch-instructions -o $perfdata >>> $SRC/SingleSource/Benchmarks/Linpack/Output/linpack-pc.simple >>> >>> autofdo/usr/bin/create_gcov >>> --binary=$SRC/SingleSource/Benchmarks/Linpack/Output/linpack-pc.simple >>> --profile=$perfdata --gcov=$gcov >>> >>> } >>> >>> >>> hth, >>> -Aditya >>> >>> From: a...@firstfloor.org To: i.palac...@samsung.com CC: dnovi...@google.com; gcc@gcc.gnu.org; davi...@google.com; hubi...@ucw.cz; seb...@gmail.com; de...@google.com; v.bari...@samsung.com Subject: Re: AutoFDO profile toolchain is open-sourced Date: Tue, 21 Apr 2015 07:25:10 -0700 Ilya Palachev writes: > > But why create_gcov does not inform about that (no branch events were > found)? It creates empty gcov file and says nothing :( > > Moreover, in the mentioned README it is said that perf should also be > executed with option -e BR_INST_RETIRED:TAKEN. Standard perf doesn't have a full event list This assumes a perf patched with the libpfm patch. Also I suspect it really wants to use PEBS events, so pp should be added. Alternatively you can use ocperf (from http://github.com/andikleen/pmu-tools) which is just a wrapper: ocperf.py record -e br_inst_retired.near_taken:pp -b ... or specify the event manually (depending on your CPU, like) perf record -e cpu/event=0xc4,umask=0x20,name=br_inst_retired_near_taken,period=49/pp -b ... BTW the biggest problem with autofdo currently is that it is quite bitrotten and supports only several years old perf. So all of this above will only work with old distributions, unless you compile an old perf utility first. -Andi -- a...@linux.intel.com -- Speaking for myself only >>>
gcc-5-20150421 is now available
Snapshot gcc-5-20150421 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/5-20150421/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 5 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-5-branch revision 88 You'll find: gcc-5-20150421.tar.bz2 Complete GCC MD5=9ac5ce176073d25b9ac0e01a962cb983 SHA1=9eec67c000a3ad59b843c9242e343a7479545a3c Diffs from 5-20150414 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-5 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
How do I set a hard register in gimple
I have a question about inserting code into a function being compiled by GCC. Basically I want to set a hard register at the beginning of a function like is being done below. If I compile the program below on MIPS the $16 register gets set to the result of alloca and even if I optimize the routine and nothing else uses p ($16), the set of $16 gets done. register void *p asm ("$16"); void *foo(void *a) { p = alloca(64); /* Rest of function. */ } But if I try to insert this code myself from inside GCC the setting of $16 keeps getting optimized away and I cannot figure out how to stop it. My code to set the register does this: ptr_var = build_decl (UNKNOWN_LOCATION, VAR_DECL, get_identifier ("__alloca_reg"), ptr_type); TREE_PUBLIC (ptr_var) = 1; DECL_EXTERNAL (ptr_var) = 1; SET_DECL_RTL (ptr_var, gen_raw_REG (Pmode, 16)); DECL_REGISTER (ptr_var) = 1; DECL_HARD_REGISTER (ptr_var) = 1; TREE_THIS_VOLATILE (ptr_var) = 1; TREE_USED (ptr_var) = 1; varpool_node::finalize_decl (ptr_var); stmt = gimple_build_assign (ptr_var, build_fold_addr_expr (array_var)); e = single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (fun)); gsi_insert_on_edge_immediate (e, stmt); And I see the code during the rtl expansion, but during the first CSE pass the set of $16 goes away. How do I mark this variable as 'volatile' so that the assignment to it does not go away? It must be possible because the set does not go away in my small example program but I can't figure out what it is setting that I am not. Steve Ellcey sell...@imgtec.com
Re: AutoFDO profile toolchain is open-sourced
On Tue, Apr 21, 2015 at 01:52:18PM -0700, Dehao Chen wrote: > Andi, > > Thanks for the patches. Turns out that the first 3 patches are already > in, the correct upstream quipper repository is: > > https://chromium.googlesource.com/chromiumos/platform2/+/master/chromiumos-wide-profiling/ > > The last 3 patches seem to be local hacks. Do you want any of them in? > > I just did a batch sync with quipper head. Please let me know if this > solves the perf problem. Still outdated: F0421 20:13:16.221422 22297 perf_reader.cc:1614] Check failed: attr_size <= sizeof(perf_event_attr) (104 vs. 96) -Andi
Re: AutoFDO profile toolchain is open-sourced
On Wed, Apr 22, 2015 at 05:15:47AM +0200, Andi Kleen wrote: > On Tue, Apr 21, 2015 at 01:52:18PM -0700, Dehao Chen wrote: > > Andi, > > > > Thanks for the patches. Turns out that the first 3 patches are already > > in, the correct upstream quipper repository is: > > > > https://chromium.googlesource.com/chromiumos/platform2/+/master/chromiumos-wide-profiling/ > > > > The last 3 patches seem to be local hacks. Do you want any of them in? > > > > I just did a batch sync with quipper head. Please let me know if this > > solves the perf problem. > > Still outdated: > > F0421 20:13:16.221422 22297 perf_reader.cc:1614] Check failed: attr_size <= > sizeof(perf_event_attr) (104 vs. 96) It converts with the attached patches, but there's still some problem parsing the data: % ./create_gcov -binary loop -gcov_version 1 -gcov loop.gcda -gcov_version 0x500e % gcc50 -O2 -fprofile-use loop.c loop.c:1:0: warning: '/home/andi/src/autofdo/loop.gcda' is version ', expected version '500e' % -Andi autofdo-patches-2.tgz Description: application/gtar-compressed
Re: PR63633: May middle-end come up width hard regs for insn expanders?
On 21/04/15 02:04 AM, Segher Boessenkool wrote: On Tue, Apr 21, 2015 at 12:27:40AM +0200, Steven Bosscher wrote: On Mon, Apr 20, 2015 at 10:11 PM, Vladimir Makarov wrote: I might be wrong but I think you have a bloated code because you use scratches. I already told several times that usage of scratch is always a bad idea. It was a bad idea for an old RA and is still a bad idea for IRA. The usage of scratches should be prohibited, probably we should write it somewhere. It is better to use just a regular pseudo instead. Thanks Vladimir, I didn't know this. Does this mean that, for example, extendsidi in i386.md would be better if it did not use match_scratch? The combiner can add or remove clobbers of scratches whenever needed, but it cannot do that for clobbers of pseudos. Yes, I think there are some pitfalls with scratches in other passes. As for combiner, it is probably worth to consider processing clobbers of pseudos with *one* reference as scratches too. It might improve code for some cases although I am not sure about this.