Re: [PATCH] gcc parallel make check
On Wed, Sep 10, 2014 at 11:23:34PM +0200, Jakub Jelinek wrote: > On Wed, Sep 10, 2014 at 11:08:22PM +0200, Jakub Jelinek wrote: > > Perhaps better approach might be if we have some way how to synchronize > > among > > multiple expect processes and spawn only as many expects (of course, per > > check target) as there are CPUs. E.g. if mkdir is atomic on all > > hosts/filesystems we care about, we could have some shared directory that > > make would clear before spawning all the expects, and after checking > > runtest_file_p we could attempt to mkdir something (e.g. testcase filename > > with $(srcdir) part removed, or *.exp filename / counter what test are we > > considering or something similar) in the shared directory, if that would > > succeed, it would tell us that we are the process that should run the test, > > if that failed, we'd know some other runtest did that. > > Or perhaps not for every single test, but every 10 or 100 tests or > > something. > > > > E.g. we could just override runtest_file_p itself, so that it would first > > call the original dejagnu version, and then do this check. > > Seems file mkdir in tcl doesn't error on pre-existing directory, so perhaps > [open $path {WRONLY EXCL CREAT}] ? > Now, does this work properly on all hosts we care about? Here is a proof of concept on the tcl side. To get a large seq of numbers in the Makefile, I guess we can use something like check_p_numbers0:=1 2 3 4 5 6 7 8 9 check_p_numbers1:=0 $(check_p_numbers0) check_p_numbers2:=$(foreach i,$(check_p_numbers0),$(patsubst %,$(i)%,$(check_p_numbers1))) check_p_numbers3:=$(patsubst %,0%,$(check_p_numbers1)) $(check_p_numbers2) check_p_numbers4:=$(foreach i,$(check_p_numbers0),$(patsubst %,$(i)%,$(check_p_numbers3))) check_p_numbers5:=$(patsubst %,0%,$(check_p_numbers3)) $(check_p_numbers4) check_p_numbers6:=$(foreach i,$(check_p_numbers0),$(patsubst %,$(i)%,$(check_p_numbers5))) check_p_numbers:=$(check_p_numbers0) $(check_p_numbers2) $(check_p_numbers4) $(check_p_numbers6) (and then what check_p_subdirs=$(wordlist 1,$(words $(check_$*_parallelize)),$(check_p_numbers)) uses, just with $(check_$*_parallelize) replaced with something to match the number of desired goals. Looking at some of the *.exp tests, it seems only some of them (though, the majority of the time consuming ones) actually use runtest_file_p, e.g. compat.exp or struct-layout-1.exp and several others don't. So, IMHO what we should do in the Makefile is, right inside @if [ -z "$(filter-out --target_board=%,$(filter-out --extra_opts%,$(RUNTESTFLAGS)))" ] \ && [ "$(filter -j, $(MFLAGS))" = "-j" ]; then \ first rm -rf $(TESTSUITEDIR)/$*-parallel; mkdir $(TESTSUITEDIR)/$*-parallel so that we start with empty dir, compute check_p_subdirs from actual -jN number, then in check-parallel-gcc_1 etc. goals (but not in check-parallel-gcc) set GCC_RUNTEST_PARALLELIZE_DIR=$(TESTSUITEDIR)/$(check_p_tool)-parallel in the environment and use RUNTESTFLAGS with selected known to be parallelizable *.exp files (dg.exp execute.exp compile.exp and the like), and use all the other *.exp files for check-parallel-gcc. Thoughts on this? Unfortunately, not sure how would that work with the check-subtargets stuff if people are used to parallelize testing across multiple machines (but it is unclear to me how they are merging the log/sum files from the multiple machines anyway). Not sure if this works over NFS/AFS and other networked filesystems, if it does, supposedly they could arrange for the *-parallel directories to be shared. I can't find how to query the -jN value passed to make check by the user though, both $(MFLAGS) and $(MAKEFLAGS) only contain something like --jobserver-fds=3,5 -j from which it is not possible to find out how many goals would be the upper reasonable limit. Running too many goals would waste time (once scheduled, the goal would only wildcard all the test, and for all of them find in the *-parallel directory the test has been run already), running too few could prevent good parallelization. --- gcc/testsuite/lib/gcc-defs.exp.jj 2014-09-01 09:43:28.0 +0200 +++ gcc/testsuite/lib/gcc-defs.exp 2014-09-11 08:37:43.871943270 +0200 @@ -188,6 +188,30 @@ if { [info procs runtest_file_p] == "" } } } +if { [info exists env(GCC_RUNTEST_PARALLELIZE_DIR)] \ + && [info procs runtest_file_p] != [list] \ + && [info procs gcc_parallelize_saved_runtest_file_p] == [list] } then { +rename runtest_file_p gcc_parallelize_saved_runtest_file_p +global gcc_runtest_parallelize_counter + +set gcc_runtest_parallelize_counter 0 +proc runtest_file_p { runtests testcase } { + global gcc_runtest_parallelize_counter + if ![gcc_parallelize_saved_runtest_file_p $runtests $testcase] { + return 0 + } + + set dir [getenv GCC_RUNTEST_PARALLELIZE_DIR] + set path $dir/$gcc_runtest_parallelize_counter + set gcc_runtest_parallelize_counter [expr {$gcc_runt
Re: [PATCH] gcc parallel make check
On Thu, Sep 11, 2014 at 09:51:23AM +0200, Jakub Jelinek wrote: > I can't find how to query the -jN value passed to make check by the user > though, both $(MFLAGS) and $(MAKEFLAGS) only contain something like > --jobserver-fds=3,5 -j from which it is not possible to find out how many > goals would be the upper reasonable limit. Running too many goals would > waste time (once scheduled, the goal would only wildcard all the test, and > for all of them find in the *-parallel directory the test has been run > already), running too few could prevent good parallelization. After a little googling, it seems there is no way to do that :(, unless one e.g. attempts to find the command line of the topmost parent make and scan it through ps or something. There is an option to touch say *-parallel/finished file once any of the check-parallel-gcc-{1,2,...} goals is done (because when it finishes, it means all the tests for the particular check-$lang that are parallelizable have either finished, or at least touched their file) and not start runtest at all if finished already exists, but guess it would be still undesirable to have tens of thousands of goals by default, so perhaps we could go with say 128 subgoals by default and have some env var to override it, so on the really highly parallel boxes you'd specify make -j512 -k check GCC_TEST_PARALLEL_SLOTS=512 or similar. Jakub
RE: [PATCH] RE: gcc parallel make check
> could it be that the pattern in normal1 should have been '[ab]*/ de*/ > [ep]*/*' ? I've checked that this fixes the bug in the current trunk split. I.e. files are stil tested, but now only once. Consider this change added to the previously submitted patch.
RESERVATION REQUEST
Hello, Kindly confirm availability in your place that could accommodate us.They are coming for a Business research. The delegate will need 3 single rooms /apartment that sleeps 3, but if the options are not available you can advise on available options The dates are 1st Nov 2014 to 9th Nov 2014 (8 days). Kindly send the rates of the rooms and the total cost for the stay for 8 days plus tax if any. In case there are no availabilty within this period do not hesitate to get back to me with the free dates. Thanks in advance. Regards
Re: Frame pointer optimization issues
On 20/08/14 16:22, Wilco Dijkstra wrote: Hi, Various targets implement -momit-leaf-frame-pointer to avoid using a frame pointer in leaf functions. Currently the GCC mid-end does not provide a way of doing this, so targets have resorted to hacks. Typically this involves forcing flag_omit_frame_pointer to be true in the _option_override callback. The issue is that this doesn't work as it modifies the actual option variable. As a result the callback is not idempotent, so option save/restore when using function attributes fail as the callback is called multiple times on the modified options. Note this bug exists on all targets which override options in _option_override (and despite claims to the contrary in BZ 60580 this bug exists on all targets that implement -fomit-leaf-frame-pointer). agree, current gcc don't support finer control of frame pointer. currently all three targets i386/aarch64/bfin want finer control of frame pointer for leaf function have bug. for example, for a simple testcase: __attribute__ ((optimize("no-omit-frame-pointer"))) int cal (int a) { int b = a + 0x200; foo(&b); return a + b + 1; } __attribute__ ((optimize("omit-frame-pointer"))) int cal1 (int a) { int b = a + 0x200; foo(&b); return a + b + 1; } ./cc1-bfin -O0 hello.c -momit-leaf-frame-pointer the attribute for "cal1" doesn't work. 2. Change the mid-end to call _frame_pointer_required even when !flag_omit_frame_pointer. This is a generic solution which allows targets to decide when exactly to optimize frame pointers. However it does mean all implementations of _frame_pointer_required must be updated (the trivial safe fix is to add "if (!flag_omit_frame_pointer) return true;" at the start). IMHO, this fix make sense, it will let the calculation for frame_pointer_needed to be more flexible. remove the "! flag_omit_frame_pointer" when initialize frame_pointer_needed, and let frame_pointer_required hook to check flag_omit_frame_pointer and any other target frame pointer control flags like omit-leaf-frame-pointer etc to decide whether frame pointer needed. frame_pointer_needed = (! flag_omit_frame_pointer || (cfun->calls_alloca && EXIT_IGNORE_STACK) /* We need the frame pointer to catch stack overflow exceptions any comments? thanks. -- Jiong
Re: [PATCH] RE: gcc parallel make check
On 11 September 2014 07:22, VandeVondele Joost wrote: > Jakub, > >> First of all, the -j2 testing shows more tests tested in gcc and libstdc++: >> >>-# of expected passes 10133 >>+# of expected passes 10152 >> >>+PASS: 23_containers/set/modifiers/erase/abi_tag.cc (test for excess errors) >>[...] >> >>Not sure where the bug is, could be e.g. in i386.exp for gcc, but for >>libstdc++ less likely to be there rather than in the split. > > I looked into this, and believe this problem is already in current trunk, and > not due to my patch. I.e. unmodified trunk also has these tests executed > several times: > > libstdc++-v3/testsuite/normal4/libstdc++.log.sep:PASS: > 23_containers/map/modifiers/erase/abi_tag.cc > libstdc++-v3/testsuite/normal1/libstdc++.log.sep:PASS: > 23_containers/map/modifiers/erase/abi_tag.cc > > I believe the current trunk pattern could indeed match those twice > (Makefile.in in trunk): > normal1) \ > dirs="`cd $$srcdir; echo [ab]* de* [ep]*/*`";; \ > normal4) \ > dirs="`cd $$srcdir; echo 23_*/[a-km-tw-z]*`";; \ > > could it be that the pattern in normal1 should have been '[ab]*/ de*/ > [ep]*/*' ? Yes, we are running these tests multiple times: PASS: 23_containers/map/modifiers/erase/abi_tag.cc (test for excess errors) PASS: 23_containers/multimap/modifiers/erase/abi_tag.cc (test for excess errors) PASS: 23_containers/multiset/modifiers/erase/abi_tag.cc (test for excess errors) PASS: 23_containers/set/modifiers/erase/abi_tag.cc (test for excess errors) PASS: 26_numerics/complex/abi_tag.cc (test for excess errors) I'll fix that.
Using modes on parallel in vec_select
We are currently working on the implementation of MSA (SIMD) for MIPS and are implementing vector interleave instructions which have a combination of vec_select and vec_concat operators in their patterns. The selectors for the vec_select operators depend on the vector mode so to avoid writing multiple patterns we are using this kind of structure: (define_insn "msa_ilvev_" [(set (match_operand:IMSA 0 "register_operand" "=f") (vec_select:IMSA (vec_concat: (match_operand:IMSA 1 "register_operand" "f") (match_operand:IMSA 2 "register_operand" "f")) (match_operand:IMSA 3 "vec_par_const_ev" "")))] Operand 3 is a parallel which we are requiring has the same mode as the vector. This allows the predicate to check for the appropriate sequence of element selectors based on the mode. The question is whether it is acceptable to require a mode on the parallel that forms the element selector? I.e. Will this requirement prevent any of the standard optimisation passes (such as combine) from speculatively matching this pattern? The mode can obviously just be moved into part of the predicate name and have more predicates but would the current approach cause any problems? Thanks, Matthew
RE: [PATCH] RE: gcc parallel make check
>> could it be that the pattern in normal1 should have been '[ab]*/ de*/ >> [ep]*/*' ? > >Yes, we are running these tests multiple times: > >PASS: 23_containers/map/modifiers/erase/abi_tag.cc (test for excess errors) >PASS: 23_containers/multimap/modifiers/erase/abi_tag.cc (test for excess >errors) >PASS: 23_containers/multiset/modifiers/erase/abi_tag.cc (test for excess >errors) >PASS: 23_containers/set/modifiers/erase/abi_tag.cc (test for excess errors) >PASS: 26_numerics/complex/abi_tag.cc (test for excess errors) > >I'll fix that. Actually, the proper pattern should presumably be '[ab]*/* de*/* [ep]*/*' even though it seems to make no difference in testing. I'll have this included in yet another version of the parallel make check patch (plus some further reschuffling as requested by Jakub), so I think there is no need for you to fix this now.
Re: [PATCH] RE: gcc parallel make check
On 11 September 2014 15:45, VandeVondele Joost wrote: > >>> could it be that the pattern in normal1 should have been '[ab]*/ de*/ >>> [ep]*/*' ? >> >>Yes, we are running these tests multiple times: >> >>PASS: 23_containers/map/modifiers/erase/abi_tag.cc (test for excess errors) >>PASS: 23_containers/multimap/modifiers/erase/abi_tag.cc (test for excess >>errors) >>PASS: 23_containers/multiset/modifiers/erase/abi_tag.cc (test for excess >>errors) >>PASS: 23_containers/set/modifiers/erase/abi_tag.cc (test for excess errors) >>PASS: 26_numerics/complex/abi_tag.cc (test for excess errors) >> >>I'll fix that. > > Actually, the proper pattern should presumably be '[ab]*/* de*/* [ep]*/*' > even though it seems to make no difference in testing. Yes, that's what I'm testing. > I'll have this included in yet another version of the parallel make check > patch (plus some further reschuffling as requested by Jakub), so I think > there is no need for you to fix this now. This can (and should) be fixed now, without waiting for some other change.
Re: [PATCH] gcc parallel make check
On Thu, Sep 11, 2014 at 10:06:40AM +0200, Jakub Jelinek wrote: > There is an option to touch say *-parallel/finished file once any of the > check-parallel-gcc-{1,2,...} goals is done (because when it finishes, it > means all the tests for the particular check-$lang that are parallelizable > have either finished, or at least touched their file) and not start runtest > at all if finished already exists, but guess it would be still undesirable to > have > tens of thousands of goals by default, so perhaps we could go with say > 128 subgoals by default and have some env var to override it, so on the > really highly parallel boxes you'd specify > make -j512 -k check GCC_TEST_PARALLEL_SLOTS=512 > or similar. Here is a patch I'm testing now: --- gcc/Makefile.in.jj 2014-09-08 22:12:56.0 +0200 +++ gcc/Makefile.in 2014-09-11 16:06:36.641219430 +0200 @@ -513,34 +513,10 @@ xm_include_list=@xm_include_list@ xm_defines=@xm_defines@ lang_checks= lang_checks_parallelized= -dg_target_exps:=aarch64.exp,alpha.exp,arm.exp,avr.exp,bfin.exp,cris.exp -dg_target_exps:=$(dg_target_exps),epiphany.exp,frv.exp,i386.exp,ia64.exp -dg_target_exps:=$(dg_target_exps),m68k.exp,microblaze.exp,mips.exp,powerpc.exp -dg_target_exps:=$(dg_target_exps),rx.exp,s390.exp,sh.exp,sparc.exp,spu.exp -dg_target_exps:=$(dg_target_exps),tic6x.exp,xstormy16.exp -# This lists a couple of test files that take most time during check-gcc. -# When doing parallelized check-gcc, these can run in parallel with the -# remaining tests. Each word in this variable stands for work for one -# make goal and one extra make goal is added to handle all the *.exp -# files not handled explicitly already. If multiple *.exp files -# should be run in the same runtest invocation (usually if they aren't -# very long running, but still should be split of from the check-parallel-$lang -# remaining tests runtest invocation), they should be concatenated with commas. -# Note that [a-zA-Z] wildcards need to have []s prefixed with \ (needed -# by tcl) and as the *.exp arguments are mached both as is and with -# */ prefixed to it in runtest_file_p, it is usually desirable to include -# a subdirectory name. -check_gcc_parallelize=execute.exp=execute/2* \ - execute.exp=execute/\[013-9a-fA-F\]* \ - execute.exp=execute/\[pP\]*,dg.exp \ - execute.exp=execute/\[g-oq-zG-OQ-Z\]*,compile.exp=compile/2* \ - compile.exp=compile/\[9pP\]*,builtins.exp \ - compile.exp=compile/\[013-8a-oq-zA-OQ-Z\]* \ - dg-torture.exp,ieee.exp \ - vect.exp,unsorted.exp \ - guality.exp \ - struct-layout-1.exp,stackalign.exp \ - $(dg_target_exps) +# Upper limit to which it is useful to parallelize this lang target. +# It doesn't make sense to try e.g. 128 goals for small testsuites +# like objc or go. +check_gcc_parallelize=1 lang_opt_files=@lang_opt_files@ $(srcdir)/c-family/c.opt $(srcdir)/common.opt lang_specs_files=@lang_specs_files@ lang_tree_files=@lang_tree_files@ @@ -3631,27 +3607,32 @@ $(filter-out $(lang_checks_parallelized) export TCL_LIBRARY ; fi ; \ $(RUNTEST) --tool $* $(RUNTESTFLAGS)) -$(patsubst %,%-subtargets,$(filter-out $(lang_checks_parallelized),$(lang_checks))): check-%-subtargets: +$(patsubst %,%-subtargets,$(lang_checks)): check-%-subtargets: @echo check-$* check_p_tool=$(firstword $(subst _, ,$*)) -check_p_vars=$(check_$(check_p_tool)_parallelize) +check_p_count=$(check_$(check_p_tool)_parallelize) check_p_subno=$(word 2,$(subst _, ,$*)) -check_p_comma=, -check_p_subwork=$(subst $(check_p_comma), ,$(if $(check_p_subno),$(word $(check_p_subno),$(check_p_vars -check_p_numbers=1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 +check_p_numbers0:=1 2 3 4 5 6 7 8 9 +check_p_numbers1:=0 $(check_p_numbers0) +check_p_numbers2:=$(foreach i,$(check_p_numbers0),$(patsubst %,$(i)%,$(check_p_numbers1))) +check_p_numbers3:=$(patsubst %,0%,$(check_p_numbers1)) $(check_p_numbers2) +check_p_numbers4:=$(foreach i,$(check_p_numbers0),$(patsubst %,$(i)%,$(check_p_numbers3))) +check_p_numbers5:=$(patsubst %,0%,$(check_p_numbers3)) $(check_p_numbers4) +check_p_numbers6:=$(foreach i,$(check_p_numbers0),$(patsubst %,$(i)%,$(check_p_numbers5))) +check_p_numbers:=$(check_p_numbers0) $(check_p_numbers2) $(check_p_numbers4) $(check_p_numbers6) check_p_subdir=$(subst _,,$*) -check_p_subdirs=$(wordlist 1,$(words $(check_$*_parallelize)),$(check_p_numbers)) +check_p_subdirs=$(wordlist 1,$(check_p_count),$(wordlist 1,$(or $(GCC_TEST_PARALLEL_SLOTS),128),$(check_p_numbers))) # For parallelized check-% targets, this decides whether parallelization # is desirable (if -jN is used and RUNTESTFLAGS doesn't contain anything # but optional --target_board or --extra_opts arguments). If desirable, # recursive make is run with check-parallel-$lang{,1,2,3,4,5} etc
Re: [PATCH] gcc parallel make check
> "Jakub" == Jakub Jelinek writes: Jakub> I fear that is going to be too expensive, because e.g. all the Jakub> caching that dejagnu and our tcl stuff does would be gone, all Jakub> the tests for lp64 etc. would need to be repeated for each test. In gdb I arranged to have this stuff saved in a special cache directory. See gdb/testsuite/lib/cache.exp for the mechanism. Tom
RE: [PATCH] gcc parallel make check
> Here is a patch I'm testing now: Hi Jakub, I also tested your patch to compare timings vs a newer patch (v8) I'll send soon == patch v8 == make -j32 -k == check-fortran 4m58.178s check-c++ ~10m check-c ~10m check 15m29.873s == patch Jakub check-c++ ~20m check-fortran 3m31.237s check-c 8m8 on the positive side, your patch provides a further speedup e.g. fortran and c testing (where it splits things nicely). The libstdc++ bottleneck is not solved, but I guess that is expected. As you have presumably found as well, your patch introduces a number failures, because some tests seem to have additional dependencies, either explicit or implicit: e.g. in gfortran.dg/binding_label_tests_10_main.f03 ! { dg-do compile } ! This file must be compiled AFTER binding_label_tests_10.f03, which it ! should be because dejagnu will sort the files. module binding_label_tests_10_main in gfortran.dg/class_45b.f03 ! { dg-do link } ! { dg-additional-sources class_45a.f03 } This could clearly trigger as well in the current scheme of splitting, only we have been lucky that dependencies seem to be 'well behaved' in having the same initial letter in the filename. Joost
Re: [PATCH] gcc parallel make check
On Thu, Sep 11, 2014 at 05:04:56PM +, VandeVondele Joost wrote: > > Here is a patch I'm testing now: > > I also tested your patch to compare timings vs a newer patch (v8) I'll send > soon > > == patch v8 == make -j32 -k == > check-fortran 4m58.178s > check-c++ ~10m > check-c ~10m > check 15m29.873s > > == patch Jakub > check-c++ ~20m > check-fortran 3m31.237s > check-c 8m8 > > on the positive side, your patch provides a further speedup e.g. fortran > and c testing (where it splits things nicely). The libstdc++ bottleneck > is not solved, but I guess that is expected. The same technique can be of course used for libstdc++, I just didn't want to do that until the -C gcc testing is changed. > As you have presumably found as well, your patch introduces a number > failures, because some tests seem to have additional dependencies, either > explicit or implicit: I found more issues, in particular it seemed that struct-layout-1.exp, gnu-encoding.exp, plugin.exp and some go*.exp don't call runtest_file_p in the same amounts and same arguments in all invocations. And these Fortran inter-test dependencies, which Tobias told me is PR56408. Unfortunately my remote testing box is unreachable now and I'm still waiting for DDR4 modules to finish building better workstation, so can't test this right now. The patch below intends to serialize the content of the problematic *.exp tests (the first runtest to reach one of those will simply run all the tests from that *.exp file, others will skip it). For go I currently have no idea why does that happen, quick hack would be just disable parallelization of go temporarily and let Ian investigate. For PR56408 we need some fix. Jakub
Re: [PATCH] gcc parallel make check
On Thu, Sep 11, 2014 at 07:26:37PM +0200, Jakub Jelinek wrote: > right now. The patch below intends to serialize the content of the > problematic *.exp tests (the first runtest to reach one of those will simply > run all the tests from that *.exp file, others will skip it). Forgotten patch below. BTW, something will probably need to be done about acats too, either similar approach or just splitting the chapters into little more jobs, because otherwise in make -C check -j48 acats dominated the testing time for me. --- gcc/Makefile.in.jj 2014-09-08 22:12:56.0 +0200 +++ gcc/Makefile.in 2014-09-11 16:58:01.076371437 +0200 @@ -513,34 +513,10 @@ xm_include_list=@xm_include_list@ xm_defines=@xm_defines@ lang_checks= lang_checks_parallelized= -dg_target_exps:=aarch64.exp,alpha.exp,arm.exp,avr.exp,bfin.exp,cris.exp -dg_target_exps:=$(dg_target_exps),epiphany.exp,frv.exp,i386.exp,ia64.exp -dg_target_exps:=$(dg_target_exps),m68k.exp,microblaze.exp,mips.exp,powerpc.exp -dg_target_exps:=$(dg_target_exps),rx.exp,s390.exp,sh.exp,sparc.exp,spu.exp -dg_target_exps:=$(dg_target_exps),tic6x.exp,xstormy16.exp -# This lists a couple of test files that take most time during check-gcc. -# When doing parallelized check-gcc, these can run in parallel with the -# remaining tests. Each word in this variable stands for work for one -# make goal and one extra make goal is added to handle all the *.exp -# files not handled explicitly already. If multiple *.exp files -# should be run in the same runtest invocation (usually if they aren't -# very long running, but still should be split of from the check-parallel-$lang -# remaining tests runtest invocation), they should be concatenated with commas. -# Note that [a-zA-Z] wildcards need to have []s prefixed with \ (needed -# by tcl) and as the *.exp arguments are mached both as is and with -# */ prefixed to it in runtest_file_p, it is usually desirable to include -# a subdirectory name. -check_gcc_parallelize=execute.exp=execute/2* \ - execute.exp=execute/\[013-9a-fA-F\]* \ - execute.exp=execute/\[pP\]*,dg.exp \ - execute.exp=execute/\[g-oq-zG-OQ-Z\]*,compile.exp=compile/2* \ - compile.exp=compile/\[9pP\]*,builtins.exp \ - compile.exp=compile/\[013-8a-oq-zA-OQ-Z\]* \ - dg-torture.exp,ieee.exp \ - vect.exp,unsorted.exp \ - guality.exp \ - struct-layout-1.exp,stackalign.exp \ - $(dg_target_exps) +# Upper limit to which it is useful to parallelize this lang target. +# It doesn't make sense to try e.g. 128 goals for small testsuites +# like objc or go. +check_gcc_parallelize=1 lang_opt_files=@lang_opt_files@ $(srcdir)/c-family/c.opt $(srcdir)/common.opt lang_specs_files=@lang_specs_files@ lang_tree_files=@lang_tree_files@ @@ -3631,27 +3607,32 @@ $(filter-out $(lang_checks_parallelized) export TCL_LIBRARY ; fi ; \ $(RUNTEST) --tool $* $(RUNTESTFLAGS)) -$(patsubst %,%-subtargets,$(filter-out $(lang_checks_parallelized),$(lang_checks))): check-%-subtargets: +$(patsubst %,%-subtargets,$(lang_checks)): check-%-subtargets: @echo check-$* check_p_tool=$(firstword $(subst _, ,$*)) -check_p_vars=$(check_$(check_p_tool)_parallelize) +check_p_count=$(check_$(check_p_tool)_parallelize) check_p_subno=$(word 2,$(subst _, ,$*)) -check_p_comma=, -check_p_subwork=$(subst $(check_p_comma), ,$(if $(check_p_subno),$(word $(check_p_subno),$(check_p_vars -check_p_numbers=1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 +check_p_numbers0:=1 2 3 4 5 6 7 8 9 +check_p_numbers1:=0 $(check_p_numbers0) +check_p_numbers2:=$(foreach i,$(check_p_numbers0),$(patsubst %,$(i)%,$(check_p_numbers1))) +check_p_numbers3:=$(patsubst %,0%,$(check_p_numbers1)) $(check_p_numbers2) +check_p_numbers4:=$(foreach i,$(check_p_numbers0),$(patsubst %,$(i)%,$(check_p_numbers3))) +check_p_numbers5:=$(patsubst %,0%,$(check_p_numbers3)) $(check_p_numbers4) +check_p_numbers6:=$(foreach i,$(check_p_numbers0),$(patsubst %,$(i)%,$(check_p_numbers5))) +check_p_numbers:=$(check_p_numbers0) $(check_p_numbers2) $(check_p_numbers4) $(check_p_numbers6) check_p_subdir=$(subst _,,$*) -check_p_subdirs=$(wordlist 1,$(words $(check_$*_parallelize)),$(check_p_numbers)) +check_p_subdirs=$(wordlist 1,$(check_p_count),$(wordlist 1,$(or $(GCC_TEST_PARALLEL_SLOTS),128),$(check_p_numbers))) # For parallelized check-% targets, this decides whether parallelization # is desirable (if -jN is used and RUNTESTFLAGS doesn't contain anything # but optional --target_board or --extra_opts arguments). If desirable, # recursive make is run with check-parallel-$lang{,1,2,3,4,5} etc. goals, # which can be executed in parallel, as they are run in separate directories. -# check-parallel-$lang{1,2,3,4,5} etc. goals invoke runtest with the longest -# running *.exp files from the testsuite, as determined b
RE: [PATCH] gcc parallel make check
> And these Fortran inter-test dependencies, which Tobias told me is > PR56408. > For PR56408 we need some fix. BTW, is there anything special about Fortran ? There are at least 180 test files that contain 'dg-additional-sources' some in a very non-local way: ./objc.dg/foreach-2.m: /* { dg-additional-sources "../objc-obj-c++-shared/nsconstantstring-class-impl.m" } */ Joost
Re: [PATCH] gcc parallel make check
On Thu, Sep 11, 2014 at 06:33:27PM +, VandeVondele Joost wrote: > > And these Fortran inter-test dependencies, which Tobias told me is > > PR56408. > > For PR56408 we need some fix. > > BTW, is there anything special about Fortran ? There are at least 180 test > files that contain 'dg-additional-sources' some in a very non-local way: > > ./objc.dg/foreach-2.m: /* { dg-additional-sources > "../objc-obj-c++-shared/nsconstantstring-class-impl.m" } */ gc-additional-sources is not a problem, that is the solution if you need Fortran modules inter-TU and can do a link test, see e.g. libgomp/testsuite/libgomp.fortran/declare-simd-{2,3}.f90 for one way how to do that. With gc-additional-sources one command line of the compiler driver compiles both source files, it is a single test from dejagnu POV, so necessarily run together. But there are some tests that want to have other TU modules and want to be dg-do compile only, currently this uses a keep-modules hack which creates inter-test dependencies. So we need something different. Jakub
Re: [PATCH] gcc parallel make check
On 11.09.2014 20:33, VandeVondele Joost wrote: >For PR56408 we need some fix. BTW, is there anything special about Fortran ? There are at least 180 test files that contain 'dg-additional-sources' some in a very non-local way: Well, the question is what you want to do with the different files. If you just want to compile them, e.g. for linking or executing, you are fine. However, with Fortran module's there is a .mod file produced – which gives an ordering constraint: The file with the module has to be compiled first before the other file can be compiled. (If one puts the module into the same file, one has effectively one translation unit and some bugs do not pop up in this case.) The current scheme comes at its limits in that case. Mainly because the file specified in dg-additional-sources is compiled after the one in which this line is written. That can be fine for linking/run-time tests, where one disables the by-itself compilation of the second file - and puts the module into the first file. However, as soon as one wants to do more, e.g. dg-error/dg-warning output, checking the dump/assembler etc., one is in trouble. See the files listed in the PR for issues. By contrast, for C/C++, one has a header file which is included by the preprocessor (hence before the compiler), thus, there is no ordering issue as no compiler input is generated. I don't know whether one could run into issues with precompiled header files - but there on has at least the different name *.h/*.hpp and *.c/.cc. I don't know about Ada or C++'s upcoming ISO-version of precompiled headers ("modules"), maybe there one runs into similar issues? See the PR for some attempts to fix it. Tobias
Does -flto give gcc access to data addresses?
Hi, I'm having trouble based on available docs like https://gcc.gnu.org/onlinedocs/gccint/LTO.html in understanding just what the gcc LTO framework is intended to be architecturally capable of. As a concrete motivating example, I have a 32K embedded program about 5% of which consists of sequences like movhi r2,0 addir2,r2,26444 stw r15,0(r2) This is on a 32-bit RISC architecture (Nios2) with 16-bit immediate values in instructions where in general a sequence like movhi r2,high_half_of_address addir2,r2,low_half_of_address is required to assemble an arbitrary 32-bit address in registers for use. However, if the high half of the address happens to be zero, (which is universally true in this program because code+data fit in 64KB -- forced by hardware constraints) we can collapse movhi r2,0 addir2,r2,26444 stw r15,0(r2) to just stw r15,26444(r0) saving two instructions. (On this architecture R0 is hardwired to zero.) This seems like a natural peephole optimization at linktime -- *if* data addresses are resolved in some (preliminary?) fashion during linktime code generation. Is this a plausible optimization to implement in gcc + binutils with the current -flto support architecture? If so, what doc/mechanism/approach/sourcefile should I be studying in order to implement this? If not, is there some other productive way to tickle gcc + binutils here? Thanks in advance, -Jeff
Re: [PATCH] gcc parallel make check
On 11 September 2014 20:19:31 Jakub Jelinek wrote: On Thu, Sep 11, 2014 at 07:26:37PM +0200, Jakub Jelinek wrote: > right now. The patch below intends to serialize the content of the > problematic *.exp tests (the first runtest to reach one of those will simply > run all the tests from that *.exp file, others will skip it). Forgotten patch below. BTW, something will probably need to be done about acats too, either similar approach or just splitting the chapters into little more jobs, because otherwise in make -C check -j48 acats dominated the testing time for me. + if [ -n "$(check_p_subno)" \ +-a -n "$$GCC_RUNTEST_PARALLELIZE_DIR" \ +-a -f $(TESTSUITEDIR)/$(check_p_tool)-parallel/finished ]; then \ test(1) -a and -o are obsolescent, please chain [] && [] instead. Thanks -a cheers ;) Sent with AquaMail for Android http://www.aqua-mail.com
Re: [PATCH] gcc parallel make check
On Thu, Sep 11, 2014 at 11:24:08PM +0200, Bernhard Reutner-Fischer wrote: > On 11 September 2014 20:19:31 Jakub Jelinek wrote: > > >On Thu, Sep 11, 2014 at 07:26:37PM +0200, Jakub Jelinek wrote: > >> right now. The patch below intends to serialize the content of the > >> problematic *.exp tests (the first runtest to reach one of those will > >> simply > >> run all the tests from that *.exp file, others will skip it). > > > >Forgotten patch below. BTW, something will probably need to be done about > >acats too, either similar approach or just splitting the chapters into > >little more jobs, because otherwise in make -C check -j48 acats dominated > >the testing time for me. > > > + if [ -n "$(check_p_subno)" \ > + -a -n "$$GCC_RUNTEST_PARALLELIZE_DIR" \ > + -a -f $(TESTSUITEDIR)/$(check_p_tool)-parallel/finished ]; then \ > > test(1) -a and -o are obsolescent, please chain [] && [] instead. That is news to me, but given the amount of test -a/-o uses e.g. in gcc/configure and hundreds of places, I'd say what we care is what is more portable to old shells. Jakub
gcc-4.8-20140911 is now available
Snapshot gcc-4.8-20140911 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.8-20140911/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.8 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_8-branch revision 215195 You'll find: gcc-4.8-20140911.tar.bz2 Complete GCC MD5=c6dd3f9fb89705beed0cb41ac9b5e8e6 SHA1=a2de3284d0a1f6ab8dd1e7f2a8f6c44806dc4dce Diffs from 4.8-20140904 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.8 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: [PATCH] gcc parallel make check
On Sep 11, 2014, at 3:15 PM, Jakub Jelinek wrote: > That is news to me, but given the amount of test -a/-o uses e.g. in > gcc/configure and hundreds of places, I'd say what we care is what is more > portable to old shells. No, we can’t care about that. If that were true, the _ && _ in the compiler source would have been fixed. Since it has not, then trivially it is portable enough. One day someone will come along and fixup the -a and -o instances for us. && should be preferred.
How to access the virtual table?
I am trying to access the virtual table. My pass is hooked after pass_ipa_pta. Consider Class A which contains virtual function. An object created as : A a; is translated in GIMPLE as struct A a; From variable "a" we can get its type which is "struct A". I tried to see how the dump_vtable function access the table. To access the virtual table for class A, Below is the code which I tried. tree t = TREE_TYPE (a); tree binfo = TYPE_BINFO (t); tree vtab = BINFO_VTABLE (binfo); FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (DECL_INITIAL (vtab)), ix, value) Using the above iterator, entries of the virtual table should be accessible. However, I am not able to do so. Kindly inform me the way to access the virtual table. Also how to fetch the functions from that virtual table? Regards, Swati
[RFC] Dealing with ODR violations in GCC
Hi, I went through excercise of running LTO bootstrap with ODR verification on. There are some typename clashes I guess we want to fix. I wonder what approach is preferred, do we want to introduce anonymous namespaces for those? Honza ../../gcc/tlink.c:62:16: warning: type ‘struct file_hash_entry’ violates one definition rule [-Wodr] typedef struct file_hash_entry ^ ../../libcpp/files.c:143:8: note: a different type is defined in another translation unit struct file_hash_entry ^ ../../gcc/tlink.c:64:15: note: the first difference of corresponding definitions is field ‘key’ const char *key; ^ ../../libcpp/files.c:145:27: note: a field with different name is defined in another translation unit struct file_hash_entry *next; ^ ../../gcc/ipa-devirt.c:2674:0: warning: type ‘struct type_change_info’ violates one definition rule [-Wodr] struct type_change_info ^ ../../gcc/ipa-prop.c:595:0: note: a different type is defined in another translation unit struct type_change_info ^ ../../gcc/ipa-devirt.c:2681:0: note: the first difference of corresponding definitions is field ‘instance’ tree instance; ^ ../../gcc/ipa-prop.c:602:0: note: a field with different name is defined in another translation unit tree object; ^ ../../gcc/gcse.c:294:0: warning: type ‘struct occr’ violates one definition rule [-Wodr] struct occr ^ ../../gcc/postreload-gcse.c:160:0: note: a different type is defined in another translation unit struct occr ^ ../../gcc/gcse.c:297:0: note: the first difference of corresponding definitions is field ‘next’ struct occr *next; ^ ../../gcc/postreload-gcse.c:163:0: note: a field of same name but different type is defined in another translation unit struct occr *next; ^ ../../gcc/gcse.c:259:0: warning: type ‘struct expr’ violates one definition rule [-Wodr] struct expr ^ ../../gcc/postreload-gcse.c:92:0: note: a different type is defined in another translation unit struct expr ^ ../../gcc/gcse.c:264:0: note: the first difference of corresponding definitions is field ‘bitmap_index’ int bitmap_index; ^ ../../gcc/postreload-gcse.c:98:0: note: a field with different name is defined in another translation unit hashval_t hash; ^ ../../gcc/predict.c:2499:0: warning: type ‘struct block_info_def’ violates one definition rule [-Wodr] typedef struct block_info_def ^ ../../gcc/reg-stack.c:208:0: note: a different type is defined in another translation unit typedef struct block_info_def ^ ../../gcc/predict.c:2502:0: note: the first difference of corresponding definitions is field ‘frequency’ sreal frequency; ^ ../../gcc/reg-stack.c:210:0: note: a field with different name is defined in another translation unit struct stack_def stack_in; /* Input stack configuration. */ ^ ../../gcc/lra-eliminations.c:80:0: warning: type ‘struct elim_table’ violates one definition rule [-Wodr] struct elim_table ^ ../../gcc/reload1.c:264:0: note: a different type is defined in another translation unit struct elim_table ^ ../../gcc/lra-eliminations.c:88:0: note: the first difference of corresponding definitions is field ‘previous_offset’ HOST_WIDE_INT previous_offset; ^ ../../gcc/reload1.c:268:0: note: a field with different name is defined in another translation unit HOST_WIDE_INT initial_offset; /* Initial difference between values. */ ^ ../../gcc/tree-ssa-ccp.c:169:0: warning: type ‘struct prop_value_d’ violates one definition rule [-Wodr] struct prop_value_d { ^ ../../gcc/tree-ssa-copy.c:79:0: note: a different type is defined in another translation unit struct prop_value_d { ^ ../../gcc/tree-ssa-ccp.c:171:0: note: the first difference of corresponding definitions is field ‘lattice_val’ ccp_lattice_t lattice_val; ^ ../../gcc/tree-ssa-copy.c:81:0: note: a field with different name is defined in another translation unit tree value; ^ ../../gcc/profile.h:26:0: warning: type ‘struct edge_info’ violates one definition rule [-Wodr] struct edge_info ^ ../../gcc/tree-ssa-dom.c:113:0: note: a different type is defined in another translation unit struct edge_info ^ ../../gcc/profile.h:28:0: note: the first difference of corresponding definitions is field ‘count_valid’ unsigned int count_valid:1; ^ ../../gcc/tree-ssa-dom.c:117:0: note: a field with different name is defined in another translation unit tree lhs; ^ ../../gcc/tree-ssa-loop-im.c:119:0: warning: type ‘struct mem_ref’ violates one definition rule [-Wodr] typedef struct mem_ref ^ ../../gcc/tree-ssa-loop-prefetch.c:271:0: note: a different type is defined in another translation unit struct mem_ref ^ ../../gcc/tree-ssa-loop-im.c:121:0: note: the first difference of corresponding definitions is field ‘id’ unsigned id; /* ID assigned to the memory reference ^ ../../gcc/tree-ssa-loop-prefetch.c:273:0: note: a field with different name is defined in another translation unit gimple
RE: [PATCH] gcc parallel make check
>>> >For PR56408 we need some fix. >> BTW, is there anything special about Fortran ? There are at least 180 test >> files that contain 'dg-additional-sources' >some in a very non-local way: >The current scheme comes at its limits in that case. . See the files listed in >the PR for issues. So, what about a pragmatic solution, and move the tests that rely on being serialized to a subdirectory serialized/ where, like now, we rely on the implicit ordering we have now ? At least it makes this assumption somewhat explicit. Joost
Re: How to access the virtual table?
> > I am trying to access the virtual table. > My pass is hooked after pass_ipa_pta. > > Consider Class A which contains virtual function. > An object created as : > A a; > is translated in GIMPLE as > struct A a; > > From variable "a" we can get its type which is "struct A". > I tried to see how the dump_vtable function access the table. > > To access the virtual table for class A, Below is the code which I tried. > > tree t = TREE_TYPE (a); > tree binfo = TYPE_BINFO (t); > tree vtab = BINFO_VTABLE (binfo); > > FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (DECL_INITIAL > (vtab)), ix, value) > > Using the above iterator, entries of the virtual table should be accessible. > However, I am not able to do so. You can check dump_tree (vtab) to see if you are seeing an VAR_DECL. BINFO_VTABLE is usually POINTER_PLUS_EXPR, so you want to access DECL_INITIAL of TREE_OPERAND (vtab) in that case. If you want to access an virtual method with given token (position in vtable), you can just use gimple_get_virt_method_for_binfo (or borrow code from there for things you need). Honza > > Kindly inform me the way to access the virtual table. > Also how to fetch the functions from that virtual table? > > Regards, > Swati