Re: Ping : [PATCH] [gcc, combine] PR46164: Don't combine the insns if a volatile register is contained.
On Fri, Feb 13, 2015 at 5:06 PM, Richard Sandiford wrote: > Segher Boessenkool writes: >> On Thu, Feb 12, 2015 at 03:54:21PM +, Richard Sandiford wrote: >>> "Hale Wang" writes: >>> > Ping? >> >> It's not a regression (or is it?), so it is not appropriate for stage4. >> >> >>> >> diff --git a/gcc/combine.c b/gcc/combine.c index 5c763b4..6901ac2 100644 >>> >> --- a/gcc/combine.c >>> >> +++ b/gcc/combine.c >>> >> @@ -1904,6 +1904,12 @@ can_combine_p (rtx_insn *insn, rtx_insn *i3, >>> >> rtx_insn *pred ATTRIBUTE_UNUSED, >>> >>set = expand_field_assignment (set); >>> >>src = SET_SRC (set), dest = SET_DEST (set); >>> >> >>> >> + /* Don't combine if dest contains a user specified register, because >>> > the >>> >> + user specified register (same with dest) in i3 would be replaced by >>> > the >>> >> + src of insn which might be different with the user's expectation. >>> >> + */ if (REG_P (dest) && REG_USERVAR_P (dest) && HARD_REGISTER_P >>> >> (dest)) >>> >> +return 0; >>> >>> I suppose this is similar to Andrew's comment, but I think the rule >>> is that it's invalid to replace a REG_USERVAR_P operand in an inline asm. >> >> Why not? You probably mean register asm, not all user variables? > > Yeah, meant hard REG_USERVAR_P, sorry, as for the patch. > >>> Outside of an inline asm we make no guarantee about whether something is >>> stored in a particular register or not. >>> >>> So IMO we should be checking whether either INSN or I3 is an asm as well >>> as the above. >> >> [ INSN can never be an asm, that is already refused by can_combine_p. ] >> >> We do not guarantee things will end up in the specified reg (except for asm), >> but will it hurt to leave things in the reg the user said it should be in, >> even >> if we do not guarantee this behaviour? > > Whether it does not, making the test unnecessarily wide is at best only > going to paper over problems elsewhere. I really think we should test > for i3 being an asm. > > Thanks, > Richard Thanks for reviewing. Hale wants me to continue his work because he will be in holiday in next ten days. The check of asm is added. Is this one OK? BR, Terry pr64818-combine-user-specified-register.patch-3 Description: Binary data
Re: Ping : [PATCH] [gcc, combine] PR46164: Don't combine the insns if a volatile register is contained.
Hi Terry, I still think this is stage1 material. > + /* Don't combine if dest contains a user specified register and i3 contains > + ASM_OPERANDS, because the user specified register (same with dest) in i3 > + would be replaced by the src of insn which might be different with > + the user's expectation. */ "Do not eliminate a register asm in an asm input" or similar? Text explaining why REG_USERVAR_P && HARD_REGISTER_P works here would be good to have, too. > + if (REG_P (dest) && REG_USERVAR_P (dest) && HARD_REGISTER_P (dest) > + && (GET_CODE (PATTERN (i3)) == SET > + && GET_CODE (SET_SRC (PATTERN (i3))) == ASM_OPERANDS)) > +return 0; That works only for asms with exactly one output. You want extract_asm_operands. Segher
[PATCH] PR target/65064: Return false for COMMON symbols
Hi, r220674 exposed a bug in ia64_in_small_data_p. After r220674, COMMON symbols binds locally for executables. But ia64_in_small_data_p returns true for COMMON symbols which are never in small data section. This patch fixes it. OK for trunk? H.J. Since COMMON symbols are never in small data section, ia64_in_small_data_p should return false for COMMON symbols. PR target/65064 * config/ia64/ia64.c (ia64_in_small_data_p): Return false for COMMON symbols. --- gcc/config/ia64/ia64.c | 4 1 file changed, 4 insertions(+) diff --git a/gcc/config/ia64/ia64.c b/gcc/config/ia64/ia64.c index 6ef22d9..3687289 100644 --- a/gcc/config/ia64/ia64.c +++ b/gcc/config/ia64/ia64.c @@ -9941,6 +9941,10 @@ ia64_in_small_data_p (const_tree exp) if (TARGET_NO_SDATA) return false; + /* COMMON symbols are never small data. */ + if (DECL_COMMON (exp)) +return false; + /* We want to merge strings, so we never consider them small data. */ if (TREE_CODE (exp) == STRING_CST) return false; -- 2.1.0
[committed] Improve handling of reloads for floating point loads and stores on PA
The attached patch fixes an ICE building the code-saturne package and generally improves the code generated for floating point loads and stores. With the previous implementation, it was not possible to load a LO_SUM DLT address to a floating point register (e.g., for an integer multiplication) when generating PIC code. While this is unusual, it appears perfectly valid. We now request a secondary reload with a general scratch register and pa_emit_move_sequence handles the rest. Tested on hppa2.0w-hp-hpux11.11, hppa64-hp-hpux11.11 and hppa-unknown-linux-gnu. Committed to trunk and 4.9 branch. Dave -- John David Anglin dave.ang...@bell.net 2015-02-15 John David Anglin * config/pa/pa.c (pa_secondary_reload): Request a secondary reload for all floading point loads and stores except those using a register index address. * config/pa/pa.md: Add new patterns to load a lo_sum DLT operand to a register. Index: config/pa/pa.c === --- config/pa/pa.c (revision 220681) +++ config/pa/pa.c (working copy) @@ -6031,18 +6031,15 @@ { x = XEXP (x, 0); - /* We don't need an intermediate for indexed and LO_SUM DLT -memory addresses. When INT14_OK_STRICT is true, it might -appear that we could directly allow register indirect -memory addresses. However, this doesn't work because we -don't support SUBREGs in floating-point register copies -and reload doesn't tell us when it's going to use a SUBREG. */ - if (IS_INDEX_ADDR_P (x) - || IS_LO_SUM_DLT_ADDR_P (x)) + /* We don't need a secondary reload for indexed memory addresses. + +When INT14_OK_STRICT is true, it might appear that we could +directly allow register indirect memory addresses. However, +this doesn't work because we don't support SUBREGs in +floating-point register copies and reload doesn't tell us +when it's going to use a SUBREG. */ + if (IS_INDEX_ADDR_P (x)) return NO_REGS; - - /* Request intermediate general register. */ - return GENERAL_REGS; } /* Request a secondary reload with a general scratch register Index: config/pa/pa.md === --- config/pa/pa.md (revision 220681) +++ config/pa/pa.md (working copy) @@ -2673,6 +2673,29 @@ [(set_attr "type" "binary") (set_attr "length" "4")]) +(define_insn "" + [(set (match_operand:SI 0 "register_operand" "=r") + (lo_sum:SI (match_operand:SI 1 "register_operand" "r") + (unspec:SI [(match_operand 2 "" "")] UNSPEC_DLTIND14R)))] + "symbolic_operand (operands[2], Pmode) + && ! function_label_operand (operands[2], Pmode) + && flag_pic" + "ldo RT'%G2(%1),%0" + [(set_attr "type" "binary") + (set_attr "length" "4")]) + +(define_insn "" + [(set (match_operand:DI 0 "register_operand" "=r") + (lo_sum:DI (match_operand:DI 1 "register_operand" "r") + (unspec:DI [(match_operand 2 "" "")] UNSPEC_DLTIND14R)))] + "symbolic_operand (operands[2], Pmode) + && ! function_label_operand (operands[2], Pmode) + && TARGET_64BIT + && flag_pic" + "ldo RT'%G2(%1),%0" + [(set_attr "type" "binary") + (set_attr "length" "4")]) + ;; Always use addil rather than ldil;add sequences. This allows the ;; HP linker to eliminate the dp relocation if the symbolic operand ;; lives in the TEXT space.
Re: patch to fix rtl documentation for new floating point comparisons
On 02/14/2015 03:26 PM, Paolo Bonzini wrote: On 10/02/2015 22:46, Joseph Myers wrote: It may make sense to define LTGT as exactly !UNEQ, and so quiet, but the choice of definition is a matter of what's convenient for the implementation (and which choice you make determines which existing code in GCC should be considered incorrect). It would be different from e.g. !UNLT and GE differing only in that UNLT is quiet and GE is signaling. So it makes sense to me to keep LTGT as signaling. while in theory, your argument is correct, in practice, this is not how people use this stuff and so i disagree. The interrupts are there to allow legacy code that does not know about nans to be "run" in a mode where the nans can be used to signal that things did not run well. Properly written floating point aware code never uses the interrupts.In that code, you want a rich set of comparisons which allow the programmer to efficiently deal with the fact that any comparison can go one of 4 possible ways. to support legacy code we need ne and eq to be quiet and lt gt le ge to be noisy. This is what the standards call for and this is what gcc delivers.Going beyond that for legacy code is a waste of time. kenny Paolo
Re: [PATCH, PR tree-optimization/65002] Disable SRA for functions wrongly marked as read-only
On Feb 13, 2015, at 11:25 AM, Jakub Jelinek wrote: >>> 2015-02-12 Ilya Enkovich >>> >>> PR tree-optimization/65002 >>> * gcc.dg/pr65002.C: New. >> >> This test should have gone into g++.dg. > > Into g++.dg/opt or g++.dg/ipa in particular. Pre-approved if someone wants to svn mv it.
[Patch, fortran] PR60898 premature release of entry symbols
Hello, I propose a fix for PR60898, where a symbol is freed despite remaining reachable in the symbol tree. The problem comes from this code in resolve_symbol: > > /* If we find that a flavorless symbol is an interface in one of the >parent namespaces, find its symtree in this namespace, free the >symbol and set the symtree to point to the interface symbol. */ > for (ns = gfc_current_ns->parent; ns; ns = ns->parent) > { > symtree = gfc_find_symtree (ns->sym_root, sym->name); > if (symtree && [...]) > { > this_symtree = gfc_find_symtree (gfc_current_ns->sym_root, > sym->name); > gfc_release_symbol (sym); > symtree->n.sym->refs++; > this_symtree->n.sym = symtree->n.sym; > return; > } > } > Here, the target of an element of the current namespace's name tree is changed to point to the outer symbol. And the current symbol is freed, without checking that it really was what was in the name tree before. In the testcase https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60898#c7 , the problematic symbol is an entry, which is available in the name tree only through a mangled name (created by gfc_get_unique_symtree in get_proc_name), so gfc_find_symtree won't find it by name lookup. In this case, what gfc_find_symtree finds is a symbol that is already the outer interface symbol, so reassigning this_symtree.n.sym would be a no-op. The patch proposed checks that sym == this_symtree->n.sym, so that the symbol reassignment is only made in that case. Otherwise, the regular symbol resolution happens normally. This patch is a stripped down version of what I posted before in the PR, which contained a symbol.c part which was increasing the reference count locally in do_traverse_symtree, to delay symbol release after all of them have been processed. That part was useless because if a symbol had to be processed more than once (meaning it was available under different names), it will have the corresponding reference count set so that it won't be freed too early in any case. Worse, that part was interacting badly with the hack used to break circular references in gfc_release_symbol, so it was better left out. Anyway, this is regression tested[*] on x86_64-unknown-linux-gnu. OK for trunk/4.9/4.8 ? Mikael [*] I have a few failing testcases (also without the patch), namely the following; does this ring a bell ? FAIL: gfortran.dg/erf_3.F90 FAIL: gfortran.dg/fmt_g0_7.f08 FAIL: gfortran.dg/fmt_en.f90 FAIL: gfortran.dg/nan_7.f90 FAIL: gfortran.dg/quad_2.f90 FAIL: gfortran.dg/quad_3.f90 FAIL: gfortran.dg/round_4.f90 2015-02-15 Mikael Morin PR fortran/60898 * resolve.c (resolve_symbol): Check that the symbol found from name lookup really is the current symbol being resolved. 2015-02-15 Mikael Morin PR fortran/60898 * gfortran.dg/entry_20.f90: New. Index: resolve.c === --- resolve.c (révision 220514) +++ resolve.c (copie de travail) @@ -13125,10 +13125,13 @@ resolve_symbol (gfc_symbol *sym) { this_symtree = gfc_find_symtree (gfc_current_ns->sym_root, sym->name); - gfc_release_symbol (sym); - symtree->n.sym->refs++; - this_symtree->n.sym = symtree->n.sym; - return; + if (this_symtree->n.sym == sym) + { + symtree->n.sym->refs++; + gfc_release_symbol (sym); + this_symtree->n.sym = symtree->n.sym; + return; + } } } ! { dg-do compile } ! ! PR fortran/50898 ! A symbol was freed prematurely during resolution, ! despite remaining reachable ! ! Original testcase from MODULE MODULE_pmat2 IMPLICIT NONE INTERFACE cad1b; MODULE PROCEDURE cad1b; END INTERFACE INTERFACE csb1b; MODULE PROCEDURE csb1b; END INTERFACE INTERFACE copbt; MODULE PROCEDURE copbt; END INTERFACE INTERFACE conbt; MODULE PROCEDURE conbt; END INTERFACE INTERFACE copmb; MODULE PROCEDURE copmb; END INTERFACE INTERFACE conmb; MODULE PROCEDURE conmb; END INTERFACE INTERFACE copbm; MODULE PROCEDURE copbm; END INTERFACE INTERFACE conbm; MODULE PROCEDURE conbm; END INTERFACE INTERFACE mulvb; MODULE PROCEDURE mulvb; END INTERFACE INTERFACE madvb; MODULE PROCEDURE madvb; END INTERFACE INTERFACE msbvb; MODULE PROCEDURE msbvb; END INTERFACE INTERFACE mulxb; MODULE PROCEDURE mulxb; END INTERFACE INTERFACE madxb; MODULE PROCEDURE madxb; END INTERFACE INTERFACE msbxb; MODULE PROCEDURE msbxb; END INTERFACE integer, parameter :: i_kind=4 integer, parameter :: r_kind=4 real(r_kind), parameter :: zero=0.0 real(r_kind), parameter :: one=1.0 real(r_kind), parameter :: two=2.0 CONTAINS SUBROUTINE cad1b(a,m1,mah1,mah2,mirror2) implicit none INTEGER(i_kind), INTENT(IN ) :: m1,mah1
Re: [Patch, fortran] PR60898 premature release of entry symbols
On 02/15/2015 09:48 AM, Mikael Morin wrote: [*] I have a few failing testcases (also without the patch), namely the following; does this ring a bell ? FAIL: gfortran.dg/erf_3.F90 FAIL: gfortran.dg/fmt_g0_7.f08 FAIL: gfortran.dg/fmt_en.f90 FAIL: gfortran.dg/nan_7.f90 FAIL: gfortran.dg/quad_2.f90 FAIL: gfortran.dg/quad_3.f90 FAIL: gfortran.dg/round_4.f90 fmt_g0_7.f08 is a new test that should be passing on x86-64 unless you have not updated scanner.c. Are these fails on x86-64? I do not see them here on mine. Jerry
Re: [Patch, fortran] PR60898 premature release of entry symbols
Dear Mikael, I have regstrapped revision r220715 with your patch. It fixes the tests in PR60898 without regression. > [*] I have a few failing testcases (also without the patch), namely the > following; does this ring a bell ? > FAIL: gfortran.dg/erf_3.F90 > FAIL: gfortran.dg/fmt_g0_7.f08 > FAIL: gfortran.dg/fmt_en.f90 > FAIL: gfortran.dg/nan_7.f90 > FAIL: gfortran.dg/quad_2.f90 > FAIL: gfortran.dg/quad_3.f90 > FAIL: gfortran.dg/round_4.f90 I don't see these failures on x86_64-apple-darwin14: Native configuration is x86_64-apple-darwin14.1.0 === gfortran tests === Running target unix/-m32 FAIL: gfortran.dg/bind_c_vars.f90 -g -flto (test for excess errors) === gfortran Summary for unix/-m32 === # of expected passes52071 # of unexpected failures1 # of expected failures 81 # of unsupported tests 241 Running target unix/-m64 FAIL: gfortran.dg/bind_c_vars.f90 -g -flto (test for excess errors) === gfortran Summary for unix/-m64 === # of expected passes52394 # of unexpected failures1 # of expected failures 81 # of unsupported tests 85 === gfortran Summary === # of expected passes104465 # of unexpected failures2 # of expected failures 162 # of unsupported tests 326 /opt/gcc/p_build/gcc/testsuite/gfortran/../../gfortran version 5.0.0 20150215 (experimental) [trunk revision 220715p2a] (GCC) === libgomp tests === Running target unix/-m32 === libgomp Summary for unix/-m32 === # of expected passes6231 # of unsupported tests 294 Running target unix/-m64 === libgomp Summary for unix/-m64 === # of expected passes6231 # of unsupported tests 294 === libgomp Summary === # of expected passes12462 # of unsupported tests 588 Which platform are you using? (the gfortran.dg/bind_c_vars.f90 failure is pr54852). Thanks for the patch, Dominique
Re: [patch, testsuite] Fix ubsan for testing when libstdc++ isn't installed
On Feb 13, 2015, at 2:24 PM, Jack Howarth wrote: > Mike and FX, > Shouldn't we also apply… Ok. > Author: fxcoudert > Date: Mon Dec 22 21:57:45 2014 > New Revision: 219035 > > URL: https://gcc.gnu.org/viewcvs?rev=219035&root=gcc&view=rev > Log: > * lib/ubsan-dg.exp: Add library path for libstdc++. > > Modified: >trunk/gcc/testsuite/ChangeLog >trunk/gcc/testsuite/lib/ubsan-dg.exp > > to gcc-4_9-branch for 4.9.3? > > https://gcc.gnu.org/ml/gcc-testresults/2015-02/msg01535.html > > shows that we need it. >Jack > > On Mon, Dec 22, 2014 at 12:02 PM, Mike Stump wrote: >> On Dec 20, 2014, at 8:58 AM, FX wrote: >>> This patch below allows ubsan to run when libstdc++ is built but not >>> installed (something which happens on darwin, in particular). This fixes >>> all 658 ubsan failures when run in this particular configuration. >>> >>> OK to commit? >> >> Ok.
Re: Chromium: LTO
Hi, > +symtab_node::iterate_direct_aliases (unsigned i, ipa_ref *&ref) > +{ > + ref_list.referring.iterate (i, &ref); > + > + if (ref && ref->use != IPA_REF_ALIAS) > +return NULL; > + > + return ref; > +} it seems a little weird the out arg can return a non alias, and so if you only want to look at aliases you have to check the return value. > + > +/* Return true if list contains an alias. */ > + > +inline bool > +symtab_node::has_aliases_p (void) > +{ > + ipa_ref *ref = NULL; > + int i; > + > + for (i = 0; iterate_direct_aliases (i, ref); i++) > +if (ref->use == IPA_REF_ALIAS) > + return true; > + return false; > +} can it ever be true there is an alias in the list but it isn't the first thing? the function above suggests not. > +symtab_node::call_for_symbol_and_aliases (bool (*callback) (symtab_node *, > + void *), > + void *data, > + bool include_overwritable) > +{ > + ipa_ref *ref; > + > + if (callback (this, data)) > +return true; > + if (iterate_direct_aliases (0, ref)) wouldn't has_aliases_p be a little more clear? Trev On Sat, Feb 14, 2015 at 07:44:40PM +0100, Jan Hubicka wrote: > Hi, > Maritn has notced that we spend a lot of time in simple cgraph/varpool > predicates. The patch bellow > reorganizes inlines so the fast paths get fast. > > perf report: > > > > 18.79% lto1-wpa lto1 [.] > > do_estimate_growth_1(cgraph_node*, void*) > > 12.48% lto1-wpa lto1 [.] > > cgraph_node::can_remove_if_no_direct_calls_and_refs_p() > > 5.86% lto1-wpa lto1 [.] > > symtab_node::used_from_object_file_p_worker(symtab_node*) > > 5.69% lto1-wpa lto1 [.] > > cgraph_node::call_for_symbol_and_aliases(bool (*)(cgraph_node*, void*), > > void*, bool) > > 5.01% lto1-wpa-stream lto1 [.] > > streamer_tree_cache_lookup(streamer_tree_cache_d*, tree_node*, unsigned > > int*) > > 4.84% lto1-wpa lto1 [.] > > cgraph_node::call_for_symbol_thunks_and_aliases(bool (*)(cgraph_node*, > > void*), void*, bool, bool) > > 2.44% lto1-wpa lto1 [.] inflate_fast > > 2.10% lto1-wpa-stream lto1 [.] > > DFS::DFS_write_tree(output_block*, DFS::sccs*, tree_node*, bool, bool, bool) > > 2.03% lto1-wpa-stream lto1 [.] > > linemap_lookup(line_maps*, unsigned int) > > 1.30% lto1-wpa-stream [kernel.kallsyms][k] > > isolate_migratepages_range > > 1.27% lto1-wpa lto1 [.] > > symtab_node::iterate_direct_aliases(unsigned int, ipa_ref*&) > > 1.21% lto1-wpa-stream lto1 [.] > > DFS::DFS_write_tree_body(output_block*, tree_node*, DFS::sccs*, bool, bool) > > 1.19% lto1-wpa-stream lto1 [.] > > streamer_write_uhwi_stream(lto_output_stream*, unsigned long) > > 0.85% lto1-wpa lto1 [.] > > compare_tree_sccs_1(tree_node*, tree_node*, tree_node***) > > 0.83% lto1-wpa lto1 [.] > > streamer_read_uhwi(lto_input_block*) > > 0.74% lto1-wpa-stream lto1 [.] > > streamer_tree_cache_insert_1(streamer_tree_cache_d*, tree_node*, unsigned > > int, unsigned int*, bool) > > 0.72% lto1-wpa lto1 [.] > > ht_lookup_with_hash(ht*, unsigned char const*, unsigned long, unsigned int, > > ht_lookup_option) > > 0.70% lto1-wpa-stream [kernel.kallsyms][k] compaction_alloc > > 0.68% lto1-wpa lto1 [.] > > unify_scc(streamer_tree_cache_d*, unsigned int, unsigned int, unsigned int, > > unsigned int) > > 0.66% lto1-wpa-stream lto1 [.] > > lto_output_tree(output_block*, tree_node*, bool, bool) > > 0.66% lto1-wpa-stream libc-2.19.so [.] _int_malloc > > 0.66% lto1-wpa-stream lto1 [.] > > hash_table > default_hashmap_traits>::hash_entry, xcallocator, true>::expand() > > 0.62% lto1-wpa-stream lto1 [.] > > streamer_write_tree_bitfields(output_block*, tree_node*) > > 0.61% lto1-wpa lto1 [.] > > streamer_read_tree_bitfields(lto_input_block*, data_in*, tree_node*) > > 0.53% lto1-wpa lto1 [.] > > lto_cgraph_replace_node(cgraph_node*, cgraph_node*) > > 0.49% lto1-wpa [kernel.kallsyms][k] copy_pte_range > > Bootstrapped/regtested x86_64-linux, comitted. > > Honza > > * ipa-chkp.c: Use iterate_direct_aliases. > * symtab.c (resolution_used_from_other_file_p): Move inline. > (symtab_node::create_reference): Fix formating. > (symtab_node::has_aliases_p): Move inline; use iterate_direct_aliases. > (symtab_node::iterate_reference): Move inline. > (symta
[PATCH, fixincludes] Fix PR 48009 53348
The stdlib.h header in AIX 4.3 does not correctly declare strtof with a const char* argument. Users are building the latest releases of GCC on AIX 4.3 The appended patch from Richard G Daniel uses fixincludes to correct the declaration. Okay? Thanks, David PR bootstrap/48009 PR bootstrap/53348 * inclhack.def (aix_strtof_const): New fix. * fixincl.x: Regenerate. * tests/base/inttypes.h: New test. Index: inclhack.def === --- inclhack.def(revision 220717) +++ inclhack.def(working copy) @@ -842,6 +842,18 @@ }; /* + * stdlib.h on AIX 4.3 declares strtof() with a non-const first argument. + */ +fix = { +hackname = aix_strtof_const; +files = stdlib.h; +select= "((extern[ \t]+)?float[ \t]+strtof)\\(char \\*, char \\*\\*\\); "; +c_fix = format; +c_fix_arg = "%1(const char *, char **);"; +test_text = "extern floatstrtof(char *, char **);"; +}; + +/* * sys/machine.h on AIX 4.3.3 puts whitespace between a \ and a newline * in an otherwise harmless (and #ifed out) macro definition */
[4.8 branch] PATCH: PR middle-end/53623: [4.7/4.8 Regression] sign extension is effectively split into two x86-64 instructions
Hi, This is a backport of the patch for PR middle-end/53623 plus all bug fixes caused by it. Tested on Linux/x86-32, Linux/x86-64 and x32. OK for 4.8 branch? Thanks. H.J. --- diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 469ee31..44bf322 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,82 @@ +2015-02-15 H.J. Lu + + Backport from mainline: + 2014-06-13 Jeff Law + + PR rtl-optimization/61094 + PR rtl-optimization/61446 + * ree.c (combine_reaching_defs): Get the mode for the copy from + the extension insn rather than the defining insn. + + 2014-06-02 Jeff Law + + PR rtl-optimization/61094 + * ree.c (combine_reaching_defs): Do not reextend an insn if it + was marked as do_no_reextend. If a copy is needed to eliminate + an extension, then mark it as do_not_reextend. + + 2014-02-14 Jeff Law + + PR rtl-optimization/60131 + * ree.c (get_extended_src_reg): New function. + (combine_reaching_defs): Use it rather than assuming location + of REG. + (find_and_remove_re): Verify first operand of extension is + a REG before adding the insns to the copy list. + + 2014-01-17 Jeff Law + + * ree.c (combine_set_extension): Temporarily disable test for + changing number of hard registers. + + 2014-01-15 Jeff Law + + PR tree-optimization/59747 + * ree.c (find_and_remove_re): Properly handle case where a second + eliminated extension requires widening a copy created for elimination + of a prior extension. + (combine_set_extension): Ensure that the number of hard regs needed + for a destination register does not change when we widen it. + + 2014-01-10 Jeff Law + + PR middle-end/59743 + * ree.c (combine_reaching_defs): Ensure the defining statement + occurs before the extension when optimizing extensions with + different source and destination hard registers. + + 2014-01-10 Jakub Jelinek + + PR rtl-optimization/59754 + * ree.c (combine_reaching_defs): Disallow !SCALAR_INT_MODE_P + modes in the REGNO != REGNO case. + + 2014-01-08 Jeff Law + + * ree.c (get_sub_rtx): New function, extracted from... + (merge_def_and_ext): Here. + (combine_reaching_defs): Use get_sub_rtx. + + 2014-01-07 Jeff Law + + PR middle-end/53623 + * ree.c (combine_set_extension): Handle case where source + and destination registers in an extension insn are different. + (combine_reaching_defs): Allow source and destination + registers in extension to be different under limited + circumstances. + (add_removable_extension): Remove restriction that the + source and destination registers in the extension are the + same. + (find_and_remove_re): Emit a copy from the extension's + destination to its source after the defining insn if + the source and destination registers are different. + + 2013-12-12 Jeff Law + + * i386.md (simple LEA peephole2): Add missing mode to zero_extend + for zero-extended MULT simple LEA pattern. + 2015-02-12 Jakub Jelinek Backported from mainline diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 372ae63..aabd6ec 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -17265,7 +17265,7 @@ && REGNO (operands[0]) == REGNO (operands[1]) && peep2_regno_dead_p (0, FLAGS_REG)" [(parallel [(set (match_dup 0) - (zero_extend (ashift:SI (match_dup 1) (match_dup 2 + (zero_extend:DI (ashift:SI (match_dup 1) (match_dup 2 (clobber (reg:CC FLAGS_REG))])] "operands[2] = GEN_INT (exact_log2 (INTVAL (operands[2])));") diff --git a/gcc/ree.c b/gcc/ree.c index c7e106f..bc566ad 100644 --- a/gcc/ree.c +++ b/gcc/ree.c @@ -327,8 +327,30 @@ combine_set_extension (ext_cand *cand, rtx curr_insn, rtx *orig_set) { rtx orig_src = SET_SRC (*orig_set); enum machine_mode orig_mode = GET_MODE (SET_DEST (*orig_set)); - rtx new_reg = gen_rtx_REG (cand->mode, REGNO (SET_DEST (*orig_set))); rtx new_set; + rtx cand_pat = PATTERN (cand->insn); + + /* If the extension's source/destination registers are not the same + then we need to change the original load to reference the destination + of the extension. Then we need to emit a copy from that destination + to the original destination of the load. */ + rtx new_reg; + bool copy_needed += (REGNO (SET_DEST (cand_pat)) != REGNO (XEXP (SET_SRC (cand_pat), 0))); + if (copy_needed) +new_reg = gen_rtx_REG (cand->mode, REGNO (SET_DEST (cand_pat))); + else +new_reg = gen_rtx_REG (cand->mode, REGNO (SET_DEST (*orig_set))); + +#if 0 + /* Rethinking test. Temporarily disabled. */ + /* We're going to be widening the result of DEF_INSN, ensure that doing so + doesn't change the number of hard regi
Re: [PATCH, fixincludes] Fix PR 48009 53348
Looks good to me. On Sun, Feb 15, 2015 at 12:49 PM, David Edelsohn wrote: > The stdlib.h header in AIX 4.3 does not correctly declare strtof with > a const char* argument. Users are building the latest releases of GCC > on AIX 4.3 The appended patch from Richard G Daniel uses fixincludes > to correct the declaration. > > Okay? > > Thanks, David > > PR bootstrap/48009 > PR bootstrap/53348 > * inclhack.def (aix_strtof_const): New fix. > * fixincl.x: Regenerate. > * tests/base/inttypes.h: New test. > > Index: inclhack.def > === > --- inclhack.def(revision 220717) > +++ inclhack.def(working copy) > @@ -842,6 +842,18 @@ > }; > > /* > + * stdlib.h on AIX 4.3 declares strtof() with a non-const first argument. > + */ > +fix = { > +hackname = aix_strtof_const; > +files = stdlib.h; > +select= "((extern[ \t]+)?float[ \t]+strtof)\\(char \\*, char > \\*\\*\\); > "; > +c_fix = format; > +c_fix_arg = "%1(const char *, char **);"; > +test_text = "extern floatstrtof(char *, char **);"; > +}; > + > +/* > * sys/machine.h on AIX 4.3.3 puts whitespace between a \ and a newline > * in an otherwise harmless (and #ifed out) macro definition > */
Re: Chromium: LTO
> Hi, > > > +symtab_node::iterate_direct_aliases (unsigned i, ipa_ref *&ref) > > +{ > > + ref_list.referring.iterate (i, &ref); > > + > > + if (ref && ref->use != IPA_REF_ALIAS) > > +return NULL; > > + > > + return ref; > > +} > > it seems a little weird the out arg can return a non alias, and so if > you only want to look at aliases you have to check the return value. > > > + > > +/* Return true if list contains an alias. */ > > + > > +inline bool > > +symtab_node::has_aliases_p (void) > > +{ > > + ipa_ref *ref = NULL; > > + int i; > > + > > + for (i = 0; iterate_direct_aliases (i, ref); i++) > > +if (ref->use == IPA_REF_ALIAS) > > + return true; > > + return false; > > +} > > can it ever be true there is an alias in the list but it isn't the first > thing? the function above suggests not. Yeah, you are right here; I was not very cureful when simplifying the function. > > > +symtab_node::call_for_symbol_and_aliases (bool (*callback) (symtab_node *, > > + void *), > > + void *data, > > + bool include_overwritable) > > +{ > > + ipa_ref *ref; > > + > > + if (callback (this, data)) > > +return true; > > + if (iterate_direct_aliases (0, ref)) > > wouldn't has_aliases_p be a little more clear? Indeed. I commited the following cleanup that also cleans get_binfo_at_offset per martin Jambor comments. * cgraph.h (symtab_node::has_aliases_p): Simplify. (symtab_node::call_for_symbol_and_aliases): Use has_aliases_p * tree.c (lookup_binfo_at_offset): Make static. (get_binfo_at_offset): Do not shadow offset; add explanatory comment. Index: cgraph.h === --- cgraph.h(revision 220709) +++ cgraph.h(working copy) @@ -2338,12 +2338,8 @@ inline bool symtab_node::has_aliases_p (void) { ipa_ref *ref = NULL; - int i; - for (i = 0; iterate_direct_aliases (i, ref); i++) -if (ref->use == IPA_REF_ALIAS) - return true; - return false; + return (iterate_direct_aliases (0, ref) != NULL); } /* Return true when RESOLUTION indicate that linker will use @@ -2984,11 +2980,9 @@ symtab_node::call_for_symbol_and_aliases void *data, bool include_overwritable) { - ipa_ref *ref; - if (callback (this, data)) return true; - if (iterate_direct_aliases (0, ref)) + if (has_aliases_p ()) return call_for_symbol_and_aliases_1 (callback, data, include_overwritable); return false; } @@ -3003,13 +2997,10 @@ cgraph_node::call_for_symbol_and_aliases void *data, bool include_overwritable) { - ipa_ref *ref; - if (callback (this, data)) return true; - if (iterate_direct_aliases (0, ref)) + if (has_aliases_p ()) return call_for_symbol_and_aliases_1 (callback, data, include_overwritable); - return false; } @@ -3023,13 +3014,10 @@ varpool_node::call_for_symbol_and_aliase void *data, bool include_overwritable) { - ipa_ref *ref; - if (callback (this, data)) return true; - if (iterate_direct_aliases (0, ref)) + if (has_aliases_p ()) return call_for_symbol_and_aliases_1 (callback, data, include_overwritable); - return false; } Index: tree.c === --- tree.c (revision 220709) +++ tree.c (working copy) @@ -11992,7 +11992,7 @@ type_in_anonymous_namespace_p (const_tre /* Lookup sub-BINFO of BINFO of TYPE at offset POS. */ -tree +static tree lookup_binfo_at_offset (tree binfo, tree type, HOST_WIDE_INT pos) { unsigned int i; @@ -12045,11 +12045,13 @@ get_binfo_at_offset (tree binfo, HOST_WI else if (offset != 0) { tree found_binfo = NULL, base_binfo; - int offset = (tree_to_shwi (BINFO_OFFSET (binfo)) + pos - / BITS_PER_UNIT); + /* Offsets in BINFO are in bytes relative to the whole structure +while POS is in bits relative to the containing field. */ + int binfo_offset = (tree_to_shwi (BINFO_OFFSET (binfo)) + pos +/ BITS_PER_UNIT); for (i = 0; BINFO_BASE_ITERATE (binfo, i, base_binfo); i++) - if (tree_to_shwi (BINFO_OFFSET (base_binfo)) == offset + if (tree_to_shwi (BINFO_OFFSET (base_binfo)) == binfo_offset && types_same_for_odr (TREE_TYPE (base_binfo), TREE_TYPE (fld))) { found_binfo = base_binfo; @@ -12058,7 +12060,8 @@ get_binfo_at_offset (tree binfo, HOST_WI if (found_binfo) binfo = found_binfo; else - binfo = lookup_binfo_at_offset (binfo, TREE_TYPE
[doc, committed] use "Title Case" for section titles
I noticed a while back when searching for something in the table of contents for the GCC manual that most section names used "Title Case", but some only capitalized the first word, or not even that. This patch is the result of me going through the whole ToC and trying to fix everything to use the "Title Case" convention. It's probably still not perfect, but at least it's more consistent about capitalization than it was previously. In a few cases, I also re-worded particularly lengthy or awkward section titles or added markup. Since this is more content-free copy-editing, I've gone ahead and checked it in as "obvious". -Sandra 2015-02-15 Sandra Loosemore gcc/ * doc/bugreport.texi: Adjust section titles throughout the file to use "Title Case". * doc/extend.texi: Likewise. * doc/gcov.texi: Likewise. * doc/implement-c.texi: Likewise. * doc/implement-cxx.texi: Likewise. * doc/invoke.texi: Likewise. * doc/objc.texi: Likewise. * doc/standards.texi: Likewise. * doc/trouble.texi: Likewise. Index: gcc/doc/bugreport.texi === --- gcc/doc/bugreport.texi (revision 220721) +++ gcc/doc/bugreport.texi (working copy) @@ -82,7 +82,7 @@ suggestions for improvement of GCC are w @end itemize @node Bug Reporting -@section How and where to Report Bugs +@section How and Where to Report Bugs @cindex compiler bugs, reporting Bugs should be reported to the bug database at @value{BUGURL}. Index: gcc/doc/extend.texi === --- gcc/doc/extend.texi (revision 220721) +++ gcc/doc/extend.texi (working copy) @@ -845,7 +845,7 @@ the middle operand uses the value alread effects of recomputing it. @node __int128 -@section 128-bit integers +@section 128-bit Integers @cindex @code{__int128} data types As an extension the integer scalar type @code{__int128} is supported for @@ -1548,7 +1548,7 @@ struct foo d[1] = @{ @{ 1, @{ 2, 3, 4 @} @end smallexample @node Empty Structures -@section Structures With No Members +@section Structures with No Members @cindex empty structures @cindex zero-size structures @@ -1786,7 +1786,7 @@ The option @option{-Wpointer-arith} requ are used. @node Pointers to Arrays -@section Pointers to arrays with qualifiers work as expected +@section Pointers to Arrays with Qualifiers Work as Expected @cindex pointers to arrays @cindex const qualifier @@ -8154,7 +8154,7 @@ You cannot operate between vectors of di signedness without a cast. @node Offsetof -@section Offsetof +@section Support for @code{offsetof} @findex __builtin_offsetof GCC implements for both C and C++ a syntactic extension to implement @@ -8182,7 +8182,7 @@ may be dependent. In either case, @var{ identifier, or a sequence of member accesses and array references. @node __sync Builtins -@section Legacy __sync Built-in Functions for Atomic Memory Access +@section Legacy @code{__sync} Built-in Functions for Atomic Memory Access The following built-in functions are intended to be compatible with those described @@ -8322,7 +8322,7 @@ are not prevented from being speculated @end table @node __atomic Builtins -@section Built-in functions for memory model aware atomic operations +@section Built-in Functions for Memory Model Aware Atomic Operations The following built-in functions approximately match the requirements for C++11 memory model. Many are similar to the @samp{__sync} prefixed built-in @@ -8591,7 +8591,7 @@ compiler may also ignore this parameter. @end deftypefn @node Integer Overflow Builtins -@section Built-in functions to perform arithmetics and arithmetic overflow checking. +@section Built-in Functions to Perform Arithmetic with Overflow Checking The following built-in functions allow performing simple arithmetic operations together with checking whether the operations overflowed. @@ -8650,7 +8650,7 @@ functions above, except they perform mul @end deftypefn @node x86 specific memory model extensions for transactional memory -@section x86 specific memory model extensions for transactional memory +@section x86-Specific Memory Model Extensions for Transactional Memory The x86 architecture supports additional memory ordering flags to mark lock critical sections for hardware lock elision. @@ -8986,7 +8986,7 @@ returns -1. @end deftypefn @node Cilk Plus Builtins -@section Cilk Plus C/C++ language extension Built-in Functions. +@section Cilk Plus C/C++ Language Extension Built-in Functions GCC provides support for the following built-in reduction funtions if Cilk Plus is enabled. Cilk Plus can be enabled using the @option{-fcilkplus} flag. @@ -11178,7 +11178,7 @@ number of an IACC register. See @pxref{ for more details. @node Directly-mapped Integer Functions -@subsubsection Directly-mapped Integer Functions +@subsubsection Directly-Mapped Integer
Re: [doc, committed] small grammar error fix
On Tuesday 2015-02-10 19:15, Sandra Loosemore wrote: I've checked it in as obvious. -Sandra 2015-02-10 David Wohlferd Sandra Loosemore gcc/ * doc/extend.texi (Loop-Specific Pragmas): Fix grammar error. That's fine, just the attachment was the ChangeLog entry again, not the patch. Gerald
[Ping] [PATCH PR64820] Fix ASan UAR detection fails on 32-bit targets if SSP is enabled.
Ping. Original Message Subject: [PATCH PR64820] Fix ASan UAR detection fails on 32-bit targets if SSP is enabled. Date: Mon, 09 Feb 2015 14:03:54 +0400 From: Maxim Ostapenko To: GCC Patches CC: Yury Gribov , Slava Garbuzov Hi, when testing I noticed, that if compile with both -fsanitize=address and -fstack-protector for 32-bit architectures and run with ASAN_OPTIONS=detect_stack_use_after_return=1, libsanitizer fails with: ==7299==AddressSanitizer CHECK failed: /home/max/workspace/downloads/gcc/libsanitizer/asan/asan_poisoning.cc:25 "((AddrIsAlignedByGranularity(addr + size))) != (0)" (0x0, 0x0) #0 0xf72d8afc in AsanCheckFailed /home/max/workspace/downloads/gcc/libsanitizer/asan/asan_rtl.cc:68 #1 0xf72dda89 in __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) /home/max/workspace/downloads/gcc/libsanitizer/sanitizer_common/sanitizer_common.cc:72 This happens because ssp inserts a stack guard into a function, that confuses asan_emit_stack_protection to calculate right size parameter for asan_stack_malloc. This tiny patch resolves the issue. Regtested with make -j12 -k check RUNTESTFLAGS='--target_board=unix\{-m32,-m64\}' on x86_64-unknown-linux-gnu. Bootstrapped, ASan-bootstrapped on x86_64-unknown-linux-gnu. Ok to commit? -Maxim gcc/ChangeLog: 2015-02-09 Max Ostapenko PR sanitizer/64820 * cfgexpand.c (align_base): New function. (alloc_stack_frame_space): Call it. (expand_stack_vars): Align prev_frame to be sure data->asan_vec elements aligned properly. gcc/testsuite/ChangeLog: 2015-02-09 Max Ostapenko PR sanitizer/64820 * c-c++-common/asan/pr64820.c: New test. diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c index 7dfe1f6..7845a17 100644 --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -282,6 +282,15 @@ align_local_variable (tree decl) return align / BITS_PER_UNIT; } +/* Align given offset BASE with ALIGN. Truncate up if ALIGN_UP is true, + down otherwise. Return truncated BASE value. */ + +static inline unsigned HOST_WIDE_INT +align_base (HOST_WIDE_INT base, unsigned HOST_WIDE_INT align, bool align_up) +{ + return align_up ? (base + align - 1) & -align : base & -align; +} + /* Allocate SIZE bytes at byte alignment ALIGN from the stack frame. Return the frame offset. */ @@ -293,17 +302,15 @@ alloc_stack_frame_space (HOST_WIDE_INT size, unsigned HOST_WIDE_INT align) new_frame_offset = frame_offset; if (FRAME_GROWS_DOWNWARD) { - new_frame_offset -= size + frame_phase; - new_frame_offset &= -align; - new_frame_offset += frame_phase; + new_frame_offset + = align_base (frame_offset - frame_phase - size, + align, false) + frame_phase; offset = new_frame_offset; } else { - new_frame_offset -= frame_phase; - new_frame_offset += align - 1; - new_frame_offset &= -align; - new_frame_offset += frame_phase; + new_frame_offset + = align_base (frame_offset - frame_phase, align, true) + frame_phase; offset = new_frame_offset; new_frame_offset += size; } @@ -1031,13 +1038,16 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data) base = virtual_stack_vars_rtx; if ((flag_sanitize & SANITIZE_ADDRESS) && ASAN_STACK && pred) { - HOST_WIDE_INT prev_offset = frame_offset; + HOST_WIDE_INT prev_offset + = align_base (frame_offset, + MAX (alignb, ASAN_RED_ZONE_SIZE), + FRAME_GROWS_DOWNWARD); tree repr_decl = NULL_TREE; - offset = alloc_stack_frame_space (stack_vars[i].size + ASAN_RED_ZONE_SIZE, MAX (alignb, ASAN_RED_ZONE_SIZE)); + data->asan_vec.safe_push (prev_offset); data->asan_vec.safe_push (offset + stack_vars[i].size); /* Find best representative of the partition. diff --git a/gcc/testsuite/c-c++-common/asan/pr64820.c b/gcc/testsuite/c-c++-common/asan/pr64820.c new file mode 100644 index 000..885a662 --- /dev/null +++ b/gcc/testsuite/c-c++-common/asan/pr64820.c @@ -0,0 +1,31 @@ +/* { dg-do run } */ +/* { dg-require-effective-target fstack_protector } */ +/* { dg-options "-fstack-protector-strong" } */ +/* { dg-set-target-env-var ASAN_OPTIONS "detect_stack_use_after_return=1" } */ +/* { dg-shouldfail "asan" } */ + +__attribute__((noinline)) +char *Ident(char *x) { + return x; +} + +__attribute__((noinline)) +char *Func1() { + char local[1 << 12]; + return Ident(local); +} + +__attribute__((noinline)) +void Func2(char *x) { + *x = 1; +} +int main(int argc, char **argv) { + Func2(Func1()); + return 0; +} + +/* { dg-output "AddressSanitizer: stack-use-after-return on address 0x\[0-9a-f\]+\[^\n\r]*(\n|\r\n|\r)" } */ +/* { dg-output "WRITE of size 1 at .* thread T0.*" } */ +/* { dg-output "#0.*(Func2)?.*pr64820.(c:21)?.*" } */ +/* { dg-output "is located in stack of thread T0 at offset.*" } */