Re: [PATCH, regression?] Support --static-libstdc++ with native AIX ld
On 01/27/2013 03:16 AM, David Edelsohn wrote: > On Fri, Jan 25, 2013 at 8:55 AM, Michael Haubenwallner > wrote: > >> Same here, building everything out-of-source. The prerequisites used are: >> * CONFIG_SHELL=/usr/local/bin/bash 4.1.7 from bullfreeware (symlinks to >> /opt/freeware/bin/) >> * /usr/bin/{gcc,g++} 4.6.1 from bullfreeware (symlinks to /opt/freeware/bin/) >> * /usr/bin/gmake 3.82 from bullfreeware (symlinks to /opt/freeware/bin/) >> * gmp-5.0.4: as shared library, configured with --prefix=/prereq ABI=32 >> * mpfr-3.1.1: as shared library, configured with --prefix=/prereq >> --with-gmp=/prereq >> * mpfr-3.1.1: as shared library, configured with --prefix=/prereq >> --with-{gmp,mpfr}=/prereq >> * gawk-3.1.7, flex-2.5.35, m4-1.4.13 from some Gentoo Prefix instance, >> nowhere in PATH, >> thus: export {AWK,FLEX}=/gentoo/prefix/usr/bin/{awk,flex} and this patch: >> http://gcc.gnu.org/ml/gcc-patches/2013-01/msg00960.html >> >> For gcc: >> * $CONFIG_SHELL configure --prefix=/does/not/exist/yet >> --with-{gmp,mpfr,mpc}=/prereq \ >> --enable--languages=c,c++ --disable-werror --disable-nls >> * gmake bootstrap > > I committed your patch. Thank you! But still curious if you've been able to reproduce the problem, and why you didn't encounter this problem beforehand. > By the way, NLS works if you build and install GNU libiconv (1.14) and > add --with-libiconv-prefix=/prereq to force GCC bootstrap to use GNU > libiconv instead of AIX libiconv. Yes, but (you've asked) here is this situation I don't want to configure extra deplib-prefixes for (remember bullfreeware is listed as provider for gcc-binaries): * bullfreeware's libiconv-1.13.1 and gettext-0.17 is installed in /opt/freeware, * /usr/lib/libintl.a is symlinked to /opt/freeware/lib (by bullfreeware's RPM), * /usr/lib/libiconv.a is the original AIX' one. Now, /usr/lib/libintl.a needs /opt/freeware/lib/libiconv.a[libiconv.so.2], and it does contain the correct RUNPATH. But subsequent binaries linking against /usr/lib/libintl.a don't (necessarily) know about the need to add /opt/freeware/lib as RUNPATH, so these binaries break with libiconv.so.2 not being found as member of /usr/lib/libiconv.a, because AIX unfortunately does stop its shared-library search at the first archive filename found. This also is the main reason for my filename-based-shared-library-versioning thing. While this topic is related, it has different reasoning - but the result does work: [1] http://www.perzl.org/aix/index.php?n=FAQs.FAQs#toolbox-compatibility-issue /haubi/
RFA: RL78: Allow SP to be used as a base register
Hi DJ, Please may I apply the patch below. It fixes the RL78 backend so that the stack register can be used as a base address register. Tested with no regressions on an rl78-elf toolchain. Cheers Nick PS. I am currently investigating allow r8-r15 to be used as base registers. gcc/ChangeLog 2013-01-28 Nick Clifton * config/rl78/rl78.c (rl78_regno_mode_code_ok_for_base_p): Allow SP_REG. Index: gcc/config/rl78/rl78.c === --- gcc/config/rl78/rl78.c (revision 195461) +++ gcc/config/rl78/rl78.c (working copy) @@ -769,7 +769,7 @@ addr_space_t address_space ATTRIBUTE_UNUSED, int outer_code ATTRIBUTE_UNUSED, int index_code) { - if (regno < 24 && regno >= 16) + if (regno <= SP_REG && regno >= 16) return true; if (index_code == REG) return (regno == HL_REG);
Re: FW: [PATCH] [MIPS] microMIPS gcc support
"Maciej W. Rozycki" writes: > On Sat, 26 Jan 2013, Richard Sandiford wrote: > >> > How about instead of complicating this we simply add support for >> > microMIPS encoding in the PLT? I think I should be able to squeeze out >> > some time next week to dust off and retest the binutils patch I've had >> > pending far too long now. This way we won't have to maintain separate >> > cases where tail calls may or may not be made via the PLT. >> > >> > Note that we need that support sooner or later anyway due to the prospect >> > of pure-microMIPS processors. >> >> Just so I know: what does the PLT patch do for external functions >> that are jumped to by both microMIPS and non-microMIPS code? > > Two PLT entries are produced in that case. > > PLT entries are created based on the relocation type referring: R_MIPS_26 > relocations trigger a standard MIPS PLT entry, R_MICROMIPS_26_S1 > relocations trigger a microMIPS PLT entry. Other relocations reuse a PLT > entry already produced for one of the jump relocations, or if none > present, then they make an own PLT entry according to ELF file header > flags: if EF_MIPS_ARCH_ASE_MICROMIPS is set, then a microMIPS entry is > produced, otherwise a standard MIPS one. Therefore depending on > relocations seen up to two entries can be produced, encoded differently so > that there is no need to switch modes with direct jumps. > > If all the individual PLT entries ultimately produced are microMIPS code, > then the PLT header is built as microMIPS code as well, otherwise it's > standard MIPS code. This guarantees no standard MIPS code is produced in > the PLT if there's none already in the executable (and vice versa). Thanks, sounds good! In that case, yeah, let's leave the TARGET_ABICALLS_PIC0 part out (but keep the rest of mips_call_may_need_jalx_p). For avoidance of doubt: I don't think there's any need to wait for the linker patches before sending the updated GCC patch. The GCC patch can only go in 4.9 anyway, and the new PLT code won't be avaiable until 2.24, so there's plenty of time on both sides. Testing the GCC patch against Mentor's linker is fine with me. Richard
Re: [Patch] Fix PR54814
On Sun, Jan 27, 2013 at 11:26 PM, Steven Bosscher wrote: > On Sun, Jan 27, 2013 at 11:09 PM, Georg-Johann Lay wrote: > The patch was originally worked out by Bernd Schmidt and fixed a problem > introduced in > > http://gcc.gnu.org/r190252 > > Ironically, this revision fixes a reload problem on x86/x86_64 -- > which doesn't use reload anymore now... > > >> Does this mean the fix is rejected for 4.8? > > No, just that it probably helps to add a RM to the CC list. > > FWIW, it seems to me that this patch should go into 4.8, because the > bug is probably not limited to AVR. Indeed, the fix also looks quite obvious though I know nothing about the code at all. Thus, ok from a RM perspective if a reload-affine person approves it. Thanks, Richard. > Ciao! > Steven
Re: Cortex-A15 vfnma/vfnms test patch
[Taking gcc-help off this thread.] Amol, I have tested these instruction with GCC and these instructions are generated. Please review and marge this test support patch in gcc main trunk. Thanks for this patch and sorry about the delay in getting around to this. This is ok and I'll take this under the 10 line rule this time . If you intend to continue to submit patches to gcc can I ask that you start the process for copyright assignments or confirm that you have a copyright assignment on file ? http://gcc.gnu.org/contribute.html#legal If you don't, send an email to g...@gcc.gnu.org with a request for copyright assignment papers and a maintainer will send you these. http://gcc.gnu.org/contribute.html in general is a good summary of the process related to contributing patches to GCC in general . Please do read that and follow up on g...@gcc.gnu.org if you have any more questions. And finally don't forget to add a changelog to your patches as documented in links from the above mentioned page. Since this is your first time I've added the following Changelog entry for your patch and applied it. regards Ramana 2013-01-27 Amol Pise * gcc.target/arm/neon-vfnms-1.c: New test. * gcc.target/arm/neon-vfnma-1.c: New test.
Re: [avr,committed] Fix fixed-point conversion
Gerald Pfeifer wrote: > On Thu, 24 Jan 2013, Georg-Johann Lay wrote: >> Committed the following change: >> >> http://gcc.gnu.org/r195424 >> >> * config/avr/avr.c (avr_out_fract): Make register numbers that >> might be outside of source operand signed. > > Can you still post patches to the list, and not just the reference? > > Thanks, > Gerald Thinks for pointing this out. I will follow the guideline in the future. Here is the change: Index: config/avr/avr.c === --- config/avr/avr.c(revision 195423) +++ config/avr/avr.c(revision 195424) @@ -7114,13 +7114,13 @@ avr_out_fract (rtx insn, rtx operands[], unsigned d1 = d0 + step; // Current and next regno of source - unsigned s0 = d0 - offset; - unsigned s1 = s0 + step; + signed s0 = d0 - offset; + signed s1 = s0 + step; // Must current resp. next regno be CLRed? This applies to the low // bytes of the destination that have no associated source bytes. - bool clr0 = s0 < src.regno; - bool clr1 = s1 < src.regno && d1 >= dest.regno; + bool clr0 = s0 < (signed) src.regno; + bool clr1 = s1 < (signed) src.regno && d1 >= dest.regno; // First gather what code to emit (if any) and additional step to // apply if a MOVW is in use. xop[2] is destination rtx and xop[3] @@ -7150,12 +7150,12 @@ avr_out_fract (rtx insn, rtx operands[], } } } - else if (offset && s0 <= src.regno_msb) + else if (offset && s0 <= (signed) src.regno_msb) { int movw = AVR_HAVE_MOVW && offset % 2 == 0 && d0 % 2 == (offset > 0) && d1 <= dest.regno_msb && d1 >= dest.regno -&& s1 <= src.regno_msb && s1 >= src.regno; +&& s1 <= (signed) src.regno_msb && s1 >= (signed) src.regno; xop[2] = all_regs_rtx[d0 & ~movw]; xop[3] = all_regs_rtx[s0 & ~movw];
Re: [Patch] Fix PR54814
Richard Biener wrote: > On Sun, Jan 27, 2013 at 11:26 PM, Steven Bosscher > wrote: > > On Sun, Jan 27, 2013 at 11:09 PM, Georg-Johann Lay wrote: > > The patch was originally worked out by Bernd Schmidt and fixed a problem > > introduced in > > > > http://gcc.gnu.org/r190252 > > > > Ironically, this revision fixes a reload problem on x86/x86_64 -- > > which doesn't use reload anymore now... > > > > > >> Does this mean the fix is rejected for 4.8? > > > > No, just that it probably helps to add a RM to the CC list. > > > > FWIW, it seems to me that this patch should go into 4.8, because the > > bug is probably not limited to AVR. > > Indeed, the fix also looks quite obvious though I know nothing about the > code at all. > > Thus, ok from a RM perspective if a reload-affine person approves it. The patch was originally by Bernd, but FWIW it looks good to me as well. Bye, Ulrich -- Dr. Ulrich Weigand GNU Toolchain for Linux on System z and Cell BE ulrich.weig...@de.ibm.com
Re: Cortex-A15 vfnma/vfnms test patch
Dear Ramana, Thank You very much for the changelog and commit of my patch in gcc. I will follow the steps mentioned by you. Thank You, Amol Pise On Mon, Jan 28, 2013 at 4:18 PM, Ramana Radhakrishnan wrote: > > > [Taking gcc-help off this thread.] > > Amol, > > >> I have tested these instruction with GCC and these instructions are >> generated. >> Please review and marge this test support patch in gcc main trunk. > > > Thanks for this patch and sorry about the delay in getting around to this. > > This is ok and I'll take this under the 10 line rule this time . > > If you intend to continue to submit patches to gcc can I ask that you start > the process for copyright assignments or confirm that you have a copyright > assignment on file ? > > http://gcc.gnu.org/contribute.html#legal > > If you don't, send an email to g...@gcc.gnu.org with a request for copyright > assignment papers and a maintainer will send you these. > > http://gcc.gnu.org/contribute.html in general is a good summary of the > process related to contributing patches to GCC in general . Please do read > that and follow up on g...@gcc.gnu.org if you have any more questions. > > And finally don't forget to add a changelog to your patches as documented in > links from the above mentioned page. Since this is your first time I've > added the following Changelog entry for your patch and applied it. > > regards > Ramana > > > > 2013-01-27 Amol Pise > > * gcc.target/arm/neon-vfnms-1.c: New test. > * gcc.target/arm/neon-vfnma-1.c: New test. > > > >
[PATCH] Fix sched ICE with prefetch (PR rtl-optimization/56117)
Hi! We ICE on the following testcase when using cselib, because cselib_lookup* is never called on the PREFETCH argument, and add_insn_mem_dependence calls cselib_subst_to_values on it, which assumes cselib_lookup* already happened on it earlier. For MEMs sched_analyze_2 calls cselib_lookup_from_insn, but for PREFETCHes it didn't. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2013-01-28 Jakub Jelinek PR rtl-optimization/56117 * sched-deps.c (sched_analyze_2) : For use_cselib call cselib_lookup_from_insn on the MEM before calling add_insn_mem_dependence. * gcc.dg/pr56117.c: New test. --- gcc/sched-deps.c.jj 2013-01-16 19:58:42.0 +0100 +++ gcc/sched-deps.c2013-01-28 09:43:33.248657691 +0100 @@ -2720,8 +2720,12 @@ sched_analyze_2 (struct deps_desc *deps, prefetch has only the start address but it is better to have something than nothing. */ if (!deps->readonly) - add_insn_mem_dependence (deps, true, insn, -gen_rtx_MEM (Pmode, XEXP (PATTERN (insn), 0))); + { + rtx x = gen_rtx_MEM (Pmode, XEXP (PATTERN (insn), 0)); + if (sched_deps_info->use_cselib) + cselib_lookup_from_insn (x, Pmode, true, VOIDmode, insn); + add_insn_mem_dependence (deps, true, insn, x); + } break; case UNSPEC_VOLATILE: --- gcc/testsuite/gcc.dg/pr56117.c.jj 2013-01-28 09:47:21.244381559 +0100 +++ gcc/testsuite/gcc.dg/pr56117.c 2013-01-28 09:46:31.0 +0100 @@ -0,0 +1,9 @@ +/* PR rtl-optimization/56117 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -fsched2-use-superblocks" } */ + +void +foo (void *p) +{ + __builtin_prefetch (p); +} Jakub
[PATCH][RFC] Avoid excessive BLOCK associations for locations
This avoids assigning BLOCKs to things that didn't have one before (originally I observed that the code snippets below happily generate a UNKNOWN_LOCATION, id->block association). A previous patch last year changed expansion in a way to not jump back to the outermost block when observing a NULL LOCATION_BLOCK in the IL, but similar to UNKNOWN_LOCATION locus handling just inherit the currently active BLOCK. Thus the patch below, instead of just avoiding the non-sensical UNKNOWN_LOCATION, id->block association goes one step further and never puts things in the outermost inline BLOCK if it didn't have a BLOCK assigned before. This avoids the original non-sensical issue and avoids excessive BLOCK associations where they are of not much use. What's the point of switching to the outermost scope for unknown-BLOCK locations? Isn't inheriting the currently active scope much more useful (it definitely is for UNKNOWN_LOCATIONs)? If we have a non-UNKNOWN_LOCATION, would a NULL BLOCK not be an error anyway? An error we "hide" in the current scheme? Bootstrapped and tested on x86_64-unknown-linux-gnu. Does this make sense? Thanks, Richard. 2013-01-28 Richard Biener * tree-inline.c (remap_gimple_stmt): Do not assing a BLOCK to a stmt that didn't have one. (copy_phis_for_bb): Likewise for PHI arguments. (copy_debug_stmt): Likewise for debug stmts. Index: gcc/tree-inline.c === --- gcc/tree-inline.c (revision 195502) +++ gcc/tree-inline.c (working copy) @@ -1198,7 +1198,6 @@ remap_gimple_stmt (gimple stmt, copy_bod { gimple copy = NULL; struct walk_stmt_info wi; - tree new_block; bool skip_first = false; /* Begin by recognizing trees that we'll completely rewrite for the @@ -1458,19 +1457,15 @@ remap_gimple_stmt (gimple stmt, copy_bod } /* If STMT has a block defined, map it to the newly constructed - block. When inlining we want statements without a block to - appear in the block of the function call. */ - new_block = id->block; + block. */ if (gimple_block (copy)) { tree *n; n = (tree *) pointer_map_contains (id->decl_map, gimple_block (copy)); gcc_assert (n); - new_block = *n; + gimple_set_block (copy, *n); } - gimple_set_block (copy, new_block); - if (gimple_debug_bind_p (copy) || gimple_debug_source_bind_p (copy)) return copy; @@ -1987,7 +1982,6 @@ copy_phis_for_bb (basic_block bb, copy_b edge old_edge = find_edge ((basic_block) new_edge->src->aux, bb); tree arg; tree new_arg; - tree block = id->block; edge_iterator ei2; location_t locus; @@ -2015,19 +2009,18 @@ copy_phis_for_bb (basic_block bb, copy_b inserted = true; } locus = gimple_phi_arg_location_from_edge (phi, old_edge); - block = id->block; if (LOCATION_BLOCK (locus)) { tree *n; n = (tree *) pointer_map_contains (id->decl_map, LOCATION_BLOCK (locus)); gcc_assert (n); - block = *n; + locus = COMBINE_LOCATION_DATA (line_table, locus, *n); } + else + locus = LOCATION_LOCUS (locus); - add_phi_arg (new_phi, new_arg, new_edge, block ? - COMBINE_LOCATION_DATA (line_table, locus, block) : - LOCATION_LOCUS (locus)); + add_phi_arg (new_phi, new_arg, new_edge, locus); } } } @@ -2324,14 +2317,11 @@ copy_debug_stmt (gimple stmt, copy_body_ tree t, *n; struct walk_stmt_info wi; - t = id->block; if (gimple_block (stmt)) { n = (tree *) pointer_map_contains (id->decl_map, gimple_block (stmt)); - if (n) - t = *n; + gimple_set_block (stmt, n ? *n : id->block); } - gimple_set_block (stmt, t); /* Remap all the operands in COPY. */ memset (&wi, 0, sizeof (wi));
[committed] Avoid setting gimple_location of force_gimple_operand* created stmts to DECL_SOURCE_LOCATION of current fn (PR tree-optimization/56094)
Hi! As discussed in the PR, this is a safer variant of a fix for 4.8, where input_location during most optimization passes is set to DECL_SOURCE_LOCATION (current_function_decl) and various parts of the gimplifier e.g. during force_gimple_operand* may end up setting gimple_location to that. For 4.9, we should revert this and set input_location to UNKNOWN_LOCATION for the optimizers. Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk. 2013-01-28 Jakub Jelinek PR tree-optimization/56094 * gimplify.c (force_gimple_operand_1): Temporarily set input_location to UNKNOWN_LOCATION while gimplifying expr. * gcc.dg/pr56094.c: New test. --- gcc/gimplify.c.jj 2013-01-25 21:02:45.0 +0100 +++ gcc/gimplify.c 2013-01-28 11:34:15.671374132 +0100 @@ -8600,6 +8600,7 @@ force_gimple_operand_1 (tree expr, gimpl { enum gimplify_status ret; struct gimplify_ctx gctx; + location_t saved_location; *stmts = NULL; @@ -8613,6 +8614,8 @@ force_gimple_operand_1 (tree expr, gimpl push_gimplify_context (&gctx); gimplify_ctxp->into_ssa = gimple_in_ssa_p (cfun); gimplify_ctxp->allow_rhs_cond_expr = true; + saved_location = input_location; + input_location = UNKNOWN_LOCATION; if (var) { @@ -8634,6 +8637,7 @@ force_gimple_operand_1 (tree expr, gimpl gcc_assert (ret != GS_ERROR); } + input_location = saved_location; pop_gimplify_context (NULL); return expr; --- gcc/testsuite/gcc.dg/pr56094.c.jj 2013-01-28 11:46:09.045221238 +0100 +++ gcc/testsuite/gcc.dg/pr56094.c 2013-01-28 11:47:54.052611852 +0100 @@ -0,0 +1,81 @@ +/* PR tree-optimization/56094 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -g -fdump-tree-optimized-lineno" } */ + +_Bool cond; + +int +fn0 (unsigned char, unsigned long long, unsigned char, + unsigned char, signed short, unsigned int, + unsigned char *); + +extern void fn3 (unsigned char, unsigned char, unsigned char, unsigned char, +unsigned char, unsigned char, unsigned char, unsigned short); +extern void fn7 (int); +extern void fn8 (int); + +static __inline__ __attribute__ ((always_inline)) void +fn1 (unsigned char arg0, unsigned char arg1, unsigned char arg2, + unsigned char arg3, unsigned char arg4, unsigned char arg5, + unsigned short arg6) +{ + asm volatile ("" :: "g" ((unsigned long long) arg0), "g" (arg1), + "g" (arg2), "g" (arg3), "g" (arg4), "g" (arg5), + "g" (arg6)); + if (cond) +{ + unsigned char loc0 = 0; + fn3 (loc0, arg0, arg1, arg2, arg3, arg4, arg5, arg6); +} +} + +static __inline__ __attribute__ ((always_inline)) void +fn4 (unsigned int arg0, unsigned long long arg1) +{ + asm volatile ("" :: "g" (arg0), "g" (arg1)); +} + +static __inline__ __attribute__ ((always_inline)) void +fn5 (unsigned int arg0, unsigned char arg1, unsigned int arg2, + unsigned char arg3) +{ + asm volatile ("" :: "g" (arg0), "g" (arg1), + "g" ((unsigned long long) arg2), "g" (arg3)); +} + +static __inline__ __attribute__ ((always_inline)) void +fn6 (unsigned long long arg0, unsigned char arg1, + unsigned char arg2, signed short arg3, + unsigned int arg4, unsigned char * arg5) +{ + asm volatile ("" :: "g" (arg0), "g" ((unsigned long long) arg1), + "g" ((unsigned long long) arg2), "g" (arg3), + "g" (arg4), "g" (arg5)); + if (cond) +{ + unsigned char loc0 = 0; + fn0 (loc0, arg0, arg1, arg2, arg3, arg4, arg5); +} +} + +unsigned char b[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 0xa }; +unsigned int q = sizeof (b) / sizeof (b[0]); + +void +foo () +{ + int i; + for (i = 1; i <= 50; i++) +{ + fn6 (i + 0x1234, i + 1, i + 0xa, i + 0x1234, q, b); + fn5 (i + 0xabcd, i << 1, i + 0x1234, i << 2); + fn7 (i + 0xdead); + fn8 (i + 0xdead); + fn1 (i, i + 1, i + 2, i + 3, i + 4, i + 5, i << 10); + fn4 (i + 0xfeed, i); +} +} + +/* Verify no statements get the location of the foo () decl. */ +/* { dg-final { scan-tree-dump-not " : 65:1\\\]" "optimized" } } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */ Jakub
[PATCH] Fix up pow folding (PR tree-optimization/56125)
Hi! gimple_expand_builtin_pow last two optimizations rely on earlier optimizations in the same function to be performed, e.g. folding pow (x, c) for n = 2c into sqrt(x) * powi(x, n / 2) is only correct for c which isn't an integer (otherwise the sqrt(x) factor would need to be skipped), but they actually do not check this. E.g. the pow (x, n) where n is integer is optimized only if: && ((n >= -1 && n <= 2) || (flag_unsafe_math_optimizations && optimize_insn_for_speed_p () && powi_cost (n) <= POWI_MAX_MULTS))) and as in the testcase the function is called, it isn't optimized and we fall through till the above mentioned optimization which blindly assumes that c isn't an integer. Fixed by both checking that c isn't an integer (and for the last optimization also that 2c isn't an integer), and also not doing the -> sqrt(x) * powi(x, n / 2) resp. 1.0 / sqrt(x) * powi(x, abs(n) / 2) optimization for -Os or cold functions, at least __attribute__((cold)) double foo (double x, double n) { return __builtin_pow (x, -1.5); } is smaller when expanded as pow call both on x86_64 and on powerpc (with -Os -ffast-math). Even just the c*_is_int tests alone could be enough to fix the bug, so if you say want to enable it for -Os even with c 1.5, but not for negative values which add another operation, it can be adjusted. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2013-01-28 Jakub Jelinek PR tree-optimization/56125 * tree-ssa-math-opts.c (gimple_expand_builtin_pow): Don't optimize pow(x,c) into sqrt(x) * powi(x, n/2) or 1.0 / (sqrt(x) * powi(x, abs(n/2))) if c is an integer or when optimizing for size. Don't optimize pow(x,c) into powi(x, n/3) * powi(cbrt(x), n%3) or 1.0 / (powi(x, abs(n)/3) * powi(cbrt(x), abs(n)%3)) if 2c is an integer. * gcc.dg/pr56125.c: New test. --- gcc/tree-ssa-math-opts.c.jj 2013-01-11 09:02:48.0 +0100 +++ gcc/tree-ssa-math-opts.c2013-01-28 10:56:40.105950483 +0100 @@ -1110,7 +1110,7 @@ gimple_expand_builtin_pow (gimple_stmt_i HOST_WIDE_INT n; tree type, sqrtfn, cbrtfn, sqrt_arg0, sqrt_sqrt, result, cbrt_x, powi_cbrt_x; enum machine_mode mode; - bool hw_sqrt_exists; + bool hw_sqrt_exists, c_is_int, c2_is_int; /* If the exponent isn't a constant, there's nothing of interest to be done. */ @@ -1122,8 +1122,9 @@ gimple_expand_builtin_pow (gimple_stmt_i c = TREE_REAL_CST (arg1); n = real_to_integer (&c); real_from_integer (&cint, VOIDmode, n, n < 0 ? -1 : 0, 0); + c_is_int = real_identical (&c, &cint); - if (real_identical (&c, &cint) + if (c_is_int && ((n >= -1 && n <= 2) || (flag_unsafe_math_optimizations && optimize_insn_for_speed_p () @@ -1221,7 +1222,8 @@ gimple_expand_builtin_pow (gimple_stmt_i return build_and_insert_call (gsi, loc, cbrtfn, sqrt_arg0); } - /* Optimize pow(x,c), where n = 2c for some nonzero integer n, into + /* Optimize pow(x,c), where n = 2c for some nonzero integer n + and c not an integer, into sqrt(x) * powi(x, n/2),n > 0; 1.0 / (sqrt(x) * powi(x, abs(n/2))), n < 0. @@ -1230,10 +1232,13 @@ gimple_expand_builtin_pow (gimple_stmt_i real_arithmetic (&c2, MULT_EXPR, &c, &dconst2); n = real_to_integer (&c2); real_from_integer (&cint, VOIDmode, n, n < 0 ? -1 : 0, 0); + c2_is_int = real_identical (&c2, &cint); if (flag_unsafe_math_optimizations && sqrtfn - && real_identical (&c2, &cint)) + && c2_is_int + && !c_is_int + && optimize_function_for_speed_p (cfun)) { tree powi_x_ndiv2 = NULL_TREE; @@ -1286,6 +1291,7 @@ gimple_expand_builtin_pow (gimple_stmt_i && cbrtfn && (gimple_val_nonnegative_real_p (arg0) || !HONOR_NANS (mode)) && real_identical (&c2, &c) + && !c2_is_int && optimize_function_for_speed_p (cfun) && powi_cost (n / 3) <= POWI_MAX_MULTS) { --- gcc/testsuite/gcc.dg/pr56125.c.jj 2013-01-28 11:00:04.359814742 +0100 +++ gcc/testsuite/gcc.dg/pr56125.c 2013-01-28 11:00:55.048532118 +0100 @@ -0,0 +1,21 @@ +/* PR tree-optimization/56125 */ +/* { dg-do run } */ +/* { dg-options "-O2 -ffast-math" } */ + +extern void abort (void); +extern double fabs (double); + +__attribute__((cold)) double +foo (double x, double n) +{ + double u = x / (n * n); + return u; +} + +int +main () +{ + if (fabs (foo (29, 2) - 7.25) > 0.001) +abort (); + return 0; +} Jakub
[committed] Avoid string.h includes in -fno-builtin-memset testcases (PR testsuite/56053)
Hi! Some targets apparently force fortification unconditionally or at least by default, when string.h is then included, memset etc. inlines might call __builtin_memset or __builtin___memset_chk directly and for explicit builtin uses -fno-builtin* doesn't work. Fixed by avoiding those includes and instead adding needed prototypes by hand, tested on x86_64-linux, committed as obvious to trunk. 2013-01-28 Jakub Jelinek PR testsuite/56053 * c-c++-common/asan/heap-overflow-1.c: Don't include stdlib.h and string.h. Provide memset, malloc and free prototypes, adjust line numbers in dg-output. * c-c++-common/asan/stack-overflow-1.c: Don't include string.h. Provide memset prototype and adjust line numbers in dg-output. * c-c++-common/asan/global-overflow-1.c: Likewise. --- gcc/testsuite/c-c++-common/asan/heap-overflow-1.c.jj2012-12-13 00:02:50.0 +0100 +++ gcc/testsuite/c-c++-common/asan/heap-overflow-1.c 2013-01-28 13:47:58.682416114 +0100 @@ -2,8 +2,18 @@ /* { dg-options "-fno-builtin-malloc -fno-builtin-free -fno-builtin-memset" } */ /* { dg-shouldfail "asan" } */ -#include -#include +#ifdef __cplusplus +extern "C" { +#endif + +void *memset (void *, int, __SIZE_TYPE__); +void *malloc (__SIZE_TYPE__); +void free (void *); + +#ifdef __cplusplus +} +#endif + volatile int ten = 10; int main(int argc, char **argv) { char *x = (char*)malloc(10); @@ -14,8 +24,8 @@ int main(int argc, char **argv) { } /* { dg-output "READ of size 1 at 0x\[0-9a-f\]+ thread T0.*(\n|\r\n|\r)" } */ -/* { dg-output "#0 0x\[0-9a-f\]+ (in _*main (\[^\n\r]*heap-overflow-1.c:11|\[^\n\r]*:0)|\[(\]).*(\n|\r\n|\r)" } */ +/* { dg-output "#0 0x\[0-9a-f\]+ (in _*main (\[^\n\r]*heap-overflow-1.c:21|\[^\n\r]*:0)|\[(\]).*(\n|\r\n|\r)" } */ /* { dg-output "0x\[0-9a-f\]+ is located 0 bytes to the right of 10-byte region\[^\n\r]*(\n|\r\n|\r)" } */ /* { dg-output "allocated by thread T0 here:\[^\n\r]*(\n|\r\n|\r)" } */ /* { dg-output "#0 0x\[0-9a-f\]+ (in _*(interceptor_|)malloc|\[(\])\[^\n\r]*(\n|\r\n|\r)" } */ -/* { dg-output "#1 0x\[0-9a-f\]+ (in _*main (\[^\n\r]*heap-overflow-1.c:9|\[^\n\r]*:0)|\[(\])\[^\n\r]*(\n|\r\n|\r)" } */ +/* { dg-output "#1 0x\[0-9a-f\]+ (in _*main (\[^\n\r]*heap-overflow-1.c:19|\[^\n\r]*:0)|\[(\])\[^\n\r]*(\n|\r\n|\r)" } */ --- gcc/testsuite/c-c++-common/asan/stack-overflow-1.c.jj 2012-12-13 00:02:50.0 +0100 +++ gcc/testsuite/c-c++-common/asan/stack-overflow-1.c 2013-01-28 13:48:41.046171347 +0100 @@ -2,9 +2,13 @@ /* { dg-options "-fno-builtin-memset" } */ /* { dg-shouldfail "asan" } */ -volatile int ten = 10; +extern +#ifdef __cplusplus +"C" +#endif +void *memset (void *, int, __SIZE_TYPE__); -#include +volatile int ten = 10; int main() { char x[10]; @@ -14,5 +18,5 @@ int main() { } /* { dg-output "READ of size 1 at 0x\[0-9a-f\]+ thread T0\[^\n\r]*(\n|\r\n|\r)" } */ -/* { dg-output "#0 0x\[0-9a-f\]+ (in _*main (\[^\n\r]*stack-overflow-1.c:12|\[^\n\r]*:0)|\[(\]).*(\n|\r\n|\r)" } */ +/* { dg-output "#0 0x\[0-9a-f\]+ (in _*main (\[^\n\r]*stack-overflow-1.c:16|\[^\n\r]*:0)|\[(\]).*(\n|\r\n|\r)" } */ /* { dg-output "Address 0x\[0-9a-f\]+ is\[^\n\r]*frame " } */ --- gcc/testsuite/c-c++-common/asan/global-overflow-1.c.jj 2012-12-13 00:02:50.0 +0100 +++ gcc/testsuite/c-c++-common/asan/global-overflow-1.c 2013-01-28 13:46:13.900017787 +0100 @@ -2,7 +2,12 @@ /* { dg-options "-fno-builtin-memset" } */ /* { dg-shouldfail "asan" } */ -#include +extern +#ifdef __cplusplus +"C" +#endif +void *memset (void *, int, __SIZE_TYPE__); + volatile int ten = 10; int main() { @@ -18,6 +23,6 @@ int main() { } /* { dg-output "READ of size 1 at 0x\[0-9a-f\]+ thread T0.*(\n|\r\n|\r)" } */ -/* { dg-output "#0 0x\[0-9a-f\]+ (in _*main (\[^\n\r]*global-overflow-1.c:15|\[^\n\r]*:0)|\[(\])\[^\n\r]*(\n|\r\n|\r).*" } */ +/* { dg-output "#0 0x\[0-9a-f\]+ (in _*main (\[^\n\r]*global-overflow-1.c:20|\[^\n\r]*:0)|\[(\])\[^\n\r]*(\n|\r\n|\r).*" } */ /* { dg-output "0x\[0-9a-f\]+ is located 0 bytes to the right of global variable" } */ /* { dg-output ".*YYY\[^\n\r]* of size 10\[^\n\r]*(\n|\r\n|\r)" } */ Jakub
Re: [PATCH] Fix up pow folding (PR tree-optimization/56125)
On Mon, 28 Jan 2013, Jakub Jelinek wrote: > Hi! > > gimple_expand_builtin_pow last two optimizations rely on earlier > optimizations in the same function to be performed, e.g. > folding pow (x, c) for n = 2c into sqrt(x) * powi(x, n / 2) is only > correct for c which isn't an integer (otherwise the sqrt(x) factor would > need to be skipped), but they actually do not check this. > E.g. the pow (x, n) where n is integer is optimized only if: > && ((n >= -1 && n <= 2) > || (flag_unsafe_math_optimizations > && optimize_insn_for_speed_p () > && powi_cost (n) <= POWI_MAX_MULTS))) > and as in the testcase the function is called, it isn't optimized and > we fall through till the above mentioned optimization which blindly assumes > that c isn't an integer. > > Fixed by both checking that c isn't an integer (and for the last > optimization also that 2c isn't an integer), and also not doing the > -> sqrt(x) * powi(x, n / 2) resp. 1.0 / sqrt(x) * powi(x, abs(n) / 2) > optimization for -Os or cold functions, at least > __attribute__((cold)) double > foo (double x, double n) > { > return __builtin_pow (x, -1.5); > } > is smaller when expanded as pow call both on x86_64 and on powerpc (with > -Os -ffast-math). Even just the c*_is_int tests alone could be enough > to fix the bug, so if you say want to enable it for -Os even with c 1.5, > but not for negative values which add another operation, it can be adjusted. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Ok. Thanks, Richard. > 2013-01-28 Jakub Jelinek > > PR tree-optimization/56125 > * tree-ssa-math-opts.c (gimple_expand_builtin_pow): Don't optimize > pow(x,c) into sqrt(x) * powi(x, n/2) or > 1.0 / (sqrt(x) * powi(x, abs(n/2))) if c is an integer or when > optimizing for size. > Don't optimize pow(x,c) into powi(x, n/3) * powi(cbrt(x), n%3) or > 1.0 / (powi(x, abs(n)/3) * powi(cbrt(x), abs(n)%3)) if 2c is an > integer. > > * gcc.dg/pr56125.c: New test. > > --- gcc/tree-ssa-math-opts.c.jj 2013-01-11 09:02:48.0 +0100 > +++ gcc/tree-ssa-math-opts.c 2013-01-28 10:56:40.105950483 +0100 > @@ -1110,7 +1110,7 @@ gimple_expand_builtin_pow (gimple_stmt_i >HOST_WIDE_INT n; >tree type, sqrtfn, cbrtfn, sqrt_arg0, sqrt_sqrt, result, cbrt_x, > powi_cbrt_x; >enum machine_mode mode; > - bool hw_sqrt_exists; > + bool hw_sqrt_exists, c_is_int, c2_is_int; > >/* If the exponent isn't a constant, there's nothing of interest > to be done. */ > @@ -1122,8 +1122,9 @@ gimple_expand_builtin_pow (gimple_stmt_i >c = TREE_REAL_CST (arg1); >n = real_to_integer (&c); >real_from_integer (&cint, VOIDmode, n, n < 0 ? -1 : 0, 0); > + c_is_int = real_identical (&c, &cint); > > - if (real_identical (&c, &cint) > + if (c_is_int >&& ((n >= -1 && n <= 2) > || (flag_unsafe_math_optimizations > && optimize_insn_for_speed_p () > @@ -1221,7 +1222,8 @@ gimple_expand_builtin_pow (gimple_stmt_i >return build_and_insert_call (gsi, loc, cbrtfn, sqrt_arg0); > } > > - /* Optimize pow(x,c), where n = 2c for some nonzero integer n, into > + /* Optimize pow(x,c), where n = 2c for some nonzero integer n > + and c not an integer, into > > sqrt(x) * powi(x, n/2),n > 0; > 1.0 / (sqrt(x) * powi(x, abs(n/2))), n < 0. > @@ -1230,10 +1232,13 @@ gimple_expand_builtin_pow (gimple_stmt_i >real_arithmetic (&c2, MULT_EXPR, &c, &dconst2); >n = real_to_integer (&c2); >real_from_integer (&cint, VOIDmode, n, n < 0 ? -1 : 0, 0); > + c2_is_int = real_identical (&c2, &cint); > >if (flag_unsafe_math_optimizations >&& sqrtfn > - && real_identical (&c2, &cint)) > + && c2_is_int > + && !c_is_int > + && optimize_function_for_speed_p (cfun)) > { >tree powi_x_ndiv2 = NULL_TREE; > > @@ -1286,6 +1291,7 @@ gimple_expand_builtin_pow (gimple_stmt_i >&& cbrtfn >&& (gimple_val_nonnegative_real_p (arg0) || !HONOR_NANS (mode)) >&& real_identical (&c2, &c) > + && !c2_is_int >&& optimize_function_for_speed_p (cfun) >&& powi_cost (n / 3) <= POWI_MAX_MULTS) > { > --- gcc/testsuite/gcc.dg/pr56125.c.jj 2013-01-28 11:00:04.359814742 +0100 > +++ gcc/testsuite/gcc.dg/pr56125.c2013-01-28 11:00:55.048532118 +0100 > @@ -0,0 +1,21 @@ > +/* PR tree-optimization/56125 */ > +/* { dg-do run } */ > +/* { dg-options "-O2 -ffast-math" } */ > + > +extern void abort (void); > +extern double fabs (double); > + > +__attribute__((cold)) double > +foo (double x, double n) > +{ > + double u = x / (n * n); > + return u; > +} > + > +int > +main () > +{ > + if (fabs (foo (29, 2) - 7.25) > 0.001) > +abort (); > + return 0; > +} > > Jakub > > -- Richard Biener SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Fel
Re: [PATCH][RFC] Avoid excessive BLOCK associations for locations
On Mon, Jan 28, 2013 at 03:15:53PM +0100, Richard Biener wrote: > Does this make sense? Yes. Wouldn't hurt to run GDB testsuite with that, though I bet most of it is -O0 anyway and thus won't stress it out too much. > 2013-01-28 Richard Biener > > * tree-inline.c (remap_gimple_stmt): Do not assing a BLOCK > to a stmt that didn't have one. > (copy_phis_for_bb): Likewise for PHI arguments. > (copy_debug_stmt): Likewise for debug stmts. Ok. Jakub
[PATCH] Fix PR56034
The following implements what I thought was present (eh ...). For partitions that contain reductions (feed loop closed PHI nodes) we rely on them being in the last partition of the loop - as we do not bother to copy / care for loop closed PHI nodes. The following now implements that fully (I've had a patch for that around), by both seeding initial partition generation from scalar reductions and taking care of merging them again into the very last partition of the loop. Completely disabling partitioning for loops with reductions would have broken some existing testcases that happen to work because for them the partitions are already in proper order. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2013-01-28 Richard Biener PR tree-optimization/56034 * tree-loop-distribution.c (enum partition_kind): Add PKIND_REDUCTION. (partition_builtin_p): Adjust. (generate_code_for_partition): Handle PKIND_REDUCTION. Assert it is the last partition. (rdg_flag_uses): Check SSA_NAME_IS_DEFAULT_DEF before looking up the vertex for the definition. (classify_partition): Classify whether a partition is a PKIND_REDUCTION, thus has uses outside of the loop. (ldist_gen): Inherit PKIND_REDUCTION when merging partitions. Merge all PKIND_REDUCTION partitions into the last partition. (tree_loop_distribution): Seed partitions from reductions as well. * gcc.dg/torture/pr56034.c: New testcase. Index: gcc/tree-loop-distribution.c === *** gcc/tree-loop-distribution.c(revision 195502) --- gcc/tree-loop-distribution.c(working copy) *** along with GCC; see the file COPYING3. *** 51,57 #include "tree-scalar-evolution.h" #include "tree-pass.h" ! enum partition_kind { PKIND_NORMAL, PKIND_MEMSET, PKIND_MEMCPY }; typedef struct partition_s { --- 51,59 #include "tree-scalar-evolution.h" #include "tree-pass.h" ! enum partition_kind { ! PKIND_NORMAL, PKIND_REDUCTION, PKIND_MEMSET, PKIND_MEMCPY ! }; typedef struct partition_s { *** partition_free (partition_t partition) *** 90,96 static bool partition_builtin_p (partition_t partition) { ! return partition->kind != PKIND_NORMAL; } /* Returns true if the partition has an writes. */ --- 92,98 static bool partition_builtin_p (partition_t partition) { ! return partition->kind > PKIND_REDUCTION; } /* Returns true if the partition has an writes. */ *** generate_code_for_partition (struct loop *** 481,486 --- 483,491 destroy_loop (loop); break; + case PKIND_REDUCTION: + /* Reductions all have to be in the last partition. */ + gcc_assert (!copy_p); case PKIND_NORMAL: generate_loops_for_partition (loop, partition, copy_p); break; *** rdg_flag_uses (struct graph *rdg, int u, *** 628,634 { tree use = USE_FROM_PTR (use_p); ! if (TREE_CODE (use) == SSA_NAME) { gimple def_stmt = SSA_NAME_DEF_STMT (use); int v = rdg_vertex_for_stmt (rdg, def_stmt); --- 633,640 { tree use = USE_FROM_PTR (use_p); ! if (TREE_CODE (use) == SSA_NAME ! && !SSA_NAME_IS_DEFAULT_DEF (use)) { gimple def_stmt = SSA_NAME_DEF_STMT (use); int v = rdg_vertex_for_stmt (rdg, def_stmt); *** classify_partition (loop_p loop, struct *** 858,882 unsigned i; tree nb_iter; data_reference_p single_load, single_store; partition->kind = PKIND_NORMAL; partition->main_dr = NULL; partition->secondary_dr = NULL; - if (!flag_tree_loop_distribute_patterns) - return; - - /* Perform general partition disqualification for builtins. */ - nb_iter = number_of_exit_cond_executions (loop); - if (!nb_iter || nb_iter == chrec_dont_know) - return; - EXECUTE_IF_SET_IN_BITMAP (partition->stmts, 0, i, bi) { gimple stmt = RDG_STMT (rdg, i); if (gimple_has_volatile_ops (stmt)) ! return; /* If the stmt has uses outside of the loop fail. ??? If the stmt is generated in another partition that --- 864,881 unsigned i; tree nb_iter; data_reference_p single_load, single_store; + bool volatiles_p = false; partition->kind = PKIND_NORMAL; partition->main_dr = NULL; partition->secondary_dr = NULL; EXECUTE_IF_SET_IN_BITMAP (partition->stmts, 0, i, bi) { gimple stmt = RDG_STMT (rdg, i); if (gimple_has_volatile_ops (stmt)) ! volatiles_p = true; /* If the stmt has uses outside of the loop fail. ??? If the stmt is generated in another partition that *** classify_partition (loop_p loop, struct ***
Re: [PATCH] Fix up pow folding (PR tree-optimization/56125)
LGTM! Thanks for fixing this. Bill On Mon, 2013-01-28 at 15:25 +0100, Jakub Jelinek wrote: > Hi! > > gimple_expand_builtin_pow last two optimizations rely on earlier > optimizations in the same function to be performed, e.g. > folding pow (x, c) for n = 2c into sqrt(x) * powi(x, n / 2) is only > correct for c which isn't an integer (otherwise the sqrt(x) factor would > need to be skipped), but they actually do not check this. > E.g. the pow (x, n) where n is integer is optimized only if: > && ((n >= -1 && n <= 2) > || (flag_unsafe_math_optimizations > && optimize_insn_for_speed_p () > && powi_cost (n) <= POWI_MAX_MULTS))) > and as in the testcase the function is called, it isn't optimized and > we fall through till the above mentioned optimization which blindly assumes > that c isn't an integer. > > Fixed by both checking that c isn't an integer (and for the last > optimization also that 2c isn't an integer), and also not doing the > -> sqrt(x) * powi(x, n / 2) resp. 1.0 / sqrt(x) * powi(x, abs(n) / 2) > optimization for -Os or cold functions, at least > __attribute__((cold)) double > foo (double x, double n) > { > return __builtin_pow (x, -1.5); > } > is smaller when expanded as pow call both on x86_64 and on powerpc (with > -Os -ffast-math). Even just the c*_is_int tests alone could be enough > to fix the bug, so if you say want to enable it for -Os even with c 1.5, > but not for negative values which add another operation, it can be adjusted. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2013-01-28 Jakub Jelinek > > PR tree-optimization/56125 > * tree-ssa-math-opts.c (gimple_expand_builtin_pow): Don't optimize > pow(x,c) into sqrt(x) * powi(x, n/2) or > 1.0 / (sqrt(x) * powi(x, abs(n/2))) if c is an integer or when > optimizing for size. > Don't optimize pow(x,c) into powi(x, n/3) * powi(cbrt(x), n%3) or > 1.0 / (powi(x, abs(n)/3) * powi(cbrt(x), abs(n)%3)) if 2c is an > integer. > > * gcc.dg/pr56125.c: New test. > > --- gcc/tree-ssa-math-opts.c.jj 2013-01-11 09:02:48.0 +0100 > +++ gcc/tree-ssa-math-opts.c 2013-01-28 10:56:40.105950483 +0100 > @@ -1110,7 +1110,7 @@ gimple_expand_builtin_pow (gimple_stmt_i >HOST_WIDE_INT n; >tree type, sqrtfn, cbrtfn, sqrt_arg0, sqrt_sqrt, result, cbrt_x, > powi_cbrt_x; >enum machine_mode mode; > - bool hw_sqrt_exists; > + bool hw_sqrt_exists, c_is_int, c2_is_int; > >/* If the exponent isn't a constant, there's nothing of interest > to be done. */ > @@ -1122,8 +1122,9 @@ gimple_expand_builtin_pow (gimple_stmt_i >c = TREE_REAL_CST (arg1); >n = real_to_integer (&c); >real_from_integer (&cint, VOIDmode, n, n < 0 ? -1 : 0, 0); > + c_is_int = real_identical (&c, &cint); > > - if (real_identical (&c, &cint) > + if (c_is_int >&& ((n >= -1 && n <= 2) > || (flag_unsafe_math_optimizations > && optimize_insn_for_speed_p () > @@ -1221,7 +1222,8 @@ gimple_expand_builtin_pow (gimple_stmt_i >return build_and_insert_call (gsi, loc, cbrtfn, sqrt_arg0); > } > > - /* Optimize pow(x,c), where n = 2c for some nonzero integer n, into > + /* Optimize pow(x,c), where n = 2c for some nonzero integer n > + and c not an integer, into > > sqrt(x) * powi(x, n/2),n > 0; > 1.0 / (sqrt(x) * powi(x, abs(n/2))), n < 0. > @@ -1230,10 +1232,13 @@ gimple_expand_builtin_pow (gimple_stmt_i >real_arithmetic (&c2, MULT_EXPR, &c, &dconst2); >n = real_to_integer (&c2); >real_from_integer (&cint, VOIDmode, n, n < 0 ? -1 : 0, 0); > + c2_is_int = real_identical (&c2, &cint); > >if (flag_unsafe_math_optimizations >&& sqrtfn > - && real_identical (&c2, &cint)) > + && c2_is_int > + && !c_is_int > + && optimize_function_for_speed_p (cfun)) > { >tree powi_x_ndiv2 = NULL_TREE; > > @@ -1286,6 +1291,7 @@ gimple_expand_builtin_pow (gimple_stmt_i >&& cbrtfn >&& (gimple_val_nonnegative_real_p (arg0) || !HONOR_NANS (mode)) >&& real_identical (&c2, &c) > + && !c2_is_int >&& optimize_function_for_speed_p (cfun) >&& powi_cost (n / 3) <= POWI_MAX_MULTS) > { > --- gcc/testsuite/gcc.dg/pr56125.c.jj 2013-01-28 11:00:04.359814742 +0100 > +++ gcc/testsuite/gcc.dg/pr56125.c2013-01-28 11:00:55.048532118 +0100 > @@ -0,0 +1,21 @@ > +/* PR tree-optimization/56125 */ > +/* { dg-do run } */ > +/* { dg-options "-O2 -ffast-math" } */ > + > +extern void abort (void); > +extern double fabs (double); > + > +__attribute__((cold)) double > +foo (double x, double n) > +{ > + double u = x / (n * n); > + return u; > +} > + > +int > +main () > +{ > + if (fabs (foo (29, 2) - 7.25) > 0.001) > +abort (); > + return 0; > +} > > Jakub >
Re: [PATCH][RFC] Avoid excessive BLOCK associations for locations
Hi, On Mon, 28 Jan 2013, Richard Biener wrote: > What's the point of switching to the outermost scope for unknown-BLOCK > locations? It's the most sensical block for code which isn't otherwise associated with a BLOCK. But the latter Shouldn't Happen, because conceptually all code runs in some (perhaps artificial) lookup context. So, it's actually not the inliner which should fixup stuff here, but rather ... > If we have a non-UNKNOWN_LOCATION, would a NULL BLOCK not be an error > anyway? ... whatever is producing such non-BLOCK code snippets. But see below. > Isn't inheriting the currently active scope much more useful (it > definitely is for UNKNOWN_LOCATIONs)? And yes, the most likely useful block for such code will be the "currently active" block. This is true only before code transformations of course; while optimizing you have the same problems like with locations, i.e. how to "merge" multiple different BLOCKs into one sensible. Now, as an implementation optimization (to not bloat the location/block sets perhaps) you can define block==NULL <--> block==outermost-scope, and in case you do so, it's indeed the inliner that needs to map NULL blocks to the mapped outermost scope of the inlined function. I would guess that this is what historically was done, and when this optimization is still employed your patch is wrong. IMHO this optimization should be used. Ciao, Michael.
Re: [Ping] [Patch, AArch64] Set libgloss_dir for aarch64*-*-* targets
Ping~ On 01/10/13 16:20, Yufeng Zhang wrote: Hi, This patch updates the top-level configuration files to explicitly set libgloss_dir to aarch64 for aarch64*-*-* targets. OK to commit? Thanks, Yufeng 2013-01-10 Yufeng Zhang * configure.ac: Set libgloss_dir for the aarch64*-*-* targets. * configure: Regenerated. top-level-config.patch diff --git a/configure.ac b/configure.ac index 02720ee..5bdf1d0 100644 --- a/configure.ac +++ b/configure.ac @@ -759,6 +759,9 @@ case "${target}" in sh*-*-pe|mips*-*-pe|*arm-wince-pe) libgloss_dir=wince ;; + aarch64*-*-* ) +libgloss_dir=aarch64 +;; arm*-*-*) libgloss_dir=arm ;;
Re: [PATCH, regression?] Support --static-libstdc++ with native AIX ld
On Mon, Jan 28, 2013 at 4:07 AM, Michael Haubenwallner wrote: > But still curious if you've been able to reproduce the problem, > and why you didn't encounter this problem beforehand. As I mentioned before, because of --boot-ld-flags, with earlier libgcc and libstdc++ installed in that directory. > Yes, but (you've asked) here is this situation I don't want to configure > extra deplib-prefixes > for (remember bullfreeware is listed as provider for gcc-binaries): > > * bullfreeware's libiconv-1.13.1 and gettext-0.17 is installed in > /opt/freeware, > * /usr/lib/libintl.a is symlinked to /opt/freeware/lib (by bullfreeware's > RPM), > * /usr/lib/libiconv.a is the original AIX' one. > > Now, /usr/lib/libintl.a needs /opt/freeware/lib/libiconv.a[libiconv.so.2], > and it does > contain the correct RUNPATH. But subsequent binaries linking against > /usr/lib/libintl.a > don't (necessarily) know about the need to add /opt/freeware/lib as RUNPATH, > so these > binaries break with libiconv.so.2 not being found as member of > /usr/lib/libiconv.a, because > AIX unfortunately does stop its shared-library search at the first archive > filename found. > > This also is the main reason for my filename-based-shared-library-versioning > thing. Over the weekend, I successfully tested a different way to configure and build: all static libraries. If you build and privately install GMP, MPFR, MPC and LIBICONV configured as static libraries (--enable-static --disable-shared) and install in /prereq, then, combined with your patch to enable --static-libstdc++ --static-libgcc, the resulting GCC only depends on AIX libc.a -- no other shared libraries. Bull Freeware can distribute the shared versions of the libraries for other applications, but they do not need to be GCC dependencies. - David
Re: [PATCH ARM iWMMXt 0/5] Improve iWMMXt support
Hi Matt, Could this patch, or perhaps the much smaller one I attached to bug 35294 be committed to the 4.7 branch? Yes. Done. Also, could you close its duplicates, bugs 36798 and 36966? Sorry no. I do not actually own these PRs, so I cannot close them. :-( Cheers Nick
Re: [doc,committed] Fix missing ':' in inline asm example
Georg-Johann Lay wrote: > Applied as obvious: > > http://gcc.gnu.org/r195471 > > > * doc/extend.texi (Example of asm with clobbered asm reg): Fix > missing ':' in asm example. > The patch: --- trunk/gcc/doc/extend.texi 2013/01/25 17:55:09 195470 +++ trunk/gcc/doc/extend.texi 2013/01/25 18:11:53 195471 @@ -6062,7 +6062,7 @@ int *y = &x; int result; asm ("magic stuff accessing an 'int' pointed to by '%1'" -"=&d" (r) : "a" (y), "m" (*y)); + : "=&d" (r) : "a" (y), "m" (*y)); return result; @} @end smallexample
Re: [PATCH] Fix up pow folding (PR tree-optimization/56125)
On Mon, 28 Jan 2013, Jakub Jelinek wrote: 2013-01-28 Jakub Jelinek PR tree-optimization/56125 * tree-ssa-math-opts.c (gimple_expand_builtin_pow): Don't optimize pow(x,c) into sqrt(x) * powi(x, n/2) or 1.0 / (sqrt(x) * powi(x, abs(n/2))) if c is an integer or when optimizing for size. Don't optimize pow(x,c) into powi(x, n/3) * powi(cbrt(x), n%3) or 1.0 / (powi(x, abs(n)/3) * powi(cbrt(x), abs(n)%3)) if 2c is an integer. * gcc.dg/pr56125.c: New test. Hello, is there an implicit -lm in the testsuite? The testcase now generates a library call to pow, like gcc-4.6. This is correct, but I am surprised this is considered better than leaving the original x/(n*n) unchanged... Should that be a different PR? -- Marc Glisse
Re: [v3] Fix management of non empty hash functor
On 10 January 2013 21:02, François Dumont wrote: > Hi > > Here is an other version of this patch. Indeed there were no need to > expose many stuff public. Inheriting from _Hash_code_base is fine, it is not > final and it deals with EBO itself. I only kept usage of > _Hashtable_ebo_helper when embedding H2 functor. As it is an extension we > could have impose it not to be final but it doesn't cost a lot to deal with > it. Finally I only needed a single friend declaration to get access to the > H2 part of _Hash_code_base. OK. > I didn't touch the default cache policy for the moment except reducing > constraints on the hash functor. I prefer to submit an other patch to change > when we cache or not depending on the hash functor expected performance. OK. The reduced constraints are good. Does this actually affect performance? In my tests it doesn't, so I assume we still need to change the caching decision to notice any performance improvements? (Do the performance benchmarks actually tell us anything useful? When I run them I get such varying results it doesn't seem to be reliable.) > I also took the time to replace some typedef expressions with using > ones. I really know what is the rule about using one or the other but I > remembered that Benjamin spent quite some time changing typedef in using so > I prefer to stick to this approach in this file, even if there are still > some typedef left. OK, that doesn't make any difference so isn't important which is used. > Tested under linux x86_64 normal and debug modes. > > 2013-01-10 François Dumont > > > * include/bits/hashtable_policy.h (_Local_iterator_base): Use > _Hashtable_ebo_helper to embed necessary functors into the > local_iterator when necessary. Pass information about functors Repeating "necessary" seems unnecessary here :) > involved in hash code by copy. > * include/bits/hashtable.h (__cache_default): Do not cache for > builtin integral types unless the hash functor is not noexcept > qualified or is not default constructible. Adapt static assertions > and local iteraror instantiations. ^^ "iteraror" + // When hash codes are not cached local iterator inherits from + // __hash_code_base above to compute node bucket index so it has to be + // default constructible. + static_assert(__if_hash_not_cached< + is_default_constructible<__hash_code_base>>::value, + "Cache the hash code or make functors involved in hash code" + " and bucket index computation default constructibles"); "constructible" not "constructibles" This is OK for trunk, but not 4.7
Re: [PATCH] Fix up pow folding (PR tree-optimization/56125)
On Mon, Jan 28, 2013 at 04:41:31PM +0100, Marc Glisse wrote: > On Mon, 28 Jan 2013, Jakub Jelinek wrote: > > >2013-01-28 Jakub Jelinek > > > > PR tree-optimization/56125 > > * tree-ssa-math-opts.c (gimple_expand_builtin_pow): Don't optimize > > pow(x,c) into sqrt(x) * powi(x, n/2) or > > 1.0 / (sqrt(x) * powi(x, abs(n/2))) if c is an integer or when > > optimizing for size. > > Don't optimize pow(x,c) into powi(x, n/3) * powi(cbrt(x), n%3) or > > 1.0 / (powi(x, abs(n)/3) * powi(cbrt(x), abs(n)%3)) if 2c is an > > integer. > > > > * gcc.dg/pr56125.c: New test. > > is there an implicit -lm in the testsuite? Yes. > The testcase now generates a library call to pow, like gcc-4.6. This > is correct, but I am surprised this is considered better than > leaving the original x/(n*n) unchanged... Should that be a different > PR? The function in question is marked as cold, therefore it should be optimized for size. The call to pow is certainly shorter than the sqrt, multiplication, division etc. Jakub
Re: [PATCH] Fix up pow folding (PR tree-optimization/56125)
On Mon, 28 Jan 2013, Jakub Jelinek wrote: On Mon, Jan 28, 2013 at 04:41:31PM +0100, Marc Glisse wrote: On Mon, 28 Jan 2013, Jakub Jelinek wrote: 2013-01-28 Jakub Jelinek PR tree-optimization/56125 * tree-ssa-math-opts.c (gimple_expand_builtin_pow): Don't optimize pow(x,c) into sqrt(x) * powi(x, n/2) or 1.0 / (sqrt(x) * powi(x, abs(n/2))) if c is an integer or when optimizing for size. Don't optimize pow(x,c) into powi(x, n/3) * powi(cbrt(x), n%3) or 1.0 / (powi(x, abs(n)/3) * powi(cbrt(x), abs(n)%3)) if 2c is an integer. * gcc.dg/pr56125.c: New test. The testcase now generates a library call to pow, like gcc-4.6. This is correct, but I am surprised this is considered better than leaving the original x/(n*n) unchanged... Should that be a different PR? The function in question is marked as cold, therefore it should be optimized for size. The call to pow is certainly shorter than the sqrt, multiplication, division etc. There is no sqrt, x/(n*n) is just one mul and one div, whereas with the call I see one mul, 3 movs to prepare for the call, and the call. -- Marc Glisse
Re: [PATCH] Fix up pow folding (PR tree-optimization/56125)
On Mon, Jan 28, 2013 at 05:07:10PM +0100, Marc Glisse wrote: > There is no sqrt, x/(n*n) is just one mul and one div, whereas with > the call I see one mul, 3 movs to prepare for the call, and the > call. Ah, you're talking about the checked in testcase, rather than the one I've mentioned in the description whether the speed guard is desirable there or not. In the checked in testcase, the problem with code size is far earlier than that, already during folding that double u = x / (n * n); is replaced by: double u = x * __builtin_pow (n, -2.0e+0); And this isn't something you can then size optimize in the pow folder on its own, return pow (n, -2.0e); will be supposedly shorter than return 1.0 / (n * n), the folding doesn't see that this is used in multiplication which could be perhaps changed into division instead. Jakub
Re: [PATCH] Fix sched ICE with prefetch (PR rtl-optimization/56117)
On 01/28/2013 07:14 AM, Jakub Jelinek wrote: Hi! We ICE on the following testcase when using cselib, because cselib_lookup* is never called on the PREFETCH argument, and add_insn_mem_dependence calls cselib_subst_to_values on it, which assumes cselib_lookup* already happened on it earlier. For MEMs sched_analyze_2 calls cselib_lookup_from_insn, but for PREFETCHes it didn't. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2013-01-28 Jakub Jelinek PR rtl-optimization/56117 * sched-deps.c (sched_analyze_2) : For use_cselib call cselib_lookup_from_insn on the MEM before calling add_insn_mem_dependence. * gcc.dg/pr56117.c: New test. I'm assuming that we don't need the shallow_copy_rtx call and related code because in the PREFETCH case we generate a new MEM and the underlying address can be safely shared. Right? If that's true, OK. jeff
Re: [PATCH] Fix sched ICE with prefetch (PR rtl-optimization/56117)
On Mon, Jan 28, 2013 at 09:39:00AM -0700, Jeff Law wrote: > I'm assuming that we don't need the shallow_copy_rtx call and > related code because in the PREFETCH case we generate a new MEM and > the underlying address can be safely shared. Right? AFAIK cselib_lookup* never modifies the rtx it is passed, shallow_copy_rtx in the MEM case is for: t = shallow_copy_rtx (dest); cselib_lookup_from_insn (XEXP (t, 0), address_mode, 1, GET_MODE (t), insn); XEXP (t, 0) = cselib_subst_to_values_from_insn (XEXP (t, 0), GET_MODE (t), insn); where we modify XEXP (t, 0) in the last assignment and don't want to change XEXP (dest, 0). Jakub
Re: [committed] Avoid setting gimple_location of force_gimple_operand* created stmts to DECL_SOURCE_LOCATION of current fn (PR tree-optimization/56094)
On 01/28/2013 07:09 AM, Jakub Jelinek wrote: Hi! As discussed in the PR, this is a safer variant of a fix for 4.8, where input_location during most optimization passes is set to DECL_SOURCE_LOCATION (current_function_decl) and various parts of the gimplifier e.g. during force_gimple_operand* may end up setting gimple_location to that. For 4.9, we should revert this and set input_location to UNKNOWN_LOCATION for the optimizers. Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk. 2013-01-28 Jakub Jelinek PR tree-optimization/56094 * gimplify.c (force_gimple_operand_1): Temporarily set input_location to UNKNOWN_LOCATION while gimplifying expr. * gcc.dg/pr56094.c: New test. Based on c#15, we should probably consider this a bit of a band-aid, right? Thus we'll install the band-aid, but keep the PR open pending a better solution for handling of input_location, correct? jeff +/* Verify no statements get the location of the foo () decl. */ +/* { dg-final { scan-tree-dump-not " : 65:1\\\]" "optimized" } } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */ Jakub
Re: [committed] Avoid setting gimple_location of force_gimple_operand* created stmts to DECL_SOURCE_LOCATION of current fn (PR tree-optimization/56094)
On Mon, Jan 28, 2013 at 09:50:35AM -0700, Jeff Law wrote: > >2013-01-28 Jakub Jelinek > > > > PR tree-optimization/56094 > > * gimplify.c (force_gimple_operand_1): Temporarily set input_location > > to UNKNOWN_LOCATION while gimplifying expr. > > > > * gcc.dg/pr56094.c: New test. > Based on c#15, we should probably consider this a bit of a band-aid, > right? Yeah. > Thus we'll install the band-aid, but keep the PR open > pending a better solution for handling of input_location, correct? That's what I did. Jakub
[Patch,avr] Remove fixed-point MUL and DIV routines from libgcc build
This removes modules from libgcc that are already supported by avr-specific fixed-point implementation and avoids duplicate functions like __mulsa3. Ok for trunk? Johann libgcc/ * config/avr/t-avr (LIB2FUNCS_EXCLUDE): Add: _mulQQ, _mulHQ, _mulHA, _mulSA, _mulUQQ, _mulUHQ, _mulUHA, _mulUSA, _divQQ, _divHQ, _divHA, _divSA, _divUQQ, _divUHQ, _divUHA, _divUSA. Index: config/avr/t-avr === --- config/avr/t-avr (revision 195301) +++ config/avr/t-avr (working copy) @@ -164,3 +164,17 @@ LIB2FUNCS_EXCLUDE += \ LIB2FUNCS_EXCLUDE += \ $(foreach func,_usadd _ussub _usneg,\ $(foreach mode,$(usat_modes),$(func_X))) + + +smul_modes = QQ HQ HA SA +umul_modes = UQQ UHQ UHA USA +sdiv_modes = QQ HQ HA SA +udiv_modes = UQQ UHQ UHA USA + +LIB2FUNCS_EXCLUDE += \ + $(foreach func,_mul,\ + $(foreach mode,$(smul_modes) $(umul_modes),$(func_X))) + +LIB2FUNCS_EXCLUDE += \ + $(foreach func,_div,\ + $(foreach mode,$(sdiv_modes) $(udiv_modes),$(func_X)))
Re: [Patch] Fix PR54814
On 01/27/2013 03:26 PM, Steven Bosscher wrote: On Sun, Jan 27, 2013 at 11:09 PM, Georg-Johann Lay wrote: The patch was originally worked out by Bernd Schmidt and fixed a problem introduced in http://gcc.gnu.org/r190252 Ironically, this revision fixes a reload problem on x86/x86_64 -- which doesn't use reload anymore now... Does this mean the fix is rejected for 4.8? No, just that it probably helps to add a RM to the CC list. FWIW, it seems to me that this patch should go into 4.8, because the bug is probably not limited to AVR. At this stage, I tend to be more conservative. However, it looks like Ulrich & Richi have taken a looksie and think the patch is fine. I'm certainly not going to object. jeff
Re: [Patch] Fix PR54814
On 01/27/2013 03:09 PM, Georg-Johann Lay wrote: If not, it'll probably need release manager approval before it can go in. Please attach your patch to PR54814 and attach PR 54814 to the 4.9 pending patches meta bug. Does this mean the fix is rejected for 4.8? Not necessarily. We're in a regression bugfix only stage; so regressions can obviously be fixed. If a change does not fix a regression, then it really needs the release manager's approval to go forward at this stage. Jeff
Re: [Patch] Fix PR54814
On 01/28/2013 06:55 AM, Ulrich Weigand wrote: Richard Biener wrote: On Sun, Jan 27, 2013 at 11:26 PM, Steven Bosscher wrote: On Sun, Jan 27, 2013 at 11:09 PM, Georg-Johann Lay wrote: The patch was originally worked out by Bernd Schmidt and fixed a problem introduced in http://gcc.gnu.org/r190252 Ironically, this revision fixes a reload problem on x86/x86_64 -- which doesn't use reload anymore now... Does this mean the fix is rejected for 4.8? No, just that it probably helps to add a RM to the CC list. FWIW, it seems to me that this patch should go into 4.8, because the bug is probably not limited to AVR. Indeed, the fix also looks quite obvious though I know nothing about the code at all. Thus, ok from a RM perspective if a reload-affine person approves it. The patch was originally by Bernd, but FWIW it looks good to me as well. Now that I know this is a regression, I've looked at it more closely and it looks good to me too. George-Johann, please install this onto the trunk. Thanks, Jeff
Re: RFA: RL78: Allow SP to be used as a base register
> Please may I apply the patch below. It fixes the RL78 backend so that > the stack register can be used as a base address register. Yes, please. Thanks!
Re: [PATCH] Adding target rdos to GCC
Uros, That is intentional. The gthr-rdos.h file is part of libgcc. My intention was to first patch gcc, then update the patches for newlib, and finally libgcc. The gthr-rdos.h file would reference include-files part of newlib, so this is kind of circular. I also cannot define the thread model for RDOS unless I define this file. I see a couple of possible solutions: 1. Keep as is. You cannot build libgcc at the current stage anyway, and the bootstrap must be built without threading 2. Add an empty gthr-rdos.h file until libgcc is done 3. Remove the threading-model for now, and add it with libgcc instead. Regards, Leif Ekblad - Original Message - From: "Uros Bizjak" To: "Leif Ekblad" Cc: "Richard Biener" ; ; "H.J. Lu" ; "Jakub Jelinek" Sent: Monday, January 28, 2013 8:23 AM Subject: Re: [PATCH] Adding target rdos to GCC On Mon, Jan 28, 2013 at 7:50 AM, Leif Ekblad wrote: If the patch is ok, could some maintainer add it to trunk? There is no gthr-rdos.h file in your patch: *** gcc-4.8-20121230/config/gthr.m4 2012-10-15 15:10:30.0 +0200 --- gcc-work/config/gthr.m4 2013-01-07 10:14:04.620667900 +0100 *** *** 21,26 --- 21,27 tpf) thread_header=config/s390/gthr-tpf.h ;; vxworks) thread_header=config/gthr-vxworks.h ;; win32) thread_header=config/i386/gthr-win32.h ;; + rdos) thread_header=config/i386/gthr-rdos.h ;; This file should be part of libgcc, so it needs its own ChangeLog. Uros.
Re: [PATCH] Adding target rdos to GCC
On Mon, Jan 28, 2013 at 8:57 PM, Leif Ekblad wrote: > That is intentional. The gthr-rdos.h file is part of libgcc. My intention > was to first patch gcc, then update the patches for newlib, and finally > libgcc. The gthr-rdos.h file would reference include-files part of newlib, > so this is kind of circular. I also cannot define the thread model for RDOS > unless I define this file. > > I see a couple of possible solutions: > 1. Keep as is. You cannot build libgcc at the current stage anyway, and the > bootstrap must be built without threading > 2. Add an empty gthr-rdos.h file until libgcc is done > 3. Remove the threading-model for now, and add it with libgcc instead. I propose option 3. Is it enough to remove gthr.m4 change from the patch in this case? Uros.
Re: [PATCH] Adding target rdos to GCC
- Original Message - From: "Uros Bizjak" To: "Leif Ekblad" Cc: "Richard Biener" ; ; "H.J. Lu" ; "Jakub Jelinek" Sent: Monday, January 28, 2013 9:03 PM Subject: Re: [PATCH] Adding target rdos to GCC On Mon, Jan 28, 2013 at 8:57 PM, Leif Ekblad wrote: That is intentional. The gthr-rdos.h file is part of libgcc. My intention was to first patch gcc, then update the patches for newlib, and finally libgcc. The gthr-rdos.h file would reference include-files part of newlib, so this is kind of circular. I also cannot define the thread model for RDOS unless I define this file. I see a couple of possible solutions: 1. Keep as is. You cannot build libgcc at the current stage anyway, and the bootstrap must be built without threading 2. Add an empty gthr-rdos.h file until libgcc is done 3. Remove the threading-model for now, and add it with libgcc instead. I propose option 3. Is it enough to remove gthr.m4 change from the patch in this case? Uros. Yes, for all practical purposes. There is a reference to thread-file in config.gcc, when threading is enabled, which doesn't work for bootstrapping the compiler anyway. Regards, Leif Ekblad
[SH] PR 56121 - fix libgcc build for SH2A
Hi, This is the same patch that I attached in the PR. It fixes an ICE when building libgcc for the SH2A target. Tested on rev. 195493 with make -k check RUNTESTFLAGS="--target_board=sh-sim \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}" ... comparing the test results against rev 193342 shows a few new failures, but they seem unrelated to this case. OK for trunk? Cheers, Oleg gcc/ChangeLog: PR target/56121 * config/sh/sh.md (bclr_m2a, bset_m2a, bst_m2a, bld_m2a, bldsign_m2a, bld_reg, *bld_regqi, band_m2a, bandreg_m2a, bor_m2a, borreg_m2a, bxor_m2a, bxorreg_m2a): Add satisfies_constraint_K03 condition. Index: gcc/config/sh/sh.md === --- gcc/config/sh/sh.md (revision 195493) +++ gcc/config/sh/sh.md (working copy) @@ -13140,6 +13140,8 @@ }) ;; SH2A instructions for bitwise operations. +;; FIXME: Convert multiple instruction insns to insn_and_split. +;; FIXME: Use iterators to fold at least and,xor,or insn variations. ;; Clear a bit in a memory location. (define_insn "bclr_m2a" @@ -13148,7 +13150,7 @@ (not:QI (ashift:QI (const_int 1) (match_operand:QI 1 "const_int_operand" "K03,K03"))) (match_dup 0)))] - "TARGET_SH2A && TARGET_BITOPS" + "TARGET_SH2A && TARGET_BITOPS && satisfies_constraint_K03 (operands[1])" "@ bclr.b %1,%0 bclr.b %1,@(0,%t0)" @@ -13171,7 +13173,7 @@ (ashift:QI (const_int 1) (match_operand:QI 1 "const_int_operand" "K03,K03")) (match_dup 0)))] - "TARGET_SH2A && TARGET_BITOPS" + "TARGET_SH2A && TARGET_BITOPS && satisfies_constraint_K03 (operands[1])" "@ bset.b %1,%0 bset.b %1,@(0,%t0)" @@ -13198,7 +13200,7 @@ (ior:QI (ashift:QI (const_int 1) (match_dup 1)) (match_dup 0] - "TARGET_SH2A && TARGET_BITOPS" + "TARGET_SH2A && TARGET_BITOPS && satisfies_constraint_K03 (operands[1])" "@ bst.b %1,%0 bst.b %1,@(0,%t0)" @@ -13211,7 +13213,7 @@ (match_operand:QI 0 "bitwise_memory_operand" "Sbw,Sbv") (const_int 1) (match_operand 1 "const_int_operand" "K03,K03")))] - "TARGET_SH2A && TARGET_BITOPS" + "TARGET_SH2A && TARGET_BITOPS && satisfies_constraint_K03 (operands[1])" "@ bld.b %1,%0 bld.b %1,@(0,%t0)" @@ -13224,7 +13226,7 @@ (match_operand:QI 0 "bitwise_memory_operand" "Sbw,m") (const_int 1) (match_operand 1 "const_int_operand" "K03,K03")))] - "TARGET_SH2A && TARGET_BITOPS" + "TARGET_SH2A && TARGET_BITOPS && satisfies_constraint_K03 (operands[1])" "@ bld.b %1,%0 bld.b %1,@(0,%t0)" @@ -13236,7 +13238,7 @@ (zero_extract:SI (match_operand:SI 0 "arith_reg_operand" "r") (const_int 1) (match_operand 1 "const_int_operand" "K03")))] - "TARGET_SH2A" + "TARGET_SH2A && satisfies_constraint_K03 (operands[1])" "bld %1,%0") (define_insn "*bld_regqi" @@ -13244,7 +13246,7 @@ (zero_extract:SI (match_operand:QI 0 "arith_reg_operand" "r") (const_int 1) (match_operand 1 "const_int_operand" "K03")))] - "TARGET_SH2A" + "TARGET_SH2A && satisfies_constraint_K03 (operands[1])" "bld %1,%0") ;; Take logical and of a specified bit of memory with the T bit and @@ -13256,7 +13258,7 @@ (match_operand:QI 0 "bitwise_memory_operand" "Sbw,m") (const_int 1) (match_operand 1 "const_int_operand" "K03,K03"] - "TARGET_SH2A && TARGET_BITOPS" + "TARGET_SH2A && TARGET_BITOPS && satisfies_constraint_K03 (operands[1])" "@ band.b %1,%0 band.b %1,@(0,%t0)" @@ -13269,7 +13271,7 @@ (const_int 1) (match_operand 2 "const_int_operand" "K03,K03")) (match_operand:SI 3 "register_operand" "r,r")))] - "TARGET_SH2A && TARGET_BITOPS" + "TARGET_SH2A && TARGET_BITOPS && satisfies_constraint_K03 (operands[2])" { static const char* alt[] = { @@ -13292,7 +13294,7 @@ (match_operand:QI 0 "bitwise_memory_operand" "Sbw,m") (const_int 1) (match_operand 1 "const_int_operand" "K03,K03"] - "TARGET_SH2A && TARGET_BITOPS" + "TARGET_SH2A && TARGET_BITOPS && satisfies_constraint_K03 (operands[1])" "@ bor.b %1,%0 bor.b %1,@(0,%t0)" @@ -13305,7 +13307,7 @@ (const_int 1) (match_operand 2 "const_int_operand" "K03,K03")) (match_operand:SI 3 "register_operand" "=r,r")))] - "TARGET_SH2A && TARGET_BITOPS" + "TARGET_SH2A && TARGET_BITOPS && satisfies_constraint_K03 (operands[2])" { static const char* alt[] = { @@ -13328,7 +13330,7 @@ (match_operand:QI 0 "bitwise_memory_operand" "Sbw,m") (const_int 1) (match_operand 1 "const_int_operand" "K03,K03"] - "TARGET_SH2A && TARGET_BITOPS" + "TARGET_SH2A && TARGET_BITOPS && satisfies_constraint_K03 (operands[1])" "@ bxor.b %1,%0 bxor.b %1,@(0,%t0)" @@ -13341,7 +13343,7 @@ (const_int 1) (match_operand 2 "const_int_operand" "K03,K03")) (match_operand:SI 3 "register_operand" "=r,r")))] - "TARGET_SH2A && TARGET_BITOPS" + "TARGET_SH2A && TARGET_BITOPS && satisfies_constrain
Re: [PATCH] Adding target rdos to GCC
On Mon, Jan 28, 2013 at 9:14 PM, Leif Ekblad wrote: >>> That is intentional. The gthr-rdos.h file is part of libgcc. My intention >>> was to first patch gcc, then update the patches for newlib, and finally >>> libgcc. The gthr-rdos.h file would reference include-files part of >>> newlib, >>> so this is kind of circular. I also cannot define the thread model for >>> RDOS >>> unless I define this file. >>> >>> I see a couple of possible solutions: >>> 1. Keep as is. You cannot build libgcc at the current stage anyway, and >>> the >>> bootstrap must be built without threading >>> 2. Add an empty gthr-rdos.h file until libgcc is done >>> 3. Remove the threading-model for now, and add it with libgcc instead. >> >> >> I propose option 3. >> >> Is it enough to remove gthr.m4 change from the patch in this case? > > Yes, for all practical purposes. There is a reference to thread-file in > config.gcc, when threading is enabled, which doesn't work for bootstrapping > the compiler anyway. Thanks for pointing it, I have also removed this reference. Attached is the patch that has been committed to SVN. I have added missing licence headers to new files and clean whitespace a bit. 2013-01-28 Leif Ekblad * config.gcc (i[34567]86-*-rdos*, x86_64-*-rdos*): New targets. * config/i386/i386.h (TARGET_RDOS): New macro. (DEFAULT_LARGE_SECTION_THRESHOLD): New macro. * config/i386/i386.c (ix86_option_override_internal): For 64bit TARGET_RDOS, set ix86_cmodel to CM_MEDIUM_PIC and flag_pic to 1. * config/i386/i386.opt (mlarge-data-threshold): Initialize to DEFAULT_LARGE_SECTION_THRESHOLD. * config/i386/i386.md (R14_REG, R15_REG): New constants. * config/i386/rdos.h: New file. * config/i386/rdos64.h: New file. Thanks, Uros. Index: config/i386/i386.c === --- config/i386/i386.c (revision 195515) +++ config/i386/i386.c (working copy) @@ -3235,10 +3235,12 @@ ix86_option_override_internal (bool main_args_p) DLL, and is essentially just as efficient as direct addressing. */ if (TARGET_64BIT && DEFAULT_ABI == MS_ABI) ix86_cmodel = CM_SMALL_PIC, flag_pic = 1; + else if (TARGET_64BIT && TARGET_RDOS) + ix86_cmodel = CM_MEDIUM_PIC, flag_pic = 1; else if (TARGET_64BIT) ix86_cmodel = flag_pic ? CM_SMALL_PIC : CM_SMALL; else -ix86_cmodel = CM_32; + ix86_cmodel = CM_32; } if (TARGET_MACHO && ix86_asm_dialect == ASM_INTEL) { Index: config/i386/i386.h === --- config/i386/i386.h (revision 195515) +++ config/i386/i386.h (working copy) @@ -518,6 +518,9 @@ extern tree x86_mfence; #define MACHOPIC_INDIRECT 0 #define MACHOPIC_PURE 0 +/* For the RDOS */ +#define TARGET_RDOS 0 + /* For the Windows 64-bit ABI. */ #define TARGET_64BIT_MS_ABI (TARGET_64BIT && ix86_cfun_abi () == MS_ABI) @@ -2081,6 +2084,10 @@ do { \ asm (SECTION_OP "\n\t" \ "call " CRT_MKSTR(__USER_LABEL_PREFIX__) #FUNC "\n" \ TEXT_SECTION_ASM_OP); + +/* Default threshold for putting data in large sections + with x86-64 medium memory model */ +#define DEFAULT_LARGE_SECTION_THRESHOLD 65536 /* Which processor to tune code generation for. */ Index: config/i386/i386.md === --- config/i386/i386.md (revision 195515) +++ config/i386/i386.md (working copy) @@ -300,6 +300,8 @@ (R11_REG40) (R12_REG41) (R13_REG42) + (R14_REG43) + (R15_REG44) (XMM8_REG 45) (XMM9_REG 46) (XMM10_REG 47) Index: config/i386/i386.opt === --- config/i386/i386.opt(revision 195515) +++ config/i386/i386.opt(working copy) @@ -140,7 +140,7 @@ Target RejectNegative Joined UInteger Var(ix86_bra Branches are this expensive (1-5, arbitrary units) mlarge-data-threshold= -Target RejectNegative Joined UInteger Var(ix86_section_threshold) Init(65536) +Target RejectNegative Joined UInteger Var(ix86_section_threshold) Init(DEFAULT_LARGE_SECTION_THRESHOLD) Data greater than given threshold will go into .ldata section in x86-64 medium model mcmodel= Index: config/i386/rdos.h === --- config/i386/rdos.h (revision 0) +++ config/i386/rdos.h (working copy) @@ -0,0 +1,33 @@ +/* Definitions for RDOS on i386. + Copyright (C) 2013 Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as publi
Re: [Patch, fortran] PR56008 (and PR47517) [F03] wrong code with lhs-realloc on assignment with derived types having allocatable components
**ping** On 23 January 2013 11:06, Tobias Burnus wrote: > Paul Richard Thomas wrote: >> >> *** gfc_alloc_allocatable_for_assignment (gf >> *** 8224,8229 >> --- 8250,8262 >> desc, tmp); >> tmp = gfc_conv_descriptor_dtype (desc); >> gfc_add_modify (&alloc_block, tmp, gfc_get_dtype (TREE_TYPE (desc))); >> + if ((expr1->ts.type == BT_DERIVED) >> + && expr1->ts.u.derived->attr.alloc_comp) >> + { >> + tmp = gfc_nullify_alloc_comp (expr1->ts.u.derived, desc, >> + expr1->rank); >> + gfc_add_expr_to_block (&alloc_block, tmp); >> + } >> alloc_expr = gfc_finish_block (&alloc_block); > > > When glancing at the patch, I wondered whether it would be better to use > CALLOC instead of MALLOC and avoid the nullification: > > /* Malloc expression. */ > gfc_init_block (&alloc_block); > tmp = build_call_expr_loc (input_location, > builtin_decl_explicit (BUILT_IN_MALLOC), > 1, size2); > > On the other hand, the nullification is probably still required for REALLOC. > If so, the question is whether CALLOC + nullify in the realloc branch - or > malloc + nullify after the realloc/malloc branches is better. Hence, your > version is probably fine. > > Sorry for not yet reviewing your patch. > > Tobias > > PS: Regarding "allocatable" and "memory leak": PR55603 has as similar issue. > For scalars, gfortran never frees allocatable function results; that's > independent of the LHS (allocatable, pointer, neither). Thus, if you are in > the mood of fixing those kind of bugs … (Actually, I am not even sure > whether that's restricted to allocation, it might also occur with > expressions like "a = f() + 5". Untested.) -- The knack of flying is learning how to throw yourself at the ground and miss. --Hitchhikers Guide to the Galaxy
Re: [v3] Fix management of non empty hash functor
Attached patch applied. 2013-01-28 François Dumont * include/bits/hashtable_policy.h (_Local_iterator_base): Use _Hashtable_ebo_helper to embed functors into the local_iterator when necessary. Pass information about functors involved in hash code by copy. * include/bits/hashtable.h (__cache_default): Do not cache for builtin integral types unless the hash functor is not noexcept qualified or is not default constructible. Adapt static assertions and local iterator instantiations. * include/debug/unordered_set (std::__debug::unordered_set<>::erase): Detect local iterators to invalidate using contained node rather than generating a dummy local_iterator instance. (std::__debug::unordered_multiset<>::erase): Likewise. * include/debug/unordered_map (std::__debug::unordered_map<>::erase): Likewise. (std::__debug::unordered_multimap<>::erase): Likewise. * testsuite/performance/23_containers/insert_erase/41975.cc: Test std::tr1 and std versions of unordered_set regardless of any macro. Add test on default cache behavior. * testsuite/performance/23_containers/insert/54075.cc: Likewise. * testsuite/23_containers/unordered_set/instantiation_neg.cc: Adapt line number. * testsuite/23_containers/unordered_set/ not_default_constructible_hash_neg.cc: New. * testsuite/23_containers/unordered_set/buckets/swap.cc: New. On 01/28/2013 04:42 PM, Jonathan Wakely wrote: On 10 January 2013 21:02, François Dumont wrote: Hi Here is an other version of this patch. Indeed there were no need to expose many stuff public. Inheriting from _Hash_code_base is fine, it is not final and it deals with EBO itself. I only kept usage of _Hashtable_ebo_helper when embedding H2 functor. As it is an extension we could have impose it not to be final but it doesn't cost a lot to deal with it. Finally I only needed a single friend declaration to get access to the H2 part of _Hash_code_base. OK. I didn't touch the default cache policy for the moment except reducing constraints on the hash functor. I prefer to submit an other patch to change when we cache or not depending on the hash functor expected performance. OK. The reduced constraints are good. Does this actually affect performance? In my tests it doesn't, so I assume we still need to change the caching decision to notice any performance improvements? No performance gain plan with that patch indeed. It just restore support for non-empty hash functor that used to work with previous implementation. There is also no performance test impacted by the modification of the default cache behavior so it is not surprised that you noticed nothing. (Do the performance benchmarks actually tell us anything useful? When I run them I get such varying results it doesn't seem to be reliable.) Last time I run the tests it was showing when not caching was better than caching. I have even added a bench on the unordered containers directly to show what are the performance of default behavior. For the moment, for the Foo type used in 54075.cc, the default behavior is not the best one. But I will submit a patch for that soon with a hash traits telling if it is fast or not, like we already talk about. François Index: include/bits/hashtable_policy.h === --- include/bits/hashtable_policy.h (revision 195515) +++ include/bits/hashtable_policy.h (working copy) @@ -1,6 +1,6 @@ // Internal policy header for unordered_set and unordered_map -*- C++ -*- -// Copyright (C) 2010, 2011, 2012 Free Software Foundation, Inc. +// Copyright (C) 2010-2013 Free Software Foundation, Inc. // // This file is part of the GNU ISO C++ Library. This library is free // software; you can redistribute it and/or modify it under the @@ -202,7 +202,7 @@ template struct _Node_iterator_base { - typedef _Hash_node<_Value, _Cache_hash_code> __node_type; + using __node_type = _Hash_node<_Value, _Cache_hash_code>; __node_type* _M_cur; @@ -282,7 +282,7 @@ struct _Node_const_iterator : public _Node_iterator_base<_Value, __cache> { - private: +private: using __base_type = _Node_iterator_base<_Value, __cache>; using __node_type = typename __base_type::__node_type; @@ -941,6 +941,17 @@ }; /** + * Primary class template _Local_iterator_base. + * + * Base class for local iterators, used to iterate within a bucket + * but not between buckets. + */ + template +struct _Local_iterator_base; + + /** * Primary class template _Hash_code_base. * * Encapsulates two policy issues that aren't quite orthogonal. @@ -974,8 +985,8 @@ private _Hashtable_ebo_helper<1, _Hash> { private: - typedef _Hashtable_ebo_helper<0, _ExtractKey> _EboExtractKey; - typedef _Hashtable_ebo_helper<1, _Hash> _EboHash; + using __ebo_extract_k
Re: [v3] Fix management of non empty hash functor
On 28 January 2013 21:08, François Dumont wrote: >> >> (Do the performance benchmarks actually tell us anything useful? >> When I run them I get such varying results it doesn't seem to be >> reliable.) > > Last time I run the tests it was showing when not caching was better than > caching. Yes, I've definitely seen real advantage from not caching (but that was in my own tests, not the performance testsuite.) > I have even added a bench on the unordered containers directly to > show what are the performance of default behavior. For the moment, for the > Foo type used in 54075.cc, the default behavior is not the best one. But I > will submit a patch for that soon with a hash traits telling if it is fast > or not, like we already talk about. Great, thanks.
Re: [PATCH, regression?] Support --static-libstdc++ with native AIX ld
On Jan 28, 2013, at 7:07 AM, David Edelsohn wrote: > Over the weekend, I successfully tested a different way to configure > and build: all static libraries. Yeah, I think our build instructions for the dependent libraries should say to build them statically.
Re: [Patch, fortran] PR56008 (and PR47517) [F03] wrong code with lhs-realloc on assignment with derived types having allocatable components
Hi Paul, This patch is sufficiently straightforward that the ChangeLog entry describes it completely. The fix for both bugs lay in the nullification of the allocatable components of the newly (re)allocated array. I think this fix is OK for trunk, for the reasons you mentioned. I also think it is straightforward enough (bordering on the obvious, but only after having read it :-) that it does not carry too much risk of a regression. So yes, OK from my side, unless somebody speaks up really quickly. Thomas
[PATCH 0/2] Avoid duplicated instrumentation in Address Sanitizer
Hello, As the subject suggests, the little patch-set that follows this message implements a basic optimization for the Address Sanitizer pass: in the same basic block, it avoids instrumenting an access to a memory region, if that same access has been instrumented before. As we store instrumented accesses to memory region in a hash table (that uses the new hash-table.h interface), I found it handy to be able to define the hash table entries as a type that has obvious constructors, rather than requiring the user of the hash table entry to write boilerplate code to do the initialization. So it was handy as well to be able to use the new operator to allocate memory for these entries (rather than using malloc + boilerplate initialization code). So I added support for the having hash table entries managed by new/delete in hash-table.h. That's what the first patch does. The second patch is where the real meat of the set is. I deliberately chose to start with the same conservative (and simple) approach used by asan@llvm which is to clear the hash table containing the already-instrumented memory accesses each time we start a new BB or each time we come across a function call. It seems like we could be smarter than that, to allow this optimization to work in inter-BB cases where there is a dominator relationship between BBs containing duplicated memory accesses. But I thought this could be added later, after 4.8. Below is the summary of the patches. [asan] Allow creating/deleting hash table entries with new/delete [asan] Avoid instrumenting duplicated memory access in the same basic block gcc/Makefile.in| 3 +- gcc/asan.c | 366 ++--- gcc/hash-table.h | 16 + .../asan/no-redundant-instrumentation-1.c | 70 4 files changed, 409 insertions(+), 46 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/asan/no-redundant-instrumentation-1.c -- Dodji
Re: [Patch, Fortran] PR 54107: [4.8 Regression] Memory hog with abstract interface
Hi Janus, Or maybe wait for the fix for comment #4? Rather not (technically it's a separate issue, I guess). While the patch is rather large, I think it is OK. One request: Could you add a comment to gfc_sym_get_dummy_args explaining what the function does and under which conditions sym->formal is NULL, while sym->ts.interface->formal isn't? Regards Thomas
[PATCH 1/2] [asan] Allow creating/deleting hash table entries with new/delete
Hello, The hash table type can handle creation and removal of entries with malloc/free. This patchlet adds support for using new/delete. It's useful for hash table entry types that have constructors (and/or destructors), to prevent the user from having to type boilerplate code to initialize them over and over again. This is used by the patch that follows this one. gcc/ * hash-table.h (struct typed_delete_remove): New type. (typed_delete_remove::remove): Implement this using the delete operator. --- gcc/hash-table.h | 16 1 file changed, 16 insertions(+) diff --git a/gcc/hash-table.h b/gcc/hash-table.h index 206423d..884840c 100644 --- a/gcc/hash-table.h +++ b/gcc/hash-table.h @@ -235,6 +235,22 @@ typed_free_remove ::remove (Type *p) free (p); } +/* Helpful type for removing entries with the delete operator. */ + +template +struct typed_delete_remove +{ + static inline void remove (Type *p); +}; + +/* Remove with delete. */ + +template +inline void +typed_delete_remove ::remove (Type *p) +{ + delete p; +} /* Helpful type for a no-op remove. */ -- 1.7.11.7 -- Dodji
[PATCH 2/2] [asan] Avoid instrumenting duplicated memory access in the same basic block
Hello, Like what Address Sanitizer does in LLVM, this patch avoids instrumented duplicated memory accesses in the same basic blocks. The approach taken is very conservative, to keep the pass simple, for a start. A memory access is considered to be a triplet made of an expression tree representing the beginning of the memory region that is accessed, an expression tree representing the length of that memory region, and a boolean that says whether that access is a load or a store. The patch builds a hash table of the memory accesses that have been instrumented in the current basic block. Then it walks the gimple statements of the current basic block. For each statement, it tests if the memory regions it references have already been instrumented. If not, the statement is instrumented and each memory references that are actually instrumented are added to the hash table. When the patch crosses a function call that is not a built-in function that we ought to instrument, the hash table is cleared, because that function call can possibly e.g free some memory that was instrumented. Likewise, when a new basic block is visited, the hash table is cleared. I guess we could be smarter than just unconditionally clearing the hash table in this later case, but this is what asan@llvm does, and for now, I thought starting in a conservative manner might have some value. The hash table is destroyed at the end of the pass. Bootstrapped and tested against trunk on x86-64-unknown-linux-gnu. gcc/ * Makefile.in (asan.o): Add new dependency on hash-table.h * asan.c (struct mem_ref, struct mem_ref_hasher): New types. (get_mem_ref_hash_table, has_stmt_been_instrumented_p) (update_mem_ref_hash_table, get_mem_ref_of_assignment): New functions. (get_mem_refs_of_builtin_call): Extract from instrument_builtin_call and tweak a little bit to make it fit with the new signature. (instrument_builtin_call): Use the new get_mem_refs_of_builtin_call. (maybe_instrument_assignment): Renamed instrument_assignment into this, and change it to advance the iterator when instrumentation actually happened and return true in that case. This makes it homogeneous with maybe_instrument_assignment, and thus give a chance to callers to be more 'regular'. (transform_statements): Clear the memory reference hash table whenever we enter a new BB, when we cross a function call, or when we are done transforming statements. Use maybe_instrument_assignment instead of instrumentation. No more need to special case maybe_instrument_assignment and advance the iterator after calling it; it's now handled just like maybe_instrument_call. Update comment. gcc/testsuite/ * c-c++-common/asan/no-redundant-instrumentation-1.c: New test. --- gcc/Makefile.in| 3 +- gcc/asan.c | 368 ++--- .../asan/no-redundant-instrumentation-1.c | 70 3 files changed, 395 insertions(+), 46 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/asan/no-redundant-instrumentation-1.c diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 6fe6345..8f7d122 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -2226,7 +2226,8 @@ stor-layout.o : stor-layout.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \ asan.o : asan.c asan.h $(CONFIG_H) $(SYSTEM_H) $(GIMPLE_H) \ output.h coretypes.h $(GIMPLE_PRETTY_PRINT_H) \ tree-iterator.h $(TREE_FLOW_H) $(TREE_PASS_H) \ - $(TARGET_H) $(EXPR_H) $(OPTABS_H) $(TM_P_H) langhooks.h + $(TARGET_H) $(EXPR_H) $(OPTABS_H) $(TM_P_H) langhooks.h \ + $(HASH_TABLE_H) tsan.o : $(CONFIG_H) $(SYSTEM_H) $(TREE_H) $(TREE_INLINE_H) \ $(GIMPLE_H) $(DIAGNOSTIC_H) langhooks.h \ $(TM_H) coretypes.h $(TREE_DUMP_H) $(TREE_PASS_H) $(CGRAPH_H) $(GGC_H) \ diff --git a/gcc/asan.c b/gcc/asan.c index f05e36c..f9a832f 100644 --- a/gcc/asan.c +++ b/gcc/asan.c @@ -34,6 +34,7 @@ along with GCC; see the file COPYING3. If not see #include "output.h" #include "tm_p.h" #include "langhooks.h" +#include "hash-table.h" /* AddressSanitizer finds out-of-bounds and use-after-free bugs with <2x slowdown on average. @@ -212,6 +213,159 @@ alias_set_type asan_shadow_set = -1; alias set is used for all shadow memory accesses. */ static GTY(()) tree shadow_ptr_types[2]; +/* Hashtable support for memory references used by gimple + statements. */ + +/* This type represents a reference to a memory region. */ +struct __attribute__ ((visibility ("hidden"))) mem_ref +{ + /* The expression of the begining of the memory region. */ + tree start; + /* The expression representing the length of the region. */ + tree len; + /* This is true iff the memory reference is a store. */ + bool is_store; + + /* Constructors. */ + mem_ref () : start (NULL_TREE), len (NULL_T
Re: [PATCH, regression?] Support --static-libstdc++ with native AIX ld
On Mon, Jan 28, 2013 at 4:17 PM, Mike Stump wrote: > On Jan 28, 2013, at 7:07 AM, David Edelsohn wrote: >> Over the weekend, I successfully tested a different way to configure >> and build: all static libraries. > > Yeah, I think our build instructions for the dependent libraries should say > to build them statically. A number of GCC developers who already do this. I previously had problem on AIX with an earlier release of GCC that was built with Graphite because one of the dependent libraries used C++, but GCC was not linked with libstdc++ at the time. The only way to break the C++ dependency of the library separate from GCC was through shared libraries. If one can link GCC against static libraries, it definitely simplifies things and avoids potential conflicts with multiple versions of GCC installed. - David
Re: [PATCH] Adding target rdos to GCC
That looks good. Thanks, Uros. Leif - Original Message - From: "Uros Bizjak" To: "Leif Ekblad" Cc: "Richard Biener" ; ; "H.J. Lu" ; "Jakub Jelinek" Sent: Monday, January 28, 2013 9:45 PM Subject: Re: [PATCH] Adding target rdos to GCC On Mon, Jan 28, 2013 at 9:14 PM, Leif Ekblad wrote: That is intentional. The gthr-rdos.h file is part of libgcc. My intention was to first patch gcc, then update the patches for newlib, and finally libgcc. The gthr-rdos.h file would reference include-files part of newlib, so this is kind of circular. I also cannot define the thread model for RDOS unless I define this file. I see a couple of possible solutions: 1. Keep as is. You cannot build libgcc at the current stage anyway, and the bootstrap must be built without threading 2. Add an empty gthr-rdos.h file until libgcc is done 3. Remove the threading-model for now, and add it with libgcc instead. I propose option 3. Is it enough to remove gthr.m4 change from the patch in this case? Yes, for all practical purposes. There is a reference to thread-file in config.gcc, when threading is enabled, which doesn't work for bootstrapping the compiler anyway. Thanks for pointing it, I have also removed this reference. Attached is the patch that has been committed to SVN. I have added missing licence headers to new files and clean whitespace a bit. 2013-01-28 Leif Ekblad * config.gcc (i[34567]86-*-rdos*, x86_64-*-rdos*): New targets. * config/i386/i386.h (TARGET_RDOS): New macro. (DEFAULT_LARGE_SECTION_THRESHOLD): New macro. * config/i386/i386.c (ix86_option_override_internal): For 64bit TARGET_RDOS, set ix86_cmodel to CM_MEDIUM_PIC and flag_pic to 1. * config/i386/i386.opt (mlarge-data-threshold): Initialize to DEFAULT_LARGE_SECTION_THRESHOLD. * config/i386/i386.md (R14_REG, R15_REG): New constants. * config/i386/rdos.h: New file. * config/i386/rdos64.h: New file. Thanks, Uros.
Re: [PATCH 1/2] [asan] Allow creating/deleting hash table entries with new/delete
On 1/28/13, Dodji Seketeli wrote: > Hello, > > The hash table type can handle creation and removal of entries with > malloc/free. This patchlet adds support for using new/delete. It's > useful for hash table entry types that have constructors (and/or > destructors), to prevent the user from having to type boilerplate code > to initialize them over and over again. This is used by the patch that > follows this one. Looks good to me. > > gcc/ > > * hash-table.h (struct typed_delete_remove): New type. > (typed_delete_remove::remove): Implement this using the delete > operator. > --- > gcc/hash-table.h | 16 > 1 file changed, 16 insertions(+) > > diff --git a/gcc/hash-table.h b/gcc/hash-table.h > index 206423d..884840c 100644 > --- a/gcc/hash-table.h > +++ b/gcc/hash-table.h > @@ -235,6 +235,22 @@ typed_free_remove ::remove (Type *p) >free (p); > } > > +/* Helpful type for removing entries with the delete operator. */ > + > +template > +struct typed_delete_remove > +{ > + static inline void remove (Type *p); > +}; > + > +/* Remove with delete. */ > + > +template > +inline void > +typed_delete_remove ::remove (Type *p) > +{ > + delete p; > +} > > /* Helpful type for a no-op remove. */ > > -- > 1.7.11.7 > > > > -- > Dodji > -- Lawrence Crowl
Re: question about section 10.12
> From: Kenneth Zadeck > Date: Mon, 28 Jan 2013 02:02:41 +0100 > this looks good to me. does your patch also address the vec_concat > issue that marc raised? You mean the issue being "same thing there"? I can confirm that (I've stumbled upon) the same issue being there (i.e. similarly applies to scalars). But nope, there's no cross-reference, so the effective wording needs to be added there too. I also noticed the parameter/s misleadingly being keyed to "vec" and fill-paragraphed the paragraph. Something like this seems obvious: (Oops, noticed gcc@ was in CC, changing to gcc-patches@.) * doc/rtl.texi (vec_concat, vec_duplicate): Mention that scalars are valid operands. Index: doc/rtl.texi === --- doc/rtl.texi(revision 195514) +++ doc/rtl.texi(working copy) @@ -2627,17 +2627,18 @@ The result mode @var{m} is either the su with that element submode (if multiple subparts are selected). @findex vec_concat -@item (vec_concat:@var{m} @var{vec1} @var{vec2}) +@item (vec_concat:@var{m} @var{x1} @var{x2}) Describes a vector concat operation. The result is a concatenation of the -vectors @var{vec1} and @var{vec2}; its length is the sum of the lengths of -the two inputs. +vectors or scalars @var{x1} and @var{x2}; its length is the sum of the +lengths of the two inputs. @findex vec_duplicate -@item (vec_duplicate:@var{m} @var{vec}) -This operation converts a small vector into a larger one by duplicating the -input values. The output vector mode must have the same submodes as the -input vector mode, and the number of output parts must be an integer multiple -of the number of input parts. +@item (vec_duplicate:@var{m} @var{x}) +This operation converts a scalar into a vector or a small vector into a +larger one by duplicating the input values. The output vector mode must have +the same submodes as the input vector mode or the scalar modes, and the +number of output parts must be an integer multiple of the number of input +parts. @end table brgds, H-P
Re: [SH] PR 56121 - fix libgcc build for SH2A
Oleg Endo wrote: > This is the same patch that I attached in the PR. > It fixes an ICE when building libgcc for the SH2A target. > Tested on rev. 195493 with > make -k check RUNTESTFLAGS="--target_board=sh-sim > \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}" > > ... comparing the test results against rev 193342 shows a few new > failures, but they seem unrelated to this case. > > OK for trunk? OK. Regards, kaz
[4.9 PATCH, alpha]: Switch alpha to LRA
Hello! 2013-01-28 Uros Bizjak * config/alpha/alpha.c (TARGET_LRA_P): New define. Bootstrapped and regression tested [1] on alphaev68-unknown-linux-gnu. OK for 4.9? [1] http://gcc.gnu.org/ml/gcc-testresults/2013-01/msg02998.html Uros. Index: config/alpha/alpha.c === --- config/alpha/alpha.c(revision 195502) +++ config/alpha/alpha.c(working copy) @@ -9872,6 +9872,9 @@ #undef TARGET_LEGITIMATE_ADDRESS_P #define TARGET_LEGITIMATE_ADDRESS_P alpha_legitimate_address_p +#undef TARGET_LRA_P +#define TARGET_LRA_P hook_bool_void_true + #undef TARGET_CONDITIONAL_REGISTER_USAGE #define TARGET_CONDITIONAL_REGISTER_USAGE alpha_conditional_register_usage
Re: [4.9 PATCH, alpha]: Switch alpha to LRA
On 01/28/2013 03:14 PM, Uros Bizjak wrote: 2013-01-28 Uros Bizjak * config/alpha/alpha.c (TARGET_LRA_P): New define. Bootstrapped and regression tested [1] on alphaev68-unknown-linux-gnu. OK for 4.9? Yep. r~
gccgo patch committed: Fix initialization order bug
This patch to the Go frontend fixes a bug determining the initialization order. I've committed a test case for the bug to the master Go repository: https://code.google.com/p/go/source/detail?spec=svn921e53d4863c8827756c0e7b228ab210441e4032&r=c3155f9f1bb64c8b3adb2b6f5527895d51f83b74 The old algorithm was overly clever and frankly I don't know what I was thinking. The new algorithm is simpler and probably more efficient. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r ee18ff1199b6 go/expressions.h --- a/go/expressions.h Fri Jan 25 16:13:13 2013 -0800 +++ b/go/expressions.h Mon Jan 28 16:23:26 2013 -0800 @@ -983,6 +983,11 @@ statement_(statement), is_lvalue_(false) { } + // The temporary that this expression refers to. + Temporary_statement* + statement() const + { return this->statement_; } + // Indicate that this reference appears on the left hand side of an // assignment statement. void diff -r ee18ff1199b6 go/gogo-tree.cc --- a/go/gogo-tree.cc Fri Jan 25 16:13:13 2013 -0800 +++ b/go/gogo-tree.cc Mon Jan 28 16:23:26 2013 -0800 @@ -499,7 +499,7 @@ // A hash table we use to avoid looping. The index is the name of a // named object. We only look through objects defined in this // package. - typedef Unordered_set(std::string) Seen_objects; + typedef Unordered_set(const void*) Seen_objects; Find_var(Named_object* var, Seen_objects* seen_objects) : Traverse(traverse_expressions), @@ -547,7 +547,7 @@ if (init != NULL) { std::pair ins = - this->seen_objects_->insert(v->name()); + this->seen_objects_->insert(v); if (ins.second) { // This is the first time we have seen this name. @@ -568,7 +568,7 @@ if (f->is_function() && f->package() == NULL) { std::pair ins = - this->seen_objects_->insert(f->name()); + this->seen_objects_->insert(f); if (ins.second) { // This is the first time we have seen this name. @@ -578,6 +578,25 @@ } } + Temporary_reference_expression* tre = e->temporary_reference_expression(); + if (tre != NULL) +{ + Temporary_statement* ts = tre->statement(); + Expression* init = ts->init(); + if (init != NULL) + { + std::pair ins = + this->seen_objects_->insert(ts); + if (ins.second) + { + // This is the first time we have seen this temporary + // statement. + if (Expression::traverse(&init, this) == TRAVERSE_EXIT) + return TRAVERSE_EXIT; + } + } +} + return TRAVERSE_CONTINUE; } @@ -613,11 +632,11 @@ { public: Var_init() -: var_(NULL), init_(NULL_TREE), waiting_(0) +: var_(NULL), init_(NULL_TREE) { } Var_init(Named_object* var, tree init) -: var_(var), init_(init), waiting_(0) +: var_(var), init_(init) { } // Return the variable. @@ -630,24 +649,11 @@ init() const { return this->init_; } - // Return the number of variables waiting for this one to be - // initialized. - size_t - waiting() const - { return this->waiting_; } - - // Increment the number waiting. - void - increment_waiting() - { ++this->waiting_; } - private: // The variable being initialized. Named_object* var_; // The initialization expression to run. tree init_; - // The number of variables which are waiting for this one. - size_t waiting_; }; typedef std::list Var_inits; @@ -660,6 +666,10 @@ static void sort_var_inits(Gogo* gogo, Var_inits* var_inits) { + typedef std::pair No_no; + typedef std::map Cache; + Cache cache; + Var_inits ready; while (!var_inits->empty()) { @@ -670,23 +680,30 @@ Named_object* dep = gogo->var_depends_on(var->var_value()); // Start walking through the list to see which variables VAR - // needs to wait for. We can skip P1->WAITING variables--that - // is the number we've already checked. + // needs to wait for. Var_inits::iterator p2 = p1; ++p2; - for (size_t i = p1->waiting(); i > 0; --i) - ++p2; for (; p2 != var_inits->end(); ++p2) { Named_object* p2var = p2->var(); - if (expression_requires(init, preinit, dep, p2var)) + No_no key(var, p2var); + std::pair ins = + cache.insert(std::make_pair(key, false)); + if (ins.second) + ins.first->second = expression_requires(init, preinit, dep, p2var); + if (ins.first->second) { // Check for cycles. - if (expression_requires(p2var->var_value()->init(), + key = std::make_pair(p2var, var); + ins = cache.insert(std::make_pair(key, false)); + if (ins.second) + ins.first->second = + expression_requires(p2var->var_value()->init(), p2var->var_value()->preinit(), gogo->var_depends_on(p2var->var_value()), - var)) + var); + if (ins.first->second) { error_at(var->location(), ("initialization expressions for %qs and " @@ -700,12 +717,8 @@ else { /
Re: [4.9 PATCH, alpha]: Switch alpha to LRA
On 01/28/2013 04:14 PM, Uros Bizjak wrote: Hello! 2013-01-28 Uros Bizjak * config/alpha/alpha.c (TARGET_LRA_P): New define. Bootstrapped and regression tested [1] on alphaev68-unknown-linux-gnu. OK for 4.9? [1] http://gcc.gnu.org/ml/gcc-testresults/2013-01/msg02998.html Can you attach this to PR 55996, the 4.9 pending patches metabug. Thanks, jeff
Re: [Patch,avr] Remove fixed-point MUL and DIV routines from libgcc build
2013/1/28 Georg-Johann Lay : > This removes modules from libgcc that are already supported by avr-specific > fixed-point implementation and avoids duplicate functions like __mulsa3. > > Ok for trunk? > > Johann > > > libgcc/ > * config/avr/t-avr (LIB2FUNCS_EXCLUDE): Add: > _mulQQ, _mulHQ, _mulHA, _mulSA, > _mulUQQ, _mulUHQ, _mulUHA, _mulUSA, > _divQQ, _divHQ, _divHA, _divSA, > _divUQQ, _divUHQ, _divUHA, _divUSA. Approved. Denis.
Ping: [Patch] PR56064: Fold VIEW_CONVERT_EXPR with FIXED_CST
Ping #1 for http://gcc.gnu.org/ml/gcc-patches/2013-01/msg01053.html Release Manager approval is here: http://gcc.gnu.org/ml/gcc/2013-01/msg00222.html This is tentative patch as discussed in http://gcc.gnu.org/ml/gcc/2013-01/msg00187.html fold-const.c gets 2 new function native_encode_fixed and native_interpret_fixed. Code common with the integer case is factored out and moved to the new constructor-like function double_int::from_buffer. The code bootstraps fine on x86-linux-gnu and I have test coverage from avr-unknown-none. Ok to apply? There are less intrusive solutions that only handle the int <-> fixed cases, for example fold-const.c:fold_view_convert_expr() could test for these cases and use double_int directly without serializing / deserializing through a memory buffer. Johann PR tree-optimization/56064 * fixed-value.c (const_fixed_from_double_int): New function. * fixed-value.h (const_fixed_from_double_int): New prototype. * fold-const.c (native_interpret_fixed): New static function. (native_interpret_expr) : Use it. (can_native_interpret_type_p) : Return true. (native_encode_fixed): New static function. (native_encode_expr) : Use it. (native_interpret_int): Move double_int worker code to... * double-int.c (double_int::from_buffer): ...this new static method. * double-int.h (double_int::from_buffer): Prototype it. testsuite/ PR tree-optimization/56064 * gcc.dg/fixed-point/view-convert.c: New test. Index: fixed-value.c === --- fixed-value.c (revision 195301) +++ fixed-value.c (working copy) @@ -81,6 +81,24 @@ check_real_for_fixed_mode (REAL_VALUE_TY return FIXED_OK; } + +/* Construct a CONST_FIXED from a bit payload and machine mode MODE. + The bits in PAYLOAD are used verbatim. */ + +FIXED_VALUE_TYPE +const_fixed_from_double_int (double_int payload, enum machine_mode mode) +{ + FIXED_VALUE_TYPE value; + + gcc_assert (GET_MODE_BITSIZE (mode) <= HOST_BITS_PER_DOUBLE_INT); + + value.data = payload; + value.mode = mode; + + return value; +} + + /* Initialize from a decimal or hexadecimal string. */ void Index: fixed-value.h === --- fixed-value.h (revision 195301) +++ fixed-value.h (working copy) @@ -49,6 +49,11 @@ extern FIXED_VALUE_TYPE fconst1[MAX_FCON const_fixed_from_fixed_value (r, m) extern rtx const_fixed_from_fixed_value (FIXED_VALUE_TYPE, enum machine_mode); +/* Construct a CONST_FIXED from a bit payload and machine mode MODE. + The bits in PAYLOAD are used verbatim. */ +extern FIXED_VALUE_TYPE const_fixed_from_double_int (double_int, +enum machine_mode); + /* Initialize from a decimal or hexadecimal string. */ extern void fixed_from_string (FIXED_VALUE_TYPE *, const char *, enum machine_mode); Index: fold-const.c === --- fold-const.c(revision 195301) +++ fold-const.c(working copy) @@ -7200,6 +7200,36 @@ native_encode_int (const_tree expr, unsi } +/* Subroutine of native_encode_expr. Encode the FIXED_CST + specified by EXPR into the buffer PTR of length LEN bytes. + Return the number of bytes placed in the buffer, or zero + upon failure. */ + +static int +native_encode_fixed (const_tree expr, unsigned char *ptr, int len) +{ + tree type = TREE_TYPE (expr); + enum machine_mode mode = TYPE_MODE (type); + int total_bytes = GET_MODE_SIZE (mode); + FIXED_VALUE_TYPE value; + tree i_value, i_type; + + if (total_bytes * BITS_PER_UNIT > HOST_BITS_PER_DOUBLE_INT) +return 0; + + i_type = lang_hooks.types.type_for_size (GET_MODE_BITSIZE (mode), 1); + + if (NULL_TREE == i_type + || TYPE_PRECISION (i_type) != total_bytes) +return 0; + + value = TREE_FIXED_CST (expr); + i_value = double_int_to_tree (i_type, value.data); + + return native_encode_int (i_value, ptr, len); +} + + /* Subroutine of native_encode_expr. Encode the REAL_CST specified by EXPR into the buffer PTR of length LEN bytes. Return the number of bytes placed in the buffer, or zero @@ -7345,6 +7375,9 @@ native_encode_expr (const_tree expr, uns case REAL_CST: return native_encode_real (expr, ptr, len); +case FIXED_CST: + return native_encode_fixed (expr, ptr, len); + case COMPLEX_CST: return native_encode_complex (expr, ptr, len); @@ -7368,44 +7401,37 @@ static tree native_interpret_int (tree type, const unsigned char *ptr, int len) { int total_bytes = GET_MODE_SIZE (TYPE_MODE (type)); - int byte, offset, word, words; - unsigned char value; double_int result; - if (total_bytes > len) -return NULL_TREE; - if (total_bytes * BITS_PER_UNIT > HOST_BITS_PER_DOUBLE_INT) + if (total_b