Re: [PATCH] A new reload-rewrite pattern recognizer for GCC vectorizer.
On Thu, Apr 24, 2014 at 05:32:54PM -0700, Cong Hou wrote: > In this patch a new reload-rewrite pattern detector is composed to > handle the following pattern in the loop being vectorized: > >x = *p; >... >y = *p; > >or > >*p = x; >... >y = *p; > > In both cases, *p is reloaded because there may exist other defs to > another memref that may alias with p. However, aliasing is eliminated > with alias checks. Then we can safely replace the last statement in > above cases by y = x. Not safely, at least not for #pragma omp simd/#pragma simd/#pragma ivdep loops if alias analysis hasn't proven there is no aliasing. So, IMNSHO you need to guard this with LOOP_VINFO_NO_DATA_DEPENDENCIES, assuming it has been computed at that point already (otherwise you need to do it elsewhere). Consider: void foo (int *p, int *q) { int i; #pragma omp simd safelen(16) for (i = 0; i < 128; i++) { int x = *p; *q += 8; *p = *p + x; p++; q++; } } It is valid to call the above with completely unrelated p and q, but also e.g. p == q, or q == p + 16 or p == q + 16. Your patch would certainly break it e.g. for p == q. Jakub
Re: [wide-int 5/5] Add dump () method for gdb debugging
Kenneth Zadeck writes: > don't you think that it would be easier to understand the number if you > printed it largest index first, as in the routines in wide-int-print.cc? Yeah, that's what the patch does. E.g. (for 32-bit HWI): [...,0x3,0x8000] is 7 << 31. Thanks, Richard > > kenny > On 04/25/2014 09:58 AM, Richard Sandiford wrote: >> This patch adds a dump () method so that it's easier to read the >> contents of the various wide-int types in gdb. I've deliberately not >> done any extension for "small_prec" cases because I think here >> we want to see the raw values as much as possible. >> >> Tested on x86_64-linux-gnu. OK to install? >> >> Thanks, >> Richard >> >> >> Index: gcc/wide-int.cc >> === >> --- gcc/wide-int.cc 2014-04-25 14:48:03.263213403 +0100 >> +++ gcc/wide-int.cc 2014-04-25 14:48:25.186333842 +0100 >> @@ -2130,3 +2130,9 @@ wi::only_sign_bit_p (const wide_int_ref >> void gt_ggc_mx (widest_int *) { } >> void gt_pch_nx (widest_int *, void (*) (void *, void *), void *) { } >> void gt_pch_nx (widest_int *) { } >> + >> +template void wide_int::dump () const; >> +template void generic_wide_int >::dump () >> const; >> +template void generic_wide_int >::dump () >> const; >> +template void offset_int::dump () const; >> +template void widest_int::dump () const; >> Index: gcc/wide-int.h >> === >> --- gcc/wide-int.h 2014-04-25 14:34:54.823652619 +0100 >> +++ gcc/wide-int.h 2014-04-25 14:48:06.218229309 +0100 >> @@ -709,6 +709,9 @@ #define INCDEC_OPERATOR(OP, DELTA) \ >> #undef ASSIGNMENT_OPERATOR >> #undef INCDEC_OPERATOR >> >> + /* Debugging functions. */ >> + void dump () const; >> + >> static const bool is_sign_extended >> = wi::int_traits >::is_sign_extended; >> }; >> @@ -859,6 +862,23 @@ generic_wide_int ::operator = ( >> return *this; >> } >> >> +/* Dump the contents of the integer to stderr, for debugging. */ >> +template >> +void >> +generic_wide_int ::dump () const >> +{ >> + unsigned int len = this->get_len (); >> + const HOST_WIDE_INT *val = this->get_val (); >> + unsigned int precision = this->get_precision (); >> + fprintf (stderr, "["); >> + if (len * HOST_BITS_PER_WIDE_INT < precision) >> +fprintf (stderr, "...,"); >> + for (unsigned int i = 0; i < len - 1; ++i) >> +fprintf (stderr, HOST_WIDE_INT_PRINT_HEX ",", val[len - 1 - i]); >> + fprintf (stderr, HOST_WIDE_INT_PRINT_HEX "], precision = %d\n", >> + val[0], precision); >> +} >> + >> namespace wi >> { >> template <>
Re: [patch, fortran] Fix PR 59604
Am 12.04.2014 21:33, schrieb Thomas Koenig: please find attached a patch for PR 59604. The patch makes sure that, if -fno-range-check is specified, using int on an overflowing boz constant yields the same result for compile-time simplification and run-time execution. OK for trunk? Looks good to me. At a glance, it looks as if it also fixes the original bug 58003 - or is something still missing? Tobias 2014-03-12 Thomas Koenig PR fortran/59604 * gfortran.h (gfc_convert_mpz_to_signed): Add prototype. * arith.c (gfc_int2int): Convert number to signed if arithmetic overflow is not checked. * simplify.c (convert_mpz_to_unsigned): Only trigger assert for size if range checking is in force. (convert_mpz_to_signed): Make non-static, rename to (gfc_convert_mpz_to_signed). (simplify_dshift): Use gfc_convert_mpz_to_signed. (gfc_simplify_ibclr): Likewise. (gfc_simplify_ibits): Likewise. (gfc_simplify_ibset): Likewise. (simplify_shift): Likewise. (gfc_simplify_ishiftc): Likewise. (gfc_simplify_maskr): Likewise. (gfc_simplify_maskl): Likewise. 2014-03-12 Thomas Koenig PR fortran/59604 * gfortran.dg/no_range_check_3.f90: New test.
[C PATCH] Another locus improvements (PR c/60257)
Another column info improvements, this time for initializer warnings. warning_init and add_pending_init had to gain new location parameter. What is worth mentioning is that the "(near initialization for X)" message seems next to useless to me now with caret diagnostics (?). Regtested/bootstrapped on powerpc64-unknown-linux-gnu and x86_64-unknown-linux-gnu, ok for trunk? 2014-04-25 Marek Polacek PR c/60257 * c-typeck.c (warning_init): Add location_t parameter. Call warning_at instead of warning. (push_init_level): Pass input_location to warning_init. (add_pending_init): Add location_t parameter. Pass loc to warning_init. (set_nonincremental_init): Pass input_location to add_pending_init. (set_nonincremental_init_from_string): Likewise. (output_init_element): Pass loc to warning_init and to add_pending_init. * gcc.dg/pr60257.c: New test. diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c index a82801f..5a232ec 100644 --- gcc/c/c-typeck.c +++ gcc/c/c-typeck.c @@ -100,14 +100,15 @@ static void push_string (const char *); static void push_member_name (tree); static int spelling_length (void); static char *print_spelling (char *); -static void warning_init (int, const char *); +static void warning_init (location_t, int, const char *); static tree digest_init (location_t, tree, tree, tree, bool, bool, int); static void output_init_element (location_t, tree, tree, bool, tree, tree, int, bool, struct obstack *); static void output_pending_init_elements (int, struct obstack *); static int set_designator (int, struct obstack *); static void push_range_stack (tree, struct obstack *); -static void add_pending_init (tree, tree, tree, bool, struct obstack *); +static void add_pending_init (location_t, tree, tree, tree, bool, + struct obstack *); static void set_nonincremental_init (struct obstack *); static void set_nonincremental_init_from_string (tree, struct obstack *); static tree find_init_member (tree, struct obstack *); @@ -6445,15 +6446,15 @@ pedwarn_init (location_t location, int opt, const char *gmsgid) component name is taken from the spelling stack. */ static void -warning_init (int opt, const char *gmsgid) +warning_init (location_t loc, int opt, const char *gmsgid) { char *ofwhat; /* The gmsgid may be a format string with %< and %>. */ - warning (opt, gmsgid); + warning_at (loc, opt, gmsgid); ofwhat = print_spelling ((char *) alloca (spelling_length () + 1)); if (*ofwhat) -warning (opt, "(near initialization for %qs)", ofwhat); +warning_at (loc, opt, "(near initialization for %qs)", ofwhat); } /* If TYPE is an array type and EXPR is a parenthesized string @@ -7299,7 +7300,8 @@ push_init_level (int implicit, struct obstack * braced_init_obstack) if (implicit == 1 && warn_missing_braces && !missing_braces_mentioned) { missing_braces_mentioned = 1; - warning_init (OPT_Wmissing_braces, "missing braces around initializer"); + warning_init (input_location, OPT_Wmissing_braces, + "missing braces around initializer"); } if (TREE_CODE (constructor_type) == RECORD_TYPE @@ -7360,7 +7362,7 @@ push_init_level (int implicit, struct obstack * braced_init_obstack) else { if (constructor_type != error_mark_node) - warning_init (0, "braces around scalar initializer"); + warning_init (input_location, 0, "braces around scalar initializer"); constructor_fields = constructor_type; constructor_unfilled_fields = constructor_type; } @@ -7775,8 +,8 @@ set_init_label (tree fieldname, struct obstack * braced_init_obstack) existing initializer. */ static void -add_pending_init (tree purpose, tree value, tree origtype, bool implicit, - struct obstack * braced_init_obstack) +add_pending_init (location_t loc, tree purpose, tree value, tree origtype, + bool implicit, struct obstack *braced_init_obstack) { struct init_node *p, **q, *r; @@ -7797,9 +7799,12 @@ add_pending_init (tree purpose, tree value, tree origtype, bool implicit, if (!implicit) { if (TREE_SIDE_EFFECTS (p->value)) - warning_init (0, "initialized field with side-effects overwritten"); + warning_init (loc, 0, + "initialized field with side-effects " + "overwritten"); else if (warn_override_init) - warning_init (OPT_Woverride_init, "initialized field overwritten"); + warning_init (loc, OPT_Woverride_init, + "initialized field overwritten"); } p->value = value; p->origtype = origtype; @@ -7824,9 +7829,12 @@ add_pending_init (tree purpose, tree value, tr
Re: [COMMITTED] Fix debug/60438 -- i686 stack vs fp operations
On 13-03-14 21:49, Richard Henderson wrote: (define_expand "ldexpxf3" - [(set (match_dup 3) - (float:XF (match_operand:SI 2 "register_operand"))) - (parallel [(set (match_operand:XF 0 " register_operand") - (unspec:XF [(match_operand:XF 1 "register_operand") - (match_dup 3)] - UNSPEC_FSCALE_FRACT)) - (set (match_dup 4) - (unspec:XF [(match_dup 1) (match_dup 3)] - UNSPEC_FSCALE_EXP))])] + [(match_operand:XF 0 "register_operand") + (match_operand:XF 1 "register_operand") + (match_operand:SI 2 "register_operand")] "TARGET_USE_FANCY_MATH_387 && flag_unsafe_math_optimizations" { @@ -14808,6 +14633,11 @@ operands[3] = gen_reg_rtx (XFmode); operands[4] = gen_reg_rtx (XFmode); + + emit_insn (gen_floatsixf2 (operands[3], operands[2])); + emit_insn (gen_fscalexf4_i387 (operands[0], operands[4], + operands[1], operands[3])); + DONE; }) Richard, For a non-bootstrap x86_64 build, gcc.dg/builtins-34.c fails for me with a sigsegv. I've traced it back to this code in insn-emit.c: ... rtx gen_ldexpxf3 (rtx operand0, rtx operand1, rtx operand2) { rtx _val = 0; start_sequence (); { rtx operands[3]; operands[0] = operand0; operands[1] = operand1; operands[2] = operand2; { if (optimize_insn_for_size_p ()) FAIL; operands[3] = gen_reg_rtx (XFmode); operands[4] = gen_reg_rtx (XFmode); ... operands is declared with size 3, and operands[3,4] accesses are out of bounds. I've done a minimal build with attached patch, and reran the test-case, which passes now. OK if bootstrap succeeds? Thanks, - Tom 2014-04-26 Tom de Vries * config/i386/i386.md (define_expand "ldexpxf3"): Fix out-of-bounds array accesses. --- gcc/config/i386/i386.md | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 25e2e93..9f103cf 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -14427,15 +14427,16 @@ "TARGET_USE_FANCY_MATH_387 && flag_unsafe_math_optimizations" { + rtx tmp1, tmp2; if (optimize_insn_for_size_p ()) FAIL; - operands[3] = gen_reg_rtx (XFmode); - operands[4] = gen_reg_rtx (XFmode); + tmp1 = gen_reg_rtx (XFmode); + tmp2 = gen_reg_rtx (XFmode); - emit_insn (gen_floatsixf2 (operands[3], operands[2])); - emit_insn (gen_fscalexf4_i387 (operands[0], operands[4], - operands[1], operands[3])); + emit_insn (gen_floatsixf2 (tmp1, operands[2])); + emit_insn (gen_fscalexf4_i387 (operands[0], tmp2, + operands[1], tmp1)); DONE; }) -- 1.8.3.2
Re: [COMMITTED] Fix debug/60438 -- i686 stack vs fp operations
On Sat, Apr 26, 2014 at 11:27 AM, Tom de Vries wrote: > On 13-03-14 21:49, Richard Henderson wrote: >> >> (define_expand "ldexpxf3" >> - [(set (match_dup 3) >> - (float:XF (match_operand:SI 2 "register_operand"))) >> - (parallel [(set (match_operand:XF 0 " register_operand") >> - (unspec:XF [(match_operand:XF 1 "register_operand") >> - (match_dup 3)] >> - UNSPEC_FSCALE_FRACT)) >> - (set (match_dup 4) >> - (unspec:XF [(match_dup 1) (match_dup 3)] >> - UNSPEC_FSCALE_EXP))])] >> + [(match_operand:XF 0 "register_operand") >> + (match_operand:XF 1 "register_operand") >> + (match_operand:SI 2 "register_operand")] >> "TARGET_USE_FANCY_MATH_387 >> && flag_unsafe_math_optimizations" >> { >> @@ -14808,6 +14633,11 @@ >> >> operands[3] = gen_reg_rtx (XFmode); >> operands[4] = gen_reg_rtx (XFmode); >> + >> + emit_insn (gen_floatsixf2 (operands[3], operands[2])); >> + emit_insn (gen_fscalexf4_i387 (operands[0], operands[4], >> + operands[1], operands[3])); >> + DONE; >> }) > > > Richard, > > For a non-bootstrap x86_64 build, gcc.dg/builtins-34.c fails for me with a > sigsegv. > > I've traced it back to this code in insn-emit.c: > ... > rtx > gen_ldexpxf3 (rtx operand0, > rtx operand1, > rtx operand2) > { > rtx _val = 0; > start_sequence (); > { > rtx operands[3]; > operands[0] = operand0; > operands[1] = operand1; > operands[2] = operand2; > > { > if (optimize_insn_for_size_p ()) > FAIL; > > operands[3] = gen_reg_rtx (XFmode); > operands[4] = gen_reg_rtx (XFmode); > ... > > operands is declared with size 3, and operands[3,4] accesses are out of > bounds. > > I've done a minimal build with attached patch, and reran the test-case, > which passes now. > > OK if bootstrap succeeds? > > 2014-04-26 Tom de Vries > > * config/i386/i386.md (define_expand "ldexpxf3"): Fix out-of-bounds > array accesses. OK for mainline and 4.9 branch. Thanks, Uros.
[RFC][ARM] TARGET_ATOMIC_ASSIGN_EXPAND_FENV hook
Hi, Attached patch implements TARGET_ATOMIC_ASSIGN_EXPAND_FENV for ARM. With this, atomic test-case gcc.dg/atomic/c11-atomic-exec-5.c now PASS. This implementation is based on SPARC and i386 implementations. Regression tested on qemu-arm for arm-none-linux-gnueabi with no new regression. Is this OK for trunk? Thanks, Kugan gcc/ +2014-04-27 Kugan Vivekanandarajah + + * config/arm/arm.c (TARGET_ATOMIC_ASSIGN_EXPAND_FENV): New define. + (arm_builtins) : Add ARM_BUILTIN_LDFPSCR and ARM_BUILTIN_STFPSCR. + (bdesc_2arg) : Add description for builtins __builtins_arm_stfpscr + and __builtins_arm_ldfpscr. + (arm_init_builtins) : Initialize builtins __builtins_arm_stfpscr and + __builtins_arm_ldfpscr. + (arm_expand_builtin) : Expand builtins __builtins_arm_stfpscr and + __builtins_arm_ldfpscr. + (arm_atomic_assign_expand_fenv): New function. + * config/arm/vfp.md (stfpscr): New pattern. + (ldfpscr) : Likewise. + * config/arm/unspecs.md (unspecv): Add VUNSPEC_LDFPSCR and + VUNSPEC_STFPSCR. + diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 0240cc7..4f0ed58 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -59,6 +59,7 @@ #include "params.h" #include "opts.h" #include "dumpfile.h" +#include "gimple-expr.h" /* Forward definitions of types. */ typedef struct minipool_nodeMnode; @@ -93,6 +94,7 @@ static int thumb_far_jump_used_p (void); static bool thumb_force_lr_save (void); static unsigned arm_size_return_regs (void); static bool arm_assemble_integer (rtx, unsigned int, int); +static void arm_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update); static void arm_print_operand (FILE *, rtx, int); static void arm_print_operand_address (FILE *, rtx); static bool arm_print_operand_punct_valid_p (unsigned char code); @@ -584,6 +586,9 @@ static const struct attribute_spec arm_attribute_table[] = #undef TARGET_MANGLE_TYPE #define TARGET_MANGLE_TYPE arm_mangle_type +#undef TARGET_ATOMIC_ASSIGN_EXPAND_FENV +#define TARGET_ATOMIC_ASSIGN_EXPAND_FENV arm_atomic_assign_expand_fenv + #undef TARGET_BUILD_BUILTIN_VA_LIST #define TARGET_BUILD_BUILTIN_VA_LIST arm_build_builtin_va_list #undef TARGET_EXPAND_BUILTIN_VA_START @@ -23212,6 +23217,9 @@ enum arm_builtins ARM_BUILTIN_CRC32CH, ARM_BUILTIN_CRC32CW, + ARM_BUILTIN_LDFPSCR, + ARM_BUILTIN_STFPSCR, + #undef CRYPTO1 #undef CRYPTO2 #undef CRYPTO3 @@ -24010,6 +24018,15 @@ static const struct builtin_description bdesc_2arg[] = IWMMXT_BUILTIN2 (iwmmxt_wmacuz, WMACUZ) IWMMXT_BUILTIN2 (iwmmxt_wmacsz, WMACSZ) + +#define FP_BUILTIN(L, U) \ + {0, CODE_FOR_##L, "__builtin_arm_"#L, ARM_BUILTIN_##U, \ + UNKNOWN, 0}, + + FP_BUILTIN (stfpscr, LDFPSCR) + FP_BUILTIN (ldfpscr, STFPSCR) +#undef FP_BUILTIN + #define CRC32_BUILTIN(L, U) \ {0, CODE_FOR_##L, "__builtin_arm_"#L, ARM_BUILTIN_##U, \ UNKNOWN, 0}, @@ -24524,6 +24541,21 @@ arm_init_builtins (void) if (TARGET_CRC32) arm_init_crc32_builtins (); + + if (TARGET_VFP) +{ + tree ftype_stfpscr + = build_function_type_list (void_type_node, unsigned_type_node, NULL); + tree ftype_ldfpscr + = build_function_type_list (unsigned_type_node, NULL); + + arm_builtin_decls[ARM_BUILTIN_LDFPSCR] + = add_builtin_function ("__builtin_arm_ldfscr", ftype_ldfpscr, + ARM_BUILTIN_LDFPSCR, BUILT_IN_MD, NULL, NULL_TREE); + arm_builtin_decls[ARM_BUILTIN_STFPSCR] + = add_builtin_function ("__builtin_arm_stfscr", ftype_stfpscr, + ARM_BUILTIN_STFPSCR, BUILT_IN_MD, NULL, NULL_TREE); +} } /* Return the ARM builtin for CODE. */ @@ -25251,6 +25283,25 @@ arm_expand_builtin (tree exp, switch (fcode) { +case ARM_BUILTIN_LDFPSCR: +case ARM_BUILTIN_STFPSCR: + if (fcode == ARM_BUILTIN_LDFPSCR) + { + icode = CODE_FOR_ldfpscr; + target = gen_reg_rtx (SImode); + pat = GEN_FCN (icode) (target); + } + else + { + target = NULL_RTX; + icode = CODE_FOR_stfpscr; + arg0 = CALL_EXPR_ARG (exp, 0); + op0 = expand_normal (arg0); + pat = GEN_FCN (icode) (op0); + } + emit_insn (pat); + return target; + case ARM_BUILTIN_TEXTRMSB: case ARM_BUILTIN_TEXTRMUB: case ARM_BUILTIN_TEXTRMSH: @@ -31116,4 +31167,70 @@ arm_asan_shadow_offset (void) return (unsigned HOST_WIDE_INT) 1 << 29; } +static void +arm_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update) +{ + if (!TARGET_VFP) +return; + + const unsigned FE_INVALID = 1; + const unsigned FE_DIVBYZERO = 2; + const unsigned FE_OVERFLOW = 4; + const unsigned FE_UNDERFLOW = 8; + const unsigned FE_INEXACT = 16; + const unsigned HOST_WIDE_INT FE_ALL_EXCEPT = (FE_INVALID | FE_DIVBYZERO + | FE_OVERFLOW | FE_UNDERFLOW +
[RFC][AARCH64] TARGET_ATOMIC_ASSIGN_EXPAND_FENV hook
Attached patch implements TARGET_ATOMIC_ASSIGN_EXPAND_FENV for AARCH64. With this, atomic test-case gcc.dg/atomic/c11-atomic-exec-5.c now PASS. This implementation is based on SPARC and i386 implementations. Regression tested on qemu-aarch64 for aarch64-none-linux-gnu with no new regression. Is this OK for trunk? Thanks, Kugan gcc/ +2014-04-27 Kugan Vivekanandarajah + + * config/aarch64/aarch64.c (TARGET_ATOMIC_ASSIGN_EXPAND_FENV): New + define. + * config/aarch64/aarch64-builtins.c (arm_builtins) : Add + AARCH64_BUILTIN_LDFPSCR and AARCH64_BUILTIN_STFPSCR. + (aarch64_init_builtins) : Initialize builtins + __builtins_aarch64_stfpscr and __builtins_aarch64_ldfpscr. + (aarch64_expand_builtin) : Expand builtins __builtins_aarch64_stfpscr + and __builtins_aarch64_ldfpscr. + (aarch64_atomic_assign_expand_fenv): New function. + * config/aarch64/aarch64.md (stfpscr): New pattern. + (ldfpscr) : Likewise. + (unspecv): Add UNSPECV_LDFPSCR and UNSPECV_STFPSCR. + diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c index 55cfe0a..70d3efa 100644 --- a/gcc/config/aarch64/aarch64-builtins.c +++ b/gcc/config/aarch64/aarch64-builtins.c @@ -371,6 +371,10 @@ static aarch64_simd_builtin_datum aarch64_simd_builtin_data[] = { enum aarch64_builtins { AARCH64_BUILTIN_MIN, + + AARCH64_BUILTIN_LDFPSCR, + AARCH64_BUILTIN_STFPSCR, + AARCH64_SIMD_BUILTIN_BASE, #include "aarch64-simd-builtins.def" AARCH64_SIMD_BUILTIN_MAX = AARCH64_SIMD_BUILTIN_BASE @@ -752,6 +756,18 @@ aarch64_init_simd_builtins (void) void aarch64_init_builtins (void) { + tree ftype_stfpscr += build_function_type_list (void_type_node, unsigned_type_node, NULL); + tree ftype_ldfpscr += build_function_type_list (unsigned_type_node, NULL); + + aarch64_builtin_decls[AARCH64_BUILTIN_LDFPSCR] += add_builtin_function ("__builtin_aarch64_ldfscr", ftype_ldfpscr, + AARCH64_BUILTIN_LDFPSCR, BUILT_IN_MD, NULL, NULL_TREE); + aarch64_builtin_decls[AARCH64_BUILTIN_STFPSCR] += add_builtin_function ("__builtin_aarch64_stfscr", ftype_stfpscr, + AARCH64_BUILTIN_STFPSCR, BUILT_IN_MD, NULL, NULL_TREE); + if (TARGET_SIMD) aarch64_init_simd_builtins (); } @@ -964,6 +980,31 @@ aarch64_expand_builtin (tree exp, { tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0); int fcode = DECL_FUNCTION_CODE (fndecl); + int icode; + rtx pat, op0; + tree arg0; + + switch (fcode) +{ +case AARCH64_BUILTIN_LDFPSCR: +case AARCH64_BUILTIN_STFPSCR: + if (fcode == AARCH64_BUILTIN_LDFPSCR) + { + icode = CODE_FOR_ldfpscr; + target = gen_reg_rtx (SImode); + pat = GEN_FCN (icode) (target); + } + else + { + target = NULL_RTX; + icode = CODE_FOR_stfpscr; + arg0 = CALL_EXPR_ARG (exp, 0); + op0 = expand_normal (arg0); + pat = GEN_FCN (icode) (op0); + } + emit_insn (pat); + return target; +} if (fcode >= AARCH64_SIMD_BUILTIN_BASE) return aarch64_simd_expand_builtin (fcode, exp, target); @@ -1196,6 +1237,70 @@ aarch64_gimple_fold_builtin (gimple_stmt_iterator *gsi) return changed; } +void +aarch64_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update) +{ + const unsigned FE_INVALID = 1; + const unsigned FE_DIVBYZERO = 2; + const unsigned FE_OVERFLOW = 4; + const unsigned FE_UNDERFLOW = 8; + const unsigned FE_INEXACT = 16; + const unsigned HOST_WIDE_INT FE_ALL_EXCEPT = (FE_INVALID | FE_DIVBYZERO + | FE_OVERFLOW | FE_UNDERFLOW + | FE_INEXACT); + const unsigned HOST_WIDE_INT FE_EXCEPT_SHIFT = 8; + + /* Genareate the equivalence of : + unsigned int fenv_var; + fenv_var = __builtin_aarch64_ldfpscr (); + + unsigned int masked_fenv; + tmp1_var = fenv_var & ~ mask; + + __builtin_aarch64_fpscr (&tmp1_var); */ + + tree fenv_var = create_tmp_var (unsigned_type_node, NULL); + tree ldfpscr = aarch64_builtin_decls[AARCH64_BUILTIN_LDFPSCR]; + tree stfpscr = aarch64_builtin_decls[AARCH64_BUILTIN_STFPSCR]; + tree mask = build_int_cst (unsigned_type_node, +~((FE_ALL_EXCEPT << FE_EXCEPT_SHIFT) + | FE_ALL_EXCEPT)); + tree ld_fenv_stmt = build2 (MODIFY_EXPR, unsigned_type_node, + fenv_var, build_call_expr (ldfpscr, 0)); + tree masked_fenv = build2 (BIT_AND_EXPR, unsigned_type_node, fenv_var, mask); + tree hold_fnclex = build_call_expr (stfpscr, 1, masked_fenv); + *hold = build2 (COMPOUND_EXPR, void_type_node, + build2 (COMPOUND_EXPR, void_type_node, masked_fenv, + ld_fenv_stmt), hold_fnclex); + + /* Store the value of masked_fenv to clear the exceptions: + __builtin_aarch64_stfpscr (masked_fenv); */ + + *cl
Re: GCC's -fsplit-stack disturbing Mach's vm_allocate
±On Sat, 2014-04-26 at 08:53 +0200, Justus Winter wrote: > Quoting Svante Signell (2014-04-24 10:39:10) > > On Fri, 2014-04-18 at 10:03 +0200, Samuel Thibault wrote: > > > Samuel Thibault, le Thu 17 Apr 2014 00:03:45 +0200, a écrit : > > > > Thomas Schwinge, le Wed 09 Apr 2014 09:36:42 +0200, a écrit : > > > > > Well, the first step is to verify that > > > > > TARGET_THREAD_SPLIT_STACK_OFFSET > > > > > and similar configury is correct for the Hurd, > > > > > > > > I have added the corresponding field, so we can just use the same offset > > > > as on Linux. > > > > > > I have uploaded packages on http://people.debian.org/~sthibault/tmp/ so > > > Svante can try setting TARGET_THREAD_SPLIT_STACK_OFFSET to 0x30 with > > > them. > > > > Status report: > > - Without split stack enabled around 70 libgo tests pass and 50 fails, > > most of them with a segfault. > > - Enabling split stack and using the libc Samuel built all 122 libgo > > tests fail with a segfault. > > - In both cases simple go programs work, like hello+sqrt.go below. > > - The segfault seems to be located at the same code piece according to > > gdb (maybe due to exception handling) > > > > cat hello+sqrt.go > > package main > > import ( > > "fmt" > > ) > > func main() { > > fmt.Printf("Hello, world. Sqrt(2) = %v\n", Sqrt(2)) > > } > > How is that even a valid go program? Sqrt is not defined. Here is the sqrt.go: // Package main is a trivial example package. package main // Sqrt returns an approximation to the square root of x. func Sqrt(x float64) float64 { z := 1.0 for i := 0; i < 1000; i++ { z -= (z*z - x) / (2 * z) } return z } gccgo-4.9 -g hello+sqrt.go sqrt.go > > I have not been able to use a local go library function, e.g. package > > newmath, and the go frontend is not yet available for GNU/Hurd. > > What do you mean exactly by "local go library function"? Like having sqrt.go in a separate directory: cat newmath/sqrt.go // Package newmath is a trivial example package. package newmath // Sqrt returns an approximation to the square root of x. func Sqrt(x float64) float64 { z := 1.0 for i := 0; i < 1000; i++ { z -= (z*z - x) / (2 * z) } return z } > > However, it seems that something triggers the segfaults when running > > make -C build/i486-gnu/libgo check (both with and w/o split-stack) > > while setting the keep parameter in ./src/libgo/testsuite/gotest > > and running them manually some of them work?? As a first glance, about > > the same number of tests succeeds with and w/o split stack :) Some of > > the failing tests still seems random, sometimes they pass, sometimes > > they fail. > > For reference, here are my notes about one of these crashes (Svante, > is this still current?): Yes it is, thanks for your help so far. Is the rpctrace bug you mentioned that the wrong ports are reported? > ~~~ snip ~~~ > First, there is a rpctrace bug (or, i'm misinterpreting the output): > > 93<--142(pid1182)->dir_lookup ("etc/hostname" 1 0) = 0 1 "" > 158<--160(pid1182) > > Here, we do a dir_lookup and get port 158. > > 158<--160(pid1182)->io_read_request (-1 255) = 0 "hurd-2013\n" > 158<--160(pid1182)->io_readable_request () = 0 0 > > Here, we use it to do stuff with that file. > > task130(pid1182)->mach_port_deallocate (pn{ 23}) = 0 > > Here, we deallocate the port. Note how the port name (pn?) says 23, > even though it's clearly port 158 that is getting deallocated, b/c we > get port 158 back from the next rpc: > > 93<--142(pid1182)->dir_lookup ("lib/i386-gnu/libnss_files.so.2" 4194305 0) > = 0 1 ""158<--157(pid1182) > > Now, the get to the real issue. From the backtrace > (http://paste.debian.net/95410/) > we know that it segfaults in mmap: > > Program received signal SIGSEGV, Segmentation fault. > 0x019977b7 in _hurd_intr_rpc_msg_in_trap () at intr-msg.c:132 > 132 intr-msg.c: No such file or directory. > [...] > Thread 4 (Thread 1205.4): > #0 0x019977b7 in _hurd_intr_rpc_msg_in_trap () at intr-msg.c:132 > err = > err = > user_option = 3 > user_timeout = 48 > m = 0x532370 > msgh_bits = 0 > remote_port = 268509186 > msgid = 21118 > save_data = > __PRETTY_FUNCTION__ = "_hurd_intr_rpc_mach_msg" > #1 0x0005 in ?? () > No symbol table info available. > #2 0x01a7a8dd in __mmap (addr=0x0, len=49880, prot=5, flags=33, fd=8, > offset=0) > at ../sysdeps/mach/hurd/mmap.c:92 > __ulink = {resource = {next = 0x0, prevp = 0x2cfcc}, thread = {next = > 0x0, > prevp = 0x1b81c5c}, cleanup = 0x19a2c70 <_hurd_port_cleanup>, > cleanup_data = 0x99} > __ctty_ulink = {resource = {next = 0x0, prevp = 0x19fc6bc > <_int_malloc+12>}, thread = { > next = 0x17, prevp = 0x5}, cleanup = 0x0, cleanup_data = 0x700f2} > __result = > descriptor = 0x1b5e467 <__io_map+7>
Re: -fuse-caller-save - Enable for MIPS
On 25-04-14 15:22, Richard Sandiford wrote: Tom de Vries writes: diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c index 45256e9..b61cd44 100644 --- a/gcc/config/mips/mips.c +++ b/gcc/config/mips/mips.c @@ -7027,11 +7027,17 @@ mips_expand_call (enum mips_call_type type, rtx result, rtx addr, { rtx orig_addr, pattern, insn; int fp_code; + rtx post_call_tmp_reg = gen_rtx_REG (word_mode, POST_CALL_TMP_REG); fp_code = aux == 0 ? 0 : (int) GET_MODE (aux); insn = mips16_build_call_stub (result, &addr, args_size, fp_code); if (insn) { + if (TARGET_EXPLICIT_RELOCS + && TARGET_CALL_CLOBBERED_GP + && !find_reg_note (insn, REG_NORETURN, 0)) + clobber_reg (&CALL_INSN_FUNCTION_USAGE (insn), post_call_tmp_reg); + I think this condition should go in mips_emit_call_insn instead, so that we don't have 4 instances of it. untyped_call could then use mips_expand_call as well. Richard, Done. Until now there was no real downside to using $6 for non-MIPS16 code. Now that there is, it would probably be worth making it: +#define POST_CALL_TMP_REG \ (TARGET_MIPS16 ? GP_ARG_FIRST + 2 : PIC_OFFSET_TABLE_REGNUM) and only adding the clobber for MIPS16. Done. diff --git a/gcc/testsuite/gcc.target/mips/fuse-caller-save.c b/gcc/testsuite/gcc.target/mips/fuse-caller-save.c new file mode 100644 index 000..1fd6c7d --- /dev/null +++ b/gcc/testsuite/gcc.target/mips/fuse-caller-save.c @@ -0,0 +1,30 @@ +/* { dg-do compile } */ +/* { dg-options "-fuse-caller-save" } */ +/* { dg-skip-if "" { *-*-* } { "*" } { "-Os" } } */ I might have asked this before, sorry, but why this skip? Please add a brief comment (in the string, if short enough). I've reduced the amount of skips a bit, and added a comment why they are skipped. +/* Testing -fuse-caller-save optimization option. */ + +static int __attribute__((noinline)) NOCOMPRESSION +bar (int x) +{ + return x + 3; +} + +int __attribute__((noinline)) NOCOMPRESSION +foo (int y) +{ + return y + bar (y); +} + +int NOCOMPRESSION +main (void) +{ + return !(foo (5) == 13); +} + +/* Check that there are only 2 stack-saves: r31 in main and foo. */ + +/* Check that there only 2 sw/sd. */ +/* { dg-final { scan-assembler-times "(?n)s\[wd\]\t\\\$.*,.*\\(\\\$sp\\)" 2 } } */ + +/* Check that the first caller-save register is unused. */ +/* { dg-final { scan-assembler-not "\\\$16" } } */ It'd be good to avoid NOCOMPRESSION because the only case that really needs the temporary register is MIPS16. Please try putting the test in a header file and reusing it for three tests, one each of MIPS16, microMIPS and uncompressed. Done, I think, I'm not 100% sure I understood what you wanted me to do here. build and reg-tested on MIPS. OK for trunk? Thanks, - Tom 2014-01-12 Radovan Obradovic Tom de Vries * config/mips/mips-protos.h (mips_emit_call_insn): Declare. * config/mips/mips.h (POST_CALL_TMP_REG): Define. * config/mips/mips.c (mips_emit_call_insn): Remove static. Use last_call_insn. Add POST_CALL_TMP_REG clobber (mips_split_call): Use POST_CALL_TMP_REG. (TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS): Redefine to true. * config/mips/mips.md (define_expand "untyped_call"): Use mips_emit_call_insn. * gcc.target/mips/mips.exp: Add use-caller-save to -ffoo/-fno-foo options. * gcc.target/mips/fuse-caller-save.h: New include file. * gcc.target/mips/fuse-caller-save.c: New test. * gcc.target/mips/fuse-caller-save-mips16.c: Same. * gcc.target/mips/fuse-caller-save-micromips.c: Same. --- gcc/config/mips/mips-protos.h | 1 + gcc/config/mips/mips.c | 24 -- gcc/config/mips/mips.h | 7 +++ gcc/config/mips/mips.md| 4 +++- .../gcc.target/mips/fuse-caller-save-micromips.c | 17 +++ .../gcc.target/mips/fuse-caller-save-mip16.c | 17 +++ gcc/testsuite/gcc.target/mips/fuse-caller-save.c | 17 +++ gcc/testsuite/gcc.target/mips/fuse-caller-save.h | 17 +++ gcc/testsuite/gcc.target/mips/mips.exp | 1 + 9 files changed, 98 insertions(+), 7 deletions(-) create mode 100644 gcc/testsuite/gcc.target/mips/fuse-caller-save-micromips.c create mode 100644 gcc/testsuite/gcc.target/mips/fuse-caller-save-mip16.c create mode 100644 gcc/testsuite/gcc.target/mips/fuse-caller-save.c create mode 100644 gcc/testsuite/gcc.target/mips/fuse-caller-save.h diff --git a/gcc/config/mips/mips-protos.h b/gcc/config/mips/mips-protos.h index 3d59b7b..19ea919 100644 --- a/gcc/config/mips/mips-protos.h +++ b/gcc/config/mips/mips-protos.h @@ -232,6 +232,7 @@ extern bool mips_use_pic_fn_addr_reg_p (const_rtx); extern rtx mips_expand_call (enum mips_call_type, rtx, rtx, rtx, rtx, bool); extern void mips_split_call (rtx, rtx); extern bool mips_get_pic_call_symbol (rtx *, int); +extern rtx mips_emit_call_insn (
[PATCH, testsuite]: Clean dump files
Hello! 2014-04-26 Uros Bizjak * gcc.dg/tree-ssa/alias-30.c (dg-options): Dump only fre1 details. * gcc.dg/vect/pr60505.c: Cleanup vect tree dump. * g++.dg/ipa/devirt-27.C (dg-options): Remove -fdump-ipa-devirt. Committed to mainline and 4.9 branch. Uros. Index: gcc.dg/tree-ssa/alias-30.c === --- gcc.dg/tree-ssa/alias-30.c (revision 209806) +++ gcc.dg/tree-ssa/alias-30.c (working copy) @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O -fdump-tree-fre-details" } */ +/* { dg-options "-O -fdump-tree-fre1-details" } */ extern int posix_memalign(void **memptr, __SIZE_TYPE__ alignment, __SIZE_TYPE__ size); Index: gcc.dg/vect/pr60505.c === --- gcc.dg/vect/pr60505.c (revision 209806) +++ gcc.dg/vect/pr60505.c (working copy) @@ -10,3 +10,5 @@ out[i] = (ovec[i] = in[i]); out[num] = ovec[num/2]; } + +/* { dg-final { cleanup-tree-dump "vect" } } */ Index: g++.dg/ipa/devirt-27.C === --- g++.dg/ipa/devirt-27.C (revision 209806) +++ g++.dg/ipa/devirt-27.C (working copy) @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O3 -fdump-ipa-devirt -fdump-tree-optimized" } */ +/* { dg-options "-O3 -fdump-tree-optimized" } */ struct A { int a;
Re: [wide-int 5/5] Add dump () method for gdb debugging
i am sorry, i missed the fact that the loop counts up but you were reversing the order in the indexes. kenny On 04/26/2014 04:26 AM, Richard Sandiford wrote: Kenneth Zadeck writes: don't you think that it would be easier to understand the number if you printed it largest index first, as in the routines in wide-int-print.cc? Yeah, that's what the patch does. E.g. (for 32-bit HWI): [...,0x3,0x8000] is 7 << 31. Thanks, Richard kenny On 04/25/2014 09:58 AM, Richard Sandiford wrote: This patch adds a dump () method so that it's easier to read the contents of the various wide-int types in gdb. I've deliberately not done any extension for "small_prec" cases because I think here we want to see the raw values as much as possible. Tested on x86_64-linux-gnu. OK to install? Thanks, Richard Index: gcc/wide-int.cc === --- gcc/wide-int.cc 2014-04-25 14:48:03.263213403 +0100 +++ gcc/wide-int.cc 2014-04-25 14:48:25.186333842 +0100 @@ -2130,3 +2130,9 @@ wi::only_sign_bit_p (const wide_int_ref void gt_ggc_mx (widest_int *) { } void gt_pch_nx (widest_int *, void (*) (void *, void *), void *) { } void gt_pch_nx (widest_int *) { } + +template void wide_int::dump () const; +template void generic_wide_int >::dump () const; +template void generic_wide_int >::dump () const; +template void offset_int::dump () const; +template void widest_int::dump () const; Index: gcc/wide-int.h === --- gcc/wide-int.h 2014-04-25 14:34:54.823652619 +0100 +++ gcc/wide-int.h 2014-04-25 14:48:06.218229309 +0100 @@ -709,6 +709,9 @@ #define INCDEC_OPERATOR(OP, DELTA) \ #undef ASSIGNMENT_OPERATOR #undef INCDEC_OPERATOR + /* Debugging functions. */ + void dump () const; + static const bool is_sign_extended = wi::int_traits >::is_sign_extended; }; @@ -859,6 +862,23 @@ generic_wide_int ::operator = ( return *this; } +/* Dump the contents of the integer to stderr, for debugging. */ +template +void +generic_wide_int ::dump () const +{ + unsigned int len = this->get_len (); + const HOST_WIDE_INT *val = this->get_val (); + unsigned int precision = this->get_precision (); + fprintf (stderr, "["); + if (len * HOST_BITS_PER_WIDE_INT < precision) +fprintf (stderr, "...,"); + for (unsigned int i = 0; i < len - 1; ++i) +fprintf (stderr, HOST_WIDE_INT_PRINT_HEX ",", val[len - 1 - i]); + fprintf (stderr, HOST_WIDE_INT_PRINT_HEX "], precision = %d\n", + val[0], precision); +} + namespace wi { template <>
Re: [wide-int 4/5] Fix canonize handling of "small_prec" case
this is fine. kenny On 04/25/2014 09:44 AM, Richard Sandiford wrote: We should write back the sign-extended value. Tested on x86_64-linux-gnu. OK to install? Thanks, Richard Index: gcc/wide-int.cc === --- gcc/wide-int.cc 2014-04-25 09:15:14.297359380 +0100 +++ gcc/wide-int.cc 2014-04-25 09:15:50.755637670 +0100 @@ -81,7 +81,7 @@ canonize (HOST_WIDE_INT *val, unsigned i top = val[len - 1]; if (len * HOST_BITS_PER_WIDE_INT > precision) -top = sext_hwi (top, precision % HOST_BITS_PER_WIDE_INT); +val[len - 1] = top = sext_hwi (top, precision % HOST_BITS_PER_WIDE_INT); if (top != 0 && top != (HOST_WIDE_INT)-1) return len;
Re: -fuse-caller-save - Collect register usage information
Eric, Honza, This patch adds analysis in pass_final to track which hard registers are set or clobbered by the function body, and stores that information in a struct cgraph_node, to be used in the fuse-caller-save optmization. This is the updated version of the previously approved patch submitted here (http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01320.html ). The changes are: - using a new hook call_fusage_contains_non_callee_clobbers, - incorporating minor review comments from Richard Sandiford ( http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01436.html ). As part of the fuse-caller-save patch series, bootstrapped and reg-tested on x86_64, and build and reg-tested on MIPS. Eric, non-cgraph part OK for trunk? Honza, cgraph part OK for trunk? Thanks, - Tom 2013-04-29 Radovan Obradovic Tom de Vries * cgraph.h (struct cgraph_node): Add function_used_regs, function_used_regs_initialized and function_used_regs_valid fields. * final.c: Move include of hard-reg-set.h to before rtl.h to declare find_all_hard_reg_sets. (collect_fn_hard_reg_usage, get_call_fndecl, get_call_cgraph_node) (get_call_reg_set_usage): New function. (rest_of_handle_final): Use collect_fn_hard_reg_usage. --- gcc/cgraph.h | 7 gcc/final.c | 117 ++- gcc/regs.h | 4 ++ 3 files changed, 127 insertions(+), 1 deletion(-) diff --git a/gcc/cgraph.h b/gcc/cgraph.h index 84fc1d9..d1f 100644 --- a/gcc/cgraph.h +++ b/gcc/cgraph.h @@ -408,6 +408,13 @@ public: /* Time profiler: first run of function. */ int tp_first_run; + /* Call unsaved hard registers really used by the corresponding + function (including ones used by functions called by the + function). */ + HARD_REG_SET function_used_regs; + /* Set if function_used_regs is valid. */ + unsigned function_used_regs_valid: 1; + /* Set when decl is an abstract function pointed to by the ABSTRACT_DECL_ORIGIN of a reachable function. */ unsigned used_as_abstract_origin : 1; diff --git a/gcc/final.c b/gcc/final.c index 8c6f6ee..7b79059 100644 --- a/gcc/final.c +++ b/gcc/final.c @@ -49,6 +49,7 @@ along with GCC; see the file COPYING3. If not see #include "tree.h" #include "varasm.h" +#include "hard-reg-set.h" #include "rtl.h" #include "tm_p.h" #include "regs.h" @@ -57,7 +58,6 @@ along with GCC; see the file COPYING3. If not see #include "recog.h" #include "conditions.h" #include "flags.h" -#include "hard-reg-set.h" #include "output.h" #include "except.h" #include "function.h" @@ -223,6 +223,7 @@ static int alter_cond (rtx); static int final_addr_vec_align (rtx); #endif static int align_fuzz (rtx, rtx, int, unsigned); +static void collect_fn_hard_reg_usage (void); /* Initialize data in final at the beginning of a compilation. */ @@ -4426,6 +4427,7 @@ rest_of_handle_final (void) assemble_start_function (current_function_decl, fnname); final_start_function (get_insns (), asm_out_file, optimize); final (get_insns (), asm_out_file, optimize); + collect_fn_hard_reg_usage (); final_end_function (); /* The IA-64 ".handlerdata" directive must be issued before the ".endp" @@ -4724,3 +4726,116 @@ make_pass_clean_state (gcc::context *ctxt) { return new pass_clean_state (ctxt); } + +/* Collect hard register usage for the current function. */ + +static void +collect_fn_hard_reg_usage (void) +{ + rtx insn; + int i; + struct cgraph_node *node; + + if (!flag_use_caller_save + || !targetm.call_fusage_contains_non_callee_clobbers) +return; + + node = cgraph_get_node (current_function_decl); + gcc_assert (node != NULL); + + for (insn = get_insns (); insn != NULL_RTX; insn = next_insn (insn)) +{ + HARD_REG_SET insn_used_regs; + + if (!NONDEBUG_INSN_P (insn)) + continue; + + find_all_hard_reg_sets (insn, &insn_used_regs, false); + + if (CALL_P (insn) + && !get_call_reg_set_usage (insn, &insn_used_regs, call_used_reg_set)) + { + CLEAR_HARD_REG_SET (node->function_used_regs); + return; + } + + IOR_HARD_REG_SET (node->function_used_regs, insn_used_regs); +} + + /* Be conservative - mark fixed and global registers as used. */ + IOR_HARD_REG_SET (node->function_used_regs, fixed_reg_set); + for (i = 0; i < FIRST_PSEUDO_REGISTER; i++) +if (global_regs[i]) + SET_HARD_REG_BIT (node->function_used_regs, i); + +#ifdef STACK_REGS + /* Handle STACK_REGS conservatively, since the df-framework does not + provide accurate information for them. */ + + for (i = FIRST_STACK_REG; i <= LAST_STACK_REG; i++) +SET_HARD_REG_BIT (node->function_used_regs, i); +#endif + + node->function_used_regs_valid = 1; +} + +/* Get the declaration of the function called by INSN. */ + +static tree +get_call_fndecl (rtx insn) +{ + rtx note, datum; + + if (!flag_use_caller_save) +return NULL_TREE; + + note = find_reg_note (insn, REG_CALL_DECL, NULL_RTX); + if (note == NULL_RTX) +return NULL_
Re: [RFC] Remove PUSH_ARGS_REVERSED from the RTL expander.
> > 2014-03-21 James Greenhalgh > > > > * calls.c (initialize_argument_information): Always treat > > PUSH_ARGS_REVERSED as 1, simplify code accordingly. > > (expand_call): Likewise. > > (emit_library_call_calue_1): Likewise. > > * expr.c (PUSH_ARGS_REVERSED): Do not define. > > (emit_push_insn): Always treat PUSH_ARGS_REVERSED as 1, simplify > > code accordingly. > > This is fine for the trunk at this point. Are you sure that it's not a correctness issue for some targets though? -- Eric Botcazou
Re: [SPARC] Fix PR target/60941
> > Not clear to me, (2U << i) should be zero if the shift count is masked. > > 2U << 31 is undefined behavior on those targets. Precisely not, or else we are not talking about the same notion of masking. -- Eric Botcazou
Re: GCC's -fsplit-stack disturbing Mach's vm_allocate
Quoting Svante Signell (2014-04-26 13:59:57) > > For reference, here are my notes about one of these crashes (Svante, > > is this still current?): > > Yes it is, thanks for your help so far. Is the rpctrace bug you > mentioned that the wrong ports are reported? > > > ~~~ snip ~~~ > > [...] > > task130(pid1182)->mach_port_deallocate (pn{ 23}) = 0 > > > > Here, we deallocate the port. Note how the port name (pn?) says 23, > > even though it's clearly port 158 that is getting deallocated [...] Reading the source revealed that 'pn{ 23}' makes it explicit that this is a port name, not a right. And indeed: routine mach_port_deallocate( task: ipc_space_t; name: mach_port_name_t); For port rights, which we know are valid in the tracee, we translate the name so that the name used by the tracee is displayed in the log. A comment in the source correctly states: /* Note that MACH_MSG_TYPE_PORT_NAME does not indicate a port right. It indicates a port name, i.e. just an integer--and we don't know what task that port name is meaningful in. If it's meaningful in a traced task, then it refers to our intercepting port rather than the original port anyway. */ We could try to be clever here, at least for the reference counting RPC. Justus
Re: [SPARC] Fix PR target/60941
Eric Botcazou writes: > > > Not clear to me, (2U << i) should be zero if the shift count is masked. > > > > 2U << 31 is undefined behavior on those targets. > > Precisely not, or else we are not talking about the same notion of masking. I believe Jakub is referring to the following in the C standard: "Bitwise shift operators ... Semantics ... If the value of the right operand ... is greater than or equal to the width of the promoted left operand, the behavior is undefined." So on 16-bit int systems you can't portably shift 2U by more than 15. (I'm not worried about the HW blowing up, but GCC itself has a long history of exploiting no-undefinedness assumptions.) /Mikael
[PATCH] doc/install.texi: Parallel profiledbootstrap is supported on all maintained releases.
Parallel profiledbootstrap is supported on all maintained releases. So just remove a misleading outdated sentence, that states the opposite, from doc/install.texi. OK for trunk? Thanks. 2014-04-26 Markus Trippelsdorf * doc/install.texi (Building with profile feedback): Remove outdated sentence. diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi index 0b2b36575775..7851b0061b30 100644 --- a/gcc/doc/install.texi +++ b/gcc/doc/install.texi @@ -2544,8 +2544,7 @@ Finally a @code{stagefeedback} compiler is built using the information collected Unlike standard bootstrap, several additional restrictions apply. The compiler used to build @code{stage1} needs to support a 64-bit integral type. -It is recommended to only use GCC for this. Also parallel make is currently -not supported since collisions in profile collecting may occur. +It is recommended to only use GCC for this. @html -- Markus
Re: [SPARC] Fix PR target/60941
On Sat, Apr 26, 2014 at 03:30:25PM +0200, Eric Botcazou wrote: > > > Not clear to me, (2U << i) should be zero if the shift count is masked. > > > > 2U << 31 is undefined behavior on those targets. > > Precisely not, or else we are not talking about the same notion of masking. Eh, C99, 6.5.7/3: "If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined." So, if you have int16 target, where unsigned int is 16-bit and UINT_MAX 65535, then shift count must be >= 0 and < 16, therefore, 2U << 31 is undefined behavior. Jakub
Re: [Patch 1/3] Use manual regex algorithm switching
On 25/04/14 20:50 -0400, Tim Shen wrote: * include/bits/regex.tcc (__regex_algo_impl<>): Remove _GLIBCXX_REGEX_DFS_QUANTIFIERS_LIMIT and use _GLIBCXX_REGEX_USE_THOMPSON_NFA instead. * include/bits/regex_automaton.h: Remove quantifier counting variable. * include/bits/regex_automaton.tcc (_State_base::_M_dot): Adjust debug NFA dump. This patch is OK, thanks very much.
Re: [COMMITTED] Fix debug/60438 -- i686 stack vs fp operations
OK if bootstrap succeeds? With testing of the bootstrap build of the patch, I ran into the following regression compared to a reference bootstrap build without the patch: ... FAIL: g++.dg/tsan/cond_race.C -O2 output pattern test, is ==3087==WARNING: Program is run with unlimited stack size, which wouldn't work with Threa\ dSanitizer. ==3087==Re-execing with stack size limited to 33554432 bytes. == WARNING: ThreadSanitizer: heap-use-after-free (pid=3087) Read of size 8 at 0x7d18efc8 by thread T1: #0 pthread_cond_signal /home/vries/gcc_versions/data/test-fix-expand-ldexp/with/src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.i\ nc:2266 (libtsan.so.0+0x00039b21) #1 thr(void*) /home/vries/gcc_versions/data/test-fix-expand-ldexp/with/src/gcc/testsuite/g++.dg/tsan/cond_race.C:20 (cond_race.exe+0x1033\ ) Previous write of size 8 at 0x7d18efc8 by main thread: #0 operator delete(void*) /home/vries/gcc_versions/data/test-fix-expand-ldexp/with/src/libsanitizer/tsan/tsan_interceptors.cc:592 (libtsan.so.0+0\ x000494b0) #1 main /home/vries/gcc_versions/data/test-fix-expand-ldexp/with/src/gcc/testsuite/g++.dg/tsan/cond_race.C:34 (cond_race.exe+0x0ea0) Location is heap block of size 96 at 0x7d18efa0 allocated by main thread: #0 operator new(unsigned long) /home/vries/gcc_versions/data/test-fix-expand-ldexp/with/src/libsanitizer/tsan/tsan_interceptors.cc:560 (libtsan.s\ o.0+0x000496f2) #1 main /home/vries/gcc_versions/data/test-fix-expand-ldexp/with/src/gcc/testsuite/g++.dg/tsan/cond_race.C:25 (cond_race.exe+0x0e12) Thread T1 (tid=3101, running) created by main thread at: #0 pthread_create /home/vries/gcc_versions/data/test-fix-expand-ldexp/with/src/libsanitizer/tsan/tsan_interceptors.cc:877 (libtsan.so.0+0x000\ 47c23) #1 main /home/vries/gcc_versions/data/test-fix-expand-ldexp/with/src/gcc/testsuite/g++.dg/tsan/cond_race.C:29 (cond_race.exe+0x0e5a) SUMMARY: ThreadSanitizer: heap-use-after-free /home/vries/gcc_versions/data/test-fix-expand-ldexp/with/src/gcc/testsuite/g++.dg/tsan/cond_race.C:20 t\ hr(void*) == ThreadSanitizer: reported 1 warnings , should match ThreadSanitizer: data race.*pthread_cond_signal.* ... I've found the same failure here: http://gcc.gnu.org/ml/gcc-testresults/2014-01/msg00127.html, so I'm assuming it's a spurious failure. I've committed to trunk and 4.9. Thanks, - Tom
Re: [Patch 2/3] Make regex's infinite loop detection accurate
On 25/04/14 18:03 -0400, Tim Shen wrote: * include/bits/regex_executor.h: Add _M_rep_count to track how many times this repeat node are visited. "is visited" not "are visited" @@ -151,6 +156,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // character. std::unique_ptr>> _M_match_queue; // Used in BFS, indicating that which state is already visited. + std::vector> _M_rep_count; We don't need the std:: qualifiers here, but it doesn't do any harm. std::unique_ptr> _M_visited; _FlagT_M_flags; // To record current solution. diff --git a/libstdc++-v3/include/bits/regex_executor.tcc b/libstdc++-v3/include/bits/regex_executor.tcc index 68a5e04..bb10a72 100644 --- a/libstdc++-v3/include/bits/regex_executor.tcc +++ b/libstdc++-v3/include/bits/regex_executor.tcc @@ -161,6 +161,39 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION return false; } + // __rep_count records how many times (__rep_count.second) + // this node are visited under certain input iterator + // (__rep_count.first). This prevent the executor from entering + // infinite loop by refusing to continue when it's already been + // visited more than twice. It's `twice` instead of `once` because + // we need to spare one more time for potential group capture. + template + template +void _Executor<_BiIter, _Alloc, _TraitsT, __dfs_mode>:: +_M_rep_once_more(_StateIdT __i) +{ + const auto& __state = _M_nfa[__i]; + auto& __rep_count = _M_rep_count[__i]; + if (__rep_count.second == 0 || __rep_count.first != _M_current) + { + auto back = __rep_count; This variable should be uglified. + __rep_count.first = _M_current; + __rep_count.second = 1; Maybe a dumb question (I don't understand a lot of the regex code!) but is it correct to set this to 1 in the case where __rep_count.first != _M_current ? Could that result in the count going downwards from 2 to 1? (Maybe that's correct?)
Re: [SPARC] Fix PR target/60941
> I believe Jakub is referring to the following in the C standard: > > "Bitwise shift operators > ... > Semantics > ... If the value of the right operand ... is greater than or equal to the > width of the promoted left operand, the behavior is undefined." > > So on 16-bit int systems you can't portably shift 2U by more than 15. Sure, but we're talking about targets for which the shift count is masked, not about the C standard. -- Eric Botcazou
Re: [SPARC] Fix PR target/60941
> So, if you have int16 target, where unsigned int is 16-bit and UINT_MAX > 65535, then shift count must be >= 0 and < 16, therefore, 2U << 31 is > undefined behavior. Well, if the shift count is masked by the target, if will be masked according to the width of the register and the result will nevertheless be zero. -- Eric Botcazou
Re: [PATCH] doc/install.texi: Parallel profiledbootstrap is supported on all maintained releases.
On Sat, 26 Apr 2014, Markus Trippelsdorf wrote: > Parallel profiledbootstrap is supported on all maintained releases. So > just remove a misleading outdated sentence, that states the opposite, > from doc/install.texi. > > OK for trunk? Okay. If this also applies to GCC 4.9 (which I think is the case), you may want to apply it to the 4.9 branch as well. Gerald
Re: Update doc/gimple.texi to reflect change from union to class hierarchy
On Fri, 25 Apr 2014, David Malcolm wrote: > Successfully generates HTML, info and pdf via appropriate make > invocations; example of resulting HTML can be seen at the bottom of: > > http://dmalcolm.fedorapeople.org/gcc/2014-04-25/Tuple-representation.html > > The diagram is split over pages 178-180 of the generated PDF: > > http://dmalcolm.fedorapeople.org/gcc/2014-04-25/gccint.pdf When I just checked that PDF did not (yet?) have that table. > but I don't think there's a good way to avoid page breaks there. Agreed. > Bootstrap in progress [do pure doc fixes require a bootstrap?] Not generally, no. > OK for trunk, assuming bootstraps? (and eventually for the 4.9 branch, > after a few days?) Yes and yes. (I did not review the technical accuracy of the diagram in detail, but (a) micromanagement is not helpful and (b) the current documentation is plain wrong.) The one thing that confused me a bit at first was the line +- gimple_statement_base where I then thought all the others should move their indentation left by four or so columns. Which is not what you are trying to state in fact. How about omitting the "+-" from that one? Gerald
Re: Reduce -flto -fprofile-generate memory use
On Thu, 17 Apr 2014, Jan Hubicka wrote: > + * opts.c (common_handle_option): Disable -fipa-reference coorectly > + with -fuse-profile. coorectly -> correctly Gerald
[C PATCH] proposal to add new warning -Wsizeof-array-argument
Hi, Shall it a good idea to add new warning -Wsizeof-array-argument that warns when sizeof is applied on parameter declared as an array ? Similar to clang's -Wsizeof-array-argument: http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20110613/042812.html This was also reported as PR6940: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=6940 I have attached a patch that adds the warning to C front-end. I implemented it by adding a new member BOOL_BITFIELD is_array_parm to tree_parm_decl. Not sure if that's a good approach. Also should it be enabled by -Wall or -Wextra or only by -Wsizeof-array-argument ? Currently I have made it enabled by -Wall along with -Wsizeof-array-argument. Bootstrapped on x86_64-unknown-linux-gnu. [gcc] * tree-core.h (tree_parm_decl): Add new member BOOL_BITFIELD is_array_parm * tree.h (SET_PARM_DECL_IS_ARRAY): New macro. (PARM_DECL_ARRAY_P): New macro. [gcc/c] * c-decl.c (push_parm_decl): Call SET_PARM_DECL_IS_ARRAY. * c-typeck.c (c_expr_sizeof_expr): Add check for sizeof-array-argument warning. [gcc/c-family] * c.opt (-Wsizeof-array-argument): New option. [gcc/testsuite/gcc.dg] * sizeof-array-argument.c: New test-case. Thanks and Regards, Prathamesh Index: gcc/tree-core.h === --- gcc/tree-core.h (revision 209800) +++ gcc/tree-core.h (working copy) @@ -1411,6 +1411,7 @@ struct GTY(()) tree_const_decl { struct GTY(()) tree_parm_decl { struct tree_decl_with_rtl common; rtx incoming_rtl; + BOOL_BITFIELD is_array_parm; }; struct GTY(()) tree_decl_with_vis { Index: gcc/tree.h === --- gcc/tree.h (revision 209800) +++ gcc/tree.h (working copy) @@ -1742,6 +1742,7 @@ extern void protected_set_expr_location #define TYPE_LANG_SPECIFIC(NODE) \ (TYPE_CHECK (NODE)->type_with_lang_specific.lang_specific) + #define TYPE_VALUES(NODE) (ENUMERAL_TYPE_CHECK (NODE)->type_non_common.values) #define TYPE_DOMAIN(NODE) (ARRAY_TYPE_CHECK (NODE)->type_non_common.values) #define TYPE_FIELDS(NODE) \ @@ -2258,6 +2259,12 @@ extern void decl_value_expr_insert (tree #define DECL_INCOMING_RTL(NODE) \ (PARM_DECL_CHECK (NODE)->parm_decl.incoming_rtl) +#define SET_PARM_DECL_IS_ARRAY(NODE, val) \ + (PARM_DECL_CHECK (NODE)->parm_decl.is_array_parm = (val)) + +#define PARM_DECL_ARRAY_P(NODE) \ + (PARM_DECL_CHECK (NODE)->parm_decl.is_array_parm != 0) + /* Nonzero for a given ..._DECL node means that no warnings should be generated just because this node is unused. */ #define DECL_IN_SYSTEM_HEADER(NODE) \ Index: gcc/c/c-decl.c === --- gcc/c/c-decl.c (revision 209800) +++ gcc/c/c-decl.c (working copy) @@ -4626,13 +4626,14 @@ push_parm_decl (const struct c_parm *par { tree attrs = parm->attrs; tree decl; + int is_array; decl = grokdeclarator (parm->declarator, parm->specs, PARM, false, NULL, &attrs, expr, NULL, DEPRECATED_NORMAL); decl_attributes (&decl, attrs, 0); - + is_array = parm->declarator->kind == cdk_array; + SET_PARM_DECL_IS_ARRAY (decl, is_array); decl = pushdecl (decl); - finish_decl (decl, input_location, NULL_TREE, NULL_TREE, NULL_TREE); } Index: gcc/c/c-typeck.c === --- gcc/c/c-typeck.c (revision 209800) +++ gcc/c/c-typeck.c (working copy) @@ -2730,6 +2730,12 @@ c_expr_sizeof_expr (location_t loc, stru else { bool expr_const_operands = true; + if (warn_sizeof_array_argument && TREE_CODE (expr.value) == PARM_DECL && PARM_DECL_ARRAY_P (expr.value)) + { + warning_at (loc, 0, "sizeof on array parameter %<%s%> shall return size of %qT", + IDENTIFIER_POINTER (DECL_NAME (expr.value)), expr.original_type); + inform (DECL_SOURCE_LOCATION (expr.value), "declared here"); + } tree folded_expr = c_fully_fold (expr.value, require_constant_value, &expr_const_operands); ret.value = c_sizeof (loc, TREE_TYPE (folded_expr)); Index: gcc/c-family/c.opt === --- gcc/c-family/c.opt (revision 209800) +++ gcc/c-family/c.opt (working copy) @@ -509,6 +509,9 @@ Warn about missing fields in struct init Wsizeof-pointer-memaccess C ObjC C++ ObjC++ Var(warn_sizeof_pointer_memaccess) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall) +Wsizeof-array-argument +C Var(warn_sizeof_array_argument) Warning LangEnabledBy(C,Wall) + Wsuggest-attribute=format C ObjC C++ ObjC++ Var(warn_suggest_attribute_format) Warning Warn about functions which might be candidates for format attributes Index: gcc/testsuite/gcc.dg/sizeof-array-argument.c === --- gcc/testsuite/gcc.dg/sizeof-array-argument.c (revision 0) +++ gcc/testsuite/gcc.dg/sizeof-array-argument.c (working copy) @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-W
Re: [DOC PATCH] Rewrite docs for inline asm
The perfect is the enemy of the good. >From all I have seen and heard, this rewrite is a clear improvement over the status quo. So I am going to review and approve it wearing my doc maintainer hat, deferring to and relying on Andrew and Richard and their deep expertise on the technical side. Please take my comments below into account for an updated patch, and once Andrew and Richard have signed of, this is then good to commit. (Patches of this size really are tough to review. That surely has contributed to the delay getting this in the tree most significantly. In fact, I fell asleep over my notebook thrice reviewing this last evening, though traveling from US West coast to Central Europe without sleep during the flight probably did not exactly help, either. :-) On Fri, 25 Apr 2014, James Greenhalgh wrote: >> +@menu >> +* Basic Asm:: Inline assembler with no operands. >> +* Extended Asm:: Inline assembler with operands. >> +* Constraints::Constraints for asm operands Should this be "Asm operands" based on the other references here or, better @code{asm} in all cases here? >> +* Explicit Reg Vars:: Defining variables residing in specified registers. Should this be "Register" instead of just "Reg"? And "Variables" instead of just "Vars"? >> +* Size of an asm:: How GCC calculates the size of an asm block. @code{asm} ? >> +@subsection Basic Asm - Assembler Instructions with No Operands --- instead of - >> +To create headers compatible with ISO C, write @code{__asm__} instead of >> +@code{asm} (@pxref{Alternate Keywords}). Here it's just ISO C, whereas in case of __inline__ you had ISO C90. Would it make sense to just use ISO C throughout? >> +By definition, a Basic @code{asm} statement is one with no operands. >> +@code{asm} statements that contain one or more colons (used to delineate >> +operands) are considered to be Extended (for example, @code{asm("int $3")} >> +is Basic, and @code{asm("int $3" : )} is Extended). @xref{Extended Asm}. At this point the reader does not yet know the concept of those colons, does she? So that can be seen as a bit confusing. >> +@subsubheading Parameters >> +@emph{AssemblerInstructions} >> +@* >> +This is a literal string that specifies the assembler code. The string can Just "assembler code", without "the", would be my take, but let's see what native speakers so. >> +The GCC compiler does not parse the assembler instructions themselves and "GCC does". GCC stands for GNU Compiler Collection, and GNU Compiler Collection Compiler may be a bit confusing. ;-) >> +You may place multiple assembler instructions together in a single >> @code{asm} >> +string, separated by the characters normally used in assembly code for the >> +system. Might "by the same characters normally...the target system" be more clear? >> +A combination that works in most places is a newline to break the >> +line, plus a tab character to move to the instruction field (written >> +as "\n\t"). Will everyone know what an instruction field is? I'm not sure it's that common of a term. >> +Do not expect a sequence of @code{asm} statements to remain perfectly >> +consecutive after compilation. If certain instructions need to remain >> +consecutive in the output, put them in a single multi-instruction asm >> +statement. Note that GCC's optimizer can move @code{asm} statements >> +relative to other code, including across jumps. optimizers (plural) >> +GCC's optimizer does not know about these jumps, and therefore cannot take >> +account of them when deciding how to optimize. Jumps from @code{asm} to C >> +labels are only supported in Extended @code{asm}. How about just saying "GCC does not know..:"? >> +@subsubheading Remarks >> +Using Extended @code{asm} will typically produce smaller, safer, and more >> +efficient code, and in most cases it is a better solution. Why? >> When writing inline assembly language outside C functions, however, >> you must use Basic @code{asm}. Extended @code{asm} statements have to >> be inside a C function. Might that be "outside of C functions"? Native speakers? The way this is phrased sounds a bit negative. Is this really a drawback? >> +Under certain circumstances, GCC may duplicate (or remove duplicates of) >> your >> +asm code as part of optimization. This can lead to unexpected duplicate >> +symbol errors during compilation if your asm code defines symbols or labels. "as part of optimization" -> "when optimizing" @code{asm} >> +Safely accessing C data and calling functions from Basic @code{asm} is more >> +complex than it may appear. To access C data, it is better to use Extended >> +@code{asm}. Here we are, this looks like (part of) the answer to my question "Why?" above. :-) >> +Since GCC does not parse the AssemblerInstructions, it has no visibility of ^ Something wrong here. >> +any symbols it references. This may result in th
Re: [PATCH 01/89] Const-correctness fixes for some gimple accessors
On Mon, 21 Apr 2014, David Malcolm wrote: > It was pointed out to me off-list that this patch series lacks > documentation changes. I'm working on fixing that, though am not sure I > want to fill everyone inboxes with an updated set of patches yet. > Should I send a combined patch for the documentation changes? (I can > break it up and merge it into the individual changes on commit, or if > these need editing). > > In any case I fixed up the corresponding entries in gcc/doc/gimple.texi, > double-checked the bootstrap/regtest/HTML generation of this one (on top > of r209545), and committed it to trunk as r209548. I'm attaching what I > actually committed. You certainly can consider any doc changes that mirror code changes as pre-approved. Based on what Jeff said, about changes like this one (const-correctness) going in right away, perhaps get these off your table together with their associated documentation aspects? And then do one doc change for all the rest -- or individual ones, I don't have a strong preference. Thanks for working on this! Gerald
Re: [Patch 2/3] Make regex's infinite loop detection accurate
On Sat, Apr 26, 2014 at 1:00 PM, Jonathan Wakely wrote: > Maybe a dumb question (I don't understand a lot of the regex code!) > but is it correct to set this to 1 in the case where > __rep_count.first != _M_current ? Could that result in the count > going downwards from 2 to 1? (Maybe that's correct?) As I said in the comment, __rep_count records how many times (__rep_count.second) this node is visited *under certain input iterator* (__rep_count.first). That is to say, if the current iterator (_M_current) is not changing, but the executor keeps visiting this node, times of visit needs to be recorded. Once the _M_current changed, the count should be reset to 0. We simply set it to 1 for the "set to 0 then increase by 1" operations. The idea behind this is that, for example, regex "(?:)*" will form a self-epsilon-circle in the NFA graph. While it's executed, an infinite loop may be entered because this circle doesn't consumes any input (aka, the current iterator is not changing). So for given node, we need to prevent from entering infinite loop by checking the iterator since last visited. If the iterator didn't change, we shall refuse to follow the edge-on-the-circle. There's one more question: we shouldn't simply refuse to follow the edge that last visited under the same iterator, because visiting some node (begin capture, end capture) on the circle has side effects. To make sure these side effects happen, one more time of reentering is permitted. That is why we set the counter and permitted entering the node for at most 2 times. About the overall regex code, I think I can compose a blog entry to explain it, if necessary? -- Regards, Tim Shen commit b188832f9bd85c71ef70693859e721974163c24b Author: tim Date: Fri Apr 25 01:50:11 2014 -0400 2014-04-25 Tim Shen * include/bits/regex_executor.h: Add _M_rep_count to track how many times this repeat node are visited. * include/bits/regex_executor.tcc (_Executor<>::_M_rep_once_more, _Executor<>::_M_dfs): Use _M_rep_count to prevent entering infinite loop. diff --git a/libstdc++-v3/include/bits/regex_executor.h b/libstdc++-v3/include/bits/regex_executor.h index 708c78e..c110b88 100644 --- a/libstdc++-v3/include/bits/regex_executor.h +++ b/libstdc++-v3/include/bits/regex_executor.h @@ -73,6 +73,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _M_results(__results), _M_match_queue(__dfs_mode ? nullptr : new vector>()), + _M_rep_count(_M_nfa.size()), _M_visited(__dfs_mode ? nullptr : new vector(_M_nfa.size())), _M_flags((__flags & regex_constants::match_prev_avail) ? (__flags @@ -104,6 +105,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION private: template void + _M_rep_once_more(_StateIdT); + + template + void _M_dfs(_StateIdT __start); template @@ -149,9 +154,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _ResultsVec& _M_results; // Used in BFS, saving states that need to be considered for the next // character. - std::unique_ptr>> _M_match_queue; + unique_ptr>> _M_match_queue; // Used in BFS, indicating that which state is already visited. - std::unique_ptr> _M_visited; + vector>_M_rep_count; + unique_ptr> _M_visited; _FlagT_M_flags; // To record current solution. _StateIdT _M_start_state; diff --git a/libstdc++-v3/include/bits/regex_executor.tcc b/libstdc++-v3/include/bits/regex_executor.tcc index 68a5e04..0daf211 100644 --- a/libstdc++-v3/include/bits/regex_executor.tcc +++ b/libstdc++-v3/include/bits/regex_executor.tcc @@ -161,6 +161,39 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION return false; } + // __rep_count records how many times (__rep_count.second) + // this node is visited under certain input iterator + // (__rep_count.first). This prevent the executor from entering + // infinite loop by refusing to continue when it's already been + // visited more than twice. It's `twice` instead of `once` because + // we need to spare one more time for potential group capture. + template + template +void _Executor<_BiIter, _Alloc, _TraitsT, __dfs_mode>:: +_M_rep_once_more(_StateIdT __i) +{ + const auto& __state = _M_nfa[__i]; + auto& __rep_count = _M_rep_count[__i]; + if (__rep_count.second == 0 || __rep_count.first != _M_current) + { + auto __back = __rep_count; + __rep_count.first = _M_current; + __rep_count.second = 1; + _M_dfs<__match_mode>(__state._M_alt); + __rep_count = __back; + } + else + { + if (__rep_count.second < 2) + { + __rep_count.second++; + _M_dfs<__match_mode>(_