Re: [PATCH v2] libitm: sh: avoid absolute relocation in shared library (PR 86712)
On Fri, 2018-08-03 at 14:54 -0600, Jeff Law wrote: > On 07/28/2018 07:04 AM, slyfox.inbox.ru via gcc-patches wrote: > > > > From: Sergei Trofimovich > > > > Cc: Andreas Schwab > > Cc: Torvald Riegel > > Cc: Alexandre Oliva > > Cc: Oleg Endo > > Cc: Kaz Kojima > > Signed-off-by: Sergei Trofimovich > > --- > > libitm/config/sh/sjlj.S | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/libitm/config/sh/sjlj.S b/libitm/config/sh/sjlj.S > > index 043f36749be..f265ab8f898 100644 > > --- a/libitm/config/sh/sjlj.S > > +++ b/libitm/config/sh/sjlj.S > > @@ -53,7 +53,7 @@ _ITM_beginTransaction: > > #else > > cfi_def_cfa_offset (4*10) > > #endif > > -#if defined HAVE_ATTRIBUTE_VISIBILITY || !defined __PIC__ > > +#if !defined __PIC__ > > mov.l .Lbegin, r1 > > jsr @r1 > > movr15, r5 > > @@ -78,7 +78,7 @@ _ITM_beginTransaction: > > > > .align 2 > > .Lbegin: > > -#if defined HAVE_ATTRIBUTE_VISIBILITY || !defined __PIC__ > > +#if !defined __PIC__ > > .long GTM_begin_transaction > > #else > > .long GTM_begin_transaction@PCREL-(.Lbegin0-.) > > > THanks. I installed this version. > Thanks Jeff. If there are no objections, I'll backport it to the 7 and 8 branches. Cheers, Oleg
Re: [PATCH] i386: Always set cfun->machine->max_used_stack_alignment
On Fri, Aug 3, 2018 at 12:55 AM, H.J. Lu wrote: > We should always set cfun->machine->max_used_stack_alignment if the > maximum stack slot alignment may be greater than 64 bits. > > Tested on i686 and x86-64. OK for master and backport for GCC 8? Can you explain why 64 bits, and what this value represents? Is this value the same for 64bit and 32bit targets? Should crtl->max_used_stack_slot_alignment be compared to STACK_BOUNDARY or even MIN_STACK_BOUNDARY instead? Uros. > Thanks. > > H.J. > > gcc/ > > PR target/86386 > * config/i386/i386.c (ix86_finalize_stack_frame_flags): Set > cfun->machine->max_used_stack_alignment if needed. > > gcc/testsuite/ > > PR target/86386 > * gcc.target/i386/pr86386.c: New file. > --- > gcc/config/i386/i386.c | 6 +++--- > gcc/testsuite/gcc.target/i386/pr86386.c | 26 + > 2 files changed, 29 insertions(+), 3 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr86386.c > > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > index ee409cfe7e4..4a0a050b3a2 100644 > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -13281,12 +13281,12 @@ ix86_finalize_stack_frame_flags (void) > recompute_frame_layout_p = true; > } > } > - else if (crtl->max_used_stack_slot_alignment > - > crtl->preferred_stack_boundary) > + else if (crtl->max_used_stack_slot_alignment > 64) > { >/* We don't need to realign stack. But we still need to keep > stack frame properly aligned to satisfy the largest alignment > -of stack slots. */ > +of stack slots if the maximum stack slot alignment may be > +greater than 64 bits. */ >if (ix86_find_max_used_stack_alignment (stack_alignment, true)) > cfun->machine->max_used_stack_alignment > = stack_alignment / BITS_PER_UNIT; > diff --git a/gcc/testsuite/gcc.target/i386/pr86386.c > b/gcc/testsuite/gcc.target/i386/pr86386.c > new file mode 100644 > index 000..a67cf45444e > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr86386.c > @@ -0,0 +1,26 @@ > +/* PR target/86386 */ > +/* { dg-do run { target { avx_runtime && int128 } } } */ > +/* { dg-options "-Os -fno-tree-dce -mstringop-strategy=vector_loop -mavx" } > */ > + > +unsigned c, d, e, f; > + > +unsigned __attribute__((noipa)) > +foo (unsigned char g, unsigned short h, unsigned i, unsigned long long j, > + unsigned char k, unsigned short l, unsigned m, unsigned __int128 n) > +{ > + __builtin_memset (&e, 0, 3); > + n <<= m; > + __builtin_memcpy (&m, 2 + (char *) &n, 1); > + m >>= 0; > + d ^= __builtin_mul_overflow (l, n, &m); > + return m; > +} > + > +int > +main () > +{ > + unsigned __int128 x = foo (0, 0, 0, 0, 0, 4, 1, 3); > + if (x != 24) > +__builtin_abort (); > + return 0; > +} > -- > 2.17.1 >
Re: [PATCH] [Ada] Make middle-end string literals NUL terminated
Hi Olivier, I think I like your idea a lot, it should be highly useful for Fortran and GO as well. Would somthing something like this (untested patch) work for you on top of my patch series. It makes use of the zero-termination properties of STRING_CSTs, that the other patch ensures. I have two C test cases for this: $ cat y3.c const char str[3] = "abc"; $ gcc -fmerge-all-constants -S y3.c $ cat y3.s .file "y3.c" .text .globl str .section.rodata.str1.1,"aMS",@progbits,1 .type str, @object .size str, 3 str: .string "abc" .ident "GCC: (GNU) 9.0.0 20180730 (experimental)" .section.note.GNU-stack,"",@progbits $ cat y4.c const char str[4] = "abc"; $ gcc -fmerge-all-constants -S y4.c $ cat y4.s $ cat y4.s .file "y4.c" .text .globl str .section.rodata.str1.1,"aMS",@progbits,1 .type str, @object .size str, 4 str: .string "abc" .ident "GCC: (GNU) 9.0.0 20180730 (experimental)" .section.note.GNU-stack,"",@progbits The .size directive uses the DECL_SIZE_UNIT, and is still 3. I don't know of that needs to change or not. But what is important that we have .string "abc" and not .ascii "abc". What do you think? Thanks Bernd. Index: gcc/varasm.c === --- gcc/varasm.c (revision 263072) +++ gcc/varasm.c (working copy) @@ -111,7 +111,8 @@ static int compare_constant (const tree, const tre static void output_constant_def_contents (rtx); static void output_addressed_constants (tree); static unsigned HOST_WIDE_INT output_constant (tree, unsigned HOST_WIDE_INT, - unsigned int, bool); + unsigned int, bool, + bool = false); static void globalize_decl (tree); static bool decl_readonly_section_1 (enum section_category); #ifdef BSS_SECTION_ASM_OP @@ -827,7 +828,7 @@ mergeable_string_section (tree decl ATTRIBUTE_UNUS unit = GET_MODE_SIZE (mode); /* Check for embedded NUL characters. */ - for (i = 0; i < len; i += unit) + for (i = 0; i < len - unit; i += unit) { for (j = 0; j < unit; j++) if (str[i + j] != '\0') @@ -2117,7 +2118,7 @@ assemble_noswitch_variable (tree decl, const char static void assemble_variable_contents (tree decl, const char *name, - bool dont_output_data) + bool dont_output_data, bool merge_strings = false) { /* Do any machine/system dependent processing of the object. */ #ifdef ASM_DECLARE_OBJECT_NAME @@ -2140,7 +2141,7 @@ assemble_variable_contents (tree decl, const char output_constant (DECL_INITIAL (decl), tree_to_uhwi (DECL_SIZE_UNIT (decl)), get_variable_align (decl), - false); + false, merge_strings); else /* Leave space for it. */ assemble_zeros (tree_to_uhwi (DECL_SIZE_UNIT (decl))); @@ -2316,7 +2317,9 @@ assemble_variable (tree decl, int top_level ATTRIB switch_to_section (sect); if (align > BITS_PER_UNIT) ASM_OUTPUT_ALIGN (asm_out_file, floor_log2 (align / BITS_PER_UNIT)); - assemble_variable_contents (decl, name, dont_output_data); + assemble_variable_contents (decl, name, dont_output_data, + sect->common.flags & SECTION_MERGE + && sect->common.flags & SECTION_STRINGS); if (asan_protected) { unsigned HOST_WIDE_INT int size @@ -4786,6 +4789,19 @@ static unsigned HOST_WIDE_INT output_constructor (tree, unsigned HOST_WIDE_INT, unsigned int, bool, oc_outer_state *); +/* Find out if a string P of length LEN and character size UNIT + has double nul termination. + The string is already known to have a nul termination at + offset LEN. Find out if there is already a NUL termination + at offset LEN - UNIT. */ + +static bool +redundant_nul_term_p (const char *p, unsigned len, unsigned unit) +{ + gcc_assert(len >= unit); + return !memcmp (p + len, p + len - unit, unit); +} + /* Output assembler code for constant EXP, with no label. This includes the pseudo-op such as ".int" or ".byte", and a newline. Assumes output_addressed_constants has been done on EXP already. @@ -4812,7 +4828,7 @@ output_constructor (tree, unsigned HOST_WIDE_INT, static unsigned HOST_WIDE_INT output_constant (tree exp, unsigned HOST_WIDE_INT size, unsigned int align, - bool reverse) + bool reverse, bool merge_strings /* = false */) { enum tree_code code; unsigned HOST_WIDE_INT thissize; @@ -4940,8 +4956,13 @@ output_constant (tree exp, unsigned HOST_WIDE_INT case CONSTRUCTOR: return output_constructor (exp, size, align, reverse, NULL); case STRING_CST: - thissize - = MIN ((unsigned HOST_WIDE_INT)TREE_STRING_LENGTH (exp), size); + thissize = (unsigned HOST_WIDE_INT)TREE_STRING_LENGTH (exp); + if (thissize > size && merge_strings + && !redundant_nul_term_p (TREE_STRING_POINTER (exp), +size, thissize - size)) + ; + else + thissi
Re: [PATCH][2/4] Add rev_post_order_and_mark_dfs_back_seme
On Wed, 1 Aug 2018 at 16:31, Richard Biener wrote: > --- a/gcc/cfganal.c > +++ b/gcc/cfganal.c > @@ -1057,6 +1057,119 @@ pre_and_rev_post_order_compute (int *pre_order, int > *rev_post_order, >return pre_order_num; > } > > +/* Unline pre_and_rev_post_order_compute we fill rev_post_order backwards Unlike > + so iterating in RPO order needs to start with rev_post_order[n - 1] > + going to rev_post_order[0]. If FOR_ITERATION is true then try to > + make CFG cycles fit into small contiguous regions of the RPO order. > + When FOR_ITERATION is true this requires up-to-date loop structures. */ > + > +int > +rev_post_order_and_mark_dfs_back_seme (struct function *fn, edge entry, > + bitmap exit_bbs, bool for_iteration, > + int *rev_post_order) > +{ > + int pre_order_num = 0; > + int rev_post_order_num = 0; > + > + /* Allocate stack for back-tracking up CFG. Worst case we need > + O(n^2) edges but the following should suffice in practice without > + a need to re-allocate. */ > + auto_vec stack (2 * n_basic_blocks_for_fn (fn)); > + > + int *pre = XNEWVEC (int, 2 * last_basic_block_for_fn (fn)); > + int *post = pre + last_basic_block_for_fn (fn); > + > + /* BB flag to track nodes that have been visited. */ > + auto_bb_flag visited (fn); > + /* BB flag to track which nodes whave post[] assigned to avoid > + zeroing post. */ s/whave/have/. > + free (pre); XDELETEVEC (pre); > + > + /* Clear the temporarily allocated flags. */ > + for (int i = 0; i < rev_post_order_num; ++i) > +BASIC_BLOCK_FOR_FN (fn, rev_post_order[i])->flags > + &= ~(post_assigned|visited); > + > + return rev_post_order_num; > +} > + > + > + > /* Compute the depth first search order on the _reverse_ graph and > store in the array DFS_ORDER, marking the nodes visited in VISITED. s/store/store it/ thanks,
patch to bug #86829
Closes bug #86829 Description: Adds substitution rules for both sin(atan(x)) and cos(atan(x)). These formulas are replaced by x / sqrt(x*x + 1) and 1 / sqrt(x*x + 1) respectively, providing up to 10x speedup. This identity can be proved mathematically. Changelog: 2018-08-03 Giuliano Belinassi * match.pd: add simplification rules to sin(atan(x)) and cos(atan(x)). Bootstrap and Testing: There were no unexpected failures in a proper testing in GCC 8.1.0 under a x86_64 running Ubuntu 18.04. Test run by giulianob on Fri Aug 3 17:01:33 2018 Native configuration is x86_64-pc-linux-gnu === gcc tests === Schedule of variations: unix Running target unix Using /usr/share/dejagnu/baseboards/unix.exp as board description file for target. Using /usr/share/dejagnu/config/unix.exp as generic interface file for target. Using /home/giulianob/Downloads/gcc/src/gcc/testsuite/config/default.exp as tool-and-target-specific interface file. Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.c-torture/compile/compile.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.c-torture/execute/builtins/builtins.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.c-torture/execute/execute.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.c-torture/execute/ieee/ieee.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.c-torture/unsorted/unsorted.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/asan/asan.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/atomic/atomic.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/autopar/autopar.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/charset/charset.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/compat/compat.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/compat/struct-layout-1.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/cpp/cpp.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/cpp/trad/trad.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/debug/debug.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/debug/dwarf2/dwarf2.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/dfp/dfp.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/dg.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/fixed-point/fixed-point.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/format/format.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/goacc-gomp/goacc-gomp.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/goacc/goacc.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/gomp/gomp.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/graphite/graphite.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/guality/guality.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/ipa/ipa.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/lto/lto.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/noncompile/noncompile.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/params/params.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/pch/pch.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/plugin/plugin.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/rtl/rtl.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/sancov/sancov.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/simulate-thread/simulate-thread.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/special/mips-abi.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/special/special.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/sso/sso.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/tls/tls.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/tm/tm.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/torture/dg-torture.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/torture/stackalign/stackalign.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/torture/tls/tls.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/tree-prof/tree-prof.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/tree-ssa/tree-ssa.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/tsan/tsan.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/ubsan/ubsan.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/vect/costmodel
Re: [PATCH] i386: Always set cfun->machine->max_used_stack_alignment
On Sat, Aug 4, 2018 at 3:42 AM, Uros Bizjak wrote: > On Fri, Aug 3, 2018 at 12:55 AM, H.J. Lu wrote: >> We should always set cfun->machine->max_used_stack_alignment if the >> maximum stack slot alignment may be greater than 64 bits. >> >> Tested on i686 and x86-64. OK for master and backport for GCC 8? > > Can you explain why 64 bits, and what this value represents? Is this > value the same for 64bit and 32bit targets? > > Should crtl->max_used_stack_slot_alignment be compared to > STACK_BOUNDARY or even MIN_STACK_BOUNDARY instead? In this case, we don't need to realign the incoming stack since both crtl->max_used_stack_slot_alignment and crtl->preferred_stack_boundary are 128 bits. We don't compute the largest alignment of stack slots to keep stack frame properly aligned for it. Normally it is OK. But if the largest alignment of stack slots > 64 bits and we don't keep stack frame proper aligned, we will get segfault if aligned vector load/store are used on these unaligned stack slots. My patch computes the largest alignment of stack slots, which are actually used, if the largest alignment of stack slots allocated is > 64 bits, which is the smallest alignment for misaligned load/store. Here is the diff of before and after: diff -up old/x.s new/x.s --- old/x.s 2018-08-02 12:39:22.916400504 -0700 +++ new/x.s 2018-08-02 12:38:35.853729415 -0700 @@ -15,6 +15,7 @@ foo: movq %rsp, %rbp .cfi_def_cfa_register 6 pushq %rbx + subq $8, %rsp <<< Stack frame is properly aligned to 16 bytes. .cfi_offset 3, -24 stosw movl 16(%rbp), %ecx @@ -65,6 +66,7 @@ foo: .L9: xorl %r8d, d(%rip) movl %esi, %eax + popq %rdx popq %rbx popq %rbp .cfi_def_cfa 7, 8 -- H.J.
[Patch, Fortran, F08] PR 45521: GENERIC resolution with ALLOCATABLE/POINTER and PROCEDURE
Hi all, this patch should finally fix up the last wrinkles of PR 45521, which deals with disambiguating specific procedures in a generic interface via the pointer/allocatable attributes of the arguments (legal in F08). For 'ordinary' generic interfaces this already works (cf. 'generic_correspondence'), but not for operator interfaces, which are treated a bit differently (see 'gfc_compare_interfaces'). The patch basically copies over the usage of 'compare_ptr_alloc' from 'generic_correspondence' to the relevant part of 'gfc_compare_interfaces'. Regtests cleanly on x86_64-linux-gnu. Ok for trunk? Cheers, Janus 2018-08-04 Janus Weil PR fortran/45521 * interface.c (gfc_compare_interfaces): Apply additional distinguishability criteria of F08 to operator interfaces. 2018-08-04 Janus Weil PR fortran/45521 * gfortran.dg/interface_assignment_6.f90: New test case. Index: gcc/fortran/interface.c === --- gcc/fortran/interface.c (revision 263178) +++ gcc/fortran/interface.c (working copy) @@ -1776,7 +1776,7 @@ } else { - /* Only check type and rank. */ + /* Operators: Only check type and rank of arguments. */ if (!compare_type (f2->sym, f1->sym)) { if (errmsg != NULL) @@ -1794,6 +1794,15 @@ symbol_rank (f2->sym)); return false; } + if ((gfc_option.allow_std & GFC_STD_F2008) + && (compare_ptr_alloc(f1->sym, f2->sym) + || compare_ptr_alloc(f2->sym, f1->sym))) + { + if (errmsg != NULL) + snprintf (errmsg, err_len, "Mismatching POINTER/ALLOCATABLE " + "attribute in argument '%s' ", f1->sym->name); + return false; + } } } ! { dg-do compile } ! ! PR 45521: [F08] GENERIC resolution with ALLOCATABLE/POINTER and PROCEDURE ! ! Contributed by Janus Weil module inteface_assignment_6 type :: t end type ! this was rejected as ambiguous, but is valid in F08 interface assignment(=) procedure testAlloc procedure testPtr end interface contains subroutine testAlloc(obj, val) type(t), allocatable, intent(out) :: obj integer, intent(in) :: val end subroutine subroutine testPtr(obj, val) type(t), pointer, intent(out) :: obj integer, intent(in) :: val end subroutine end
[PATCH] Handle not explicitly zero terminated strings in merge sections
Hi! This patch is inspired by Olivier's feedback to my previous patch on the zero-termination of Ada STRING_CST. The idea is that strings that do not have embedded nul characters _and_ do not happen to be zero-terminated in the DECL_UNIT_SIZE, are currently not in the merge string sections. To be in the merge string section they need a terminating nul character, that gets written directly in varasm while assembling the string constant. This patch uses the new string properties that my previous patch series implements, and is based on the other patches here: [PATCH] Check the STRING_CSTs in varasm.c https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00331.html [PATCH] Handle overlength strings in the C FE https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00142.html [PATCH] Handle overlength strings in C++ FE https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00045.html Approved: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00105.html [PATCH] Make GO string literals properly NUL terminated https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01931.html [PATCH] [Ada] Make middle-end string literals NUL terminated https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01929.html The existing test case gcc.dg/merge-all-constants-1.c contains two overlength strings that get now stripped down by the C FE, and look in the middle end exactly like normal Ada, Fortran or Go strings. And get now allocated in the merge.str section, unless they have an embedded nul character, which I changed to make the test pass again. Olivier, could you add test cases from the Ada side to this? Bootstrapped and reg-tested on x86_64-pc-linux-gnu. Is it OK for trunk (after all pre-condition patches are approved/committed)? Thanks Bernd.
Re: [PATCH] Handle not explicitly zero terminated strings in merge sections
Again, with patch On 08/04/18 17:43, Bernd Edlinger wrote: > Hi! > > > This patch is inspired by Olivier's feedback to my previous patch on the > zero-termination of Ada STRING_CST. > > The idea is that strings that do not have embedded nul characters _and_ > do not happen to be zero-terminated in the DECL_UNIT_SIZE, are currently > not in the merge string sections. To be in the merge string section > they need a terminating nul character, that gets written directly > in varasm while assembling the string constant. This patch uses > the new string properties that my previous patch series implements, > and is based on the other patches here: > > [PATCH] Check the STRING_CSTs in varasm.c > https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00331.html > > [PATCH] Handle overlength strings in the C FE > https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00142.html > > [PATCH] Handle overlength strings in C++ FE > https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00045.html > Approved: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00105.html > > [PATCH] Make GO string literals properly NUL terminated > https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01931.html > > [PATCH] [Ada] Make middle-end string literals NUL terminated > https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01929.html > > > The existing test case gcc.dg/merge-all-constants-1.c > contains two overlength strings that get now stripped down > by the C FE, and look in the middle end exactly like normal > Ada, Fortran or Go strings. And get now allocated > in the merge.str section, unless they have an embedded > nul character, which I changed to make the test pass again. > > Olivier, could you add test cases from the Ada side to this? > > > Bootstrapped and reg-tested on x86_64-pc-linux-gnu. > Is it OK for trunk (after all pre-condition patches > are approved/committed)? > > > Thanks > Bernd. gcc: 2018-08-04 Bernd Edlinger * varasm.c (output_constant): Add new parameter merge_strings. (assemble_variable_contents): Likewise. (mergeable_string_section): Don't fail if the last char is non-zero. (assemble_variable): Pass merge_strings for string merge sections. (redundant_nul_term_p): New helper function. (output_constant): Make strings properly zero terminated in merge string sections. testsuite: 2018-08-04 Bernd Edlinger * gcc.dg/merge-all-constants-1.c: Adjust test. Index: gcc/varasm.c === --- gcc/varasm.c (revision 263072) +++ gcc/varasm.c (working copy) @@ -111,7 +111,8 @@ static int compare_constant (const tree, const tre static void output_constant_def_contents (rtx); static void output_addressed_constants (tree); static unsigned HOST_WIDE_INT output_constant (tree, unsigned HOST_WIDE_INT, - unsigned int, bool); + unsigned int, bool, + bool = false); static void globalize_decl (tree); static bool decl_readonly_section_1 (enum section_category); #ifdef BSS_SECTION_ASM_OP @@ -827,7 +828,7 @@ mergeable_string_section (tree decl ATTRIBUTE_UNUS unit = GET_MODE_SIZE (mode); /* Check for embedded NUL characters. */ - for (i = 0; i < len; i += unit) + for (i = 0; i < len - unit; i += unit) { for (j = 0; j < unit; j++) if (str[i + j] != '\0') @@ -2117,7 +2118,7 @@ assemble_noswitch_variable (tree decl, const char static void assemble_variable_contents (tree decl, const char *name, - bool dont_output_data) + bool dont_output_data, bool merge_strings = false) { /* Do any machine/system dependent processing of the object. */ #ifdef ASM_DECLARE_OBJECT_NAME @@ -2140,7 +2141,7 @@ assemble_variable_contents (tree decl, const char output_constant (DECL_INITIAL (decl), tree_to_uhwi (DECL_SIZE_UNIT (decl)), get_variable_align (decl), - false); + false, merge_strings); else /* Leave space for it. */ assemble_zeros (tree_to_uhwi (DECL_SIZE_UNIT (decl))); @@ -2316,7 +2317,9 @@ assemble_variable (tree decl, int top_level ATTRIB switch_to_section (sect); if (align > BITS_PER_UNIT) ASM_OUTPUT_ALIGN (asm_out_file, floor_log2 (align / BITS_PER_UNIT)); - assemble_variable_contents (decl, name, dont_output_data); + assemble_variable_contents (decl, name, dont_output_data, + sect->common.flags & SECTION_MERGE + && sect->common.flags & SECTION_STRINGS); if (asan_protected) { unsigned HOST_WIDE_INT int size @@ -4786,6 +4789,19 @@ static unsigned HOST_WIDE_INT output_constructor (tree, unsigned HOST_WIDE_INT, unsigned int, bool, oc_outer_state *); +/* Find out if a string P of length LEN and character size UNIT + has double nul termination. + The string is already known to have a nul termination at + offset LEN. Find out if there is already a NUL termination + at offset LEN - UNIT. */ + +static bool +redundant_nul_term_p (const char *p, unsigned len, unsigned unit) +{ + gcc_assert(len >= unit); + return !memcm
Re: [testsuite, committed] Use relative line numbers in gcc.dg/guality
On Jul 09 2018, Tom de Vries wrote: > this patches uses relative line numbers in gcc.dg/guality where obvious: > either the relative line number is '.', '.-1' or '.+1', or adjacent to > another obvious case. This introduced a lot of test names that are no longer unique. gcc.dg/guality/csttest.c -O0 line .+1 r == -(0x1efULL << 50) gcc.dg/guality/csttest.c -O0 line .+10 i == (0x1efULL << 50) gcc.dg/guality/csttest.c -O0 line .+11 h == (0x1efULL << 33) gcc.dg/guality/csttest.c -O0 line .+12 g == (0x37ULL << 50) gcc.dg/guality/csttest.c -O0 line .+13 f == (0x37ULL << 31) gcc.dg/guality/csttest.c -O0 line .+14 e == (0x17ULL << 50) gcc.dg/guality/csttest.c -O0 line .+15 d == (0x17ULL << 15) gcc.dg/guality/csttest.c -O0 line .+16 c == (0x1ULL << 35) gcc.dg/guality/csttest.c -O0 line .+17 b == (0x7ULL << 33) gcc.dg/guality/csttest.c -O0 line .+18 a == (0x17ULL << 31) gcc.dg/guality/csttest.c -O0 line .+2 q == -(0x1efULL << 33) gcc.dg/guality/csttest.c -O0 line .+3 p == -(0x37ULL << 50) gcc.dg/guality/csttest.c -O0 line .+4 o == -(0x37ULL << 31) gcc.dg/guality/csttest.c -O0 line .+5 n == -(0x17ULL << 50) gcc.dg/guality/csttest.c -O0 line .+6 m == -(0x17ULL << 15) gcc.dg/guality/csttest.c -O0 line .+7 l == -(0x1ULL << 35) gcc.dg/guality/csttest.c -O0 line .+8 k == -(0x7ULL << 33) gcc.dg/guality/csttest.c -O0 line .+9 j == -(0x17ULL << 31) gcc.dg/guality/csttest.c -O1 -DPREVENT_OPTIMIZATION line .+1 r == -(0x1efULL << 50) gcc.dg/guality/csttest.c -O1 -DPREVENT_OPTIMIZATION line .+10 i == (0x1efULL << 50) gcc.dg/guality/csttest.c -O1 -DPREVENT_OPTIMIZATION line .+11 h == (0x1efULL << 33) gcc.dg/guality/csttest.c -O1 -DPREVENT_OPTIMIZATION line .+12 g == (0x37ULL << 50) gcc.dg/guality/csttest.c -O1 -DPREVENT_OPTIMIZATION line .+13 f == (0x37ULL << 31) gcc.dg/guality/csttest.c -O1 -DPREVENT_OPTIMIZATION line .+14 e == (0x17ULL << 50) gcc.dg/guality/csttest.c -O1 -DPREVENT_OPTIMIZATION line .+15 d == (0x17ULL << 15) gcc.dg/guality/csttest.c -O1 -DPREVENT_OPTIMIZATION line .+16 c == (0x1ULL << 35) gcc.dg/guality/csttest.c -O1 -DPREVENT_OPTIMIZATION line .+17 b == (0x7ULL << 33) gcc.dg/guality/csttest.c -O1 -DPREVENT_OPTIMIZATION line .+18 a == (0x17ULL << 31) gcc.dg/guality/csttest.c -O1 -DPREVENT_OPTIMIZATION line .+2 q == -(0x1efULL << 33) gcc.dg/guality/csttest.c -O1 -DPREVENT_OPTIMIZATION line .+3 p == -(0x37ULL << 50) gcc.dg/guality/csttest.c -O1 -DPREVENT_OPTIMIZATION line .+4 o == -(0x37ULL << 31) gcc.dg/guality/csttest.c -O1 -DPREVENT_OPTIMIZATION line .+5 n == -(0x17ULL << 50) gcc.dg/guality/csttest.c -O1 -DPREVENT_OPTIMIZATION line .+6 m == -(0x17ULL << 15) gcc.dg/guality/csttest.c -O1 -DPREVENT_OPTIMIZATION line .+7 l == -(0x1ULL << 35) gcc.dg/guality/csttest.c -O1 -DPREVENT_OPTIMIZATION line .+8 k == -(0x7ULL << 33) gcc.dg/guality/csttest.c -O1 -DPREVENT_OPTIMIZATION line .+9 j == -(0x17ULL << 31) gcc.dg/guality/csttest.c -O2 -DPREVENT_OPTIMIZATION line .+1 r == -(0x1efULL << 50) gcc.dg/guality/csttest.c -O2 -DPREVENT_OPTIMIZATION line .+10 i == (0x1efULL << 50) gcc.dg/guality/csttest.c -O2 -DPREVENT_OPTIMIZATION line .+11 h == (0x1efULL << 33) gcc.dg/guality/csttest.c -O2 -DPREVENT_OPTIMIZATION line .+12 g == (0x37ULL << 50) gcc.dg/guality/csttest.c -O2 -DPREVENT_OPTIMIZATION line .+13 f == (0x37ULL << 31) gcc.dg/guality/csttest.c -O2 -DPREVENT_OPTIMIZATION line .+14 e == (0x17ULL << 50) gcc.dg/guality/csttest.c -O2 -DPREVENT_OPTIMIZATION line .+15 d == (0x17ULL << 15) gcc.dg/guality/csttest.c -O2 -DPREVENT_OPTIMIZATION line .+16 c == (0x1ULL << 35) gcc.dg/guality/csttest.c -O2 -DPREVENT_OPTIMIZATION line .+17 b == (0x7ULL << 33) gcc.dg/guality/csttest.c -O2 -DPREVENT_OPTIMIZATION line .+18 a == (0x17ULL << 31) gcc.dg/guality/csttest.c -O2 -DPREVENT_OPTIMIZATION line .+2 q == -(0x1efULL << 33) gcc.dg/guality/csttest.c -O2 -DPREVENT_OPTIMIZATION line .+3 p == -(0x37ULL << 50) gcc.dg/guality/csttest.c -O2 -DPREVENT_OPTIMIZATION line .+4 o == -(0x37ULL << 31) gcc.dg/guality/csttest.c -O2 -DPREVENT_OPTIMIZATION line .+5 n == -(0x17ULL << 50) gcc.dg/guality/csttest.c -O2 -DPREVENT_OPTIMIZATION line .+6 m == -(0x17ULL << 15) gcc.dg/guality/csttest.c -O2 -DPREVENT_OPTIMIZATION line .+7 l == -(0x1ULL << 35) gcc.dg/guality/csttest.c -O2 -DPREVENT_OPTIMIZATION line .+8 k == -(0x7ULL << 33) gcc.dg/guality/csttest.c -O2 -DPREVENT_OPTIMIZATION line .+9 j == -(0x17ULL << 31) gcc.dg/guality/csttest.c -O2 -flto -fno-use-linker-plugin -flto-partition=none -DPREVENT_OPTIMIZATION line .+1 r == -(0x1efULL << 50) gcc.dg/guality/csttest.c -O2 -flto -fno-use-linker-plugin -flto-partition=none -DPREVENT_OPTIMIZATION line .+10 i == (0x1efULL << 50) gcc.dg/guality/csttest.c -O2 -flto -fno-use-linker-plugin -flto-partition=none -DPREVENT_OPTIMIZATION line .+11 h == (0x1efULL << 33) gcc.dg/g
Re: [PATCH] Handle not explicitly zero terminated strings in merge sections
Aehm, I forgot the Fortran FE patch which is also a pre-condition (at least for building with all languages): [PATCH] Create internally nul terminated string literals in fortan FE https://gcc.gnu.org/ml/fortran/2018-08/msg0.html Thanks Bernd. On 08/04/18 17:44, Bernd Edlinger wrote: > Again, with patch > > On 08/04/18 17:43, Bernd Edlinger wrote: >> Hi! >> >> >> This patch is inspired by Olivier's feedback to my previous patch on the >> zero-termination of Ada STRING_CST. >> >> The idea is that strings that do not have embedded nul characters _and_ >> do not happen to be zero-terminated in the DECL_UNIT_SIZE, are currently >> not in the merge string sections. To be in the merge string section >> they need a terminating nul character, that gets written directly >> in varasm while assembling the string constant. This patch uses >> the new string properties that my previous patch series implements, >> and is based on the other patches here: >> >> [PATCH] Check the STRING_CSTs in varasm.c >> https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00331.html >> >> [PATCH] Handle overlength strings in the C FE >> https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00142.html >> >> [PATCH] Handle overlength strings in C++ FE >> https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00045.html >> Approved: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00105.html >> >> [PATCH] Make GO string literals properly NUL terminated >> https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01931.html >> >> [PATCH] [Ada] Make middle-end string literals NUL terminated >> https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01929.html >> >> >> The existing test case gcc.dg/merge-all-constants-1.c >> contains two overlength strings that get now stripped down >> by the C FE, and look in the middle end exactly like normal >> Ada, Fortran or Go strings. And get now allocated >> in the merge.str section, unless they have an embedded >> nul character, which I changed to make the test pass again. >> >> Olivier, could you add test cases from the Ada side to this? >> >> >> Bootstrapped and reg-tested on x86_64-pc-linux-gnu. >> Is it OK for trunk (after all pre-condition patches >> are approved/committed)? >> >> >> Thanks >> Bernd.
Re: dejagnu version update?
On Tue, 16 May 2017 at 21:08, Mike Stump wrote: > > On May 16, 2017, at 5:16 AM, Jonathan Wakely wrote: > > The change I care about in 1.5.3 > > So, we haven't talked much about the version people want most. If we update, > might as well get something that more people care about. 1.5.3 is in ubuntu > LTS 16.04 and Fedora 24, so it's been around awhile. SUSU is said to be > using 1.6, in the post 1.4.4 systems. People stated they want 1.5.2 and > 1.5.3, so, I'm inclined to say, let's shoot for 1.5.3 when we do update. > > As for the machines in the FSF compile farm, nah, tail wagging the dog. I'd > rather just update the requirement, and the owners or users of those machines > can install a new dejagnu, if they are using one that is too old and they > want to support testing gcc. So.. let me ping that, again, now that another year has passed :) PS: Recap: https://gcc.gnu.org/ml/fortran/2012-03/msg00094.html was later applied as http://git.savannah.gnu.org/gitweb/?p=dejagnu.git;a=commit;h=5481f29161477520c691d525653323b82fa47ad7 and was part of the dejagnu-1.5.2 release from 2015. Jonathan requires 1.5.3 for libstdc++ testing. The libdirs fix would allow us to remove the 150 occurrences of the load_gcc_lib hack, refer to the patch to the fortran list back then. AFAIR this is still not fixed: +# BUG: gcc-dg calls gcc-set-multilib-library-path but does not load gcc-defs! debian-stable (i think 9 ATM), Ubuntu LTS ship versions recent enough to contain both fixes. Commercial distros seem to ship fixed versions, too. thanks,
Remove duplicate test
The test for type:cvip was being run twice. Committed as obvious. Andreas. * gcc.dg/guality/const-volatile.c: Remove duplicate test "type:cvip". diff --git a/gcc/testsuite/gcc.dg/guality/const-volatile.c b/gcc/testsuite/gcc.dg/guality/const-volatile.c index 3bfca0d14d..21f84b5c20 100644 --- a/gcc/testsuite/gcc.dg/guality/const-volatile.c +++ b/gcc/testsuite/gcc.dg/guality/const-volatile.c @@ -80,8 +80,6 @@ main (int argc, char **argv) /* { dg-final { gdb-test "@main" "type:vs" "volatile struct { const long cli; const signed char csc; }" } } */ -/* { dg-final { gdb-test "@main" "type:cvip" "int * const volatile" } } */ - /* { dg-final { gdb-test "@main" "type:bar" "struct bar { short s; const short cs; volatile short vs; const volatile short cvs; volatile long long vll; }" } } */ /* { dg-final { gdb-test "@main" "type:foo" "struct foo { const long cli; const signed char csc; }" } } */ /* { dg-final { gdb-test "@main" "type:cfoo" "const struct foo { const long cli; const signed char csc; }" } } */ -- 2.18.0 -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different."
[PATCH] assume sprintf formatting of wide characters may fail (PR 86853)
The sprintf handling of wide characters neglects to consider that calling the function may fail due to a conversion error (when the wide character is invalid or not representable in the current locale). The handling also misinterprets the POSIX %S wide string directive as a plain narrow %s and doesn't include %C (the POSIX equivalent of %lc). The attached patch corrects these oversights by extending the data structures to indicate when a directive may fail, and extending the UNDER4K member of the format_result structure to also encode calls with such directives. Tested on x86_64-linux. Besides the trunk, since this bug can affect code correctness I'd like to backport this patch to both release branches (7 and 8). Martin PR tree-optimization/86853 - sprintf optimization for wide strings doesn't account for conversion failure gcc/ChangeLog: PR tree-optimization/86853 * gimple-ssa-sprintf.c (struct format_result): Rename member. (struct fmtresult): Add member and initialize it in ctors. (format_character): Handle %C. Extend range to NUL. Set MAYFAIL. (format_string): Handle %S the same as %ls. Set MAYFAIL. (format_directive): Set POSUNDER4K when MAYFAIL is set. (parse_directive): Handle %C same as %c. (sprintf_dom_walker::compute_format_length): Adjust. (is_call_safe): Adjust. gcc/testsuite/ChangeLog: PR tree-optimization/86853 * gcc.dg/tree-ssa/builtin-sprintf-10.c: New test. * gcc.dg/tree-ssa/builtin-sprintf-11.c: New test. * gcc.dg/tree-ssa/builtin-sprintf-warn-18.c: Adjust. Index: gcc/gimple-ssa-sprintf.c === --- gcc/gimple-ssa-sprintf.c (revision 263295) +++ gcc/gimple-ssa-sprintf.c (working copy) @@ -211,12 +211,14 @@ struct format_result the return value optimization. */ bool knownrange; - /* True if no individual directive resulted in more than 4095 bytes - of output (the total NUMBER_CHARS_{MIN,MAX} might be greater). - Implementations are not required to handle directives that produce - more than 4K bytes (leading to undefined behavior) and so when one - is found it disables the return value optimization. */ - bool under4k; + /* True if no individual directive could fail or result in more than + 4095 bytes of output (the total NUMBER_CHARS_{MIN,MAX} might be + greater). Implementations are not required to handle directives + that produce more than 4K bytes (leading to undefined behavior) + and so when one is found it disables the return value optimization. + Similarly, directives that can fail (such as wide character + directives) disable the optimization. */ + bool posunder4k; /* True when a floating point directive has been seen in the format string. */ @@ -650,7 +652,7 @@ struct fmtresult fmtresult (unsigned HOST_WIDE_INT min = HOST_WIDE_INT_MAX) : argmin (), argmax (), knownrange (min < HOST_WIDE_INT_MAX), -nullp () +mayfail (), nullp () { range.min = min; range.max = min; @@ -664,7 +666,7 @@ struct fmtresult unsigned HOST_WIDE_INT likely = HOST_WIDE_INT_MAX) : argmin (), argmax (), knownrange (min < HOST_WIDE_INT_MAX && max < HOST_WIDE_INT_MAX), -nullp () +mayfail (), nullp () { range.min = min; range.max = max; @@ -694,6 +696,10 @@ struct fmtresult heuristics that depend on warning levels. */ bool knownrange; + /* True for a directive that may fail (such as wide character + directives). */ + bool mayfail; + /* True when the argument is a null pointer. */ bool nullp; }; @@ -2202,7 +2208,8 @@ format_character (const directive &dir, tree arg, res.knownrange = true; - if (dir.modifier == FMT_LEN_l) + if (dir.specifier == 'C' + || dir.modifier == FMT_LEN_l) { /* A wide character can result in as few as zero bytes. */ res.range.min = 0; @@ -2217,13 +2224,20 @@ format_character (const directive &dir, tree arg, res.range.likely = 0; res.range.unlikely = 0; } - else if (min > 0 && min < 128) + else if (min >= 0 && min < 128) { + /* Be conservative if the target execution character set + is not a 1-to-1 mapping to the source character set or + if the source set is not ASCII. */ + bool one_2_one_ascii + = (target_to_host_charmap[0] == 1 && target_to_host ('a') == 97); + /* A wide character in the ASCII range most likely results in a single byte, and only unlikely in up to MB_LEN_MAX. */ - res.range.max = 1; + res.range.max = one_2_one_ascii ? 1 : target_mb_len_max ();; res.range.likely = 1; res.range.unlikely = target_mb_len_max (); + res.mayfail = !one_2_one_ascii; } else { @@ -2232,6 +2246,8 @@ format_character (const directive &dir, tree arg, res.range.max = target_mb_len_max (); res.range.likely = 2; res.range.unlikely = res.range.max; + /* Converting such a character may fa
Re: [PATCH] i386: Always set cfun->machine->max_used_stack_alignment
On Sat, Aug 4, 2018 at 3:59 PM, H.J. Lu wrote: > On Sat, Aug 4, 2018 at 3:42 AM, Uros Bizjak wrote: >> On Fri, Aug 3, 2018 at 12:55 AM, H.J. Lu wrote: >>> We should always set cfun->machine->max_used_stack_alignment if the >>> maximum stack slot alignment may be greater than 64 bits. >>> >>> Tested on i686 and x86-64. OK for master and backport for GCC 8? >> >> Can you explain why 64 bits, and what this value represents? Is this >> value the same for 64bit and 32bit targets? >> >> Should crtl->max_used_stack_slot_alignment be compared to >> STACK_BOUNDARY or even MIN_STACK_BOUNDARY instead? > > In this case, we don't need to realign the incoming stack since both > crtl->max_used_stack_slot_alignment and crtl->preferred_stack_boundary > are 128 bits. We don't compute the largest alignment of stack slots to > keep stack frame properly aligned for it. Normally it is OK. But if > the largest alignment of stack slots > 64 bits and we don't keep stack > frame proper aligned, we will get segfault if aligned vector load/store > are used on these unaligned stack slots. My patch computes the > largest alignment of stack slots, which are actually used, if the > largest alignment of stack slots allocated is > 64 bits, which is > the smallest alignment for misaligned load/store. Does > 64 bits also hold for 32 bit targets, and -mabi=ms? I think that we need to compare to STACK_BOUNDARY instead: --cut here-- Index: config/i386/i386.c === --- config/i386/i386.c (revision 263307) +++ config/i386/i386.c (working copy) @@ -13281,8 +13281,7 @@ recompute_frame_layout_p = true; } } - else if (crtl->max_used_stack_slot_alignment - > crtl->preferred_stack_boundary) + else if (crtl->max_used_stack_slot_alignment > STACK_BOUNDARY) { /* We don't need to realign stack. But we still need to keep stack frame properly aligned to satisfy the largest alignment --cut here-- (The testcase works OK with -mabi=ms, which somehow suggests that we don't need realignment in this case). Uros.
Re: [PATCH] i386: Always set cfun->machine->max_used_stack_alignment
On Sat, Aug 4, 2018 at 12:09 PM, Uros Bizjak wrote: > On Sat, Aug 4, 2018 at 3:59 PM, H.J. Lu wrote: >> On Sat, Aug 4, 2018 at 3:42 AM, Uros Bizjak wrote: >>> On Fri, Aug 3, 2018 at 12:55 AM, H.J. Lu wrote: We should always set cfun->machine->max_used_stack_alignment if the maximum stack slot alignment may be greater than 64 bits. Tested on i686 and x86-64. OK for master and backport for GCC 8? >>> >>> Can you explain why 64 bits, and what this value represents? Is this >>> value the same for 64bit and 32bit targets? >>> >>> Should crtl->max_used_stack_slot_alignment be compared to >>> STACK_BOUNDARY or even MIN_STACK_BOUNDARY instead? >> >> In this case, we don't need to realign the incoming stack since both >> crtl->max_used_stack_slot_alignment and crtl->preferred_stack_boundary >> are 128 bits. We don't compute the largest alignment of stack slots to >> keep stack frame properly aligned for it. Normally it is OK. But if >> the largest alignment of stack slots > 64 bits and we don't keep stack >> frame proper aligned, we will get segfault if aligned vector load/store >> are used on these unaligned stack slots. My patch computes the >> largest alignment of stack slots, which are actually used, if the >> largest alignment of stack slots allocated is > 64 bits, which is >> the smallest alignment for misaligned load/store. > > Does > 64 bits also hold for 32 bit targets, and -mabi=ms? I think > that we need to compare to STACK_BOUNDARY instead: 64 bits requirement is independent of any psABIs nor 32-bit vs 64-bit. cfun->machine->max_used_stack_alignment is used to decide how stack frame should be aligned. It is always safe to compute it. I used else if (crtl->max_used_stack_slot_alignment > 64) to compute cfun->machine->max_used_stack_alignment only if we have to. > --cut here-- > Index: config/i386/i386.c > === > --- config/i386/i386.c (revision 263307) > +++ config/i386/i386.c (working copy) > @@ -13281,8 +13281,7 @@ > recompute_frame_layout_p = true; > } > } > - else if (crtl->max_used_stack_slot_alignment > - > crtl->preferred_stack_boundary) > + else if (crtl->max_used_stack_slot_alignment > STACK_BOUNDARY) > { This isn't correct.. cygming.h has #define STACK_BOUNDARY (TARGET_64BIT && ix86_abi == MS_ABI ? 128 : BITS_PER_WORD) darwin.h has #undef STACK_BOUNDARY #define STACK_BOUNDARY \ ((profile_flag || (TARGET_64BIT && ix86_abi == MS_ABI)) \ ? 128 : BITS_PER_WORD) i386.h has /* Boundary (in *bits*) on which stack pointer should be aligned. */ #define STACK_BOUNDARY \ (TARGET_64BIT && ix86_abi == MS_ABI ? 128 : BITS_PER_WORD) If STACK_BOUNDARY is 128 and max_used_stack_slot_alignment is 128, we will get segment when 128bit aligned load/store is generated on misaligned stack slot. >/* We don't need to realign stack. But we still need to keep > stack frame properly aligned to satisfy the largest alignment > --cut here-- > > (The testcase works OK with -mabi=ms, which somehow suggests that we > don't need realignment in this case). We may not hit 128bit aligned load/store on misaligned stack slot in this case. It doesn't mean that won't happen. else if (crtl->max_used_stack_slot_alignment > 64) is the correct thing to do here. -- H.J.
[committed, libgomp, nvptx, --without-cuda-driver] Don't use system cuda driver
Hi, Using libgomp configure option --with-cuda-driver= we can indicate what cuda driver to use to build the libgomp nvptx plugin. Without such an option, the system cuda driver is used, if available. If not availabe, a dlopen interface is used instead. However, when we use --without-cuda-driver (or the equivalent --with-cuda-driver=no) the system cuda driver is still used if available. This patch fixes that, making sure that --without-cuda-driver selects the dlopen interface. Build on x86_64 with nvptx accelerator and tested libgomp testsuite, with and without option --without-cuda-driver. Committed. Thanks, - Tom [libgomp, nvptx, --without-cuda-driver] Don't use system cuda driver 2018-08-04 Tom de Vries * plugin/configfrag.ac: For --without-cuda-driver, set CUDA_DRIVER_INCLUDE and CUDA_DRIVER_LIB to no. Handle CUDA_DRIVER_INCLUDE == no and CUDA_DRIVER_LIB == no. * configure: Regenerate. --- libgomp/configure| 49 --- libgomp/plugin/configfrag.ac | 55 ++-- 2 files changed, 63 insertions(+), 41 deletions(-) diff --git a/libgomp/configure b/libgomp/configure index ced7606b355..b4fc9d3cc72 100755 --- a/libgomp/configure +++ b/libgomp/configure @@ -15302,7 +15302,11 @@ if test "${with_cuda_driver_lib+set}" = set; then : fi case "x$with_cuda_driver" in - x | xno) ;; + x) ;; + xno) +CUDA_DRIVER_INCLUDE=no +CUDA_DRIVER_LIB=no +;; *) CUDA_DRIVER_INCLUDE=$with_cuda_driver/include CUDA_DRIVER_LIB=$with_cuda_driver/lib ;; @@ -15313,10 +15317,12 @@ fi if test "x$with_cuda_driver_lib" != x; then CUDA_DRIVER_LIB=$with_cuda_driver_lib fi -if test "x$CUDA_DRIVER_INCLUDE" != x; then +if test "x$CUDA_DRIVER_INCLUDE" != x \ + && test "x$CUDA_DRIVER_INCLUDE" != xno; then CUDA_DRIVER_CPPFLAGS=-I$CUDA_DRIVER_INCLUDE fi -if test "x$CUDA_DRIVER_LIB" != x; then +if test "x$CUDA_DRIVER_LIB" != x \ + && test "x$CUDA_DRIVER_LIB" != xno; then CUDA_DRIVER_LDFLAGS=-L$CUDA_DRIVER_LIB fi @@ -15400,17 +15406,19 @@ if test x"$enable_offload_targets" != x; then nvptx*) tgt_name=nvptx PLUGIN_NVPTX=$tgt - PLUGIN_NVPTX_CPPFLAGS=$CUDA_DRIVER_CPPFLAGS - PLUGIN_NVPTX_LDFLAGS=$CUDA_DRIVER_LDFLAGS - PLUGIN_NVPTX_LIBS='-lcuda' - - PLUGIN_NVPTX_save_CPPFLAGS=$CPPFLAGS - CPPFLAGS="$PLUGIN_NVPTX_CPPFLAGS $CPPFLAGS" - PLUGIN_NVPTX_save_LDFLAGS=$LDFLAGS - LDFLAGS="$PLUGIN_NVPTX_LDFLAGS $LDFLAGS" - PLUGIN_NVPTX_save_LIBS=$LIBS - LIBS="$PLUGIN_NVPTX_LIBS $LIBS" - cat confdefs.h - <<_ACEOF >conftest.$ac_ext + if test "x$CUDA_DRIVER_LIB" != xno \ + && test "x$CUDA_DRIVER_LIB" != xno; then + PLUGIN_NVPTX_CPPFLAGS=$CUDA_DRIVER_CPPFLAGS + PLUGIN_NVPTX_LDFLAGS=$CUDA_DRIVER_LDFLAGS + PLUGIN_NVPTX_LIBS='-lcuda' + + PLUGIN_NVPTX_save_CPPFLAGS=$CPPFLAGS + CPPFLAGS="$PLUGIN_NVPTX_CPPFLAGS $CPPFLAGS" + PLUGIN_NVPTX_save_LDFLAGS=$LDFLAGS + LDFLAGS="$PLUGIN_NVPTX_LDFLAGS $LDFLAGS" + PLUGIN_NVPTX_save_LIBS=$LIBS + LIBS="$PLUGIN_NVPTX_LIBS $LIBS" + cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include "cuda.h" int @@ -15426,13 +15434,16 @@ if ac_fn_c_try_link "$LINENO"; then : fi rm -f core conftest.err conftest.$ac_objext \ conftest$ac_exeext conftest.$ac_ext - CPPFLAGS=$PLUGIN_NVPTX_save_CPPFLAGS - LDFLAGS=$PLUGIN_NVPTX_save_LDFLAGS - LIBS=$PLUGIN_NVPTX_save_LIBS + CPPFLAGS=$PLUGIN_NVPTX_save_CPPFLAGS + LDFLAGS=$PLUGIN_NVPTX_save_LDFLAGS + LIBS=$PLUGIN_NVPTX_save_LIBS + fi case $PLUGIN_NVPTX in nvptx*) - if test "x$CUDA_DRIVER_INCLUDE" = x \ - && test "x$CUDA_DRIVER_LIB" = x; then + if (test "x$CUDA_DRIVER_INCLUDE" = x \ + || test "x$CUDA_DRIVER_INCLUDE" = xno) \ + && (test "x$CUDA_DRIVER_LIB" = x \ + || test "x$CUDA_DRIVER_LIB" = xno); then PLUGIN_NVPTX=1 PLUGIN_NVPTX_CPPFLAGS='-I$(srcdir)/plugin/cuda' PLUGIN_NVPTX_LIBS='-ldl' diff --git a/libgomp/plugin/configfrag.ac b/libgomp/plugin/configfrag.ac index 864817d44d1..a979425a293 100644 --- a/libgomp/plugin/configfrag.ac +++ b/libgomp/plugin/configfrag.ac @@ -59,7 +59,11 @@ AC_ARG_WITH(cuda-driver-lib, [AS_HELP_STRING([--with-cuda-driver-lib=PATH], [specify directory for the installed CUDA driver library])]) case "x$with_cuda_driver" in - x | xno) ;; + x) ;; + xno) +CUDA_DRIVER_INCLUDE=no +CUDA_DRIVER_LIB=no +;; *) CUDA_DRIVER_INCLUDE=$with_cuda_driver/include CUDA_DRIVER_LIB=$with_cuda_driver/lib ;; @@ -70,10 +74,12 @@ fi if test "x$with_cuda_driver_lib" != x; then CUDA_DRIVER_LIB=$with_cuda_driver_lib fi -if test "x$CUDA_DRIVER_INCLUDE" != x; then +if test "x$CUDA_DRIVER_INCLUDE" !=
Re: [PATCH,nvptx] Use CUDA driver API to select default runtime launch, geometry
On 08/03/2018 05:37 PM, Cesar Philippidis wrote: >> But I still see no rationale why blocks is used here, and I wonder >> whether something like num_gangs = grids * 64 would give similar results. > My original intent was to keep the load proportional to the block size. > So, in the case were a block size is limited by shared-memory or the > register file capacity, the runtime wouldn't excessively over assign > gangs to the multiprocessor units if their state is going to be swapped > out even more than necessary. So, that's your rationale. Please add a comment describing this. Thanks, - Tom
Re: [PATCH] Make strlen range computations more conservative
On 08/03/2018 01:43 AM, Jakub Jelinek wrote: On Thu, Aug 02, 2018 at 09:59:13PM -0600, Martin Sebor wrote: If I call this with foo (2, 1), do you still claim it is not valid C? String functions like strlen operate on character strings stored in character arrays. Calling strlen (&s[1]) is invalid because &s[1] is not the address of a character array. The fact that objects can be represented as arrays of bytes doesn't change that. The standard may be somewhat loose with words on this distinction but the intent certainly isn't for strlen to traverse arbitrary sequences of bytes that cross subobject boundaries. (That is the intent behind the raw memory functions, but the current text doesn't make the distinction clear.) But the standard doesn't say that right now. It does, in the restriction on multi-dimensional array accesses. Given the array 'char a[2][2];' it's only valid to access a[0][0] and a[0][1], and a[1][0], and a[1][1]. It's not valid to access a[2][0] or a[2][1], even though they happen to be located at the same addresses as a[1][0] and a[1][1]. There is no exception for distinct struct members. So in a struct { char a[2], b[2]; }, even though a and b and laid out the same way as char[2][2] would be, it's not valid to treat a as such. There is no distinction between array subscripting and pointer arithmetic, so it doesn't matter what form the access takes. Yes, the standard could be clearer. There probably even are ambiguities and contradictions (the authors of the Object Model proposal believe there are and are trying to clarify/remove them). But the intent is clearly there. It's especially important for adjacent members of different types (say a char[8] followed by a function pointer. We definitely don't want writes to the array to be allowed to change the function pointer.) Plus, at least from the middle-end POV, there is also the case of placement new and stores changing the dynamic type of the object, previously say a struct with two fields, then a placement new with a single char array over it (the placement new will not survive in the middle-end, so it will be just a memcpy or strcpy or some other byte copy over the original object, and due to the CSE/SCCVN etc. of pointer to pointer conversions being in the middle-end useless means you can see a pointer to the struct with two fields rather than pointer to char array. There may be challenges in the middle-end, you would know much better than me. All I'm saying is that it's not valid to access [sub]objects by dereferencing pointers to other subobjects. All the examples in this discussion have been of that form. Consider e.g. typedef __typeof__ (sizeof 0) size_t; void *operator new (size_t, void *p) { return p; } void *operator new[] (size_t, void *p) { return p; } struct S { char a; char b[64]; }; void baz (char *); size_t foo (S *p) { baz (&p->a); char *q = new (p) char [16]; baz (q); return __builtin_strlen (q); } I don't think it is correct to say that strlen must be 0. In this testcase the pointer passed to strlen is still S *, though I think with enough tweaking you could also have something where the argument is &p->a. I think the problem here is changing the type of p->a. I'm not up on the latest C++ changes here but I think it's a known problem with the specification. A similar (known) problem also comes in the case of dynamically allocated objects: char *p = (char*)operator new (2); char *p1 = new (p) char ('a'); char *p2 = new (p) char ('\0'); strlen (p1); Is the strlen(p) call valid when there's no string or array at p: there is a singlelton char object that just happens to be followed by another singleton char object. It's not an array of two elements. Each is [an array of] one char. This is a (specification) problem for sequence containers like vector where strictly speaking, it's not valid to iterate over them because of the array restriction. I have no problem for strlen to return 0 if it sees a toplevel object of size 1, but note that if it is extern, it already might be a problem in some cases: struct T { char a; char a2[]; } b; extern struct T c; void foo (int *p) { p[0] = strlen (b); p[1] = strlen (c); } If c's definition is struct T c = { ' ', "abcde" }; then the object doesn't have length of 1. I'm assuming above you meant strlen(&b) and strlen(&c) (or equivalently, strlen(&b.a) and strlen(&c.a). If so, it's the same problem. The strlen call is invalid unless b.a and c.a are nul. Martin
[committed, testsuite, guality] Use absolute line number in pass/fail line
On Sat, Aug 04, 2018 at 05:45:59PM +0200, Andreas Schwab wrote: > On Jul 09 2018, Tom de Vries wrote: > > > this patches uses relative line numbers in gcc.dg/guality where obvious: > > either the relative line number is '.', '.-1' or '.+1', or adjacent to > > another obvious case. > > This introduced a lot of test names that are no longer unique. > Fix in patch below. Tested on x86_64. Committed. Thanks, - Tom [testsuite, guality] Use absolute line number in pass/fail line 2018-08-04 Tom de Vries * lib/gcc-gdb-test.exp: Use absolute line number in pass/fail line. --- gcc/testsuite/lib/gcc-gdb-test.exp | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/gcc/testsuite/lib/gcc-gdb-test.exp b/gcc/testsuite/lib/gcc-gdb-test.exp index b13d3ec7f85..0066e157b42 100644 --- a/gcc/testsuite/lib/gcc-gdb-test.exp +++ b/gcc/testsuite/lib/gcc-gdb-test.exp @@ -54,18 +54,19 @@ proc gdb-test { useline args } { set var $arg1 } -set gdb_name $::env(GUALITY_GDB_NAME) -set testname "$testcase line [lindex $args 0] [lindex $args 1] == [lindex $args 2]" -set output_file "[file rootname [file tail $prog]].exe" -set cmd_file "[file rootname [file tail $prog]].gdb" - -set fd [open $cmd_file "w"] set line [lindex $args 0] if { [string range $line 0 0] == "@" } { set line [string range $line 1 end] } else { set line [get-absolute-line $useline $line] } + +set gdb_name $::env(GUALITY_GDB_NAME) +set testname "$testcase line $line [lindex $args 1] == [lindex $args 2]" +set output_file "[file rootname [file tail $prog]].exe" +set cmd_file "[file rootname [file tail $prog]].gdb" + +set fd [open $cmd_file "w"] puts $fd "break $line" puts $fd "run" puts $fd "$command $var"
Re: [PATCH] i386: Always set cfun->machine->max_used_stack_alignment
On Sat, Aug 4, 2018 at 9:49 PM, H.J. Lu wrote: > On Sat, Aug 4, 2018 at 12:09 PM, Uros Bizjak wrote: >> On Sat, Aug 4, 2018 at 3:59 PM, H.J. Lu wrote: >>> On Sat, Aug 4, 2018 at 3:42 AM, Uros Bizjak wrote: On Fri, Aug 3, 2018 at 12:55 AM, H.J. Lu wrote: > We should always set cfun->machine->max_used_stack_alignment if the > maximum stack slot alignment may be greater than 64 bits. > > Tested on i686 and x86-64. OK for master and backport for GCC 8? Can you explain why 64 bits, and what this value represents? Is this value the same for 64bit and 32bit targets? Should crtl->max_used_stack_slot_alignment be compared to STACK_BOUNDARY or even MIN_STACK_BOUNDARY instead? >>> >>> In this case, we don't need to realign the incoming stack since both >>> crtl->max_used_stack_slot_alignment and crtl->preferred_stack_boundary >>> are 128 bits. We don't compute the largest alignment of stack slots to >>> keep stack frame properly aligned for it. Normally it is OK. But if >>> the largest alignment of stack slots > 64 bits and we don't keep stack >>> frame proper aligned, we will get segfault if aligned vector load/store >>> are used on these unaligned stack slots. My patch computes the >>> largest alignment of stack slots, which are actually used, if the >>> largest alignment of stack slots allocated is > 64 bits, which is >>> the smallest alignment for misaligned load/store. >> >> Does > 64 bits also hold for 32 bit targets, and -mabi=ms? I think >> that we need to compare to STACK_BOUNDARY instead: > > 64 bits requirement is independent of any psABIs nor 32-bit vs 64-bit. > cfun->machine->max_used_stack_alignment is used to decide how > stack frame should be aligned. It is always safe to compute it. I used > > else if (crtl->max_used_stack_slot_alignment > 64) > > to compute cfun->machine->max_used_stack_alignment only if > we have to. > >> --cut here-- >> Index: config/i386/i386.c >> === >> --- config/i386/i386.c (revision 263307) >> +++ config/i386/i386.c (working copy) >> @@ -13281,8 +13281,7 @@ >> recompute_frame_layout_p = true; >> } >> } >> - else if (crtl->max_used_stack_slot_alignment >> - > crtl->preferred_stack_boundary) >> + else if (crtl->max_used_stack_slot_alignment > STACK_BOUNDARY) >> { > > This isn't correct.. cygming.h has > > #define STACK_BOUNDARY (TARGET_64BIT && ix86_abi == MS_ABI ? 128 : > BITS_PER_WORD) > > darwin.h has > > #undef STACK_BOUNDARY > #define STACK_BOUNDARY \ > ((profile_flag || (TARGET_64BIT && ix86_abi == MS_ABI)) \ > ? 128 : BITS_PER_WORD) > > i386.h has > > /* Boundary (in *bits*) on which stack pointer should be aligned. */ > #define STACK_BOUNDARY \ > (TARGET_64BIT && ix86_abi == MS_ABI ? 128 : BITS_PER_WORD) > > If STACK_BOUNDARY is 128 and max_used_stack_slot_alignment is 128, > we will get segment when 128bit aligned load/store is generated on misaligned > stack slot. > >>/* We don't need to realign stack. But we still need to keep >> stack frame properly aligned to satisfy the largest alignment >> --cut here-- >> >> (The testcase works OK with -mabi=ms, which somehow suggests that we >> don't need realignment in this case). > > We may not hit 128bit aligned load/store on misaligned stack slot in this > case. It doesn't mean that won't happen. > > else if (crtl->max_used_stack_slot_alignment > 64) > > is the correct thing to do here. OK, but please add a comment, so in the future we will still know the purpose of the magic number. Thanks, Uros.
Re: [PATCH] Make strlen range computations more conservative
On 08/03/2018 01:00 AM, Jeff Law wrote: On 07/24/2018 05:18 PM, Bernd Edlinger wrote: On 07/24/18 23:46, Jeff Law wrote: On 07/24/2018 01:59 AM, Bernd Edlinger wrote: Hi! This patch makes strlen range computations more conservative. Firstly if there is a visible type cast from type A to B before passing then value to strlen, don't expect the type layout of B to restrict the possible return value range of strlen. Why do you think this is the right thing to do? ie, is there language in the standards that makes you think the code as it stands today is incorrect from a conformance standpoint? Is there a significant body of code that is affected in an adverse way by the current code? If so, what code? I think if you have an object, of an effective type A say char[100], then you can cast the address of A to B, say typedef char (*B)[2] for instance and then to const char *, say for use in strlen. I may be wrong, but I think that we should at least try not to pick up char[2] from B, but instead use A for strlen ranges, or leave this range open. Currently the range info for strlen is [0..1] in this case, even if we see the type cast in the generic tree. ISTM that you're essentially saying that the cast to const char * destroys any type information we can exploit here. But if that's the case, then I don't think we can even derive a range of [0,99]. What's to say that "A" didn't result from a similar cast of some object that was char[200] that happened out of the scope of what we could see during the strlen range computation? If that is what you're arguing, then I think there's a re-evaluation that needs to happen WRT strlen range computation/ And just to be clear, I do see this as a significant correctness question. Martin, thoughts? The argument is that given: struct S { char a[4], b; }; char a[8] = "1234567"; this is valid and should pass: __attribute__ ((noipa)) void f (struct S *p) { assert (7 == strlen (p->a)); } int main (void) { f ((struct S*)a); } (This is the basic premise behind pr86259.) This argument is wrong and the code is invalid. For the access to p->a to be valid p must point to an object of struct S (it doesn't) and the p->a array must hold a nul-terminated string (it also doesn't). This should not be surprising because the following equivalent code behaves the same way: __attribute__ ((noipa)) void f (struct S *p) { int n = 0; for (int i = 0; p->a[i]; ++i) ++n; if (3 != n) __builtin_abort (); } and also because for write accesses, GCC (helpfully) enforces the restriction with _FORTIFY_SOURCE=2: __attribute__ ((noipa)) void f (struct S *p) { strcpy (p->a, "1234"); // aborts } I care less about the optimization than I do about the basic premise that it's essential to respect subobject boundaries(*). It would make little sense to undo the strlen optimization without also undoing the optimization for the direct array access case. Undoing either would raise the question about the validity of the _FORRTIFY_SOURCE=2 behavior. That would be a huge step backwards in terms of code security. If we did some of these but not others, it would make the behavior inconsistent and surprising, all to accommodate one instance of invalid code. If we had a valid test case where the strlen optimization leads to invalid code, or even if there were a significant number of bug reports showing that it breaks an invalid but common idiom, I would certainly feel compelled to make it right somehow. But there has been just one bug report with clearly invalid code that should be easily corrected. Martin [*] I also care deeply about all the warnings that depend on it to avoid false positives: that's pretty much all those I have implemented in the middle-end: -Wformat-{overflow, truncation}, -Wstringop-{overflow,truncation}, and likely even -Wrestrict.
Re: [PATCH] adjust sprintf range for AIX QNaN output (PR 86571)
On 08/02/2018 01:46 PM, Martin Sebor wrote: The recently added test gcc.dg/torture/builtin-sprintf.c to verify that the sprintf result computed by GCC matches libc's for Infinity and NaN has been failing on AIX which formats NaN as either QNaN or SNaN, contrary to C/POSIX requirements. The attached tweak adjusts the result computed by GCC to include the AIX format. If there are no objections I'll commit the tweak later this week and backport it to GCC 8 the next. I have committed this change as r263312. In the longer term, I think it might be best to introduce an OS target hook to describe sprintf implementation-defined details like the format of infinite values (i.e., inf or infinity, nan or qnan/snan), and perhaps also %p format and anything else that might be relevant. I'll wait a bit before backporting it GCC 8 and 7 in case there are any comments/concerns. Martin
Re: [PATCH] adjust sprintf range for AIX QNaN output (PR 86571)
On Sat, Aug 4, 2018, 18:19 Martin Sebor wrote: > On 08/02/2018 01:46 PM, Martin Sebor wrote: > > The recently added test gcc.dg/torture/builtin-sprintf.c > > to verify that the sprintf result computed by GCC matches > > libc's for Infinity and NaN has been failing on AIX which > > formats NaN as either QNaN or SNaN, contrary to C/POSIX > > requirements. The attached tweak adjusts the result > > computed by GCC to include the AIX format. If there are > > no objections I'll commit the tweak later this week and > > backport it to GCC 8 the next. > > I have committed this change as r263312. > > In the longer term, I think it might be best to introduce an OS > target hook to describe sprintf implementation-defined details > like the format of infinite values (i.e., inf or infinity, nan > or qnan/snan), and perhaps also %p format and anything else that > might be relevant. > Another option would be to calculate the length of NAN string at gcc build time. Set the optimization constant to the value returned by sprintf (os libc function, not compiler inlined value) while building gcc. David >
Re: [PATCH] adjust sprintf range for AIX QNaN output (PR 86571)
On 08/04/2018 04:25 PM, David Edelsohn wrote: On Sat, Aug 4, 2018, 18:19 Martin Sebor mailto:mse...@gmail.com>> wrote: On 08/02/2018 01:46 PM, Martin Sebor wrote: > The recently added test gcc.dg/torture/builtin-sprintf.c > to verify that the sprintf result computed by GCC matches > libc's for Infinity and NaN has been failing on AIX which > formats NaN as either QNaN or SNaN, contrary to C/POSIX > requirements. The attached tweak adjusts the result > computed by GCC to include the AIX format. If there are > no objections I'll commit the tweak later this week and > backport it to GCC 8 the next. I have committed this change as r263312. In the longer term, I think it might be best to introduce an OS target hook to describe sprintf implementation-defined details like the format of infinite values (i.e., inf or infinity, nan or qnan/snan), and perhaps also %p format and anything else that might be relevant. Another option would be to calculate the length of NAN string at gcc build time. Set the optimization constant to the value returned by sprintf (os libc function, not compiler inlined value) while building gcc. That sounds even better, thanks. I've opened bug 86857 to look into it. Martin
Re: [PATCH] i386: Always set cfun->machine->max_used_stack_alignment
On Sat, Aug 04, 2018 at 11:48:15PM +0200, Uros Bizjak wrote: > On Sat, Aug 4, 2018 at 9:49 PM, H.J. Lu wrote: > > On Sat, Aug 4, 2018 at 12:09 PM, Uros Bizjak wrote: > >> On Sat, Aug 4, 2018 at 3:59 PM, H.J. Lu wrote: > >>> On Sat, Aug 4, 2018 at 3:42 AM, Uros Bizjak wrote: > On Fri, Aug 3, 2018 at 12:55 AM, H.J. Lu wrote: > > We should always set cfun->machine->max_used_stack_alignment if the > > maximum stack slot alignment may be greater than 64 bits. > > > > Tested on i686 and x86-64. OK for master and backport for GCC 8? > > Can you explain why 64 bits, and what this value represents? Is this > value the same for 64bit and 32bit targets? > > Should crtl->max_used_stack_slot_alignment be compared to > STACK_BOUNDARY or even MIN_STACK_BOUNDARY instead? > >>> > >>> In this case, we don't need to realign the incoming stack since both > >>> crtl->max_used_stack_slot_alignment and crtl->preferred_stack_boundary > >>> are 128 bits. We don't compute the largest alignment of stack slots to > >>> keep stack frame properly aligned for it. Normally it is OK. But if > >>> the largest alignment of stack slots > 64 bits and we don't keep stack > >>> frame proper aligned, we will get segfault if aligned vector load/store > >>> are used on these unaligned stack slots. My patch computes the > >>> largest alignment of stack slots, which are actually used, if the > >>> largest alignment of stack slots allocated is > 64 bits, which is > >>> the smallest alignment for misaligned load/store. > >> > >> Does > 64 bits also hold for 32 bit targets, and -mabi=ms? I think > >> that we need to compare to STACK_BOUNDARY instead: > > > > 64 bits requirement is independent of any psABIs nor 32-bit vs 64-bit. > > cfun->machine->max_used_stack_alignment is used to decide how > > stack frame should be aligned. It is always safe to compute it. I used > > > > else if (crtl->max_used_stack_slot_alignment > 64) > > > > to compute cfun->machine->max_used_stack_alignment only if > > we have to. > > > >> --cut here-- > >> Index: config/i386/i386.c > >> === > >> --- config/i386/i386.c (revision 263307) > >> +++ config/i386/i386.c (working copy) > >> @@ -13281,8 +13281,7 @@ > >> recompute_frame_layout_p = true; > >> } > >> } > >> - else if (crtl->max_used_stack_slot_alignment > >> - > crtl->preferred_stack_boundary) > >> + else if (crtl->max_used_stack_slot_alignment > STACK_BOUNDARY) > >> { > > > > This isn't correct.. cygming.h has > > > > #define STACK_BOUNDARY (TARGET_64BIT && ix86_abi == MS_ABI ? 128 : > > BITS_PER_WORD) > > > > darwin.h has > > > > #undef STACK_BOUNDARY > > #define STACK_BOUNDARY \ > > ((profile_flag || (TARGET_64BIT && ix86_abi == MS_ABI)) \ > > ? 128 : BITS_PER_WORD) > > > > i386.h has > > > > /* Boundary (in *bits*) on which stack pointer should be aligned. */ > > #define STACK_BOUNDARY \ > > (TARGET_64BIT && ix86_abi == MS_ABI ? 128 : BITS_PER_WORD) > > > > If STACK_BOUNDARY is 128 and max_used_stack_slot_alignment is 128, > > we will get segment when 128bit aligned load/store is generated on > > misaligned > > stack slot. > > > >>/* We don't need to realign stack. But we still need to keep > >> stack frame properly aligned to satisfy the largest alignment > >> --cut here-- > >> > >> (The testcase works OK with -mabi=ms, which somehow suggests that we > >> don't need realignment in this case). > > > > We may not hit 128bit aligned load/store on misaligned stack slot in this > > case. It doesn't mean that won't happen. > > > > else if (crtl->max_used_stack_slot_alignment > 64) > > > > is the correct thing to do here. > > OK, but please add a comment, so in the future we will still know the > purpose of the magic number. > Like this? H.J. --- cfun->machine->max_used_stack_alignment is used to decide how stack frame should be aligned. This is independent of any psABIs nor 32-bit vs 64-bit. It is always safe to compute max_used_stack_alignment. We compute it only if 128-bit aligned load/store may be generated on misaligned stack slot which will lead to segfault. gcc/ PR target/86386 * config/i386/i386.c (ix86_finalize_stack_frame_flags): Set cfun->machine->max_used_stack_alignment if needed. gcc/testsuite/ PR target/86386 * gcc.target/i386/pr86386.c: New file. --- gcc/config/i386/i386.c | 14 +++-- gcc/testsuite/gcc.target/i386/pr86386.c | 26 + 2 files changed, 34 insertions(+), 6 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr86386.c diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index ee409cfe7e4..cf8c33bd909 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -13281,12 +13281,14 @@ ix86_finalize_stack_frame_flags (void) recompute_frame_layout_p = true; } } -
Re: [PATCH] Make strlen range computations more conservative
On 08/04/18 23:56, Martin Sebor wrote: > On 08/03/2018 01:00 AM, Jeff Law wrote: >> On 07/24/2018 05:18 PM, Bernd Edlinger wrote: >>> On 07/24/18 23:46, Jeff Law wrote: On 07/24/2018 01:59 AM, Bernd Edlinger wrote: > Hi! > > This patch makes strlen range computations more conservative. > > Firstly if there is a visible type cast from type A to B before passing > then value to strlen, don't expect the type layout of B to restrict the > possible return value range of strlen. Why do you think this is the right thing to do? ie, is there language in the standards that makes you think the code as it stands today is incorrect from a conformance standpoint? Is there a significant body of code that is affected in an adverse way by the current code? If so, what code? >>> >>> I think if you have an object, of an effective type A say char[100], then >>> you can cast the address of A to B, say typedef char (*B)[2] for instance >>> and then to const char *, say for use in strlen. I may be wrong, but I >>> think >>> that we should at least try not to pick up char[2] from B, but instead >>> use A for strlen ranges, or leave this range open. Currently the range >>> info for strlen is [0..1] in this case, even if we see the type cast >>> in the generic tree. >> ISTM that you're essentially saying that the cast to const char * >> destroys any type information we can exploit here. But if that's the >> case, then I don't think we can even derive a range of [0,99]. What's >> to say that "A" didn't result from a similar cast of some object that >> was char[200] that happened out of the scope of what we could see during >> the strlen range computation? >> >> If that is what you're arguing, then I think there's a re-evaluation >> that needs to happen WRT strlen range computation/ >> >> And just to be clear, I do see this as a significant correctness question. >> >> Martin, thoughts? > > The argument is that given: > > struct S { char a[4], b; }; > > char a[8] = "1234567"; > > this is valid and should pass: > > __attribute__ ((noipa)) > void f (struct S *p) > { > assert (7 == strlen (p->a)); > } > > int main (void) > { > f ((struct S*)a); > } > > (This is the basic premise behind pr86259.) > > This argument is wrong and the code is invalid. For the access > to p->a to be valid p must point to an object of struct S (it > doesn't) and the p->a array must hold a nul-terminated string > (it also doesn't). > > This should not be surprising because the following equivalent > code behaves the same way: > > __attribute__ ((noipa)) > void f (struct S *p) > { > int n = 0; > for (int i = 0; p->a[i]; ++i) > ++n; > if (3 != n) > __builtin_abort (); > } > > and also because for write accesses, GCC (helpfully) enforces > the restriction with _FORTIFY_SOURCE=2: > > __attribute__ ((noipa)) > void f (struct S *p) > { > strcpy (p->a, "1234"); // aborts > } > > I care less about the optimization than I do about the basic > premise that it's essential to respect subobject boundaries(*). > It would make little sense to undo the strlen optimization > without also undoing the optimization for the direct array > access case. Undoing either would raise the question about > the validity of the _FORRTIFY_SOURCE=2 behavior. That would > be a huge step backwards in terms of code security. If we > did some of these but not others, it would make the behavior > inconsistent and surprising, all to accommodate one instance > of invalid code. > > If we had a valid test case where the strlen optimization > leads to invalid code, or even if there were a significant > number of bug reports showing that it breaks an invalid > but common idiom, I would certainly feel compelled to > make it right somehow. But there has been just one bug > report with clearly invalid code that should be easily > corrected. > I see this from a software engineering POV. If we have code like this: void test (const char *x) { assert (strlen (x) < 10); } One would usually expect the program to abort (or at least abort with a near 100% likelihood) if x is not a valid string with length less than 10. But if lto and other optimizations show that this code is invoked with an invalid, non-zero terminated string the assertion is suddenly gone. Martin, why do you insist that GCC has to do this, and that it must be a good idea to do so, just based on the language definition? Why do we need assertions at all, when all programs have to be completely correct before we may compile them? > Martin > > [*] I also care deeply about all the warnings that depend > on it to avoid false positives: that's pretty much all those > I have implemented in the middle-end: -Wformat-{overflow, > truncation}, -Wstringop-{overflow,truncation}, and likely > even -Wrestrict. I do as well, but a false positive will not cause
Re: [PATCH] Make strlen range computations more conservative
On 08/04/18 22:52, Martin Sebor wrote: > On 08/03/2018 01:43 AM, Jakub Jelinek wrote: >> On Thu, Aug 02, 2018 at 09:59:13PM -0600, Martin Sebor wrote: If I call this with foo (2, 1), do you still claim it is not valid C? >>> >>> String functions like strlen operate on character strings stored >>> in character arrays. Calling strlen (&s[1]) is invalid because >>> &s[1] is not the address of a character array. The fact that >>> objects can be represented as arrays of bytes doesn't change >>> that. The standard may be somewhat loose with words on this >>> distinction but the intent certainly isn't for strlen to traverse >>> arbitrary sequences of bytes that cross subobject boundaries. >>> (That is the intent behind the raw memory functions, but >>> the current text doesn't make the distinction clear.) >> >> But the standard doesn't say that right now. > > It does, in the restriction on multi-dimensional array accesses. > Given the array 'char a[2][2];' it's only valid to access a[0][0] > and a[0][1], and a[1][0], and a[1][1]. It's not valid to access > a[2][0] or a[2][1], even though they happen to be located at > the same addresses as a[1][0] and a[1][1]. > > There is no exception for distinct struct members. So in > a struct { char a[2], b[2]; }, even though a and b and laid > out the same way as char[2][2] would be, it's not valid to > treat a as such. There is no distinction between array > subscripting and pointer arithmetic, so it doesn't matter > what form the access takes. > > Yes, the standard could be clearer. There probably even are > ambiguities and contradictions (the authors of the Object Model > proposal believe there are and are trying to clarify/remove > them). But the intent is clearly there. It's especially > important for adjacent members of different types (say a char[8] > followed by a function pointer. We definitely don't want writes > to the array to be allowed to change the function pointer.) > >> Plus, at least from the middle-end POV, there is also the case of >> placement new and stores changing the dynamic type of the object, >> previously say a struct with two fields, then a placement new with a single >> char array over it (the placement new will not survive in the middle-end, so >> it will be just a memcpy or strcpy or some other byte copy over the original >> object, and due to the CSE/SCCVN etc. of pointer to pointer conversions >> being in the middle-end useless means you can see a pointer to the struct >> with two fields rather than pointer to char array. > > There may be challenges in the middle-end, you would know much > better than me. All I'm saying is that it's not valid to access > [sub]objects by dereferencing pointers to other subobjects. All > the examples in this discussion have been of that form. > These examples do not aim to be valid C, they just point out limitations of the middle-end design, and a good deal of the problems are due to trying to do things that are not safe within the boundaries given by the middle-end design. Bernd. >> >> Consider e.g. >> typedef __typeof__ (sizeof 0) size_t; >> void *operator new (size_t, void *p) { return p; } >> void *operator new[] (size_t, void *p) { return p; } >> struct S { char a; char b[64]; }; >> void baz (char *); >> >> size_t >> foo (S *p) >> { >> baz (&p->a); >> char *q = new (p) char [16]; >> baz (q); >> return __builtin_strlen (q); >> } >> >> I don't think it is correct to say that strlen must be 0. In this testcase >> the pointer passed to strlen is still S *, though I think with enough >> tweaking you could also have something where the argument is &p->a. > > I think the problem here is changing the type of p->a. I'm > not up on the latest C++ changes here but I think it's a known > problem with the specification. A similar (known) problem also > comes in the case of dynamically allocated objects: > > char *p = (char*)operator new (2); > char *p1 = new (p) char ('a'); > char *p2 = new (p) char ('\0'); > strlen (p1); > > Is the strlen(p) call valid when there's no string or array > at p: there is a singlelton char object that just happens > to be followed by another singleton char object. It's not > an array of two elements. Each is [an array of] one char. > > This is a (specification) problem for sequence containers like > vector where strictly speaking, it's not valid to iterate over > them because of the array restriction. > >> >> I have no problem for strlen to return 0 if it sees a toplevel object of >> size 1, but note that if it is extern, it already might be a problem in some >> cases: >> struct T { char a; char a2[]; } b; >> extern struct T c; >> void foo (int *p) { p[0] = strlen (b); p[1] = strlen (c); } >> If c's definition is struct T c = { ' ', "abcde" }; >> then the object doesn't have length of 1. > > I'm assuming above you meant strlen(&b) and strlen(&c) (or > equivalently, strlen(&b.a) and strlen(&c.a). If so, it's > the same problem. T