[patch, fortran] Fix PR 72714, ICE on invalid
Hello world, the attached patch fixes a 7/8/9 regression by rejecting an invalid expression in coarray allocation that led to an ICE. It also adds a few more checks. One point that is checked for is that, unlike normal arrays, coarrays cannot be empty. Regression-tested. OK for trunk and affected branches? Regards Thomas 2019-03-02 Thomas Koenig PR fortran/72714 * resolve.c (resolve_allocate_expr): Add some tests for coarrays. 2019-03-02 Thomas Koenig PR fortran/72714 * gfortran.dg/coarray_allocate_11.f90: New test. Index: resolve.c === --- resolve.c (Revision 269260) +++ resolve.c (Arbeitskopie) @@ -7766,13 +7766,54 @@ resolve_allocate_expr (gfc_expr *e, gfc_code *code if (codimension) for (i = ar->dimen; i < ar->dimen + ar->codimen; i++) - if (ar->dimen_type[i] == DIMEN_THIS_IMAGE) - { - gfc_error ("Coarray specification required in ALLOCATE statement " - "at %L", &e->where); - goto failure; - } + { + switch (ar->dimen_type[i]) + { + case DIMEN_THIS_IMAGE: + gfc_error ("Coarray specification required in ALLOCATE statement " + "at %L", &e->where); + goto failure; + case DIMEN_RANGE: + if (ar->start[i] == 0 || ar->end[i] == 0) + { + /* If ar->stride[i] is NULL, we issued a previous error. */ + if (ar->stride[i] == NULL) + gfc_error ("Bad array specification in ALLOCATE statement " + "at %L", &e->where); + goto failure; + } + else if (gfc_dep_compare_expr (ar->start[i], ar->end[i]) == 1) + { + gfc_error ("Upper cobound is less than lower cobound at %L", + &ar->start[i]->where); + goto failure; + } + break; + + case DIMEN_ELEMENT: + if (ar->start[i]->expr_type == EXPR_CONSTANT) + { + gcc_assert (ar->start[i]->ts.type == BT_INTEGER); + if (mpz_cmp_si (ar->start[i]->value.integer, 1) < 0) + { + gfc_error ("Upper cobound is less than lower cobound " + " of 1 at %L", &ar->start[i]->where); + goto failure; + } + } + break; + + case DIMEN_STAR: + break; + + default: + gfc_error ("Bad array specification in ALLOCATE statement at %L", + &e->where); + goto failure; + + } + } for (i = 0; i < ar->dimen; i++) { if (ar->type == AR_ELEMENT || ar->type == AR_FULL) ! { dg-do compile } ! { dg-additional-options -fcoarray=single } ! PR fortran/72714 ! Test for not ICEing and different error contitions when allocating ! coarrays. program p integer, allocatable :: z[:,:] integer :: i allocate (z[1:,*]) ! { dg-error "Bad array specification in ALLOCATE statement" } allocate (z[:2,*]) ! { dg-error "Bad array specification in ALLOCATE statement" } allocate (z[2:1,*]) ! { dg-error "Upper cobound is less than lower cobound" } allocate (z[:0,*]) ! { dg-error "Bad array specification in ALLOCATE statement" } allocate (z[0,*]) ! { dg-error "Upper cobound is less than lower cobound" } allocate (z[1,*]) ! This is OK allocate (z[1:1,*]) ! This is OK allocate (z[i:i,*]) ! This is OK allocate (z[i:i-1,*]) ! { dg-error "Upper cobound is less than lower cobound" } end
Re: [patch, fortran] Fix pointers not escaping via C_PTR
I wrote: First, this talks about a C pointer having a target. Second, you can re-estabilsh the association to a different pointer to the Fortran program. There is another point to consider: This is interoperability with C we are dealing with, so we also have to follow C semantics. And, love it or hate it, C pointers escape. So, OK for trunk? Regards Thomas
[PATCH] [MinGW] Set __USE_MINGW_ACCESS for C++ as well
We set __USE_MINGW_ACCESS for windows hosts to use MinGWs wrapper for the access function. The wrapper ensures that access behaves in the expected way (e.g. for special files, such as nul). However, we now compile most sources with the C++ compiler and the __USE_MINGW_ACCESS in CFLAGS is not used there. This causes GCCs build against newer msvcrt versions with incompatible access implementations to fail. This patch adds the flag to the CXXFLAGS for all bootstrap stages. Bootstrapped on x86_64-mingw64-seh. config/ChangeLog: 2019-03-02 Johannes Pfau * mh-mingw: Also set __USE_MINGW_ACCESS flag for C++ code. --- config/mh-mingw | 5 + 1 file changed, 5 insertions(+) diff --git a/config/mh-mingw b/config/mh-mingw index bc1d27477d0..a795096f038 100644 --- a/config/mh-mingw +++ b/config/mh-mingw @@ -2,6 +2,11 @@ # Vista (see PR33281 for details). BOOT_CFLAGS += -D__USE_MINGW_ACCESS -Wno-pedantic-ms-format CFLAGS += -D__USE_MINGW_ACCESS +STAGE1_CXXFLAGS += -D__USE_MINGW_ACCESS +STAGE2_CXXFLAGS += -D__USE_MINGW_ACCESS +STAGE3_CXXFLAGS += -D__USE_MINGW_ACCESS +STAGE4_CXXFLAGS += -D__USE_MINGW_ACCESS + # Increase stack limit to a figure based on the Linux default, with 4MB added # as GCC turns out to need that much more to pass all the limits-* tests. LDFLAGS += -Wl,--stack,12582912 -- 2.19.2
Re: [patch, fortran] Fix PR 72714, ICE on invalid
Hi Thomas, This is good for trunk. Thanks Paul On Sun, 3 Mar 2019 at 09:46, Thomas Koenig wrote: > > Hello world, > > the attached patch fixes a 7/8/9 regression by rejecting an invalid > expression in coarray allocation that led to an ICE. It also adds a few > more checks. > > One point that is checked for is that, unlike normal arrays, coarrays > cannot be empty. > > Regression-tested. OK for trunk and affected branches? > > Regards > > Thomas > > 2019-03-02 Thomas Koenig > > > > PR fortran/72714 > > * resolve.c (resolve_allocate_expr): Add some tests for > coarrays. > > > 2019-03-02 Thomas Koenig > > > > PR fortran/72714 > > * gfortran.dg/coarray_allocate_11.f90: New test. -- "If you can't explain it simply, you don't understand it well enough" - Albert Einstein
[PATCH] Optimize vector init constructor
For vector init constructor: --- typedef float __v4sf __attribute__ ((__vector_size__ (16))); __v4sf foo (__v4sf x, float f) { __v4sf y = { f, x[1], x[2], x[3] }; return y; } --- we can optimize vector init constructor with vector copy or permute followed by a single scalar insert: __v4sf D.1912; __v4sf D.1913; __v4sf D.1914; __v4sf y; x.0_1 = x; D.1912 = x.0_1; _2 = D.1912; D.1913 = _2; BIT_FIELD_REF = f; y = D.1913; D.1914 = y; return D.1914; instead of __v4sf D.1962; __v4sf y; _1 = BIT_FIELD_REF ; _2 = BIT_FIELD_REF ; _3 = BIT_FIELD_REF ; y = {f, _1, _2, _3}; D.1962 = y; return D.1962; gcc/ PR tree-optimization/88828 * gimplify.c (gimplify_init_constructor): Optimize vector init constructor with vector copy or permute followed by a single scalar insert. gcc/testsuite/ PR tree-optimization/88828 * gcc.target/i386/pr88828-1.c: New test. * gcc.target/i386/pr88828-2.c: Likewise. * gcc.target/i386/pr88828-3a.c: Likewise. * gcc.target/i386/pr88828-3b.c: Likewise. * gcc.target/i386/pr88828-4a.c: Likewise. * gcc.target/i386/pr88828-4b.c: Likewise. * gcc.target/i386/pr88828-5a.c: Likewise. * gcc.target/i386/pr88828-5b.c: Likewise. * gcc.target/i386/pr88828-6a.c: Likewise. * gcc.target/i386/pr88828-6b.c: Likewise. --- gcc/gimplify.c | 176 +++-- gcc/testsuite/gcc.target/i386/pr88828-1.c | 16 ++ gcc/testsuite/gcc.target/i386/pr88828-2.c | 17 ++ gcc/testsuite/gcc.target/i386/pr88828-3a.c | 16 ++ gcc/testsuite/gcc.target/i386/pr88828-3b.c | 18 +++ gcc/testsuite/gcc.target/i386/pr88828-4a.c | 17 ++ gcc/testsuite/gcc.target/i386/pr88828-4b.c | 20 +++ gcc/testsuite/gcc.target/i386/pr88828-5a.c | 16 ++ gcc/testsuite/gcc.target/i386/pr88828-5b.c | 18 +++ gcc/testsuite/gcc.target/i386/pr88828-6a.c | 17 ++ gcc/testsuite/gcc.target/i386/pr88828-6b.c | 19 +++ 11 files changed, 336 insertions(+), 14 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-1.c create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-2.c create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-3a.c create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-3b.c create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-4a.c create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-4b.c create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-5a.c create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-5b.c create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-6a.c create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-6b.c diff --git a/gcc/gimplify.c b/gcc/gimplify.c index 983635ba21f..893a4311f9e 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -5082,22 +5082,170 @@ gimplify_init_constructor (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p, TREE_CONSTANT (ctor) = 0; } - /* Vector types use CONSTRUCTOR all the way through gimple - compilation as a general initializer. */ - FOR_EACH_VEC_SAFE_ELT (elts, ix, ce) + tree rhs_vector = NULL; + /* The vector element to replace scalar elements, which + will be overridden by scalar insert. */ + tree vector_element = NULL; + /* The single scalar element. */ + tree scalar_element = NULL; + unsigned int scalar_idx = 0; + enum { unknown, copy, permute, init } operation = unknown; + bool insert = false; + + /* Check if we can generate vector copy or permute followed by + a single scalar insert. */ + if (TYPE_VECTOR_SUBPARTS (type).is_constant ()) { - enum gimplify_status tret; - tret = gimplify_expr (&ce->value, pre_p, post_p, is_gimple_val, - fb_rvalue); - if (tret == GS_ERROR) - ret = GS_ERROR; - else if (TREE_STATIC (ctor) -&& !initializer_constant_valid_p (ce->value, - TREE_TYPE (ce->value))) - TREE_STATIC (ctor) = 0; + /* If all RHS vector elements come from the same vector, + we can use permute. If all RHS vector elements come + from the same vector in the same order, we can use + copy. */ + unsigned int nunits + = TYPE_VECTOR_SUBPARTS (type).to_constant (); + unsigned int nscalars = 0; + unsigned int nvectors = 0; + operation = unknown; + FOR_EACH_VEC_SAFE_ELT (elts, ix, ce) + if (TREE_CODE (ce->value) == ARRAY_REF + || TREE_CODE (ce->value) == ARRAY_RANGE_REF) + { + if (!vector_element) + vector_element = ce->value; + /* Get the vector index. */ + tree idx = TREE_OPERAND (ce->value, 1); + if (TRE
Re: [PATCH] Optimize vector init constructor
) ,On Sun, Mar 3, 2019 at 6:32 AM H.J. Lu wrote: > > For vector init constructor: > > --- > typedef float __v4sf __attribute__ ((__vector_size__ (16))); > > __v4sf > foo (__v4sf x, float f) > { > __v4sf y = { f, x[1], x[2], x[3] }; > return y; > } > --- > > we can optimize vector init constructor with vector copy or permute > followed by a single scalar insert: > > __v4sf D.1912; > __v4sf D.1913; > __v4sf D.1914; > __v4sf y; > > x.0_1 = x; > D.1912 = x.0_1; > _2 = D.1912; > D.1913 = _2; > BIT_FIELD_REF = f; > y = D.1913; > D.1914 = y; > return D.1914; > > instead of > > __v4sf D.1962; > __v4sf y; > > _1 = BIT_FIELD_REF ; > _2 = BIT_FIELD_REF ; > _3 = BIT_FIELD_REF ; > y = {f, _1, _2, _3}; > D.1962 = y; > return D.1962; > > gcc/ > > PR tree-optimization/88828 > * gimplify.c (gimplify_init_constructor): Optimize vector init > constructor with vector copy or permute followed by a single > scalar insert. Doing this here does not catch things like: typedef float __v4sf __attribute__ ((__vector_size__ (16))); __v4sf vector_init (float f0,float f1, float f2,float f3) { __v4sf y = { f, x[1], x[2], x[3] }; return y; } __v4sf foo (__v4sf x, float f) { return vector_init (f, x[1], x[2], x[3]) ; } > > gcc/testsuite/ > > PR tree-optimization/88828 > * gcc.target/i386/pr88828-1.c: New test. > * gcc.target/i386/pr88828-2.c: Likewise. > * gcc.target/i386/pr88828-3a.c: Likewise. > * gcc.target/i386/pr88828-3b.c: Likewise. > * gcc.target/i386/pr88828-4a.c: Likewise. > * gcc.target/i386/pr88828-4b.c: Likewise. > * gcc.target/i386/pr88828-5a.c: Likewise. > * gcc.target/i386/pr88828-5b.c: Likewise. > * gcc.target/i386/pr88828-6a.c: Likewise. > * gcc.target/i386/pr88828-6b.c: Likewise. > --- > gcc/gimplify.c | 176 +++-- > gcc/testsuite/gcc.target/i386/pr88828-1.c | 16 ++ > gcc/testsuite/gcc.target/i386/pr88828-2.c | 17 ++ > gcc/testsuite/gcc.target/i386/pr88828-3a.c | 16 ++ > gcc/testsuite/gcc.target/i386/pr88828-3b.c | 18 +++ > gcc/testsuite/gcc.target/i386/pr88828-4a.c | 17 ++ > gcc/testsuite/gcc.target/i386/pr88828-4b.c | 20 +++ > gcc/testsuite/gcc.target/i386/pr88828-5a.c | 16 ++ > gcc/testsuite/gcc.target/i386/pr88828-5b.c | 18 +++ > gcc/testsuite/gcc.target/i386/pr88828-6a.c | 17 ++ > gcc/testsuite/gcc.target/i386/pr88828-6b.c | 19 +++ > 11 files changed, 336 insertions(+), 14 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-1.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-2.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-3a.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-3b.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-4a.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-4b.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-5a.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-5b.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-6a.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-6b.c > > diff --git a/gcc/gimplify.c b/gcc/gimplify.c > index 983635ba21f..893a4311f9e 100644 > --- a/gcc/gimplify.c > +++ b/gcc/gimplify.c > @@ -5082,22 +5082,170 @@ gimplify_init_constructor (tree *expr_p, gimple_seq > *pre_p, gimple_seq *post_p, > TREE_CONSTANT (ctor) = 0; > } > > - /* Vector types use CONSTRUCTOR all the way through gimple > - compilation as a general initializer. */ > - FOR_EACH_VEC_SAFE_ELT (elts, ix, ce) > + tree rhs_vector = NULL; > + /* The vector element to replace scalar elements, which > + will be overridden by scalar insert. */ > + tree vector_element = NULL; > + /* The single scalar element. */ > + tree scalar_element = NULL; > + unsigned int scalar_idx = 0; > + enum { unknown, copy, permute, init } operation = unknown; > + bool insert = false; > + > + /* Check if we can generate vector copy or permute followed by > + a single scalar insert. */ > + if (TYPE_VECTOR_SUBPARTS (type).is_constant ()) > { > - enum gimplify_status tret; > - tret = gimplify_expr (&ce->value, pre_p, post_p, is_gimple_val, > - fb_rvalue); > - if (tret == GS_ERROR) > - ret = GS_ERROR; > - else if (TREE_STATIC (ctor) > -&& !initializer_constant_valid_p (ce->value, > - TREE_TYPE (ce->value))) > - TREE_STATIC (ctor) = 0; > + /* If all RHS vector elements come from the same vector, > + we can use permute. If all RHS vector elements come > + from the same vector in the same order, we can use > + copy. */ > +
Re: [PATCH] x32: Add addr32 prefix to UNSPEC_VSIBADDR instructions
On Thu, Feb 28, 2019 at 8:10 PM H.J. Lu wrote: > > 32-bit indices in VSIB address are sign-extended to 64 bits. In x32, > when 32-bit indices are used as addresses, like in > > vgatherdps %ymm7, 0(,%ymm9,1), %ymm6 > > 32-bit indices, 0xf7fa3010, is sign-extended to 0xf7fa3010 which > is invalid address. Add addr32 prefix to UNSPEC_VSIBADDR instructions > for x32 if there is no base register nor symbol. > > This fixes 175.vpr and 254.gap in SPEC CPU 2000 on x32 with > > -Ofast -funroll-loops -march=haswell 1. Testcases 2 to 9 fail on fedora-29 with: In file included from /usr/include/features.h:452, from /usr/include/bits/libc-header-start.h:33, from /usr/include/stdlib.h:25, from /ssd/uros/gcc-build-fast/gcc/include/mm_malloc.h:27, from /ssd/uros/gcc-build-fast/gcc/include/xmmintrin.h:34, from /ssd/uros/gcc-build-fast/gcc/include/immintrin.h:29, from /home/uros/gcc-svn/trunk/gcc/testsuite/gcc.target/i386/pr89523-2.c:7: /usr/include/gnu/stubs.h:13:11: fatal error: gnu/stubs-x32.h: No such file or directory 2. Does the patch work with -maddress-mode={short,long}? 3. The implementation is wrong. You should use operand substitution with VSIB address as operand, not substitution without operand. 4. The PR is not a regression. Uros. > > gcc/ > > PR target/89523 > * config/i386/i386.c (ix86_print_operand): Also handle '_' to > add addr32 prefix if required. > (ix86_print_operand_punct_valid_p): Allow '_'. > * config/i386/sse.md (*avx512pf_gatherpfsf_mask): Prepend > "%_". > (*avx512pf_gatherpfdf_mask): Likewise. > (*avx512pf_scatterpfsf_mask): Likewise. > (*avx512pf_scatterpfdf_mask): Likewise. > (*avx2_gathersi): Likewise. > (*avx2_gathersi_2): Likewise. > (*avx2_gatherdi): Likewise. > (*avx2_gatherdi_2): Likewise. > (*avx2_gatherdi_3): Likewise. > (*avx2_gatherdi_4): Likewise. > (*avx512f_gathersi): Likewise. > (*avx512f_gathersi_2): Likewise. > (*avx512f_gatherdi): Likewise. > (*avx512f_gatherdi_2): Likewise. > (*avx512f_scattersi): Likewise. > (*avx512f_scatterdi): Likewise. > > gcc/testsuite/ > > PR target/89523 > * gcc.target/i386/pr89523-1.c: New test. > * gcc.target/i386/pr89523-2.c: Likewise. > * gcc.target/i386/pr89523-3.c: Likewise. > * gcc.target/i386/pr89523-4.c: Likewise. > * gcc.target/i386/pr89523-5.c: Likewise. > * gcc.target/i386/pr89523-6.c: Likewise. > * gcc.target/i386/pr89523-7.c: Likewise. > * gcc.target/i386/pr89523-8.c: Likewise. > * gcc.target/i386/pr89523-9.c: Likewise. > > xxx > --- > gcc/config/i386/i386.c| 39 ++- > gcc/config/i386/sse.md| 46 +++ > gcc/testsuite/gcc.target/i386/pr89523-1.c | 24 > gcc/testsuite/gcc.target/i386/pr89523-2.c | 17 + > gcc/testsuite/gcc.target/i386/pr89523-3.c | 17 + > gcc/testsuite/gcc.target/i386/pr89523-4.c | 16 > gcc/testsuite/gcc.target/i386/pr89523-5.c | 18 + > gcc/testsuite/gcc.target/i386/pr89523-6.c | 17 + > gcc/testsuite/gcc.target/i386/pr89523-7.c | 19 ++ > gcc/testsuite/gcc.target/i386/pr89523-8.c | 19 ++ > gcc/testsuite/gcc.target/i386/pr89523-9.c | 16 > 11 files changed, 224 insertions(+), 24 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-1.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-2.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-3.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-4.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-5.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-6.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-7.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-8.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-9.c > > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > index b8357a7db5d..336696136de 100644 > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -17805,6 +17805,7 @@ print_reg (rtx x, int code, FILE *file) > ~ -- print "i" if TARGET_AVX2, "f" otherwise. > ^ -- print addr32 prefix if TARGET_64BIT and Pmode != word_mode > ! -- print NOTRACK prefix for jxx/call/ret instructions if required. > + _ -- print addr32 prefix if required. > */ > > void > @@ -18356,6 +18357,42 @@ ix86_print_operand (FILE *file, rtx x, int code) > fputs ("addr32 ", file); > return; > > + case '_': > + if (TARGET_X32) > + { > + subrtx_var_iterator::array_type array; > + FOR_EACH_SUBRTX_VAR (iter, array, > + PATTERN (current_output_insn),
[RFC] libgcc: Integrating emutls and D garbage collector
Hello all, I wanted to ask for some feedback on an issue which has affected GDC for a long time now: Letting the D garbage collector scan emulated TLS memory. Basically, for every variable instance in TLS memory we need to be able to get the address and size of that instance, so the GC can scan this memory for pointers. Now with libgcc emutls there are essentially two problems: 1) Memory is allocated individually for each variable, so it's not possible to simply get a memory range as in native TLS. I can't think of a way to get the required information with the current API, so I've attached proof-of-concept patch which implements a __emutls_iterate_memory function. It is however rather intrusive, so maybe there's a better solution. Currently it keeps the size for all variables in every threads local memory, but an optimization would be to keep the size only once in a separate array. A completely different, less intrusive solution would be to allow the frontend to specify different emutls functions (e.g. __d_emutls_get_address instead of __emutls_get_address). The main drawback here is that linking C/D emutls variables would not be possible due to the incompatible ABI. However, this is also true for the other D compilers' (DMD) emutls implementation, so this may be fine. A D emutls implementation would also be much simpler and probably faster (we can just allocate using the GC). (Of course it's way too late for such changes in GCC9, so I'm just asking for some general feedback.) 2) TLS memory is only freed once a thread exits, but not on unloading of shared libraries. This problem is aggravated for D, as this old TLS memory may contain references to the GC heap and prevent that memory from being collected. However, I think this issue is not that important, especially as I can't think of a portable solution (one destructor per variable could work, but seems quite wasteful). I'm looking forward to your answers, maybe there's some better solution I'm not aware of. Best regards, Johannes --- libgcc/emutls.c | 118 +-- libgcc/libgcc-std.ver.in | 5 ++ 2 files changed, 118 insertions(+), 5 deletions(-) diff --git a/libgcc/emutls.c b/libgcc/emutls.c index c725142e465..66b74d29f52 100644 --- a/libgcc/emutls.c +++ b/libgcc/emutls.c @@ -52,6 +52,8 @@ struct __emutls_array void *__emutls_get_address (struct __emutls_object *); void __emutls_register_common (struct __emutls_object *, word, word, void *); +typedef void (*iterate_callback)(void* mem, pointer size, void *user); +void __emutls_iterate_memory (iterate_callback cb, void *user); #ifdef __GTHREADS #ifdef __GTHREAD_MUTEX_INIT @@ -62,10 +64,111 @@ static __gthread_mutex_t emutls_mutex; static __gthread_key_t emutls_key; static pointer emutls_size; +static struct __emutls_array *emutls_arrays; + +static void +emutls_array_register (struct __emutls_array *arr) +{ + __gthread_mutex_lock (&emutls_mutex); + + if (emutls_arrays == NULL) +{ + emutls_arrays = calloc (32 + 1, sizeof (void *)); + emutls_arrays->size = 32; +} + + // Try to write to an empty slot + pointer slot_index = 0; + for (; slot_index < emutls_arrays->size; slot_index++) +{ + if (emutls_arrays->data[slot_index] == NULL) + { + emutls_arrays->data[slot_index] = (void *) arr; + break; + } +} + // No empty slot? + if (slot_index == emutls_arrays->size) +{ + emutls_arrays = realloc (emutls_arrays, (slot_index + 2) * sizeof (void *)); + if (emutls_arrays == NULL) + abort (); + emutls_arrays->size = slot_index + 1; + emutls_arrays->data[slot_index] = (void *) arr; +} + + __gthread_mutex_unlock (&emutls_mutex); +} + +static void +emutls_array_update (struct __emutls_array *old, struct __emutls_array *updated) +{ + if (updated == old) +return; + + __gthread_mutex_lock (&emutls_mutex); + + for (pointer slot_index = 0; slot_index < emutls_arrays->size; slot_index++) +{ + if (emutls_arrays->data[slot_index] == (void *) old) +emutls_arrays->data[slot_index] = (void *) updated; +} + + __gthread_mutex_unlock (&emutls_mutex); +} + +static void +emutls_array_unregister (struct __emutls_array *arr) +{ + __gthread_mutex_lock (&emutls_mutex); + + for (pointer slot_index = 0; slot_index < emutls_arrays->size; slot_index++) +{ + if (emutls_arrays->data[slot_index] == (void *) arr) +emutls_arrays->data[slot_index] = NULL; +} + + __gthread_mutex_unlock (&emutls_mutex); +} + +static void +emutls_array_iterate (struct __emutls_array *arr, iterate_callback cb, void *user) +{ + if (arr == NULL) +return; + + for (pointer i = 0; i < arr->size; i++) +{ + void *ptr = arr->data[i]; + if (ptr) + { + pointer size = ((pointer*) ptr)[-2]; + cb (ptr, size, user); + } +} +} + +void +__emutls_iterate_memory (iterate_callback cb, void *user) +{ + __gthread_mutex_l
Re: [PR fortran/77583, patch ]- ICE in pp_quoted_string, at pretty-print.c:966
I didn't see any disagreement, so committed to trunk (rev.269353) and "backported" to 7- and 8-branches. Thanks, Harald On 03/02/19 00:15, Steve Kargl wrote: > On Sat, Mar 02, 2019 at 12:12:10AM +0100, Harald Anlauf wrote: >> The attached patch (originally by Steve Kargl) fixes a NULL pointer >> dereference that may occur when checking for a conflict. >> >> Regtested successfully. >> >> OK for trunk? Backport to active branches? >> >> >> 2019-03-02 Harald Anlauf >> Steve Kargl > > Steven G. Kargl > > ;-) > > I, of course, approve of the patch, but you might give > others a chance to disagree. >
[libstc++] Don't throw in std::assoc_legendre for m > l
The return value specified in "8.1.2 associated Legendre polynomials" of ISO/IEC JTC 1/SC 22/WG 21 N3060 (which is identical to the expression in the doxygen comment of the patched function) is well- defined for m>l: it is always zero because $ P_l(x) $ is a polynomial of degree l. The standard does not enforce an exception in this case because none of the requirements in 8.1 (5) on page 11 of ISO/IEC JTC 1/SC 22/WG 21 N3060 are met. Note: the implementation of st::assoc_legendre in Visual Studio 2017 (tested with Visual Studio 15.9.7) silently returns zero. Index: libstdc++-v3/include/tr1/legendre_function.tcc === --- libstdc++-v3/include/tr1/legendre_function.tcc (revision 269352) +++ libstdc++-v3/include/tr1/legendre_function.tcc (working copy) @@ -67,13 +67,13 @@ /** * @brief Return the Legendre polynomial by recursion on degree * @f$ l @f$. - * + * * The Legendre function of @f$ l @f$ and @f$ x @f$, * @f$ P_l(x) @f$, is defined by: * @f[ * P_l(x) = \frac{1}{2^l l!}\frac{d^l}{dx^l}(x^2 - 1)^{l} * @f] - * + * * @param l The degree of the Legendre polynomial. @f$l >= 0@f$. * @param x The argument of the Legendre polynomial. @f$|x| <= 1@f$. */ @@ -120,17 +120,17 @@ /** * @brief Return the associated Legendre function by recursion * on @f$ l @f$. - * + * * The associated Legendre function is derived from the Legendre function * @f$ P_l(x) @f$ by the Rodrigues formula: * @f[ * P_l^m(x) = (1 - x^2)^{m/2}\frac{d^m}{dx^m}P_l(x) * @f] - * + * * @param l The degree of the associated Legendre function. * @f$ l >= 0 @f$. * @param m The order of the associated Legendre function. - * @f$ m <= l @f$. + * @f$ m >= 0 @f$. * @param x The argument of the associated Legendre function. * @f$ |x| <= 1 @f$. * @param phase The phase of the associated Legendre function. @@ -146,8 +146,7 @@ std::__throw_domain_error(__N("Argument out of range" " in __assoc_legendre_p.")); else if (__m > __l) -std::__throw_domain_error(__N("Degree out of range" - " in __assoc_legendre_p.")); +return _Tp(0); else if (__isnan(__x)) return std::numeric_limits<_Tp>::quiet_NaN(); else if (__m == 0) @@ -192,7 +191,7 @@ /** * @brief Return the spherical associated Legendre function. - * + * * The spherical associated Legendre function of @f$ l @f$, @f$ m @f$, * and @f$ \theta @f$ is defined as @f$ Y_l^m(\theta,0) @f$ where * @f[ @@ -202,7 +201,7 @@ * @f] * is the spherical harmonic function and @f$ P_l^m(x) @f$ is the * associated Legendre function. - * + * * This function differs from the associated Legendre function by * argument (@f$x = \cos(\theta)@f$) and by a normalization factor * but this factor is rather large for large @f$ l @f$ and @f$ m @f$ @@ -210,7 +209,7 @@ * and @f$ m @f$. * @note Unlike the case for __assoc_legendre_p the Condon-Shortley * phase factor @f$ (-1)^m @f$ is present here. - * + * * @param l The degree of the spherical associated Legendre function. * @f$ l >= 0 @f$. * @param m The order of the spherical associated Legendre function.
Re: [PATCH] Optimize vector init constructor
On Sun, Mar 03, 2019 at 06:40:09AM -0800, Andrew Pinski wrote: > ) > ,On Sun, Mar 3, 2019 at 6:32 AM H.J. Lu wrote: > > > > For vector init constructor: > > > > --- > > typedef float __v4sf __attribute__ ((__vector_size__ (16))); > > > > __v4sf > > foo (__v4sf x, float f) > > { > > __v4sf y = { f, x[1], x[2], x[3] }; > > return y; > > } > > --- > > > > we can optimize vector init constructor with vector copy or permute > > followed by a single scalar insert: > > > > __v4sf D.1912; > > __v4sf D.1913; > > __v4sf D.1914; > > __v4sf y; > > > > x.0_1 = x; > > D.1912 = x.0_1; > > _2 = D.1912; > > D.1913 = _2; > > BIT_FIELD_REF = f; > > y = D.1913; > > D.1914 = y; > > return D.1914; > > > > instead of > > > > __v4sf D.1962; > > __v4sf y; > > > > _1 = BIT_FIELD_REF ; > > _2 = BIT_FIELD_REF ; > > _3 = BIT_FIELD_REF ; > > y = {f, _1, _2, _3}; > > D.1962 = y; > > return D.1962; > > > > gcc/ > > > > PR tree-optimization/88828 > > * gimplify.c (gimplify_init_constructor): Optimize vector init > > constructor with vector copy or permute followed by a single > > scalar insert. > > > Doing this here does not catch things like: > typedef float __v4sf __attribute__ ((__vector_size__ (16))); > > > __v4sf > vector_init (float f0,float f1, float f2,float f3) > { > __v4sf y = { f, x[1], x[2], x[3] }; >return y; > } > > __v4sf > foo (__v4sf x, float f) > { > return vector_init (f, x[1], x[2], x[3]) ; > } > Here is a patch for simplify_vector_constructor to optimize vector init constructor with vector copy or permute followed by a single scalar insert. But this doesn't work correcly: [hjl@gnu-cfl-2 pr88828]$ cat bar.i typedef float __v4sf __attribute__ ((__vector_size__ (16))); static __v4sf vector_init (float f0,float f1, float f2,float f3) { __v4sf y = { f0, f1, f2, f3 }; return y; } __v4sf foo (__v4sf x, float f) { return vector_init (f, x[1], x[2], x[3]) ; } [hjl@gnu-cfl-2 pr88828]$ make bar.s /export/build/gnu/tools-build/gcc-wip-debug/build-x86_64-linux/gcc/xgcc -B/export/build/gnu/tools-build/gcc-wip-debug/build-x86_64-linux/gcc/ -O2 -S bar.i [hjl@gnu-cfl-2 pr88828]$ cat bar.s .file "bar.i" .text .p2align 4 .globl foo .type foo, @function foo: .LFB1: .cfi_startproc ret .cfi_endproc .LFE1: .size foo, .-foo .ident "GCC: (GNU) 9.0.1 20190303 (experimental)" .section.note.GNU-stack,"",@progbits [hjl@gnu-cfl-2 pr88828]$ Scalar insert is missing. --- gcc/tree-ssa-forwprop.c | 77 - 1 file changed, 69 insertions(+), 8 deletions(-) diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c index eeb6281c652..b10cfccf7b8 100644 --- a/gcc/tree-ssa-forwprop.c +++ b/gcc/tree-ssa-forwprop.c @@ -2008,7 +2008,7 @@ simplify_vector_constructor (gimple_stmt_iterator *gsi) unsigned elem_size, i; unsigned HOST_WIDE_INT nelts; enum tree_code code, conv_code; - constructor_elt *elt; + constructor_elt *ce; bool maybe_ident; gcc_checking_assert (gimple_assign_rhs_code (stmt) == CONSTRUCTOR); @@ -2027,18 +2027,41 @@ simplify_vector_constructor (gimple_stmt_iterator *gsi) orig[1] = NULL; conv_code = ERROR_MARK; maybe_ident = true; - FOR_EACH_VEC_SAFE_ELT (CONSTRUCTOR_ELTS (op), i, elt) + + tree rhs_vector = NULL; + /* The single scalar element. */ + tree scalar_element = NULL; + unsigned int scalar_idx = 0; + bool insert = false; + unsigned int nscalars = 0; + unsigned int nvectors = 0; + FOR_EACH_VEC_SAFE_ELT (CONSTRUCTOR_ELTS (op), i, ce) { tree ref, op1; if (i >= nelts) return false; - if (TREE_CODE (elt->value) != SSA_NAME) + if (TREE_CODE (ce->value) != SSA_NAME) return false; - def_stmt = get_prop_source_stmt (elt->value, false, NULL); + def_stmt = get_prop_source_stmt (ce->value, false, NULL); if (!def_stmt) - return false; + { + if ( gimple_nop_p (SSA_NAME_DEF_STMT (ce->value))) + { + /* Only allow one single scalar insert. */ + if (nscalars != 0) + return false; + + nscalars = 1; + insert = true; + scalar_idx = i; + scalar_element = ce->value; + continue; + } + else + return false; + } code = gimple_assign_rhs_code (def_stmt); if (code == FLOAT_EXPR || code == FIX_TRUNC_EXPR) @@ -2046,7 +2069,7 @
Re: [PATCH] x32: Add addr32 prefix to UNSPEC_VSIBADDR instructions
On Sun, Mar 3, 2019 at 9:27 AM Uros Bizjak wrote: > > On Thu, Feb 28, 2019 at 8:10 PM H.J. Lu wrote: > > > > 32-bit indices in VSIB address are sign-extended to 64 bits. In x32, > > when 32-bit indices are used as addresses, like in > > > > vgatherdps %ymm7, 0(,%ymm9,1), %ymm6 > > > > 32-bit indices, 0xf7fa3010, is sign-extended to 0xf7fa3010 which > > is invalid address. Add addr32 prefix to UNSPEC_VSIBADDR instructions > > for x32 if there is no base register nor symbol. > > > > This fixes 175.vpr and 254.gap in SPEC CPU 2000 on x32 with > > > > -Ofast -funroll-loops -march=haswell > > 1. Testcases 2 to 9 fail on fedora-29 with: > > In file included from /usr/include/features.h:452, > from /usr/include/bits/libc-header-start.h:33, > from /usr/include/stdlib.h:25, > from /ssd/uros/gcc-build-fast/gcc/include/mm_malloc.h:27, > from /ssd/uros/gcc-build-fast/gcc/include/xmmintrin.h:34, > from /ssd/uros/gcc-build-fast/gcc/include/immintrin.h:29, > from > /home/uros/gcc-svn/trunk/gcc/testsuite/gcc.target/i386/pr89523-2.c:7: > /usr/include/gnu/stubs.h:13:11: fatal error: gnu/stubs-x32.h: No such > file or directory I will update tests to remove "#include immintrin.h" > 2. Does the patch work with -maddress-mode={short,long}? Yes. > 3. The implementation is wrong. You should use operand substitution > with VSIB address as operand, not substitution without operand. How can I add an addr32 prefix with operand substitution? This is very similar to "%^". My updated patch will use "%^". > 4. The PR is not a regression. Correct. H.J. > Uros. > > > > > gcc/ > > > > PR target/89523 > > * config/i386/i386.c (ix86_print_operand): Also handle '_' to > > add addr32 prefix if required. > > (ix86_print_operand_punct_valid_p): Allow '_'. > > * config/i386/sse.md (*avx512pf_gatherpfsf_mask): Prepend > > "%_". > > (*avx512pf_gatherpfdf_mask): Likewise. > > (*avx512pf_scatterpfsf_mask): Likewise. > > (*avx512pf_scatterpfdf_mask): Likewise. > > (*avx2_gathersi): Likewise. > > (*avx2_gathersi_2): Likewise. > > (*avx2_gatherdi): Likewise. > > (*avx2_gatherdi_2): Likewise. > > (*avx2_gatherdi_3): Likewise. > > (*avx2_gatherdi_4): Likewise. > > (*avx512f_gathersi): Likewise. > > (*avx512f_gathersi_2): Likewise. > > (*avx512f_gatherdi): Likewise. > > (*avx512f_gatherdi_2): Likewise. > > (*avx512f_scattersi): Likewise. > > (*avx512f_scatterdi): Likewise. > > > > gcc/testsuite/ > > > > PR target/89523 > > * gcc.target/i386/pr89523-1.c: New test. > > * gcc.target/i386/pr89523-2.c: Likewise. > > * gcc.target/i386/pr89523-3.c: Likewise. > > * gcc.target/i386/pr89523-4.c: Likewise. > > * gcc.target/i386/pr89523-5.c: Likewise. > > * gcc.target/i386/pr89523-6.c: Likewise. > > * gcc.target/i386/pr89523-7.c: Likewise. > > * gcc.target/i386/pr89523-8.c: Likewise. > > * gcc.target/i386/pr89523-9.c: Likewise. > > > > xxx > > --- > > gcc/config/i386/i386.c| 39 ++- > > gcc/config/i386/sse.md| 46 +++ > > gcc/testsuite/gcc.target/i386/pr89523-1.c | 24 > > gcc/testsuite/gcc.target/i386/pr89523-2.c | 17 + > > gcc/testsuite/gcc.target/i386/pr89523-3.c | 17 + > > gcc/testsuite/gcc.target/i386/pr89523-4.c | 16 > > gcc/testsuite/gcc.target/i386/pr89523-5.c | 18 + > > gcc/testsuite/gcc.target/i386/pr89523-6.c | 17 + > > gcc/testsuite/gcc.target/i386/pr89523-7.c | 19 ++ > > gcc/testsuite/gcc.target/i386/pr89523-8.c | 19 ++ > > gcc/testsuite/gcc.target/i386/pr89523-9.c | 16 > > 11 files changed, 224 insertions(+), 24 deletions(-) > > create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-1.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-2.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-3.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-4.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-5.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-6.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-7.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-8.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-9.c > > > > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > > index b8357a7db5d..336696136de 100644 > > --- a/gcc/config/i386/i386.c > > +++ b/gcc/config/i386/i386.c > > @@ -17805,6 +17805,7 @@ print_reg (rtx x, int code, FILE *file) > > ~ -- print "i" if TARGET_AVX2, "f" otherwise. > > ^ -- print addr32 prefix if TARGET_64BIT and Pmode != word_mode > > ! -- print NOTRACK prefix for jxx/call/ret instru
Re: [PATCH] x32: Add addr32 prefix to UNSPEC_VSIBADDR instructions
On Sun, Mar 3, 2019 at 10:18 PM H.J. Lu wrote: > > On Sun, Mar 3, 2019 at 9:27 AM Uros Bizjak wrote: > > > > On Thu, Feb 28, 2019 at 8:10 PM H.J. Lu wrote: > > > > > > 32-bit indices in VSIB address are sign-extended to 64 bits. In x32, > > > when 32-bit indices are used as addresses, like in > > > > > > vgatherdps %ymm7, 0(,%ymm9,1), %ymm6 > > > > > > 32-bit indices, 0xf7fa3010, is sign-extended to 0xf7fa3010 which > > > is invalid address. Add addr32 prefix to UNSPEC_VSIBADDR instructions > > > for x32 if there is no base register nor symbol. > > > > > > This fixes 175.vpr and 254.gap in SPEC CPU 2000 on x32 with > > > > > > -Ofast -funroll-loops -march=haswell > > > > 1. Testcases 2 to 9 fail on fedora-29 with: > > > > In file included from /usr/include/features.h:452, > > from /usr/include/bits/libc-header-start.h:33, > > from /usr/include/stdlib.h:25, > > from /ssd/uros/gcc-build-fast/gcc/include/mm_malloc.h:27, > > from /ssd/uros/gcc-build-fast/gcc/include/xmmintrin.h:34, > > from /ssd/uros/gcc-build-fast/gcc/include/immintrin.h:29, > > from > > /home/uros/gcc-svn/trunk/gcc/testsuite/gcc.target/i386/pr89523-2.c:7: > > /usr/include/gnu/stubs.h:13:11: fatal error: gnu/stubs-x32.h: No such > > file or directory > > I will update tests to remove "#include immintrin.h" > > > 2. Does the patch work with -maddress-mode={short,long}? > > Yes. > > > 3. The implementation is wrong. You should use operand substitution > > with VSIB address as operand, not substitution without operand. > > How can I add an addr32 prefix with operand substitution? This is > very similar to "%^". My updated patch will use "%^". Yes, using %^ is what I think would be the optimal solution. Other than that, in your proposed patch, operand-less %_ scans the entire current_output_insn to dig to the UNSPEC_VSIBADDR. You can just use operand substitution, and do e.g. "%X2vgatherpf0..." where 'X' processes operand 2 (vsib_address_operand) and conditionally outputs addr32. BTW: In a new version of the patch, please specify what is changed from the previous version. Otherwise, review of a new version is more or less a guesswork what changed. Uros.
[PATCH] PR libstdc++/89562 use binary mode for file I/O
PR libstdc++/89562 * src/filesystem/ops-common.h (do_copy_file): Open files in binary mode for mingw. Tested x86_64-linux, and lightly tested on mingw-w64 to verify the fix works. commit b15e67df3477fac3fea5a3df234be91391719fcd Author: Jonathan Wakely Date: Sun Mar 3 22:01:29 2019 + PR libstdc++/89562 use binary mode for file I/O PR libstdc++/89562 * src/filesystem/ops-common.h (do_copy_file): Open files in binary mode for mingw. diff --git a/libstdc++-v3/src/filesystem/ops-common.h b/libstdc++-v3/src/filesystem/ops-common.h index 55e482ff8f2..6dc9b137dbf 100644 --- a/libstdc++-v3/src/filesystem/ops-common.h +++ b/libstdc++-v3/src/filesystem/ops-common.h @@ -402,7 +402,12 @@ _GLIBCXX_BEGIN_NAMESPACE_FILESYSTEM int fd; }; -CloseFD in = { posix::open(from, O_RDONLY) }; +int iflag = O_RDONLY; +#ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS +iflag |= O_BINARY; +#endif + +CloseFD in = { posix::open(from, iflag) }; if (in.fd == -1) { ec.assign(errno, std::generic_category()); @@ -413,6 +418,9 @@ _GLIBCXX_BEGIN_NAMESPACE_FILESYSTEM oflag |= O_TRUNC; else oflag |= O_EXCL; +#ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS +oflag |= O_BINARY; +#endif CloseFD out = { posix::open(to, oflag, S_IWUSR) }; if (out.fd == -1) {
Re: [v3 PATCH, RFC] Rewrite variant. Also PR libstdc++/85517
On Wed, 6 Feb 2019 at 15:12, Ville Voutilainen wrote: > And, to emphasize, the most important reason for this was to be able > to write straightforward > code for the special member functions, with the hope that it wouldn't > have a negative codegen > effect. Our Microsoft friends described the general technique as "has > crazy-good codegen", > but I have no idea what their starting point was; our starting point > probably wasn't bad > to begin with. However, the codegen should be somewhat improved; this patch removes a bag of run-time ifs from the implementation. An amended patch attached. This gets rid of all __erased* stuff, including hash, swap, constructors, relops. I consider variant to no longer be in the state of sin after this. Since this is touching just a C++17 facility with no impact elsewhere, we could consider landing it in GCC 9 as a late change. Failing that, it certainly seems safe enough to put into GCC 9.2. 2019-03-04 Ville Voutilainen Rewrite variant. Also PR libstdc++/85517 * include/std/variant (__do_visit): New. (__variant_cast): Likewise. (__variant_cookie): Likewise. (__erased_*): Remove. (_Variant_storage::_S_vtable): Likewise. (_Variant_storage::__M_reset_impl): Adjust to use __do_visit. (_Variant_storage::__M_reset): Adjust. (_Copy_ctor_base(const _Copy_ctor_base&)): Adjust to use __do_visit. (_Move_ctor_base(_Move_ctor_base&&)): Likewise. (_Move_ctor_base::__M_destructive_copy): New. (_Copy_assign_base::operator=): Adjust to use __do_visit. (_Copy_assign_alias): Adjust to check both copy assignment and copy construction for triviality. (_Move_assign_base::operator=): Adjust to use __do_visit. (_Multi_array): Add support for visitors that accept and return a __variant_cookie. (__gen_vtable_impl::_S_apply_all_alts): Likewise. (__gen_vtable_impl::_S_apply_single_alt): Likewise. (__gen_vtable_impl::__element_by_index_or_cookie): New. Generate a __variant_cookie temporary for a variant that is valueless and.. (__gen_vtable_impl::__visit_invoke): ..adjust here. (__gen_vtable::_Array_type): Conditionally make space for the __variant_cookie visitor case. (relops): Adjust to use __do_visit. (variant): Add __variant_cast as a friend. (variant::emplace): Use _M_reset() instead of self-destruction. (visit): Reimplement in terms of __do_visit. * testsuite/20_util/variant/compile.cc: Adjust. * testsuite/20_util/variant/run.cc: Likewise. diff --git a/libstdc++-v3/include/std/variant b/libstdc++-v3/include/std/variant index 89deb14..8b1f407 100644 --- a/libstdc++-v3/include/std/variant +++ b/libstdc++-v3/include/std/variant @@ -138,6 +138,19 @@ namespace __variant constexpr variant_alternative_t<_Np, variant<_Types...>> const&& get(const variant<_Types...>&&); + template +constexpr decltype(auto) +__do_visit(_Visitor&& __visitor, _Variants&&... __variants); + + template +decltype(auto) __variant_cast(_Tp&& __rhs) +{ + if constexpr (is_const_v>) +return static_cast&>(__rhs); + else +return static_cast&>(__rhs); +} + namespace __detail { namespace __variant @@ -155,6 +168,9 @@ namespace __variant std::integral_constant ? 0 : __index_of_v<_Tp, _Rest...> + 1> {}; + // used for raw visitation + struct __variant_cookie {}; + // _Uninitialized is guaranteed to be a literal type, even if T is not. // We have to do this, because [basic.types]p10.5.3 (n4606) is not implemented // yet. When it's implemented, _Uninitialized can be changed to the alias @@ -236,63 +252,6 @@ namespace __variant std::forward<_Variant>(__v)._M_u); } - // Various functions as "vtable" entries, where those vtables are used by - // polymorphic operations. - template -void -__erased_ctor(void* __lhs, void* __rhs) -{ - using _Type = remove_reference_t<_Lhs>; - ::new (__lhs) _Type(__variant::__ref_cast<_Rhs>(__rhs)); -} - - template -void -__erased_dtor(_Variant&& __v) -{ std::_Destroy(std::__addressof(__variant::__get<_Np>(__v))); } - - template -void -__erased_assign(void* __lhs, void* __rhs) -{ - __variant::__ref_cast<_Lhs>(__lhs) = __variant::__ref_cast<_Rhs>(__rhs); -} - - template -void -__erased_swap(void* __lhs, void* __rhs) -{ - using std::swap; - swap(__variant::__ref_cast<_Lhs>(__lhs), - __variant::__ref_cast<_Rhs>(__rhs)); -} - -#define _VARIANT_RELATION_FUNCTION_TEMPLATE(__OP, __NAME) \ - template \ -constexpr bool \ -__erased_##__NAME(const _Variant& __lhs, const _Variant& __rhs) \ -{ \ - return __variant::__get<_Np>(std::forward<_Variant>(__lhs)) \ - __OP __variant::__get<_Np>(std::forward<_Variant>(__rhs)); \ -} - - _VARIANT_RELATION_FUNCTION_TEMPLATE(<, less) - _VARIANT_RELATION_FUNCTION_TEMPLATE(<=, less_equal) - _VARIANT_RELATION_FUNCTION_TEMPLATE(==, equal) - _V
Re: [v3 PATCH, RFC] Rewrite variant. Also PR libstdc++/85517
On Mon, 4 Mar 2019 at 01:26, Ville Voutilainen wrote: > I consider variant to no longer be in the state of sin after this. Sigh, except for the places where it self-destructs or placement-news over things that it shouldn't. That's hopefully the next bit that I'll rectify, Real Soon Now.
*Ping* Re: [PATCH] PR c/43673 - Incorrect warning in dfp printf.
Ping: https://gcc.gnu.org/ml/gcc-patches/2019-02/msg01949.html Thanks Xionghu On 2019/2/26 AM9:13, luo...@linux.ibm.com wrote: > From: Xiong Hu Luo > > dfp printf/scanf of Ha/HA, Da/DA and DDa/DDA is not set properly, cause > incorrect warning happens: > "use of 'D' length modifier with 'a' type character". > > Regression-tested on powerpc64le-linux, OK for trunk and gcc-8? > > gcc/c-family/ChangeLog: > > 2019-02-25 Xiong Hu Luo > > PR c/43673 > * c-format.c (print_char_table, scanf_char_table): Replace BADLEN with > TEX_D32, TEX_D64 or TEX_D128. > > gcc/testsuit/ChangeLog: > > 2019-02-25 Xiong Hu Luo > > PR c/43673 > * gcc.dg/format-dfp-printf-1.c: New test. > * gcc.dg/format-dfp-scanf-1.c: Likewise. > --- > gcc/c-family/c-format.c| 4 ++-- > gcc/testsuite/gcc.dg/format/dfp-printf-1.c | 28 ++-- > gcc/testsuite/gcc.dg/format/dfp-scanf-1.c | 22 -- > 3 files changed, 48 insertions(+), 6 deletions(-) > > diff --git a/gcc/c-family/c-format.c b/gcc/c-family/c-format.c > index 9b48ee3..af33ef9 100644 > --- a/gcc/c-family/c-format.c > +++ b/gcc/c-family/c-format.c > @@ -674,7 +674,7 @@ static const format_char_info print_char_table[] = >{ "n", 1, STD_C89, { T89_I, T99_SC, T89_S, T89_L, T9L_LL, > BADLEN, T99_SST, T99_PD, T99_IM, BADLEN, BADLEN, BADLEN }, "", > "W", NULL }, >/* C99 conversion specifiers. */ >{ "F", 0, STD_C99, { T99_D, BADLEN, BADLEN, T99_D, BADLEN, > T99_LD, BADLEN, BADLEN, BADLEN, TEX_D32, TEX_D64, TEX_D128 }, "-wp0 > +#'I", "", NULL }, > - { "aA", 0, STD_C99, { T99_D, BADLEN, BADLEN, T99_D, BADLEN, > T99_LD, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN }, "-wp0 +#", > "", NULL }, > + { "aA", 0, STD_C99, { T99_D, BADLEN, BADLEN, T99_D, BADLEN, > T99_LD, BADLEN, BADLEN, BADLEN, TEX_D32, TEX_D64, TEX_D128 }, "-wp0 +#", > "", NULL }, >/* X/Open conversion specifiers. */ >{ "C", 0, STD_EXT, { TEX_WI, BADLEN, BADLEN, BADLEN, BADLEN, > BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN }, "-w", > "", NULL }, >{ "S", 1, STD_EXT, { TEX_W, BADLEN, BADLEN, BADLEN, BADLEN, > BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN }, "-wp", > "R", NULL }, > @@ -847,7 +847,7 @@ static const format_char_info scan_char_table[] = >{ "n", 1, STD_C89, { T89_I, T99_SC, T89_S, T89_L, T9L_LL, > BADLEN, T99_SST, T99_PD, T99_IM, BADLEN, BADLEN, BADLEN }, "", "W", > NULL }, >/* C99 conversion specifiers. */ >{ "F", 1, STD_C99, { T99_F, BADLEN, BADLEN, T99_D, BADLEN, > T99_LD, BADLEN, BADLEN, BADLEN, TEX_D32, TEX_D64, TEX_D128 }, "*w'", > "W", NULL }, > - { "aA", 1, STD_C99, { T99_F, BADLEN, BADLEN, T99_D, BADLEN, > T99_LD, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN }, "*w'", "W", > NULL }, > + { "aA", 1, STD_C99, { T99_F, BADLEN, BADLEN, T99_D, BADLEN, > T99_LD, BADLEN, BADLEN, BADLEN, TEX_D32, TEX_D64, TEX_D128 }, "*w'", > "W", NULL }, >/* X/Open conversion specifiers. */ >{ "C", 1, STD_EXT, { TEX_W, BADLEN, BADLEN, BADLEN, BADLEN, > BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN }, "*mw", "W", > NULL }, >{ "S", 1, STD_EXT, { TEX_W, BADLEN, BADLEN, BADLEN, BADLEN, > BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN }, "*amw", "W", > NULL }, > diff --git a/gcc/testsuite/gcc.dg/format/dfp-printf-1.c > b/gcc/testsuite/gcc.dg/format/dfp-printf-1.c > index e92f161..a290895 100644 > --- a/gcc/testsuite/gcc.dg/format/dfp-printf-1.c > +++ b/gcc/testsuite/gcc.dg/format/dfp-printf-1.c > @@ -17,6 +17,8 @@ foo (_Decimal32 x, _Decimal64 y, _Decimal128 z, int i, > unsigned int j, > >/* Check lack of warnings for valid usage. */ > > + printf ("%Ha\n", x); > + printf ("%HA\n", x); >printf ("%Hf\n", x); >printf ("%HF\n", x); >printf ("%He\n", x); > @@ -24,6 +26,8 @@ foo (_Decimal32 x, _Decimal64 y, _Decimal128 z, int i, > unsigned int j, >printf ("%Hg\n", x); >printf ("%HG\n", x); > > + printf ("%Da\n", y); > + printf ("%DA\n", y); >printf ("%Df\n", y); >printf ("%DF\n", y); >printf ("%De\n", y); > @@ -31,6 +35,8 @@ foo (_Decimal32 x, _Decimal64 y, _Decimal128 z, int i, > unsigned int j, >printf ("%Dg\n", y); >printf ("%DG\n", y); > > + printf ("%DDa\n", z); > + printf ("%DDA\n", z); >printf ("%DDf\n", z); >printf ("%DDF\n", z); >printf ("%DDe\n", z); > @@ -43,12 +49,16 @@ foo (_Decimal32 x, _Decimal64 y, _Decimal128 z, int i, > unsigned int j, > >/* Check warnings for type mismatches. */ > > + printf ("%Ha\n", y); /* { dg-warning "expects argument" "bad use of > %H" } */ > + printf ("%HA\n", y); /* { dg-warning "expects argument" "bad use of > %H" } */ >printf ("%Hf\n", y); /*
回复:[PATCH GCC10] ipa-inline.c: Trivial fix on function not declared inline check in want_inline_small_function_p
-- 发件人:Segher Boessenkool 发送时间:2019年3月1日(星期五) 22:18 收件人:JunMa 抄 送:gcc-patches 主 题:Re: [PATCH GCC10] ipa-inline.c: Trivial fix on function not declared inline check in want_inline_small_function_p Hi! On Fri, Mar 01, 2019 at 04:39:38PM +0800, JunMa wrote: >Since MAX_INLINE_INSNS_AUTO should be below or equal to >MAX_INLINE_INSNS_SINGLE (see params.def), there is no need >to do second inlining limit check on growth when function not >declared inline, this patch removes it. >Bootstrapped and tested on x86_64-unknown-linux-gnu, is it ok for trunk? Your mail subject says this is for GCC 10, but you are asking for GCC 9 now; which is it? Sorry. Since we are in GCC9 stage4 now, also it's not for regression fix.So, it's for GCC 10. > 2019-03-01 Jun Ma > > *ipa-inline.c(want_inline_small_function_p): Remove > redundant growth check when function not declared > inline Some spaces were lost in the first line. Trailing space. Sentences should end with a full stop (or similar). Don't send patches (or pretty much anything else) as application/octet-stream attachments. Segher Sorry again for this. Here is the full change. JunMa 2019-03-01 Jun Ma * ipa-inline.c(want_inline_small_function_p): Remove redundant growth check when function not declared inline. diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c index 360c3de..ff9bc9e 100644 --- a/gcc/ipa-inline.c +++ b/gcc/ipa-inline.c @@ -837,15 +837,11 @@ want_inline_small_function_p (struct cgraph_edge *e, bool report) ? MAX (MAX_INLINE_INSNS_AUTO, MAX_INLINE_INSNS_SINGLE) : MAX_INLINE_INSNS_AUTO) - && !(big_speedup == -1 ? big_speedup_p (e) : big_speedup)) + && !(big_speedup == -1 ? big_speedup_p (e) : big_speedup) + && growth_likely_positive (callee, growth)) { - /* growth_likely_positive is expensive, always test it last. */ - if (growth >= MAX_INLINE_INSNS_SINGLE - || growth_likely_positive (callee, growth)) - { - e->inline_failed = CIF_MAX_INLINE_INSNS_AUTO_LIMIT; - want_inline = false; - } + e->inline_failed = CIF_MAX_INLINE_INSNS_AUTO_LIMIT; + want_inline = false; } /* If call is cold, do not inline when function body would grow. */ else if (!e->maybe_hot_p ()
Re: [PATCH GCC10] ipa-inline.c: Trivial fix on function not declared inline check in want_inline_small_function_p
Hi Please ignore the previous mail. 在 2019/3/1 下午10:17, Segher Boessenkool 写道: Hi! On Fri, Mar 01, 2019 at 04:39:38PM +0800, JunMa wrote: Since MAX_INLINE_INSNS_AUTO should be below or equal to MAX_INLINE_INSNS_SINGLE (see params.def), there is no need to do second inlining limit check on growth when function not declared inline, this patch removes it. Bootstrapped and tested on x86_64-unknown-linux-gnu, is it ok for trunk? Your mail subject says this is for GCC 10, but you are asking for GCC 9 now; which is it? Since we are in GCC9 stage4 now, also it's not for regression fix. So, it's for GCC 10. 2019-03-01 Jun Ma *ipa-inline.c(want_inline_small_function_p): Remove redundant growth check when function not declared inline Some spaces were lost in the first line. Trailing space. Sentences should end with a full stop (or similar). Don't send patches (or pretty much anything else) as application/octet-stream attachments. Segher Sorry again for this. Here is the full change. JunMa 2019-03-01 Jun Ma * ipa-inline.c(want_inline_small_function_p): Remove redundant growth check when function not declared inline. diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c index 360c3de..ff9bc9e 100644 --- a/gcc/ipa-inline.c +++ b/gcc/ipa-inline.c @@ -837,15 +837,11 @@ want_inline_small_function_p (struct cgraph_edge *e, bool report) ? MAX (MAX_INLINE_INSNS_AUTO, MAX_INLINE_INSNS_SINGLE) : MAX_INLINE_INSNS_AUTO) - && !(big_speedup == -1 ? big_speedup_p (e) : big_speedup)) + && !(big_speedup == -1 ? big_speedup_p (e) : big_speedup) + && growth_likely_positive (callee, growth)) { - /* growth_likely_positive is expensive, always test it last. */ - if (growth >= MAX_INLINE_INSNS_SINGLE - || growth_likely_positive (callee, growth)) - { - e->inline_failed = CIF_MAX_INLINE_INSNS_AUTO_LIMIT; - want_inline = false; - } + e->inline_failed = CIF_MAX_INLINE_INSNS_AUTO_LIMIT; + want_inline = false; } /* If call is cold, do not inline when function body would grow. */ else if (!e->maybe_hot_p ()