date:20190303

[patch, fortran] Fix PR 72714, ICE on invalid

2019-03-03 Thread Thomas Koenig


Hello world,

the attached patch fixes a 7/8/9 regression by rejecting an invalid
expression in coarray allocation that led to an ICE.  It also adds a few
more checks.

One point that is checked for is that, unlike normal arrays, coarrays
cannot be empty.

Regression-tested. OK for trunk and affected branches?

Regards

Thomas

2019-03-02  Thomas Koenig   




PR fortran/72714 

* resolve.c (resolve_allocate_expr): Add some tests for 
coarrays.



2019-03-02  Thomas Koenig   




PR fortran/72714 


* gfortran.dg/coarray_allocate_11.f90: New test.
Index: resolve.c
===
--- resolve.c	(Revision 269260)
+++ resolve.c	(Arbeitskopie)
@@ -7766,13 +7766,54 @@ resolve_allocate_expr (gfc_expr *e, gfc_code *code
 
   if (codimension)
 for (i = ar->dimen; i < ar->dimen + ar->codimen; i++)
-  if (ar->dimen_type[i] == DIMEN_THIS_IMAGE)
-	{
-	  gfc_error ("Coarray specification required in ALLOCATE statement "
-		 "at %L", &e->where);
-	  goto failure;
-	}
+  {
+	switch (ar->dimen_type[i])
+	  {
+	  case DIMEN_THIS_IMAGE:
+	gfc_error ("Coarray specification required in ALLOCATE statement "
+		   "at %L", &e->where);
+	goto failure;
 
+	  case  DIMEN_RANGE:
+	if (ar->start[i] == 0 || ar->end[i] == 0)
+	  {
+		/* If ar->stride[i] is NULL, we issued a previous error.  */
+		if (ar->stride[i] == NULL)
+		  gfc_error ("Bad array specification in ALLOCATE statement "
+			 "at %L", &e->where);
+		goto failure;
+	  }
+	else if (gfc_dep_compare_expr (ar->start[i], ar->end[i]) == 1)
+	  {
+		gfc_error ("Upper cobound is less than lower cobound at %L",
+			   &ar->start[i]->where);
+		goto failure;
+	  }
+	break;
+
+	  case DIMEN_ELEMENT:
+	if (ar->start[i]->expr_type == EXPR_CONSTANT)
+	  {
+		gcc_assert (ar->start[i]->ts.type == BT_INTEGER);
+		if (mpz_cmp_si (ar->start[i]->value.integer, 1) < 0)
+		  {
+		gfc_error ("Upper cobound is less than lower cobound "
+			   " of 1 at %L", &ar->start[i]->where);
+		goto failure;
+		  }
+	  }
+	break;
+
+	  case DIMEN_STAR:
+	break;
+
+	  default:
+	gfc_error ("Bad array specification in ALLOCATE statement at %L",
+		   &e->where);
+	goto failure;
+
+	  }
+  }
   for (i = 0; i < ar->dimen; i++)
 {
   if (ar->type == AR_ELEMENT || ar->type == AR_FULL)
! { dg-do compile }
! { dg-additional-options -fcoarray=single }
! PR fortran/72714
! Test for not ICEing and different error contitions when allocating
! coarrays.
program p
   integer, allocatable :: z[:,:]
   integer :: i
   allocate (z[1:,*]) ! { dg-error "Bad array specification in ALLOCATE statement" }
   allocate (z[:2,*]) ! { dg-error "Bad array specification in ALLOCATE statement" }
   allocate (z[2:1,*]) ! { dg-error "Upper cobound is less than lower cobound" }
   allocate (z[:0,*]) ! { dg-error "Bad array specification in ALLOCATE statement" }
   allocate (z[0,*]) ! { dg-error "Upper cobound is less than lower cobound" }
   allocate (z[1,*]) ! This is OK
   allocate (z[1:1,*]) ! This is OK
   allocate (z[i:i,*]) ! This is OK
   allocate (z[i:i-1,*]) ! { dg-error "Upper cobound is less than lower cobound" }
end

Re: [patch, fortran] Fix pointers not escaping via C_PTR

2019-03-03 Thread Thomas Koenig


I wrote:


First, this talks about a C pointer having a target.  Second, you can
re-estabilsh the association to a different pointer to the Fortran
program.


There is another point to consider: This is interoperability with C
we are dealing with, so we also have to follow C semantics.
And, love it or hate it, C pointers escape.

So, OK for trunk?

Regards

Thomas

[PATCH] [MinGW] Set __USE_MINGW_ACCESS for C++ as well

2019-03-03 Thread Johannes Pfau

We set __USE_MINGW_ACCESS for windows hosts to use MinGWs wrapper
for the access function. The wrapper ensures that access behaves
in the expected way (e.g. for special files, such as nul).
However, we now compile most sources with the C++ compiler and
the __USE_MINGW_ACCESS in CFLAGS is not used there. This causes
GCCs build against newer msvcrt versions with incompatible
access implementations to fail. This patch adds the flag to the
CXXFLAGS for all bootstrap stages. Bootstrapped on
x86_64-mingw64-seh.

config/ChangeLog:

2019-03-02  Johannes Pfau  

* mh-mingw: Also set __USE_MINGW_ACCESS flag for C++ code.

---
 config/mh-mingw | 5 +
 1 file changed, 5 insertions(+)

diff --git a/config/mh-mingw b/config/mh-mingw
index bc1d27477d0..a795096f038 100644
--- a/config/mh-mingw
+++ b/config/mh-mingw
@@ -2,6 +2,11 @@
 # Vista (see PR33281 for details).
 BOOT_CFLAGS += -D__USE_MINGW_ACCESS -Wno-pedantic-ms-format
 CFLAGS += -D__USE_MINGW_ACCESS
+STAGE1_CXXFLAGS += -D__USE_MINGW_ACCESS
+STAGE2_CXXFLAGS += -D__USE_MINGW_ACCESS
+STAGE3_CXXFLAGS += -D__USE_MINGW_ACCESS
+STAGE4_CXXFLAGS += -D__USE_MINGW_ACCESS
+
 # Increase stack limit to a figure based on the Linux default, with 4MB added
 # as GCC turns out to need that much more to pass all the limits-* tests.
 LDFLAGS += -Wl,--stack,12582912
-- 
2.19.2

Re: [patch, fortran] Fix PR 72714, ICE on invalid

2019-03-03 Thread Paul Richard Thomas

Hi Thomas,

This is good for trunk.

Thanks

Paul

On Sun, 3 Mar 2019 at 09:46, Thomas Koenig  wrote:
>
> Hello world,
>
> the attached patch fixes a 7/8/9 regression by rejecting an invalid
> expression in coarray allocation that led to an ICE.  It also adds a few
> more checks.
>
> One point that is checked for is that, unlike normal arrays, coarrays
> cannot be empty.
>
> Regression-tested. OK for trunk and affected branches?
>
> Regards
>
> Thomas
>
> 2019-03-02  Thomas Koenig  
>
>
>
>  PR fortran/72714
>
>  * resolve.c (resolve_allocate_expr): Add some tests for
> coarrays.
>
>
> 2019-03-02  Thomas Koenig  
>
>
>
>  PR fortran/72714
>
>  * gfortran.dg/coarray_allocate_11.f90: New test.



-- 
"If you can't explain it simply, you don't understand it well enough"
- Albert Einstein

[PATCH] Optimize vector init constructor

2019-03-03 Thread H.J. Lu

For vector init constructor:

---
typedef float __v4sf __attribute__ ((__vector_size__ (16)));

__v4sf
foo (__v4sf x, float f)
{
  __v4sf y = { f, x[1], x[2], x[3] };
  return y;
}
---

we can optimize vector init constructor with vector copy or permute
followed by a single scalar insert:

  __v4sf D.1912;
  __v4sf D.1913;
  __v4sf D.1914;
  __v4sf y;

  x.0_1 = x;
  D.1912 = x.0_1;
  _2 = D.1912;
  D.1913 = _2;
  BIT_FIELD_REF  = f;
  y = D.1913;
  D.1914 = y;
  return D.1914;

instead of

  __v4sf D.1962;
  __v4sf y;

  _1 = BIT_FIELD_REF ;
  _2 = BIT_FIELD_REF ;
  _3 = BIT_FIELD_REF ;
  y = {f, _1, _2, _3};
  D.1962 = y;
  return D.1962;

gcc/

PR tree-optimization/88828
* gimplify.c (gimplify_init_constructor): Optimize vector init
constructor with vector copy or permute followed by a single
scalar insert.

gcc/testsuite/

PR tree-optimization/88828
* gcc.target/i386/pr88828-1.c: New test.
* gcc.target/i386/pr88828-2.c: Likewise.
* gcc.target/i386/pr88828-3a.c: Likewise.
* gcc.target/i386/pr88828-3b.c: Likewise.
* gcc.target/i386/pr88828-4a.c: Likewise.
* gcc.target/i386/pr88828-4b.c: Likewise.
* gcc.target/i386/pr88828-5a.c: Likewise.
* gcc.target/i386/pr88828-5b.c: Likewise.
* gcc.target/i386/pr88828-6a.c: Likewise.
* gcc.target/i386/pr88828-6b.c: Likewise.
---
 gcc/gimplify.c | 176 +++--
 gcc/testsuite/gcc.target/i386/pr88828-1.c  |  16 ++
 gcc/testsuite/gcc.target/i386/pr88828-2.c  |  17 ++
 gcc/testsuite/gcc.target/i386/pr88828-3a.c |  16 ++
 gcc/testsuite/gcc.target/i386/pr88828-3b.c |  18 +++
 gcc/testsuite/gcc.target/i386/pr88828-4a.c |  17 ++
 gcc/testsuite/gcc.target/i386/pr88828-4b.c |  20 +++
 gcc/testsuite/gcc.target/i386/pr88828-5a.c |  16 ++
 gcc/testsuite/gcc.target/i386/pr88828-5b.c |  18 +++
 gcc/testsuite/gcc.target/i386/pr88828-6a.c |  17 ++
 gcc/testsuite/gcc.target/i386/pr88828-6b.c |  19 +++
 11 files changed, 336 insertions(+), 14 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-3a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-3b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-4a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-4b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-5a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-5b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-6a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-6b.c

diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 983635ba21f..893a4311f9e 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -5082,22 +5082,170 @@ gimplify_init_constructor (tree *expr_p, gimple_seq 
*pre_p, gimple_seq *post_p,
TREE_CONSTANT (ctor) = 0;
  }
 
-   /* Vector types use CONSTRUCTOR all the way through gimple
-  compilation as a general initializer.  */
-   FOR_EACH_VEC_SAFE_ELT (elts, ix, ce)
+   tree rhs_vector = NULL;
+   /* The vector element to replace scalar elements, which
+  will be overridden by scalar insert.  */
+   tree vector_element = NULL;
+   /* The single scalar element.  */
+   tree scalar_element = NULL;
+   unsigned int scalar_idx = 0;
+   enum { unknown, copy, permute, init } operation = unknown;
+   bool insert = false;
+
+   /* Check if we can generate vector copy or permute followed by
+  a single scalar insert.  */
+   if (TYPE_VECTOR_SUBPARTS (type).is_constant ())
  {
-   enum gimplify_status tret;
-   tret = gimplify_expr (&ce->value, pre_p, post_p, is_gimple_val,
- fb_rvalue);
-   if (tret == GS_ERROR)
- ret = GS_ERROR;
-   else if (TREE_STATIC (ctor)
-&& !initializer_constant_valid_p (ce->value,
-  TREE_TYPE (ce->value)))
- TREE_STATIC (ctor) = 0;
+   /* If all RHS vector elements come from the same vector,
+  we can use permute.  If all RHS vector elements come
+  from the same vector in the same order, we can use
+  copy.  */
+   unsigned int nunits
+ = TYPE_VECTOR_SUBPARTS (type).to_constant ();
+   unsigned int nscalars = 0;
+   unsigned int nvectors = 0;
+   operation = unknown;
+   FOR_EACH_VEC_SAFE_ELT (elts, ix, ce)
+ if (TREE_CODE (ce->value) == ARRAY_REF
+ || TREE_CODE (ce->value) == ARRAY_RANGE_REF)
+   {
+ if (!vector_element)
+   vector_element = ce->value;
+ /* Get the vector index.  */
+ tree idx = TREE_OPERAND (ce->value, 1);
+ if (TRE

Re: [PATCH] Optimize vector init constructor

2019-03-03 Thread Andrew Pinski

)
,On Sun, Mar 3, 2019 at 6:32 AM H.J. Lu  wrote:
>
> For vector init constructor:
>
> ---
> typedef float __v4sf __attribute__ ((__vector_size__ (16)));
>
> __v4sf
> foo (__v4sf x, float f)
> {
>   __v4sf y = { f, x[1], x[2], x[3] };
>   return y;
> }
> ---
>
> we can optimize vector init constructor with vector copy or permute
> followed by a single scalar insert:
>
>   __v4sf D.1912;
>   __v4sf D.1913;
>   __v4sf D.1914;
>   __v4sf y;
>
>   x.0_1 = x;
>   D.1912 = x.0_1;
>   _2 = D.1912;
>   D.1913 = _2;
>   BIT_FIELD_REF  = f;
>   y = D.1913;
>   D.1914 = y;
>   return D.1914;
>
> instead of
>
>   __v4sf D.1962;
>   __v4sf y;
>
>   _1 = BIT_FIELD_REF ;
>   _2 = BIT_FIELD_REF ;
>   _3 = BIT_FIELD_REF ;
>   y = {f, _1, _2, _3};
>   D.1962 = y;
>   return D.1962;
>
> gcc/
>
> PR tree-optimization/88828
> * gimplify.c (gimplify_init_constructor): Optimize vector init
> constructor with vector copy or permute followed by a single
> scalar insert.


Doing this here does not catch things like:
typedef float __v4sf __attribute__ ((__vector_size__ (16)));


__v4sf
vector_init (float f0,float f1, float f2,float f3)
{
  __v4sf y = { f, x[1], x[2], x[3] };
   return y;
}

__v4sf
foo (__v4sf x, float f)
{
  return vector_init (f, x[1], x[2], x[3]) ;
}

>
> gcc/testsuite/
>
> PR tree-optimization/88828
> * gcc.target/i386/pr88828-1.c: New test.
> * gcc.target/i386/pr88828-2.c: Likewise.
> * gcc.target/i386/pr88828-3a.c: Likewise.
> * gcc.target/i386/pr88828-3b.c: Likewise.
> * gcc.target/i386/pr88828-4a.c: Likewise.
> * gcc.target/i386/pr88828-4b.c: Likewise.
> * gcc.target/i386/pr88828-5a.c: Likewise.
> * gcc.target/i386/pr88828-5b.c: Likewise.
> * gcc.target/i386/pr88828-6a.c: Likewise.
> * gcc.target/i386/pr88828-6b.c: Likewise.
> ---
>  gcc/gimplify.c | 176 +++--
>  gcc/testsuite/gcc.target/i386/pr88828-1.c  |  16 ++
>  gcc/testsuite/gcc.target/i386/pr88828-2.c  |  17 ++
>  gcc/testsuite/gcc.target/i386/pr88828-3a.c |  16 ++
>  gcc/testsuite/gcc.target/i386/pr88828-3b.c |  18 +++
>  gcc/testsuite/gcc.target/i386/pr88828-4a.c |  17 ++
>  gcc/testsuite/gcc.target/i386/pr88828-4b.c |  20 +++
>  gcc/testsuite/gcc.target/i386/pr88828-5a.c |  16 ++
>  gcc/testsuite/gcc.target/i386/pr88828-5b.c |  18 +++
>  gcc/testsuite/gcc.target/i386/pr88828-6a.c |  17 ++
>  gcc/testsuite/gcc.target/i386/pr88828-6b.c |  19 +++
>  11 files changed, 336 insertions(+), 14 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-3a.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-3b.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-4a.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-4b.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-5a.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-5b.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-6a.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr88828-6b.c
>
> diff --git a/gcc/gimplify.c b/gcc/gimplify.c
> index 983635ba21f..893a4311f9e 100644
> --- a/gcc/gimplify.c
> +++ b/gcc/gimplify.c
> @@ -5082,22 +5082,170 @@ gimplify_init_constructor (tree *expr_p, gimple_seq 
> *pre_p, gimple_seq *post_p,
> TREE_CONSTANT (ctor) = 0;
>   }
>
> -   /* Vector types use CONSTRUCTOR all the way through gimple
> -  compilation as a general initializer.  */
> -   FOR_EACH_VEC_SAFE_ELT (elts, ix, ce)
> +   tree rhs_vector = NULL;
> +   /* The vector element to replace scalar elements, which
> +  will be overridden by scalar insert.  */
> +   tree vector_element = NULL;
> +   /* The single scalar element.  */
> +   tree scalar_element = NULL;
> +   unsigned int scalar_idx = 0;
> +   enum { unknown, copy, permute, init } operation = unknown;
> +   bool insert = false;
> +
> +   /* Check if we can generate vector copy or permute followed by
> +  a single scalar insert.  */
> +   if (TYPE_VECTOR_SUBPARTS (type).is_constant ())
>   {
> -   enum gimplify_status tret;
> -   tret = gimplify_expr (&ce->value, pre_p, post_p, is_gimple_val,
> - fb_rvalue);
> -   if (tret == GS_ERROR)
> - ret = GS_ERROR;
> -   else if (TREE_STATIC (ctor)
> -&& !initializer_constant_valid_p (ce->value,
> -  TREE_TYPE (ce->value)))
> - TREE_STATIC (ctor) = 0;
> +   /* If all RHS vector elements come from the same vector,
> +  we can use permute.  If all RHS vector elements come
> +  from the same vector in the same order, we can use
> +  copy.  */
> +

Re: [PATCH] x32: Add addr32 prefix to UNSPEC_VSIBADDR instructions

2019-03-03 Thread Uros Bizjak

On Thu, Feb 28, 2019 at 8:10 PM H.J. Lu  wrote:
>
> 32-bit indices in VSIB address are sign-extended to 64 bits.  In x32,
> when 32-bit indices are used as addresses, like in
>
> vgatherdps %ymm7, 0(,%ymm9,1), %ymm6
>
> 32-bit indices, 0xf7fa3010, is sign-extended to 0xf7fa3010 which
> is invalid address.  Add addr32 prefix to UNSPEC_VSIBADDR instructions
> for x32 if there is no base register nor symbol.
>
> This fixes 175.vpr and 254.gap in SPEC CPU 2000 on x32 with
>
> -Ofast -funroll-loops -march=haswell

1. Testcases 2 to 9 fail on fedora-29 with:

In file included from /usr/include/features.h:452,
 from /usr/include/bits/libc-header-start.h:33,
 from /usr/include/stdlib.h:25,
 from /ssd/uros/gcc-build-fast/gcc/include/mm_malloc.h:27,
 from /ssd/uros/gcc-build-fast/gcc/include/xmmintrin.h:34,
 from /ssd/uros/gcc-build-fast/gcc/include/immintrin.h:29,
 from
/home/uros/gcc-svn/trunk/gcc/testsuite/gcc.target/i386/pr89523-2.c:7:
/usr/include/gnu/stubs.h:13:11: fatal error: gnu/stubs-x32.h: No such
file or directory

2. Does the patch work with -maddress-mode={short,long}?

3. The implementation is wrong. You should use operand substitution
with VSIB address as operand, not substitution without operand.

4. The PR is not a regression.

Uros.

>
> gcc/
>
> PR target/89523
> * config/i386/i386.c (ix86_print_operand): Also handle '_' to
> add addr32 prefix if required.
> (ix86_print_operand_punct_valid_p): Allow '_'.
> * config/i386/sse.md (*avx512pf_gatherpfsf_mask): Prepend
> "%_".
> (*avx512pf_gatherpfdf_mask): Likewise.
> (*avx512pf_scatterpfsf_mask): Likewise.
> (*avx512pf_scatterpfdf_mask): Likewise.
> (*avx2_gathersi): Likewise.
> (*avx2_gathersi_2): Likewise.
> (*avx2_gatherdi): Likewise.
> (*avx2_gatherdi_2): Likewise.
> (*avx2_gatherdi_3): Likewise.
> (*avx2_gatherdi_4): Likewise.
> (*avx512f_gathersi): Likewise.
> (*avx512f_gathersi_2): Likewise.
> (*avx512f_gatherdi): Likewise.
> (*avx512f_gatherdi_2): Likewise.
> (*avx512f_scattersi): Likewise.
> (*avx512f_scatterdi): Likewise.
>
> gcc/testsuite/
>
> PR target/89523
> * gcc.target/i386/pr89523-1.c: New test.
> * gcc.target/i386/pr89523-2.c: Likewise.
> * gcc.target/i386/pr89523-3.c: Likewise.
> * gcc.target/i386/pr89523-4.c: Likewise.
> * gcc.target/i386/pr89523-5.c: Likewise.
> * gcc.target/i386/pr89523-6.c: Likewise.
> * gcc.target/i386/pr89523-7.c: Likewise.
> * gcc.target/i386/pr89523-8.c: Likewise.
> * gcc.target/i386/pr89523-9.c: Likewise.
>
> xxx
> ---
>  gcc/config/i386/i386.c| 39 ++-
>  gcc/config/i386/sse.md| 46 +++
>  gcc/testsuite/gcc.target/i386/pr89523-1.c | 24 
>  gcc/testsuite/gcc.target/i386/pr89523-2.c | 17 +
>  gcc/testsuite/gcc.target/i386/pr89523-3.c | 17 +
>  gcc/testsuite/gcc.target/i386/pr89523-4.c | 16 
>  gcc/testsuite/gcc.target/i386/pr89523-5.c | 18 +
>  gcc/testsuite/gcc.target/i386/pr89523-6.c | 17 +
>  gcc/testsuite/gcc.target/i386/pr89523-7.c | 19 ++
>  gcc/testsuite/gcc.target/i386/pr89523-8.c | 19 ++
>  gcc/testsuite/gcc.target/i386/pr89523-9.c | 16 
>  11 files changed, 224 insertions(+), 24 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-3.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-4.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-5.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-6.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-7.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-8.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-9.c
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index b8357a7db5d..336696136de 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -17805,6 +17805,7 @@ print_reg (rtx x, int code, FILE *file)
> ~ -- print "i" if TARGET_AVX2, "f" otherwise.
> ^ -- print addr32 prefix if TARGET_64BIT and Pmode != word_mode
> ! -- print NOTRACK prefix for jxx/call/ret instructions if required.
> +   _ -- print addr32 prefix if required.
>   */
>
>  void
> @@ -18356,6 +18357,42 @@ ix86_print_operand (FILE *file, rtx x, int code)
> fputs ("addr32 ", file);
>   return;
>
> +   case '_':
> + if (TARGET_X32)
> +   {
> + subrtx_var_iterator::array_type array;
> + FOR_EACH_SUBRTX_VAR (iter, array,
> +  PATTERN (current_output_insn),

[RFC] libgcc: Integrating emutls and D garbage collector

2019-03-03 Thread Johannes Pfau

Hello all,

I wanted to ask for some feedback on an issue which has affected GDC
for a long time now: Letting the D garbage collector scan
emulated TLS memory. Basically, for every variable instance in TLS
memory we need to be able to get the address and size of that instance,
so the GC can scan this memory for pointers.

Now with libgcc emutls there are essentially two problems:

1) Memory is allocated individually for each variable, so it's not
possible to simply get a memory range as in native TLS.
I can't think of a way to get the required information with the
current API, so I've attached proof-of-concept patch which
implements a __emutls_iterate_memory function. It is however rather
intrusive, so maybe there's a better solution. Currently it keeps
the size for all variables in every threads local memory, but an
optimization would be to keep the size only once in a separate array.

A completely different, less intrusive solution would be to allow the
frontend to specify different emutls functions (e.g.
__d_emutls_get_address instead of __emutls_get_address). The main
drawback here is that linking C/D emutls variables would not be
possible due to the incompatible ABI. However, this is also true
for the other D compilers' (DMD) emutls implementation, so this may
be fine. A D emutls implementation would also be much simpler and
probably faster (we can just allocate using the GC).

(Of course it's way too late for such changes in GCC9, so I'm just
asking for some general feedback.)


2) TLS memory is only freed once a thread exits, but not on unloading of
shared libraries. This problem is aggravated for D, as this old TLS
memory may contain references to the GC heap and prevent that memory
from being collected. However, I think this issue is not that important,
especially as I can't think of a portable solution (one destructor
per variable could work, but seems quite wasteful). 

I'm looking forward to your answers, maybe there's some better solution
I'm not aware of.

Best regards,
Johannes

---
 libgcc/emutls.c  | 118 +--
 libgcc/libgcc-std.ver.in |   5 ++
 2 files changed, 118 insertions(+), 5 deletions(-)

diff --git a/libgcc/emutls.c b/libgcc/emutls.c
index c725142e465..66b74d29f52 100644
--- a/libgcc/emutls.c
+++ b/libgcc/emutls.c
@@ -52,6 +52,8 @@ struct __emutls_array
 
 void *__emutls_get_address (struct __emutls_object *);
 void __emutls_register_common (struct __emutls_object *, word, word, void *);
+typedef void (*iterate_callback)(void* mem, pointer size, void *user);
+void __emutls_iterate_memory (iterate_callback cb, void *user);
 
 #ifdef __GTHREADS
 #ifdef __GTHREAD_MUTEX_INIT
@@ -62,10 +64,111 @@ static __gthread_mutex_t emutls_mutex;
 static __gthread_key_t emutls_key;
 static pointer emutls_size;
 
+static struct __emutls_array *emutls_arrays;
+
+static void
+emutls_array_register (struct __emutls_array *arr)
+{
+  __gthread_mutex_lock (&emutls_mutex);
+
+  if (emutls_arrays == NULL)
+{
+  emutls_arrays = calloc (32 + 1, sizeof (void *));
+  emutls_arrays->size = 32;
+}
+
+  // Try to write to an empty slot
+  pointer slot_index = 0;
+  for (; slot_index < emutls_arrays->size; slot_index++)
+{
+  if (emutls_arrays->data[slot_index] == NULL)
+   {
+ emutls_arrays->data[slot_index] = (void *) arr;
+ break;
+   }
+}
+  // No empty slot?
+  if (slot_index == emutls_arrays->size)
+{
+  emutls_arrays = realloc (emutls_arrays, (slot_index + 2) * sizeof (void 
*));
+  if (emutls_arrays == NULL)
+   abort ();
+  emutls_arrays->size = slot_index + 1;
+  emutls_arrays->data[slot_index] = (void *) arr;
+}
+
+  __gthread_mutex_unlock (&emutls_mutex);
+}
+
+static void
+emutls_array_update (struct __emutls_array *old, struct __emutls_array 
*updated)
+{
+  if (updated == old)
+return;
+
+  __gthread_mutex_lock (&emutls_mutex);
+
+  for (pointer slot_index = 0; slot_index < emutls_arrays->size; slot_index++)
+{
+  if (emutls_arrays->data[slot_index] == (void *) old)
+emutls_arrays->data[slot_index] = (void *) updated;
+}
+
+  __gthread_mutex_unlock (&emutls_mutex);
+}
+
+static void
+emutls_array_unregister (struct __emutls_array *arr)
+{
+  __gthread_mutex_lock (&emutls_mutex);
+
+  for (pointer slot_index = 0; slot_index < emutls_arrays->size; slot_index++)
+{
+  if (emutls_arrays->data[slot_index] == (void *) arr)
+emutls_arrays->data[slot_index] = NULL;
+}
+
+  __gthread_mutex_unlock (&emutls_mutex);
+}
+
+static void
+emutls_array_iterate (struct __emutls_array *arr, iterate_callback cb, void 
*user)
+{
+  if (arr == NULL)
+return;
+
+  for (pointer i = 0; i < arr->size; i++)
+{
+  void *ptr = arr->data[i];
+  if (ptr)
+   {
+ pointer size = ((pointer*) ptr)[-2];
+ cb (ptr, size, user);
+   }
+}
+}
+
+void
+__emutls_iterate_memory (iterate_callback cb, void *user)
+{
+  __gthread_mutex_l

Re: [PR fortran/77583, patch ]- ICE in pp_quoted_string, at pretty-print.c:966

2019-03-03 Thread Harald Anlauf

I didn't see any disagreement, so committed to trunk (rev.269353)
and "backported" to 7- and 8-branches.

Thanks,
Harald

On 03/02/19 00:15, Steve Kargl wrote:
> On Sat, Mar 02, 2019 at 12:12:10AM +0100, Harald Anlauf wrote:
>> The attached patch (originally by Steve Kargl) fixes a NULL pointer
>> dereference that may occur when checking for a conflict.
>>
>> Regtested successfully.
>>
>> OK for trunk?  Backport to active branches?
>>
>>
>> 2019-03-02  Harald Anlauf  
>>  Steve Kargl  
> 
> Steven G. Kargl  
> 
> ;-)
> 
> I, of course, approve of the patch, but you might give
> others a chance to disagree.
>

[libstc++] Don't throw in std::assoc_legendre for m > l

2019-03-03 Thread André Brand

The return value specified in "8.1.2 associated Legendre polynomials"
of ISO/IEC JTC 1/SC 22/WG 21 N3060 (which is identical to the
expression in the doxygen comment of the patched function) is well-
defined for m>l: it is always zero because $ P_l(x) $ is a polynomial
of degree l.

The standard does not enforce an exception in this case because none of
the requirements in 8.1 (5) on page 11 of ISO/IEC JTC 1/SC 22/WG 21
N3060 are met.

Note: the implementation of st::assoc_legendre in Visual Studio 2017
(tested with Visual Studio 15.9.7) silently returns zero.
Index: libstdc++-v3/include/tr1/legendre_function.tcc
===
--- libstdc++-v3/include/tr1/legendre_function.tcc	(revision 269352)
+++ libstdc++-v3/include/tr1/legendre_function.tcc	(working copy)
@@ -67,13 +67,13 @@
 /**
  *   @brief  Return the Legendre polynomial by recursion on degree
  *   @f$ l @f$.
- * 
+ *
  *   The Legendre function of @f$ l @f$ and @f$ x @f$,
  *   @f$ P_l(x) @f$, is defined by:
  *   @f[
  * P_l(x) = \frac{1}{2^l l!}\frac{d^l}{dx^l}(x^2 - 1)^{l}
  *   @f]
- * 
+ *
  *   @param  l  The degree of the Legendre polynomial.  @f$l >= 0@f$.
  *   @param  x  The argument of the Legendre polynomial.  @f$|x| <= 1@f$.
  */
@@ -120,17 +120,17 @@
 /**
  *   @brief  Return the associated Legendre function by recursion
  *   on @f$ l @f$.
- * 
+ *
  *   The associated Legendre function is derived from the Legendre function
  *   @f$ P_l(x) @f$ by the Rodrigues formula:
  *   @f[
  * P_l^m(x) = (1 - x^2)^{m/2}\frac{d^m}{dx^m}P_l(x)
  *   @f]
- * 
+ *
  *   @param  l  The degree of the associated Legendre function.
  *  @f$ l >= 0 @f$.
  *   @param  m  The order of the associated Legendre function.
- *  @f$ m <= l @f$.
+ *  @f$ m >= 0 @f$.
  *   @param  x  The argument of the associated Legendre function.
  *  @f$ |x| <= 1 @f$.
  *   @param  phase  The phase of the associated Legendre function.
@@ -146,8 +146,7 @@
 std::__throw_domain_error(__N("Argument out of range"
   " in __assoc_legendre_p."));
   else if (__m > __l)
-std::__throw_domain_error(__N("Degree out of range"
-  " in __assoc_legendre_p."));
+return _Tp(0);
   else if (__isnan(__x))
 return std::numeric_limits<_Tp>::quiet_NaN();
   else if (__m == 0)
@@ -192,7 +191,7 @@
 
 /**
  *   @brief  Return the spherical associated Legendre function.
- * 
+ *
  *   The spherical associated Legendre function of @f$ l @f$, @f$ m @f$,
  *   and @f$ \theta @f$ is defined as @f$ Y_l^m(\theta,0) @f$ where
  *   @f[
@@ -202,7 +201,7 @@
  *   @f]
  *   is the spherical harmonic function and @f$ P_l^m(x) @f$ is the
  *   associated Legendre function.
- * 
+ *
  *   This function differs from the associated Legendre function by
  *   argument (@f$x = \cos(\theta)@f$) and by a normalization factor
  *   but this factor is rather large for large @f$ l @f$ and @f$ m @f$
@@ -210,7 +209,7 @@
  *   and @f$ m @f$.
  *   @note Unlike the case for __assoc_legendre_p the Condon-Shortley
  *   phase factor @f$ (-1)^m @f$ is present here.
- * 
+ *
  *   @param  l  The degree of the spherical associated Legendre function.
  *  @f$ l >= 0 @f$.
  *   @param  m  The order of the spherical associated Legendre function.

Re: [PATCH] Optimize vector init constructor

2019-03-03 Thread H.J. Lu

On Sun, Mar 03, 2019 at 06:40:09AM -0800, Andrew Pinski wrote:
> )
> ,On Sun, Mar 3, 2019 at 6:32 AM H.J. Lu  wrote:
> >
> > For vector init constructor:
> >
> > ---
> > typedef float __v4sf __attribute__ ((__vector_size__ (16)));
> >
> > __v4sf
> > foo (__v4sf x, float f)
> > {
> >   __v4sf y = { f, x[1], x[2], x[3] };
> >   return y;
> > }
> > ---
> >
> > we can optimize vector init constructor with vector copy or permute
> > followed by a single scalar insert:
> >
> >   __v4sf D.1912;
> >   __v4sf D.1913;
> >   __v4sf D.1914;
> >   __v4sf y;
> >
> >   x.0_1 = x;
> >   D.1912 = x.0_1;
> >   _2 = D.1912;
> >   D.1913 = _2;
> >   BIT_FIELD_REF  = f;
> >   y = D.1913;
> >   D.1914 = y;
> >   return D.1914;
> >
> > instead of
> >
> >   __v4sf D.1962;
> >   __v4sf y;
> >
> >   _1 = BIT_FIELD_REF ;
> >   _2 = BIT_FIELD_REF ;
> >   _3 = BIT_FIELD_REF ;
> >   y = {f, _1, _2, _3};
> >   D.1962 = y;
> >   return D.1962;
> >
> > gcc/
> >
> > PR tree-optimization/88828
> > * gimplify.c (gimplify_init_constructor): Optimize vector init
> > constructor with vector copy or permute followed by a single
> > scalar insert.
> 
> 
> Doing this here does not catch things like:
> typedef float __v4sf __attribute__ ((__vector_size__ (16)));
> 
> 
> __v4sf
> vector_init (float f0,float f1, float f2,float f3)
> {
>   __v4sf y = { f, x[1], x[2], x[3] };
>return y;
> }
> 
> __v4sf
> foo (__v4sf x, float f)
> {
>   return vector_init (f, x[1], x[2], x[3]) ;
> }
> 

Here is a patch for simplify_vector_constructor to optimize vector init
constructor with vector copy or permute followed by a single scalar
insert.  But this doesn't work correcly:

[hjl@gnu-cfl-2 pr88828]$ cat bar.i
typedef float __v4sf __attribute__ ((__vector_size__ (16)));

static __v4sf
vector_init (float f0,float f1, float f2,float f3)
{
  __v4sf y = { f0, f1, f2, f3 };
   return y;
}

__v4sf
foo (__v4sf x, float f)
{
  return vector_init (f, x[1], x[2], x[3]) ;
}
[hjl@gnu-cfl-2 pr88828]$ make bar.s
/export/build/gnu/tools-build/gcc-wip-debug/build-x86_64-linux/gcc/xgcc 
-B/export/build/gnu/tools-build/gcc-wip-debug/build-x86_64-linux/gcc/ -O2 -S 
bar.i
[hjl@gnu-cfl-2 pr88828]$ cat bar.s
.file   "bar.i"
.text
.p2align 4
.globl  foo
.type   foo, @function
foo:
.LFB1:
.cfi_startproc
ret
.cfi_endproc
.LFE1:
.size   foo, .-foo
.ident  "GCC: (GNU) 9.0.1 20190303 (experimental)"
.section.note.GNU-stack,"",@progbits
[hjl@gnu-cfl-2 pr88828]$

Scalar insert is missing.
---
 gcc/tree-ssa-forwprop.c | 77 -
 1 file changed, 69 insertions(+), 8 deletions(-)

diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c
index eeb6281c652..b10cfccf7b8 100644
--- a/gcc/tree-ssa-forwprop.c
+++ b/gcc/tree-ssa-forwprop.c
@@ -2008,7 +2008,7 @@ simplify_vector_constructor (gimple_stmt_iterator *gsi)
   unsigned elem_size, i;
   unsigned HOST_WIDE_INT nelts;
   enum tree_code code, conv_code;
-  constructor_elt *elt;
+  constructor_elt *ce;
   bool maybe_ident;
 
   gcc_checking_assert (gimple_assign_rhs_code (stmt) == CONSTRUCTOR);
@@ -2027,18 +2027,41 @@ simplify_vector_constructor (gimple_stmt_iterator *gsi)
   orig[1] = NULL;
   conv_code = ERROR_MARK;
   maybe_ident = true;
-  FOR_EACH_VEC_SAFE_ELT (CONSTRUCTOR_ELTS (op), i, elt)
+
+  tree rhs_vector = NULL;
+  /* The single scalar element.  */
+  tree scalar_element = NULL;
+  unsigned int scalar_idx = 0;
+  bool insert = false;
+  unsigned int nscalars = 0;
+  unsigned int nvectors = 0;
+  FOR_EACH_VEC_SAFE_ELT (CONSTRUCTOR_ELTS (op), i, ce)
 {
   tree ref, op1;
 
   if (i >= nelts)
return false;
 
-  if (TREE_CODE (elt->value) != SSA_NAME)
+  if (TREE_CODE (ce->value) != SSA_NAME)
return false;
-  def_stmt = get_prop_source_stmt (elt->value, false, NULL);
+  def_stmt = get_prop_source_stmt (ce->value, false, NULL);
   if (!def_stmt)
-   return false;
+   {
+ if ( gimple_nop_p (SSA_NAME_DEF_STMT (ce->value)))
+   {
+ /* Only allow one single scalar insert.  */
+ if (nscalars != 0)
+   return false;
+
+ nscalars = 1;
+ insert = true;
+ scalar_idx = i;
+ scalar_element = ce->value;
+ continue;
+   }
+ else
+   return false;
+   }
   code = gimple_assign_rhs_code (def_stmt);
   if (code == FLOAT_EXPR
  || code == FIX_TRUNC_EXPR)
@@ -2046,7 +2069,7 @

Re: [PATCH] x32: Add addr32 prefix to UNSPEC_VSIBADDR instructions

2019-03-03 Thread H.J. Lu

On Sun, Mar 3, 2019 at 9:27 AM Uros Bizjak  wrote:
>
> On Thu, Feb 28, 2019 at 8:10 PM H.J. Lu  wrote:
> >
> > 32-bit indices in VSIB address are sign-extended to 64 bits.  In x32,
> > when 32-bit indices are used as addresses, like in
> >
> > vgatherdps %ymm7, 0(,%ymm9,1), %ymm6
> >
> > 32-bit indices, 0xf7fa3010, is sign-extended to 0xf7fa3010 which
> > is invalid address.  Add addr32 prefix to UNSPEC_VSIBADDR instructions
> > for x32 if there is no base register nor symbol.
> >
> > This fixes 175.vpr and 254.gap in SPEC CPU 2000 on x32 with
> >
> > -Ofast -funroll-loops -march=haswell
>
> 1. Testcases 2 to 9 fail on fedora-29 with:
>
> In file included from /usr/include/features.h:452,
>  from /usr/include/bits/libc-header-start.h:33,
>  from /usr/include/stdlib.h:25,
>  from /ssd/uros/gcc-build-fast/gcc/include/mm_malloc.h:27,
>  from /ssd/uros/gcc-build-fast/gcc/include/xmmintrin.h:34,
>  from /ssd/uros/gcc-build-fast/gcc/include/immintrin.h:29,
>  from
> /home/uros/gcc-svn/trunk/gcc/testsuite/gcc.target/i386/pr89523-2.c:7:
> /usr/include/gnu/stubs.h:13:11: fatal error: gnu/stubs-x32.h: No such
> file or directory

I will update tests to remove  "#include immintrin.h"

> 2. Does the patch work with -maddress-mode={short,long}?

Yes.

> 3. The implementation is wrong. You should use operand substitution
> with VSIB address as operand, not substitution without operand.

How can I add an addr32 prefix with operand substitution?  This is
very similar to "%^".  My updated patch will use "%^".

> 4. The PR is not a regression.

Correct.

H.J.
> Uros.
>
> >
> > gcc/
> >
> > PR target/89523
> > * config/i386/i386.c (ix86_print_operand): Also handle '_' to
> > add addr32 prefix if required.
> > (ix86_print_operand_punct_valid_p): Allow '_'.
> > * config/i386/sse.md (*avx512pf_gatherpfsf_mask): Prepend
> > "%_".
> > (*avx512pf_gatherpfdf_mask): Likewise.
> > (*avx512pf_scatterpfsf_mask): Likewise.
> > (*avx512pf_scatterpfdf_mask): Likewise.
> > (*avx2_gathersi): Likewise.
> > (*avx2_gathersi_2): Likewise.
> > (*avx2_gatherdi): Likewise.
> > (*avx2_gatherdi_2): Likewise.
> > (*avx2_gatherdi_3): Likewise.
> > (*avx2_gatherdi_4): Likewise.
> > (*avx512f_gathersi): Likewise.
> > (*avx512f_gathersi_2): Likewise.
> > (*avx512f_gatherdi): Likewise.
> > (*avx512f_gatherdi_2): Likewise.
> > (*avx512f_scattersi): Likewise.
> > (*avx512f_scatterdi): Likewise.
> >
> > gcc/testsuite/
> >
> > PR target/89523
> > * gcc.target/i386/pr89523-1.c: New test.
> > * gcc.target/i386/pr89523-2.c: Likewise.
> > * gcc.target/i386/pr89523-3.c: Likewise.
> > * gcc.target/i386/pr89523-4.c: Likewise.
> > * gcc.target/i386/pr89523-5.c: Likewise.
> > * gcc.target/i386/pr89523-6.c: Likewise.
> > * gcc.target/i386/pr89523-7.c: Likewise.
> > * gcc.target/i386/pr89523-8.c: Likewise.
> > * gcc.target/i386/pr89523-9.c: Likewise.
> >
> > xxx
> > ---
> >  gcc/config/i386/i386.c| 39 ++-
> >  gcc/config/i386/sse.md| 46 +++
> >  gcc/testsuite/gcc.target/i386/pr89523-1.c | 24 
> >  gcc/testsuite/gcc.target/i386/pr89523-2.c | 17 +
> >  gcc/testsuite/gcc.target/i386/pr89523-3.c | 17 +
> >  gcc/testsuite/gcc.target/i386/pr89523-4.c | 16 
> >  gcc/testsuite/gcc.target/i386/pr89523-5.c | 18 +
> >  gcc/testsuite/gcc.target/i386/pr89523-6.c | 17 +
> >  gcc/testsuite/gcc.target/i386/pr89523-7.c | 19 ++
> >  gcc/testsuite/gcc.target/i386/pr89523-8.c | 19 ++
> >  gcc/testsuite/gcc.target/i386/pr89523-9.c | 16 
> >  11 files changed, 224 insertions(+), 24 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-1.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-2.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-3.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-4.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-5.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-6.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-7.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-8.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr89523-9.c
> >
> > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> > index b8357a7db5d..336696136de 100644
> > --- a/gcc/config/i386/i386.c
> > +++ b/gcc/config/i386/i386.c
> > @@ -17805,6 +17805,7 @@ print_reg (rtx x, int code, FILE *file)
> > ~ -- print "i" if TARGET_AVX2, "f" otherwise.
> > ^ -- print addr32 prefix if TARGET_64BIT and Pmode != word_mode
> > ! -- print NOTRACK prefix for jxx/call/ret instru

Re: [PATCH] x32: Add addr32 prefix to UNSPEC_VSIBADDR instructions

2019-03-03 Thread Uros Bizjak

On Sun, Mar 3, 2019 at 10:18 PM H.J. Lu  wrote:
>
> On Sun, Mar 3, 2019 at 9:27 AM Uros Bizjak  wrote:
> >
> > On Thu, Feb 28, 2019 at 8:10 PM H.J. Lu  wrote:
> > >
> > > 32-bit indices in VSIB address are sign-extended to 64 bits.  In x32,
> > > when 32-bit indices are used as addresses, like in
> > >
> > > vgatherdps %ymm7, 0(,%ymm9,1), %ymm6
> > >
> > > 32-bit indices, 0xf7fa3010, is sign-extended to 0xf7fa3010 which
> > > is invalid address.  Add addr32 prefix to UNSPEC_VSIBADDR instructions
> > > for x32 if there is no base register nor symbol.
> > >
> > > This fixes 175.vpr and 254.gap in SPEC CPU 2000 on x32 with
> > >
> > > -Ofast -funroll-loops -march=haswell
> >
> > 1. Testcases 2 to 9 fail on fedora-29 with:
> >
> > In file included from /usr/include/features.h:452,
> >  from /usr/include/bits/libc-header-start.h:33,
> >  from /usr/include/stdlib.h:25,
> >  from /ssd/uros/gcc-build-fast/gcc/include/mm_malloc.h:27,
> >  from /ssd/uros/gcc-build-fast/gcc/include/xmmintrin.h:34,
> >  from /ssd/uros/gcc-build-fast/gcc/include/immintrin.h:29,
> >  from
> > /home/uros/gcc-svn/trunk/gcc/testsuite/gcc.target/i386/pr89523-2.c:7:
> > /usr/include/gnu/stubs.h:13:11: fatal error: gnu/stubs-x32.h: No such
> > file or directory
>
> I will update tests to remove  "#include immintrin.h"
>
> > 2. Does the patch work with -maddress-mode={short,long}?
>
> Yes.
>
> > 3. The implementation is wrong. You should use operand substitution
> > with VSIB address as operand, not substitution without operand.
>
> How can I add an addr32 prefix with operand substitution?  This is
> very similar to "%^".  My updated patch will use "%^".

Yes, using %^ is what I think would be the optimal solution. Other
than that, in your proposed patch, operand-less %_ scans the entire
current_output_insn to dig to the UNSPEC_VSIBADDR. You can just use
operand substitution, and do e.g. "%X2vgatherpf0..." where 'X'
processes operand 2 (vsib_address_operand) and conditionally outputs
addr32.

BTW: In a new version of the patch, please specify what is changed
from the previous version. Otherwise, review of a new version is more
or less a guesswork what changed.

Uros.

[PATCH] PR libstdc++/89562 use binary mode for file I/O

2019-03-03 Thread Jonathan Wakely


PR libstdc++/89562
* src/filesystem/ops-common.h (do_copy_file): Open files in binary
mode for mingw.

Tested x86_64-linux, and lightly tested on mingw-w64 to verify the fix
works.


commit b15e67df3477fac3fea5a3df234be91391719fcd
Author: Jonathan Wakely 
Date:   Sun Mar 3 22:01:29 2019 +

PR libstdc++/89562 use binary mode for file I/O

PR libstdc++/89562
* src/filesystem/ops-common.h (do_copy_file): Open files in binary
mode for mingw.

diff --git a/libstdc++-v3/src/filesystem/ops-common.h 
b/libstdc++-v3/src/filesystem/ops-common.h
index 55e482ff8f2..6dc9b137dbf 100644
--- a/libstdc++-v3/src/filesystem/ops-common.h
+++ b/libstdc++-v3/src/filesystem/ops-common.h
@@ -402,7 +402,12 @@ _GLIBCXX_BEGIN_NAMESPACE_FILESYSTEM
   int fd;
 };
 
-CloseFD in = { posix::open(from, O_RDONLY) };
+int iflag = O_RDONLY;
+#ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS
+iflag |= O_BINARY;
+#endif
+
+CloseFD in = { posix::open(from, iflag) };
 if (in.fd == -1)
   {
ec.assign(errno, std::generic_category());
@@ -413,6 +418,9 @@ _GLIBCXX_BEGIN_NAMESPACE_FILESYSTEM
   oflag |= O_TRUNC;
 else
   oflag |= O_EXCL;
+#ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS
+oflag |= O_BINARY;
+#endif
 CloseFD out = { posix::open(to, oflag, S_IWUSR) };
 if (out.fd == -1)
   {

Re: [v3 PATCH, RFC] Rewrite variant. Also PR libstdc++/85517

2019-03-03 Thread Ville Voutilainen

On Wed, 6 Feb 2019 at 15:12, Ville Voutilainen
 wrote:

> And, to emphasize, the most important reason for this was to be able
> to write straightforward
> code for the special member functions, with the hope that it wouldn't
> have a negative codegen
> effect. Our Microsoft friends described the general technique as "has
> crazy-good codegen",
> but I have no idea what their starting point was; our starting point
> probably wasn't bad
> to begin with.

However, the codegen should be somewhat improved; this patch removes a
bag of run-time ifs from the implementation.

An amended patch attached. This gets rid of all __erased* stuff,
including hash, swap, constructors, relops.
I consider variant to no longer be in the state of sin after this.
Since this is touching just a C++17 facility with no
impact elsewhere, we could consider landing it in GCC 9 as a late
change. Failing that, it certainly seems safe enough
to put into GCC 9.2.

2019-03-04  Ville Voutilainen  

Rewrite variant.
Also PR libstdc++/85517
* include/std/variant (__do_visit): New.
(__variant_cast): Likewise.
(__variant_cookie): Likewise.
(__erased_*): Remove.
(_Variant_storage::_S_vtable): Likewise.
(_Variant_storage::__M_reset_impl): Adjust to use __do_visit.
(_Variant_storage::__M_reset): Adjust.
(_Copy_ctor_base(const _Copy_ctor_base&)): Adjust to use __do_visit.
(_Move_ctor_base(_Move_ctor_base&&)): Likewise.
(_Move_ctor_base::__M_destructive_copy): New.
(_Copy_assign_base::operator=): Adjust to use __do_visit.
(_Copy_assign_alias): Adjust to check both copy assignment
and copy construction for triviality.
(_Move_assign_base::operator=): Adjust to use __do_visit.
(_Multi_array): Add support for visitors that accept and return
a __variant_cookie.
(__gen_vtable_impl::_S_apply_all_alts): Likewise.
(__gen_vtable_impl::_S_apply_single_alt): Likewise.
(__gen_vtable_impl::__element_by_index_or_cookie): New. Generate
a __variant_cookie temporary for a variant that is valueless and..
(__gen_vtable_impl::__visit_invoke): ..adjust here.
(__gen_vtable::_Array_type): Conditionally make space for
the __variant_cookie visitor case.
(relops): Adjust to use __do_visit.
(variant): Add __variant_cast as a friend.
(variant::emplace): Use _M_reset() instead of self-destruction.
(visit): Reimplement in terms of __do_visit.
* testsuite/20_util/variant/compile.cc: Adjust.
* testsuite/20_util/variant/run.cc: Likewise.
diff --git a/libstdc++-v3/include/std/variant b/libstdc++-v3/include/std/variant
index 89deb14..8b1f407 100644
--- a/libstdc++-v3/include/std/variant
+++ b/libstdc++-v3/include/std/variant
@@ -138,6 +138,19 @@ namespace __variant
 constexpr variant_alternative_t<_Np, variant<_Types...>> const&&
 get(const variant<_Types...>&&);
 
+  template
+constexpr decltype(auto)
+__do_visit(_Visitor&& __visitor, _Variants&&... __variants);
+
+  template 
+decltype(auto) __variant_cast(_Tp&& __rhs)
+{
+  if constexpr (is_const_v>)
+return static_cast&>(__rhs);
+  else
+return static_cast&>(__rhs);
+}
+
 namespace __detail
 {
 namespace __variant
@@ -155,6 +168,9 @@ namespace __variant
   std::integral_constant
 	? 0 : __index_of_v<_Tp, _Rest...> + 1> {};
 
+  // used for raw visitation
+  struct __variant_cookie {};
+
   // _Uninitialized is guaranteed to be a literal type, even if T is not.
   // We have to do this, because [basic.types]p10.5.3 (n4606) is not implemented
   // yet. When it's implemented, _Uninitialized can be changed to the alias
@@ -236,63 +252,6 @@ namespace __variant
 			  std::forward<_Variant>(__v)._M_u);
 }
 
-  // Various functions as "vtable" entries, where those vtables are used by
-  // polymorphic operations.
-  template
-void
-__erased_ctor(void* __lhs, void* __rhs)
-{
-  using _Type = remove_reference_t<_Lhs>;
-  ::new (__lhs) _Type(__variant::__ref_cast<_Rhs>(__rhs));
-}
-
-  template
-void
-__erased_dtor(_Variant&& __v)
-{ std::_Destroy(std::__addressof(__variant::__get<_Np>(__v))); }
-
-  template
-void
-__erased_assign(void* __lhs, void* __rhs)
-{
-  __variant::__ref_cast<_Lhs>(__lhs) = __variant::__ref_cast<_Rhs>(__rhs);
-}
-
-  template
-void
-__erased_swap(void* __lhs, void* __rhs)
-{
-  using std::swap;
-  swap(__variant::__ref_cast<_Lhs>(__lhs),
-	   __variant::__ref_cast<_Rhs>(__rhs));
-}
-
-#define _VARIANT_RELATION_FUNCTION_TEMPLATE(__OP, __NAME) \
-  template \
-constexpr bool \
-__erased_##__NAME(const _Variant& __lhs, const _Variant& __rhs) \
-{ \
-  return __variant::__get<_Np>(std::forward<_Variant>(__lhs)) \
-	  __OP __variant::__get<_Np>(std::forward<_Variant>(__rhs)); \
-}
-
-  _VARIANT_RELATION_FUNCTION_TEMPLATE(<, less)
-  _VARIANT_RELATION_FUNCTION_TEMPLATE(<=, less_equal)
-  _VARIANT_RELATION_FUNCTION_TEMPLATE(==, equal)
-  _V

Re: [v3 PATCH, RFC] Rewrite variant. Also PR libstdc++/85517

2019-03-03 Thread Ville Voutilainen

On Mon, 4 Mar 2019 at 01:26, Ville Voutilainen
 wrote:
> I consider variant to no longer be in the state of sin after this.

Sigh, except for the places where it self-destructs or placement-news
over things that it shouldn't. That's hopefully
the next bit that I'll rectify, Real Soon Now.

Ping Re: [PATCH] PR c/43673 - Incorrect warning in dfp printf.

2019-03-03 Thread Xiong Hu Luo

Ping:
https://gcc.gnu.org/ml/gcc-patches/2019-02/msg01949.html

Thanks
Xionghu

On 2019/2/26 AM9:13, luo...@linux.ibm.com wrote:
> From: Xiong Hu Luo 
> 
> dfp printf/scanf of Ha/HA, Da/DA and DDa/DDA is not set properly, cause
> incorrect warning happens:
> "use of 'D' length modifier with 'a' type character".
> 
> Regression-tested on powerpc64le-linux, OK for trunk and gcc-8?
> 
> gcc/c-family/ChangeLog:
> 
> 2019-02-25  Xiong Hu Luo  
> 
>   PR c/43673
>   * c-format.c (print_char_table, scanf_char_table): Replace BADLEN with
>   TEX_D32, TEX_D64 or TEX_D128.
> 
> gcc/testsuit/ChangeLog:
> 
> 2019-02-25  Xiong Hu Luo  
> 
>   PR c/43673
>   * gcc.dg/format-dfp-printf-1.c: New test.
>   * gcc.dg/format-dfp-scanf-1.c: Likewise.
> ---
>  gcc/c-family/c-format.c|  4 ++--
>  gcc/testsuite/gcc.dg/format/dfp-printf-1.c | 28 ++--
>  gcc/testsuite/gcc.dg/format/dfp-scanf-1.c  | 22 --
>  3 files changed, 48 insertions(+), 6 deletions(-)
> 
> diff --git a/gcc/c-family/c-format.c b/gcc/c-family/c-format.c
> index 9b48ee3..af33ef9 100644
> --- a/gcc/c-family/c-format.c
> +++ b/gcc/c-family/c-format.c
> @@ -674,7 +674,7 @@ static const format_char_info print_char_table[] =
>{ "n",   1, STD_C89, { T89_I,   T99_SC,  T89_S,   T89_L,   T9L_LL,  
> BADLEN,  T99_SST, T99_PD,  T99_IM,  BADLEN,  BADLEN,  BADLEN }, "",  
> "W",  NULL },
>/* C99 conversion specifiers.  */
>{ "F",   0, STD_C99, { T99_D,   BADLEN,  BADLEN,  T99_D,   BADLEN,  
> T99_LD,  BADLEN,  BADLEN,  BADLEN,  TEX_D32, TEX_D64, TEX_D128 }, "-wp0 
> +#'I", "",   NULL },
> -  { "aA",  0, STD_C99, { T99_D,   BADLEN,  BADLEN,  T99_D,   BADLEN,  
> T99_LD,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-wp0 +#",   
> "",   NULL },
> +  { "aA",  0, STD_C99, { T99_D,   BADLEN,  BADLEN,  T99_D,   BADLEN,  
> T99_LD,  BADLEN,  BADLEN,  BADLEN,  TEX_D32, TEX_D64,  TEX_D128 }, "-wp0 +#", 
>   "",   NULL },
>/* X/Open conversion specifiers.  */
>{ "C",   0, STD_EXT, { TEX_WI,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  
> BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-w",
> "",   NULL },
>{ "S",   1, STD_EXT, { TEX_W,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  
> BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-wp",   
> "R",  NULL },
> @@ -847,7 +847,7 @@ static const format_char_info scan_char_table[] =
>{ "n", 1, STD_C89, { T89_I,   T99_SC,  T89_S,   T89_L,   T9L_LL,  
> BADLEN,  T99_SST, T99_PD,  T99_IM,  BADLEN,  BADLEN,  BADLEN }, "", "W",  
>  NULL },
>/* C99 conversion specifiers.  */
>{ "F",   1, STD_C99, { T99_F,   BADLEN,  BADLEN,  T99_D,   BADLEN,  
> T99_LD,  BADLEN,  BADLEN,  BADLEN,  TEX_D32, TEX_D64, TEX_D128 }, "*w'",  
> "W",   NULL },
> -  { "aA",   1, STD_C99, { T99_F,   BADLEN,  BADLEN,  T99_D,   BADLEN,  
> T99_LD,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "*w'",  "W",  
>  NULL },
> +  { "aA",   1, STD_C99, { T99_F,   BADLEN,  BADLEN,  T99_D,   BADLEN,  
> T99_LD,  BADLEN,  BADLEN,  BADLEN,  TEX_D32,  TEX_D64,  TEX_D128 }, "*w'",  
> "W",   NULL },
>/* X/Open conversion specifiers.  */
>{ "C", 1, STD_EXT, { TEX_W,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  
> BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "*mw",   "W", 
>   NULL },
>{ "S", 1, STD_EXT, { TEX_W,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  
> BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "*amw",  "W", 
>   NULL },
> diff --git a/gcc/testsuite/gcc.dg/format/dfp-printf-1.c 
> b/gcc/testsuite/gcc.dg/format/dfp-printf-1.c
> index e92f161..a290895 100644
> --- a/gcc/testsuite/gcc.dg/format/dfp-printf-1.c
> +++ b/gcc/testsuite/gcc.dg/format/dfp-printf-1.c
> @@ -17,6 +17,8 @@ foo (_Decimal32 x, _Decimal64 y, _Decimal128 z, int i, 
> unsigned int j,
>  
>/* Check lack of warnings for valid usage.  */
>  
> +  printf ("%Ha\n", x);
> +  printf ("%HA\n", x);
>printf ("%Hf\n", x);
>printf ("%HF\n", x);
>printf ("%He\n", x);
> @@ -24,6 +26,8 @@ foo (_Decimal32 x, _Decimal64 y, _Decimal128 z, int i, 
> unsigned int j,
>printf ("%Hg\n", x);
>printf ("%HG\n", x);
>  
> +  printf ("%Da\n", y);
> +  printf ("%DA\n", y);
>printf ("%Df\n", y);
>printf ("%DF\n", y);
>printf ("%De\n", y);
> @@ -31,6 +35,8 @@ foo (_Decimal32 x, _Decimal64 y, _Decimal128 z, int i, 
> unsigned int j,
>printf ("%Dg\n", y);
>printf ("%DG\n", y);
>  
> +  printf ("%DDa\n", z);
> +  printf ("%DDA\n", z);
>printf ("%DDf\n", z);
>printf ("%DDF\n", z);
>printf ("%DDe\n", z);
> @@ -43,12 +49,16 @@ foo (_Decimal32 x, _Decimal64 y, _Decimal128 z, int i, 
> unsigned int j,
>  
>/* Check warnings for type mismatches.  */
>  
> +  printf ("%Ha\n", y);   /* { dg-warning "expects argument" "bad use of 
> %H" } */
> +  printf ("%HA\n", y);   /* { dg-warning "expects argument" "bad use of 
> %H" } */
>printf ("%Hf\n", y);   /*

回复：[PATCH GCC10] ipa-inline.c: Trivial fix on function not declared inline check in want_inline_small_function_p

2019-03-03 Thread JunMa

--
发件人：Segher Boessenkool 
发送时间：2019年3月1日(星期五) 22:18
收件人：JunMa 
抄 送：gcc-patches 
主 题：Re: [PATCH GCC10] ipa-inline.c: Trivial fix on function not declared inline 
check in want_inline_small_function_p


Hi!

On Fri, Mar 01, 2019 at 04:39:38PM +0800, JunMa wrote:
>Since MAX_INLINE_INSNS_AUTO should be below or equal to 
>MAX_INLINE_INSNS_SINGLE (see params.def), there is no need
>to do second inlining limit check on growth when function not
>declared inline, this patch removes it.
>Bootstrapped and tested on x86_64-unknown-linux-gnu, is it ok for trunk?

Your mail subject says this is for GCC 10, but you are asking for GCC 9
now; which is it?


Sorry. Since we are in GCC9 stage4 now, also it's not for regression fix.So, 
it's for GCC 10.

> 2019-03-01  Jun Ma  
> 
> *ipa-inline.c(want_inline_small_function_p): Remove
> redundant growth check when function not declared 
> inline

Some spaces were lost in the first line.  Trailing space.  Sentences
should end with a full stop (or similar).

Don't send patches (or pretty much anything else) as
application/octet-stream attachments.


Segher


Sorry again for this. Here is the full change.

JunMa


2019-03-01  Jun Ma  

* ipa-inline.c(want_inline_small_function_p): Remove
redundant growth check when function not declared 
inline.

diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
index 360c3de..ff9bc9e 100644
--- a/gcc/ipa-inline.c
+++ b/gcc/ipa-inline.c
@@ -837,15 +837,11 @@ want_inline_small_function_p (struct cgraph_edge *e, bool 
report)
 ? MAX (MAX_INLINE_INSNS_AUTO,
MAX_INLINE_INSNS_SINGLE)
 : MAX_INLINE_INSNS_AUTO)
-  && !(big_speedup == -1 ? big_speedup_p (e) : big_speedup))
+  && !(big_speedup == -1 ? big_speedup_p (e) : big_speedup)
+  && growth_likely_positive (callee, growth))
{
- /* growth_likely_positive is expensive, always test it last.  */
-  if (growth >= MAX_INLINE_INSNS_SINGLE
- || growth_likely_positive (callee, growth))
-   {
- e->inline_failed = CIF_MAX_INLINE_INSNS_AUTO_LIMIT;
- want_inline = false;
-   }
+ e->inline_failed = CIF_MAX_INLINE_INSNS_AUTO_LIMIT;
+ want_inline = false;
}
   /* If call is cold, do not inline when function body would grow. */
   else if (!e->maybe_hot_p ()

Re: [PATCH GCC10] ipa-inline.c: Trivial fix on function not declared inline check in want_inline_small_function_p

2019-03-03 Thread JunMa


Hi

  Please ignore the previous mail.

在 2019/3/1 下午10:17, Segher Boessenkool 写道:

Hi!

On Fri, Mar 01, 2019 at 04:39:38PM +0800, JunMa wrote:

Since MAX_INLINE_INSNS_AUTO should be below or equal to
MAX_INLINE_INSNS_SINGLE (see params.def), there is no need
to do second inlining limit check on growth when function not
declared inline, this patch removes it.
Bootstrapped and tested on x86_64-unknown-linux-gnu, is it ok for trunk?

Your mail subject says this is for GCC 10, but you are asking for GCC 9
now; which is it?


Since we are in GCC9 stage4 now, also it's not for regression fix.
So, it's for GCC 10.


2019-03-01  Jun Ma  

 *ipa-inline.c(want_inline_small_function_p): Remove
 redundant growth check when function not declared
 inline

Some spaces were lost in the first line.  Trailing space.  Sentences
should end with a full stop (or similar).

Don't send patches (or pretty much anything else) as
application/octet-stream attachments.


Segher


Sorry again for this. Here is the full change.

JunMa

2019-03-01  Jun Ma

* ipa-inline.c(want_inline_small_function_p): Remove
redundant growth check when function not declared
inline.


diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
index 360c3de..ff9bc9e 100644
--- a/gcc/ipa-inline.c
+++ b/gcc/ipa-inline.c
@@ -837,15 +837,11 @@ want_inline_small_function_p (struct cgraph_edge *e, bool 
report)
 ? MAX (MAX_INLINE_INSNS_AUTO,
MAX_INLINE_INSNS_SINGLE)
 : MAX_INLINE_INSNS_AUTO)
-  && !(big_speedup == -1 ? big_speedup_p (e) : big_speedup))
+  && !(big_speedup == -1 ? big_speedup_p (e) : big_speedup)
+  && growth_likely_positive (callee, growth))
{
- /* growth_likely_positive is expensive, always test it last.  */
-  if (growth >= MAX_INLINE_INSNS_SINGLE
- || growth_likely_positive (callee, growth))
-   {
- e->inline_failed = CIF_MAX_INLINE_INSNS_AUTO_LIMIT;
- want_inline = false;
-   }
+ e->inline_failed = CIF_MAX_INLINE_INSNS_AUTO_LIMIT;
+ want_inline = false;
}
   /* If call is cold, do not inline when function body would grow. */
   else if (!e->maybe_hot_p ()

[patch, fortran] Fix PR 72714, ICE on invalid

Re: [patch, fortran] Fix pointers not escaping via C_PTR

[PATCH] [MinGW] Set __USE_MINGW_ACCESS for C++ as well

Re: [patch, fortran] Fix PR 72714, ICE on invalid

[PATCH] Optimize vector init constructor

Re: [PATCH] Optimize vector init constructor

Re: [PATCH] x32: Add addr32 prefix to UNSPEC_VSIBADDR instructions

[RFC] libgcc: Integrating emutls and D garbage collector

Re: [PR fortran/77583, patch ]- ICE in pp_quoted_string, at pretty-print.c:966

[libstc++] Don't throw in std::assoc_legendre for m > l

Re: [PATCH] Optimize vector init constructor

Re: [PATCH] x32: Add addr32 prefix to UNSPEC_VSIBADDR instructions

Re: [PATCH] x32: Add addr32 prefix to UNSPEC_VSIBADDR instructions

[PATCH] PR libstdc++/89562 use binary mode for file I/O

Re: [v3 PATCH, RFC] Rewrite variant. Also PR libstdc++/85517

Re: [v3 PATCH, RFC] Rewrite variant. Also PR libstdc++/85517

Ping Re: [PATCH] PR c/43673 - Incorrect warning in dfp printf.

回复：[PATCH GCC10] ipa-inline.c: Trivial fix on function not declared inline check in want_inline_small_function_p

Re: [PATCH GCC10] ipa-inline.c: Trivial fix on function not declared inline check in want_inline_small_function_p

19 matches

Site Navigation

Mail list logo

Footer information