Re: [PATCH] i386: Improve [QH]Imode rotates with masked shift count [PR99405]

2021-03-06 Thread Uros Bizjak via Gcc-patches
On Fri, Mar 5, 2021 at 9:40 PM Jakub Jelinek  wrote:
>
> Hi!
>
> The following testcase shows that while we nicely optimize away the
> useless and? of shift count before rotation for [SD]Imode rotates,
> we don't do that for [QH]Imode.
>
> The following patch optimizes that by using the right iterator on those
> 4 patterns.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux.  Ok for trunk?
> Or just GCC12?
>
> 2021-03-05  Jakub Jelinek  
>
> PR target/99405
> * config/i386/i386.md (*3_mask, *3_mask_1):
> For any_rotate define_insn_split and following splitters, use
> SWI iterator instead of SWI48.
>
> * gcc.target/i386/pr99405.c: New test.

I'm not sure I remember why I left out QI and HImode from rotates (and
shifts) when these patterns were introduced, but the testcase shows
that they are effective. Since this is not a regression, OK for gcc-12

Thanks,
Uros.

> --- gcc/config/i386/i386.md.jj  2021-03-03 10:02:27.871589603 +0100
> +++ gcc/config/i386/i386.md 2021-03-05 14:00:35.768378973 +0100
> @@ -11951,9 +11951,9 @@ (define_expand "3"
>
>  ;; Avoid useless masking of count operand.
>  (define_insn_and_split "*3_mask"
> -  [(set (match_operand:SWI48 0 "nonimmediate_operand")
> -   (any_rotate:SWI48
> - (match_operand:SWI48 1 "nonimmediate_operand")
> +  [(set (match_operand:SWI 0 "nonimmediate_operand")
> +   (any_rotate:SWI
> + (match_operand:SWI 1 "nonimmediate_operand")
>   (subreg:QI
> (and:SI
>   (match_operand:SI 2 "register_operand" "c")
> @@ -11967,15 +11967,15 @@ (define_insn_and_split "*3_m
>"&& 1"
>[(parallel
>   [(set (match_dup 0)
> -  (any_rotate:SWI48 (match_dup 1)
> -(match_dup 2)))
> +  (any_rotate:SWI (match_dup 1)
> +  (match_dup 2)))
>(clobber (reg:CC FLAGS_REG))])]
>"operands[2] = gen_lowpart (QImode, operands[2]);")
>
>  (define_split
> -  [(set (match_operand:SWI48 0 "register_operand")
> -   (any_rotate:SWI48
> - (match_operand:SWI48 1 "const_int_operand")
> +  [(set (match_operand:SWI 0 "register_operand")
> +   (any_rotate:SWI
> + (match_operand:SWI 1 "const_int_operand")
>   (subreg:QI
> (and:SI
>   (match_operand:SI 2 "register_operand")
> @@ -11984,14 +11984,14 @@ (define_split
> == GET_MODE_BITSIZE (mode) - 1"
>   [(set (match_dup 4) (match_dup 1))
>(set (match_dup 0)
> -   (any_rotate:SWI48 (match_dup 4)
> -(subreg:QI (match_dup 2) 0)))]
> +   (any_rotate:SWI (match_dup 4)
> +  (subreg:QI (match_dup 2) 0)))]
>   "operands[4] = gen_reg_rtx (mode);")
>
>  (define_insn_and_split "*3_mask_1"
> -  [(set (match_operand:SWI48 0 "nonimmediate_operand")
> -   (any_rotate:SWI48
> - (match_operand:SWI48 1 "nonimmediate_operand")
> +  [(set (match_operand:SWI 0 "nonimmediate_operand")
> +   (any_rotate:SWI
> + (match_operand:SWI 1 "nonimmediate_operand")
>   (and:QI
> (match_operand:QI 2 "register_operand" "c")
> (match_operand:QI 3 "const_int_operand"
> @@ -12004,14 +12004,14 @@ (define_insn_and_split "*3_m
>"&& 1"
>[(parallel
>   [(set (match_dup 0)
> -  (any_rotate:SWI48 (match_dup 1)
> -(match_dup 2)))
> +  (any_rotate:SWI (match_dup 1)
> +  (match_dup 2)))
>(clobber (reg:CC FLAGS_REG))])])
>
>  (define_split
> -  [(set (match_operand:SWI48 0 "register_operand")
> -   (any_rotate:SWI48
> - (match_operand:SWI48 1 "const_int_operand")
> +  [(set (match_operand:SWI 0 "register_operand")
> +   (any_rotate:SWI
> + (match_operand:SWI 1 "const_int_operand")
>   (and:QI
> (match_operand:QI 2 "register_operand")
> (match_operand:QI 3 "const_int_operand"]
> @@ -12019,7 +12019,7 @@ (define_split
>== GET_MODE_BITSIZE (mode) - 1"
>   [(set (match_dup 4) (match_dup 1))
>(set (match_dup 0)
> -   (any_rotate:SWI48 (match_dup 4) (match_dup 2)))]
> +   (any_rotate:SWI (match_dup 4) (match_dup 2)))]
>   "operands[4] = gen_reg_rtx (mode);")
>
>  ;; Implement rotation using two double-precision
> --- gcc/testsuite/gcc.target/i386/pr99405.c.jj  2021-03-05 13:40:20.334860937 
> +0100
> +++ gcc/testsuite/gcc.target/i386/pr99405.c 2021-03-05 13:39:51.131185009 
> +0100
> @@ -0,0 +1,23 @@
> +/* PR target/99405 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mtune=generic -fomit-frame-pointer" } */
> +/* { dg-final { scan-assembler-not "\tand\[bl]\t\\\$" } } */
> +
> +unsigned char f1 (unsigned char x, unsigned y) { return (x << (y & 7)) | (x 
> >> (-y & 7)); }
> +unsigned short f2 (unsigned short x, unsigned y) { return (x << (y & 15)) | 
> (x >> (-y & 15)); }
> +unsigned int f3 (unsigned int x, unsigned y) { return (x << (y & 31)) | (x 
> >> (-y & 31)); }
> +unsign

Re: [PATCH] libstdc++: Improve std::rot[lr] [PR99396]

2021-03-06 Thread Jonathan Wakely via Gcc-patches
On Fri, 5 Mar 2021, 22:32 Jakub Jelinek via Libstdc++, <
libstd...@gcc.gnu.org> wrote:

> Hi!
>
> As can be seen on:
> #include 
>
> unsigned char f1 (unsigned char x, int y) { return std::rotl (x, y); }
> unsigned char f2 (unsigned char x, int y) { return std::rotr (x, y); }
> unsigned short f3 (unsigned short x, int y) { return std::rotl (x, y); }
> unsigned short f4 (unsigned short x, int y) { return std::rotr (x, y); }
> unsigned int f5 (unsigned int x, int y) { return std::rotl (x, y); }
> unsigned int f6 (unsigned int x, int y) { return std::rotr (x, y); }
> unsigned long int f7 (unsigned long int x, int y) { return std::rotl (x,
> y); }
> unsigned long int f8 (unsigned long int x, int y) { return std::rotr (x,
> y); }
> unsigned long long int f9 (unsigned long long int x, int y) { return
> std::rotl (x, y); }
> unsigned long long int f10 (unsigned long long int x, int y) { return
> std::rotr (x, y); }
> //unsigned __int128 f11 (unsigned __int128 x, int y) { return std::rotl
> (x, y); }
> //unsigned __int128 f12 (unsigned __int128 x, int y) { return std::rotr
> (x, y); }
>
> constexpr auto a = std::rotl (1234U, 0);
> constexpr auto b = std::rotl (1234U, 5);
> constexpr auto c = std::rotl (1234U, -5);
> constexpr auto d = std::rotl (1234U, -__INT_MAX__ - 1);
> the current  definitions of std::__rot[lr] aren't pattern recognized
> as rotates, they are too long/complex for that, starting with signed
> modulo,
> special case for 0 and different cases for positive and negative.
>
> For types with power of two bits the following patch adds definitions that
> the compiler can pattern recognize and turn e.g. on x86_64 into
> ro[lr][bwlq]
> instructions.  For weirdo types like unsigned __int20 etc. it keeps the
> current definitions.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>

OK, thanks.



>
>


Re: [PATCH] i386: Fix some -mavx512vl -mno-avx512bw bugs [PR99321]

2021-03-06 Thread Uros Bizjak via Gcc-patches
On Fri, Mar 5, 2021 at 9:51 PM Jakub Jelinek  wrote:
>
> Hi!
>
> As I wrote in the mail with the previous PR99321 fix, we have various
> bugs where we emit instructions that need avx512bw and avx512vl
> ISAs when compiling with -mavx512vl -mno-avx512bw.
>
> Without the following patch,
> /* PR target/99321 */
> /* Would need some effective target for GNU as that supports 
> -march=+noavx512bw etc. */
> /* { dg-do assemble } */
> /* { dg-options "-O2 -mavx512vl -mno-avx512bw -Wa,-march=+noavx512bw" } */
>
> #include 
>
> typedef unsigned char V1 __attribute__((vector_size (16)));
> typedef unsigned char V2 __attribute__((vector_size (32)));
> typedef unsigned short V3 __attribute__((vector_size (16)));
> typedef unsigned short V4 __attribute__((vector_size (32)));
>
> void f1 (void) { register V1 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm 
> ("" : "=v" (a), "=v" (b)); a += b; __asm ("" : : "v" (a)); }
> void f2 (void) { register V2 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm 
> ("" : "=v" (a), "=v" (b)); a += b; __asm ("" : : "v" (a)); }
> void f3 (void) { register V3 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm 
> ("" : "=v" (a), "=v" (b)); a += b; __asm ("" : : "v" (a)); }
> void f4 (void) { register V4 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm 
> ("" : "=v" (a), "=v" (b)); a += b; __asm ("" : : "v" (a)); }
> void f5 (void) { register V1 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm 
> ("" : "=v" (a), "=v" (b)); a -= b; __asm ("" : : "v" (a)); }
> void f6 (void) { register V2 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm 
> ("" : "=v" (a), "=v" (b)); a -= b; __asm ("" : : "v" (a)); }
> void f7 (void) { register V3 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm 
> ("" : "=v" (a), "=v" (b)); a -= b; __asm ("" : : "v" (a)); }
> void f8 (void) { register V4 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm 
> ("" : "=v" (a), "=v" (b)); a -= b; __asm ("" : : "v" (a)); }
> void f9 (void) { register V3 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm 
> ("" : "=v" (a), "=v" (b)); a *= b; __asm ("" : : "v" (a)); }
> void f10 (void) { register V4 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm 
> ("" : "=v" (a), "=v" (b)); a *= b; __asm ("" : : "v" (a)); }
> void f11 (void) { register V1 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm 
> ("" : "=v" (a), "=v" (b)); a = (V1) _mm_min_epu8 ((__m128i) a, (__m128i) b); 
> __asm ("" : : "v" (a)); }
> void f12 (void) { register V2 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm 
> ("" : "=v" (a), "=v" (b)); a = (V2) _mm256_min_epu8 ((__m256i) a, (__m256i) 
> b); __asm ("" : : "v" (a)); }
> void f13 (void) { register V3 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm 
> ("" : "=v" (a), "=v" (b)); a = (V3) _mm_min_epu16 ((__m128i) a, (__m128i) b); 
> __asm ("" : : "v" (a)); }
> void f14 (void) { register V4 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm 
> ("" : "=v" (a), "=v" (b)); a = (V4) _mm256_min_epu16 ((__m256i) a, (__m256i) 
> b); __asm ("" : : "v" (a)); }
> void f15 (void) { register V1 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm 
> ("" : "=v" (a), "=v" (b)); a = (V1) _mm_min_epi8 ((__m128i) a, (__m128i) b); 
> __asm ("" : : "v" (a)); }
> void f16 (void) { register V2 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm 
> ("" : "=v" (a), "=v" (b)); a = (V2) _mm256_min_epi8 ((__m256i) a, (__m256i) 
> b); __asm ("" : : "v" (a)); }
> void f17 (void) { register V3 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm 
> ("" : "=v" (a), "=v" (b)); a = (V3) _mm_min_epi16 ((__m128i) a, (__m128i) b); 
> __asm ("" : : "v" (a)); }
> void f18 (void) { register V4 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm 
> ("" : "=v" (a), "=v" (b)); a = (V4) _mm256_min_epi16 ((__m256i) a, (__m256i) 
> b); __asm ("" : : "v" (a)); }
> void f19 (void) { register V1 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm 
> ("" : "=v" (a), "=v" (b)); a = (V1) _mm_max_epu8 ((__m128i) a, (__m128i) b); 
> __asm ("" : : "v" (a)); }
> void f20 (void) { register V2 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm 
> ("" : "=v" (a), "=v" (b)); a = (V2) _mm256_max_epu8 ((__m256i) a, (__m256i) 
> b); __asm ("" : : "v" (a)); }
> void f21 (void) { register V3 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm 
> ("" : "=v" (a), "=v" (b)); a = (V3) _mm_max_epu16 ((__m128i) a, (__m128i) b); 
> __asm ("" : : "v" (a)); }
> void f22 (void) { register V4 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm 
> ("" : "=v" (a), "=v" (b)); a = (V4) _mm256_max_epu16 ((__m256i) a, (__m256i) 
> b); __asm ("" : : "v" (a)); }
> void f23 (void) { register V1 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm 
> ("" : "=v" (a), "=v" (b)); a = (V1) _mm_max_epi8 ((__m128i) a, (__m128i) b); 
> __asm ("" : : "v" (a)); }
> void f24 (void) { register V2 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm 
> ("" : "=v" (a), "=v" (b)); a = (V2) _mm256_max_epi8 ((__m256i) a, (__m256i) 
> b); __asm ("" : : "v" (a)); }
> void f25 (void) { register V3 a __asm ("%xmm16"), b __asm ("%xmm17"); __asm 
> ("" : "=v" (a), "=v" (b)); a = (V3) _mm_max_epi16 ((__m128i) a, (__m128i) b); 
> __asm

Re: [PATCH] i386: Fix some -mavx512vl -mno-avx512bw bugs [PR99321]

2021-03-06 Thread Jakub Jelinek via Gcc-patches
On Sat, Mar 06, 2021 at 11:19:15AM +0100, Uros Bizjak wrote:
> > We already have Yw constraint which is equivalent to v for
> > -mavx512bw -mavx512vl and to nothing otherwise, so for
> > the instructions that need both we need to use xYw and
> > v for modes that don't need that.
> 
> Perhaps we should introduce another Y... constraint to return correct
> SSE regset based on TARGET_... flags, instead of using compound xYw? I
> think that introducing new constraint is the established approach we
> should follow. The new mode_attr looks OK to me.

One possibility would be to change the meaning of Yw, because it
is an internal undocumented constraint and all uses in GCC currently use it
as xYw:
constraints.md:(define_register_constraint "Yw"
mmx.md:  [(set (match_operand:V4HI 0 "register_operand" "=y,xYw")
mmx.md:  (match_operand:V4HI 1 "register_mmxmem_operand" "ym,xYw")
mmx.md:  [(set (match_operand:V4HI 0 "register_operand" "=y,xYw")
mmx.md: (match_operand:SI 1 "register_operand" "0,xYw"]
Would that be ok?

If not, I'll add
(define_register_constraint "Yl"
 "TARGET_AVX512BW && TARGET_AVX512VL ? ALL_SSE_REGS : TARGET_SSE ? SSE_REGS : 
NO_REGS"
 "@internal Any EVEX encodable SSE register (@code{%xmm0-%xmm31}) for AVX512BW 
with TARGET_AVX512VL target, otherwise any SSE register.")

Jakub



Re: [pushed] c++: Fix class NTTP constness handling [PR98810]

2021-03-06 Thread Eric Botcazou
> GCC 9 doesn't have the "c++20" target yet.
> 
>   * g++.dg/cpp2a/nontype-class-defarg1.C: Use target c++2a.

Thanks!

-- 
Eric Botcazou




Re: [PATCH] i386: Fix some -mavx512vl -mno-avx512bw bugs [PR99321]

2021-03-06 Thread Uros Bizjak via Gcc-patches
On Sat, Mar 6, 2021 at 11:34 AM Jakub Jelinek  wrote:
>
> On Sat, Mar 06, 2021 at 11:19:15AM +0100, Uros Bizjak wrote:
> > > We already have Yw constraint which is equivalent to v for
> > > -mavx512bw -mavx512vl and to nothing otherwise, so for
> > > the instructions that need both we need to use xYw and
> > > v for modes that don't need that.
> >
> > Perhaps we should introduce another Y... constraint to return correct
> > SSE regset based on TARGET_... flags, instead of using compound xYw? I
> > think that introducing new constraint is the established approach we
> > should follow. The new mode_attr looks OK to me.
>
> One possibility would be to change the meaning of Yw, because it
> is an internal undocumented constraint and all uses in GCC currently use it
> as xYw:
> constraints.md:(define_register_constraint "Yw"
> mmx.md:  [(set (match_operand:V4HI 0 "register_operand" "=y,xYw")
> mmx.md:  (match_operand:V4HI 1 "register_mmxmem_operand" "ym,xYw")
> mmx.md:  [(set (match_operand:V4HI 0 "register_operand" "=y,xYw")
> mmx.md: (match_operand:SI 1 "register_operand" "0,xYw"]
> Would that be ok?

Yes, this is an excellent idea.

Uros.

> If not, I'll add
> (define_register_constraint "Yl"
>  "TARGET_AVX512BW && TARGET_AVX512VL ? ALL_SSE_REGS : TARGET_SSE ? SSE_REGS : 
> NO_REGS"
>  "@internal Any EVEX encodable SSE register (@code{%xmm0-%xmm31}) for 
> AVX512BW with TARGET_AVX512VL target, otherwise any SSE register.")
>
> Jakub
>


Re: [PATCH] libgcov: Fix build on Darwin [PR99406]

2021-03-06 Thread Jeff Law via Gcc-patches



On 3/5/21 1:41 PM, Jakub Jelinek via Gcc-patches wrote:
> On Fri, Mar 05, 2021 at 04:19:47PM +, Iain Sandoe wrote:
>> Jakub Jelinek via Gcc-patches  wrote:
>>
>>> As reported, bootstrap currently fails on older Darwin because
>>> MAP_ANONYMOUS
>>> is not defined.
>>>
>>> The following is what gcc/system.h does, so I think it should work for
>>> libgcov.
>>> Build tested on x86_64-linux, ok for trunk?
>> bootstrap suceeded r11-7524 + this patch on Darwin11.
> And bootstrap/regtest succeeded on x86_64-linux and i686-linux too.
>
>>> 2021-03-05  Jakub Jelinek  
>>>
>>> PR gcov-profile/99406
>>> * libgcov.h (MAP_FAILED, MAP_ANONYMOUS): If HAVE_SYS_MMAN_H is
>>> defined, define these macros if not defined already.
OK
jeff



[committed][OG10] DWARF: late code range fixup

2021-03-06 Thread Andrew Stubbs


This patch fixes up the DWARF code ranges for offload debugging, again.

This time it defers the changes until most other DWARF generation has 
occurred, because the previous method was causing ICEs on some testcases.


This patch will be proposed for mainline in stage 1.

Andrew
DWARF: late code range fixup

Ensure that the parent DWARF subprograms of offload kernel functions have a
code range, and are therefore not discarded by GDB.  This is only necessary
when the parent function does not actually exist in the final binary, which is
commonly the case within the offload device's binary.

This patch replaces 808bdf1bb29 and fdcb23540a2.  It should be squashed with
those before being posted upstream.

gcc/

	* gcc/dwarf2out.c (notional_parents_list): New file variable.
	(gen_subprogram_die): Record offload kernel functions in
	notional_parents_list.
	(fixup_notional_parents): New function.
	(dwarf2out_finish): Call fixup_notional_parents.
	(dwarf2out_c_finalize): Reset notional_parents_list.

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 1d7cb6273f0..d6796caba3e 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -3427,6 +3427,12 @@ static GTY(()) limbo_die_node *limbo_die_list;
DW_AT_{,MIPS_}linkage_name once their DECL_ASSEMBLER_NAMEs are set.  */
 static GTY(()) limbo_die_node *deferred_asm_name;
 
+/* A list of DIEs for which we may have to add a notional code range to the
+   parent DIE.  This happens for parents of nested offload kernels, and is
+   necessary because the parents don't exist on the offload target, yet GDB
+   expects parents of real functions to also appear to exist.  */
+static GTY(()) limbo_die_node *notional_parents_list;
+
 struct dwarf_file_hasher : ggc_ptr_hash
 {
   typedef const char *compare_type;
@@ -23085,34 +23091,25 @@ gen_subprogram_die (tree decl, dw_die_ref context_die)
 	  dw_fde_ref fde = fun->fde;
 	  if (fde->dw_fde_begin)
 	{
-	  dw_attr_node *low = get_AT (subr_die, DW_AT_low_pc);
-	  dw_attr_node *high = get_AT (subr_die, DW_AT_high_pc);
-	  if (low && high)
-		{
-		  /* Replace the existing value, it will have come from
-		 the "omp target entrypoint" case below.  */
-		  free (low->dw_attr_val.v.val_lbl_id);
-		  low->dw_attr_val.v.val_lbl_id = xstrdup (fde->dw_fde_begin);
-		  free (high->dw_attr_val.v.val_lbl_id);
-		  high->dw_attr_val.v.val_lbl_id = xstrdup (fde->dw_fde_end);
-		}
-	  else
-		/* We have already generated the labels.  */
-		add_AT_low_high_pc (subr_die, fde->dw_fde_begin,
-fde->dw_fde_end, false);
+	  /* We have already generated the labels.  */
+	  add_AT_low_high_pc (subr_die, fde->dw_fde_begin,
+  fde->dw_fde_end, false);
 
 	 /* Offload kernel functions are nested within a parent function
 	that doesn't actually exist within the offload object.  GDB
 		will ignore the function and everything nested within unless
-		we give it a notional code range (the values aren't
-		important, as long as they are valid).  */
+		we give the parent a code range.  We can't do it here because
+		that breaks the case where the parent actually does exist (as
+		it does on the host-side), so we defer the fixup for later.  */
 	 if (lookup_attribute ("omp target entrypoint",
-   DECL_ATTRIBUTES (decl))
-		 && subr_die->die_parent
-		 && subr_die->die_parent->die_tag == DW_TAG_subprogram
-		 && !get_AT_low_pc (subr_die->die_parent))
-	   add_AT_low_high_pc (subr_die->die_parent, fde->dw_fde_begin,
-   fde->dw_fde_end, false);
+   DECL_ATTRIBUTES (decl)))
+	   {
+		 limbo_die_node *node = ggc_cleared_alloc ();
+		 node->die = subr_die;
+		 node->created_for = decl;
+		 node->next = notional_parents_list;
+		 notional_parents_list = node;
+	   }
 	}
 	  else
 	{
@@ -31348,6 +31345,37 @@ flush_limbo_die_list (void)
 }
 }
 
+/* Add a code range to the notional parent function (which does not actually
+   exist) so that GDB does not ignore all the child functions.  The actual
+   values do not matter, but need to be valid labels, so we simply copy those
+   from the child function.
+
+   Typically this occurs when we have an offload kernel, where the parent
+   function only exists in the host-side portion of the code.  */
+
+static void
+fixup_notional_parents (void)
+{
+  limbo_die_node *node;
+
+  while ((node = notional_parents_list))
+{
+  dw_die_ref die = node->die;
+  dw_die_ref parent = die->die_parent;
+  notional_parents_list = node->next;
+
+  if (parent
+	  && parent->die_tag == DW_TAG_subprogram
+	  && !get_AT_low_pc (parent))
+	{
+	  dw_attr_node *low = get_AT (die, DW_AT_low_pc);
+	  dw_attr_node *high = get_AT (die, DW_AT_high_pc);
+
+	  add_AT_low_high_pc (parent, AT_lbl (low), AT_lbl (high), false);
+	}
+}
+}
+
 /* Reset DIEs so we can output them again.  */
 
 static void
@@ -31378,6 +31406,9 @@ dwarf2out_finish (const char *filename)
   /* Flush out any latecomers to the limbo party.  */
   flush_limbo_die_list 

[committed][OG10] amdgcn: Fix early-debug relocations

2021-03-06 Thread Andrew Stubbs

This patch is now backported to devel/omp/gcc-10.

Andrew

On 26/11/2020 14:41, Andrew Stubbs wrote:
This patch fixes an error in GCN mkoffload that corrupted relocations in 
the early-debug info.


The code now updates the relocation code without zeroing the symbol index.

Andrew




[PATCH] [AARCH64] Modify __ARM_ARCH as per latest ACLE

2021-03-06 Thread Naveen Hurugalawadi via Gcc-patches
--_002_DM6PR18MB27780C281E55729A473FF172A7969DM6PR18MB2778namp_
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

Hi,

Please find attached the patch that modifies "__ARM_ARCH" as per latest ACL=
E.
Built and tested the patch on aarch64-marvell-linux-gnu.
Please review the patch and let me know if it's okay.

aarch64 : Modify  __ARM_ARCH as per latest ACLE

2021-04-03  Naveen H S  

PR tree-optimization/99312

* config/aarch64/aarch64-arches.def (armv8.1-a): Modify ARCH_REV as=
 per latest ACLE.
(armv8.2-a): Likewise.
(armv8.3-a): Likewise.
(armv8.4-a): Likewise.
(armv8.5-a): Likewise.
(armv8.6-a): Likewise.=20

Thanks,
Naveen


--_002_DM6PR18MB27780C281E55729A473FF172A7969DM6PR18MB2778namp_
Content-Type: application/octet-stream; name="pr99312-1.patch"
Content-Description: pr99312-1.patch
Content-Disposition: attachment; filename="pr99312-1.patch"; size=2037;
creation-date="Fri, 05 Mar 2021 03:22:55 GMT";
modification-date="Thu, 04 Mar 2021 05:56:50 GMT"
Content-Transfer-Encoding: base64

ZGlmZiAtLWdpdCBhL2djYy9jb25maWcvYWFyY2g2NC9hYXJjaDY0LWFyY2hlcy5kZWYgYi9nY2Mv
Y29uZmlnL2FhcmNoNjQvYWFyY2g2NC1hcmNoZXMuZGVmCmluZGV4IGI3NDk3Mjc3YmI4Li5hY2Fm
NTU2NjkyNyAxMDA2NDQKLS0tIGEvZ2NjL2NvbmZpZy9hYXJjaDY0L2FhcmNoNjQtYXJjaGVzLmRl
ZgorKysgYi9nY2MvY29uZmlnL2FhcmNoNjQvYWFyY2g2NC1hcmNoZXMuZGVmCkBAIC0yNSwxOCAr
MjUsMTkgQEAKICAgIGNvbnN0YW50LiAgVGhlIENPUkUgaXMgdGhlIGlkZW50aWZpZXIgZm9yIGEg
Y29yZSByZXByZXNlbnRhdGl2ZSBvZgogICAgdGhpcyBhcmNoaXRlY3R1cmUuICBBUkNIX0lERU5U
IGlzIHRoZSBhcmNoaXRlY3R1cmUgaWRlbnRpZmllci4gIEl0IG11c3QgYmUKICAgIHVuaXF1ZSBh
bmQgYmUgc3ludGFjdGljYWxseSB2YWxpZCB0byBhcHBlYXIgYXMgcGFydCBvZiBhbiBlbnVtIGlk
ZW50aWZpZXIuCi0gICBBUkNIX1JFViBpcyBhbiBpbnRlZ2VyIHNwZWNpZnlpbmcgdGhlIGFyY2hp
dGVjdHVyZSBtYWpvciByZXZpc2lvbi4KKyAgIEFSQ0hfUkVWIGlzIGFuIGludGVnZXIgKFggKiAx
MDAgKyBZICBFLmcuIGZvciBBcm12OC4xIGl0J3MgODAxLiBFeGNlcHQgZm9yCisgICBBcm12OC1h
IHdoaWNoIGlzIHN0aWxsIDgpIHNwZWNpZnlpbmcgdGhlIGFyY2hpdGVjdHVyZSBtYWpvciByZXZp
c2lvbi4KICAgIEZMQUdTIGFyZSB0aGUgZmxhZ3MgaW1wbGllZCBieSB0aGUgYXJjaGl0ZWN0dXJl
LgogICAgRHVlIHRvIHRoZSBhc3N1bXB0aW9ucyBhYm91dCB0aGUgcG9zaXRpb25zIG9mIHRoZXNl
IGZpZWxkcyBpbiBjb25maWcuZ2NjLAogICAgdGhlIE5BTUUgc2hvdWxkIGJlIGtlcHQgYXMgdGhl
IGZpcnN0IGFyZ3VtZW50IGFuZCBGTEFHUyBhcyB0aGUgbGFzdC4gICovCiAKIEFBUkNINjRfQVJD
SCgiYXJtdjgtYSIsCSAgICAgIGdlbmVyaWMsCSAgICAgOEEsCTgsICBBQVJDSDY0X0ZMX0ZPUl9B
UkNIOCkKLUFBUkNINjRfQVJDSCgiYXJtdjguMS1hIiwgICAgIGdlbmVyaWMsCSAgICAgOF8xQSwJ
OCwgIEFBUkNINjRfRkxfRk9SX0FSQ0g4XzEpCi1BQVJDSDY0X0FSQ0goImFybXY4LjItYSIsICAg
ICBnZW5lcmljLAkgICAgIDhfMkEsCTgsICBBQVJDSDY0X0ZMX0ZPUl9BUkNIOF8yKQotQUFSQ0g2
NF9BUkNIKCJhcm12OC4zLWEiLCAgICAgZ2VuZXJpYywJICAgICA4XzNBLAk4LCAgQUFSQ0g2NF9G
TF9GT1JfQVJDSDhfMykKLUFBUkNINjRfQVJDSCgiYXJtdjguNC1hIiwgICAgIGdlbmVyaWMsCSAg
ICAgOF80QSwJOCwgIEFBUkNINjRfRkxfRk9SX0FSQ0g4XzQpCi1BQVJDSDY0X0FSQ0goImFybXY4
LjUtYSIsICAgICBnZW5lcmljLAkgICAgIDhfNUEsCTgsICBBQVJDSDY0X0ZMX0ZPUl9BUkNIOF81
KQotQUFSQ0g2NF9BUkNIKCJhcm12OC42LWEiLCAgICAgZ2VuZXJpYywJICAgICA4XzZBLAk4LCAg
QUFSQ0g2NF9GTF9GT1JfQVJDSDhfNikKK0FBUkNINjRfQVJDSCgiYXJtdjguMS1hIiwgICAgIGdl
bmVyaWMsCSAgICAgOF8xQSwJODAxLCAgQUFSQ0g2NF9GTF9GT1JfQVJDSDhfMSkKK0FBUkNINjRf
QVJDSCgiYXJtdjguMi1hIiwgICAgIGdlbmVyaWMsCSAgICAgOF8yQSwJODAyLCAgQUFSQ0g2NF9G
TF9GT1JfQVJDSDhfMikKK0FBUkNINjRfQVJDSCgiYXJtdjguMy1hIiwgICAgIGdlbmVyaWMsCSAg
ICAgOF8zQSwJODAzLCAgQUFSQ0g2NF9GTF9GT1JfQVJDSDhfMykKK0FBUkNINjRfQVJDSCgiYXJt
djguNC1hIiwgICAgIGdlbmVyaWMsCSAgICAgOF80QSwJODA0LCAgQUFSQ0g2NF9GTF9GT1JfQVJD
SDhfNCkKK0FBUkNINjRfQVJDSCgiYXJtdjguNS1hIiwgICAgIGdlbmVyaWMsCSAgICAgOF81QSwJ
ODA1LCAgQUFSQ0g2NF9GTF9GT1JfQVJDSDhfNSkKK0FBUkNINjRfQVJDSCgiYXJtdjguNi1hIiwg
ICAgIGdlbmVyaWMsCSAgICAgOF82QSwJODA2LCAgQUFSQ0g2NF9GTF9GT1JfQVJDSDhfNikKIEFB
UkNINjRfQVJDSCgiYXJtdjgtciIsICAgICAgIGdlbmVyaWMsCSAgICAgOFIgICwJOCwgIEFBUkNI
NjRfRkxfRk9SX0FSQ0g4X1IpCiAKICN1bmRlZiBBQVJDSDY0X0FSQ0gK

--_002_DM6PR18MB27780C281E55729A473FF172A7969DM6PR18MB2778namp_--



[committed] d: Don't set default flag_complex_method.

2021-03-06 Thread Iain Buclaw via Gcc-patches
Hi,

This patch removes the default initializing of flag_complex_method in
the D front-end.  D doesn't need C99-like requirements for complex
multiply and divide, the default set by common.opt is sufficient enough.

Bootstrapped and regression tested on x86_64-linux-gnu/-m32/-mx32, and
committed to mainline.

Regards,
Iain.

---
gcc/d/ChangeLog:

* d-lang.cc (d_init_options_struct): Don't set default
flag_complex_method.
---
 gcc/d/d-lang.cc | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/gcc/d/d-lang.cc b/gcc/d/d-lang.cc
index 1a51c5e4b7c..0720cba1340 100644
--- a/gcc/d/d-lang.cc
+++ b/gcc/d/d-lang.cc
@@ -342,9 +342,6 @@ d_init_options_struct (gcc_options *opts)
   /* GCC options.  */
   opts->x_flag_exceptions = 1;
 
-  /* Avoid range issues for complex multiply and divide.  */
-  opts->x_flag_complex_method = 2;
-
   /* Unlike C, there is no global `errno' variable.  */
   opts->x_flag_errno_math = 0;
   opts->frontend_set_flag_errno_math = true;
-- 
2.27.0



Re: [PATCH] c++: Fix constexpr evaluation of pre-increment when !lval [PR99287]

2021-03-06 Thread Jason Merrill via Gcc-patches

On 3/5/21 5:18 PM, Patrick Palka wrote:

On Fri, 5 Mar 2021, Jason Merrill wrote:


On 3/5/21 1:05 PM, Patrick Palka wrote:

Here, during cxx_eval_increment_expression (with lval=false) of
++__first where __first is &"mystr"[0], we correctly update __first
to &"mystr"[1] but we end up returning &"mystr"[0] + 1 instead of
&"mystr"[1].  This unreduced return value inhibits other pointer
arithmetic folding during later constexpr evaluation, which ultimately
causes the constexpr evaluation to fail.

It turns out the simplification of &"mystr"[0] + 1 to &"mystr"[1]
is performed by cxx_fold_pointer_plus_expression, not by fold_build2.
So we perform this simplification during constexpr evaluation of
the temporary MODIFY_EXPR (assigning to __first the simplified value),
but then we return 'mod' which has only been folded via fold_build2 and
hasn't gone through cxx_fold_pointer_plus_expression.

This patch fixes this by updating 'mod' to the (rvalue) result of the
MODIFY_EXPR evaluation, so that we capture any additional folding of
'mod'.  We now need to be wary of the evaluation failing and returning
e.g. the MODIFY_EXPR or NULL_TREE; it seems checking *non_constant_p
should cover our bases here and is generally prudent.

(Finally, since returning 'mod' instead of 'op' when !lval seems to be
more than just an optimization, i.e. callers seems to expect this
behavior, this patch additionally clarifies the nearby comment to that
effect.)

Boostrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk or perhaps GCC 12?

gcc/cp/ChangeLog:

PR c++/99287
* constexpr.c (cxx_eval_increment_expression): Pass lval=false
when evaluating the MODIFY_EXPR, and update 'mod' with the
result of this evaluation.  Check *non_constant_p afterwards.
Clarify nearby comment.

gcc/testsuite/ChangeLog:

PR c++/99287
* g++.dg/cpp2a/constexpr-99287.C: New test.

Co-authored-by: Jakub Jelinek 
---
   gcc/cp/constexpr.c   | 16 ++---
   gcc/testsuite/g++.dg/cpp2a/constexpr-99287.C | 61 
   2 files changed, 67 insertions(+), 10 deletions(-)
   create mode 100644 gcc/testsuite/g++.dg/cpp2a/constexpr-99287.C

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index cd0a68e9fd6..49df79837ca 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -5582,20 +5582,16 @@ cxx_eval_increment_expression (const constexpr_ctx
*ctx, tree t,
 /* Storing the modified value.  */
 tree store = build2_loc (cp_expr_loc_or_loc (t, input_location),
   MODIFY_EXPR, type, op, mod);
-  cxx_eval_constant_expression (ctx, store,
-   true, non_constant_p, overflow_p);
+  mod = cxx_eval_constant_expression (ctx, store, false,


How about passing lval down here and returning mod either way?


Sounds good, like this?  Testing in progress


OK.


-- >8 --

Subject: [PATCH] c++: Fix constexpr evaluation of pre-increment when !lval
  [PR99287]

Here, during cxx_eval_increment_expression (with lval=false) of
++__first where __first is &"mystr"[0], we correctly update __first
to &"mystr"[1] but we end up returning &"mystr"[0] + 1 instead of
&"mystr"[1].  This unreduced return value inhibits other pointer
arithmetic folding during later constexpr evaluation, which ultimately
causes the constexpr evaluation to fail.

It turns out the simplification of &"mystr"[0] + 1 to &"mystr"[1]
is performed by cxx_fold_pointer_plus_expression, not by fold_build2.
So we perform this simplification during constexpr evaluation of
the temporary MODIFY_EXPR (during which we assign to __first the
simplified value), but then we return 'mod' which has only been folded
via fold_build2 and hasn't gone through cxx_fold_pointer_plus_expression.

This patch fixes this by also updating 'mod' with the result of the
MODIFY_EXPR evaluation appropriately, so that we capture any additional
folding of the expression when !lval.  We now need to be wary of this
evaluation failing and returning e.g. the MODIFY_EXPR or NULL_TREE; it
seems checking *non_constant_p should cover our bases here and is
generally prudent.

gcc/cp/ChangeLog:

PR c++/99287
* constexpr.c (cxx_eval_increment_expression): Pass lval when
evaluating the MODIFY_EXPR, and update 'mod' with the result of
this evaluation.  Check *non_constant_p afterwards.  For prefix
ops, just return 'mod'.

gcc/testsuite/ChangeLog:

PR c++/99287
* g++.dg/cpp2a/constexpr-99287.C: New test.

Co-authored-by: Jakub Jelinek 
---
  gcc/cp/constexpr.c   | 17 +++---
  gcc/testsuite/g++.dg/cpp2a/constexpr-99287.C | 61 
  2 files changed, 68 insertions(+), 10 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/constexpr-99287.C

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 7d96d577d84..d7150b25b19 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -5582,20 +5582,17 @@

New German PO file for 'gcc' (version 11.1-b20210207)

2021-03-06 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the German team of translators.  The file is available at:

https://translationproject.org/latest/gcc/de.po

(This file, 'gcc-11.1-b20210207.de.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Re: [commited] [PR99378] LRA: Skip decomposing address for asm insn operand with unknown constraint

2021-03-06 Thread Gerald Pfeifer
On Fri, 5 Mar 2021, Vladimir Makarov via Gcc-patches wrote:
>   The following patch fixes
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99378
> 
>   The patch was successfully bootstrapped and tested on x86-64.

Is it possible this breaks bootstrap on i586?

See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99438

  .../libgcc/soft-fp/divtf3.c: In function '__divtf3':
  .../libgcc/soft-fp/divtf3.c:51:1: error: unrecognizable insn:
   51 | }
  | ^
(insn 1185 3357 3676 80 (parallel [
(set (reg:SI 5 di [621])
(asm_operands:SI ("sub{l} {%11,%3|%3,%11}
sbb{l} {%9,%2|%2,%9}
sbb{l} {%7,%1|%1,%7}
sbb{l} {%5,%0|%0,%5}") ("=r") 0 [
(reg:SI 5 di [621])
(mem/c:SI (plus:SI (reg/f:SI 6 bp)
(const_int -80 [0xffb0])) [5 
A_f[2]+0 S4 A64])
(reg:SI 1 dx [622])
(mem/c:SI (plus:SI (reg/f:SI 6 bp)
(const_int -84 [0xffac])) [5 
A_f[1]+0 S4 A32])
(reg:SI 0 ax [623])
(mem/c:SI (plus:SI (reg/f:SI 6 bp)

Gerald