[PATCH] fortran: Unshare associate var charlen [PR104228]

2022-01-29 Thread Mikael Morin

Hello,

the attached patch is a fix for PR104228.
Even if simple, I wouldn’t call it obvious, as it’s involving character 
length and associate, so I don’t mind some extra review eyes.


Tested on x86_64-pc-linux-gnu. Ok for master/11/10/9?From 0819226560387b2953622ee3d5d051a35606d504 Mon Sep 17 00:00:00 2001
From: Mikael Morin 
Date: Fri, 28 Jan 2022 22:00:57 +0100
Subject: [PATCH] fortran: Unshare associate var charlen [PR104228]

PR104228 showed that character lengths were shared between associate
variable and associate targets.  This is problematic when the associate
target is itself a variable and gets a variable to hold the length, as
the length variable is added (and all the variables following it in the chain)
to both the associate variable scope and the target variable scope.
This caused an ICE when compiling with -O0 -fsanitize=address.

This change forces the creation of a separate character length for the
associate variable.  It also forces the initialization of the character
length variable to avoid regressing associate_32 and associate_47 tests.

gcc/fortran/ChangeLog:

	* resolve.cc (resolve_assoc_var): Also create a new character
	length for non-dummy associate targets.
	* trans-stmt.cc (trans_associate_var): Initialize character length
	even if no temporary is used for the associate variable.

gcc/testsuite/ChangeLog:

	* gfortran.dg/asan/associate_1.f90: New test.
	* gfortran.dg/asan/associate_2.f90: New test.
---
 gcc/fortran/resolve.cc|  1 -
 gcc/fortran/trans-stmt.cc |  2 +-
 .../gfortran.dg/asan/associate_1.f90  | 19 +++
 .../gfortran.dg/asan/associate_2.f90  | 19 +++
 4 files changed, 39 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/asan/associate_1.f90
 create mode 100644 gcc/testsuite/gfortran.dg/asan/associate_2.f90

diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc
index 835a4783718..266e41e25b1 100644
--- a/gcc/fortran/resolve.cc
+++ b/gcc/fortran/resolve.cc
@@ -9227,7 +9227,6 @@ resolve_assoc_var (gfc_symbol* sym, bool resolve_target)
 	sym->ts.u.cl = target->ts.u.cl;
 
   if (sym->ts.deferred && target->expr_type == EXPR_VARIABLE
-	  && target->symtree->n.sym->attr.dummy
 	  && sym->ts.u.cl == target->ts.u.cl)
 	{
 	  sym->ts.u.cl = gfc_new_charlen (sym->ns, NULL);
diff --git a/gcc/fortran/trans-stmt.cc b/gcc/fortran/trans-stmt.cc
index 04f8147d23b..30b6bd5dd2a 100644
--- a/gcc/fortran/trans-stmt.cc
+++ b/gcc/fortran/trans-stmt.cc
@@ -1918,7 +1918,7 @@ trans_associate_var (gfc_symbol *sym, gfc_wrapped_block *block)
   gfc_conv_expr_descriptor (&se, e);
 
   if (sym->ts.type == BT_CHARACTER
-	  && !se.direct_byref && sym->ts.deferred
+	  && sym->ts.deferred
 	  && !sym->attr.select_type_temporary
 	  && VAR_P (sym->ts.u.cl->backend_decl)
 	  && se.string_length != sym->ts.u.cl->backend_decl)
diff --git a/gcc/testsuite/gfortran.dg/asan/associate_1.f90 b/gcc/testsuite/gfortran.dg/asan/associate_1.f90
new file mode 100644
index 000..b5ea75498b7
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/asan/associate_1.f90
@@ -0,0 +1,19 @@
+! { dg-do compile }
+! { dg-additional-options "-O0" }
+!
+! PR fortran/104228
+! The code generated code for the program below wrongly pushed the Y character
+! length variable to both P and S scope, which was leading to an ICE when
+! address sanitizer was in effect
+
+program p
+   character(:), save, allocatable :: x(:)
+   call s
+contains
+   subroutine s
+  associate (y => x)
+ y = [x]
+  end associate
+   end
+end
+
diff --git a/gcc/testsuite/gfortran.dg/asan/associate_2.f90 b/gcc/testsuite/gfortran.dg/asan/associate_2.f90
new file mode 100644
index 000..9bfb2bfbafb
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/asan/associate_2.f90
@@ -0,0 +1,19 @@
+! { dg-do compile }
+! { dg-additional-options "-O0" }
+!
+! PR fortran/104228
+! The code generated code for the program below wrongly pushed the Y character
+! length variable to both P and S scope, which was leading to an ICE when
+! address sanitizer was in effect
+
+program p
+   character(:), allocatable :: x(:)
+   call s
+contains
+   subroutine s
+  associate (y => x)
+ y = [x]
+  end associate
+   end
+end
+
-- 
2.34.1



[PATCH] [PATCH, v3, 1/1, AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack

2022-01-29 Thread Dan Li via Gcc-patches
Shadow Call Stack can be used to protect the return address of a
function at runtime, and clang already supports this feature[1].

To enable SCS in user mode, in addition to compiler, other support
is also required (as discussed in [2]). This patch only adds basic
support for SCS from the compiler side, and provides convenience
for users to enable SCS.

For linux kernel, only the support of the compiler is required.

[1] https://clang.llvm.org/docs/ShadowCallStack.html
[2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102768

Signed-off-by: Dan Li 

gcc/ChangeLog:

* config/aarch64/aarch64.c (aarch64_layout_frame):
Change callee_adjust when scs is enabled.
(aarch64_restore_callee_saves): Avoid pop x30 twice
when scs is enabled.
(aarch64_expand_prologue): Push x30 onto SCS before it's
pushed onto stack.
(aarch64_expand_epilogue): Pop x30 frome SCS, while
preventing it from being popped from the regular stack again.
(aarch64_override_options_internal): Add SCS compile option check.
(TARGET_HAVE_SHADOW_CALL_STACK): New hook.
* config/aarch64/aarch64.h (struct GTY): Add is_scs_enabled.
* config/aarch64/aarch64.md (scs_push):  New template.
(scs_pop): Likewise.
* doc/invoke.texi: Document -fsanitize=shadow-call-stack.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in: Add hook have_shadow_call_stack.
* flag-types.h (enum sanitize_code):
Add SANITIZE_SHADOW_CALL_STACK.
* opts.c: Add shadow-call-stack.
* target.def: New hook.
* toplev.c (process_options): Add SCS compile option check.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/shadow_call_stack_1.c: New test.
* gcc.target/aarch64/shadow_call_stack_2.c: New test.
* gcc.target/aarch64/shadow_call_stack_3.c: New test.
* gcc.target/aarch64/shadow_call_stack_4.c: New test.
* gcc.target/aarch64/shadow_call_stack_5.c: New test.
* gcc.target/aarch64/shadow_call_stack_6.c: New test.
* gcc.target/aarch64/shadow_call_stack_7.c: New test.
* gcc.target/aarch64/shadow_call_stack_8.c: New test.
---
V3:
- Change scs_push/pop to standard move patterns.
- Optimize scs_pop to avoid pop x30 twice when shadow stack is enabled.

 gcc/config/aarch64/aarch64.c  | 66 +--
 gcc/config/aarch64/aarch64.h  |  4 ++
 gcc/config/aarch64/aarch64.md | 10 +++
 gcc/doc/invoke.texi   | 30 +
 gcc/doc/tm.texi   |  5 ++
 gcc/doc/tm.texi.in|  2 +
 gcc/flag-types.h  |  2 +
 gcc/opts.c|  1 +
 gcc/target.def|  8 +++
 .../gcc.target/aarch64/shadow_call_stack_1.c  |  6 ++
 .../gcc.target/aarch64/shadow_call_stack_2.c  |  6 ++
 .../gcc.target/aarch64/shadow_call_stack_3.c  | 45 +
 .../gcc.target/aarch64/shadow_call_stack_4.c  | 20 ++
 .../gcc.target/aarch64/shadow_call_stack_5.c  | 18 +
 .../gcc.target/aarch64/shadow_call_stack_6.c  | 18 +
 .../gcc.target/aarch64/shadow_call_stack_7.c  | 18 +
 .../gcc.target/aarch64/shadow_call_stack_8.c  | 24 +++
 gcc/toplev.c  | 10 +++
 18 files changed, 289 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_3.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_4.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_5.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_6.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_7.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_8.c

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 699c105a42a..461c205010e 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -79,6 +79,7 @@
 #include "tree-ssa-loop-niter.h"
 #include "fractional-cost.h"
 #include "rtlanal.h"
+#include "asan.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -7478,10 +7479,31 @@ aarch64_layout_frame (void)
   frame.sve_callee_adjust = 0;
   frame.callee_offset = 0;
 
+  /* Shadow call stack only deal with functions where the LR is pushed
+ onto the stack and without specifying the "no_sanitize" attribute
+ with the argument "shadow-call-stack".  */
+  frame.is_scs_enabled
+= (!crtl->calls_eh_return
+   && (sanitize_flags_p (SANITIZE_SHADOW_CALL_STACK)
+  && known_ge (cfun->machine->frame.reg_offset[LR_REGNUM], 0)));
+
+  /* When shadow call stack is enabled, the scs_pop in the epilogue will
+ restore x30, and we don't need to p

Re: [PING^3][PATCH,v2,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack

2022-01-29 Thread Dan Li via Gcc-patches

Hi, Richard,
I have sent out my v3[1], and (probably) fixed the previous issues,
please let me know if i got something wrong :)

[1] https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589471.html

Thanks,
Dan.

On 1/25/22 02:19, Richard Sandiford wrote:

Dan Li  writes:

+
  if (flag_stack_usage_info)
current_function_static_stack_size = constant_lower_bound (frame_size);

@@ -9066,6 +9089,10 @@ aarch64_expand_epilogue (bool for_sibcall)

  RTX_FRAME_RELATED_P (insn) = 1;
}

+  /* Pop return address from shadow call stack.  */

+  if (aarch64_shadow_call_stack_enabled ())
+   emit_insn (gen_scs_pop ());
+


This looks correct, but following on from the above, I guess we continue
to restore x30 from the frame in the traditional way first, and then
overwrite it with the SCS value.  I think the output code would be
slightly neater if we suppressed the first restore of x30.


Yes, the current epilogue is like:
  ...
  ldp x29, x30, [sp], #16
+   ldr x30, [x18, #-8]!

I think may be we can keep the two x30 pops here, for two reasons:
1) The implementation of scs in clang is to pop x30 twice, it might be
better to be consistent with clang here[1].


This doesn't seem a very compelling reason on its own, unless it's
explicitly meant to be observable behaviour.  (But then, observed how?)


2) If we keep the first restore of x30, it may provide some flexibility
for other scenarios. For example, we can directly patch the scs_push/pop
insns in the binary to turn it into a binary without scs;


Is that a supported (and ideally documented) use case?  If it is,
then it becomes important for correctness that the compiler emits
exactly the opcodes in the original patch.  If it isn't, then the
compiler can emit other instructions that have the same effect.

I'd like a definitive ruling on this from the kernel folks before
the patch goes in.

If binary patching is supposed to be possible then scs_push and
scs_pop *do* need to be separate define_insns.  But they also need
to have some magic unspec that differentiates them from normal
pushes and pops, e.g.:

   (set ...mem...
(unspec:DI [...reg...] UNSPEC_SCS_PUSH))

so that there is no chance that the pattern would be treated as
a normal move and optimised accordingly.


+;; Save X30 in the X18-based POST_INC stack (consistent with clang).
+(define_insn "scs_push"
+  [(set (mem:DI (reg:DI R18_REGNUM)) (reg:DI R30_REGNUM))
+   (set (reg:DI R18_REGNUM) (plus:DI (reg:DI R18_REGNUM) (const_int 8)))]
+  ""
+  "str\\tx30, [x18], #8"
+  [(set_attr "type" "store_8")]
+)
+
+;; Load X30 form the X18-based PRE_DEC stack (consistent with clang).
+(define_insn "scs_pop"
+  [(set (reg:DI R18_REGNUM) (minus:DI (reg:DI R18_REGNUM) (const_int 8)))
+   (set (reg:DI R30_REGNUM) (mem:DI (reg:DI R18_REGNUM)))]
+  ""
+  "ldr\\tx30, [x18, #-8]!"
+  [(set_attr "type" "load_8")]
+)
+


The two SETs here work in parallel, with the define_insn as a whole
following a "read input operands, act, write output operands" sequence.
The (mem: …) address therefore sees the old value of r18 rather than the
new one.  Also, RTL rules say that subtractions need to be written as
additions of the negative.


Oh, sorry, I got it wrong here.


I think the pattern would therefore be something like:

[(set (reg:DI R30_REGNUM) (mem:DI (plus:DI (reg:DI R18_REGNUM)
 (const_int -8]
 (set (reg:DI R18_REGNUM) (plus:DI (reg:DI R18_REGNUM) (const_int -8)))]

However, since these are normal moves, I think we should be able to use
the standard move patterns.  E.g. the push could be:

(define_expand "scs_push"
[(set (mem:DI (pre_dec:DI (reg:DI R18_REGNUM)))
  (reg:DI R30_REGNUM))])

and similarly for the pop.


Thanks, this template looks better.

Since the push/pop of SCS and normal stack in clang are different
(for example, scs push is post_inc, while normal stack is pre_dec),
in order to maintain consistency, I think it can be changed to the
following:

(define_expand "scs_push"
[(set (mem:DI (post_inc:DI (reg:DI R18_REGNUM)))
  (reg:DI R30_REGNUM))])

(define_expand "scs_pop"
[(set (reg:DI R30_REGNUM)
 (mem:DI (pre_dec:DI (reg:DI R18_REGNUM])


Yeah, I copied the wrong name above, sorry.

Thanks,
Richard


[PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]

2022-01-29 Thread Jakub Jelinek via Gcc-patches
Hi!

On Fri, Jan 28, 2022 at 11:38:23AM -0700, Jeff Law wrote:
> Thanks.  Given the original submission and most of the review work was done
> prior to stage3 closing, I went ahead and installed this on the trunk.

Unfortunately this breaks quite a lot of things.
The main problem is that GIMPLE allows EQ_EXPR etc. only with BOOLEAN_TYPE
or with TYPE_PRECISION == 1 integral type (or vector boolean).
Violating this causes verification failures in tree-cfg.cc in some cases,
in other cases wrong-code issues because before it is verified we e.g.
transform
1U / x
into
x == 1U
and later into
x (because we assume that == type must be one of the above cases and
when it is the same type as the type of the first operand, for boolean-ish
cases it should be equivalent).

Fixed by changing that
(eq @1 { build_one_cst (type); })
into
(convert (eq:boolean_type_node @1 { build_one_cst (type); }))
Note, I'm not 100% sure if :boolean_type_node is required in that case,
I see some spots in match.pd that look exactly like this, while there is
e.g. (convert (le ...)) that supposedly does the right thing too.
The signed integer 1/X case doesn't need changes changes, for
(cond (le ...) ...)
le gets correctly boolean_type_node and cond should use type.
I've also reformatted it, some lines were too long, match.pd uses
indentation by 1 column instead of 2 etc.

Bootstrapped/regtested on powerpc64le-linux and tested on the testcases
on x86_64-linux, ok for trunk?

2022-01-29  Jakub Jelinek  
Andrew Pinski  

PR tree-optimization/104279
PR tree-optimization/104280
PR tree-optimization/104281
* match.pd (1 / X -> X == 1 for unsigned X): Build eq with
boolean_type_node and convert to type.  Formatting fixes.

* gcc.dg/torture/pr104279.c: New test.
* gcc.dg/torture/pr104280.c: New test.
* gcc.dg/torture/pr104281.c: New test.

--- gcc/match.pd.jj 2022-01-29 11:11:39.316628007 +0100
+++ gcc/match.pd2022-01-29 12:19:46.662096678 +0100
@@ -435,18 +435,22 @@ (define_operator_list SYNC_FETCH_AND_AND
&& TYPE_UNSIGNED (type))
(trunc_divmod @0 @1
 
- /* 1 / X -> X == 1 for unsigned integer X.
-1 / X -> X >= -1 && X <= 1 ? X : 0 for signed integer X.
-But not for 1 / 0 so that we can get proper warnings and errors,
-and not for 1-bit integers as they are edge cases better handled 
elsewhere. */
+/* 1 / X -> X == 1 for unsigned integer X.
+   1 / X -> X >= -1 && X <= 1 ? X : 0 for signed integer X.
+   But not for 1 / 0 so that we can get proper warnings and errors,
+   and not for 1-bit integers as they are edge cases better handled
+   elsewhere.  */
 (simplify
-  (trunc_div integer_onep@0 @1)
-  (if (INTEGRAL_TYPE_P (type) && !integer_zerop (@1) && TYPE_PRECISION (type) 
> 1)
-(if (TYPE_UNSIGNED (type))
-  (eq @1 { build_one_cst (type); })
-  (with { tree utype = unsigned_type_for (type); }
-(cond (le (plus (convert:utype @1) { build_one_cst (utype); }) { 
build_int_cst (utype, 2); })
-  @1 { build_zero_cst (type); })
+ (trunc_div integer_onep@0 @1)
+ (if (INTEGRAL_TYPE_P (type)
+  && !integer_zerop (@1)
+  && TYPE_PRECISION (type) > 1)
+  (if (TYPE_UNSIGNED (type))
+   (convert (eq:boolean_type_node @1 { build_one_cst (type); }))
+   (with { tree utype = unsigned_type_for (type); }
+(cond (le (plus (convert:utype @1) { build_one_cst (utype); })
+ { build_int_cst (utype, 2); })
+ @1 { build_zero_cst (type); })
 
 /* Combine two successive divisions.  Note that combining ceil_div
and floor_div is trickier and combining round_div even more so.  */
--- gcc/testsuite/gcc.dg/torture/pr104279.c.jj  2022-01-29 12:25:36.388174312 
+0100
+++ gcc/testsuite/gcc.dg/torture/pr104279.c 2022-01-29 12:25:53.206937588 
+0100
@@ -0,0 +1,12 @@
+/* PR tree-optimization/104279 */
+/* { dg-do compile } */
+
+unsigned a, b;
+
+int
+main ()
+{
+  b = ~(0 || ~0);
+  a = ~b / ~a;
+  return 0;
+}
--- gcc/testsuite/gcc.dg/torture/pr104280.c.jj  2022-01-29 12:24:36.190021595 
+0100
+++ gcc/testsuite/gcc.dg/torture/pr104280.c 2022-01-29 12:25:28.681282783 
+0100
@@ -0,0 +1,16 @@
+/* PR tree-optimization/104280 */
+/* { dg-do run } */
+
+int
+foo (unsigned b, int c)
+{
+  return b / c;
+}
+
+int
+main ()
+{
+  if (foo (1, 2) != 0)
+__builtin_abort ();
+  return 0;
+}
--- gcc/testsuite/gcc.dg/torture/pr104281.c.jj  2022-01-29 12:27:29.840577473 
+0100
+++ gcc/testsuite/gcc.dg/torture/pr104281.c 2022-01-29 12:27:24.526652267 
+0100
@@ -0,0 +1,22 @@
+/* PR tree-optimization/104281 */
+/* { dg-do run } */
+
+unsigned a = 1;
+int b, c = 2;
+long d;
+
+int
+main ()
+{
+  while (1)
+{
+  int m = a;
+L:
+  a = ~(-(m || b & d));
+  b = ((1 ^ a) / c);
+  if (b)
+   goto L;
+  break;
+}
+  return 0;
+}


Jakub



[PATCH] testsuite: Fix up tree-ssa/divide-7.c testcase [PR95424]

2022-01-29 Thread Jakub Jelinek via Gcc-patches
Hi!

This test fails everywhere, because ? doesn't match literal ?.
It should use \\? instead.  I've also changed those .s in there.
Tested on x86_64-linux (-m32/-m64) and powerpc64le-linux, ok for trunk?

2022-01-29  Jakub Jelinek  

PR tree-optimization/95424
* gcc.dg/tree-ssa/divide-7.c: Fix up regexps in scan-tree-dump{,-not}.

--- gcc/testsuite/gcc.dg/tree-ssa/divide-7.c.jj 2022-01-29 11:11:39.338627697 
+0100
+++ gcc/testsuite/gcc.dg/tree-ssa/divide-7.c2022-01-29 17:11:31.516070531 
+0100
@@ -5,5 +5,5 @@ int f(int x) {
   return 1 / x;
 }
 
-/* { dg-final { scan-tree-dump-not "1 / x_..D.;" "optimized" } } */
-/* { dg-final { scan-tree-dump ".. <= 2 ? x_..D. : 0;" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "1 / x_\[0-9]\+\\\(D\\\);" "optimized" } } 
*/
+/* { dg-final { scan-tree-dump " <= 2 \\? x_\[0-9]\+\\\(D\\\) : 0;" 
"optimized" } } */

Jakub



[PATCH] testsuite: Fix up tree-ssa/pr103514.c testcase [PR103514]

2022-01-29 Thread Jakub Jelinek via Gcc-patches
On Fri, Jan 28, 2022 at 03:14:16PM -0700, Jeff Law via Gcc-patches wrote:
> > This patch will add the missed pattern described in bug 103514 [1] to the 
> > match.pd. [1] includes proof of correctness for the patch too.
> > 
> > PR tree-optimization/103514
> > * match.pd (a & b) ^ (a == b) -> !(a | b): New optimization.
> > * match.pd (a & b) == (a ^ b) -> !(a | b): New optimization.
> > * gcc.dg/tree-ssa/pr103514.c: Testcase for this optimization.
> > 
> > 1) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103514
> Note the bug was filed an fixed during stage3, review just didn't happen in
> a reasonable timeframe.
> 
> I'm going to ACK this for the trunk and go ahead and commit it for you.

The testcase FAILs on short-circuit targets like powerpc64le-linux.
While the first 2 functions are identical, the last two look like:
   :
  if (a_5(D) != 0)
goto ; [INV]
  else
goto ; [INV]

   :
  if (b_6(D) != 0)
goto ; [INV]
  else
goto ; [INV]

   :

   :
  # iftmp.1_4 = PHI <1(3), 0(4)>
  _1 = a_5(D) == b_6(D);
  _2 = (int) _1;
  _3 = _2 ^ iftmp.1_4;
  _9 = _2 != iftmp.1_4;
  return _9;
instead of the expected:
   :
  _3 = a_8(D) & b_9(D);
  _4 = (int) _3;
  _5 = a_8(D) == b_9(D);
  _6 = (int) _5;
  _1 = a_8(D) | b_9(D);
  _2 = ~_1;
  _7 = (int) _2;
  _10 = ~_1;
  return _10;
so no wonder it doesn't match.  E.g. x86_64-linux will also use jumps
if it isn't just a && b but a && b && c && d (will do
a & b and c & d tests and jump based on those.

As it is too late to implement this optimization even for the short
circuiting targets this late (not even sure which pass would be best),
this patch just forces non-short-circuiting for the test.

Tested on x86_64-linux -m32/-m64 and powerpc64le-linux, ok for trunk?

2022-01-29  Jakub Jelinek  

PR tree-optimization/103514
* gcc.dg/tree-ssa/pr103514.c: Add
--param logical-op-non-short-circuit=1 to dg-options.

--- gcc/testsuite/gcc.dg/tree-ssa/pr103514.c.jj 2022-01-29 11:11:39.338627697 
+0100
+++ gcc/testsuite/gcc.dg/tree-ssa/pr103514.c2022-01-29 17:37:18.255237211 
+0100
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O -fdump-tree-optimized" } */
+/* { dg-options "-O --param logical-op-non-short-circuit=1 
-fdump-tree-optimized" } */
 #include 
 
 bool
@@ -30,4 +30,4 @@ h (bool a, bool b)
 /* Make sure we have removed "==" and "^" and "&". */
 /* { dg-final { scan-tree-dump-not "&" "optimized"} } */
 /* { dg-final { scan-tree-dump-not "\\^"  "optimized"} } */
-/* { dg-final { scan-tree-dump-not "==" "optimized"} } */
\ No newline at end of file
+/* { dg-final { scan-tree-dump-not "==" "optimized"} } */


Jakub



Re: [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]

2022-01-29 Thread Jeff Law via Gcc-patches




On 1/29/2022 9:23 AM, Jakub Jelinek wrote:

Hi!

On Fri, Jan 28, 2022 at 11:38:23AM -0700, Jeff Law wrote:

Thanks.  Given the original submission and most of the review work was done
prior to stage3 closing, I went ahead and installed this on the trunk.

Unfortunately this breaks quite a lot of things.
The main problem is that GIMPLE allows EQ_EXPR etc. only with BOOLEAN_TYPE
or with TYPE_PRECISION == 1 integral type (or vector boolean).
Violating this causes verification failures in tree-cfg.cc in some cases,
in other cases wrong-code issues because before it is verified we e.g.
transform
1U / x
into
x == 1U
and later into
x (because we assume that == type must be one of the above cases and
when it is the same type as the type of the first operand, for boolean-ish
cases it should be equivalent).

Fixed by changing that
(eq @1 { build_one_cst (type); })
into
(convert (eq:boolean_type_node @1 { build_one_cst (type); }))
Note, I'm not 100% sure if :boolean_type_node is required in that case,
I see some spots in match.pd that look exactly like this, while there is
e.g. (convert (le ...)) that supposedly does the right thing too.
The signed integer 1/X case doesn't need changes changes, for
(cond (le ...) ...)
le gets correctly boolean_type_node and cond should use type.
I've also reformatted it, some lines were too long, match.pd uses
indentation by 1 column instead of 2 etc.

Bootstrapped/regtested on powerpc64le-linux and tested on the testcases
on x86_64-linux, ok for trunk?

2022-01-29  Jakub Jelinek  
Andrew Pinski  

PR tree-optimization/104279
PR tree-optimization/104280
PR tree-optimization/104281
* match.pd (1 / X -> X == 1 for unsigned X): Build eq with
boolean_type_node and convert to type.  Formatting fixes.

* gcc.dg/torture/pr104279.c: New test.
* gcc.dg/torture/pr104280.c: New test.
* gcc.dg/torture/pr104281.c: New test.
Ugh.  Sorry for the breakage.  THat makes me wonder how the original 
patch was tested at all.


I think both forms will DTRT (eq:boolean_type_node ...) vs (convert (le 
...)) .


OK for the trunk as well as the testcase fix.

Jeff




[PATCH] PR/101135 - Load of null pointer when passing absent assumed-shape array argument for an optional dummy argument

2022-01-29 Thread Harald Anlauf via Gcc-patches
Dear Fortranners,

compiling with -fsanitize=undefined shows that we did mishandle the
case where a missing optional argument is passed to another procedure.

Besides the example given in the PR, the existing testcase
fortran.dg/missing_optional_dummy_6a.f90 fails with:

gcc/testsuite/gfortran.dg/missing_optional_dummy_6a.f90:21:29: runtime error: 
load of null pointer of type 'integer(kind=4)'
gcc/testsuite/gfortran.dg/missing_optional_dummy_6a.f90:22:30: runtime error: 
load of null pointer of type 'integer(kind=4)'
gcc/testsuite/gfortran.dg/missing_optional_dummy_6a.f90:27:29: runtime error: 
load of null pointer of type 'integer(kind=4)'

The least invasive change - already pointed out by the reporter - is
to check the presence of the argument before dereferencing the data
pointer after the offset calculation.  This requires adjusting the
checking pattern for gfortran.dg/missing_optional_dummy_6a.f90.

Regtesting reminded me that procedures with bind(c) attribute are doing
their own stuff, which is why they need to be excluded here, otherwise
testcase bind-c-contiguous-4.f90 would regress on the expected output.

I've created a testcase that uses this PR's input as well as the lesson
learned from studying the bind(c) testcase and placed this in the asan
subdirectory.

There is a potential alternative solution which I did not pursue, as I
think it is more invasive, but also that I didn't succeed to implement:
A non-present dummy array argument should not need to get its descriptor
set up.  Pursuing this is probably not the right thing to do during the
current stage of development and could be implemented later.  If somebody
believes this is important, feel free to open a PR for this.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

Thanks,
Harald

From 69ca8f83149107f48b86360eb878d9d746b99234 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Sat, 29 Jan 2022 22:18:30 +0100
Subject: [PATCH] Fortran: fix handling of absent array argument passed to
 optional dummy

gcc/fortran/ChangeLog:

	PR fortran/101135
	* trans-array.cc (gfc_get_dataptr_offset): Check for optional
	arguments being present before dereferencing data pointer.

gcc/testsuite/ChangeLog:

	PR fortran/101135
	* gfortran.dg/missing_optional_dummy_6a.f90: Adjust diagnostic patterns.
	* gfortran.dg/asan/missing_optional_dummy_7.f90: New test.
---
 gcc/fortran/trans-array.cc| 11 
 .../asan/missing_optional_dummy_7.f90 | 64 +++
 .../gfortran.dg/missing_optional_dummy_6a.f90 |  4 +-
 3 files changed, 77 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/asan/missing_optional_dummy_7.f90

diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc
index cfb6eac11c7..9eaa99c5550 100644
--- a/gcc/fortran/trans-array.cc
+++ b/gcc/fortran/trans-array.cc
@@ -7207,6 +7207,17 @@ gfc_get_dataptr_offset (stmtblock_t *block, tree parm, tree desc, tree offset,

   /* Set the target data pointer.  */
   offset = gfc_build_addr_expr (gfc_array_dataptr_type (desc), tmp);
+
+  /* Check for optional dummy argument being present.  BIND(C) procedure
+ arguments are excepted here since they are handled differently.  */
+  if (expr->expr_type == EXPR_VARIABLE
+  && expr->symtree->n.sym->attr.dummy
+  && expr->symtree->n.sym->attr.optional
+  && !expr->symtree->n.sym->ns->proc_name->attr.is_bind_c)
+offset = build3_loc (input_location, COND_EXPR, TREE_TYPE (offset),
+			 gfc_conv_expr_present (expr->symtree->n.sym), offset,
+			 fold_convert (TREE_TYPE (offset), gfc_index_zero_node));
+
   gfc_conv_descriptor_data_set (block, parm, offset);
 }

diff --git a/gcc/testsuite/gfortran.dg/asan/missing_optional_dummy_7.f90 b/gcc/testsuite/gfortran.dg/asan/missing_optional_dummy_7.f90
new file mode 100644
index 000..bdd7006170d
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/asan/missing_optional_dummy_7.f90
@@ -0,0 +1,64 @@
+! { dg-do run }
+! { dg-additional-options "-fdump-tree-original -fsanitize=undefined" }
+!
+! PR fortran/101135 - Load of null pointer when passing absent
+! assumed-shape array argument for an optional dummy argument
+!
+! Based on testcase by Marcel Jacobse
+
+program main
+  implicit none
+  character(len=3) :: a(6) = ['abc', 'def', 'ghi', 'jlm', 'nop', 'qrs']
+  call as ()
+  call as (a(::2))
+  call as_c ()
+  call as_c (a(2::2))
+  call test_wrapper
+  call test_wrapper_c
+  call test2_wrapper
+contains
+  subroutine as (xx)
+character(len=*), optional, intent(in) :: xx(*)
+if (.not. present (xx)) return
+print *, xx(1:3)
+  end subroutine as
+  subroutine as_c (zz) bind(c)
+character(len=*), optional, intent(in) :: zz(*)
+if (.not. present (zz)) return
+print *, zz(1:3)
+  end subroutine as_c
+
+  subroutine test_wrapper (x)
+real, dimension(1), intent(out), optional :: x
+call test (x) !
+  end subroutine test_wrapper
+  subroutine test (y)
+real, dimension(:), intent(out), optional :: y
+if (p

[PATCH] libstdc++ testsuite: Increase lwg3464.cc timeout factors to 20

2022-01-29 Thread Hans-Peter Nilsson via Gcc-patches
These tests have always been failing for my cris-elf
autotester running a simulator; they take about 20 minutes
each, compared to the timeout of 720 seconds, doubled
because they timed out in another simulator setup.

They are the *only* libstdc++ tests that timeout for my
setup so I thought this'd be best fixed in the testsuite
rather than a local timeout setting (in e.g. the baseboard
file).  And, better make it an increase that counts.  Or,
maybe they're actually needlessly excessive? FWIW, running
the whole libstdc++ testsuite for this target takes 2h40min.

Ok?

* testsuite/27_io/basic_istream/get/char/lwg3464.cc: Increase
timeout factor to 20.
* testsuite/27_io/basic_istream/get/wchar_t/lwg3464.cc: Likewise.
---
 libstdc++-v3/testsuite/27_io/basic_istream/get/char/lwg3464.cc  | 2 +-
 .../testsuite/27_io/basic_istream/get/wchar_t/lwg3464.cc| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/testsuite/27_io/basic_istream/get/char/lwg3464.cc 
b/libstdc++-v3/testsuite/27_io/basic_istream/get/char/lwg3464.cc
index f426ff7dd867..69612375cba0 100644
--- a/libstdc++-v3/testsuite/27_io/basic_istream/get/char/lwg3464.cc
+++ b/libstdc++-v3/testsuite/27_io/basic_istream/get/char/lwg3464.cc
@@ -16,7 +16,7 @@
 // .
 
 // { dg-do run { target { ! lp64 } } }
-// { dg-timeout-factor 2 }
+// { dg-timeout-factor 20 }
 
 #include 
 #include 
diff --git a/libstdc++-v3/testsuite/27_io/basic_istream/get/wchar_t/lwg3464.cc 
b/libstdc++-v3/testsuite/27_io/basic_istream/get/wchar_t/lwg3464.cc
index 4be6beb2310f..5f07c6d9880d 100644
--- a/libstdc++-v3/testsuite/27_io/basic_istream/get/wchar_t/lwg3464.cc
+++ b/libstdc++-v3/testsuite/27_io/basic_istream/get/wchar_t/lwg3464.cc
@@ -16,7 +16,7 @@
 // .
 
 // { dg-do run { target { ! lp64 } } }
-// { dg-timeout-factor 2 }
+// { dg-timeout-factor 20 }
 
 #include 
 #include 
-- 
2.30.2



Re: [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]

2022-01-29 Thread Zhao Wei Liew via Gcc-patches
Sincere apologies for the issues. I wasn't aware of the need for a cast but
after reading the PRs, I understand that now. On the other hand, the
incorrect test case was simply a major oversight on my part.

I'll be sure to be more careful next time.

Thanks for the fixes!


Re: [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]

2022-01-29 Thread Andrew Pinski via Gcc-patches
On Sat, Jan 29, 2022 at 6:30 PM Zhao Wei Liew via Gcc-patches
 wrote:
>
> Sincere apologies for the issues. I wasn't aware of the need for a cast but
> after reading the PRs, I understand that now. On the other hand, the
> incorrect test case was simply a major oversight on my part.
>
> I'll be sure to be more careful next time.

Well I think gimple-match should have caught the cast issue really. I
filed https://gcc.gnu.org/PR104286 for that. I hope to submit a patch
for stage 1 to catch that so it will be harder to happen in the first
place.

Thanks,
Andrew

>
> Thanks for the fixes!