Re: [PATCH] handle "invisible" reference in -Wdangling-pointer (PR104436)

2022-02-09 Thread Richard Biener via Gcc-patches
On Tue, Feb 8, 2022 at 11:38 PM Jason Merrill via Gcc-patches
 wrote:
>
> On 2/8/22 16:59, Martin Sebor wrote:
> > Transforming a by-value arguments to by-reference as GCC does for some
> > class types can trigger -Wdangling-pointer when the argument is used
> > to store the address of a local variable.  Since the stored value is
> > not accessible in the caller the warning is a false positive.
> >
> > The attached patch handles this case by excluding PARM_DECLs with
> > the DECL_BY_REFERENCE bit set from consideration.
> >
> > While testing the patch I noticed some instances of the warning are
> > uninitentionally duplicated as the pass runs more than once.  To avoid
> > that, I also introduce warning suppression into the handler for this
> > instance of the warning.  (There might still be others.)
>
> The second test should verify that we do warn about returning 't' from a
> function; we don't want to ignore the DECL_BY_REFERENCE RESULT_DECL.
>
> > +   tree var = SSA_NAME_VAR (lhs_ref.ref);
> > +   if (DECL_BY_REFERENCE (var))

I think you need to test var && TREE_CODE (var) == PARM_DECL here since
for DECL_BY_REFERENCE RESULT_DECL we _do_ escape to the caller.  Also
SSA_NAME_VAR var might be NULL.

> > + /* Avoid by-value arguments transformed into by-reference.  */
> > + continue;
>
> I wonder if we can we express this property of invisiref parms somewhere
> more general?  I imagine optimizations would find it useful as well.
> Could pointer_query somehow treat the reference as pointing to a
> function-local object?

I think points-to analysis got this correct when the reference was marked
restrict but now it also fails at this, making DSE fail to eliminate the
store in

struct A { A(); ~A(); int *p; };

void foo (struct A a, int *p)
{
  a.p = p;
}

> I previously tried to express this by marking the reference as
> 'restrict', but that was wrong
> (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97474).
>
> Jason
>


Re: [PATCH] PR tree-optimization/104420: Fix checks for constant folding X*0.0

2022-02-09 Thread Richard Biener via Gcc-patches
On Wed, Feb 9, 2022 at 12:20 AM Roger Sayle  wrote:
>
>
> This patch resolves PR tree-optimization/104420, which is a P1 regression
> where, as observed by Jakub Jelinek, the conditions for constant folding
> x*0.0 are incorrect (following my patch for PR tree-optimization/96392).
> The multiplication x*0.0 may yield a negative zero result, -0.0, if X is
> negative (not just if x may be negative zero).  Hence (without -ffast-math)
> (int)x*0.0 can't be optimized to 0.0, but (unsigned)x*0.0 can be constant
> folded.  This adds a bunch of test cases to confirm the desired behaviour,
> and removes an incorrect test from gcc.dg/pr96392.c which checked for the
> wrong behaviour.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and
> make -k check no new failures.  Ok for mainline?

OK.

Thanks,
Richard.

> 2022-02-08  Roger Sayle  
>
> gcc/ChangeLog
> PR tree-optimization/104420
> * match.pd (mult @0 real_zerop): Tweak conditions for constant
> folding X*0.0 (or X*-0.0) to HONOR_SIGNED_ZEROS when appropriate.
>
> gcc/testsuite/ChangeLog
> PR tree-optimization/104420
> * gcc.dg/pr104420-1.c: New test case.
> * gcc.dg/pr104420-2.c: New test case.
> * gcc.dg/pr104420-3.c: New test case.
> * gcc.dg/pr104420-4.c: New test case.
> * gcc.dg/pr96392.c: Remove incorrect test.
>
> Thanks in advance (and sorry for the breakage/thinko).
> Roger
> --
>


Re: [PATCH] AutoFDO: Don't try to promote indirect calls that result in recursive direct calls

2022-02-09 Thread Richard Biener via Gcc-patches
On Wed, Feb 9, 2022 at 8:16 AM Eugene Rozenfeld via Gcc-patches
 wrote:
>
> AutoFDO tries to promote and inline all indirect calls that were promoted
> and inlined in the original binary and that are still hot. In the included
> test case, the promotion results in a direct call that is a recursive call.
> inline_call and optimize_inline_calls can't handle recursive calls at this 
> stage.
> Currently, inline_call fails with a segmentation fault.
>
> This change leaves the indirect call alone if promotion will result in a 
> recursive call.
>
> Tested on x86_64-pc-linux-gnu.

OK.

Thanks,
Richard.

> gcc/ChangeLog:
> * auto-profile.cc (afdo_indirect_call): Don't attempt to promote 
> indirect calls
> that will result in direct recursive calls.
>
> gcc/testsuite/g++.dg/tree-prof/ChangeLog:
> * indir-call-recursive-inlining.C : New test.
> ---
>  gcc/auto-profile.cc   | 40 --
>  .../tree-prof/indir-call-recursive-inlining.C | 54 +++
>  2 files changed, 78 insertions(+), 16 deletions(-)
>  create mode 100644 
> gcc/testsuite/g++.dg/tree-prof/indir-call-recursive-inlining.C
>
> diff --git a/gcc/auto-profile.cc b/gcc/auto-profile.cc
> index c7cee639c85..2b34b80b82d 100644
> --- a/gcc/auto-profile.cc
> +++ b/gcc/auto-profile.cc
> @@ -975,7 +975,7 @@ read_profile (void)
>   * after annotation, we just need to mark, and let follow-up logic to
> decide if it needs to promote and inline.  */
>
> -static void
> +static bool
>  afdo_indirect_call (gimple_stmt_iterator *gsi, const icall_target_map &map,
>  bool transform)
>  {
> @@ -983,12 +983,12 @@ afdo_indirect_call (gimple_stmt_iterator *gsi, const 
> icall_target_map &map,
>tree callee;
>
>if (map.size () == 0)
> -return;
> +return false;
>gcall *stmt = dyn_cast  (gs);
>if (!stmt
>|| gimple_call_internal_p (stmt)
>|| gimple_call_fndecl (stmt) != NULL_TREE)
> -return;
> +return false;
>
>gcov_type total = 0;
>icall_target_map::const_iterator max_iter = map.end ();
> @@ -1003,7 +1003,7 @@ afdo_indirect_call (gimple_stmt_iterator *gsi, const 
> icall_target_map &map,
>struct cgraph_node *direct_call = cgraph_node::get_for_asmname (
>get_identifier (afdo_string_table->get_name (max_iter->first)));
>if (direct_call == NULL || !direct_call->profile_id)
> -return;
> +return false;
>
>callee = gimple_call_fn (stmt);
>
> @@ -1013,20 +1013,27 @@ afdo_indirect_call (gimple_stmt_iterator *gsi, const 
> icall_target_map &map,
>hist->hvalue.counters = XNEWVEC (gcov_type, hist->n_counters);
>gimple_add_histogram_value (cfun, stmt, hist);
>
> -  // Total counter
> +  /* Total counter */
>hist->hvalue.counters[0] = total;
> -  // Number of value/counter pairs
> +  /* Number of value/counter pairs */
>hist->hvalue.counters[1] = 1;
> -  // Value
> +  /* Value */
>hist->hvalue.counters[2] = direct_call->profile_id;
> -  // Counter
> +  /* Counter */
>hist->hvalue.counters[3] = max_iter->second;
>
>if (!transform)
> -return;
> +return false;
> +
> +  cgraph_node* current_function_node = cgraph_node::get 
> (current_function_decl);
> +
> +  /* If the direct call is a recursive call, don't promote it since
> + we are not set up to inline recursive calls at this stage. */
> +  if (direct_call == current_function_node)
> +return false;
>
>struct cgraph_edge *indirect_edge
> -  = cgraph_node::get (current_function_decl)->get_edge (stmt);
> +  = current_function_node->get_edge (stmt);
>
>if (dump_file)
>  {
> @@ -1040,13 +1047,13 @@ afdo_indirect_call (gimple_stmt_iterator *gsi, const 
> icall_target_map &map,
>  {
>if (dump_file)
>  fprintf (dump_file, " not transforming\n");
> -  return;
> +  return false;
>  }
>if (DECL_STRUCT_FUNCTION (direct_call->decl) == NULL)
>  {
>if (dump_file)
>  fprintf (dump_file, " no declaration\n");
> -  return;
> +  return false;
>  }
>
>if (dump_file)
> @@ -1063,16 +1070,17 @@ afdo_indirect_call (gimple_stmt_iterator *gsi, const 
> icall_target_map &map,
>cgraph_edge::redirect_call_stmt_to_callee (new_edge);
>gimple_remove_histogram_value (cfun, stmt, hist);
>inline_call (new_edge, true, NULL, NULL, false);
> +  return true;
>  }
>
>  /* From AutoFDO profiles, find values inside STMT for that we want to measure
> histograms and adds them to list VALUES.  */
>
> -static void
> +static bool
>  afdo_vpt (gimple_stmt_iterator *gsi, const icall_target_map &map,
>bool transform)
>  {
> -  afdo_indirect_call (gsi, map, transform);
> +  return afdo_indirect_call (gsi, map, transform);
>  }
>
>  typedef std::set bb_set;
> @@ -1498,8 +1506,8 @@ afdo_vpt_for_early_inline (stmt_set *promoted_stmts)
>{
>  /* Promote the indirect call and update the promoted_stmts.  */
>  promoted_stmts

Re: [PATCH V2] tree-optimization/104288 - Register non-null side effects properly.

2022-02-09 Thread Richard Biener via Gcc-patches
On Tue, Feb 8, 2022 at 9:58 PM Andrew MacLeod  wrote:
>
> On 2/8/22 03:32, Richard Biener wrote:
> > On Tue, Feb 8, 2022 at 2:33 AM Andrew MacLeod via Gcc-patches
> >  wrote:
> >> On 2/7/22 09:29, Andrew MacLeod wrote:
> >>> I have a proposal for PR 104288.
> >>>
> >>> ALL patches (in sequence) bootstrap on their own and each cause no
> >>> regressions.
> >> I've been continuing to work with this towards the GCC13 solution for
> >> statement side effects.  And There is another possibility we could
> >> pursue which is less intrusive
> >>
> >> I adapted the portions of patch 2/4 which process nonnull, but changes
> >> it from being in the nonnull class to being in the cache.
> >>
> >> THe main difference is it continues to use a single bit, just changing
> >> all uses of it to *always* mean its non-null on exit to the block.
> >>
> >> Range-on-entry is changed to only check dominator blocks, and
> >> range_of_expr doesn't check the bit at all.. in both ranger and the cache.
> >>
> >> When we are doing the block walk and process side effects, the on-entry
> >> *cache* range for the block is adjusted to be non-null instead.   The
> >> statements which precede this stmt have already been processed, and all
> >> remaining statements in this block will now pick up this new non-value
> >> from the on-entry cache.  This should be perfectly safe.
> >>
> >> The final tweak is when range_of_expr is looking the def block, it
> >> normally does not have an on entry cache value.  (the def is in the
> >> block, so there is no entry cache value).  It now checks to see if we
> >> have set one, which can only happen when we have been doing the
> >> statement walk and set the on-entry cache with  a non-null value.  This
> >> allows us to pick up the non-null range in the def block too... (once we
> >> have passed a statement which set nonnull of course).
> >>
> >> THis patch provides much less change to trunk, and is probably a better
> >> approach as its closer to what is going to happen in GCC13.
> >>
> >> Im still working with it, so the patch isnt fully refined yet... but it
> >> bootstraps and passes all test with no regressions.  And passes the new
> >> tests too.   I figured I would mention this before you look to hard at
> >> the other patchset.the GCC11 patch doesn't change.
> >>
> >> Let me know if you prefer this I think I do :-)  less churn, and
> >> closer to end state.
> > Yes, I very much prefer this - some comments to the other patches
> > still apply to this one.  Like using get_nonnull_args and probably
> > adding a bail-out to computing ranges from stmts that can throw
> > internally or have abnormal control flow (to not get into range-on-exit
> > vs. normal vs. exceptional or abnormal edges).
> >
> > Richard.
>
> with some minor performance tweaks, such as moving adjust_range() to the
> header so it can be inlined .
>
> range-on-edge now applies the non-null from the src block if
> appropriate, not in range-on-exit.   That should resolve  the internal
> throwing statements I think.  and I have switched over to
> get_nonnull_args().
>
> Bootstraps on build-x86_64-pc-linux-gnu and passes all regressions.
>
> OK for trunk, or did I miss something?

OK.

I do think there's some confusion about -fnon-call-exceptions.  The
comment

+  // Non-call exceptions mean we could throw in the middle of the
+  // block, so just punt on those for now.

also applies to regular exceptions, non-call vs. call EH just
adds the possibility of non-call stmts to throw.  I do not think that
value-range propagation needs to care about stmts that throw
in the middle of a block (which automatically means no EH edges
and thus !stmt_can_throw_internal).  When there are EH edges
then restrictions apply to both calls and non-calls that throw.

So whatever made those cfun->can_throw_non_call_exceptions
necessary should have shown a more general issue with EH.

That's something to look at, but not in the scope of this fix.

Richard.

> Andrew
>
> PS. odd.. I haven't seen the git diff be wrong before, but it shows the
> ranger_cache::range_on_edge changes as being in
> gimple_cache::range_of_expr...  They are most definitely are not
> and it applies/de-applies fine.. so its just an oddity I guess.
>


[PATCH] ubsan: Separate -f{,no-}delete-null-pointer-checks from -fsanitize={null,{,returns-}nonnull-attribute} [PR104426]

2022-02-09 Thread Jakub Jelinek via Gcc-patches
Hi!

The 3 ubsan -fsanitize={null,{,returns-}nonnull-attribute} sanitizers were
setting implicitly -fno-delete-null-pointer-exceptions, so that
optimizations don't optimize away its checks whether some pointers are NULL.
Unfortunately by doing that there is no way to find out if
flag_delete_null_pointer_checks is 0 or 1 because the target usually allows
variables or functions at the start of address space, or user asked for
that option or whether it was because of sanitization or a combination of
those.

Unfortunately Variable in *.opt files can't be PerFunction or Optimization,
so there is no easy way to have some global_options.x_* be combined from
other Optimization flags.

So, the following patch instead introduces an inline function that should be
used in most places, and for the folding_initializer case ignores
sanitization and honors just target's or user
-fno-delete-null-pointer-checks.

Another possibility would be to invert the meaning of the variable,
change flag_delete_null_pointer_checks into
flag_dont_delete_null_pointer_checks and use separate values for the var
originating from command line or target's decision (e.g. 1) and from
sanitization (e.g. 2), but it would be a small nightmare to encode that
into *.opt.

A separate issue not solved in this patch is whether addresses of automatic
variables can be assumed to be non-NULL even in the
-fno-delete-null-pointer-checks case (whether no target actually places its
stack at the very start of the address space (especially growing up)).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-02-08  Jakub Jelinek  

PR c++/67762
PR c++/71962
PR c++/104426
gcc/
* asan.h (delete_null_pointer_checks): New function.
* opts.cc (finish_options): Don't clear
opts->x_flag_delete_null_pointer_checks for -fsanitize=null,
-fsanitize=nonnull-attribute and/or
-fsanitize=returns-nonnull-attribute.
* tree.cc: Include asan.h.
(nonnull_arg_p): Use delete_null_pointer_checks () instead of
flag_delete_null_pointer_checks.
* range-op.cc: Include attribs.h and asan.h.
(pointer_plus_operator::wi_fold): Use delete_null_pointer_checks ()
instead of flag_delete_null_pointer_checks.
* gimple-range-fold.cc: Include attribs.h and asan.h.
(fold_using_range::range_of_address): Use
delete_null_pointer_checks () instead of
flag_delete_null_pointer_checks.
* tree-ssa-structalias.cc: Include asan.h.
(get_constraint_for_1, find_func_aliases_for_builtin_call): Use
delete_null_pointer_checks () instead of
flag_delete_null_pointer_checks.
* symtab.cc: Include asan.h.
(symtab_node::nonzero_address): If !folding_initializer, use
delete_null_pointer_checks () instead of
flag_delete_null_pointer_checks.
* ipa-fnsummary.cc: Include asan.h.
(points_to_local_or_readonly_memory_p): Use
delete_null_pointer_checks () instead of
flag_delete_null_pointer_checks.
* tree-ssa-alias.cc: Include attribs.h and asan.h.
(modref_may_conflict): Use delete_null_pointer_checks () instead of
flag_delete_null_pointer_checks.
* ubsan.cc (instrument_nonnull_arg, instrument_nonnull_return):
Instead of temporarily setting flag_delete_null_pointer_checks
temporarily clear SANITIZE_NULL, SANITIZE_NONNULL_ATTRIBUTE and
SANITIZE_RETURNS_NONNULL_ATTRIBUTE bits from flag_sanitize.
* tree-vrp.cc: Include attribs.h and asan.h.
(extract_range_from_pointer_plus_expr): Use
delete_null_pointer_checks () instead of
flag_delete_null_pointer_checks.
* vr-values.cc: Include asan.h.
(vr_values::vrp_stmt_computes_nonzero): Use
delete_null_pointer_checks () instead of
flag_delete_null_pointer_checks.
* tree-ssa-loop-niter.cc: Include attribs.h and asan.h.
(infer_loop_bounds_from_pointer_arith): Use
delete_null_pointer_checks () instead of
flag_delete_null_pointer_checks.
* ipa-pure-const.cc: Include attribs.h and asan.h.
(malloc_candidate_p): Use delete_null_pointer_checks () instead of
flag_delete_null_pointer_checks.
* gimple.cc (gimple_call_nonnull_result_p,
infer_nonnull_range_by_dereference, infer_nonnull_range_by_attribute):
Use delete_null_pointer_checks () instead of
flag_delete_null_pointer_checks.
* fold-const.cc (tree_expr_nonzero_warnv_p): Use
delete_null_pointer_checks () instead of
flag_delete_null_pointer_checks.
* rtlanal.cc: Include stringpool.h, attribs.h and asan.h.
(nonzero_address_p): Use delete_null_pointer_checks () instead of
flag_delete_null_pointer_checks.
gcc/c-family/
* c-ubsan.cc (ubsan_maybe_instrument_reference_or_call): Instead of
temporarily setting flag_delete_null_pointer_che

[PATCH] c: Fix up __builtin_assoc_barrier handling in the C FE [PR104427]

2022-02-09 Thread Jakub Jelinek via Gcc-patches
Hi!

The following testcase ICEs, because when creating PAREN_EXPR for
__builtin_assoc_barrier the FE doesn't do the usual tweaks for
EXCESS_PRECISION_EXPR or C_MAYBE_CONST_EXPR.  I believe that the
declared effect of the builtin is just association barrier, so
e.g. excess precision should be still handled like if it wasn't
there.

The following patch uses build_unary_op to handle those.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-02-08  Jakub Jelinek  

PR c/104427
* c-parser.cc (c_parser_postfix_expression)
: Use parser_build_unary_op
instead of build1_loc to build PAREN_EXPR.
* c-typeck.cc (build_unary_op): Handle PAREN_EXPR.
* c-fold.cc (c_fully_fold_internal): Likewise.

* gcc.dg/pr104427.c: New test.

--- gcc/c/c-parser.cc.jj2022-01-18 11:58:58.927991414 +0100
+++ gcc/c/c-parser.cc   2022-02-08 12:56:00.028208794 +0100
@@ -10128,8 +10128,7 @@ c_parser_postfix_expression (c_parser *p
mark_exp_read (e1.value);
location_t end_loc = c_parser_peek_token (parser)->get_finish ();
parens.skip_until_found_close (parser);
-   expr.value = build1_loc (loc, PAREN_EXPR, TREE_TYPE (e1.value),
-e1.value);
+   expr = parser_build_unary_op (loc, PAREN_EXPR, e1);
set_c_expr_source_range (&expr, start_loc, end_loc);
  }
  break;
--- gcc/c/c-typeck.cc.jj2022-01-18 11:58:58.929991386 +0100
+++ gcc/c/c-typeck.cc   2022-02-08 12:53:16.026485823 +0100
@@ -4921,6 +4921,10 @@ build_unary_op (location_t location, enu
   ret = val;
   goto return_build_unary_op;
 
+case PAREN_EXPR:
+  ret = build1 (code, TREE_TYPE (arg), arg);
+  goto return_build_unary_op;
+
 default:
   gcc_unreachable ();
 }
--- gcc/c/c-fold.cc.jj  2022-01-18 11:58:58.925991443 +0100
+++ gcc/c/c-fold.cc 2022-02-08 12:37:46.315411262 +0100
@@ -465,6 +465,7 @@ c_fully_fold_internal (tree expr, bool i
 case BIT_NOT_EXPR:
 case TRUTH_NOT_EXPR:
 case CONJ_EXPR:
+case PAREN_EXPR:
 unary:
   /* Unary operations.  */
   orig_op0 = op0 = TREE_OPERAND (expr, 0);
--- gcc/testsuite/gcc.dg/pr104427.c.jj  2022-02-08 13:08:14.054022838 +0100
+++ gcc/testsuite/gcc.dg/pr104427.c 2022-02-08 13:08:41.541641923 +0100
@@ -0,0 +1,13 @@
+/* PR c/104427 */
+/* { dg-do compile } */
+/* { dg-options "" } */
+/* { dg-add-options float16 } */
+/* { dg-require-effective-target float16 } */
+
+_Float16 x, y;
+
+int
+foo ()
+{
+  return __builtin_assoc_barrier (x + y) - y;
+}

Jakub



Re: [PATCH, V2] Use system default for long double if not specified on PowerPC.

2022-02-09 Thread Segher Boessenkool
Hi Andreas,

On Tue, Feb 08, 2022 at 06:36:57PM +0100, Andreas Schwab wrote:
> On Feb 08 2022, Peter Bergner wrote:
> > Can you please clarify one thing for me.  Do you think it's possible
> > that we can come up with some type of configure patch that automatically
> > sets the long double default given something on the system we can test
> > for or do you think that's impossible and we'll just have to live with
> > explicitly using a configure option to set the default?
> 
> It should be handled the same as the double->long double switch.

So how was that done?  It's not something I can find online ("long
double conversion" does not find it, heh).  Was it just a flag day where
the default was changed?

IMO it is a bad idea to change configuration without the user asking for
it (and even doing that silently).  It is hard enough and painful enough
to do this conversion in the first place, we do not need twice as many
user configurations (that are not even shown by gcc -v, etc.)

Testing is hard.  Testing twice as many configurations is twice as hard.
The only outcome that can be reasonably expected is that at least half
of the new configurations will not be tested at all.  And that is very
expensive, because there will be a lot of wasted work and frustration
for whoever gets to handle bug reports, and ditto for whoever gets to
deal with a bug that was not handled (aka, a worse user experience).

The apparent goal here is to give users compilers that default to IEEE
QP long double earlier.  That is a fine goal, but it should be achieved
bu actually changing the default earlier, not by leaving behind a large
fraction of users!


Segher


Re: [PATCH] PR target/102059 Fix inline of target specific functions

2022-02-09 Thread Kewen.Lin via Gcc-patches
Hi Michael,

on 2022/2/9 上午11:27, Michael Meissner via Gcc-patches wrote:
> Reset -mpower8-fusion for power9 inlining power8 functions, PR 102059.
> 
> This patch is an attempt to make a much simpler patch to fix PR target/102059
> than the previous patch.
> 
> It just fixes the issue that if a function is specifically declared as a 
> power8
> function, you can't inline in functions that are specified with power9 or
> power10 options.
> 
> The issue is -mpower8-fusion is cleared when you use -mcpu=power9 or
> -mcpu=power10.  When I wrote the code for controlling which function can 
> inline
> other functions, -mpower8-fusion was set for -mcpu=power9 (-mcpu=power10 was
> not an option at that time).  This patch fixes this particular problem.
> 
> Perhaps -mpower8-fusion should go away in the GCC 13 time frame.  This patch
> just goes in and resets the fusion bit for testing inlines.
> 
> I have built a bootstrapped little endian compiler on power9 and the tests had
> no regressions.
> 
> I have built a bootstrapped big endian compiler on power8 and I tested both
> 32-bit and 64-bit builds, and there were no regressions.
> 
> Can I install this into the trunk and back port it into GCC 11 after a burn-in
> period?
> 

Thanks for the patch!  I guess we also need this for GCC 10 as:

$cat htm.c

__attribute__((always_inline)) int foo(int *b) {
  *b += 10;
  return *b;
}

#pragma GCC target "cpu=power10,htm"
int bar(int* a){
  *a = foo(a);
  return 0;
}

/opt/at14.0/bin/gcc -flto -S htm.c -mcpu=power8
htm.c:1:36: warning: ‘always_inline’ function might not be inlinable 
[-Wattributes]
1 | __attribute__((always_inline)) int foo(int *b) {
  |^~~
htm.c: In function ‘bar’:
htm.c:1:36: error: inlining failed in call to ‘always_inline’ ‘foo’: target 
specific option mismatch
htm.c:8:8: note: called from here
8 |   *a = foo(a);
  |^~

Besides, as I noted in the PR, with this fix we can safely remove the option
"-mno-power8-fusion" in gcc/testsuite/gcc.dg/lto/pr102059-1_0.c, which has
the coverage for lto (though I didn't test ;-)).

BR,
Kewen

> 2022-02-08   Michael Meissner  
> 
> gcc/
> 
>   PR target/102059
>   * config/rs6000/rs6000.cc (rs6000_can_inline_p): Don't test for
>   power8-fusion if the caller was power9 or power10.
> 
> gcc/testsuite/
>   PR target/102059
>   * gcc.target/powerpc/pr102059-4.c: New test.
> ---
>  gcc/config/rs6000/rs6000.cc   |  8 
>  gcc/testsuite/gcc.target/powerpc/pr102059-4.c | 20 +++
>  2 files changed, 28 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr102059-4.c
> 
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index eaba9a2d698..e2d94421ae9 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -25278,6 +25278,14 @@ rs6000_can_inline_p (tree caller, tree callee)
> callee_isa &= ~OPTION_MASK_HTM;
> explicit_isa &= ~OPTION_MASK_HTM;
>   }
> +
> +   /* Power9 and power10 do not set power8-fusion.  If the callee was
> +  explicitly compiled for power8, and the caller was power9 or
> +  power10, ignore the power8-fusion bits if it was set by
> +  default.  */
> +   if ((caller_isa & OPTION_MASK_P8_FUSION) == 0
> +   && (explicit_isa & OPTION_MASK_P8_FUSION) == 0)
> + callee_isa &= ~OPTION_MASK_P8_FUSION;
>   }
>  
>/* The callee's options must be a subset of the caller's options, i.e.
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr102059-4.c 
> b/gcc/testsuite/gcc.target/powerpc/pr102059-4.c
> new file mode 100644
> index 000..627c6f820c7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr102059-4.c
> @@ -0,0 +1,20 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mdejagnu-cpu=power10 -Wno-attributes" } */
> +
> +/* Verify that power10 can explicity include functions compiled for power8.
> +   The issue is -mcpu=power8 enables -mpower8-fusion, but -mcpu=power9 or
> +   -mcpu=power10 do not set power8-fusion by default.  */
> +
> +static inline int __attribute__ ((always_inline,target("cpu=power8")))
> +foo (int *b)
> +{
> +  *b += 10;
> +  return *b;
> +}
> +
> +int
> +bar (int *a)
> +{
> +  *a = foo (a);
> +  return 0;
> +}
>


Re: [PATCH] ubsan: Separate -f{,no-}delete-null-pointer-checks from -fsanitize={null,{,returns-}nonnull-attribute} [PR104426]

2022-02-09 Thread Richard Biener via Gcc-patches
On Wed, 9 Feb 2022, Jakub Jelinek wrote:

> Hi!
> 
> The 3 ubsan -fsanitize={null,{,returns-}nonnull-attribute} sanitizers were
> setting implicitly -fno-delete-null-pointer-exceptions, so that
> optimizations don't optimize away its checks whether some pointers are NULL.
> Unfortunately by doing that there is no way to find out if
> flag_delete_null_pointer_checks is 0 or 1 because the target usually allows
> variables or functions at the start of address space, or user asked for
> that option or whether it was because of sanitization or a combination of
> those.
> 
> Unfortunately Variable in *.opt files can't be PerFunction or Optimization,
> so there is no easy way to have some global_options.x_* be combined from
> other Optimization flags.
> 
> So, the following patch instead introduces an inline function that should be
> used in most places, and for the folding_initializer case ignores
> sanitization and honors just target's or user
> -fno-delete-null-pointer-checks.
> 
> Another possibility would be to invert the meaning of the variable,
> change flag_delete_null_pointer_checks into
> flag_dont_delete_null_pointer_checks and use separate values for the var
> originating from command line or target's decision (e.g. 1) and from

I see there are still targets doing sth like

static void
msp430_option_override (void)
{
  /* The MSP430 architecture can safely dereference a NULL pointer.  In 
fact,
 there are memory mapped registers there.  */
  flag_delete_null_pointer_checks = 0;

it would be very nice to instead have those override the 
zero_address_valid target hook (only i386.cc does that at the moment).
It probably doesn't remove much of the complication.

Then at the same time you add the delete_null_pointer_checks ()
abstraction we should add a zero_address_valid (type/as) abstraction
in case we want -fno-delete-null-pointer-checks be a user accessible
way to override the target here.

> sanitization (e.g. 2), but it would be a small nightmare to encode that
> into *.opt.
> 
> A separate issue not solved in this patch is whether addresses of automatic
> variables can be assumed to be non-NULL even in the
> -fno-delete-null-pointer-checks case (whether no target actually places its
> stack at the very start of the address space (especially growing up)).

True - but not sure if worth optimizing ... (the zero_address_valid
could get a decl overload as well).

Did you replace all flag_delete_null_pointer_checks uses?  Can you
rename the flag just to be sure?

Thanks,
Richard.

> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2022-02-08  Jakub Jelinek  
> 
>   PR c++/67762
>   PR c++/71962
>   PR c++/104426
> gcc/
>   * asan.h (delete_null_pointer_checks): New function.
>   * opts.cc (finish_options): Don't clear
>   opts->x_flag_delete_null_pointer_checks for -fsanitize=null,
>   -fsanitize=nonnull-attribute and/or
>   -fsanitize=returns-nonnull-attribute.
>   * tree.cc: Include asan.h.
>   (nonnull_arg_p): Use delete_null_pointer_checks () instead of
>   flag_delete_null_pointer_checks.
>   * range-op.cc: Include attribs.h and asan.h.
>   (pointer_plus_operator::wi_fold): Use delete_null_pointer_checks ()
>   instead of flag_delete_null_pointer_checks.
>   * gimple-range-fold.cc: Include attribs.h and asan.h.
>   (fold_using_range::range_of_address): Use
>   delete_null_pointer_checks () instead of
>   flag_delete_null_pointer_checks.
>   * tree-ssa-structalias.cc: Include asan.h.
>   (get_constraint_for_1, find_func_aliases_for_builtin_call): Use
>   delete_null_pointer_checks () instead of
>   flag_delete_null_pointer_checks.
>   * symtab.cc: Include asan.h.
>   (symtab_node::nonzero_address): If !folding_initializer, use
>   delete_null_pointer_checks () instead of
>   flag_delete_null_pointer_checks.
>   * ipa-fnsummary.cc: Include asan.h.
>   (points_to_local_or_readonly_memory_p): Use
>   delete_null_pointer_checks () instead of
>   flag_delete_null_pointer_checks.
>   * tree-ssa-alias.cc: Include attribs.h and asan.h.
>   (modref_may_conflict): Use delete_null_pointer_checks () instead of
>   flag_delete_null_pointer_checks.
>   * ubsan.cc (instrument_nonnull_arg, instrument_nonnull_return):
>   Instead of temporarily setting flag_delete_null_pointer_checks
>   temporarily clear SANITIZE_NULL, SANITIZE_NONNULL_ATTRIBUTE and
>   SANITIZE_RETURNS_NONNULL_ATTRIBUTE bits from flag_sanitize.
>   * tree-vrp.cc: Include attribs.h and asan.h.
>   (extract_range_from_pointer_plus_expr): Use
>   delete_null_pointer_checks () instead of
>   flag_delete_null_pointer_checks.
>   * vr-values.cc: Include asan.h.
>   (vr_values::vrp_stmt_computes_nonzero): Use
>   delete_null_pointer_checks () instead of
>   flag_delete_null_pointer_checks.
>   * tree-ssa-loop-niter.cc: Include attribs.h and asan.h

[PATCH] [i386] ICE: QImode(not SImode) operand should be passed to gen_vec_initv16qiqi in ashlv16qi3.

2022-02-09 Thread liuhongt via Gcc-patches
ix86_expand_vector_init expects vals to be a parallel containing
values of individual fields which should be either element mode of the
vector mode, or a vector mode with the same element mode and smaller
number of elements.

But in the expander ashlv16qi3, the second operand is SImode which
can't be directly passed to gen_vec_initv16qiqi.

Bootstrapped on CLX
Regtested on x86_64-pc-linux-gnu{-m32\ -mxop\ -mavx2,\ -mxop\ -mavx2}.
Don't have machine with xop for native bootstrap, but i think the fix
should be ok.

Ok for trunk?

gcc/ChangeLog:

PR target/104451
* config/i386/sse.md (3): lowpart_subreg
operands[2] from SImode to QImode.

gcc/testsuite/ChangeLog:

PR target/104451
* gcc.target/i386/pr104451.c: New test.
---
 gcc/config/i386/sse.md   |  3 ++-
 gcc/testsuite/gcc.target/i386/pr104451.c | 25 
 2 files changed, 27 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr104451.c

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index d8cb7b65594..36b35f68349 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -24153,8 +24153,9 @@ (define_expand "3"
negate = true;
}
   par = gen_rtx_PARALLEL (V16QImode, rtvec_alloc (16));
+  tmp = lowpart_subreg (QImode, operands[2], SImode);
   for (i = 0; i < 16; i++)
-XVECEXP (par, 0, i) = operands[2];
+   XVECEXP (par, 0, i) = tmp;
 
   tmp = gen_reg_rtx (V16QImode);
   emit_insn (gen_vec_initv16qiqi (tmp, par));
diff --git a/gcc/testsuite/gcc.target/i386/pr104451.c 
b/gcc/testsuite/gcc.target/i386/pr104451.c
new file mode 100644
index 000..22f3ad092b3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr104451.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx2 -O" } */
+
+typedef char __attribute__((__vector_size__ (16))) V;
+typedef unsigned char __attribute__((__vector_size__ (16))) UV;
+V v;
+UV uv;
+
+V
+foo (long c)
+{
+  return v << c;
+}
+
+V
+foo1 (long c)
+{
+  return v >> c;
+}
+
+UV
+foo2 (unsigned long uc)
+{
+  return uv >> uc;
+}
-- 
2.18.1



[PATCH] [i386] ICE: QImode(not SImode) operand should be passed to gen_vec_initv16qiqi in ashlv16qi3.

2022-02-09 Thread liuhongt via Gcc-patches
ix86_expand_vector_init expects vals to be a parallel containing
values of individual fields which should be either element mode of the
vector mode, or a vector mode with the same element mode and smaller
number of elements.

But in the expander ashlv16qi3, the second operand is SImode which
can't be directly passed to gen_vec_initv16qiqi.

Bootstrapped on CLX.
regtested with x86_64-pc-linux-gnu{-m32\ -mxop\ -mavx2,\ -mavx2 \-mxop}.
I don't have platform with xop for native bootstrap, but i think the fix
should be ok.

Ok for trunk?

gcc/ChangeLog:

PR target/104451
* config/i386/sse.md (3): lowpart_subreg
operands[2] from SImode to QImode.

gcc/testsuite/ChangeLog:

PR target/104451
* gcc.target/i386/pr104451.c: New test.
---
 gcc/config/i386/sse.md   |  3 ++-
 gcc/testsuite/gcc.target/i386/pr104451.c | 25 
 2 files changed, 27 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr104451.c

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index d8cb7b65594..36b35f68349 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -24153,8 +24153,9 @@ (define_expand "3"
negate = true;
}
   par = gen_rtx_PARALLEL (V16QImode, rtvec_alloc (16));
+  tmp = lowpart_subreg (QImode, operands[2], SImode);
   for (i = 0; i < 16; i++)
-XVECEXP (par, 0, i) = operands[2];
+   XVECEXP (par, 0, i) = tmp;
 
   tmp = gen_reg_rtx (V16QImode);
   emit_insn (gen_vec_initv16qiqi (tmp, par));
diff --git a/gcc/testsuite/gcc.target/i386/pr104451.c 
b/gcc/testsuite/gcc.target/i386/pr104451.c
new file mode 100644
index 000..22f3ad092b3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr104451.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx2 -O" } */
+
+typedef char __attribute__((__vector_size__ (16))) V;
+typedef unsigned char __attribute__((__vector_size__ (16))) UV;
+V v;
+UV uv;
+
+V
+foo (long c)
+{
+  return v << c;
+}
+
+V
+foo1 (long c)
+{
+  return v >> c;
+}
+
+UV
+foo2 (unsigned long uc)
+{
+  return uv >> uc;
+}
-- 
2.18.1



[PATCH] target/104453 - guard call folding with NULL LHS

2022-02-09 Thread Richard Biener via Gcc-patches
This guards shift builtin folding to do nothing when there is
no LHS, similar to what other foldings do.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed as obvious.

2022-02-09  Richard Biener  

PR target/104453
* config/i386/i386.cc (ix86_gimple_fold_builtin): Guard shift
folding for NULL LHS.

* gcc.target/i386/pr104453.c: New testcase.
---
 gcc/config/i386/i386.cc  |  2 ++
 gcc/testsuite/gcc.target/i386/pr104453.c | 11 +++
 2 files changed, 13 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr104453.c

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index dd5584fb8ed..448c079c7ac 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -18642,6 +18642,8 @@ ix86_gimple_fold_builtin (gimple_stmt_iterator *gsi)
 
 do_shift:
   gcc_assert (n_args >= 2);
+  if (!gimple_call_lhs (stmt))
+   break;
   arg0 = gimple_call_arg (stmt, 0);
   arg1 = gimple_call_arg (stmt, 1);
   elems = TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg0));
diff --git a/gcc/testsuite/gcc.target/i386/pr104453.c 
b/gcc/testsuite/gcc.target/i386/pr104453.c
new file mode 100644
index 000..325cedf0e2c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr104453.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512f" } */
+
+typedef short __attribute__((__vector_size__ (32))) V;
+V g;
+
+void
+foo (void)
+{
+  __builtin_ia32_psrawi256 (g, 0);
+}
-- 
2.34.1


[PATCH] middle-end/104450 - ISEL and non-call EH

2022-02-09 Thread Richard Biener via Gcc-patches
The following avoids merging a vector compare with EH with a
VEC_COND_EXPR.  We should be able to do fallback expansion and if
we really are for the optimization we need quite some shuffling
to arrange for the proper EH redirection in all cases, IMHO not
worth it.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed to trunk 
sofar.

2022-02-09  Richard Biener  

PR middle-end/104450
* gimple-isel.cc: Pass cfun around.
(+gimple_expand_vec_cond_expr): Do not combine a throwing
comparison with the select.

* g++.dg/torture/pr104450.C: New testcase.
---
 gcc/gimple-isel.cc  | 24 ++--
 gcc/testsuite/g++.dg/torture/pr104450.C | 16 
 2 files changed, 30 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/torture/pr104450.C

diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc
index 3d4f02c7fec..1d93766b704 100644
--- a/gcc/gimple-isel.cc
+++ b/gcc/gimple-isel.cc
@@ -50,7 +50,7 @@ along with GCC; see the file COPYING3.  If not see
  u = _8;  */
 
 static gimple *
-gimple_expand_vec_set_expr (gimple_stmt_iterator *gsi)
+gimple_expand_vec_set_expr (struct function *fun, gimple_stmt_iterator *gsi)
 {
   enum tree_code code;
   gcall *new_stmt = NULL;
@@ -76,7 +76,7 @@ gimple_expand_vec_set_expr (gimple_stmt_iterator *gsi)
   tree pos = TREE_OPERAND (lhs, 1);
   tree view_op0 = TREE_OPERAND (op0, 0);
   machine_mode outermode = TYPE_MODE (TREE_TYPE (view_op0));
-  if (auto_var_in_fn_p (view_op0, cfun->decl)
+  if (auto_var_in_fn_p (view_op0, fun->decl)
  && !TREE_ADDRESSABLE (view_op0) && can_vec_set_var_idx_p (outermode))
{
  location_t loc = gimple_location (stmt);
@@ -110,7 +110,7 @@ gimple_expand_vec_set_expr (gimple_stmt_iterator *gsi)
function based on type of selected expansion.  */
 
 static gimple *
-gimple_expand_vec_cond_expr (gimple_stmt_iterator *gsi,
+gimple_expand_vec_cond_expr (struct function *fun, gimple_stmt_iterator *gsi,
 hash_map 
*vec_cond_ssa_name_uses)
 {
   tree lhs, op0a = NULL_TREE, op0b = NULL_TREE;
@@ -178,7 +178,11 @@ gimple_expand_vec_cond_expr (gimple_stmt_iterator *gsi,
}
 
   gassign *def_stmt = dyn_cast (SSA_NAME_DEF_STMT (op0));
-  if (def_stmt)
+  if (def_stmt
+ /* When the compare has EH we do not want to forward it when
+it has multiple uses and in general because of the complication
+with EH redirection.  */
+ && !stmt_can_throw_internal (fun, def_stmt))
{
  tcode = gimple_assign_rhs_code (def_stmt);
  op0a = gimple_assign_rhs1 (def_stmt);
@@ -279,18 +283,18 @@ gimple_expand_vec_cond_expr (gimple_stmt_iterator *gsi,
VEC_COND_EXPR assignments.  */
 
 static unsigned int
-gimple_expand_vec_exprs (void)
+gimple_expand_vec_exprs (struct function *fun)
 {
   gimple_stmt_iterator gsi;
   basic_block bb;
   hash_map vec_cond_ssa_name_uses;
   auto_bitmap dce_ssa_names;
 
-  FOR_EACH_BB_FN (bb, cfun)
+  FOR_EACH_BB_FN (bb, fun)
 {
   for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
{
- gimple *g = gimple_expand_vec_cond_expr (&gsi,
+ gimple *g = gimple_expand_vec_cond_expr (fun, &gsi,
   &vec_cond_ssa_name_uses);
  if (g != NULL)
{
@@ -299,7 +303,7 @@ gimple_expand_vec_exprs (void)
  gsi_replace (&gsi, g, false);
}
 
- gimple_expand_vec_set_expr (&gsi);
+ gimple_expand_vec_set_expr (fun, &gsi);
  if (gsi_end_p (gsi))
break;
}
@@ -342,9 +346,9 @@ public:
   return true;
 }
 
-  virtual unsigned int execute (function *)
+  virtual unsigned int execute (function *fun)
 {
-  return gimple_expand_vec_exprs ();
+  return gimple_expand_vec_exprs (fun);
 }
 
 }; // class pass_gimple_isel
diff --git a/gcc/testsuite/g++.dg/torture/pr104450.C 
b/gcc/testsuite/g++.dg/torture/pr104450.C
new file mode 100644
index 000..402a4849e54
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/pr104450.C
@@ -0,0 +1,16 @@
+// { dg-do compile }
+// { dg-additional-options "-fnon-call-exceptions" }
+// { dg-additional-options "-mavx512f" { target x86_64-*-* i?86-*-* } }
+
+#define vectsize 64
+typedef int __attribute__((__vector_size__ (vectsize))) V;
+typedef float __attribute__((__vector_size__ (vectsize))) F;
+F f;
+V v;
+struct g{~g();};
+void
+foo (void)
+{
+  g t;
+  v += (V) (0 <= f);
+}
-- 
2.34.1


Re: [PATCH] [i386] ICE: QImode(not SImode) operand should be passed to gen_vec_initv16qiqi in ashlv16qi3.

2022-02-09 Thread Uros Bizjak via Gcc-patches
On Wed, Feb 9, 2022 at 10:03 AM liuhongt  wrote:
>
> ix86_expand_vector_init expects vals to be a parallel containing
> values of individual fields which should be either element mode of the
> vector mode, or a vector mode with the same element mode and smaller
> number of elements.
>
> But in the expander ashlv16qi3, the second operand is SImode which
> can't be directly passed to gen_vec_initv16qiqi.
>
> Bootstrapped on CLX
> Regtested on x86_64-pc-linux-gnu{-m32\ -mxop\ -mavx2,\ -mxop\ -mavx2}.
> Don't have machine with xop for native bootstrap, but i think the fix
> should be ok.
>
> Ok for trunk?

Please add -mxop to dg-options of the testcase.

OK with the above change.

Thanks,
Uros.

>
> gcc/ChangeLog:
>
> PR target/104451
> * config/i386/sse.md (3): lowpart_subreg
> operands[2] from SImode to QImode.
>
> gcc/testsuite/ChangeLog:
>
> PR target/104451
> * gcc.target/i386/pr104451.c: New test.
> ---
>  gcc/config/i386/sse.md   |  3 ++-
>  gcc/testsuite/gcc.target/i386/pr104451.c | 25 
>  2 files changed, 27 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr104451.c
>
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index d8cb7b65594..36b35f68349 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -24153,8 +24153,9 @@ (define_expand "3"
> negate = true;
> }
>par = gen_rtx_PARALLEL (V16QImode, rtvec_alloc (16));
> +  tmp = lowpart_subreg (QImode, operands[2], SImode);
>for (i = 0; i < 16; i++)
> -XVECEXP (par, 0, i) = operands[2];
> +   XVECEXP (par, 0, i) = tmp;
>
>tmp = gen_reg_rtx (V16QImode);
>emit_insn (gen_vec_initv16qiqi (tmp, par));
> diff --git a/gcc/testsuite/gcc.target/i386/pr104451.c 
> b/gcc/testsuite/gcc.target/i386/pr104451.c
> new file mode 100644
> index 000..22f3ad092b3
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr104451.c
> @@ -0,0 +1,25 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mavx2 -O" } */
> +
> +typedef char __attribute__((__vector_size__ (16))) V;
> +typedef unsigned char __attribute__((__vector_size__ (16))) UV;
> +V v;
> +UV uv;
> +
> +V
> +foo (long c)
> +{
> +  return v << c;
> +}
> +
> +V
> +foo1 (long c)
> +{
> +  return v >> c;
> +}
> +
> +UV
> +foo2 (unsigned long uc)
> +{
> +  return uv >> uc;
> +}
> --
> 2.18.1
>


Re: [PATCH, V2] Use system default for long double if not specified on PowerPC.

2022-02-09 Thread Andreas Schwab
On Feb 09 2022, Segher Boessenkool wrote:

> Hi Andreas,
>
> On Tue, Feb 08, 2022 at 06:36:57PM +0100, Andreas Schwab wrote:
>> On Feb 08 2022, Peter Bergner wrote:
>> > Can you please clarify one thing for me.  Do you think it's possible
>> > that we can come up with some type of configure patch that automatically
>> > sets the long double default given something on the system we can test
>> > for or do you think that's impossible and we'll just have to live with
>> > explicitly using a configure option to set the default?
>> 
>> It should be handled the same as the double->long double switch.
>
> So how was that done?

I thnk commit ed965309dad added that.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [PATCH] ubsan: Separate -f{,no-}delete-null-pointer-checks from -fsanitize={null,{,returns-}nonnull-attribute} [PR104426]

2022-02-09 Thread Jakub Jelinek via Gcc-patches
On Wed, Feb 09, 2022 at 10:03:02AM +0100, Richard Biener wrote:
> I see there are still targets doing sth like
> 
> static void
> msp430_option_override (void)
> {
>   /* The MSP430 architecture can safely dereference a NULL pointer.  In 
> fact,
>  there are memory mapped registers there.  */
>   flag_delete_null_pointer_checks = 0;

Sure, that is the typical embedded target case.
And the user option is in case users are using a typically non-embedded
target case in an embedded way, even on x86_64-linux one can mmap something
at address 0 if one tweaks some kernel config parameters.

I guess I could change
default_addr_space_zero_address_valid from return false; to
  if (!flag_delete_null_pointer_checks)
return true;
  if (folding_initializer)
return false;
  if (current_function_decl
  && sanitize_flags_p (SANITIZE_NULL | SANITIZE_NONNULL_ATTRIBUTE
   | SANITIZE_RETURNS_NONNULL_ATTRIBUTE,
   current_function_decl))
return true;
  return false;
and change the i386 one to also call the default version.

Replacing all the flag_delete_null_pointer_checks uses
with some wrappers around the target hook will be less fun, not sure if the
pointer type will be always visible there...

> Did you replace all flag_delete_null_pointer_checks uses?

I did.

>  Can you rename the flag just to be sure?

Sure.

Jakub



[PATCH] tree-optimization/104445 - check for vector extraction support

2022-02-09 Thread Richard Biener via Gcc-patches
This adds a missing check to epilogue reduction re-use, namely
that we can do hi/lo extracts from the vector when demoting it
to the epilogue vector size.

I've chosen to add a can_vec_extract helper to optabs-query.h,
in the future we might want to simplify the vectorizers life by
handling vector-from-vector extraction via BIT_FIELD_REFs during
RTL expansion via the mode punning when the vec_extract is not
directly supported.

I'm not 100% sure we can always do the punning of the
vec_extract result to a vector mode of the same size, but then
I'm also not sure how to check for that (the vectorizer doesn't
in other places it does that at the moment, but I suppose we
eventually just go through memory there)?

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Does this look OK?

Thanks,
Richard.

2022-02-09  Richard Biener  

PR tree-optimization/104445
PR tree-optimization/102832
* optabs-query.h (can_vec_extract): New.
* optabs-query.cc (can_vec_extract): Likewise.
* tree-vect-loop.cc (vect_find_reusable_accumulator): Check
we can extract a hi/lo part from the larger vector.

* gcc.dg/vect/pr104445.c: New testcase.
---
 gcc/optabs-query.cc  | 33 
 gcc/optabs-query.h   |  1 +
 gcc/testsuite/gcc.dg/vect/pr102832.c | 12 ++
 gcc/testsuite/gcc.dg/vect/pr104445.c | 16 ++
 gcc/tree-vect-loop.cc|  8 +--
 5 files changed, 68 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr102832.c
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr104445.c

diff --git a/gcc/optabs-query.cc b/gcc/optabs-query.cc
index 2ce8d74db16..fa88b4bede0 100644
--- a/gcc/optabs-query.cc
+++ b/gcc/optabs-query.cc
@@ -763,3 +763,36 @@ supports_vec_scatter_store_p (machine_mode mode)
   return this_fn_optabs->supports_vec_scatter_store[mode] > 0;
 }
 
+/* Whether we can extract part of the vector mode MODE as
+   (scalar or vector) mode EXTR_MODE.  */
+
+bool
+can_vec_extract (machine_mode mode, machine_mode extr_mode)
+{
+  if (!VECTOR_MODE_P (mode))
+return false;
+
+  unsigned m;
+  if (!constant_multiple_p (GET_MODE_SIZE (mode),
+   GET_MODE_SIZE (extr_mode), &m))
+return false;
+
+  if (convert_optab_handler (vec_extract_optab, mode, extr_mode)
+  != CODE_FOR_nothing)
+return true;
+
+  if (!VECTOR_MODE_P (extr_mode))
+return false;
+
+  /* Besides a direct vec_extract we can also use an element extract from
+ an integer vector mode with elements of the size of the extr_mode.  */
+  scalar_int_mode imode;
+  machine_mode vmode;
+  if (!int_mode_for_size (GET_MODE_SIZE (extr_mode), 0).exists (&imode)
+  || !related_vector_mode (mode, imode, m).exists (&vmode)
+  || (convert_optab_handler (vec_extract_optab, vmode, imode)
+ == CODE_FOR_nothing))
+return false;
+  /* We assume we can pun mode to vmode and imode to extr_mode.  */
+  return true;
+}
diff --git a/gcc/optabs-query.h b/gcc/optabs-query.h
index 8b768c1797d..b9c9fd6f64d 100644
--- a/gcc/optabs-query.h
+++ b/gcc/optabs-query.h
@@ -195,6 +195,7 @@ bool can_atomic_load_p (machine_mode);
 bool lshift_cheap_p (bool);
 bool supports_vec_gather_load_p (machine_mode = E_VOIDmode);
 bool supports_vec_scatter_store_p (machine_mode = E_VOIDmode);
+bool can_vec_extract (machine_mode, machine_mode);
 
 /* Version of find_widening_optab_handler_and_mode that operates on
specific mode types.  */
diff --git a/gcc/testsuite/gcc.dg/vect/pr102832.c 
b/gcc/testsuite/gcc.dg/vect/pr102832.c
new file mode 100644
index 000..7cb4db5e4c7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr102832.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O3" } */
+/* { dg-additional-options "-march=armv8.2-a+sve -msve-vector-bits=128" { 
target aarch64-*-* } } */
+
+int a, b;
+char c;
+signed char d(int e, int f) { return e - f; }
+void g() {
+  a = 0;
+  for (; a >= -17; a = d(a, 1))
+c ^= b;
+}
diff --git a/gcc/testsuite/gcc.dg/vect/pr104445.c 
b/gcc/testsuite/gcc.dg/vect/pr104445.c
new file mode 100644
index 000..8ec3b3b0f1e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr104445.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O3" } */
+/* { dg-additional-options "-mavx -mno-mmx" { target x86_64-*-* i?86-*-* } } */
+
+signed char a;
+signed char f (int i, int j)
+{
+  signed char c;
+  while (i != 0)
+  {
+a ^= j;
+++c;
+++i;
+  }
+  return c;
+}
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 4860bfd3344..9916ae46460 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -4997,7 +4997,8 @@ vect_find_reusable_accumulator (loop_vec_info loop_vinfo,
   if (!constant_multiple_p (TYPE_VECTOR_SUBPARTS (old_vectype),
TYPE_VECTOR_SUBPARTS (vectype), &m))
 return false;
-  /* Check the intermediate vector types are available.  */
+ 

Re: [PATCH] ubsan: Separate -f{,no-}delete-null-pointer-checks from -fsanitize={null,{,returns-}nonnull-attribute} [PR104426]

2022-02-09 Thread Richard Biener via Gcc-patches
On Wed, 9 Feb 2022, Jakub Jelinek wrote:

> On Wed, Feb 09, 2022 at 10:03:02AM +0100, Richard Biener wrote:
> > I see there are still targets doing sth like
> > 
> > static void
> > msp430_option_override (void)
> > {
> >   /* The MSP430 architecture can safely dereference a NULL pointer.  In 
> > fact,
> >  there are memory mapped registers there.  */
> >   flag_delete_null_pointer_checks = 0;
> 
> Sure, that is the typical embedded target case.

Yes.

> And the user option is in case users are using a typically non-embedded
> target case in an embedded way, even on x86_64-linux one can mmap something
> at address 0 if one tweaks some kernel config parameters.
> 
> I guess I could change
> default_addr_space_zero_address_valid from return false; to
>   if (!flag_delete_null_pointer_checks)
> return true;
>   if (folding_initializer)
> return false;
>   if (current_function_decl
>   && sanitize_flags_p (SANITIZE_NULL | SANITIZE_NONNULL_ATTRIBUTE
>  | SANITIZE_RETURNS_NONNULL_ATTRIBUTE,
>  current_function_decl))
> return true;
>   return false;
> and change the i386 one to also call the default version.

That does look like bogus abstraction though - I'd rather have
the target be specific w/o option checks and replace 
targetm.zero_addres_valid uses with a wrapper (like you do for
flag_delete_null_pointer_checks), if we think that the specific
query should be adjusted by sanitize flags (why?) or
folding_initializer (why?).

> Replacing all the flag_delete_null_pointer_checks uses
> with some wrappers around the target hook will be less fun, not sure if the
> pointer type will be always visible there...

I'd only replace target hook invocations.  Checking whether
NULL pointers are validly pointing to objects should be different
from checking whether optimization should remove checks for NULL.
That we overload -fno-delete-null-pointer-checks is a mistake
difficult to undo.  But at least making targets not set
that flag as we now have a proper target hook would be a good
start.

> > Did you replace all flag_delete_null_pointer_checks uses?
> 
> I did.

OK, just wanted to check.

> >  Can you rename the flag just to be sure?
> 
> Sure.

flag_internal_dnpc or so ;)

>   Jakub


Re: [PATCH] libstdc++: Fix deadlock in atomic wait [PR104442]

2022-02-09 Thread Jonathan Wakely via Gcc-patches
On Wed, 9 Feb 2022 at 00:57, Thomas Rodgers via Libstdc++
 wrote:
>
> This issue was observed as a deadloack in
> 29_atomics/atomic/wait_notify/100334.cc on vxworks. When a wait is
> "laundered" (e.g. type T* does not suffice as a waitable address for the
> platform's native waiting primitive), the address waited is that of the
> _M_ver member of __waiter_pool_base, so several threads may wait on the
> same address for unrelated atomic's. As noted in the PR, the
> implementation correctly exits the wait for the thread who's data
> changed, but not for any other threads waiting on the same address.
>
> As noted in the PR the __waiter::_M_do_wait_v member was correctly exiting
> but the other waiters were not reloaded the value of _M_ver before
> re-entering the wait.
>
> Moving the spin call inside the loop accomplishes this, and is
> consistent with the predicate accepting version of __waiter::_M_do_wait.

There is a change to the memory order in _S_do_spin_v which is not
described in the commit msg or the changelog. Is that unintentional?

(Aside: why do we even have _S_do_spin_v, it's called in exactly one
place, so could just be inlined into _M_do_spin_v, couldn't it?)



Re: [PATCH] tree-optimization/104445 - check for vector extraction support

2022-02-09 Thread Richard Sandiford via Gcc-patches
Richard Biener  writes:
> This adds a missing check to epilogue reduction re-use, namely
> that we can do hi/lo extracts from the vector when demoting it
> to the epilogue vector size.
>
> I've chosen to add a can_vec_extract helper to optabs-query.h,
> in the future we might want to simplify the vectorizers life by
> handling vector-from-vector extraction via BIT_FIELD_REFs during
> RTL expansion via the mode punning when the vec_extract is not
> directly supported.
>
> I'm not 100% sure we can always do the punning of the
> vec_extract result to a vector mode of the same size, but then
> I'm also not sure how to check for that (the vectorizer doesn't
> in other places it does that at the moment, but I suppose we
> eventually just go through memory there)?
>
> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
>
> Does this look OK?

LGTM.  I guess some of the existing optab checks could be simplified
using the new helper, but that's a separate clean-up.

Thanks,
Richard

>
> Thanks,
> Richard.
>
> 2022-02-09  Richard Biener  
>
>   PR tree-optimization/104445
>   PR tree-optimization/102832
>   * optabs-query.h (can_vec_extract): New.
>   * optabs-query.cc (can_vec_extract): Likewise.
>   * tree-vect-loop.cc (vect_find_reusable_accumulator): Check
>   we can extract a hi/lo part from the larger vector.
>
>   * gcc.dg/vect/pr104445.c: New testcase.
> ---
>  gcc/optabs-query.cc  | 33 
>  gcc/optabs-query.h   |  1 +
>  gcc/testsuite/gcc.dg/vect/pr102832.c | 12 ++
>  gcc/testsuite/gcc.dg/vect/pr104445.c | 16 ++
>  gcc/tree-vect-loop.cc|  8 +--
>  5 files changed, 68 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/vect/pr102832.c
>  create mode 100644 gcc/testsuite/gcc.dg/vect/pr104445.c
>
> diff --git a/gcc/optabs-query.cc b/gcc/optabs-query.cc
> index 2ce8d74db16..fa88b4bede0 100644
> --- a/gcc/optabs-query.cc
> +++ b/gcc/optabs-query.cc
> @@ -763,3 +763,36 @@ supports_vec_scatter_store_p (machine_mode mode)
>return this_fn_optabs->supports_vec_scatter_store[mode] > 0;
>  }
>  
> +/* Whether we can extract part of the vector mode MODE as
> +   (scalar or vector) mode EXTR_MODE.  */
> +
> +bool
> +can_vec_extract (machine_mode mode, machine_mode extr_mode)
> +{
> +  if (!VECTOR_MODE_P (mode))
> +return false;
> +
> +  unsigned m;
> +  if (!constant_multiple_p (GET_MODE_SIZE (mode),
> + GET_MODE_SIZE (extr_mode), &m))
> +return false;
> +
> +  if (convert_optab_handler (vec_extract_optab, mode, extr_mode)
> +  != CODE_FOR_nothing)
> +return true;
> +
> +  if (!VECTOR_MODE_P (extr_mode))
> +return false;
> +
> +  /* Besides a direct vec_extract we can also use an element extract from
> + an integer vector mode with elements of the size of the extr_mode.  */
> +  scalar_int_mode imode;
> +  machine_mode vmode;
> +  if (!int_mode_for_size (GET_MODE_SIZE (extr_mode), 0).exists (&imode)
> +  || !related_vector_mode (mode, imode, m).exists (&vmode)
> +  || (convert_optab_handler (vec_extract_optab, vmode, imode)
> +   == CODE_FOR_nothing))
> +return false;
> +  /* We assume we can pun mode to vmode and imode to extr_mode.  */
> +  return true;
> +}
> diff --git a/gcc/optabs-query.h b/gcc/optabs-query.h
> index 8b768c1797d..b9c9fd6f64d 100644
> --- a/gcc/optabs-query.h
> +++ b/gcc/optabs-query.h
> @@ -195,6 +195,7 @@ bool can_atomic_load_p (machine_mode);
>  bool lshift_cheap_p (bool);
>  bool supports_vec_gather_load_p (machine_mode = E_VOIDmode);
>  bool supports_vec_scatter_store_p (machine_mode = E_VOIDmode);
> +bool can_vec_extract (machine_mode, machine_mode);
>  
>  /* Version of find_widening_optab_handler_and_mode that operates on
> specific mode types.  */
> diff --git a/gcc/testsuite/gcc.dg/vect/pr102832.c 
> b/gcc/testsuite/gcc.dg/vect/pr102832.c
> new file mode 100644
> index 000..7cb4db5e4c7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/pr102832.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O3" } */
> +/* { dg-additional-options "-march=armv8.2-a+sve -msve-vector-bits=128" { 
> target aarch64-*-* } } */
> +
> +int a, b;
> +char c;
> +signed char d(int e, int f) { return e - f; }
> +void g() {
> +  a = 0;
> +  for (; a >= -17; a = d(a, 1))
> +c ^= b;
> +}
> diff --git a/gcc/testsuite/gcc.dg/vect/pr104445.c 
> b/gcc/testsuite/gcc.dg/vect/pr104445.c
> new file mode 100644
> index 000..8ec3b3b0f1e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/pr104445.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O3" } */
> +/* { dg-additional-options "-mavx -mno-mmx" { target x86_64-*-* i?86-*-* } } 
> */
> +
> +signed char a;
> +signed char f (int i, int j)
> +{
> +  signed char c;
> +  while (i != 0)
> +  {
> +a ^= j;
> +++c;
> +++i;
> +  }
> +  return c;
> +}
> diff --git a/gcc/tree-v

Re: [PATCH] tree-optimization/104445 - check for vector extraction support

2022-02-09 Thread Richard Biener via Gcc-patches
On Wed, 9 Feb 2022, Richard Sandiford wrote:

> Richard Biener  writes:
> > This adds a missing check to epilogue reduction re-use, namely
> > that we can do hi/lo extracts from the vector when demoting it
> > to the epilogue vector size.
> >
> > I've chosen to add a can_vec_extract helper to optabs-query.h,
> > in the future we might want to simplify the vectorizers life by
> > handling vector-from-vector extraction via BIT_FIELD_REFs during
> > RTL expansion via the mode punning when the vec_extract is not
> > directly supported.
> >
> > I'm not 100% sure we can always do the punning of the
> > vec_extract result to a vector mode of the same size, but then
> > I'm also not sure how to check for that (the vectorizer doesn't
> > in other places it does that at the moment, but I suppose we
> > eventually just go through memory there)?
> >
> > Bootstrap and regtest running on x86_64-unknown-linux-gnu.
> >
> > Does this look OK?
> 
> LGTM.  I guess some of the existing optab checks could be simplified
> using the new helper, but that's a separate clean-up.

Indeed.

I did notice some errors in the patch when digging into whether
there's a bug in the i386 patterns as well though.  I've now
verified we do proper can_vec_extract queries (got the order
wrong for multiple intermediate steps and asked for
vec_extractv8qiv16qi).  Also I mixed up bitsize vs. size for
the int_mode_for_size query.  The issue with x86 is that we
have no v8qi with -m32 -mno-mmx but instead we use DImode,
but we do have v4qi.  I've checked we can handle v16qi -> DImode
extracts via vec_extract_v2didi but we have to punt for
DImode -> v4qi with this simple approach.

I'll think about those cases when we clean this up, possibly next
stage1.

Re-bootstrapping and testing the variant below, will push when
that succeeds.

Richard.

>From e1374a647d4524734cad373a79fe9b863365c374 Mon Sep 17 00:00:00 2001
From: Richard Biener 
Date: Wed, 9 Feb 2022 10:55:18 +0100
Subject: [PATCH] tree-optimization/104445 - check for vector extraction
 support
To: gcc-patches@gcc.gnu.org

This adds a missing check to epilogue reduction re-use, namely
that we can do hi/lo extracts from the vector when demoting it
to the epilogue vector size.

I've chosen to add a can_vec_extract helper to optabs-query.h,
in the future we might want to simplify the vectorizers life by
handling vector-from-vector extraction via BIT_FIELD_REFs during
RTL expansion via the mode punning when the vec_extract is not
directly supported.

I'm not 100% sure we can always do the punning of the
vec_extract result to a vector mode of the same size, but then
I'm also not sure how to check for that (the vectorizer doesn't
in other places it does that at the moment, but I suppose we
eventually just go through memory there)?

2022-02-09  Richard Biener  

PR tree-optimization/104445
PR tree-optimization/102832
* optabs-query.h (can_vec_extract): New.
* optabs-query.cc (can_vec_extract): Likewise.
* tree-vect-loop.cc (vect_find_reusable_accumulator): Check
we can extract a hi/lo part from the larger vector, rework
check iteration from larger to smaller sizes.

* gcc.dg/vect/pr104445.c: New testcase.
---
 gcc/optabs-query.cc  | 28 
 gcc/optabs-query.h   |  1 +
 gcc/testsuite/gcc.dg/vect/pr102832.c | 12 
 gcc/testsuite/gcc.dg/vect/pr104445.c | 16 
 gcc/tree-vect-loop.cc| 16 ++--
 5 files changed, 67 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr102832.c
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr104445.c

diff --git a/gcc/optabs-query.cc b/gcc/optabs-query.cc
index 2ce8d74db16..713c098ba4e 100644
--- a/gcc/optabs-query.cc
+++ b/gcc/optabs-query.cc
@@ -763,3 +763,31 @@ supports_vec_scatter_store_p (machine_mode mode)
   return this_fn_optabs->supports_vec_scatter_store[mode] > 0;
 }
 
+/* Whether we can extract part of the vector mode MODE as
+   (scalar or vector) mode EXTR_MODE.  */
+
+bool
+can_vec_extract (machine_mode mode, machine_mode extr_mode)
+{
+  unsigned m;
+  if (!VECTOR_MODE_P (mode)
+  || !constant_multiple_p (GET_MODE_SIZE (mode),
+  GET_MODE_SIZE (extr_mode), &m))
+return false;
+
+  if (convert_optab_handler (vec_extract_optab, mode, extr_mode)
+  != CODE_FOR_nothing)
+return true;
+
+  /* Besides a direct vec_extract we can also use an element extract from
+ an integer vector mode with elements of the size of the extr_mode.  */
+  scalar_int_mode imode;
+  machine_mode vmode;
+  if (!int_mode_for_size (GET_MODE_BITSIZE (extr_mode), 0).exists (&imode)
+  || !related_vector_mode (mode, imode, m).exists (&vmode)
+  || (convert_optab_handler (vec_extract_optab, vmode, imode)
+ == CODE_FOR_nothing))
+return false;
+  /* We assume we can pun mode to vmode and imode to extr_mode.  */
+  return true;
+}
diff --

Consider 'TDF_UID', 'TDF_NOUID' in 'print_node_brief', 'print_node'

2022-02-09 Thread Thomas Schwinge
Hi!

OK to push (now, or in next development stage 1?) the attached
"Consider 'TDF_UID', 'TDF_NOUID' in 'print_node_brief', 'print_node'",
or should that be done differently -- or, per the current state (why?)
not at all?

This does work for my current debugging task, but I've not yet run
'make check' in case anything needs to be adjusted there.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From e655409cf9154ac72194dd55f7f80cb5ed3137fc Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Wed, 9 Feb 2022 12:51:39 +0100
Subject: [PATCH] Consider 'TDF_UID', 'TDF_NOUID' in 'print_node_brief',
 'print_node'

Running GCC with '-fdump-tree-all-uid' (so that 'TDF_UID' is set in
'dump_flags') and '-wrapper gdb,--args', then for a 'call debug_tree(decl)',
that does (pretty-)print all kinds of things -- but not the 'DECL_UID':

[...]
(gdb) print dump_flags & TDF_UID
$1 = 256
(gdb) call debug_tree(decl)
 
unit-size 
align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x77e8 precision:32 min  max 
pointer_to_this >
used SI source-gcc/gcc/testsuite/gfortran.dg/goacc-gomp/pr102330-3.f90:10:3 size  unit-size 
align:32 warn_if_not_align:0 context >
(gdb) print decl.decl_minimal.uid
$3 = 4249

In my opinion, that's a bit unfortunate, as the 'DECL_UID' is very important
for debugging certain classes of issues.

With this patch, there is no change if 'TDF_UID' isn't set, but if it is, we
now use the same syntax as 'gcc/tree-pretty-print.cc:dump_decl_name', for
example:

 
unit-size 
align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x77e8 precision:32 min  max 
pointer_to_this >
used SI source-gcc/gcc/testsuite/gfortran.dg/goacc-gomp/pr102330-3.f90:10:3 size  unit-size 
align:32 warn_if_not_align:0 context >

Notice 'var_decl': 'i' vs. 'iD.4249', and 'function_decl': 'p' vs. 'pD.4227'.
Or 'iD.', 'pD.' if 'TDF_NOUID' is set ('-fdump-tree-all-uid-nouid', for
example).

	gcc/
	* print-tree.cc (print_node_brief, print_node): Consider
	'TDF_UID', 'TDF_NOUID'.
---
 gcc/print-tree.cc | 24 ++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/gcc/print-tree.cc b/gcc/print-tree.cc
index 0876da873a9..f2da5187293 100644
--- a/gcc/print-tree.cc
+++ b/gcc/print-tree.cc
@@ -139,7 +139,17 @@ print_node_brief (FILE *file, const char *prefix, const_tree node, int indent)
   if (tclass == tcc_declaration)
 {
   if (DECL_NAME (node))
-	fprintf (file, " %s", IDENTIFIER_POINTER (DECL_NAME (node)));
+	{
+	  fprintf (file, " %s", IDENTIFIER_POINTER (DECL_NAME (node)));
+	  if (dump_flags & TDF_UID)
+	{
+	  char c = TREE_CODE (node) == CONST_DECL ? 'C' : 'D';
+	  if (dump_flags & TDF_NOUID)
+		fprintf (file, "%c.", c);
+	  else
+		fprintf (file, "%c.%d", c, DECL_UID (node));
+	}
+	}
   else if (TREE_CODE (node) == LABEL_DECL
 	   && LABEL_DECL_UID (node) != -1)
 	{
@@ -284,7 +294,17 @@ print_node (FILE *file, const char *prefix, tree node, int indent,
   if (tclass == tcc_declaration)
 {
   if (DECL_NAME (node))
-	fprintf (file, " %s", IDENTIFIER_POINTER (DECL_NAME (node)));
+	{
+	  fprintf (file, " %s", IDENTIFIER_POINTER (DECL_NAME (node)));
+	  if (dump_flags & TDF_UID)
+	{
+	  char c = TREE_CODE (node) == CONST_DECL ? 'C' : 'D';
+	  if (dump_flags & TDF_NOUID)
+		fprintf (file, "%c.", c);
+	  else
+		fprintf (file, "%c.%d", c, DECL_UID (node));
+	}
+	}
   else if (code == LABEL_DECL
 	   && LABEL_DECL_UID (node) != -1)
 	{
-- 
2.25.1



Re: [PATCH 1/4][RFC] middle-end/90348 - add explicit birth

2022-02-09 Thread Michael Matz via Gcc-patches
Hey,

On Tue, 8 Feb 2022, Joseph Myers wrote:

> On Tue, 8 Feb 2022, Richard Biener via Gcc-patches wrote:
> 
> > which I think is OK?  That is, when the abstract machine
> > arrives at 'int i;' then the previous content in 'i' goes
> > away?  Or would
> 
> Yes, that's correct.  "If an initialization is specified for the object, 
> it is performed each time the declaration or compound literal is reached 
> in the execution of the block; otherwise, the value becomes indeterminate 
> each time the declaration is reached.".

Okay, that makes things easier then.  We can put the birth 
clobbers at their point of declaration, just the storage associated with a 
decl needs to survive for the whole block.  We still need to make sure 
that side entries skipping declarations correctly "allocate" such storage 
(by introducing proper conflicts with other objects), but at least values 
don't need to survive decls.


Ciao,
Michael.


Re: [PATCH V2] tree-optimization/104288 - Register non-null side effects properly.

2022-02-09 Thread Andrew MacLeod via Gcc-patches

On 2/9/22 03:45, Richard Biener wrote:

On Tue, Feb 8, 2022 at 9:58 PM Andrew MacLeod  wrote:

On 2/8/22 03:32, Richard Biener wrote:

On Tue, Feb 8, 2022 at 2:33 AM Andrew MacLeod via Gcc-patches
 wrote:

On 2/7/22 09:29, Andrew MacLeod wrote:

I have a proposal for PR 104288.

ALL patches (in sequence) bootstrap on their own and each cause no
regressions.

I've been continuing to work with this towards the GCC13 solution for
statement side effects.  And There is another possibility we could
pursue which is less intrusive

I adapted the portions of patch 2/4 which process nonnull, but changes
it from being in the nonnull class to being in the cache.

THe main difference is it continues to use a single bit, just changing
all uses of it to *always* mean its non-null on exit to the block.

Range-on-entry is changed to only check dominator blocks, and
range_of_expr doesn't check the bit at all.. in both ranger and the cache.

When we are doing the block walk and process side effects, the on-entry
*cache* range for the block is adjusted to be non-null instead.   The
statements which precede this stmt have already been processed, and all
remaining statements in this block will now pick up this new non-value
from the on-entry cache.  This should be perfectly safe.

The final tweak is when range_of_expr is looking the def block, it
normally does not have an on entry cache value.  (the def is in the
block, so there is no entry cache value).  It now checks to see if we
have set one, which can only happen when we have been doing the
statement walk and set the on-entry cache with  a non-null value.  This
allows us to pick up the non-null range in the def block too... (once we
have passed a statement which set nonnull of course).

THis patch provides much less change to trunk, and is probably a better
approach as its closer to what is going to happen in GCC13.

Im still working with it, so the patch isnt fully refined yet... but it
bootstraps and passes all test with no regressions.  And passes the new
tests too.   I figured I would mention this before you look to hard at
the other patchset.the GCC11 patch doesn't change.

Let me know if you prefer this I think I do :-)  less churn, and
closer to end state.

Yes, I very much prefer this - some comments to the other patches
still apply to this one.  Like using get_nonnull_args and probably
adding a bail-out to computing ranges from stmts that can throw
internally or have abnormal control flow (to not get into range-on-exit
vs. normal vs. exceptional or abnormal edges).

Richard.

with some minor performance tweaks, such as moving adjust_range() to the
header so it can be inlined .

range-on-edge now applies the non-null from the src block if
appropriate, not in range-on-exit.   That should resolve  the internal
throwing statements I think.  and I have switched over to
get_nonnull_args().

Bootstraps on build-x86_64-pc-linux-gnu and passes all regressions.

OK for trunk, or did I miss something?

OK.

I do think there's some confusion about -fnon-call-exceptions.  The
comment

+  // Non-call exceptions mean we could throw in the middle of the
+  // block, so just punt on those for now.

also applies to regular exceptions, non-call vs. call EH just
adds the possibility of non-call stmts to throw.  I do not think that
value-range propagation needs to care about stmts that throw
in the middle of a block (which automatically means no EH edges
and thus !stmt_can_throw_internal).  When there are EH edges
then restrictions apply to both calls and non-calls that throw.

So whatever made those cfun->can_throw_non_call_exceptions
necessary should have shown a more general issue with EH.


its probably paranoid carry over from a few years ago when getting off 
the ground, combined with the non-null bit originally applying to the 
entire block.


With these changes, and adjusting the outgoing values only on 
non-abnormal/EH edge, I suspect they are no longer needed.  I will look 
at pulling them all out for the next release.






That's something to look at, but not in the scope of this fix.

Richard.


Andrew

PS. odd.. I haven't seen the git diff be wrong before, but it shows the
ranger_cache::range_on_edge changes as being in
gimple_cache::range_of_expr...  They are most definitely are not
and it applies/de-applies fine.. so its just an oddity I guess.





Re: [PATCH] ubsan: Separate -f{,no-}delete-null-pointer-checks from -fsanitize={null,{,returns-}nonnull-attribute} [PR104426]

2022-02-09 Thread Jakub Jelinek via Gcc-patches
On Wed, Feb 09, 2022 at 11:19:25AM +0100, Richard Biener wrote:
> That does look like bogus abstraction though - I'd rather have
> the target be specific w/o option checks and replace 
> targetm.zero_addres_valid uses with a wrapper (like you do for
> flag_delete_null_pointer_checks), if we think that the specific
> query should be adjusted by sanitize flags (why?) or
> folding_initializer (why?).

Based on discussions on IRC, here is a WIP patch.

Unfortunately, there are 3 unresolved issues:
1) ipa-icf.cc uses
  && opt_for_fn (decl, flag_delete_null_pointer_checks))
   there is a pointer type, but I guess we'd need to adjust the
   target hook to take a defaulted fndecl argument and use that
   for the options
2) rtlanal.cc has:
case SYMBOL_REF:
  return flag_delete_null_pointer_checks && !SYMBOL_REF_WEAK (x);
   Is there any way how to find out address space of a SYMBOL_REF?
   Or shall it hardcode ADDR_SPACE_GENERIC?
3) tree-ssa-structalias.cc has:
  if ((TREE_CODE (t) == INTEGER_CST
   && integer_zerop (t))
  /* The only valid CONSTRUCTORs in gimple with pointer typed
 elements are zero-initializer.  But in IPA mode we also
 process global initializers, so verify at least.  */
  || (TREE_CODE (t) == CONSTRUCTOR
  && CONSTRUCTOR_NELTS (t) == 0))
{
  if (flag_delete_null_pointer_checks)
temp.var = nothing_id;
  else
temp.var = nonlocal_id;
  temp.type = ADDRESSOF;
  temp.offset = 0;
  results->safe_push (temp);
  return;
}
   mpt really sure where to get the address space from in that case

And perhaps I didn't do it right in some other spots too.

--- gcc/targhooks.cc.jj 2022-01-18 11:58:59.919977242 +0100
+++ gcc/targhooks.cc2022-02-09 13:21:08.958835833 +0100
@@ -1598,7 +1598,7 @@ default_addr_space_subset_p (addr_space_
 bool
 default_addr_space_zero_address_valid (addr_space_t as ATTRIBUTE_UNUSED)
 {
-  return false;
+  return !flag_delete_null_pointer_checks_;
 }
 
 /* The default hook for debugging the address space is to return the
--- gcc/tree.cc.jj  2022-02-08 20:08:04.001539492 +0100
+++ gcc/tree.cc 2022-02-09 14:44:01.602693848 +0100
@@ -69,6 +69,7 @@ along with GCC; see the file COPYING3.
 #include "gimple-fold.h"
 #include "escaped_string.h"
 #include "gimple-range.h"
+#include "asan.h"
 
 /* Tree code classes.  */
 
@@ -13937,12 +13938,13 @@ nonnull_arg_p (const_tree arg)
   /* THIS argument of method is always non-NULL.  */
   if (TREE_CODE (TREE_TYPE (cfun->decl)) == METHOD_TYPE
   && arg == DECL_ARGUMENTS (cfun->decl)
-  && flag_delete_null_pointer_checks)
+  && POINTER_TYPE_P (TREE_TYPE (arg))
+  && !zero_address_valid (TREE_TYPE (arg)))
 return true;
 
   /* Values passed by reference are always non-NULL.  */
   if (TREE_CODE (TREE_TYPE (arg)) == REFERENCE_TYPE
-  && flag_delete_null_pointer_checks)
+  && !zero_address_valid (TREE_TYPE (arg)))
 return true;
 
   fntype = TREE_TYPE (cfun->decl);
@@ -14549,6 +14551,24 @@ get_attr_nonstring_decl (tree expr, tree
   return NULL_TREE;
 }
 
+/* Return true if NULL is a valid address in AS.  */
+
+bool
+zero_address_valid (addr_space_t as)
+{
+  if (targetm.addr_space.zero_address_valid (as))
+return true;
+  /* -fsanitize={null,{,returns-}nonnull-attribute sanitizers need
+ NULL pointer checks to be preserved, so pretend NULL addresses
+ are valid for it as well.
+ But don't do that in constant expressions or initializers.  */
+  if (folding_initializer)
+return false;
+  return sanitize_flags_p (SANITIZE_NULL | SANITIZE_NONNULL_ATTRIBUTE  

   
+  | SANITIZE_RETURNS_NONNULL_ATTRIBUTE,
+  current_function_decl);
+}
+
 #if CHECKING_P
 
 namespace selftest {
--- gcc/config/i386/i386.cc.jj  2022-02-09 12:55:50.716774241 +0100
+++ gcc/config/i386/i386.cc 2022-02-09 13:23:01.041272540 +0100
@@ -23804,7 +23804,9 @@ ix86_gen_scratch_sse_rtx (machine_mode m
 static bool
 ix86_addr_space_zero_address_valid (addr_space_t as)
 {
-  return as != ADDR_SPACE_GENERIC;
+  if (as != ADDR_SPACE_GENERIC)
+return true;
+  return default_addr_space_zero_address_valid (as);
 }
 
 static void
--- gcc/config/nios2/elf.h.jj   2022-01-11 23:11:21.915296829 +0100
+++ gcc/config/nios2/elf.h  2022-02-09 13:04:42.643433282 +0100
@@ -57,5 +57,5 @@
vector).  Users can override this on the command line to get the
additional optimizations it enables.  */
 #define SUBTARGET_OVERRIDE_OPTIONS \
-  if (flag_delete_null_pointer_checks < 0) \
-flag_delete_null_pointer_checks = 0
+  if (flag_delete_null_pointer_checks_ < 0)\
+flag_delete_null_pointer_checks_ = 0
--- gcc/config/msp430/msp430.cc.jj  2022-02-04 14:36:54.410613609 +0100
+++ gcc/config/msp430/msp430.cc 2022-02-09 13:04:09.372888416 +0100
@@ -161,7 +161,7 @@ msp

Re: [PATCH 1/4][RFC] middle-end/90348 - add explicit birth

2022-02-09 Thread Richard Biener via Gcc-patches
On Tue, 8 Feb 2022, Michael Matz wrote:

> Hello,
> 
> On Tue, 8 Feb 2022, Richard Biener wrote:
> 
> > > int state = 2, *p, camefrom1 = 0;
> > > for (;;) switch (state) {
> > >   case 1: 
> > >   case 2: ;
> > > int i;
> > > if (state != 1) { p = &i; i = 0; }
> > > if (state == 1) { (*p)++; return *p; }
> > > state = 1;
> > > continue;
> > > }
> > > 
> > > Note how i is initialized during state 2, and needs to be kept 
> > > initialized 
> > > during state 1, so there must not be a CLOBBER (birth or other) at the 
> > > point of the declaration of 'i'.  AFAICS in my simple tests a DECL_EXPR 
> > > for 'i' is placed with the statement associated with 'case 2' label, 
> > > putting a CLOBBER there would be the wrong thing.  If the decl had an 
> > > initializer it might be harmless, as it would be overwritten at that 
> > > place, but even so, in this case there is no initializer.  Hmm.
> > 
> > You get after gimplification:
> > 
> >   state = 2;
> >   camefrom1 = 0;
> >   :
> >   switch (state) , case 1: , case 2: >
> >   {
> > int i;
> > 
> > try
> >   {
> > i = {CLOBBER(birth)};  /// ignore, should go away
> > :
> > :
> > i = {CLOBBER(birth)};
> 
> I think this clobber here would be the problem, because ...
> 
> > which I think is OK?  That is, when the abstract machine
> > arrives at 'int i;' then the previous content in 'i' goes
> > away?  Or would
> > 
> > int foo()
> > {
> >goto ick;
> > use:
> >int i, *p;
> >return *p;
> > ick:
> >i = 1;
> >p = &i;
> >goto use;
> > 
> > }
> > 
> > require us to return 1?
> 
> ... I think this is exactly the logical consequence of lifetime of 'i' 
> being the whole block.  We need to return 1. (Joseph: correct me on that! 
> :) )  That's what I was trying to get at with my switch example as well.
> 
> > With the current patch 'i' is clobbered before the return.
> > 
> > > Another complication arises from simple forward jumps:
> > > 
> > >   goto forw;
> > >   {
> > > int i;
> > > printf("not reachable\n");
> > >   forw:
> > > i = 1;
> > >   }
> > > 
> > > Despite the jump skiping the unreachable head of the BLOCK, 'i' needs to 
> > > be considered birthed at the label.  (In a way the placement of births 
> > > exactly mirrors the placements of deaths (of course), every path from 
> > > outside a BLOCK to inside needs to birth-clobber all variables (in C), 
> > > like every path leaving needs to kill them.  It's just that we have a 
> > > convenient construct for the latter (try-finally), but not for the former)
> > 
> > The situation with an unreachable birth is handled conservatively
> > since we consider a variable without a (visible at RTL expansion time)
> > birth to conflict with all other variables.
> 
> That breaks down when a birth is there (because it was otherwise 
> reachable) but not on the taken path:
> 
>   if (nevertrue)
> goto start;
>   goto forw;
>   start:
>   {
> int i;
> printf("not really reachable, but unknowingly so\n");
>   forw:
> i = 1;
>   }

I think to cause breakage you need a use of 'i' on the side-entry
path that is not reachable from the path with the birth.  I guess sth like

   if (nevertrue)
 goto start;
   goto forw;
   start:
   {
 int i = 0;
 printf("not really reachable, but unknowingly so\n");
 goto common;
   forw:
 i = 1;
   common:
 foo (&i);
   }

if we'd have a variable that's live only on the side-entry path
then it could share the stack slot with 'i' this way, breaking
things (now we don't move CLOBBERs so it isn't easy to construct
such case).  The present dataflow would, for the above, indeed
compute 'i' not live in the forw: block.

> > I don't see a good way to have a birth magically appear at 'forw' 
> > without trying to argue that the following stmt is the first mentioning 
> > the variable.
> 
> That's what my 'Hmm' aluded to :)  The only correct and precise way I see 
> is to implement something similar like try-finally topside-down.  An 
> easier but less precise way is to place the births at the (start of) 
> innermost block containing the decl _and all jumps into the block_.  Even 
> less presice, but perhaps even easier is to place the births for decls in 
> blocks with side-entries into the function scope (and for blocks without 
> side entries at their start).
> 
> Except for switches side-entries are really really seldom, so we might 
> usefully get away with that latter solution.  And for switches it might be 
> okay to put the births at the block containing the switch (if it itself 
> doesn't have side entries, and the switch block doesn't have side 
> entries except the case labels).
> 
> If the birth is moved to outward blocks it might be best if also the 
> dealloc/death clobbers are moved to it, otherwise there might be paths 
> containing a birth but no death.
> 
> The less precise you get with those births the more non-sharing you'll 
> get, but that's the pr

[pushed] c++: P2493 feature test macro updates

2022-02-09 Thread Jason Merrill via Gcc-patches
The C++ committee just updated the values of these macros to reflect some
late C++20 papers that we implement but others don't yet; see PR103891.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/c-family/ChangeLog:

* c-cppbuiltin.cc (c_cpp_builtins): Update values
of __cpp_constexpr and __cpp_concepts for C++20.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/feat-cxx2b.C: Adjust.
* g++.dg/cpp2a/feat-cxx2a.C: Adjust.
---
 gcc/c-family/c-cppbuiltin.cc| 4 ++--
 gcc/testsuite/g++.dg/cpp23/feat-cxx2b.C | 4 ++--
 gcc/testsuite/g++.dg/cpp2a/feat-cxx2a.C | 8 
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/c-family/c-cppbuiltin.cc b/gcc/c-family/c-cppbuiltin.cc
index 528211cf50e..4672ae8486a 100644
--- a/gcc/c-family/c-cppbuiltin.cc
+++ b/gcc/c-family/c-cppbuiltin.cc
@@ -1059,7 +1059,7 @@ c_cpp_builtins (cpp_reader *pfile)
  cpp_define (pfile, "__cpp_generic_lambdas=201707L");
  cpp_define (pfile, "__cpp_designated_initializers=201707L");
  if (cxx_dialect <= cxx20)
-   cpp_define (pfile, "__cpp_constexpr=201907L");
+   cpp_define (pfile, "__cpp_constexpr=202002L");
  cpp_define (pfile, "__cpp_constexpr_in_decltype=201711L");
  cpp_define (pfile, "__cpp_conditional_explicit=201806L");
  cpp_define (pfile, "__cpp_consteval=201811L");
@@ -1084,7 +1084,7 @@ c_cpp_builtins (cpp_reader *pfile)
   if (flag_concepts)
 {
  if (cxx_dialect >= cxx20)
-cpp_define (pfile, "__cpp_concepts=201907L");
+   cpp_define (pfile, "__cpp_concepts=202002L");
   else
 cpp_define (pfile, "__cpp_concepts=201507L");
 }
diff --git a/gcc/testsuite/g++.dg/cpp23/feat-cxx2b.C 
b/gcc/testsuite/g++.dg/cpp23/feat-cxx2b.C
index 923e6bcf65e..c1f91e78e66 100644
--- a/gcc/testsuite/g++.dg/cpp23/feat-cxx2b.C
+++ b/gcc/testsuite/g++.dg/cpp23/feat-cxx2b.C
@@ -528,8 +528,8 @@
 
 #ifndef __cpp_concepts
 #  error "__cpp_concepts"
-#elif __cpp_concepts != 201907
-#  error "__cpp_concepts != 201907"
+#elif __cpp_concepts != 202002
+#  error "__cpp_concepts != 202002"
 #endif
 
 #ifndef __cpp_using_enum
diff --git a/gcc/testsuite/g++.dg/cpp2a/feat-cxx2a.C 
b/gcc/testsuite/g++.dg/cpp2a/feat-cxx2a.C
index 3239df824fc..c65ea6bf48a 100644
--- a/gcc/testsuite/g++.dg/cpp2a/feat-cxx2a.C
+++ b/gcc/testsuite/g++.dg/cpp2a/feat-cxx2a.C
@@ -134,8 +134,8 @@
 
 #ifndef __cpp_constexpr
 #  error "__cpp_constexpr"
-#elif __cpp_constexpr != 201907
-#  error "__cpp_constexpr != 201907"
+#elif __cpp_constexpr != 202002
+#  error "__cpp_constexpr != 202002"
 #endif
 
 #ifndef __cpp_decltype_auto
@@ -528,8 +528,8 @@
 
 #ifndef __cpp_concepts
 #  error "__cpp_concepts"
-#elif __cpp_concepts != 201907
-#  error "__cpp_concepts != 201907"
+#elif __cpp_concepts != 202002
+#  error "__cpp_concepts != 202002"
 #endif
 
 #ifndef __cpp_using_enum

base-commit: 59b31f0e2d187ebdb3d399661e22b28e4ebd8099
-- 
2.27.0



Re: [PATCH] ubsan: Separate -f{,no-}delete-null-pointer-checks from -fsanitize={null,{,returns-}nonnull-attribute} [PR104426]

2022-02-09 Thread Richard Biener via Gcc-patches
On Wed, 9 Feb 2022, Jakub Jelinek wrote:

> On Wed, Feb 09, 2022 at 11:19:25AM +0100, Richard Biener wrote:
> > That does look like bogus abstraction though - I'd rather have
> > the target be specific w/o option checks and replace 
> > targetm.zero_addres_valid uses with a wrapper (like you do for
> > flag_delete_null_pointer_checks), if we think that the specific
> > query should be adjusted by sanitize flags (why?) or
> > folding_initializer (why?).
> 
> Based on discussions on IRC, here is a WIP patch.
> 
> Unfortunately, there are 3 unresolved issues:
> 1) ipa-icf.cc uses
>   && opt_for_fn (decl, flag_delete_null_pointer_checks))
>there is a pointer type, but I guess we'd need to adjust the
>target hook to take a defaulted fndecl argument and use that
>for the options

Yeah, I'd use a struct function arg tho, not a decl.

> 2) rtlanal.cc has:
> case SYMBOL_REF:
>   return flag_delete_null_pointer_checks && !SYMBOL_REF_WEAK (x);
>Is there any way how to find out address space of a SYMBOL_REF?

TYPE_ADDR_SPACE (TREE_TYPE (SYMBOL_REF_DECL ())) I guess.

>Or shall it hardcode ADDR_SPACE_GENERIC?
> 3) tree-ssa-structalias.cc has:
>   if ((TREE_CODE (t) == INTEGER_CST
>&& integer_zerop (t))
>   /* The only valid CONSTRUCTORs in gimple with pointer typed
>  elements are zero-initializer.  But in IPA mode we also
>  process global initializers, so verify at least.  */
>   || (TREE_CODE (t) == CONSTRUCTOR
>   && CONSTRUCTOR_NELTS (t) == 0))
> {
>   if (flag_delete_null_pointer_checks)
> temp.var = nothing_id;
>   else
> temp.var = nonlocal_id;
>   temp.type = ADDRESSOF;
>   temp.offset = 0;
>   results->safe_push (temp);
>   return;
> }
>mpt really sure where to get the address space from in that case
> 
> And perhaps I didn't do it right in some other spots too.

This case is really difficult since we track pointers through integers
(mind the missing POINTER_TYPE_P check above).  Of course we have
no idea what address-space the integer was converted from or will
be converted to so what the above wants to check is whether
there is _any_ address-space that could have a zero pointer pointing
to a valid object ...

> --- gcc/targhooks.cc.jj   2022-01-18 11:58:59.919977242 +0100
> +++ gcc/targhooks.cc  2022-02-09 13:21:08.958835833 +0100
> @@ -1598,7 +1598,7 @@ default_addr_space_subset_p (addr_space_
>  bool
>  default_addr_space_zero_address_valid (addr_space_t as ATTRIBUTE_UNUSED)
>  {
> -  return false;
> +  return !flag_delete_null_pointer_checks_;

As said, I'd not do that, but check it in zero_address_valid only.
Otherwise all targets overriding the hook have to remember to check
this flag.  I suppose we'd then do

  if (option_set (flag_delete_null_pointer_check))
use flag_delete_null_pointer_check;
  else
use targetm.zero_address_valid;

possibly only for the default address-space.

>  }
>  
>  /* The default hook for debugging the address space is to return the
> --- gcc/tree.cc.jj2022-02-08 20:08:04.001539492 +0100
> +++ gcc/tree.cc   2022-02-09 14:44:01.602693848 +0100
> @@ -69,6 +69,7 @@ along with GCC; see the file COPYING3.
>  #include "gimple-fold.h"
>  #include "escaped_string.h"
>  #include "gimple-range.h"
> +#include "asan.h"
>  
>  /* Tree code classes.  */
>  
> @@ -13937,12 +13938,13 @@ nonnull_arg_p (const_tree arg)
>/* THIS argument of method is always non-NULL.  */
>if (TREE_CODE (TREE_TYPE (cfun->decl)) == METHOD_TYPE
>&& arg == DECL_ARGUMENTS (cfun->decl)
> -  && flag_delete_null_pointer_checks)
> +  && POINTER_TYPE_P (TREE_TYPE (arg))
> +  && !zero_address_valid (TREE_TYPE (arg)))
>  return true;
>  
>/* Values passed by reference are always non-NULL.  */
>if (TREE_CODE (TREE_TYPE (arg)) == REFERENCE_TYPE
> -  && flag_delete_null_pointer_checks)
> +  && !zero_address_valid (TREE_TYPE (arg)))
>  return true;
>  
>fntype = TREE_TYPE (cfun->decl);
> @@ -14549,6 +14551,24 @@ get_attr_nonstring_decl (tree expr, tree
>return NULL_TREE;
>  }
>  
> +/* Return true if NULL is a valid address in AS.  */
> +
> +bool
> +zero_address_valid (addr_space_t as)
> +{
> +  if (targetm.addr_space.zero_address_valid (as))
> +return true;
> +  /* -fsanitize={null,{,returns-}nonnull-attribute sanitizers need
> + NULL pointer checks to be preserved, so pretend NULL addresses
> + are valid for it as well.
> + But don't do that in constant expressions or initializers.  */
> +  if (folding_initializer)
> +return false;
> +  return sanitize_flags_p (SANITIZE_NULL | SANITIZE_NONNULL_ATTRIBUTE
>   
>
> +| SANITIZE_RETURNS_NONNULL_ATTRIBUTE,
> +current_function_decl);
> +}
> +
>  #if CHECKING_P
>  
>  namespace selftest {
> --- gcc/config/

[PATCH] middle-end/104464 - ISEL and non-call EH #2

2022-02-09 Thread Richard Biener via Gcc-patches
The following adjusts the earlier change to still allow an
uncritical replacement.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2022-02-09  Richard Biener  

PR middle-end/104464
* gimple-isel.cc (gimple_expand_vec_cond_expr): Postpone
throwing check to after unproblematic replacement.

* gcc.dg/pr104464.c: New testcase.
---
 gcc/gimple-isel.cc  | 28 ++--
 gcc/testsuite/gcc.dg/pr104464.c | 11 +++
 2 files changed, 25 insertions(+), 14 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr104464.c

diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc
index 1d93766b704..3635585bf45 100644
--- a/gcc/gimple-isel.cc
+++ b/gcc/gimple-isel.cc
@@ -178,11 +178,7 @@ gimple_expand_vec_cond_expr (struct function *fun, 
gimple_stmt_iterator *gsi,
}
 
   gassign *def_stmt = dyn_cast (SSA_NAME_DEF_STMT (op0));
-  if (def_stmt
- /* When the compare has EH we do not want to forward it when
-it has multiple uses and in general because of the complication
-with EH redirection.  */
- && !stmt_can_throw_internal (fun, def_stmt))
+  if (def_stmt)
{
  tcode = gimple_assign_rhs_code (def_stmt);
  op0a = gimple_assign_rhs1 (def_stmt);
@@ -195,7 +191,6 @@ gimple_expand_vec_cond_expr (struct function *fun, 
gimple_stmt_iterator *gsi,
 tcode);
 
  /* Try to fold x CMP y ? -1 : 0 to x CMP y.  */
-
  if (can_compute_op0
  && integer_minus_onep (op1)
  && integer_zerop (op2)
@@ -207,14 +202,19 @@ gimple_expand_vec_cond_expr (struct function *fun, 
gimple_stmt_iterator *gsi,
  return new_stmt;
}
 
- if (can_compute_op0
- && used_vec_cond_exprs >= 2
- && (get_vcond_mask_icode (mode, TYPE_MODE (op0_type))
- != CODE_FOR_nothing))
-   {
- /* Keep the SSA name and use vcond_mask.  */
- tcode = TREE_CODE (op0);
-   }
+ /* When the compare has EH we do not want to forward it when
+it has multiple uses and in general because of the complication
+with EH redirection.  */
+ if (stmt_can_throw_internal (fun, def_stmt))
+   tcode = TREE_CODE (op0);
+
+ /* If we can compute op0 and have multiple uses, keep the SSA
+name and use vcond_mask.  */
+ else if (can_compute_op0
+  && used_vec_cond_exprs >= 2
+  && (get_vcond_mask_icode (mode, TYPE_MODE (op0_type))
+  != CODE_FOR_nothing))
+   tcode = TREE_CODE (op0);
}
   else
tcode = TREE_CODE (op0);
diff --git a/gcc/testsuite/gcc.dg/pr104464.c b/gcc/testsuite/gcc.dg/pr104464.c
new file mode 100644
index 000..ed6a22c39d5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr104464.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fnon-call-exceptions -fno-tree-dce -fno-tree-forwprop 
-fsignaling-nans" } */
+
+typedef double __attribute__((__vector_size__(16))) F;
+F f;
+
+void
+foo(void)
+{
+  f += (F)(f != (F){}[0]);
+}
-- 
2.34.1


Re: [PATCH] handle "invisible" reference in -Wdangling-pointer (PR104436)

2022-02-09 Thread Jason Merrill via Gcc-patches

On 2/9/22 03:30, Richard Biener wrote:

On Tue, Feb 8, 2022 at 11:38 PM Jason Merrill via Gcc-patches
 wrote:


On 2/8/22 16:59, Martin Sebor wrote:

Transforming a by-value arguments to by-reference as GCC does for some
class types can trigger -Wdangling-pointer when the argument is used
to store the address of a local variable.  Since the stored value is
not accessible in the caller the warning is a false positive.

The attached patch handles this case by excluding PARM_DECLs with
the DECL_BY_REFERENCE bit set from consideration.

While testing the patch I noticed some instances of the warning are
uninitentionally duplicated as the pass runs more than once.  To avoid
that, I also introduce warning suppression into the handler for this
instance of the warning.  (There might still be others.)


The second test should verify that we do warn about returning 't' from a
function; we don't want to ignore the DECL_BY_REFERENCE RESULT_DECL.


+   tree var = SSA_NAME_VAR (lhs_ref.ref);
+   if (DECL_BY_REFERENCE (var))


I think you need to test var && TREE_CODE (var) == PARM_DECL here since
for DECL_BY_REFERENCE RESULT_DECL we _do_ escape to the caller.  Also
SSA_NAME_VAR var might be NULL.


+ /* Avoid by-value arguments transformed into by-reference.  */
+ continue;


I wonder if we can we express this property of invisiref parms somewhere
more general?  I imagine optimizations would find it useful as well.
Could pointer_query somehow treat the reference as pointing to a
function-local object?


I think points-to analysis got this correct when the reference was marked
restrict but now it also fails at this, making DSE fail to eliminate the
store in

struct A { A(); ~A(); int *p; };

void foo (struct A a, int *p)
{
   a.p = p;
}


Well, that's conservatively correct; since we don't know the definition 
of ~A, we don't know whether it copies p somewhere, e.g.


int *global_p;
A::~A() { global_p = p; }

in which case eliminating the store would be an invalid optimization, 
just as it would be if 'a' were a local variable.



I previously tried to express this by marking the reference as
'restrict', but that was wrong
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97474).




Re: [pushed] c++: P2493 feature test macro updates

2022-02-09 Thread Jonathan Wakely via Gcc-patches
On Wed, 9 Feb 2022 at 14:40, Jason Merrill wrote:
>
> The C++ committee just updated the values of these macros to reflect some
> late C++20 papers that we implement but others don't yet; see PR103891.
>
> Tested x86_64-pc-linux-gnu, applying to trunk.

Nice! I'll test the corresponding libstdc++ patch, thanks.



Re: [PATCH] ubsan: Separate -f{,no-}delete-null-pointer-checks from -fsanitize={null,{,returns-}nonnull-attribute} [PR104426]

2022-02-09 Thread Jakub Jelinek via Gcc-patches
On Wed, Feb 09, 2022 at 03:41:23PM +0100, Richard Biener wrote:
> On Wed, 9 Feb 2022, Jakub Jelinek wrote:
> 
> > On Wed, Feb 09, 2022 at 11:19:25AM +0100, Richard Biener wrote:
> > > That does look like bogus abstraction though - I'd rather have
> > > the target be specific w/o option checks and replace 
> > > targetm.zero_addres_valid uses with a wrapper (like you do for
> > > flag_delete_null_pointer_checks), if we think that the specific
> > > query should be adjusted by sanitize flags (why?) or
> > > folding_initializer (why?).
> > 
> > Based on discussions on IRC, here is a WIP patch.
> > 
> > Unfortunately, there are 3 unresolved issues:
> > 1) ipa-icf.cc uses
> >   && opt_for_fn (decl, flag_delete_null_pointer_checks))
> >there is a pointer type, but I guess we'd need to adjust the
> >target hook to take a defaulted fndecl argument and use that
> >for the options
> 
> Yeah, I'd use a struct function arg tho, not a decl.

But both opts_for_fn and sanitizer_flag_p take a fndecl tree, not cfun.

> > 2) rtlanal.cc has:
> > case SYMBOL_REF:
> >   return flag_delete_null_pointer_checks && !SYMBOL_REF_WEAK (x);
> >Is there any way how to find out address space of a SYMBOL_REF?
> 
> TYPE_ADDR_SPACE (TREE_TYPE (SYMBOL_REF_DECL ())) I guess.

And default to ADDR_SPACE_GENERIC if there is no SYMBOL_REF_DECL?
That can work.

> >Or shall it hardcode ADDR_SPACE_GENERIC?
> > 3) tree-ssa-structalias.cc has:
> >   if ((TREE_CODE (t) == INTEGER_CST
> >&& integer_zerop (t))
> >   /* The only valid CONSTRUCTORs in gimple with pointer typed
> >  elements are zero-initializer.  But in IPA mode we also
> >  process global initializers, so verify at least.  */
> >   || (TREE_CODE (t) == CONSTRUCTOR
> >   && CONSTRUCTOR_NELTS (t) == 0))
> > {
> >   if (flag_delete_null_pointer_checks)
> > temp.var = nothing_id;
> >   else
> > temp.var = nonlocal_id;
> >   temp.type = ADDRESSOF;
> >   temp.offset = 0;
> >   results->safe_push (temp);
> >   return;
> > }
> >mpt really sure where to get the address space from in that case
> > 
> > And perhaps I didn't do it right in some other spots too.
> 
> This case is really difficult since we track pointers through integers
> (mind the missing POINTER_TYPE_P check above).  Of course we have
> no idea what address-space the integer was converted from or will
> be converted to so what the above wants to check is whether
> there is _any_ address-space that could have a zero pointer pointing
> to a valid object ...

Ugh.  So that would be ADDR_SPACE_ANY ((unsigned char) -1) and use that
in the hook?
But we'd penalize x86 through it because for the __seg_?s address spaces
we allow 0 address...

> > --- gcc/targhooks.cc.jj 2022-01-18 11:58:59.919977242 +0100
> > +++ gcc/targhooks.cc2022-02-09 13:21:08.958835833 +0100
> > @@ -1598,7 +1598,7 @@ default_addr_space_subset_p (addr_space_
> >  bool
> >  default_addr_space_zero_address_valid (addr_space_t as ATTRIBUTE_UNUSED)
> >  {
> > -  return false;
> > +  return !flag_delete_null_pointer_checks_;
> 
> As said, I'd not do that, but check it in zero_address_valid only.
> Otherwise all targets overriding the hook have to remember to check
> this flag.  I suppose we'd then do
> 
>   if (option_set (flag_delete_null_pointer_check))
> use flag_delete_null_pointer_check;
>   else
> use targetm.zero_address_valid;
> 
> possibly only for the default address-space.

The advantage of checking the option in the hook is that it can precisely
decide what exactly it wants for each address space.  It can e.g. decide
to ignore the flag and say that in some address space 0 is always valid or 0
is never valid, or honor it under some conditions etc.
Doing it outside of the hook means we do the decision globally, and either
we hardcode targetm.addr_space.zero_address_valid || 
!flag_delete_null_pointer_check_, or
targetm.addr_space.zero_address_valid && !flag_delete_null_pointer_check_

> > --- gcc/config/i386/i386.cc.jj  2022-02-09 12:55:50.716774241 +0100
> > +++ gcc/config/i386/i386.cc 2022-02-09 13:23:01.041272540 +0100
> > @@ -23804,7 +23804,9 @@ ix86_gen_scratch_sse_rtx (machine_mode m
> >  static bool
> >  ix86_addr_space_zero_address_valid (addr_space_t as)
> >  {
> > -  return as != ADDR_SPACE_GENERIC;
> > +  if (as != ADDR_SPACE_GENERIC)
> > +return true;
> 
> so this makes it not possibel to use -fdelete-null-pointer-checks to
> override the non-default address space behavior (on x86)

Yes.  To some extent that is already the current behavior as
targetm.addr_space.zero_address_valid is used in some spots explicitly.
But at least it is a target's decision and without introducing further
options like -fdelete-null-pointer-check=address_space
we need one or the other choice.

> > --- gcc/config/msp430/msp430.cc.jj  2022-02-04 14:36:54.410613609 +0100
> > +++ gcc/config/msp430/msp430.cc 2022-02-09 13:04:09

Re: [PATCH] libstdc++: Fix deadlock in atomic wait [PR104442]

2022-02-09 Thread Thomas Rodgers via Gcc-patches
Excessively enthusiastic refactoring. I expect to rewrite most of this as
part of the work I'm starting now for GCC13 stage1.

On Wed, Feb 9, 2022 at 2:43 AM Jonathan Wakely  wrote:

> On Wed, 9 Feb 2022 at 00:57, Thomas Rodgers via Libstdc++
>  wrote:
> >
> > This issue was observed as a deadloack in
> > 29_atomics/atomic/wait_notify/100334.cc on vxworks. When a wait is
> > "laundered" (e.g. type T* does not suffice as a waitable address for the
> > platform's native waiting primitive), the address waited is that of the
> > _M_ver member of __waiter_pool_base, so several threads may wait on the
> > same address for unrelated atomic's. As noted in the PR, the
> > implementation correctly exits the wait for the thread who's data
> > changed, but not for any other threads waiting on the same address.
> >
> > As noted in the PR the __waiter::_M_do_wait_v member was correctly
> exiting
> > but the other waiters were not reloaded the value of _M_ver before
> > re-entering the wait.
> >
> > Moving the spin call inside the loop accomplishes this, and is
> > consistent with the predicate accepting version of __waiter::_M_do_wait.
>
> There is a change to the memory order in _S_do_spin_v which is not
> described in the commit msg or the changelog. Is that unintentional?
>
> (Aside: why do we even have _S_do_spin_v, it's called in exactly one
> place, so could just be inlined into _M_do_spin_v, couldn't it?)
>
>


Re: [pushed] c++: P2493 feature test macro updates

2022-02-09 Thread Jakub Jelinek via Gcc-patches
On Wed, Feb 09, 2022 at 09:40:49AM -0500, Jason Merrill via Gcc-patches wrote:
> The C++ committee just updated the values of these macros to reflect some
> late C++20 papers that we implement but others don't yet; see PR103891.
> 
> Tested x86_64-pc-linux-gnu, applying to trunk.

So, shouldn't we update project/cxx-status.html for that change?

Like following?

diff --git a/htdocs/projects/cxx-status.html b/htdocs/projects/cxx-status.html
index 014fed8b..4bbff256 100644
--- a/htdocs/projects/cxx-status.html
+++ b/htdocs/projects/cxx-status.html
@@ -312,7 +312,7 @@
Concepts 
   https://wg21.link/p0734r0";>P0734R0
10 
-   __cpp_concepts >= 201907 
+   __cpp_concepts >= 202002 
 
 
   https://wg21.link/p0857r0";>P0857R0
@@ -590,7 +590,7 @@
 
   https://wg21.link/p1330r0";>P1330R0
9 

-   
+   __cpp_constexpr >= 202002 
 
 
   


Jakub



Re: [PATCH] rs6000: Correct function prototypes for vec_replace_unaligned

2022-02-09 Thread Segher Boessenkool
Hi!

On Tue, Feb 08, 2022 at 03:29:48PM -0600, Bill Schmidt wrote:
> Due to a pasto error in the documentation, vec_replace_unaligned was
> implemented with the same function prototypes as vec_replace_elt.  It was
> intended that vec_replace_unaligned always specify output vectors as having
> type vector unsigned char, to emphasize that elements are potentially
> misaligned by this built-in function.

In the documentation that isn't publically available in the first place,
heh.

I don't see how the result type should always be vector unsigned char
here, and not for every other builtin the same.  But whatever the
documentation says, we should do of course.

> This patch corrects the misimplementation.

> gcc/
>   * config/rs6000/rs6000-builtins.def (VREPLACE_UN_UV2DI): Change
>   function prototype.
>   (VREPLACE_UN_UV4SI): Likewise.
>   (VREPLACE_UN_V2DF): Likewise.
>   (VREPLACE_UN_V2DI): Likewise.
>   (VREPLACE_UN_V4SF): Likewise.
>   (VREPLACE_UN_V4SI): Likewise.
>   * config/rs6000/rs6000-overload.def (VEC_REPLACE_UN): Change all
>   function prototypes.
>   * config/rs6000/vsx.md (vreplace_un_): Remove define_expand.
>   (vreplace_un_): New define_insn.
> 
> gcc/testsuite/
>   * gcc.target/powerpc/vec-replace-word-runnable.c: Handle expected
>   prototypes for each call to vec_replace_unaligned.

> +(define_insn "vreplace_un_"
> + [(set (match_operand:V16QI 0 "register_operand" "=v")
> +  (unspec:V16QI [(match_operand:REPLACE_ELT 1 "register_operand" "0")
> + (match_operand: 2 "register_operand" "r")
> +  (match_operand:QI 3 "const_0_to_12_operand" "n")]
> + UNSPEC_REPLACE_UN))]
> + "TARGET_POWER10"
> + "vins %0,%2,%3"
> + [(set_attr "type" "vecsimple")])

const_0_to_12 operand is wrong for vinsd.  Not new after your patch, but
still broken.  You could use const_0_to_15_operand together with an insn
condition for example.  It seems all uses of 0_to_12 have similar
problems?  Removing it completely also saves use from having to fix its
documentation (in predicates.md) :-)

The patch is okay for trunk.  Thanks!


Segher


Re: [pushed] c++: P2493 feature test macro updates

2022-02-09 Thread Jonathan Wakely via Gcc-patches
On Wed, 9 Feb 2022 at 15:24, Jakub Jelinek  wrote:
>
> On Wed, Feb 09, 2022 at 09:40:49AM -0500, Jason Merrill via Gcc-patches wrote:
> > The C++ committee just updated the values of these macros to reflect some
> > late C++20 papers that we implement but others don't yet; see PR103891.
> >
> > Tested x86_64-pc-linux-gnu, applying to trunk.
>
> So, shouldn't we update project/cxx-status.html for that change?
>
> Like following?
>
> diff --git a/htdocs/projects/cxx-status.html b/htdocs/projects/cxx-status.html
> index 014fed8b..4bbff256 100644
> --- a/htdocs/projects/cxx-status.html
> +++ b/htdocs/projects/cxx-status.html
> @@ -312,7 +312,7 @@
> Concepts 
>https://wg21.link/p0734r0";>P0734R0
>  href="../gcc-10/changes.html#cxx">10 
> -   __cpp_concepts >= 201907 
> +   __cpp_concepts >= 202002 

I don't like this change. The value to check for P0734R0 support is
still 201907. If you want to also check for P0848R3 support, you can
use 202002. So I think it would be better to move the P0848R3 row out
of the rowspan group, and then put 202002 as the macro for that paper.


>  
>  
>https://wg21.link/p0857r0";>P0857R0
> @@ -590,7 +590,7 @@
>  
>https://wg21.link/p1330r0";>P1330R0
>  href="../gcc-9/changes.html#cxx">9 
> -   
> +   __cpp_constexpr >= 202002 
>  
>  
>

This one looks fine.



[pushed] c++: modules and explicit(bool) [PR103752]

2022-02-09 Thread Jason Merrill via Gcc-patches
We weren't streaming a C++20 dependent explicit-specifier.

gcc/cp/ChangeLog:

* module.cc (trees_out::core_vals): Stream explicit specifier.
(trees_in::core_vals): Likewise.
* pt.cc (store_explicit_specifier): No longer static.
(tsubst_function_decl): Clear DECL_HAS_DEPENDENT_EXPLICIT_SPEC_P.
* cp-tree.h (lookup_explicit_specifier): Declare.

gcc/testsuite/ChangeLog:

* g++.dg/modules/explicit-bool-1_b.C: New test.
* g++.dg/modules/explicit-bool-1_a.H: New test.
---
 gcc/cp/cp-tree.h  |  1 +
 gcc/cp/module.cc  | 10 +
 gcc/cp/pt.cc  | 10 +++--
 .../g++.dg/modules/explicit-bool-1_b.C|  6 +
 .../g++.dg/modules/explicit-bool-1_a.H| 22 +++
 5 files changed, 47 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/modules/explicit-bool-1_b.C
 create mode 100644 gcc/testsuite/g++.dg/modules/explicit-bool-1_a.H

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index f09055e4852..f253b32c3f2 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -7420,6 +7420,7 @@ extern bool copy_guide_p  (const_tree);
 extern bool template_guide_p   (const_tree);
 extern bool builtin_guide_p(const_tree);
 extern void store_explicit_specifier   (tree, tree);
+extern tree lookup_explicit_specifier  (tree);
 extern void walk_specializations   (bool,
 void (*)(bool, spec_entry *,
  void *),
diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index 81ceef92df3..3cf0af10bc0 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -6034,6 +6034,9 @@ trees_out::core_vals (tree t)
   WT (t->function_decl.function_specific_target);
   WT (t->function_decl.function_specific_optimization);
   WT (t->function_decl.vindex);
+
+  if (DECL_HAS_DEPENDENT_EXPLICIT_SPEC_P (t))
+   WT (lookup_explicit_specifier (t));
   break;
 
 case USING_DECL:
@@ -6531,6 +6534,13 @@ trees_in::core_vals (tree t)
RT (t->function_decl.function_specific_target);
RT (t->function_decl.function_specific_optimization);
RT (t->function_decl.vindex);
+
+   if (DECL_HAS_DEPENDENT_EXPLICIT_SPEC_P (t))
+ {
+   tree spec;
+   RT (spec);
+   store_explicit_specifier (t, spec);
+ }
   }
   break;
 
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 862f337886c..5c995da62c6 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -13870,7 +13870,7 @@ store_explicit_specifier (tree v, tree t)
 
 /* Lookup an element in EXPLICIT_SPECIFIER_MAP.  */
 
-static tree
+tree
 lookup_explicit_specifier (tree v)
 {
   return *explicit_specifier_map->get (v);
@@ -14103,7 +14103,13 @@ tsubst_function_decl (tree t, tree args, 
tsubst_flags_t complain,
/*function_p=*/false,
/*i_c_e_p=*/true);
   spec = build_explicit_specifier (spec, complain);
-  DECL_NONCONVERTING_P (r) = (spec == boolean_true_node);
+  if (instantiation_dependent_expression_p (spec))
+   store_explicit_specifier (r, spec);
+  else
+   {
+ DECL_NONCONVERTING_P (r) = (spec == boolean_true_node);
+ DECL_HAS_DEPENDENT_EXPLICIT_SPEC_P (r) = false;
+   }
 }
 
   /* OpenMP UDRs have the only argument a reference to the declared
diff --git a/gcc/testsuite/g++.dg/modules/explicit-bool-1_b.C 
b/gcc/testsuite/g++.dg/modules/explicit-bool-1_b.C
new file mode 100644
index 000..27bfdee1a65
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/explicit-bool-1_b.C
@@ -0,0 +1,6 @@
+// { dg-additional-options -fmodules-ts }
+// { dg-require-effective-target c++20 }
+
+export module x;
+import "explicit-bool-1_a.H";
+pair environment;
diff --git a/gcc/testsuite/g++.dg/modules/explicit-bool-1_a.H 
b/gcc/testsuite/g++.dg/modules/explicit-bool-1_a.H
new file mode 100644
index 000..fa14de1137d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/explicit-bool-1_a.H
@@ -0,0 +1,22 @@
+// { dg-additional-options -fmodule-header }
+// { dg-require-effective-target c++20 }
+
+template
+struct pair
+{
+constexpr
+  explicit(__is_same(_T1, _T2))
+  pair()
+  { }
+
+_T1 first;
+_T2 second;
+};
+
+struct string
+{
+  string() { }
+  string(const char* s) : s(s) { }
+
+  const char* s = "";
+};

base-commit: d80f2248c5962318c77624a0eab05b81c59add1b
-- 
2.27.0



Re: [pushed] c++: P2493 feature test macro updates

2022-02-09 Thread Jakub Jelinek via Gcc-patches
On Wed, Feb 09, 2022 at 03:38:32PM +, Jonathan Wakely wrote:
> On Wed, 9 Feb 2022 at 15:24, Jakub Jelinek  wrote:
> >
> > On Wed, Feb 09, 2022 at 09:40:49AM -0500, Jason Merrill via Gcc-patches 
> > wrote:
> > > The C++ committee just updated the values of these macros to reflect some
> > > late C++20 papers that we implement but others don't yet; see PR103891.
> > >
> > > Tested x86_64-pc-linux-gnu, applying to trunk.
> >
> > So, shouldn't we update project/cxx-status.html for that change?
> >
> > Like following?
> >
> > diff --git a/htdocs/projects/cxx-status.html 
> > b/htdocs/projects/cxx-status.html
> > index 014fed8b..4bbff256 100644
> > --- a/htdocs/projects/cxx-status.html
> > +++ b/htdocs/projects/cxx-status.html
> > @@ -312,7 +312,7 @@
> > Concepts 
> >https://wg21.link/p0734r0";>P0734R0
> >  > href="../gcc-10/changes.html#cxx">10 
> > -   __cpp_concepts >= 201907 
> > +   __cpp_concepts >= 202002 
> 
> I don't like this change. The value to check for P0734R0 support is
> still 201907. If you want to also check for P0848R3 support, you can
> use 202002. So I think it would be better to move the P0848R3 row out
> of the rowspan group, and then put 202002 as the macro for that paper.

So perhaps like following then?
diff --git a/htdocs/projects/cxx-status.html b/htdocs/projects/cxx-status.html
index 014fed8b..5141629b 100644
--- a/htdocs/projects/cxx-status.html
+++ b/htdocs/projects/cxx-status.html
@@ -312,7 +312,7 @@
Concepts 
   https://wg21.link/p0734r0";>P0734R0
10 
-   __cpp_concepts >= 201907 
+   __cpp_concepts >= 201907 
 
 
   https://wg21.link/p0857r0";>P0857R0
@@ -325,9 +325,11 @@
 
 
   https://wg21.link/p0848r3";>P0848R3
+   __cpp_concepts >= 202002 
 
 
   https://wg21.link/p1616r1";>P1616R1
+   __cpp_concepts >= 201907 
 
 
   https://wg21.link/p1452r2";>P1452R2
@@ -590,7 +592,7 @@
 
   https://wg21.link/p1330r0";>P1330R0
9 

-   
+   __cpp_constexpr >= 202002 
 
 
   


Jakub



[PATCH] c++: memfn lookup consistency and using-decls [PR104432]

2022-02-09 Thread Patrick Palka via Gcc-patches
In filter_memfn_lookup, we weren't correctly recognizing and matching up
member functions introduced via a non-dependent using-decl.  This caused
us to crash in the below testcases in which we correctly pruned the
overload set for the non-dependent call ahead of time, but then at
instantiation time filter_memfn_lookup failed to match the selected
function (introduced in each case by a non-dependent using-decl) to the
corresponding function from the new lookup set.  Such member functions
need special handling in filter_memfn_lookup because they look exactly
the same in the old and new lookup sets, whereas ordinary member
functions that're defined in the (dependent) current class become more
specialized in the new lookup set.

This patch reworks the matching logic in filter_memfn_lookup so that it
handles non-dependent using-decls correctly, and is hopefully simpler to
follow.

Bootstrapped and regtested on x86_64-pc-linux, does this look OK for
trunk?

PR c++/104432

gcc/cp/ChangeLog:

* call.cc (build_new_method_call): When a non-dependent call
resolves to a specialization of a member template, always build
the pruned overload set using the member template, not the
specialization.
* pt.cc (filter_memfn_lookup): New parameter newtype.  Simplify
and correct how members from the new lookup set are matched to
those from the old one.
(tsubst_baselink): Pass binfo_type as newtype to
filter_memfn_lookup.

gcc/testsuite/ChangeLog:

* g++.dg/template/non-dependent19.C: New test.
* g++.dg/template/non-dependent19a.C: New test.
* g++.dg/template/non-dependent20.C: New test.
---
 gcc/cp/call.cc|  9 ++--
 gcc/cp/pt.cc  | 49 +--
 .../g++.dg/template/non-dependent19.C | 14 ++
 .../g++.dg/template/non-dependent19a.C| 16 ++
 .../g++.dg/template/non-dependent20.C | 16 ++
 5 files changed, 73 insertions(+), 31 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/template/non-dependent19.C
 create mode 100644 gcc/testsuite/g++.dg/template/non-dependent19a.C
 create mode 100644 gcc/testsuite/g++.dg/template/non-dependent20.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index b2e89c5d783..d6eed5ed835 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -11189,12 +11189,11 @@ build_new_method_call (tree instance, tree fns, 
vec **args,
   if (really_overloaded_fn (fns))
{
  if (DECL_TEMPLATE_INFO (fn)
- && DECL_MEMBER_TEMPLATE_P (DECL_TI_TEMPLATE (fn))
- && dependent_type_p (DECL_CONTEXT (fn)))
+ && DECL_MEMBER_TEMPLATE_P (DECL_TI_TEMPLATE (fn)))
{
- /* FIXME: We're not prepared to fully instantiate "inside-out"
-partial instantiations such as A::f().  So instead
-use the selected template, not the specialization.  */
+ /* Use the selected template, not the specialization, so that
+this looks like an actual lookup result for sake of
+filter_memfn_lookup.  */
 
  if (OVL_SINGLE_P (fns))
/* If the original overload set consists of a single function
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 862f337886c..3a5d06bf297 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -16311,12 +16311,12 @@ tsubst (tree t, tree args, tsubst_flags_t complain, 
tree in_decl)
 }
 
 /* OLDFNS is a lookup set of member functions from some class template, and
-   NEWFNS is a lookup set of member functions from a specialization of that
-   class template.  Return the subset of NEWFNS which are specializations of
-   a function from OLDFNS.  */
+   NEWFNS is a lookup set of member functions from NEWTYPE, a specialization
+   of that class template.  Return the subset of NEWFNS which are
+   specializations of a function from OLDFNS.  */
 
 static tree
-filter_memfn_lookup (tree oldfns, tree newfns)
+filter_memfn_lookup (tree oldfns, tree newfns, tree newtype)
 {
   /* Record all member functions from the old lookup set OLDFNS into
  VISIBLE_SET.  */
@@ -16326,38 +16326,34 @@ filter_memfn_lookup (tree oldfns, tree newfns)
   if (TREE_CODE (fn) == USING_DECL)
{
  /* FIXME: Punt on (dependent) USING_DECL for now; mapping
-a dependent USING_DECL to its instantiation seems
-tricky.  */
+a dependent USING_DECL to the member functions it introduces
+seems tricky.  */
  gcc_checking_assert (DECL_DEPENDENT_P (fn));
  return newfns;
}
-  else if (TREE_CODE (fn) == TEMPLATE_DECL)
-   /* A member function template.  */
-   visible_set.add (fn);
-  else if (TREE_CODE (fn) == FUNCTION_DECL)
-   {
- if (DECL_TEMPLATE_INFO (fn))
-   /* A non-template member function.  */
-   visible_set.add (DECL_TI_TEMPLATE (fn));
- else
-   

Re: [pushed] c++: P2493 feature test macro updates

2022-02-09 Thread Jonathan Wakely via Gcc-patches
On Wed, 9 Feb 2022 at 15:45, Jakub Jelinek  wrote:
>
> On Wed, Feb 09, 2022 at 03:38:32PM +, Jonathan Wakely wrote:
> > On Wed, 9 Feb 2022 at 15:24, Jakub Jelinek  wrote:
> > >
> > > On Wed, Feb 09, 2022 at 09:40:49AM -0500, Jason Merrill via Gcc-patches 
> > > wrote:
> > > > The C++ committee just updated the values of these macros to reflect 
> > > > some
> > > > late C++20 papers that we implement but others don't yet; see PR103891.
> > > >
> > > > Tested x86_64-pc-linux-gnu, applying to trunk.
> > >
> > > So, shouldn't we update project/cxx-status.html for that change?
> > >
> > > Like following?
> > >
> > > diff --git a/htdocs/projects/cxx-status.html 
> > > b/htdocs/projects/cxx-status.html
> > > index 014fed8b..4bbff256 100644
> > > --- a/htdocs/projects/cxx-status.html
> > > +++ b/htdocs/projects/cxx-status.html
> > > @@ -312,7 +312,7 @@
> > > Concepts 
> > >https://wg21.link/p0734r0";>P0734R0
> > >  > > href="../gcc-10/changes.html#cxx">10 
> > > -   __cpp_concepts >= 201907 
> > > +   __cpp_concepts >= 202002 
> >
> > I don't like this change. The value to check for P0734R0 support is
> > still 201907. If you want to also check for P0848R3 support, you can
> > use 202002. So I think it would be better to move the P0848R3 row out
> > of the rowspan group, and then put 202002 as the macro for that paper.
>
> So perhaps like following then?

That looks good to me.


> diff --git a/htdocs/projects/cxx-status.html b/htdocs/projects/cxx-status.html
> index 014fed8b..5141629b 100644
> --- a/htdocs/projects/cxx-status.html
> +++ b/htdocs/projects/cxx-status.html
> @@ -312,7 +312,7 @@
> Concepts 
>https://wg21.link/p0734r0";>P0734R0
>  href="../gcc-10/changes.html#cxx">10 
> -   __cpp_concepts >= 201907 
> +   __cpp_concepts >= 201907 
>  
>  
>https://wg21.link/p0857r0";>P0857R0
> @@ -325,9 +325,11 @@
>  
>  
>https://wg21.link/p0848r3";>P0848R3
> +   __cpp_concepts >= 202002 
>  
>  
>https://wg21.link/p1616r1";>P1616R1
> +   __cpp_concepts >= 201907 
>  
>  
>https://wg21.link/p1452r2";>P1452R2
> @@ -590,7 +592,7 @@
>  
>https://wg21.link/p1330r0";>P1330R0
>  href="../gcc-9/changes.html#cxx">9 
> -   
> +   __cpp_constexpr >= 202002 
>  
>  
>
>
>
> Jakub
>



Re: [PATCH] c++: memfn lookup consistency and using-decls [PR104432]

2022-02-09 Thread Jason Merrill via Gcc-patches

On 2/9/22 10:45, Patrick Palka wrote:

In filter_memfn_lookup, we weren't correctly recognizing and matching up
member functions introduced via a non-dependent using-decl.  This caused
us to crash in the below testcases in which we correctly pruned the
overload set for the non-dependent call ahead of time, but then at
instantiation time filter_memfn_lookup failed to match the selected
function (introduced in each case by a non-dependent using-decl) to the
corresponding function from the new lookup set.  Such member functions
need special handling in filter_memfn_lookup because they look exactly
the same in the old and new lookup sets, whereas ordinary member
functions that're defined in the (dependent) current class become more
specialized in the new lookup set.

This patch reworks the matching logic in filter_memfn_lookup so that it
handles non-dependent using-decls correctly, and is hopefully simpler to
follow.

Bootstrapped and regtested on x86_64-pc-linux, does this look OK for
trunk?

PR c++/104432

gcc/cp/ChangeLog:

* call.cc (build_new_method_call): When a non-dependent call
resolves to a specialization of a member template, always build
the pruned overload set using the member template, not the
specialization.
* pt.cc (filter_memfn_lookup): New parameter newtype.  Simplify
and correct how members from the new lookup set are matched to
those from the old one.
(tsubst_baselink): Pass binfo_type as newtype to
filter_memfn_lookup.

gcc/testsuite/ChangeLog:

* g++.dg/template/non-dependent19.C: New test.
* g++.dg/template/non-dependent19a.C: New test.
* g++.dg/template/non-dependent20.C: New test.
---
  gcc/cp/call.cc|  9 ++--
  gcc/cp/pt.cc  | 49 +--
  .../g++.dg/template/non-dependent19.C | 14 ++
  .../g++.dg/template/non-dependent19a.C| 16 ++
  .../g++.dg/template/non-dependent20.C | 16 ++
  5 files changed, 73 insertions(+), 31 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/template/non-dependent19.C
  create mode 100644 gcc/testsuite/g++.dg/template/non-dependent19a.C
  create mode 100644 gcc/testsuite/g++.dg/template/non-dependent20.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index b2e89c5d783..d6eed5ed835 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -11189,12 +11189,11 @@ build_new_method_call (tree instance, tree fns, 
vec **args,
if (really_overloaded_fn (fns))
{
  if (DECL_TEMPLATE_INFO (fn)
- && DECL_MEMBER_TEMPLATE_P (DECL_TI_TEMPLATE (fn))
- && dependent_type_p (DECL_CONTEXT (fn)))
+ && DECL_MEMBER_TEMPLATE_P (DECL_TI_TEMPLATE (fn)))
{
- /* FIXME: We're not prepared to fully instantiate "inside-out"
-partial instantiations such as A::f().  So instead
-use the selected template, not the specialization.  */
+ /* Use the selected template, not the specialization, so that
+this looks like an actual lookup result for sake of
+filter_memfn_lookup.  */
  
  	  if (OVL_SINGLE_P (fns))

/* If the original overload set consists of a single function
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 862f337886c..3a5d06bf297 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -16311,12 +16311,12 @@ tsubst (tree t, tree args, tsubst_flags_t complain, 
tree in_decl)
  }
  
  /* OLDFNS is a lookup set of member functions from some class template, and

-   NEWFNS is a lookup set of member functions from a specialization of that
-   class template.  Return the subset of NEWFNS which are specializations of
-   a function from OLDFNS.  */
+   NEWFNS is a lookup set of member functions from NEWTYPE, a specialization
+   of that class template.  Return the subset of NEWFNS which are
+   specializations of a function from OLDFNS.  */
  
  static tree

-filter_memfn_lookup (tree oldfns, tree newfns)
+filter_memfn_lookup (tree oldfns, tree newfns, tree newtype)
  {
/* Record all member functions from the old lookup set OLDFNS into
   VISIBLE_SET.  */
@@ -16326,38 +16326,34 @@ filter_memfn_lookup (tree oldfns, tree newfns)
if (TREE_CODE (fn) == USING_DECL)
{
  /* FIXME: Punt on (dependent) USING_DECL for now; mapping
-a dependent USING_DECL to its instantiation seems
-tricky.  */
+a dependent USING_DECL to the member functions it introduces
+seems tricky.  */


FWIW I still think this shouldn't be very tricky.

The patch is OK.


  gcc_checking_assert (DECL_DEPENDENT_P (fn));
  return newfns;
}
-  else if (TREE_CODE (fn) == TEMPLATE_DECL)
-   /* A member function template.  */
-   visible_set.add (fn);
-  else if (TREE_CODE (fn) == FUNCTION_DECL)
-   {
- if (DECL_TEMPLATE_INFO (fn

Re: [PATCH] Fix PR 101515 (ICE in pp_cxx_unqualified_id, at cp/cxx-pretty-print.c:128)

2022-02-09 Thread Qing Zhao via Gcc-patches


> On Feb 8, 2022, at 4:20 PM, Jason Merrill  wrote:
> 
> On 2/8/22 15:11, Qing Zhao wrote:
>> Hi,
>> This is the patch to fix PR101515 (ICE in pp_cxx_unqualified_id, at  
>> cp/cxx-pretty-print.c:128)
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101515
>> It's possible that the TYPE_NAME of a record_type is NULL, therefore when
>> printing the TYPE_NAME, we should check and handle this special case.
>> Please see the comment of pr101515 for more details.
>> The fix is very simple, just check and special handle cases when TYPE_NAME 
>> is NULL.
>> Bootstrapped and regression tested on both x86 and aarch64, no issues.
>> Okay for commit?
>> Thanks.
>> Qing
>> =
>> From f37ee8d21b80cb77d8108cb97a487c84c530545b Mon Sep 17 00:00:00 2001
>> From: Qing Zhao 
>> Date: Tue, 8 Feb 2022 16:10:37 +
>> Subject: [PATCH] Fix PR 101515 ICE in pp_cxx_unqualified_id, at
>>  cp/cxx-pretty-print.c:128.
>> It's possible that the TYPE_NAME of a record_type is NULL, therefore when
>> printing the TYPE_NAME, we should check and handle this special case.
>> gcc/cp/ChangeLog:
>>  * cxx-pretty-print.cc (pp_cxx_unqualified_id): Check and handle
>>  the case when TYPE_NAME is NULL.
>> gcc/testsuite/ChangeLog:
>>  * g++.dg/pr101515.C: New test.
>> ---
>>  gcc/cp/cxx-pretty-print.cc  |  5 -
>>  gcc/testsuite/g++.dg/pr101515.C | 25 +
>>  2 files changed, 29 insertions(+), 1 deletion(-)
>>  create mode 100644 gcc/testsuite/g++.dg/pr101515.C
>> diff --git a/gcc/cp/cxx-pretty-print.cc b/gcc/cp/cxx-pretty-print.cc
>> index 4f9a090e520d..744ed0add5ba 100644
>> --- a/gcc/cp/cxx-pretty-print.cc
>> +++ b/gcc/cp/cxx-pretty-print.cc
>> @@ -171,7 +171,10 @@ pp_cxx_unqualified_id (cxx_pretty_printer *pp, tree t)
>>  case ENUMERAL_TYPE:
>>  case TYPENAME_TYPE:
>>  case UNBOUND_CLASS_TEMPLATE:
>> -  pp_cxx_unqualified_id (pp, TYPE_NAME (t));
>> +  if (TYPE_NAME (t))
>> +pp_cxx_unqualified_id (pp, TYPE_NAME (t));
>> +  else
>> +pp_string (pp, "");
> 
> Hmm, but it's not an unnamed class, it's a pointer to member function type, 
> and it would be better to avoid dumping compiler internal representations 
> like the __pfn field name.
Yes, It’s not an unnamed class, but the ICE happened when try to print the 
compiler generated member function type “__ptrmemfunc_type”, whose TYPE_NAME is 
NULLed during building this type in c++ FE and the c++ FE does not handle the 
case when TYPE_NAME is NULL correctly. 

So, there are two levels of issues:

1. The first level issue is that the current C++ FE does not handle the case 
TYPE_NAME being NULL correctly, this is indeed a bug in the current code and 
should be fixed as in the current patch. 

2. The second level issue is what you suggested in the above, shall we print 
the “compiler generated internal type”  to the user? And I agree with you that 
it might not be a good idea to print such compiler internal names to the user.  
Are we do this right now in general? (i.e, check whether the current TYPE is a 
source level TYPE or a compiler internal TYPE, and then only print out the name 
of TYPE for the source level TYPE?) and is there a bit in the TYPE to 
distinguish whether a TYPE is user -level type or a compiler generated internal 
type?

> I think the real problem comes sooner, when c_fold_indirect_ref_for_warn 
> turns a MEM_REF with RECORD_TYPE into a COMPONENT_REF with POINTER_TYPE.
What’s the major issue for this transformation? (I will study this in more 
details).

thanks.

Qing


> 
>>if (tree ti = TYPE_TEMPLATE_INFO_MAYBE_ALIAS (t))
>>  if (PRIMARY_TEMPLATE_P (TI_TEMPLATE (ti)))
>>{
>> diff --git a/gcc/testsuite/g++.dg/pr101515.C 
>> b/gcc/testsuite/g++.dg/pr101515.C
>> new file mode 100644
>> index ..898c7e003c22
>> --- /dev/null
>> +++ b/gcc/testsuite/g++.dg/pr101515.C
>> @@ -0,0 +1,25 @@
>> +/* PR101515 -  ICE in pp_cxx_unqualified_id, at cp/cxx-pretty-print.c:128
>> +   { dg-do compile }
>> +   { dg-options "-Wuninitialized -O1" } */
>> +
>> +struct S
>> +{
>> +  int j;
>> +};
>> +struct T : public S
>> +{
>> +  virtual void h () {}
>> +};
>> +struct ptrmemfunc
>> +{
>> +  void (*ptr) ();
>> +};
>> +typedef void (S::*sp)();
>> +int main ()
>> +{
>> +  T t;
>> +  sp x;
>> +  ptrmemfunc *xp = (ptrmemfunc *) &x;
>> +  if (xp->ptr != ((void (*)())(sizeof(void *  /* { dg-warning "is used 
>> uninitialized" } */
>> +return 1;
>> +}
> 



Re: [PATCH 1/4][RFC] middle-end/90348 - add explicit birth

2022-02-09 Thread Michael Matz via Gcc-patches
Hello,

On Wed, 9 Feb 2022, Richard Biener wrote:

> > That breaks down when a birth is there (because it was otherwise 
> > reachable) but not on the taken path:
> > 
> >   if (nevertrue)
> > goto start;
> >   goto forw;
> >   start:
> >   {
> > int i;
> > printf("not really reachable, but unknowingly so\n");
> >   forw:
> > i = 1;
> >   }
> 
> I think to cause breakage you need a use of 'i' on the side-entry
> path that is not reachable from the path with the birth.  I guess sth like
> 
>if (nevertrue)
>  goto start;
>goto forw;
>start:
>{
>  int i = 0;
>  printf("not really reachable, but unknowingly so\n");
>  goto common;
>forw:
>  i = 1;
>common:
>  foo (&i);
>}
> 
> if we'd have a variable that's live only on the side-entry path
> then it could share the stack slot with 'i' this way, breaking
> things (now we don't move CLOBBERs so it isn't easy to construct
> such case).  The present dataflow would, for the above, indeed
> compute 'i' not live in the forw: block.

Yes, now that we have established (in the subthread with Joseph) that the 
value becomes indeterminate at decls we only need to care for not sharing 
storage invalidly, so yeah, some changes in the conflict computation still 
are needed.

> > Except for switches side-entries are really really seldom, so we might 
> > usefully get away with that latter solution.  And for switches it might be 
> > okay to put the births at the block containing the switch (if it itself 
> > doesn't have side entries, and the switch block doesn't have side 
> > entries except the case labels).
> > 
> > If the birth is moved to outward blocks it might be best if also the 
> > dealloc/death clobbers are moved to it, otherwise there might be paths 
> > containing a birth but no death.
> > 
> > The less precise you get with those births the more non-sharing you'll 
> > get, but that's the price.
> 
> Yes, sure.  I don't see a good way to place births during gimplification
> then.

Well, for each BIND you can compute if there are side entries at all, 
then, when lowering a BIND you put the births into the containing 
innermost BIND that doesn't have side-entries, instead of into the current 
BIND.

> The end clobbers rely on our EH lowering machinery.  For the entries we 
> might be able to compute GIMPLE BIND transitions during BIND lowering if 
> we associate labels with BINDs.  There should be a single fallthru into 
> the BIND at this point.  We could put a flag on the goto destination 
> labels whether they are reached from an outer BIND.
> 
>  goto inner;
>  {
>{
> int i;
>  {
>int j;
> inner:
>goto middle;
>  }
> middle:
>}
>  }
> 
> Since an entry CLOBBER is also a clobber we have to insert birth
> clobbers for all decls of all the binds inbetwee the goto source
> and the destination.  So for the above case on the edge to
> inner: have births for i and j and at the edge to middle we'd
> have none.

Correct, that's basically the most precise scheme, it's what I called 
try-finally topside-down ("always-before"? :) ).  (You have to care for 
computed goto! i.e. BINDs containing address-taken labels, which make 
things even uglier)  I think the easier way to deal with the above is to 
notice that the inner BIND has a side entry and conceptually move the 
decls outwards to BINDs that don't have such (or bind-crossing gotos), 
i.e. do as if it were:

  int i;  // moved
  int j;  // moved
  goto inner;
  {
{
  {
  inner:
goto middle;
  }
  middle:
}
  }

> Requires some kind of back-mapping from label to goto sources
> that are possibly problematic.  One issue is that GIMPLE
> lowering re-builds the BLOCK tree (for whatever reason...),
> so I'm not sure if we need to do it before that (for correctness).
> 
> Does that make sense?

I honestly can't say for 100% :-)  It sounds like it makes sense, yes.


Ciao,
Michael.


Re: [PATCH] [PATCH, v4, 1/1, AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack

2022-02-09 Thread Richard Sandiford via Gcc-patches
Dan Li  writes:
> Shadow Call Stack can be used to protect the return address of a
> function at runtime, and clang already supports this feature[1].
>
> To enable SCS in user mode, in addition to compiler, other support
> is also required (as discussed in [2]). This patch only adds basic
> support for SCS from the compiler side, and provides convenience
> for users to enable SCS.
>
> For linux kernel, only the support of the compiler is required.
>
> [1] https://clang.llvm.org/docs/ShadowCallStack.html
> [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102768
>
> Signed-off-by: Dan Li 
>
> gcc/ChangeLog:
>
>   * config/aarch64/aarch64.c (SLOT_REQUIRED):
>   Rename wb_candidate[12] to wb_push_candidate[12].
>   (aarch64_layout_frame): Likewise, and
>   change callee_adjust when scs is enabled.
>   (aarch64_save_callee_saves):
>   Rename wb_candidate[12] to wb_push_candidate[12].
>   (aarch64_restore_callee_saves): Likewise.
>   (aarch64_get_separate_components): Likewise.
>   (aarch64_expand_prologue): Push x30 onto SCS before it's
>   pushed onto stack.
>   (aarch64_expand_epilogue): Pop x30 frome SCS, while
>   preventing it from being popped from the regular stack again.
>   (aarch64_override_options_internal): Add SCS compile option check.
>   (TARGET_HAVE_SHADOW_CALL_STACK): New hook.
>   * config/aarch64/aarch64.h (struct GTY): Add is_scs_enabled,
>   wb_pop_candidate[12], and rename wb_candidate[12] to
>   wb_push_candidate[12].
>   * config/aarch64/aarch64.md (scs_push): New template.
>   (scs_pop): Likewise.
>   * doc/invoke.texi: Document -fsanitize=shadow-call-stack.
>   * doc/tm.texi: Regenerate.
>   * doc/tm.texi.in: Add hook have_shadow_call_stack.
>   * flag-types.h (enum sanitize_code):
>   Add SANITIZE_SHADOW_CALL_STACK.
>   * opts.c: Add shadow-call-stack.
>   * target.def: New hook.
>   * toplev.c (process_options): Add SCS compile option check.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/aarch64/shadow_call_stack_1.c: New test.
>   * gcc.target/aarch64/shadow_call_stack_2.c: New test.
>   * gcc.target/aarch64/shadow_call_stack_3.c: New test.
>   * gcc.target/aarch64/shadow_call_stack_4.c: New test.
>   * gcc.target/aarch64/shadow_call_stack_5.c: New test.
>   * gcc.target/aarch64/shadow_call_stack_6.c: New test.
>   * gcc.target/aarch64/shadow_call_stack_7.c: New test.
>   * gcc.target/aarch64/shadow_call_stack_8.c: New test.
> ---
> V4:
> - Added wb_[push|pop]_candidates[12] to ensure push/pop can
> emit different registers.
>
> V3:
> - Change scs_push/pop to standard move patterns.
> - Optimize scs_pop to avoid pop x30 twice when shadow stack is enabled.

LGTM.  Just a few minor comments below.

>
>  gcc/config/aarch64/aarch64.c  | 121 +-
>  gcc/config/aarch64/aarch64.h  |  21 ++-
>  gcc/config/aarch64/aarch64.md |  10 ++
>  gcc/doc/invoke.texi   |  30 +
>  gcc/doc/tm.texi   |   5 +
>  gcc/doc/tm.texi.in|   2 +
>  gcc/flag-types.h  |   2 +
>  gcc/opts.c|   1 +
>  gcc/target.def|   8 ++
>  .../gcc.target/aarch64/shadow_call_stack_1.c  |   6 +
>  .../gcc.target/aarch64/shadow_call_stack_2.c  |   6 +
>  .../gcc.target/aarch64/shadow_call_stack_3.c  |  45 +++
>  .../gcc.target/aarch64/shadow_call_stack_4.c  |  20 +++
>  .../gcc.target/aarch64/shadow_call_stack_5.c  |  18 +++
>  .../gcc.target/aarch64/shadow_call_stack_6.c  |  18 +++
>  .../gcc.target/aarch64/shadow_call_stack_7.c  |  18 +++
>  .../gcc.target/aarch64/shadow_call_stack_8.c  |  24 
>  gcc/toplev.c  |  10 ++
>  18 files changed, 332 insertions(+), 33 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_1.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_2.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_3.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_4.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_5.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_6.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_7.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/shadow_call_stack_8.c
>
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 699c105a42a..f4d962917c4 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -79,6 +79,7 @@
>  #include "tree-ssa-loop-niter.h"
>  #include "fractional-cost.h"
>  #include "rtlanal.h"
> +#include "asan.h"
>  
>  /* This file should be included last.  */
>  #include "target-def.h"
> @@ -7291,8 +7292,8 @@ aarch64_layout_frame (v

Re: [Patch]middle-end: updating the reg use in exit block for -fzero-call-used-regs [PR100775]

2022-02-09 Thread Richard Sandiford via Gcc-patches
Qing Zhao  writes:
> Hi, Richard,
>
> Could you please review this patch? This is a fix to the previous 
> -fzero-call-used-regs implementation. 
>
> PR 100775 ( ICE: in df_exit_block_bitmap_verify, at df-scan.c:4164 with 
> -mthumb -fzero-call-used-regs=used)
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100775
>
> Although the ICE only happens on arm, but this is a bug in the middle end. 
> So, I think this bug has higher priority, 
> Need to be included into gcc12, and also need to be back ported to gcc11. 
>
> In the pass_zero_call_used_regs, when updating dataflow info after adding
> the register zeroing sequence in the epilogue of the function, we should
> call "df_update_exit_block_uses" to update the register use information in
> the exit block to include all the registers that have been zeroed.
>
> The change has been bootstrapped and reg-tested on both x86 and aarch64 (with 
> -enable-checking=yes,rtl,df). 
> Since I cannot find an arm machine,  no bootstrap and reg-tested on arm yet.
>
> For the arm failure, I just tested it with the cross build and it has no 
> issue withe the fix.
>
> (One question here:
> Previously, I though “df_set_bb_dirty (EXIT_BLOCK_PTR_FOR_FN (cfun))” and a 
> later “df_analyze()” should rescan 
> the changed exit block of the function, and update all the df info 
> automatically, it apparently not the case, the register
> use info at exit block is not automatically updated, we have to add an 
> explicitly call to “df_update_exit_block_uses”.
> I checked the pass_thread_prologue_and_epilogue, looks like it also 
> explicitly calls “df_update_entry_exit_and_calls” 
> to update the register use info.
> Shall the “df_set_bb_dirty” + “df_analyze” automatically update the reg use 
> info of the dirty block?).

I think the current df behaviour makes sense.  Updating the set of
live-out registers is a specialised operation and I think it's better
to make it explicit.

>
> Let me know whether there is any issue with the fix?
>
> Thanks
>
> Qing
>
> ===
>
> From e1cca5659c85e7c536f5016a2c75c615e65dba75 Mon Sep 17 00:00:00 2001
> From: Qing Zhao 
> Date: Fri, 28 Jan 2022 16:29:51 +
> Subject: [PATCH] middle-end: updating the reg use in exit block for
> -fzero-call-used-regs [PR100775]
>
> In the pass_zero_call_used_regs, when updating dataflow info after adding
> the register zeroing sequence in the epilogue of the function, we should
> call "df_update_exit_block_uses" to update the register use information in
> the exit block to include all the registers that have been zeroed.
>
> 2022-01-27  Qing Zhao  
>
> gcc/ChangeLog:
>
>   * function.cc (gen_call_used_regs_seq): Call
>   df_update_exit_block_uses when updating df.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/arm/pr100775.c: New test.
> ---
> gcc/function.cc | 1 +
> gcc/testsuite/gcc.target/arm/pr100775.c | 8 
> 2 files changed, 9 insertions(+)
> create mode 100644 gcc/testsuite/gcc.target/arm/pr100775.c
>
> diff --git a/gcc/function.cc b/gcc/function.cc
> index e1d2565f8d92..c8a77c9a6246 100644
> --- a/gcc/function.cc
> +++ b/gcc/function.cc
> @@ -5942,6 +5942,7 @@ gen_call_used_regs_seq (rtx_insn *ret, unsigned int 
> zero_regs_type)
>   /* Update the data flow information.  */
>   crtl->must_be_zero_on_return |= zeroed_hardregs;
>   df_set_bb_dirty (EXIT_BLOCK_PTR_FOR_FN (cfun));
> +  df_update_exit_block_uses ();

I think this should replace the df_set_bb_dirty call:
df_update_exit_block_uses will mark the block as dirty where
necessary.

Nit, but the indentation of the new call looks off.

> }
> }
>
> diff --git a/gcc/testsuite/gcc.target/arm/pr100775.c 
> b/gcc/testsuite/gcc.target/arm/pr100775.c
> new file mode 100644
> index ..dd2255a95492
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/pr100775.c
> @@ -0,0 +1,8 @@
> +/* { dg-do compile } */

Please add:

/* { dg-skip-if "" { ! { arm_thumb1_ok || arm_thumb2_ok } } } */

here.

OK with those changes, thanks.  Please wait a week or so before
backporting.

Richard

> +/* { dg-options "-mthumb -fzero-call-used-regs=used" } */
> +
> +int
> +foo (int x)
> +{
> +  return x;
> +}


Re: [PATCH] c++: memfn lookup consistency and using-decls [PR104432]

2022-02-09 Thread Patrick Palka via Gcc-patches
On Wed, 9 Feb 2022, Jason Merrill wrote:

> On 2/9/22 10:45, Patrick Palka wrote:
> > In filter_memfn_lookup, we weren't correctly recognizing and matching up
> > member functions introduced via a non-dependent using-decl.  This caused
> > us to crash in the below testcases in which we correctly pruned the
> > overload set for the non-dependent call ahead of time, but then at
> > instantiation time filter_memfn_lookup failed to match the selected
> > function (introduced in each case by a non-dependent using-decl) to the
> > corresponding function from the new lookup set.  Such member functions
> > need special handling in filter_memfn_lookup because they look exactly
> > the same in the old and new lookup sets, whereas ordinary member
> > functions that're defined in the (dependent) current class become more
> > specialized in the new lookup set.
> > 
> > This patch reworks the matching logic in filter_memfn_lookup so that it
> > handles non-dependent using-decls correctly, and is hopefully simpler to
> > follow.
> > 
> > Bootstrapped and regtested on x86_64-pc-linux, does this look OK for
> > trunk?
> > 
> > PR c++/104432
> > 
> > gcc/cp/ChangeLog:
> > 
> > * call.cc (build_new_method_call): When a non-dependent call
> > resolves to a specialization of a member template, always build
> > the pruned overload set using the member template, not the
> > specialization.
> > * pt.cc (filter_memfn_lookup): New parameter newtype.  Simplify
> > and correct how members from the new lookup set are matched to
> > those from the old one.
> > (tsubst_baselink): Pass binfo_type as newtype to
> > filter_memfn_lookup.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/template/non-dependent19.C: New test.
> > * g++.dg/template/non-dependent19a.C: New test.
> > * g++.dg/template/non-dependent20.C: New test.
> > ---
> >   gcc/cp/call.cc|  9 ++--
> >   gcc/cp/pt.cc  | 49 +--
> >   .../g++.dg/template/non-dependent19.C | 14 ++
> >   .../g++.dg/template/non-dependent19a.C| 16 ++
> >   .../g++.dg/template/non-dependent20.C | 16 ++
> >   5 files changed, 73 insertions(+), 31 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/template/non-dependent19.C
> >   create mode 100644 gcc/testsuite/g++.dg/template/non-dependent19a.C
> >   create mode 100644 gcc/testsuite/g++.dg/template/non-dependent20.C
> > 
> > diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
> > index b2e89c5d783..d6eed5ed835 100644
> > --- a/gcc/cp/call.cc
> > +++ b/gcc/cp/call.cc
> > @@ -11189,12 +11189,11 @@ build_new_method_call (tree instance, tree fns,
> > vec **args,
> > if (really_overloaded_fn (fns))
> > {
> >   if (DECL_TEMPLATE_INFO (fn)
> > - && DECL_MEMBER_TEMPLATE_P (DECL_TI_TEMPLATE (fn))
> > - && dependent_type_p (DECL_CONTEXT (fn)))
> > + && DECL_MEMBER_TEMPLATE_P (DECL_TI_TEMPLATE (fn)))
> > {
> > - /* FIXME: We're not prepared to fully instantiate "inside-out"
> > -partial instantiations such as A::f().  So instead
> > -use the selected template, not the specialization.  */
> > + /* Use the selected template, not the specialization, so that
> > +this looks like an actual lookup result for sake of
> > +filter_memfn_lookup.  */
> >   if (OVL_SINGLE_P (fns))
> > /* If the original overload set consists of a single function
> > diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
> > index 862f337886c..3a5d06bf297 100644
> > --- a/gcc/cp/pt.cc
> > +++ b/gcc/cp/pt.cc
> > @@ -16311,12 +16311,12 @@ tsubst (tree t, tree args, tsubst_flags_t
> > complain, tree in_decl)
> >   }
> > /* OLDFNS is a lookup set of member functions from some class template,
> > and
> > -   NEWFNS is a lookup set of member functions from a specialization of that
> > -   class template.  Return the subset of NEWFNS which are specializations
> > of
> > -   a function from OLDFNS.  */
> > +   NEWFNS is a lookup set of member functions from NEWTYPE, a
> > specialization
> > +   of that class template.  Return the subset of NEWFNS which are
> > +   specializations of a function from OLDFNS.  */
> > static tree
> > -filter_memfn_lookup (tree oldfns, tree newfns)
> > +filter_memfn_lookup (tree oldfns, tree newfns, tree newtype)
> >   {
> > /* Record all member functions from the old lookup set OLDFNS into
> >VISIBLE_SET.  */
> > @@ -16326,38 +16326,34 @@ filter_memfn_lookup (tree oldfns, tree newfns)
> > if (TREE_CODE (fn) == USING_DECL)
> > {
> >   /* FIXME: Punt on (dependent) USING_DECL for now; mapping
> > -a dependent USING_DECL to its instantiation seems
> > -tricky.  */
> > +a dependent USING_DECL to the member functions it introduces
> > +seems tricky.  */
> 
> FWIW I still think this shouldn't be very tricky.

I trie

Re: [Patch]middle-end: updating the reg use in exit block for -fzero-call-used-regs [PR100775]

2022-02-09 Thread Qing Zhao via Gcc-patches


> On Feb 9, 2022, at 10:20 AM, Richard Sandiford  
> wrote:
> 
> Qing Zhao  writes:
>> Hi, Richard,
>> 
>> Could you please review this patch? This is a fix to the previous 
>> -fzero-call-used-regs implementation. 
>> 
>> PR 100775 ( ICE: in df_exit_block_bitmap_verify, at df-scan.c:4164 with 
>> -mthumb -fzero-call-used-regs=used)
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100775
>> 
>> Although the ICE only happens on arm, but this is a bug in the middle end. 
>> So, I think this bug has higher priority, 
>> Need to be included into gcc12, and also need to be back ported to gcc11. 
>> 
>> In the pass_zero_call_used_regs, when updating dataflow info after adding
>> the register zeroing sequence in the epilogue of the function, we should
>> call "df_update_exit_block_uses" to update the register use information in
>> the exit block to include all the registers that have been zeroed.
>> 
>> The change has been bootstrapped and reg-tested on both x86 and aarch64 
>> (with -enable-checking=yes,rtl,df). 
>> Since I cannot find an arm machine,  no bootstrap and reg-tested on arm yet.
>> 
>> For the arm failure, I just tested it with the cross build and it has no 
>> issue withe the fix.
>> 
>> (One question here:
>> Previously, I though “df_set_bb_dirty (EXIT_BLOCK_PTR_FOR_FN (cfun))” and a 
>> later “df_analyze()” should rescan 
>> the changed exit block of the function, and update all the df info 
>> automatically, it apparently not the case, the register
>> use info at exit block is not automatically updated, we have to add an 
>> explicitly call to “df_update_exit_block_uses”.
>> I checked the pass_thread_prologue_and_epilogue, looks like it also 
>> explicitly calls “df_update_entry_exit_and_calls” 
>> to update the register use info.
>> Shall the “df_set_bb_dirty” + “df_analyze” automatically update the reg use 
>> info of the dirty block?).
> 
> I think the current df behaviour makes sense.  Updating the set of
> live-out registers is a specialised operation and I think it's better
> to make it explicit.

Okay. I see.
> 
>> 
>> Let me know whether there is any issue with the fix?
>> 
>> Thanks
>> 
>> Qing
>> 
>> ===
>> 
>> From e1cca5659c85e7c536f5016a2c75c615e65dba75 Mon Sep 17 00:00:00 2001
>> From: Qing Zhao 
>> Date: Fri, 28 Jan 2022 16:29:51 +
>> Subject: [PATCH] middle-end: updating the reg use in exit block for
>> -fzero-call-used-regs [PR100775]
>> 
>> In the pass_zero_call_used_regs, when updating dataflow info after adding
>> the register zeroing sequence in the epilogue of the function, we should
>> call "df_update_exit_block_uses" to update the register use information in
>> the exit block to include all the registers that have been zeroed.
>> 
>> 2022-01-27  Qing Zhao  
>> 
>> gcc/ChangeLog:
>> 
>>  * function.cc (gen_call_used_regs_seq): Call
>>  df_update_exit_block_uses when updating df.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>  * gcc.target/arm/pr100775.c: New test.
>> ---
>> gcc/function.cc | 1 +
>> gcc/testsuite/gcc.target/arm/pr100775.c | 8 
>> 2 files changed, 9 insertions(+)
>> create mode 100644 gcc/testsuite/gcc.target/arm/pr100775.c
>> 
>> diff --git a/gcc/function.cc b/gcc/function.cc
>> index e1d2565f8d92..c8a77c9a6246 100644
>> --- a/gcc/function.cc
>> +++ b/gcc/function.cc
>> @@ -5942,6 +5942,7 @@ gen_call_used_regs_seq (rtx_insn *ret, unsigned int 
>> zero_regs_type)
>>  /* Update the data flow information.  */
>>  crtl->must_be_zero_on_return |= zeroed_hardregs;
>>  df_set_bb_dirty (EXIT_BLOCK_PTR_FOR_FN (cfun));
>> +  df_update_exit_block_uses ();
> 
> I think this should replace the df_set_bb_dirty call:
> df_update_exit_block_uses will mark the block as dirty where
> necessary.

Okay, will do that.
> 
> Nit, but the indentation of the new call looks off.
Will fix it.
> 
>>}
>> }
>> 
>> diff --git a/gcc/testsuite/gcc.target/arm/pr100775.c 
>> b/gcc/testsuite/gcc.target/arm/pr100775.c
>> new file mode 100644
>> index ..dd2255a95492
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/arm/pr100775.c
>> @@ -0,0 +1,8 @@
>> +/* { dg-do compile } */
> 
> Please add:
> 
> /* { dg-skip-if "" { ! { arm_thumb1_ok || arm_thumb2_ok } } } */
Will add it. 
> 
> here.
> 
> OK with those changes, thanks.

Thank you!
>  Please wait a week or so before
> backporting.

Okay.

Qing
> 
> Richard
> 
>> +/* { dg-options "-mthumb -fzero-call-used-regs=used" } */
>> +
>> +int
>> +foo (int x)
>> +{
>> +  return x;
>> +}



[pushed 0/8] aarch64: Fix regression in vec_init code quality

2022-02-09 Thread Richard Sandiford via Gcc-patches
The main purpose of this patch series is to fix a performance
regression from GCC 8.  Before the patch:

int64x2_t s64q_1(int64_t a0, int64_t a1) {
  if (__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__)
return (int64x2_t) { a1, a0 };
  else
return (int64x2_t) { a0, a1 };
}

generated:

fmovd0, x0
ins v0.d[1], x1
ins v0.d[1], x1
ret

whereas GCC 8 generated the more respectable:

dup v0.2d, x0
ins v0.d[1], x1
ret

But there are some related knock-on changes that IMO are needed to keep
things in a consistent and maintainable state.

There is still more cleanup and optimisation that could be done in this
area, but that's definitely GCC 13 material.

Tested on aarch64-linux-gnu and aarch64_be-elf, pushed.

Sorry for the size of the series, but it really did seem like the
best fix in the circumstances.

Richard


[pushed 1/8] aarch64: Tighten general_operand predicates

2022-02-09 Thread Richard Sandiford via Gcc-patches
This patch fixes some case in which *general_operand was used over
*nonimmediate_operand by patterns that don't accept immediates.
This avoids some complication with later patches.

gcc/
* config/aarch64/aarch64-simd.md (aarch64_simd_vec_set): Use
aarch64_simd_nonimmediate_operand instead of
aarch64_simd_general_operand.
(@aarch64_combinez): Use nonimmediate_operand instead of
general_operand.
(@aarch64_combinez_be): Likewise.
---
 gcc/config/aarch64/aarch64-simd.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index 6646e069ad2..9529bdb4997 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1039,7 +1039,7 @@ (define_insn "aarch64_simd_vec_set"
   [(set (match_operand:VALL_F16 0 "register_operand" "=w,w,w")
(vec_merge:VALL_F16
(vec_duplicate:VALL_F16
-   (match_operand: 1 "aarch64_simd_general_operand" 
"w,?r,Utv"))
+   (match_operand: 1 "aarch64_simd_nonimmediate_operand" 
"w,?r,Utv"))
(match_operand:VALL_F16 3 "register_operand" "0,0,0")
(match_operand:SI 2 "immediate_operand" "i,i,i")))]
   "TARGET_SIMD"
@@ -4380,7 +4380,7 @@ (define_insn "store_pair_lanes"
 (define_insn "@aarch64_combinez"
   [(set (match_operand: 0 "register_operand" "=w,w,w")
(vec_concat:
- (match_operand:VDC 1 "general_operand" "w,?r,m")
+ (match_operand:VDC 1 "nonimmediate_operand" "w,?r,m")
  (match_operand:VDC 2 "aarch64_simd_or_scalar_imm_zero")))]
   "TARGET_SIMD && !BYTES_BIG_ENDIAN"
   "@
@@ -4395,7 +4395,7 @@ (define_insn "@aarch64_combinez_be"
   [(set (match_operand: 0 "register_operand" "=w,w,w")
 (vec_concat:
  (match_operand:VDC 2 "aarch64_simd_or_scalar_imm_zero")
- (match_operand:VDC 1 "general_operand" "w,?r,m")))]
+ (match_operand:VDC 1 "nonimmediate_operand" "w,?r,m")))]
   "TARGET_SIMD && BYTES_BIG_ENDIAN"
   "@
mov\\t%0.8b, %1.8b
-- 
2.25.1



[pushed 2/8] aarch64: Generalise vec_set predicate

2022-02-09 Thread Richard Sandiford via Gcc-patches
The aarch64_simd_vec_set define_insn takes memory operands,
so this patch makes the vec_set optab expander do the same.

gcc/
* config/aarch64/aarch64-simd.md (vec_set): Allow the
element to be an aarch64_simd_nonimmediate_operand.
---
 gcc/config/aarch64/aarch64-simd.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index 9529bdb4997..872a3d78269 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1378,7 +1378,7 @@ (define_insn "vec_shr_"
 
 (define_expand "vec_set"
   [(match_operand:VALL_F16 0 "register_operand")
-   (match_operand: 1 "register_operand")
+   (match_operand: 1 "aarch64_simd_nonimmediate_operand")
(match_operand:SI 2 "immediate_operand")]
   "TARGET_SIMD"
   {
-- 
2.25.1



[pushed 3/8] aarch64: Generalise adjacency check for load_pair_lanes

2022-02-09 Thread Richard Sandiford via Gcc-patches
This patch generalises the load_pair_lanes guard so that
it uses aarch64_check_consecutive_mems to check for consecutive
mems.  It also allows the pattern to be used for STRICT_ALIGNMENT
targets if the alignment is high enough.

The main aim is to avoid an inline test, for the sake of a later patch
that needs to repeat it.  Reusing aarch64_check_consecutive_mems seemed
simpler than writing an entirely new function.

gcc/
* config/aarch64/aarch64-protos.h (aarch64_mergeable_load_pair_p):
Declare.
* config/aarch64/aarch64-simd.md (load_pair_lanes): Use
aarch64_mergeable_load_pair_p instead of inline check.
* config/aarch64/aarch64.cc (aarch64_expand_vector_init): Likewise.
(aarch64_check_consecutive_mems): Allow the reversed parameter
to be null.
(aarch64_mergeable_load_pair_p): New function.
---
 gcc/config/aarch64/aarch64-protos.h   |  1 +
 gcc/config/aarch64/aarch64-simd.md|  7 +--
 gcc/config/aarch64/aarch64.cc | 54 ---
 gcc/testsuite/gcc.target/aarch64/vec-init-6.c | 12 +
 gcc/testsuite/gcc.target/aarch64/vec-init-7.c | 12 +
 5 files changed, 62 insertions(+), 24 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/vec-init-6.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/vec-init-7.c

diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 26368538a55..b75ed35635b 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -1000,6 +1000,7 @@ void aarch64_atomic_assign_expand_fenv (tree *, tree *, 
tree *);
 int aarch64_ccmp_mode_to_code (machine_mode mode);
 
 bool extract_base_offset_in_addr (rtx mem, rtx *base, rtx *offset);
+bool aarch64_mergeable_load_pair_p (machine_mode, rtx, rtx);
 bool aarch64_operands_ok_for_ldpstp (rtx *, bool, machine_mode);
 bool aarch64_operands_adjust_ok_for_ldpstp (rtx *, bool, machine_mode);
 void aarch64_swap_ldrstr_operands (rtx *, bool);
diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index 872a3d78269..c5bc2ea658b 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -4353,11 +4353,8 @@ (define_insn "load_pair_lanes"
(vec_concat:
   (match_operand:VDC 1 "memory_operand" "Utq")
   (match_operand:VDC 2 "memory_operand" "m")))]
-  "TARGET_SIMD && !STRICT_ALIGNMENT
-   && rtx_equal_p (XEXP (operands[2], 0),
-  plus_constant (Pmode,
- XEXP (operands[1], 0),
- GET_MODE_SIZE (mode)))"
+  "TARGET_SIMD
+   && aarch64_mergeable_load_pair_p (mode, operands[1], operands[2])"
   "ldr\\t%q0, %1"
   [(set_attr "type" "neon_load1_1reg_q")]
 )
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 296145e6008..c47543aebf3 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -21063,11 +21063,7 @@ aarch64_expand_vector_init (rtx target, rtx vals)
 for store_pair_lanes.  */
  if (memory_operand (x0, inner_mode)
  && memory_operand (x1, inner_mode)
- && !STRICT_ALIGNMENT
- && rtx_equal_p (XEXP (x1, 0),
- plus_constant (Pmode,
-XEXP (x0, 0),
-GET_MODE_SIZE (inner_mode
+ && aarch64_mergeable_load_pair_p (mode, x0, x1))
{
  rtx t;
  if (inner_mode == DFmode)
@@ -24687,14 +24683,20 @@ aarch64_sched_adjust_priority (rtx_insn *insn, int 
priority)
   return priority;
 }
 
-/* Check if *MEM1 and *MEM2 are consecutive memory references and,
+/* If REVERSED is null, return true if memory reference *MEM2 comes
+   immediately after memory reference *MEM1.  Do not change the references
+   in this case.
+
+   Otherwise, check if *MEM1 and *MEM2 are consecutive memory references and,
if they are, try to make them use constant offsets from the same base
register.  Return true on success.  When returning true, set *REVERSED
to true if *MEM1 comes after *MEM2, false if *MEM1 comes before *MEM2.  */
 static bool
 aarch64_check_consecutive_mems (rtx *mem1, rtx *mem2, bool *reversed)
 {
-  *reversed = false;
+  if (reversed)
+*reversed = false;
+
   if (GET_RTX_CLASS (GET_CODE (XEXP (*mem1, 0))) == RTX_AUTOINC
   || GET_RTX_CLASS (GET_CODE (XEXP (*mem2, 0))) == RTX_AUTOINC)
 return false;
@@ -24723,7 +24725,7 @@ aarch64_check_consecutive_mems (rtx *mem1, rtx *mem2, 
bool *reversed)
   if (known_eq (UINTVAL (offset1) + size1, UINTVAL (offset2)))
return true;
 
-  if (known_eq (UINTVAL (offset2) + size2, UINTVAL (offset1)))
+  if (known_eq (UINTVAL (offset2) + size2, UINTVAL (offset1)) && reversed)
{
  *reversed = true;
  return 

[pushed 4/8] aarch64: Remove redundant vec_concat patterns

2022-02-09 Thread Richard Sandiford via Gcc-patches
move_lo_quad_internal_ and move_lo_quad_internal_be_
partially duplicate the later aarch64_combinez{,_be} patterns.
The duplication itself is a regression.

The only substantive differences between the two are:

* combinez uses vector MOV (ORR) instead of element MOV (DUP).
  The former seems more likely to be handled via renaming.

* combinez disparages the GPR->FPR alternative whereas move_lo_quad
  gave it equal cost.  The new test gives a token example of when
  the combinez behaviour helps.

gcc/
* config/aarch64/aarch64-simd.md (move_lo_quad_internal_)
(move_lo_quad_internal_be_): Delete.
(move_lo_quad_): Use aarch64_combine instead of the above.

gcc/testsuite/
* gcc.target/aarch64/vec-init-8.c: New test.
---
 gcc/config/aarch64/aarch64-simd.md| 37 +--
 gcc/testsuite/gcc.target/aarch64/vec-init-8.c | 15 
 2 files changed, 17 insertions(+), 35 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/vec-init-8.c

diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index c5bc2ea658b..d6cd4c70fe7 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1584,46 +1584,13 @@ (define_insn "aarch64_p"
 ;; On little-endian this is { operand, zeroes }
 ;; On big-endian this is { zeroes, operand }
 
-(define_insn "move_lo_quad_internal_"
-  [(set (match_operand:VQMOV 0 "register_operand" "=w,w,w")
-   (vec_concat:VQMOV
- (match_operand: 1 "register_operand" "w,r,r")
- (match_operand: 2 "aarch64_simd_or_scalar_imm_zero")))]
-  "TARGET_SIMD && !BYTES_BIG_ENDIAN"
-  "@
-   dup\\t%d0, %1.d[0]
-   fmov\\t%d0, %1
-   dup\\t%d0, %1"
-  [(set_attr "type" "neon_dup,f_mcr,neon_dup")
-   (set_attr "length" "4")
-   (set_attr "arch" "simd,fp,simd")]
-)
-
-(define_insn "move_lo_quad_internal_be_"
-  [(set (match_operand:VQMOV 0 "register_operand" "=w,w,w")
-   (vec_concat:VQMOV
- (match_operand: 2 "aarch64_simd_or_scalar_imm_zero")
- (match_operand: 1 "register_operand" "w,r,r")))]
-  "TARGET_SIMD && BYTES_BIG_ENDIAN"
-  "@
-   dup\\t%d0, %1.d[0]
-   fmov\\t%d0, %1
-   dup\\t%d0, %1"
-  [(set_attr "type" "neon_dup,f_mcr,neon_dup")
-   (set_attr "length" "4")
-   (set_attr "arch" "simd,fp,simd")]
-)
-
 (define_expand "move_lo_quad_"
   [(match_operand:VQMOV 0 "register_operand")
(match_operand: 1 "register_operand")]
   "TARGET_SIMD"
 {
-  rtx zs = CONST0_RTX (mode);
-  if (BYTES_BIG_ENDIAN)
-emit_insn (gen_move_lo_quad_internal_be_ (operands[0], operands[1], 
zs));
-  else
-emit_insn (gen_move_lo_quad_internal_ (operands[0], operands[1], 
zs));
+  emit_insn (gen_aarch64_combine (operands[0], operands[1],
+CONST0_RTX (mode)));
   DONE;
 }
 )
diff --git a/gcc/testsuite/gcc.target/aarch64/vec-init-8.c 
b/gcc/testsuite/gcc.target/aarch64/vec-init-8.c
new file mode 100644
index 000..18f8afe10f5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/vec-init-8.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O" } */
+
+#include 
+
+int64x2_t f1(int64_t *ptr) {
+  int64_t x = *ptr;
+  asm volatile ("" ::: "memory");
+  if (__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__)
+return (int64x2_t) { 0, x };
+  else
+return (int64x2_t) { x, 0 };
+}
+
+/* { dg-final { scan-assembler {\tldr\td0, \[x0\]\n} } } */
-- 
2.25.1



[pushed 5/8] aarch64: Add more vec_combine patterns

2022-02-09 Thread Richard Sandiford via Gcc-patches
vec_combine is really one instruction on aarch64, provided that
the lowpart element is in the same register as the destination
vector.  This patch adds patterns for that.

The patch fixes a regression from GCC 8.  Before the patch:

int64x2_t s64q_1(int64_t a0, int64_t a1) {
  if (__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__)
return (int64x2_t) { a1, a0 };
  else
return (int64x2_t) { a0, a1 };
}

generated:

fmovd0, x0
ins v0.d[1], x1
ins v0.d[1], x1
ret

whereas GCC 8 generated the more respectable:

dup v0.2d, x0
ins v0.d[1], x1
ret

gcc/
* config/aarch64/predicates.md (aarch64_reg_or_mem_pair_operand):
New predicate.
* config/aarch64/aarch64-simd.md (*aarch64_combine_internal)
(*aarch64_combine_internal_be): New patterns.

gcc/testsuite/
* gcc.target/aarch64/vec-init-9.c: New test.
* gcc.target/aarch64/vec-init-10.c: Likewise.
* gcc.target/aarch64/vec-init-11.c: Likewise.
---
 gcc/config/aarch64/aarch64-simd.md|  62 
 gcc/config/aarch64/predicates.md  |   4 +
 .../gcc.target/aarch64/vec-init-10.c  |  15 +
 .../gcc.target/aarch64/vec-init-11.c  |  12 +
 gcc/testsuite/gcc.target/aarch64/vec-init-9.c | 267 ++
 5 files changed, 360 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/vec-init-10.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/vec-init-11.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/vec-init-9.c

diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index d6cd4c70fe7..ead80396e70 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -4326,6 +4326,25 @@ (define_insn "load_pair_lanes"
   [(set_attr "type" "neon_load1_1reg_q")]
 )
 
+;; This STP pattern is a partial duplicate of the general vec_concat patterns
+;; below.  The reason for having both of them is that the alternatives of
+;; the later patterns do not have consistent register preferences: the STP
+;; alternatives have no preference between GPRs and FPRs (and if anything,
+;; the GPR form is more natural for scalar integers) whereas the other
+;; alternatives *require* an FPR for operand 1 and prefer one for operand 2.
+;;
+;; Using "*" to hide the STP alternatives from the RA penalizes cases in
+;; which the destination was always memory.  On the other hand, expressing
+;; the true preferences makes GPRs seem more palatable than they really are
+;; for register destinations.
+;;
+;; Despite that, we do still want the general form to have STP alternatives,
+;; in order to handle cases where a register destination is spilled.
+;;
+;; The best compromise therefore seemed to be to have a dedicated STP
+;; pattern to catch cases in which the destination was always memory.
+;; This dedicated pattern must come first.
+
 (define_insn "store_pair_lanes"
   [(set (match_operand: 0 "aarch64_mem_pair_lanes_operand" "=Umn, Umn")
(vec_concat:
@@ -4338,6 +4357,49 @@ (define_insn "store_pair_lanes"
   [(set_attr "type" "neon_stp, store_16")]
 )
 
+;; Form a vector whose least significant half comes from operand 1 and whose
+;; most significant half comes from operand 2.  The register alternatives
+;; tie the least significant half to the same register as the destination,
+;; so that only the other half needs to be handled explicitly.  For the
+;; reasons given above, the STP alternatives use ? for constraints that
+;; the register alternatives either don't accept or themselves disparage.
+
+(define_insn "*aarch64_combine_internal"
+  [(set (match_operand: 0 "aarch64_reg_or_mem_pair_operand" "=w, w, w, 
Umn, Umn")
+   (vec_concat:
+ (match_operand:VDC 1 "register_operand" "0, 0, 0, ?w, ?r")
+ (match_operand:VDC 2 "aarch64_simd_nonimmediate_operand" "w, ?r, Utv, 
w, ?r")))]
+  "TARGET_SIMD
+   && !BYTES_BIG_ENDIAN
+   && (register_operand (operands[0], mode)
+   || register_operand (operands[2], mode))"
+  "@
+   ins\t%0.d[1], %2.d[0]
+   ins\t%0.d[1], %2
+   ld1\t{%0.d}[1], %2
+   stp\t%d1, %d2, %y0
+   stp\t%x1, %x2, %y0"
+  [(set_attr "type" "neon_ins_q, neon_from_gp_q, neon_load1_one_lane_q, 
neon_stp, store_16")]
+)
+
+(define_insn "*aarch64_combine_internal_be"
+  [(set (match_operand: 0 "aarch64_reg_or_mem_pair_operand" "=w, w, w, 
Umn, Umn")
+   (vec_concat:
+ (match_operand:VDC 2 "aarch64_simd_nonimmediate_operand" "w, ?r, Utv, 
?w, ?r")
+ (match_operand:VDC 1 "register_operand" "0, 0, 0, ?w, ?r")))]
+  "TARGET_SIMD
+   && BYTES_BIG_ENDIAN
+   && (register_operand (operands[0], mode)
+   || register_operand (operands[2], mode))"
+  "@
+   ins\t%0.d[1], %2.d[0]
+   ins\t%0.d[1], %2
+   ld1\t{%0.d}[1], %2
+   stp\t%d2, %d1, %y0
+   stp\t%x2, %x1, %y0"
+  [(set_attr "type" "neon_ins_q, neon_from_gp_q, neon_load1_one_lane_q, 
neon_stp, store_16")]
+)
+
 ;; In this insn, operand 1 should be low, and op

[pushed 6/8] aarch64: Add a general vec_concat expander

2022-02-09 Thread Richard Sandiford via Gcc-patches
After previous patches, we have a (mostly new) group of vec_concat
patterns as well as vestiges of the old move_lo/hi_quad patterns.
(A previous patch removed the move_lo_quad insns, but we still
have the move_hi_quad insns and both sets of expanders.)

This patch is the first of two to remove the old move_lo/hi_quad
stuff.  It isn't technically a regression fix, but it seemed
better to make the changes now rather than leave things in
a half-finished and inconsistent state.

This patch defines an aarch64_vec_concat expander that coerces the
element operands into a valid form, including the ones added by the
previous patch.  This in turn lets us get rid of one move_lo/hi_quad
pair.

As a side-effect, it also means that vcombines of 2 vectors make
better use of the available forms, like vec_inits of 2 scalars
already do.

gcc/
* config/aarch64/aarch64-protos.h (aarch64_split_simd_combine):
Delete.
* config/aarch64/aarch64-simd.md (@aarch64_combinez): Rename
to...
(*aarch64_combinez): ...this.
(@aarch64_combinez_be): Rename to...
(*aarch64_combinez_be): ...this.
(@aarch64_vec_concat): New expander.
(aarch64_combine): Use it.
(@aarch64_simd_combine): Delete.
* config/aarch64/aarch64.cc (aarch64_split_simd_combine): Delete.
(aarch64_expand_vector_init): Use aarch64_vec_concat.

gcc/testsuite/
* gcc.target/aarch64/vec-init-12.c: New test.
---
 gcc/config/aarch64/aarch64-protos.h   |  2 -
 gcc/config/aarch64/aarch64-simd.md| 76 ---
 gcc/config/aarch64/aarch64.cc | 55 ++
 .../gcc.target/aarch64/vec-init-12.c  | 65 
 4 files changed, 122 insertions(+), 76 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/vec-init-12.c

diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index b75ed35635b..392efa0b74d 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -925,8 +925,6 @@ bool aarch64_split_128bit_move_p (rtx, rtx);
 
 bool aarch64_mov128_immediate (rtx);
 
-void aarch64_split_simd_combine (rtx, rtx, rtx);
-
 void aarch64_split_simd_move (rtx, rtx);
 
 /* Check for a legitimate floating point constant for FMOV.  */
diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index ead80396e70..7acde0dd099 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -4403,7 +4403,7 @@ (define_insn "*aarch64_combine_internal_be"
 ;; In this insn, operand 1 should be low, and operand 2 the high part of the
 ;; dest vector.
 
-(define_insn "@aarch64_combinez"
+(define_insn "*aarch64_combinez"
   [(set (match_operand: 0 "register_operand" "=w,w,w")
(vec_concat:
  (match_operand:VDC 1 "nonimmediate_operand" "w,?r,m")
@@ -4417,7 +4417,7 @@ (define_insn "@aarch64_combinez"
(set_attr "arch" "simd,fp,simd")]
 )
 
-(define_insn "@aarch64_combinez_be"
+(define_insn "*aarch64_combinez_be"
   [(set (match_operand: 0 "register_operand" "=w,w,w")
 (vec_concat:
  (match_operand:VDC 2 "aarch64_simd_or_scalar_imm_zero")
@@ -4431,38 +4431,62 @@ (define_insn "@aarch64_combinez_be"
(set_attr "arch" "simd,fp,simd")]
 )
 
-(define_expand "aarch64_combine"
-  [(match_operand: 0 "register_operand")
-   (match_operand:VDC 1 "register_operand")
-   (match_operand:VDC 2 "aarch64_simd_reg_or_zero")]
+;; Form a vector whose first half (in array order) comes from operand 1
+;; and whose second half (in array order) comes from operand 2.
+;; This operand order follows the RTL vec_concat operation.
+(define_expand "@aarch64_vec_concat"
+  [(set (match_operand: 0 "register_operand")
+   (vec_concat:
+ (match_operand:VDC 1 "general_operand")
+ (match_operand:VDC 2 "general_operand")))]
   "TARGET_SIMD"
 {
-  if (operands[2] == CONST0_RTX (mode))
+  int lo = BYTES_BIG_ENDIAN ? 2 : 1;
+  int hi = BYTES_BIG_ENDIAN ? 1 : 2;
+
+  if (MEM_P (operands[1])
+  && MEM_P (operands[2])
+  && aarch64_mergeable_load_pair_p (mode, operands[1], operands[2]))
+/* Use load_pair_lanes.  */
+;
+  else if (operands[hi] == CONST0_RTX (mode))
 {
-  if (BYTES_BIG_ENDIAN)
-   emit_insn (gen_aarch64_combinez_be (operands[0], operands[1],
- operands[2]));
-  else
-   emit_insn (gen_aarch64_combinez (operands[0], operands[1],
-  operands[2]));
+  /* Use *aarch64_combinez.  */
+  if (!nonimmediate_operand (operands[lo], mode))
+   operands[lo] = force_reg (mode, operands[lo]);
 }
   else
-aarch64_split_simd_combine (operands[0], operands[1], operands[2]);
-  DONE;
-}
-)
+{
+  /* Use *aarch64_combine_general.  */
+  operands[lo] = force_reg (mode, operands[lo]);
+  if (!aarch64_simd_nonimmediate_operand (operands[hi], mode))
+  

[pushed 7/8] aarch64: Remove move_lo/hi_quad expanders

2022-02-09 Thread Richard Sandiford via Gcc-patches
This patch is the second of two to remove the old
move_lo/hi_quad expanders and move_hi_quad insns.

gcc/
* config/aarch64/aarch64-simd.md (@aarch64_split_simd_mov):
Use aarch64_combine instead of move_lo/hi_quad.  Tabify.
(move_lo_quad_, aarch64_simd_move_hi_quad_): Delete.
(aarch64_simd_move_hi_quad_be_, move_hi_quad_): Delete.
(vec_pack_trunc_): Take general_operand elements and use
aarch64_combine rather than move_lo/hi_quad to combine them.
(vec_pack_trunc_df): Likewise.
---
 gcc/config/aarch64/aarch64-simd.md | 111 +
 1 file changed, 18 insertions(+), 93 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index 7acde0dd099..ef6e772503d 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -272,7 +272,7 @@ (define_split
 
 (define_expand "@aarch64_split_simd_mov"
   [(set (match_operand:VQMOV 0)
-(match_operand:VQMOV 1))]
+   (match_operand:VQMOV 1))]
   "TARGET_SIMD"
   {
 rtx dst = operands[0];
@@ -280,23 +280,22 @@ (define_expand "@aarch64_split_simd_mov"
 
 if (GP_REGNUM_P (REGNO (src)))
   {
-rtx src_low_part = gen_lowpart (mode, src);
-rtx src_high_part = gen_highpart (mode, src);
+   rtx src_low_part = gen_lowpart (mode, src);
+   rtx src_high_part = gen_highpart (mode, src);
+   rtx dst_low_part = gen_lowpart (mode, dst);
 
-emit_insn
-  (gen_move_lo_quad_ (dst, src_low_part));
-emit_insn
-  (gen_move_hi_quad_ (dst, src_high_part));
+   emit_move_insn (dst_low_part, src_low_part);
+   emit_insn (gen_aarch64_combine (dst, dst_low_part,
+  src_high_part));
   }
-
 else
   {
-rtx dst_low_part = gen_lowpart (mode, dst);
-rtx dst_high_part = gen_highpart (mode, dst);
+   rtx dst_low_part = gen_lowpart (mode, dst);
+   rtx dst_high_part = gen_highpart (mode, dst);
rtx lo = aarch64_simd_vect_par_cnst_half (mode, , false);
rtx hi = aarch64_simd_vect_par_cnst_half (mode, , true);
-emit_insn (gen_aarch64_get_half (dst_low_part, src, lo));
-emit_insn (gen_aarch64_get_half (dst_high_part, src, hi));
+   emit_insn (gen_aarch64_get_half (dst_low_part, src, lo));
+   emit_insn (gen_aarch64_get_half (dst_high_part, src, hi));
   }
 DONE;
   }
@@ -1580,69 +1579,6 @@ (define_insn "aarch64_p"
 ;; What that means, is that the RTL descriptions of the below patterns
 ;; need to change depending on endianness.
 
-;; Move to the low architectural bits of the register.
-;; On little-endian this is { operand, zeroes }
-;; On big-endian this is { zeroes, operand }
-
-(define_expand "move_lo_quad_"
-  [(match_operand:VQMOV 0 "register_operand")
-   (match_operand: 1 "register_operand")]
-  "TARGET_SIMD"
-{
-  emit_insn (gen_aarch64_combine (operands[0], operands[1],
-CONST0_RTX (mode)));
-  DONE;
-}
-)
-
-;; Move operand1 to the high architectural bits of the register, keeping
-;; the low architectural bits of operand2.
-;; For little-endian this is { operand2, operand1 }
-;; For big-endian this is { operand1, operand2 }
-
-(define_insn "aarch64_simd_move_hi_quad_"
-  [(set (match_operand:VQMOV 0 "register_operand" "+w,w")
-(vec_concat:VQMOV
-  (vec_select:
-(match_dup 0)
-(match_operand:VQMOV 2 "vect_par_cnst_lo_half" ""))
- (match_operand: 1 "register_operand" "w,r")))]
-  "TARGET_SIMD && !BYTES_BIG_ENDIAN"
-  "@
-   ins\\t%0.d[1], %1.d[0]
-   ins\\t%0.d[1], %1"
-  [(set_attr "type" "neon_ins")]
-)
-
-(define_insn "aarch64_simd_move_hi_quad_be_"
-  [(set (match_operand:VQMOV 0 "register_operand" "+w,w")
-(vec_concat:VQMOV
- (match_operand: 1 "register_operand" "w,r")
-  (vec_select:
-(match_dup 0)
-(match_operand:VQMOV 2 "vect_par_cnst_lo_half" ""]
-  "TARGET_SIMD && BYTES_BIG_ENDIAN"
-  "@
-   ins\\t%0.d[1], %1.d[0]
-   ins\\t%0.d[1], %1"
-  [(set_attr "type" "neon_ins")]
-)
-
-(define_expand "move_hi_quad_"
- [(match_operand:VQMOV 0 "register_operand")
-  (match_operand: 1 "register_operand")]
- "TARGET_SIMD"
-{
-  rtx p = aarch64_simd_vect_par_cnst_half (mode, , false);
-  if (BYTES_BIG_ENDIAN)
-emit_insn (gen_aarch64_simd_move_hi_quad_be_ (operands[0],
-   operands[1], p));
-  else
-emit_insn (gen_aarch64_simd_move_hi_quad_ (operands[0],
-   operands[1], p));
-  DONE;
-})
-
 ;; Narrowing operations.
 
 (define_insn "aarch64_xtn_insn_le"
@@ -1743,16 +1679,12 @@ (define_insn "*aarch64_narrow_trunc"
 
 (define_expand "vec_pack_trunc_"
  [(match_operand: 0 "register_operand")
-  (match_operand:VDN 1 "register_operand")
-  (match_operand:VDN 2 "register_operand")]
+  (match_operand:VDN 1 "general_operand")
+  (match_operand:VDN 2 "general_operan

[pushed 8/8] aarch64: Extend vec_concat patterns to 8-byte vectors

2022-02-09 Thread Richard Sandiford via Gcc-patches
This patch extends the previous support for 16-byte vec_concat
so that it supports pairs of 4-byte elements.  This too isn't
strictly a regression fix, since the 8-byte forms weren't affected
by the same problems as the 16-byte forms, but it leaves things in
a more consistent state.

gcc/
* config/aarch64/iterators.md (VDCSIF): New mode iterator.
(VDBL): Handle SF.
(single_wx, single_type, single_dtype, dblq): New mode attributes.
* config/aarch64/aarch64-simd.md (load_pair_lanes): Extend
from VDC to VDCSIF.
(store_pair_lanes): Likewise.
(*aarch64_combine_internal): Likewise.
(*aarch64_combine_internal_be): Likewise.
(*aarch64_combinez): Likewise.
(*aarch64_combinez_be): Likewise.
* config/aarch64/aarch64.cc (aarch64_classify_address): Handle
8-byte modes for ADDR_QUERY_LDP_STP_N.
(aarch64_print_operand): Likewise for %y.

gcc/testsuite/
* gcc.target/aarch64/vec-init-13.c: New test.
* gcc.target/aarch64/vec-init-14.c: Likewise.
* gcc.target/aarch64/vec-init-15.c: Likewise.
* gcc.target/aarch64/vec-init-16.c: Likewise.
* gcc.target/aarch64/vec-init-17.c: Likewise.
---
 gcc/config/aarch64/aarch64-simd.md|  72 +-
 gcc/config/aarch64/aarch64.cc |  16 ++-
 gcc/config/aarch64/iterators.md   |  38 +-
 .../gcc.target/aarch64/vec-init-13.c  | 123 ++
 .../gcc.target/aarch64/vec-init-14.c  | 123 ++
 .../gcc.target/aarch64/vec-init-15.c  |  15 +++
 .../gcc.target/aarch64/vec-init-16.c  |  12 ++
 .../gcc.target/aarch64/vec-init-17.c  |  73 +++
 8 files changed, 430 insertions(+), 42 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/vec-init-13.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/vec-init-14.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/vec-init-15.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/vec-init-16.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/vec-init-17.c

diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index ef6e772503d..18733428f3f 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -4243,12 +4243,12 @@ (define_insn_and_split "aarch64_get_lane"
 (define_insn "load_pair_lanes"
   [(set (match_operand: 0 "register_operand" "=w")
(vec_concat:
-  (match_operand:VDC 1 "memory_operand" "Utq")
-  (match_operand:VDC 2 "memory_operand" "m")))]
+  (match_operand:VDCSIF 1 "memory_operand" "Utq")
+  (match_operand:VDCSIF 2 "memory_operand" "m")))]
   "TARGET_SIMD
&& aarch64_mergeable_load_pair_p (mode, operands[1], operands[2])"
-  "ldr\\t%q0, %1"
-  [(set_attr "type" "neon_load1_1reg_q")]
+  "ldr\\t%0, %1"
+  [(set_attr "type" "neon_load1_1reg")]
 )
 
 ;; This STP pattern is a partial duplicate of the general vec_concat patterns
@@ -4273,12 +4273,12 @@ (define_insn "load_pair_lanes"
 (define_insn "store_pair_lanes"
   [(set (match_operand: 0 "aarch64_mem_pair_lanes_operand" "=Umn, Umn")
(vec_concat:
-  (match_operand:VDC 1 "register_operand" "w, r")
-  (match_operand:VDC 2 "register_operand" "w, r")))]
+  (match_operand:VDCSIF 1 "register_operand" "w, r")
+  (match_operand:VDCSIF 2 "register_operand" "w, r")))]
   "TARGET_SIMD"
   "@
-   stp\\t%d1, %d2, %y0
-   stp\\t%x1, %x2, %y0"
+   stp\t%1, %2, %y0
+   stp\t%1, %2, %y0"
   [(set_attr "type" "neon_stp, store_16")]
 )
 
@@ -4292,37 +4292,37 @@ (define_insn "store_pair_lanes"
 (define_insn "*aarch64_combine_internal"
   [(set (match_operand: 0 "aarch64_reg_or_mem_pair_operand" "=w, w, w, 
Umn, Umn")
(vec_concat:
- (match_operand:VDC 1 "register_operand" "0, 0, 0, ?w, ?r")
- (match_operand:VDC 2 "aarch64_simd_nonimmediate_operand" "w, ?r, Utv, 
w, ?r")))]
+ (match_operand:VDCSIF 1 "register_operand" "0, 0, 0, ?w, ?r")
+ (match_operand:VDCSIF 2 "aarch64_simd_nonimmediate_operand" "w, ?r, 
Utv, w, ?r")))]
   "TARGET_SIMD
&& !BYTES_BIG_ENDIAN
&& (register_operand (operands[0], mode)
|| register_operand (operands[2], mode))"
   "@
-   ins\t%0.d[1], %2.d[0]
-   ins\t%0.d[1], %2
-   ld1\t{%0.d}[1], %2
-   stp\t%d1, %d2, %y0
-   stp\t%x1, %x2, %y0"
-  [(set_attr "type" "neon_ins_q, neon_from_gp_q, neon_load1_one_lane_q, 
neon_stp, store_16")]
+   ins\t%0.[1], %2.[0]
+   ins\t%0.[1], %2
+   ld1\t{%0.}[1], %2
+   stp\t%1, %2, %y0
+   stp\t%1, %2, %y0"
+  [(set_attr "type" "neon_ins, neon_from_gp, 
neon_load1_one_lane, neon_stp, store_16")]
 )
 
 (define_insn "*aarch64_combine_internal_be"
   [(set (match_operand: 0 "aarch64_reg_or_mem_pair_operand" "=w, w, w, 
Umn, Umn")
(vec_concat:
- (match_operand:VDC 2 "aarch64_simd_nonimmediate_operand" "w, ?r, Utv, 
?w, ?r")
- (match_operand:VDC 1 "register_operand" "0, 

Re: [PATCH] libstdc++: Fix deadlock in atomic wait [PR104442]

2022-02-09 Thread Thomas Rodgers via Gcc-patches
Updated patch. I reverted the memory order change (and will submit that as
another patch) and fixed some spelling and grammar errors.

On Wed, Feb 9, 2022 at 2:43 AM Jonathan Wakely  wrote:

> On Wed, 9 Feb 2022 at 00:57, Thomas Rodgers via Libstdc++
>  wrote:
> >
> > This issue was observed as a deadloack in
> > 29_atomics/atomic/wait_notify/100334.cc on vxworks. When a wait is
> > "laundered" (e.g. type T* does not suffice as a waitable address for the
> > platform's native waiting primitive), the address waited is that of the
> > _M_ver member of __waiter_pool_base, so several threads may wait on the
> > same address for unrelated atomic's. As noted in the PR, the
> > implementation correctly exits the wait for the thread who's data
> > changed, but not for any other threads waiting on the same address.
> >
> > As noted in the PR the __waiter::_M_do_wait_v member was correctly
> exiting
> > but the other waiters were not reloaded the value of _M_ver before
> > re-entering the wait.
> >
> > Moving the spin call inside the loop accomplishes this, and is
> > consistent with the predicate accepting version of __waiter::_M_do_wait.
>
> There is a change to the memory order in _S_do_spin_v which is not
> described in the commit msg or the changelog. Is that unintentional?
>
> (Aside: why do we even have _S_do_spin_v, it's called in exactly one
> place, so could just be inlined into _M_do_spin_v, couldn't it?)
>
>
From b39283d5100305e7a95d59324059de9952d3a858 Mon Sep 17 00:00:00 2001
From: Thomas Rodgers 
Date: Tue, 8 Feb 2022 16:33:36 -0800
Subject: [PATCH] libstdc++: Fix deadlock in atomic wait [PR104442]

This issue was observed as a deadlock in
29_atomics/atomic/wait_notify/100334.cc on vxworks. When a wait is
"laundered" (e.g. type T* does not suffice as a waitable address for the
platform's native waiting primitive), the address waited is that of the
_M_ver member of __waiter_pool_base, so several threads may wait on the
same address for unrelated atomic objects. As noted in the PR, the
implementation correctly exits the wait for the thread whose data
changed, but not for any other threads waiting on the same address.

As noted in the PR the __waiter::_M_do_wait_v member was correctly exiting
but the other waiters were not reloading the value of _M_ver before
re-entering the wait.

Moving the spin call inside the loop accomplishes this, and is
consistent with the predicate accepting version of __waiter::_M_do_wait.

libstdc++-v3/ChangeLog:

	PR libstdc++/104442
	* include/bits/atomic_wait.h (__waiter::_M_do_wait_v): Move spin
	 loop inside do loop so that threads failing the wait, reload
	 _M_ver.
---
 libstdc++-v3/include/bits/atomic_wait.h | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/libstdc++-v3/include/bits/atomic_wait.h b/libstdc++-v3/include/bits/atomic_wait.h
index d7de0d7eb9e..6ce7f9343cf 100644
--- a/libstdc++-v3/include/bits/atomic_wait.h
+++ b/libstdc++-v3/include/bits/atomic_wait.h
@@ -388,12 +388,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	  void
 	  _M_do_wait_v(_Tp __old, _ValFn __vfn)
 	  {
-	__platform_wait_t __val;
-	if (__base_type::_M_do_spin_v(__old, __vfn, __val))
-	  return;
-
 	do
 	  {
+		__platform_wait_t __val;
+		if (__base_type::_M_do_spin_v(__old, __vfn, __val))
+		  return;
 		__base_type::_M_w._M_do_wait(__base_type::_M_addr, __val);
 	  }
 	while (__detail::__atomic_compare(__old, __vfn()));
-- 
2.34.1



Re: [PATCH] libstdc++: Fix deadlock in atomic wait [PR104442]

2022-02-09 Thread Jonathan Wakely via Gcc-patches
On Wed, 9 Feb 2022 at 17:10, Thomas Rodgers wrote:
>
> Updated patch. I reverted the memory order change (and will submit that as 
> another patch) and fixed some spelling and grammar errors.

OK for trunk and gcc-11, thanks.



[PATCH] libstdc++: Strengthen memory order for atomic::wait/notify

2022-02-09 Thread Thomas Rodgers via Gcc-patches
This patch changes the memory order used in the spin wait code to match
that of libc++.
From 92caa08b272520ec4a272b302b37d8fb47afb2ab Mon Sep 17 00:00:00 2001
From: Thomas Rodgers 
Date: Wed, 9 Feb 2022 09:26:00 -0800
Subject: [PATCH] libstdc++: Strengthen memory order for atomic::wait/notify
 (spinning)

This patch changes the memory order used in the spin wait code to match
that of libc++.

libstdc++-v3/ChangeLog:
	* includ/bits/atomic_wait.h (__waiter_base::_S_do_spin,
	__waiter_base::_S_do_spin_v): Change memory order from relaxed
	to acquire.
---
 libstdc++-v3/include/bits/atomic_wait.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/bits/atomic_wait.h b/libstdc++-v3/include/bits/atomic_wait.h
index 6ce7f9343cf..125b1cad886 100644
--- a/libstdc++-v3/include/bits/atomic_wait.h
+++ b/libstdc++-v3/include/bits/atomic_wait.h
@@ -332,7 +332,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	  }
 	else
 	  {
-		__atomic_load(__addr, &__val, __ATOMIC_RELAXED);
+		__atomic_load(__addr, &__val, __ATOMIC_ACQUIRE);
 	  }
 	return __atomic_spin(__pred, __spin);
 	  }
@@ -353,7 +353,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 		 __platform_wait_t& __val,
 		 _Spin __spin = _Spin{ })
 	  {
-	__atomic_load(__addr, &__val, __ATOMIC_RELAXED);
+	__atomic_load(__addr, &__val, __ATOMIC_ACQUIRE);
 	return __atomic_spin(__pred, __spin);
 	  }
 
-- 
2.34.1



Patch committed: Correct -fgo-dump-spec alignment field name

2022-02-09 Thread Ian Lance Taylor via Gcc-patches
My earlier patch to correct the -fgo-dump-spec field name
(https://gcc.gnu.org/pipermail/gcc-patches/2021-December/587464.html)
was incomplete.  It replaced part of the name with "_", but not all of
it.  This patch completes the job.  Bootstrapped and ran Go and
-fgo-dump-spec testsuite on x86_64-pc-linux-gnu.  Committed to
mainline.

Ian

* godump.cc (go_force_record_alignment): Really name the alignment
field "_" (complete 2021-12-29 change).

* gcc.misc-tests/godump-1.c: Adjust for alignment field rename.
d3f3ec5a555dbf0e3329515b38f848b4760589b2
diff --git a/gcc/godump.cc b/gcc/godump.cc
index 2092446b0cc..669168806f3 100644
--- a/gcc/godump.cc
+++ b/gcc/godump.cc
@@ -643,14 +643,13 @@ go_append_padding (struct obstack *ob, unsigned int 
from_offset,
 }
 
 /* Appends an array of type TYPE_STRING with zero elements and the name
-   "Godump_INDEX_align" to OB.  If TYPE_STRING is a null pointer, ERROR_STRING
-   is appended instead of the type.  Returns INDEX + 1.  */
+   "_" to OB.  If TYPE_STRING is a null pointer, ERROR_STRING is appended
+   instead of the type.  Returns INDEX + 1.  */
 
 static unsigned int
 go_force_record_alignment (struct obstack *ob, const char *type_string,
   unsigned int index, const char *error_string)
 {
-  index = go_append_artificial_name (ob, index);
   obstack_grow (ob, "_ ", 2);
   if (type_string == NULL)
 obstack_grow (ob, error_string, strlen (error_string));
diff --git a/gcc/testsuite/gcc.misc-tests/godump-1.c 
b/gcc/testsuite/gcc.misc-tests/godump-1.c
index b05be78d321..95dabdc0e4c 100644
--- a/gcc/testsuite/gcc.misc-tests/godump-1.c
+++ b/gcc/testsuite/gcc.misc-tests/godump-1.c
@@ -501,10 +501,10 @@ struct { struct { uint8_t ca[3]; } s; uint32_t i; } sn;
 /* { dg-final { scan-file godump-1.out "(?n)^var _sn struct \{ s struct \{ ca 
\\\[2\\+1\\\]uint8; \}; i uint32; \}$" } } */
 
 typedef struct { struct { uint8_t a; uint16_t s; }; uint8_t b; } tsn_anon;
-/* { dg-final { scan-file godump-1.out "(?n)^type _tsn_anon struct \{ a uint8; 
s uint16; b uint8; Godump_0_pad \\\[.\\\]byte; Godump_1_ \\\[0\\\]int16; \}$" } 
} */
+/* { dg-final { scan-file godump-1.out "(?n)^type _tsn_anon struct \{ a uint8; 
s uint16; b uint8; Godump_0_pad \\\[.\\\]byte; _ \\\[0\\\]int16; \}$" } } */
 
 struct { struct { uint8_t a; uint16_t s; }; uint8_t b; } sn_anon;
-/* { dg-final { scan-file godump-1.out "(?n)^var _sn_anon struct \{ a uint8; s 
uint16; b uint8; Godump_0_pad \\\[.\\\]byte; Godump_1_ \\\[0\\\]int16; \}$" } } 
*/
+/* { dg-final { scan-file godump-1.out "(?n)^var _sn_anon struct \{ a uint8; s 
uint16; b uint8; Godump_0_pad \\\[.\\\]byte; _ \\\[0\\\]int16; \}$" } } */
 
 
 /*** structs with bitfields ***/
@@ -575,16 +575,16 @@ struct { uint8_t bf : 8; uint8_t c; } sbf_pad8_3;
 /* { dg-final { scan-file godump-1.out "(?n)^var _sbf_pad8_3 struct \{ bf 
uint8; c uint8; \}$" } } */
 
 typedef struct { uint16_t bf : 1; uint8_t c; } tsbf_pad16_1;
-/* { dg-final { scan-file godump-1.out "(?n)^type _tsbf_pad16_1 struct \{ 
Godump_0_pad \\\[1\\\]byte; c uint8; Godump_1_ \\\[0\\\]int16; \}$" } } */
+/* { dg-final { scan-file godump-1.out "(?n)^type _tsbf_pad16_1 struct \{ 
Godump_0_pad \\\[1\\\]byte; c uint8; _ \\\[0\\\]int16; \}$" } } */
 
 struct { uint16_t bf : 1; uint8_t c; } sbf_pad16_1;
-/* { dg-final { scan-file godump-1.out "(?n)^var _sbf_pad16_1 struct \{ 
Godump_0_pad \\\[1\\\]byte; c uint8; Godump_1_ \\\[0\\\]int16; \}$" } } */
+/* { dg-final { scan-file godump-1.out "(?n)^var _sbf_pad16_1 struct \{ 
Godump_0_pad \\\[1\\\]byte; c uint8; _ \\\[0\\\]int16; \}$" } } */
 
 typedef struct { uint16_t bf : 15; uint8_t c; } tsbf_pad16_2;
-/* { dg-final { scan-file godump-1.out "(?n)^type _tsbf_pad16_2 struct \{ 
Godump_0_pad \\\[2\\\]byte; c uint8; Godump_1_pad \\\[.\\\]byte; Godump_2_ 
\\\[0\\\]int16; \}$" } } */
+/* { dg-final { scan-file godump-1.out "(?n)^type _tsbf_pad16_2 struct \{ 
Godump_0_pad \\\[2\\\]byte; c uint8; Godump_1_pad \\\[.\\\]byte; _ 
\\\[0\\\]int16; \}$" } } */
 
 struct { uint16_t bf : 15; uint8_t c; } sbf_pad16_2;
-/* { dg-final { scan-file godump-1.out "(?n)^var _sbf_pad16_2 struct \{ 
Godump_0_pad \\\[2\\\]byte; c uint8; Godump_1_pad \\\[.\\\]byte; Godump_2_ 
\\\[0\\\]int16; \}$" } } */
+/* { dg-final { scan-file godump-1.out "(?n)^var _sbf_pad16_2 struct \{ 
Godump_0_pad \\\[2\\\]byte; c uint8; Godump_1_pad \\\[.\\\]byte; _ 
\\\[0\\\]int16; \}$" } } */
 
 typedef struct { uint16_t bf : 16; uint8_t c; } tsbf_pad16_3;
 /* { dg-final { scan-file godump-1.out "(?n)^type _tsbf_pad16_3 struct \{ bf 
uint16; c uint8; Godump_0_pad \\\[.\\\]byte; \}$" } } */
@@ -593,16 +593,16 @@ struct { uint16_t bf : 16; uint8_t c; } sbf_pad16_3;
 /* { dg-final { scan-file godump-1.out "(?n)^var _sbf_pad16_3 struct \{ bf 
uint16; c uint8; Godump_0_pad \\\[.\\\]byte; \}$" } } */
 
 typedef struct { uint32_t bf : 1; uint8_t c; } tsbf_pad32_1;
-/* { dg-final { scan-file godump-1.out "(?n)^type _tsbf_pad32_1 struct \{ 
Godump_0_pad \\\[1\\\]byte; c uint8; Go

Re: [PATCH] Reset relations when crossing backedges.

2022-02-09 Thread Martin Jambor
Hello,

On Fri, Jan 21 2022, Aldy Hernandez via Gcc-patches wrote:
> As discussed in PR103721, the problem here is that we are crossing a
> backedge and causing us to use relations from a previous iteration of a
> loop.
>
> This handles the testcases in both PR103721 and PR104067 which are variants
> of the same thing.
>
> Tested on x86-64 Linux with the usual regstrap as well as verifying the
> thread count before and after the patch.  The number of threads is
> reduced by a miniscule amount.
>
> I assume we need release manager approval at this point?  OK for trunk?
>
> gcc/ChangeLog:
>
>   PR 103721/tree-optimization
>   * gimple-range-path.cc
>   (path_range_query::relations_may_be_invalidated): New.
>   (path_range_query::compute_ranges_in_block): Reset relations if
>   they may be invalidated.
>   (path_range_query::maybe_register_phi_relation): Exit if relations
>   may be invalidated on incoming edge.
>   (path_range_query::compute_phi_relations): Pass incoming PHI edge
>   to maybe_register_phi_relation.
>   * gimple-range-path.h (relations_may_be_invalidated): New.
>   (maybe_register_phi_relation): Pass edge instead of tree.
>   * tree-ssa-threadbackward.cc (back_threader::back_threader):
>   * value-relation.cc (path_oracle::path_oracle): Call
>   mark_dfs_back_edges.
>   (path_oracle::register_relation): Add SSA names to m_registered
>   bitmap.
>   (path_oracle::reset_path): Clear m_registered bitmap.
>   * value-relation.h (path_oracle::set_root_oracle): New.

this has caused around 5% regression of 429.mcf when built with -O2 -flto
(generic march) on x86_64 (I tried and confirmed on AMD Zen3, Zen2 and
Intel Cascadelake), the two former cases can be seen here:

https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=469.60.0
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=413.60.0&plot.1=292.60.0&;

This does not seem to be a regression against gcc11 and I am not sure
whether it is worth a bug-report, but perhaps it is worth looking at as
it may indicate where we can improve further?

Martin


Re: [PATCH] c++: memfn lookup consistency and using-decls [PR104432]

2022-02-09 Thread Jason Merrill via Gcc-patches

On 2/9/22 11:36, Patrick Palka wrote:

On Wed, 9 Feb 2022, Jason Merrill wrote:


On 2/9/22 10:45, Patrick Palka wrote:

In filter_memfn_lookup, we weren't correctly recognizing and matching up
member functions introduced via a non-dependent using-decl.  This caused
us to crash in the below testcases in which we correctly pruned the
overload set for the non-dependent call ahead of time, but then at
instantiation time filter_memfn_lookup failed to match the selected
function (introduced in each case by a non-dependent using-decl) to the
corresponding function from the new lookup set.  Such member functions
need special handling in filter_memfn_lookup because they look exactly
the same in the old and new lookup sets, whereas ordinary member
functions that're defined in the (dependent) current class become more
specialized in the new lookup set.

This patch reworks the matching logic in filter_memfn_lookup so that it
handles non-dependent using-decls correctly, and is hopefully simpler to
follow.

Bootstrapped and regtested on x86_64-pc-linux, does this look OK for
trunk?

PR c++/104432

gcc/cp/ChangeLog:

* call.cc (build_new_method_call): When a non-dependent call
resolves to a specialization of a member template, always build
the pruned overload set using the member template, not the
specialization.
* pt.cc (filter_memfn_lookup): New parameter newtype.  Simplify
and correct how members from the new lookup set are matched to
those from the old one.
(tsubst_baselink): Pass binfo_type as newtype to
filter_memfn_lookup.

gcc/testsuite/ChangeLog:

* g++.dg/template/non-dependent19.C: New test.
* g++.dg/template/non-dependent19a.C: New test.
* g++.dg/template/non-dependent20.C: New test.
---
   gcc/cp/call.cc|  9 ++--
   gcc/cp/pt.cc  | 49 +--
   .../g++.dg/template/non-dependent19.C | 14 ++
   .../g++.dg/template/non-dependent19a.C| 16 ++
   .../g++.dg/template/non-dependent20.C | 16 ++
   5 files changed, 73 insertions(+), 31 deletions(-)
   create mode 100644 gcc/testsuite/g++.dg/template/non-dependent19.C
   create mode 100644 gcc/testsuite/g++.dg/template/non-dependent19a.C
   create mode 100644 gcc/testsuite/g++.dg/template/non-dependent20.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index b2e89c5d783..d6eed5ed835 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -11189,12 +11189,11 @@ build_new_method_call (tree instance, tree fns,
vec **args,
 if (really_overloaded_fn (fns))
{
  if (DECL_TEMPLATE_INFO (fn)
- && DECL_MEMBER_TEMPLATE_P (DECL_TI_TEMPLATE (fn))
- && dependent_type_p (DECL_CONTEXT (fn)))
+ && DECL_MEMBER_TEMPLATE_P (DECL_TI_TEMPLATE (fn)))
{
- /* FIXME: We're not prepared to fully instantiate "inside-out"
-partial instantiations such as A::f().  So instead
-use the selected template, not the specialization.  */
+ /* Use the selected template, not the specialization, so that
+this looks like an actual lookup result for sake of
+filter_memfn_lookup.  */
  if (OVL_SINGLE_P (fns))
/* If the original overload set consists of a single function
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 862f337886c..3a5d06bf297 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -16311,12 +16311,12 @@ tsubst (tree t, tree args, tsubst_flags_t
complain, tree in_decl)
   }
 /* OLDFNS is a lookup set of member functions from some class template,
and
-   NEWFNS is a lookup set of member functions from a specialization of that
-   class template.  Return the subset of NEWFNS which are specializations
of
-   a function from OLDFNS.  */
+   NEWFNS is a lookup set of member functions from NEWTYPE, a
specialization
+   of that class template.  Return the subset of NEWFNS which are
+   specializations of a function from OLDFNS.  */
 static tree
-filter_memfn_lookup (tree oldfns, tree newfns)
+filter_memfn_lookup (tree oldfns, tree newfns, tree newtype)
   {
 /* Record all member functions from the old lookup set OLDFNS into
VISIBLE_SET.  */
@@ -16326,38 +16326,34 @@ filter_memfn_lookup (tree oldfns, tree newfns)
 if (TREE_CODE (fn) == USING_DECL)
{
  /* FIXME: Punt on (dependent) USING_DECL for now; mapping
-a dependent USING_DECL to its instantiation seems
-tricky.  */
+a dependent USING_DECL to the member functions it introduces
+seems tricky.  */


FWIW I still think this shouldn't be very tricky.


I tried implementing this by substituting into the USING_DECL_SCOPE and
then during the filtering step keeping the member functions from the new
lookup set whose DECL_CONTEXT is the same as the substituted scope, i.e.
keeping the m

Re: [PATCH] Fix PR 101515 (ICE in pp_cxx_unqualified_id, at cp/cxx-pretty-print.c:128)

2022-02-09 Thread Jason Merrill via Gcc-patches

On 2/9/22 10:51, Qing Zhao wrote:




On Feb 8, 2022, at 4:20 PM, Jason Merrill  wrote:

On 2/8/22 15:11, Qing Zhao wrote:

Hi,
This is the patch to fix PR101515 (ICE in pp_cxx_unqualified_id, at  
cp/cxx-pretty-print.c:128)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101515
It's possible that the TYPE_NAME of a record_type is NULL, therefore when
printing the TYPE_NAME, we should check and handle this special case.
Please see the comment of pr101515 for more details.
The fix is very simple, just check and special handle cases when TYPE_NAME is 
NULL.
Bootstrapped and regression tested on both x86 and aarch64, no issues.
Okay for commit?
Thanks.
Qing
=
 From f37ee8d21b80cb77d8108cb97a487c84c530545b Mon Sep 17 00:00:00 2001
From: Qing Zhao 
Date: Tue, 8 Feb 2022 16:10:37 +
Subject: [PATCH] Fix PR 101515 ICE in pp_cxx_unqualified_id, at
  cp/cxx-pretty-print.c:128.
It's possible that the TYPE_NAME of a record_type is NULL, therefore when
printing the TYPE_NAME, we should check and handle this special case.
gcc/cp/ChangeLog:
* cxx-pretty-print.cc (pp_cxx_unqualified_id): Check and handle
the case when TYPE_NAME is NULL.
gcc/testsuite/ChangeLog:
* g++.dg/pr101515.C: New test.
---
  gcc/cp/cxx-pretty-print.cc  |  5 -
  gcc/testsuite/g++.dg/pr101515.C | 25 +
  2 files changed, 29 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/pr101515.C
diff --git a/gcc/cp/cxx-pretty-print.cc b/gcc/cp/cxx-pretty-print.cc
index 4f9a090e520d..744ed0add5ba 100644
--- a/gcc/cp/cxx-pretty-print.cc
+++ b/gcc/cp/cxx-pretty-print.cc
@@ -171,7 +171,10 @@ pp_cxx_unqualified_id (cxx_pretty_printer *pp, tree t)
  case ENUMERAL_TYPE:
  case TYPENAME_TYPE:
  case UNBOUND_CLASS_TEMPLATE:
-  pp_cxx_unqualified_id (pp, TYPE_NAME (t));
+  if (TYPE_NAME (t))
+   pp_cxx_unqualified_id (pp, TYPE_NAME (t));
+  else
+   pp_string (pp, "");


Hmm, but it's not an unnamed class, it's a pointer to member function type, and 
it would be better to avoid dumping compiler internal representations like the 
__pfn field name.

Yes, It’s not an unnamed class, but the ICE happened when try to print the 
compiler generated member function type “__ptrmemfunc_type”, whose TYPE_NAME is 
NULLed during building this type in c++ FE and the c++ FE does not handle the 
case when TYPE_NAME is NULL correctly.

So, there are two levels of issues:

1. The first level issue is that the current C++ FE does not handle the case 
TYPE_NAME being NULL correctly, this is indeed a bug in the current code and 
should be fixed as in the current patch.


Sure, we might as well make this code more robust.  But we can do better 
than  if we check TYPE_PTRMEMFUNC_P.



2. The second level issue is what you suggested in the above, shall we print 
the “compiler generated internal type”  to the user? And I agree with you that 
it might not be a good idea to print such compiler internal names to the user.  
Are we do this right now in general? (i.e, check whether the current TYPE is a 
source level TYPE or a compiler internal TYPE, and then only print out the name 
of TYPE for the source level TYPE?) and is there a bit in the TYPE to 
distinguish whether a TYPE is user -level type or a compiler generated internal 
type?



I think the real problem comes sooner, when c_fold_indirect_ref_for_warn turns 
a MEM_REF with RECORD_TYPE into a COMPONENT_REF with POINTER_TYPE.



What’s the major issue for this transformation? (I will study this in more 
details).


We told c_fold_indirect_ref that we want a RECORD_TYPE (the PMF as a 
whole) and it gave us back a POINTER_TYPE instead (the __pmf member). 
Folding shouldn't change the type of an expression like that.


Jason



Re: [PATCH] c: Fix up __builtin_assoc_barrier handling in the C FE [PR104427]

2022-02-09 Thread Joseph Myers
On Wed, 9 Feb 2022, Jakub Jelinek via Gcc-patches wrote:

> Hi!
> 
> The following testcase ICEs, because when creating PAREN_EXPR for
> __builtin_assoc_barrier the FE doesn't do the usual tweaks for
> EXCESS_PRECISION_EXPR or C_MAYBE_CONST_EXPR.  I believe that the
> declared effect of the builtin is just association barrier, so
> e.g. excess precision should be still handled like if it wasn't
> there.
> 
> The following patch uses build_unary_op to handle those.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] PR target/102059 Fix inline of target specific functions

2022-02-09 Thread Michael Meissner via Gcc-patches
On Wed, Feb 09, 2022 at 04:56:13PM +0800, Kewen.Lin wrote:
> Hi Michael,
> 
> on 2022/2/9 上午11:27, Michael Meissner via Gcc-patches wrote:
> > Reset -mpower8-fusion for power9 inlining power8 functions, PR 102059.
> > 
> > This patch is an attempt to make a much simpler patch to fix PR 
> > target/102059
> > than the previous patch.
> > 
> > It just fixes the issue that if a function is specifically declared as a 
> > power8
> > function, you can't inline in functions that are specified with power9 or
> > power10 options.
> > 
> > The issue is -mpower8-fusion is cleared when you use -mcpu=power9 or
> > -mcpu=power10.  When I wrote the code for controlling which function can 
> > inline
> > other functions, -mpower8-fusion was set for -mcpu=power9 (-mcpu=power10 was
> > not an option at that time).  This patch fixes this particular problem.
> > 
> > Perhaps -mpower8-fusion should go away in the GCC 13 time frame.  This patch
> > just goes in and resets the fusion bit for testing inlines.
> > 
> > I have built a bootstrapped little endian compiler on power9 and the tests 
> > had
> > no regressions.
> > 
> > I have built a bootstrapped big endian compiler on power8 and I tested both
> > 32-bit and 64-bit builds, and there were no regressions.
> > 
> > Can I install this into the trunk and back port it into GCC 11 after a 
> > burn-in
> > period?
> > 
> 
> Thanks for the patch!  I guess we also need this for GCC 10 as:

Gcc 10 backport is doable once it is installed in the trunk.  Since a lot of my
work is on IEEE 128-bit or Power10, I don't always think of GCC 10 backports,
but in this case, the patch is rather simple.  Thanks for the reminder.

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


[committed][target/97040] Avoid using predefined insn name for instruction with different semantics

2022-02-09 Thread Jeff Law via Gcc-patches


This isn't technically a regression, but it only impacts the v850 target 
and fixes a long standing code correctness issue.


As outlined in slightly more detail in the PR, the v850 is using the 
pattern name "fnmasf4" and "fnmssf4" to generate fnmaf.s and fnmsf.s 
instructions respectively.


Unfortunately fnmasf4 is expected to produce (-a * b) + c and fnmssf4 
(-a * b) - c.  Those v850 instructions actually negate the entire result.


The fix is trivial.  Use a different pattern name so that the combiner 
can still generate those instructions, but prevent those instructions 
from being used to implement GCC's notion of what fnmas and fnmss should be.


This fixes pr97040 as well as a handful of testsuite failures for the 
v3e5 multilib.


Committed to the trunk.

Jeffcommit eefec38c992e3622a69de9667e91f0cafbff03cc
Author: Jeff Law 
Date:   Wed Feb 9 14:10:53 2022 -0500

Avoid using predefined insn name for instruction with different semantics

This isn't technically a regression, but it only impacts the v850 target and
fixes a long standing code correctness issue.

As outlined in slightly more detail in the PR, the v850 is using the pattern
name "fnmasf4" and "fnmssf4" to generate fnmaf.s and fnmsf.s instructions
 respectively.

Unfortunately fnmasf4 is expected to produce (-a * b) + c and
fnmssf4 (-a * b) - c.  Those v850 instructions actually negate the entire
result.

The fix is trivial.  Use a different pattern name so that the combiner can
still generate those instructions, but prevent those instructions from being
used to implement GCC's notion of what fnmas and fnmss should be.

This fixes pr97040 as well as a handful of testsuite failures for the v3e5
multilib.

gcc/
PR target/97040
* config/v850/v850.md (*v850_fnmasf4): Renamed from fnmasf4.
(*v850_fnmssf4): Renamed from fnmssf4

diff --git a/gcc/config/v850/v850.md b/gcc/config/v850/v850.md
index ed51157691b..6ca31e3f43f 100644
--- a/gcc/config/v850/v850.md
+++ b/gcc/config/v850/v850.md
@@ -2601,7 +2601,12 @@
(set_attr "type" "fpu")])
 
 ;;; negative-multiply-add
-(define_insn "fnmasf4"
+;; Note the name on this and the following insn were previously fnmasf4
+;; and fnmssf4.  Those names are known to the gimple->rtl expanders and
+;; must implement specific semantics (negating one of the inputs to the
+;; multiplication).  The v850 instructions actually negate the entire
+;; result.  Thus the names have been changed and hidden.
+(define_insn "*v850_fnmasf4"
   [(set (match_operand:SF 0 "register_operand" "=r")
(neg:SF (fma:SF (match_operand:SF 1 "register_operand" "r")
(match_operand:SF 2 "register_operand" "r")
@@ -2612,7 +2617,7 @@
(set_attr "type" "fpu")])
 
 ;; negative-multiply-subtract
-(define_insn "fnmssf4"
+(define_insn "*v850_fnmssf4"
   [(set (match_operand:SF 0 "register_operand" "=r")
(neg:SF (fma:SF (match_operand:SF 1 "register_operand" "r")
(match_operand:SF 2 "register_operand" "r")


[PATCH] i386: Force inputs to a register to avoid lowpart_subreg failure [PR104458]

2022-02-09 Thread Uros Bizjak via Gcc-patches
Input operands can be in the form of:

(subreg:DI (reg:V2SF 96) 0)

which chokes lowpart_subreg. Force inputs to a register, which is
preferable even when the input operand is from memory.

2022-02-09  Uroš Bizjak  

gcc/ChangeLog:

PR target/104458
* config/i386/i386-expand.cc (ix86_split_idivmod):
Force operands[2] and operands[3] into a register..

gcc/testsuite/ChangeLog:

PR target/104458
* gcc.target/i386/pr104458.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Pushed to master.

Uros.
diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
index eb1930ba375..ce9607e36de 100644
--- a/gcc/config/i386/i386-expand.cc
+++ b/gcc/config/i386/i386-expand.cc
@@ -1407,6 +1407,9 @@ ix86_split_idivmod (machine_mode mode, rtx operands[],
   rtx scratch, tmp0, tmp1, tmp2;
   rtx (*gen_divmod4_1) (rtx, rtx, rtx, rtx);
 
+  operands[2] = force_reg (mode, operands[2]);
+  operands[3] = force_reg (mode, operands[3]);
+
   switch (mode)
 {
 case E_SImode:
diff --git a/gcc/testsuite/gcc.target/i386/pr104458.c 
b/gcc/testsuite/gcc.target/i386/pr104458.c
new file mode 100644
index 000..d1d28c13118
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr104458.c
@@ -0,0 +1,13 @@
+/* PR target/104458 */
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O1 -m8bit-idiv" } */
+
+typedef float __attribute__((__vector_size__ (8))) F;
+
+int i;
+
+void
+foo (F f)
+{
+  i += i % (long) f;
+}


[PATCH] i386: -mno-xsave should disable all relevant ISA flags [PR104462]

2022-02-09 Thread Uros Bizjak via Gcc-patches
2022-02-09  Uroš Bizjak  

gcc/ChangeLog:

PR target/104462
* common/config/i386/i386-common.cc (OPTION_MASK_ISA2_XSAVE_UNSET):
Also include OPTION_MASK_ISA2_AVX2_UNSET.

gcc/testsuite/ChangeLog:

PR target/104462
* gcc.target/i386/pr104462.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Pushed to master.

Uros.
diff --git a/gcc/common/config/i386/i386-common.cc 
b/gcc/common/config/i386/i386-common.cc
index 607e9f20e85..449df6351c9 100644
--- a/gcc/common/config/i386/i386-common.cc
+++ b/gcc/common/config/i386/i386-common.cc
@@ -206,7 +206,8 @@ along with GCC; see the file COPYING3.  If not see
   (OPTION_MASK_ISA_XSAVE | OPTION_MASK_ISA_XSAVEOPT_UNSET \
| OPTION_MASK_ISA_XSAVES_UNSET | OPTION_MASK_ISA_XSAVEC_UNSET \
| OPTION_MASK_ISA_AVX_UNSET)
-#define OPTION_MASK_ISA2_XSAVE_UNSET OPTION_MASK_ISA2_AMX_TILE_UNSET
+#define OPTION_MASK_ISA2_XSAVE_UNSET \
+  (OPTION_MASK_ISA2_AVX2_UNSET | OPTION_MASK_ISA2_AMX_TILE_UNSET)
 #define OPTION_MASK_ISA_XSAVEOPT_UNSET OPTION_MASK_ISA_XSAVEOPT
 #define OPTION_MASK_ISA_AVX2_UNSET \
   (OPTION_MASK_ISA_AVX2 | OPTION_MASK_ISA_AVX512F_UNSET)
diff --git a/gcc/testsuite/gcc.target/i386/pr104462.c 
b/gcc/testsuite/gcc.target/i386/pr104462.c
new file mode 100644
index 000..7a5ee64f431
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr104462.c
@@ -0,0 +1,13 @@
+/* PR target/104462 */
+/* { dg-do compile } */
+/* { dg-options "-mavx512fp16 -mno-xsave" } */
+
+typedef _Float16 __attribute__((__vector_size__ (8))) F;
+
+F f;
+
+void
+foo (void)
+{
+  f *= -f;
+}


Re: [Patch] Fortran/OpenMP: Avoid ICE for invalid char array in omp atomic [PR104329]

2022-02-09 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 04, 2022 at 12:39:53PM +0100, Tobias Burnus wrote:
> Already during parsing, the allocatable character array assignment
>x = (x)
> 
> is converted to two gfc_codes with EXEC_ASSIGN, namely:
> 
>   ASSIGN z1:_F.DA0(FULL) (parens z1:x(FULL))
>   ASSIGN z1:x(FULL) z1:_F.DA0(FULL)
> 
> But the current code expects only one gfc_code - as parse.c does some
> checks, that's unexpected for resolution and currently is checked with
> an gcc_assert.
> 
> Solution: I now defer the gfc_assert until after diagnosing the arguments.
> 
> OK for mainline (only affected version)?
> 
> Tobias
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
> München, HRB 106955

> Fortran/OpenMP: Avoid ICE for invalid char array in omp atomic [PR104329]
> 
>   PR fortran/104329
> gcc/fortran/ChangeLog:
> 
>   * openmp.cc (resolve_omp_atomic): Defer extra-code assert after
>   other diagnostics.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gfortran.dg/gomp/atomic-28.f90: New test.
> 
>  gcc/fortran/openmp.cc| 11 ---
>  gcc/testsuite/gfortran.dg/gomp/atomic-28.f90 | 28 
> 
>  2 files changed, 36 insertions(+), 3 deletions(-)

Ok, thanks.

Jakub



[COMMITED][PATCH] x86: Compile PR target/104441 tests with -march=x86-64

2022-02-09 Thread H.J. Lu via Gcc-patches
Compile PR target/104441 tests with -march=x86-64 to fix test failures
when GCC is configured with --with-arch=native --with-cpu=native.

PR target/104441
* gcc.target/i386/pr104441-1a.c: Compile with -march=x86-64.
* gcc.target/i386/pr104441-1b.c: Likewise.
---
 gcc/testsuite/gcc.target/i386/pr104441-1a.c | 2 +-
 gcc/testsuite/gcc.target/i386/pr104441-1b.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/i386/pr104441-1a.c 
b/gcc/testsuite/gcc.target/i386/pr104441-1a.c
index f4d263205f8..83734f710bd 100644
--- a/gcc/testsuite/gcc.target/i386/pr104441-1a.c
+++ b/gcc/testsuite/gcc.target/i386/pr104441-1a.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -mtune=skylake -Wno-attributes" } */
+/* { dg-options "-O3 -march=x86-64 -mtune=skylake -Wno-attributes" } */
 
 #include 
 #include 
diff --git a/gcc/testsuite/gcc.target/i386/pr104441-1b.c 
b/gcc/testsuite/gcc.target/i386/pr104441-1b.c
index 0b8a796d93c..325af044bb8 100644
--- a/gcc/testsuite/gcc.target/i386/pr104441-1b.c
+++ b/gcc/testsuite/gcc.target/i386/pr104441-1b.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O3 -mvzeroupper -Wno-attributes" } */
+/* { dg-options "-O3 -march=x86-64 -mvzeroupper -Wno-attributes" } */
 
 #include "pr104441-1a.c"
 
-- 
2.34.1



Re: [PATCH] RISC-V: Enable TARGET_SUPPORTS_WIDE_INT

2022-02-09 Thread Vineet Gupta




On 2/7/22 13:24, Vineet Gupta wrote:



On 2/7/22 10:58, Palmer Dabbelt wrote:

On Mon, 07 Feb 2022 09:41:10 PST (-0800), Vineet Gupta wrote:



On 2/7/22 01:28, Philipp Tomsich wrote:

Vineet,

On Mon, 7 Feb 2022 at 07:06, Vineet Gupta  
wrote:
This is at par with other major arches such as aarch64, i386, s390 
...


No testsuite regressions: same numbers w/ w/o

Putting that check seems a good idea, but I haven't seen any cases
related to this get through anyway.
Do you have seen any instances where the backend got this wrong? If
so, please share, so we can run a fuller regression and see any
performance impact.


No, there were no failures which this fixes. Seems like other arches 
did

this back in 2015.
When aarch64 did similar change, commit 2ca5b4303bd5, a directed 
generic

test pr68129_1.c was added which doesn't fail before/after.


The only offending MD pattern we had was for for constant 0, which 
IIUC should be a const_int now (and has been for some time) so 
shouldn't even have been matching anything.  I was worried about the 
fcvt-based moves on rv32, but my trivial test indicates those still 
work fine


   double foo(void)
   {
   return 0;
   }
      foo:
   fcvt.d.w    fa0,x0
   ret

so I'm assuming they're coming in through const_int as well. Probably 
worth a full rv32 testsuite run, but as far as I can tell we were 
essentially TARGET_SUPPORTS_WIDE_INT clean already so this should be 
pretty safe.


Ok I'll go off and run the rv32 suite just to be safe.



Unfortunately the patch isn't trivially applying on trunk: it's 
targeting the wrong files and is showing some whitespace issues 
(though those may have been a result of me attempting to clean stuff 
up).  I assuming that means that the tests weren't run on trunk, though.


I tested both gcc 11 and trunk. Both were clean. My bad that I posted 
the patch off of internal gcc 11 tree.




I put a cleaned up version over here 
 
in case that helps anyone.  I haven't run the regressions, but otherwise


Reviewed-by: Palmer Dabbelt 

LMK if you want me to run the test suite. 


I'd be great if you can verify. I'll go off and setup a rc32 test 
setup as well.


Tested this for rv32 as well, no regressions w/ w/o as well - do note 
that actual failures between rv32 and rv64 are slightly different, but 
not due to this patch.




IIUC we're still a bit away from the GCC 12 branch, and given this 
doesn't fix any manifestable bugs it should be held for 13.


Sure thing.

Thx,
-Vineet




[PATCH] middle-end: Support ABIs that pass FP values as wider integers.

2022-02-09 Thread Roger Sayle

This patch adds middle-end support for target ABIs that pass/return
floating point values in integer registers with precision wider than
the original FP mode.  An example, is the nvptx backend where 16-bit
HFmode registers are passed/returned as (promoted to) SImode registers.
Unfortunately, this currently falls foul of the various (recent?) sanity
checks that (very sensibly) prevent creating paradoxical SUBREGs of
floating point registers.  The approach below is to explicitly perform the
conversion/promotion in two steps, via an integer mode of same precision
as the floating point value.  So on nvptx, 16-bit HFmode is initially
converted to 16-bit HImode (using SUBREG), then zero-extended to SImode,
and likewise when going the other way, parameters truncated to HImode
then converted to HFmode (using SUBREG).  These changes are localized
to expand_value_return and expanding DECL_RTL to support strange ABIs,
rather than inside convert_modes or gen_lowpart, as mismatched
precision integer/FP conversions should be explicit in the RTL,
and these semantics not generally visible/implicit in user code.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check with no new failures, and on nvptx-none, where it is
the middle-end portion of a pair of patches to allow the default ISA to
be advanced.  Ok for mainline?


2022-02-09  Roger Sayle  

gcc/ChangeLog
   * cfgexpand.cc (expand_value_return): Allow backends to promote
   a scalar floating point return value to a wider integer mode.
   * expr.cc (expand_expr_real_1) [expand_decl_rtl]: Likewise, allow
   backends to promote scalar FP PARM_DECLs to wider integer modes.


Thanks in advance,
Roger
--

diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
index d51af2e..c377f16 100644
--- a/gcc/cfgexpand.cc
+++ b/gcc/cfgexpand.cc
@@ -3715,7 +3715,22 @@ expand_value_return (rtx val)
 mode = promote_function_mode (type, old_mode, &unsignedp, funtype, 1);
 
   if (mode != old_mode)
-   val = convert_modes (mode, old_mode, val, unsignedp);
+   {
+ /* Some ABIs require scalar floating point modes to be returned
+in a wider scalar integer mode.  We need to explicitly
+reinterpret to an integer mode of the correct precision
+before extending to the desired result.  */
+ if (SCALAR_INT_MODE_P (mode)
+ && SCALAR_FLOAT_MODE_P (old_mode)
+ && known_gt (GET_MODE_SIZE (mode), GET_MODE_SIZE (old_mode)))
+   {
+ scalar_int_mode imode = int_mode_for_mode (old_mode).require ();
+ val = force_reg (imode, gen_lowpart (imode, val));
+ val = convert_modes (mode, imode, val, 1);
+   }
+ else
+   val = convert_modes (mode, old_mode, val, unsignedp);
+   }
 
   if (GET_CODE (return_reg) == PARALLEL)
emit_group_load (return_reg, val, type, int_size_in_bytes (type));
diff --git a/gcc/expr.cc b/gcc/expr.cc
index 35e4029..e4efdcd 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -10674,6 +10674,19 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode 
tmode,
pmode = promote_ssa_mode (ssa_name, &unsignedp);
  gcc_assert (GET_MODE (decl_rtl) == pmode);
 
+ /* Some ABIs require scalar floating point modes to be passed
+in a wider scalar integer mode.  We need to explicitly
+truncate to an integer mode of the correct precision before
+using a SUBREG to reinterpret as a floating point value.  */
+ if (SCALAR_FLOAT_MODE_P (mode)
+ && SCALAR_INT_MODE_P (pmode)
+ && known_lt (GET_MODE_SIZE (mode), GET_MODE_SIZE (pmode)))
+   {
+ scalar_int_mode imode = int_mode_for_mode (mode).require ();
+ temp = force_reg (imode, gen_lowpart (imode, decl_rtl));
+ return gen_lowpart_SUBREG (mode, temp);
+   }
+
  temp = gen_lowpart_SUBREG (mode, decl_rtl);
  SUBREG_PROMOTED_VAR_P (temp) = 1;
  SUBREG_PROMOTED_SET (temp, unsignedp);


Re: [PATCH] c++: memfn lookup consistency and using-decls [PR104432]

2022-02-09 Thread Patrick Palka via Gcc-patches
On Wed, 9 Feb 2022, Jason Merrill wrote:

> On 2/9/22 11:36, Patrick Palka wrote:
> > On Wed, 9 Feb 2022, Jason Merrill wrote:
> > 
> > > On 2/9/22 10:45, Patrick Palka wrote:
> > > > In filter_memfn_lookup, we weren't correctly recognizing and matching up
> > > > member functions introduced via a non-dependent using-decl.  This caused
> > > > us to crash in the below testcases in which we correctly pruned the
> > > > overload set for the non-dependent call ahead of time, but then at
> > > > instantiation time filter_memfn_lookup failed to match the selected
> > > > function (introduced in each case by a non-dependent using-decl) to the
> > > > corresponding function from the new lookup set.  Such member functions
> > > > need special handling in filter_memfn_lookup because they look exactly
> > > > the same in the old and new lookup sets, whereas ordinary member
> > > > functions that're defined in the (dependent) current class become more
> > > > specialized in the new lookup set.
> > > > 
> > > > This patch reworks the matching logic in filter_memfn_lookup so that it
> > > > handles non-dependent using-decls correctly, and is hopefully simpler to
> > > > follow.
> > > > 
> > > > Bootstrapped and regtested on x86_64-pc-linux, does this look OK for
> > > > trunk?
> > > > 
> > > > PR c++/104432
> > > > 
> > > > gcc/cp/ChangeLog:
> > > > 
> > > > * call.cc (build_new_method_call): When a non-dependent call
> > > > resolves to a specialization of a member template, always build
> > > > the pruned overload set using the member template, not the
> > > > specialization.
> > > > * pt.cc (filter_memfn_lookup): New parameter newtype.  Simplify
> > > > and correct how members from the new lookup set are matched to
> > > > those from the old one.
> > > > (tsubst_baselink): Pass binfo_type as newtype to
> > > > filter_memfn_lookup.
> > > > 
> > > > gcc/testsuite/ChangeLog:
> > > > 
> > > > * g++.dg/template/non-dependent19.C: New test.
> > > > * g++.dg/template/non-dependent19a.C: New test.
> > > > * g++.dg/template/non-dependent20.C: New test.
> > > > ---
> > > >gcc/cp/call.cc|  9 ++--
> > > >gcc/cp/pt.cc  | 49
> > > > +--
> > > >.../g++.dg/template/non-dependent19.C | 14 ++
> > > >.../g++.dg/template/non-dependent19a.C| 16 ++
> > > >.../g++.dg/template/non-dependent20.C | 16 ++
> > > >5 files changed, 73 insertions(+), 31 deletions(-)
> > > >create mode 100644 gcc/testsuite/g++.dg/template/non-dependent19.C
> > > >create mode 100644 gcc/testsuite/g++.dg/template/non-dependent19a.C
> > > >create mode 100644 gcc/testsuite/g++.dg/template/non-dependent20.C
> > > > 
> > > > diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
> > > > index b2e89c5d783..d6eed5ed835 100644
> > > > --- a/gcc/cp/call.cc
> > > > +++ b/gcc/cp/call.cc
> > > > @@ -11189,12 +11189,11 @@ build_new_method_call (tree instance, tree
> > > > fns,
> > > > vec **args,
> > > >  if (really_overloaded_fn (fns))
> > > > {
> > > >   if (DECL_TEMPLATE_INFO (fn)
> > > > - && DECL_MEMBER_TEMPLATE_P (DECL_TI_TEMPLATE (fn))
> > > > - && dependent_type_p (DECL_CONTEXT (fn)))
> > > > + && DECL_MEMBER_TEMPLATE_P (DECL_TI_TEMPLATE (fn)))
> > > > {
> > > > - /* FIXME: We're not prepared to fully instantiate 
> > > > "inside-out"
> > > > -partial instantiations such as A::f().  So 
> > > > instead
> > > > -use the selected template, not the specialization.  */
> > > > + /* Use the selected template, not the specialization, so 
> > > > that
> > > > +this looks like an actual lookup result for sake of
> > > > +filter_memfn_lookup.  */
> > > >   if (OVL_SINGLE_P (fns))
> > > > /* If the original overload set consists of a single
> > > > function
> > > > diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
> > > > index 862f337886c..3a5d06bf297 100644
> > > > --- a/gcc/cp/pt.cc
> > > > +++ b/gcc/cp/pt.cc
> > > > @@ -16311,12 +16311,12 @@ tsubst (tree t, tree args, tsubst_flags_t
> > > > complain, tree in_decl)
> > > >}
> > > >  /* OLDFNS is a lookup set of member functions from some class
> > > > template,
> > > > and
> > > > -   NEWFNS is a lookup set of member functions from a specialization of
> > > > that
> > > > -   class template.  Return the subset of NEWFNS which are
> > > > specializations
> > > > of
> > > > -   a function from OLDFNS.  */
> > > > +   NEWFNS is a lookup set of member functions from NEWTYPE, a
> > > > specialization
> > > > +   of that class template.  Return the subset of NEWFNS which are
> > > > +   specializations of a function from OLDFNS.  */
> > > >  static tree
> > > > -filter_memfn_lookup (tree oldfns, tree new

[r12-7125 Regression] FAIL: gcc.target/i386/pr104441-1a.c scan-assembler [ \t]+vextracti128[ \t]+[^\n]+\n[ \t]+vpaddd[ \t]+[^\n]+\n[ \t]+vmovd[ \t]+[^\n]+\n[ \t]+vzeroupper on Linux/x86_64

2022-02-09 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

5390a2f191682dae3c6d1e1deac20e05be413514 is the first bad commit
commit 5390a2f191682dae3c6d1e1deac20e05be413514
Author: H.J. Lu 
Date:   Sun Jan 30 10:08:14 2022 -0800

x86: Check each component of source operand for AVX_U128_DIRTY

caused

FAIL: gcc.target/i386/pr104441-1a.c scan-assembler [ \t]+vextracti128[ 
\t]+[^\n]+\n[ \t]+vpaddd[ \t]+[^\n]+\n[ \t]+vmovd[ \t]+[^\n]+\n[ \t]+vzeroupper

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-7125/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr104441-1a.c --target_board='unix{-m32\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr104441-1a.c --target_board='unix{-m64\ 
-march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


Re: [PATCH] libstdc++: Fix deadlock in atomic wait [PR104442]

2022-02-09 Thread Thomas Rodgers via Gcc-patches
Tested x86_64-pc-linux-gnu, committed to master, backported to gcc-11.

On Wed, Feb 9, 2022 at 9:14 AM Jonathan Wakely  wrote:

> On Wed, 9 Feb 2022 at 17:10, Thomas Rodgers wrote:
> >
> > Updated patch. I reverted the memory order change (and will submit that
> as another patch) and fixed some spelling and grammar errors.
>
> OK for trunk and gcc-11, thanks.
>
>


Re: [PATCH] libstdc++: Strengthen memory order for atomic::wait/notify

2022-02-09 Thread Jonathan Wakely via Gcc-patches
On Wed, 9 Feb 2022 at 17:35, Thomas Rodgers via Libstdc++
 wrote:
>
> This patch changes the memory order used in the spin wait code to match
> that of libc++.

OK for trunk (and gcc-11 if needed).



Re: [PATCH] PR fortran/66193 - ICE for initialisation of some non-zero-sized arrays

2022-02-09 Thread Mikael Morin

Hello

Le 06/02/2022 à 22:14, Harald Anlauf via Fortran a écrit :

Dear Fortranners,

some instances of valid constant array constructors did lead to ICEs.
It turned out that on the one hand we need to attempt simplification of
elements of the constructor, especially when we encounter parenthesized
expression.  On the other hand the occurence of type specs and empty
constructors need to be handled more gracefully.

Parts of the PR have been fixed previously, so the remaining part was
rather simple.

The testcase is based on Gerhards latest example attached to the PR.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?


OK.


Given the simplicity of the patch and that it is an ICE on valid code,
would this qualify for later application to 11-branch?


I suppose it does.

Thanks.


[PATCH] rs6000: Rename vec_clrl and vec_clrr to agreed-upon names

2022-02-09 Thread Bill Schmidt via Gcc-patches
Hi!

After vec_clrl and vec_clrr were implemented and during review of the
documentation, it was agreed to change their names to vec_clr_first and
vec_clr_last to more clearly describe their bi-endian semantics.  ("Left"
and "right" are the wrong terms to be using.)  It looks like I neglected
to make that change, so fixing it now.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
Is this okay for trunk, and for backport to gcc 11 after some burn-in?

Thanks!
Bill


2022-02-09  Bill Schmidt  

gcc/
* config/rs6000/rs6000-overload.def (VEC_CLR_FIRST): Rename from
VEC_CLRL.
(VEC_CLR_LAST): Rename from VEC_CLRR.

gcc/testsuite/
* gcc.target/powerpc/vec-clrl-0.c: Adjust to new names.
* gcc.target/powerpc/vec-clrl-1.c: Likewise.
* gcc.target/powerpc/vec-clrl-2.c: Likewise.
* gcc.target/powerpc/vec-clrl-3.c: Likewise.
* gcc.target/powerpc/vec-clrr-0.c: Likewise.
* gcc.target/powerpc/vec-clrr-1.c: Likewise.
* gcc.target/powerpc/vec-clrr-2.c: Likewise.
* gcc.target/powerpc/vec-clrr-3.c: Likewise.
---
 gcc/config/rs6000/rs6000-overload.def | 12 ++--
 gcc/testsuite/gcc.target/powerpc/vec-clrl-0.c |  4 ++--
 gcc/testsuite/gcc.target/powerpc/vec-clrl-1.c |  4 ++--
 gcc/testsuite/gcc.target/powerpc/vec-clrl-2.c |  4 ++--
 gcc/testsuite/gcc.target/powerpc/vec-clrl-3.c |  4 ++--
 gcc/testsuite/gcc.target/powerpc/vec-clrr-0.c |  4 ++--
 gcc/testsuite/gcc.target/powerpc/vec-clrr-1.c |  4 ++--
 gcc/testsuite/gcc.target/powerpc/vec-clrr-2.c |  4 ++--
 gcc/testsuite/gcc.target/powerpc/vec-clrr-3.c |  4 ++--
 9 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-overload.def 
b/gcc/config/rs6000/rs6000-overload.def
index 44e2945aaa0..0b68cc3c3b2 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -557,16 +557,16 @@
   vuc __builtin_vec_vcipherlast_be (vuc, vuc);
 VCIPHERLAST_BE
 
-[VEC_CLRL, vec_clrl, __builtin_vec_clrl]
-  vsc __builtin_vec_clrl (vsc, unsigned int);
+[VEC_CLR_FIRST, vec_clr_first, __builtin_vec_clr_first]
+  vsc __builtin_vec_clr_first (vsc, unsigned int);
 VCLRLB  VCLRLB_S
-  vuc __builtin_vec_clrl (vuc, unsigned int);
+  vuc __builtin_vec_clr_first (vuc, unsigned int);
 VCLRLB  VCLRLB_U
 
-[VEC_CLRR, vec_clrr, __builtin_vec_clrr]
-  vsc __builtin_vec_clrr (vsc, unsigned int);
+[VEC_CLR_LAST, vec_clr_last, __builtin_vec_clr_last]
+  vsc __builtin_vec_clr_last (vsc, unsigned int);
 VCLRRB  VCLRRB_S
-  vuc __builtin_vec_clrr (vuc, unsigned int);
+  vuc __builtin_vec_clr_last (vuc, unsigned int);
 VCLRRB  VCLRRB_U
 
 ; We skip generating a #define because of the C-versus-C++ complexity
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-clrl-0.c 
b/gcc/testsuite/gcc.target/powerpc/vec-clrl-0.c
index d0b183ebfaf..df055c6535e 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-clrl-0.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-clrl-0.c
@@ -5,11 +5,11 @@
 
 extern void abort (void);
 
-/* Vector string clear left-most bytes of unsigned char.  */
+/* Vector string clear first bytes of unsigned char.  */
 vector unsigned char
 clrl (vector unsigned char arg, int n)
 {
-  return vec_clrl (arg, n);
+  return vec_clr_first (arg, n);
 }
 
 /* { dg-final { scan-assembler {\mvclrlb\M} { target be } } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-clrl-1.c 
b/gcc/testsuite/gcc.target/powerpc/vec-clrl-1.c
index 43ab32c0278..692f83e033b 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-clrl-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-clrl-1.c
@@ -7,11 +7,11 @@
 
 extern void abort (void);
 
-/* Vector string clear left-most bytes of unsigned char.  */
+/* Vector string clear first bytes of unsigned char.  */
 vector unsigned char
 clrl (vector unsigned char arg, int n)
 {
-  return vec_clrl (arg, n);
+  return vec_clr_first (arg, n);
 }
 
 int main (int argc, char *argv [])
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-clrl-2.c 
b/gcc/testsuite/gcc.target/powerpc/vec-clrl-2.c
index b9676b8b04c..ffecf432736 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-clrl-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-clrl-2.c
@@ -5,11 +5,11 @@
 
 extern void abort (void);
 
-/* Vector string clear left-most bytes of unsigned char.  */
+/* Vector string clear first bytes of unsigned char.  */
 vector signed char
 clrl (vector signed char arg, int n)
 {
-  return vec_clrl (arg, n);
+  return vec_clr_first (arg, n);
 }
 
 /* { dg-final { scan-assembler {\mvclrlb\M} { target be } } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-clrl-3.c 
b/gcc/testsuite/gcc.target/powerpc/vec-clrl-3.c
index 0ae5abcee50..456f655e7aa 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-clrl-3.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-clrl-3.c
@@ -7,11 +7,11 @@
 
 extern void abort (void);
 
-/* Vector string clear left-most bytes of unsigned char.  */
+/* Vector string clear first bytes of unsigned char.  */
 vector 

Re: [PATCH] Fix PR 101515 (ICE in pp_cxx_unqualified_id, at cp/cxx-pretty-print.c:128)

2022-02-09 Thread Qing Zhao via Gcc-patches


> On Feb 9, 2022, at 12:23 PM, Jason Merrill  wrote:
> 
> On 2/9/22 10:51, Qing Zhao wrote:
>>> On Feb 8, 2022, at 4:20 PM, Jason Merrill  wrote:
>>> 
>>> On 2/8/22 15:11, Qing Zhao wrote:
 Hi,
 This is the patch to fix PR101515 (ICE in pp_cxx_unqualified_id, at  
 cp/cxx-pretty-print.c:128)
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101515
 It's possible that the TYPE_NAME of a record_type is NULL, therefore when
 printing the TYPE_NAME, we should check and handle this special case.
 Please see the comment of pr101515 for more details.
 The fix is very simple, just check and special handle cases when TYPE_NAME 
 is NULL.
 Bootstrapped and regression tested on both x86 and aarch64, no issues.
 Okay for commit?
 Thanks.
 Qing
 =
 From f37ee8d21b80cb77d8108cb97a487c84c530545b Mon Sep 17 00:00:00 2001
 From: Qing Zhao 
 Date: Tue, 8 Feb 2022 16:10:37 +
 Subject: [PATCH] Fix PR 101515 ICE in pp_cxx_unqualified_id, at
  cp/cxx-pretty-print.c:128.
 It's possible that the TYPE_NAME of a record_type is NULL, therefore when
 printing the TYPE_NAME, we should check and handle this special case.
 gcc/cp/ChangeLog:
* cxx-pretty-print.cc (pp_cxx_unqualified_id): Check and handle
the case when TYPE_NAME is NULL.
 gcc/testsuite/ChangeLog:
* g++.dg/pr101515.C: New test.
 ---
  gcc/cp/cxx-pretty-print.cc  |  5 -
  gcc/testsuite/g++.dg/pr101515.C | 25 +
  2 files changed, 29 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/pr101515.C
 diff --git a/gcc/cp/cxx-pretty-print.cc b/gcc/cp/cxx-pretty-print.cc
 index 4f9a090e520d..744ed0add5ba 100644
 --- a/gcc/cp/cxx-pretty-print.cc
 +++ b/gcc/cp/cxx-pretty-print.cc
 @@ -171,7 +171,10 @@ pp_cxx_unqualified_id (cxx_pretty_printer *pp, tree t)
  case ENUMERAL_TYPE:
  case TYPENAME_TYPE:
  case UNBOUND_CLASS_TEMPLATE:
 -  pp_cxx_unqualified_id (pp, TYPE_NAME (t));
 +  if (TYPE_NAME (t))
 +  pp_cxx_unqualified_id (pp, TYPE_NAME (t));
 +  else
 +  pp_string (pp, "");
>>> 
>>> Hmm, but it's not an unnamed class, it's a pointer to member function type, 
>>> and it would be better to avoid dumping compiler internal representations 
>>> like the __pfn field name.
>> Yes, It’s not an unnamed class, but the ICE happened when try to print the 
>> compiler generated member function type “__ptrmemfunc_type”, whose TYPE_NAME 
>> is NULLed during building this type in c++ FE and the c++ FE does not handle 
>> the case when TYPE_NAME is NULL correctly.
>> So, there are two levels of issues:
>> 1. The first level issue is that the current C++ FE does not handle the case 
>> TYPE_NAME being NULL correctly, this is indeed a bug in the current code and 
>> should be fixed as in the current patch.
> 
> Sure, we might as well make this code more robust.  But we can do better than 
>  if we check TYPE_PTRMEMFUNC_P.
Okay, so what should we print to the user if it's “TYPE_PTRMEMFUNC_P”? Print 
nothing or some special string? 
> 
>> 2. The second level issue is what you suggested in the above, shall we print 
>> the “compiler generated internal type”  to the user? And I agree with you 
>> that it might not be a good idea to print such compiler internal names to 
>> the user.  Are we do this right now in general? (i.e, check whether the 
>> current TYPE is a source level TYPE or a compiler internal TYPE, and then 
>> only print out the name of TYPE for the source level TYPE?) and is there a 
>> bit in the TYPE to distinguish whether a TYPE is user -level type or a 
>> compiler generated internal type?
> 
>>> I think the real problem comes sooner, when c_fold_indirect_ref_for_warn 
>>> turns a MEM_REF with RECORD_TYPE into a COMPONENT_REF with POINTER_TYPE.
> 
>> What’s the major issue for this transformation? (I will study this in more 
>> details).
> 
> We told c_fold_indirect_ref that we want a RECORD_TYPE (the PMF as a whole) 
> and it gave us back a POINTER_TYPE instead (the __pmf member). Folding 
> shouldn't change the type of an expression like that.

Yes, this is not correct transformation, will study in more detail and try to 
fix it.

Qing
> 
> Jason



[PATCH] PR fortran/77693 - ICE in rtl_for_decl_init, at dwarf2out.c:17378

2022-02-09 Thread Harald Anlauf via Gcc-patches
Dear all,

as we did not properly check the initialization of pointers in
DATA statements for valid initial data targets, we could either
ICE or generate wrong code.  Testcase based on Gerhard's, BTW.

The attached patch adds a check in gfc_assign_data_value by
calling gfc_check_pointer_assign, as the latter did not get
called otherwise.

Along the course I introduced a new macro IS_POINTER() that
should help to make the code more readable whenever we need
to check the attributes of a symbol to see whether it is a
pointer, CLASS or not.  At least it may save some typing in
the future.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

Thanks,
Harald

From c94d8f63482e810453dd188faa8396dfac397929 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Wed, 9 Feb 2022 21:54:29 +0100
Subject: [PATCH] Fortran: improve check of pointer initialization in DATA
 statements

gcc/fortran/ChangeLog:

	PR fortran/77693
	* data.cc (gfc_assign_data_value): If a variable in a data
	statement has the POINTER attribute, check for allowed initial
	data target that is compatible with pointer assignment.
	* gfortran.h (IS_POINTER): New macro.

gcc/testsuite/ChangeLog:

	PR fortran/77693
	* gfortran.dg/data_pointer_2.f90: New test.
---
 gcc/fortran/data.cc  |  4 
 gcc/fortran/gfortran.h   |  3 +++
 gcc/testsuite/gfortran.dg/data_pointer_2.f90 | 21 
 3 files changed, 28 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.dg/data_pointer_2.f90

diff --git a/gcc/fortran/data.cc b/gcc/fortran/data.cc
index f7c91437439..7a5866f3c28 100644
--- a/gcc/fortran/data.cc
+++ b/gcc/fortran/data.cc
@@ -618,6 +618,10 @@ gfc_assign_data_value (gfc_expr *lvalue, gfc_expr *rvalue, mpz_t index,
 	gfc_convert_type (expr, &lvalue->ts, 0);
 }

+  if (IS_POINTER (symbol)
+  && !gfc_check_pointer_assign (lvalue, rvalue, false, true))
+return false;
+
   if (last_con == NULL)
 symbol->value = expr;
   else
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 993879feda4..32618c155dc 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -3896,6 +3896,9 @@ bool gfc_is_finalizable (gfc_symbol *, gfc_expr **);
 	 && CLASS_DATA (sym) \
 	 && CLASS_DATA (sym)->attr.dimension \
 	 && !CLASS_DATA (sym)->attr.class_pointer)
+#define IS_POINTER(sym) \
+	(sym->ts.type == BT_CLASS && sym->attr.class_ok && CLASS_DATA (sym) \
+	 ? CLASS_DATA (sym)->attr.class_pointer : sym->attr.pointer)

 /* frontend-passes.cc */

diff --git a/gcc/testsuite/gfortran.dg/data_pointer_2.f90 b/gcc/testsuite/gfortran.dg/data_pointer_2.f90
new file mode 100644
index 000..e1677d1c3fb
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/data_pointer_2.f90
@@ -0,0 +1,21 @@
+! { dg-do compile }
+! { dg-options "-O -g" }
+! PR fortran/77693 - ICE in rtl_for_decl_init
+! Contributed by G.Steinmetz
+
+program p
+  implicit none
+  complex, target  :: y= (1.,2.)
+  complex, target  :: z(2) = (3.,4.)
+  complex, pointer :: a => y
+  complex, pointer :: b => z(1)
+  complex, pointer :: c, d, e
+  data c /NULL()/   ! Valid
+  data d /y/! Valid
+  data e /(1.,2.)/  ! { dg-error "Pointer assignment target" }
+  if (associated (a)) print *, a% re
+  if (associated (b)) print *, b% im
+  if (associated (c)) print *, c% re
+  if (associated (d)) print *, d% im
+  if (associated (e)) print *, e% re
+end
--
2.34.1



Go patch committed: link against -lrt on GNU/Linux

2022-02-09 Thread Ian Lance Taylor via Gcc-patches
In the Go 1.18 release libgo needs to link against -lrt on GNU/Linux
only, to call the timer_create, timer_settime, and timer_delete
functions.  In preparation this patch changes the gccgo driver to link
against -lrt when linking libgo statiically, and the gotools Makefile
to link the runtime test against -lrt.  Unfortunately this is
dependent on the target configuration, so we can't easily use a
configury test.  Instead the gccgo driver simply checks the target
configuration to see whether it needs to add -lrt.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian

* gospec.cc (RTLIB, RT_LIBRARY): Define.
(lang_specific_driver): Add -lrt if linking statically on
GNU/Linux.

* configure.ac (RT_LIBS): Define.
* Makefile.am (check-runtime): Set GOLIBS to $(RT_LIBS).
* configure, Makefile.in: Regenerate.
7c4484858b0dd9e015806df0daac7c5bc8340d4c
diff --git a/gcc/go/gospec.cc b/gcc/go/gospec.cc
index df92b62d8e6..ba7ba4ea09d 100644
--- a/gcc/go/gospec.cc
+++ b/gcc/go/gospec.cc
@@ -29,10 +29,12 @@ along with GCC; see the file COPYING3.  If not see
 #define MATHLIB(1<<2)
 /* This bit is set if they did `-lpthread'.  */
 #define THREADLIB  (1<<3)
+/* This bit is set if they did `-lrt'.  */
+#define RTLIB  (1<<4)
 /* This bit is set if they did `-lc'.  */
-#define WITHLIBC   (1<<4)
+#define WITHLIBC   (1<<5)
 /* Skip this option.  */
-#define SKIPOPT(1<<5)
+#define SKIPOPT(1<<6)
 
 #ifndef MATH_LIBRARY
 #define MATH_LIBRARY "m"
@@ -44,6 +46,8 @@ along with GCC; see the file COPYING3.  If not see
 #define THREAD_LIBRARY "pthread"
 #define THREAD_LIBRARY_PROFILE THREAD_LIBRARY
 
+#define RT_LIBRARY "rt"
+
 #define LIBGO "go"
 #define LIBGO_PROFILE LIBGO
 #define LIBGOBEGIN "gobegin"
@@ -74,6 +78,9 @@ lang_specific_driver (struct cl_decoded_option 
**in_decoded_options,
   /* "-lpthread" if it appears on the command line.  */
   const struct cl_decoded_option *saw_thread = 0;
 
+  /* "-lrt" if it appears on the command line.  */
+  const struct cl_decoded_option *saw_rt = 0;
+
   /* "-lc" if it appears on the command line.  */
   const struct cl_decoded_option *saw_libc = 0;
 
@@ -84,6 +91,9 @@ lang_specific_driver (struct cl_decoded_option 
**in_decoded_options,
   /* Whether we need the thread library.  */
   int need_thread = 0;
 
+  /* Whether we need the rt library.  */
+  int need_rt = 0;
+
   /* By default, we throw on the math library if we have one.  */
   int need_math = (MATH_LIBRARY[0] != '\0');
 
@@ -156,6 +166,8 @@ lang_specific_driver (struct cl_decoded_option 
**in_decoded_options,
}
  else if (strcmp (arg, THREAD_LIBRARY) == 0)
args[i] |= THREADLIB;
+ else if (strcmp (arg, RT_LIBRARY) == 0)
+   args[i] |= RTLIB;
  else if (strcmp (arg, "c") == 0)
args[i] |= WITHLIBC;
  else
@@ -260,7 +272,7 @@ lang_specific_driver (struct cl_decoded_option 
**in_decoded_options,
 #endif
 
   /* Make sure to have room for the trailing NULL argument.  */
-  num_args = argc + need_math + shared_libgcc + (library > 0) * 5 + 10;
+  num_args = argc + need_math + shared_libgcc + (library > 0) * 6 + 10;
   new_decoded_options = XNEWVEC (struct cl_decoded_option, num_args);
 
   i = 0;
@@ -314,6 +326,12 @@ lang_specific_driver (struct cl_decoded_option 
**in_decoded_options,
  saw_thread = &decoded_options[i];
}
 
+  if (!saw_rt && (args[i] & RTLIB) && library > 0)
+   {
+ --j;
+ saw_rt = &decoded_options[i];
+   }
+
   if (!saw_libc && (args[i] & WITHLIBC) && library > 0)
{
  --j;
@@ -395,9 +413,23 @@ lang_specific_driver (struct cl_decoded_option 
**in_decoded_options,
 #endif
 
   /* When linking libgo statically we also need to link with the
-pthread library.  */
+pthread and (on GNU/Linux) the rt library.  */
   if (library > 1 || static_link)
-   need_thread = 1;
+   {
+ need_thread = 1;
+ if (strstr (DEFAULT_TARGET_MACHINE, "linux") != NULL)
+   need_rt = 1;
+   }
+}
+
+  if (saw_rt)
+new_decoded_options[j++] = *saw_rt;
+  else if (library > 0 && need_rt)
+{
+  generate_option (OPT_l, RT_LIBRARY, 1, CL_DRIVER,
+  &new_decoded_options[j]);
+  added_libraries++;
+  j++;
 }
 
   if (saw_thread)
diff --git a/gotools/Makefile.am b/gotools/Makefile.am
index 199899b9ef0..9e81024ea78 100644
--- a/gotools/Makefile.am
+++ b/gotools/Makefile.am
@@ -246,12 +246,14 @@ check-runtime: go$(EXEEXT) $(noinst_PROGRAMS) check-head 
check-gccgo check-gcc
GOARCH=`$(abs_builddir)/go$(EXEEXT) env GOARCH`; \
GOOS=`$(abs_builddir)/go$(EXEEXT) env GOOS`; \
files=`$(SHELL) $(libgosrcdir)/../match.sh --goarch=$${GOARCH} 
--goos=$${GOOS} --srcdir=$(libgosrcdir)/runtime 
--extrafiles="$(libgodir)/runtime_linknames.go $(libgodir)/runtime_sysinfo.go 
$(libgodir)/sigtab.go $(libgodir)/g

Re: [PATCH] tree-optimization/104373 - early uninit diagnostic on unreachable code

2022-02-09 Thread Martin Sebor via Gcc-patches

On 2/9/22 00:12, Richard Biener wrote:

On Tue, 8 Feb 2022, Jeff Law wrote:




On 2/8/2022 12:03 AM, Richard Biener via Gcc-patches wrote:

The following improves early uninit diagnostics by computing edge
reachability using our value-numbering framework in its cheapest
mode and ignoring unreachable blocks when looking
for uninitialized uses.  To not ICE with -fdump-tree-all the
early uninit pass needs a dumpfile since VN tries to dump statistics.

For gimple-match.c at -O0 -g this causes a 2% increase in compile-time.

In theory all early diagnostic passes could benefit from a VN run but
that would require more refactoring that's not appropriate at this stage.
This patch addresses a GCC 12 diagnostic regression and also happens to
fix one XFAIL in gcc.dg/uninit-pr20644-O0.c

Bootstrapped and tested on x86_64-unknown-linux-gnu, OK for trunk?

Thanks,
Richard.

2022-02-04  Richard Biener  

  PR tree-optimization/104373
  * tree-ssa-sccvn.h (do_rpo_vn): New export exposing the
  walk kind.
  * tree-ssa-sccvn.cc (do_rpo_vn): Export, get the default
  walk kind as argument.
  (run_rpo_vn): Adjust.
  (pass_fre::execute): Likewise.
  * tree-ssa-uninit.cc (warn_uninitialized_vars): Skip
  blocks not reachable.
  (execute_late_warn_uninitialized): Mark all edges as
  executable.
  (execute_early_warn_uninitialized): Use VN to compute
  executable edges.
  (pass_data_early_warn_uninitialized): Enable a dump file,
  change dump name to warn_uninit.

  * g++.dg/warn/Wuninitialized-32.C: New testcase.
  * gcc.dg/uninit-pr20644-O0.c: Remove XFAIL.

I'm conflicted on this ;-)

I generally lean on the side of eliminating false positives in these
diagnostics.  So identifying unreachable blocks and using that to prune the
set of warnings we generate, even at -O0 is good from that point of view.

But doing something like this has many of the problems that relying on
optimizations does, even if we don't optimize away the unreachable code.
Right now the warning should be fairly stable at -O0 -- the set of diagnostics
you get isn't going to change a lot release to release which is important to
some users.   Second, at -O0 whether or not you get a warning isn't a function
of how good our unreachable code analysis might be.

This was quite a contentious topic many years ago.  So much that I dropped
some work on Wuninit on the floor as I just couldn't take the arguing.  So be
aware that you might be opening a can of worms.

So the question comes down to a design decision.   What's more important to
the end users?  Fewer false positives or better stability in the warning?  I
think the former, but there's certainly been a vocal group that prefers the
latter.


I see - I didn't think of this aspect at all but that means I have no
idea on whether it is important or not ...

In our own setup we're running into "instabilities" with optimization
when building packages that enable -Werror, so I can see shops doing
dev builds at -O0 with warnings and -Werror but drop -Werror for
optimized builds.


On the implementation side I have zero concerns.    Looking further out, ISTM
we could mark the blocks as unreachable (rather than deducing it from edge
flags).  That would make it pretty easy to mark those blocks relatively early
and allow us to suppress any middle end diagnostic occurring in an unreachable
block.


So what I had in mind is that for the set of early diagnostic passes

   PUSH_INSERT_PASSES_WITHIN (pass_build_ssa_passes)
   NEXT_PASS (pass_fixup_cfg);
   NEXT_PASS (pass_build_ssa);
   NEXT_PASS (pass_warn_printf);
   NEXT_PASS (pass_warn_nonnull_compare);
   NEXT_PASS (pass_early_warn_uninitialized);
   NEXT_PASS (pass_warn_access, /*early=*/true);

we'd run VN and keep it's lattice around (and not just the
EDGE_EXECUTABLE flags).  That would for example allow
pass_warn_nonnull_compare to see that in

void foo (void *p __attribute__((nonnull)))
{
   void *q = p;
   if (q)
 bar (q);
}

we are comparing a never NULL pointer.  Currently the q = p copy
makes it not realize this.  Likewise some constants can be
propagated this way.

Of course using the VN lattice means quite some changes in those
passes.  Even without the VN lattice having unreachable edges
marked could improve diagnostics for, say PHI nodes, if only
a single executable edge remains.

Martin, do you have any thoughts here?  Any opinion on the patch
that for now just marks (not) executable edges to prune diagnostics at
-O0?


Many middle end warnings now run at -O0.  Thanks to Ranger (and
the pointer_query solution), they can identify many of the same
problem statements as with optimization.  But for the same reason
they're also more prone to false positives for unreachable code
because DCE doesn't run at -O0.  So in my mind, identifying at
least some of it then, is a step in the right direction.

So for the avoidance of doubt, I'm in favor of both the patch and
extending the approach to other warnings.

Thanks
Martin



Thanks,
Richard

Go patch committed: treat notinheap types as not being pointers

2022-02-09 Thread Ian Lance Taylor via Gcc-patches
This patch to the Go frontend treats notinheap types as not being
pointers.  By definition, a type is marked noinheap doesn't contain
any pointers that the garbage collector cares about, and neither does
a pointer to such a type.  Change the type descriptors to consistently
treat such types as not being pointers, by setting ptrdata to 0 and
gcdata to nil.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
83399f6af2c5ff74bb2e09168ba77478d2cfce14
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 9cd22ef011e..3ea7aed3506 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-3b1e46937d11b043d0986a3dfefaee27454c3da0
+7dffb933d33ff288675c8094d05c31b35cbf7e4d
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/gcc/go/gofrontend/gogo.cc b/gcc/go/gofrontend/gogo.cc
index b1e210ee6ac..30d5c9fcb0b 100644
--- a/gcc/go/gofrontend/gogo.cc
+++ b/gcc/go/gofrontend/gogo.cc
@@ -8869,10 +8869,13 @@ Named_object::get_backend(Gogo* gogo, 
std::vector& const_decls,
   {
 named_type->
 type_descriptor_pointer(gogo, Linemap::predeclared_location());
-   named_type->gc_symbol_pointer(gogo);
 Type* pn = Type::make_pointer_type(named_type);
 pn->type_descriptor_pointer(gogo, Linemap::predeclared_location());
-   pn->gc_symbol_pointer(gogo);
+   if (named_type->in_heap())
+ {
+   named_type->gc_symbol_pointer(gogo);
+   pn->gc_symbol_pointer(gogo);
+ }
   }
   }
   break;
diff --git a/gcc/go/gofrontend/types.cc b/gcc/go/gofrontend/types.cc
index 1c67ea099eb..ee3467666d8 100644
--- a/gcc/go/gofrontend/types.cc
+++ b/gcc/go/gofrontend/types.cc
@@ -2513,13 +2513,18 @@ Type::type_descriptor_constructor(Gogo* gogo, int 
runtime_type_kind,
   Expression_list* vals = new Expression_list();
   vals->reserve(12);
 
-  if (!this->has_pointer())
+  bool has_pointer;
+  if (name != NULL)
+has_pointer = name->has_pointer();
+  else
+has_pointer = this->has_pointer();
+  if (!has_pointer)
 runtime_type_kind |= RUNTIME_TYPE_KIND_NO_POINTERS;
   if (this->is_direct_iface_type())
 runtime_type_kind |= RUNTIME_TYPE_KIND_DIRECT_IFACE;
   int64_t ptrsize;
   int64_t ptrdata;
-  if (this->needs_gcprog(gogo, &ptrsize, &ptrdata))
+  if (has_pointer && this->needs_gcprog(gogo, &ptrsize, &ptrdata))
 runtime_type_kind |= RUNTIME_TYPE_KIND_GC_PROG;
 
   Struct_field_list::const_iterator p = fields->begin();
@@ -2530,7 +2535,10 @@ Type::type_descriptor_constructor(Gogo* gogo, int 
runtime_type_kind,
   ++p;
   go_assert(p->is_field_name("ptrdata"));
   type_info = Expression::TYPE_INFO_DESCRIPTOR_PTRDATA;
-  vals->push_back(Expression::make_type_info(this, type_info));
+  if (has_pointer)
+vals->push_back(Expression::make_type_info(this, type_info));
+  else
+vals->push_back(Expression::make_integer_ul(0, p->type(), bloc));
 
   ++p;
   go_assert(p->is_field_name("hash"));
@@ -2576,7 +2584,12 @@ Type::type_descriptor_constructor(Gogo* gogo, int 
runtime_type_kind,
 
   ++p;
   go_assert(p->is_field_name("gcdata"));
-  vals->push_back(Expression::make_gc_symbol(this));
+  if (has_pointer)
+vals->push_back(Expression::make_gc_symbol(this));
+  else
+vals->push_back(Expression::make_cast(p->type(),
+ Expression::make_nil(bloc),
+ bloc));
 
   ++p;
   go_assert(p->is_field_name("string"));
@@ -10894,6 +10907,10 @@ Named_type::do_verify()
 bool
 Named_type::do_has_pointer() const
 {
+  // A type that is not in the heap has no pointers that we care about.
+  if (!this->in_heap_)
+return false;
+
   if (this->seen_)
 return false;
   this->seen_ = true;
diff --git a/gcc/go/gofrontend/types.h b/gcc/go/gofrontend/types.h
index a33453afc84..c55345a9d64 100644
--- a/gcc/go/gofrontend/types.h
+++ b/gcc/go/gofrontend/types.h
@@ -2300,9 +2300,12 @@ class Pointer_type : public Type
   do_verify()
   { return this->to_type_->verify(); }
 
+  // If this is a pointer to a type that can't be in the heap, then
+  // the garbage collector does not have to look at this, so pretend
+  // that this is not a pointer at all.
   bool
   do_has_pointer() const
-  { return true; }
+  { return this->to_type_->in_heap(); }
 
   bool
   do_compare_is_identity(Gogo*)


Go patch committed: Use nil pointer for zero length string constant

2022-02-09 Thread Ian Lance Taylor via Gcc-patches
This patch to the Go frontend uses a nil pointer for a zero length
string constant.  We used to pointlessly set the pointer of a zero
length string constant to point to a zero byte constant.  Bootstrapped
and ran Go testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

Ian
2e2b861e8941c4e9b36b88e9c562642b1aba6eaf
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 3ea7aed3506..8cbd0c19a8d 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-7dffb933d33ff288675c8094d05c31b35cbf7e4d
+263e8d2a2ab57c6f2b3035f370d40476bda87c9f
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/gcc/go/gofrontend/expressions.cc b/gcc/go/gofrontend/expressions.cc
index d7b64767a00..3f597654858 100644
--- a/gcc/go/gofrontend/expressions.cc
+++ b/gcc/go/gofrontend/expressions.cc
@@ -2123,9 +2123,15 @@ String_expression::do_get_backend(Translate_context* 
context)
 
   Location loc = this->location();
   std::vector init(2);
-  Bexpression* str_cst =
-  gogo->backend()->string_constant_expression(this->val_);
-  init[0] = gogo->backend()->address_expression(str_cst, loc);
+
+  if (this->val_.size() == 0)
+init[0] = gogo->backend()->nil_pointer_expression();
+  else
+{
+  Bexpression* str_cst =
+   gogo->backend()->string_constant_expression(this->val_);
+  init[0] = gogo->backend()->address_expression(str_cst, loc);
+}
 
   Btype* int_btype = Type::lookup_integer_type("int")->get_backend(gogo);
   mpz_t lenval;


Go patch committed: Don't warn for print()

2022-02-09 Thread Ian Lance Taylor via Gcc-patches
This patch to the Go frontend stops warning for calls of print().  We
used to warn for calls to print(), because it doesn't do anything.
However, a Go 1.18 test uses that call, and it is valid Go.  Change
the compiler to just accept it and compile it; this will produce calls
to printlock and printunlock, and nothing else.  Bootstrapped and ran
Go testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

Ian
e50a79552d567cd49703103d478ab93d805f60c1
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 8cbd0c19a8d..52f4b423f02 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-263e8d2a2ab57c6f2b3035f370d40476bda87c9f
+b0dcd2d1e5e73952408b9f2d4d86ae12d102b20c
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/gcc/go/gofrontend/expressions.cc b/gcc/go/gofrontend/expressions.cc
index 3f597654858..1b3b3bf135e 100644
--- a/gcc/go/gofrontend/expressions.cc
+++ b/gcc/go/gofrontend/expressions.cc
@@ -10332,16 +10332,7 @@ Builtin_call_expression::do_check_types(Gogo*)
 case BUILTIN_PRINTLN:
   {
const Expression_list* args = this->args();
-   if (args == NULL)
- {
-   if (this->code_ == BUILTIN_PRINT)
- go_warning_at(this->location(), 0,
-"no arguments for built-in function %<%s%>",
-(this->code_ == BUILTIN_PRINT
- ? "print"
- : "println"));
- }
-   else
+   if (args != NULL)
  {
for (Expression_list::const_iterator p = args->begin();
 p != args->end();


[committed] analyzer: more uninit test coverage

2022-02-09 Thread David Malcolm via Gcc-patches
In addition to other test coverage, this adds the examples from
  https://cwe.mitre.org/data/definitions/457.html
(aka "CWE-457: Use of Uninitialized Variable")

For reference, the output from -fanalyzer looks like this
(after stripping away the DejaGnu directives):

uninit-CWE-457-examples.c: In function 'example_2_bad_code':
uninit-CWE-457-examples.c:56:3: warning: use of uninitialized value 'bN' 
[CWE-457] [-Wanalyzer-use-of-uninitialized-value]
   56 |   repaint(aN, bN); /* { dg-warning "use of uninitialized value 'bN'" } 
*/
  |   ^~~
  'example_2_bad_code': events 1-4
|
|   34 |   int aN, bN;
|  |   ^~
|  |   |
|  |   (1) region created on stack here
|   35 |   switch (ctl) {
|  |   ~~
|  |   |
|  |   (2) following 'default:' branch...
|..
|   51 |   default:
|  |   ~~~
|  |   |
|  |   (3) ...to here
|..
|   56 |   repaint(aN, bN);
|  |   ~~~
|  |   |
|  |   (4) use of uninitialized value 'bN' here
|
uninit-CWE-457-examples.c: In function 'example_3_bad_code':
uninit-CWE-457-examples.c:95:3: warning: use of uninitialized value 
'test_string' [CWE-457] [-Wanalyzer-use-of-uninitialized-value]
   95 |   printf("%s", test_string);
  |   ^
  'example_3_bad_code': events 1-4
|
|   90 |   char *test_string;
|  | ^~~
|  | |
|  | (1) region created on stack here
|   91 |   if (i != err_val)
|  |  ~
|  |  |
|  |  (2) following 'false' branch (when 'i == err_val')...
|..
|   95 |   printf("%s", test_string);
|  |   ~
|  |   |
|  |   (3) ...to here
|  |   (4) use of uninitialized value 'test_string' here
|

Successfully regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r12-7157-g91b27d984ce174.

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/uninit-1.c: Add test coverage for shifts,
comparisons, +, -, *, /, and __builtin_strlen.
* gcc.dg/analyzer/uninit-CWE-457-examples.c: New test.

Signed-off-by: David Malcolm 
---
 gcc/testsuite/gcc.dg/analyzer/uninit-1.c  |  85 +
 .../gcc.dg/analyzer/uninit-CWE-457-examples.c | 119 ++
 2 files changed, 204 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/uninit-CWE-457-examples.c

diff --git a/gcc/testsuite/gcc.dg/analyzer/uninit-1.c 
b/gcc/testsuite/gcc.dg/analyzer/uninit-1.c
index cb7b252ef45..9a6576e1b0a 100644
--- a/gcc/testsuite/gcc.dg/analyzer/uninit-1.c
+++ b/gcc/testsuite/gcc.dg/analyzer/uninit-1.c
@@ -1,4 +1,5 @@
 #include "analyzer-decls.h"
+typedef __SIZE_TYPE__ size_t;
 
 int test_1 (void)
 {
@@ -42,3 +43,87 @@ int test_6 (int i)
   int arr[10]; /* { dg-message "region created on stack here" } */
   return arr[i]; /* { dg-warning "use of uninitialized value 'arr\\\[i\\\]'" } 
*/
 }
+
+int test_rshift_rhs (int i)
+{
+  int j; /* { dg-message "region created on stack here" } */
+  return i >> j; /* { dg-warning "use of uninitialized value 'j'" } */
+}
+
+int test_lshift_rhs (int i)
+{
+  int j; /* { dg-message "region created on stack here" } */
+  return i << j; /* { dg-warning "use of uninitialized value 'j'" } */
+}
+
+int test_rshift_lhs (int i)
+{
+  int j; /* { dg-message "region created on stack here" } */
+  return j >> i; /* { dg-warning "use of uninitialized value 'j'" } */
+}
+
+int test_lshift_lhs (int i)
+{
+  int j; /* { dg-message "region created on stack here" } */
+  return j << i; /* { dg-warning "use of uninitialized value 'j'" } */
+}
+
+int test_cmp (int i)
+{
+  int j; /* { dg-message "region created on stack here" } */
+  return i < j; /* { dg-warning "use of uninitialized value 'j'" } */
+}
+
+float test_plus_rhs (float x)
+{
+  float y; /* { dg-message "region created on stack here" } */
+  return x + y; /* { dg-warning "use of uninitialized value 'y'" } */
+}
+
+float test_plus_lhs (float x)
+{
+  float y; /* { dg-message "region created on stack here" } */
+  return y + x; /* { dg-warning "use of uninitialized value 'y'" } */
+}
+
+float test_minus_rhs (float x)
+{
+  float y; /* { dg-message "region created on stack here" } */
+  return x - y; /* { dg-warning "use of uninitialized value 'y'" } */
+}
+
+float test_minus_lhs (float x)
+{
+  float y; /* { dg-message "region created on stack here" } */
+  return y - x; /* { dg-warning "use of uninitialized value 'y'" } */
+}
+
+float test_times_rhs (float x)
+{
+  float y; /* { dg-message "region created on stack here" } */
+  return x * y; /* { dg-warning "use of uninitialized value 'y'" } */
+}
+
+float test_times_lhs (float x)
+{
+  float y; /* { dg-message "region created on stack here" } */
+  return y * x; /* { dg-warning "use of uninitialized value 'y'" } */
+}
+
+float test_divide_rhs (float x)
+{
+  float y; /* { dg-mess

[r12-7133 Regression] FAIL: g++.dg/modules/xtreme-header_a.H -std=c++2b (internal compiler error: tree check: expected none of template_decl, have template_decl in add_specializations, at cp/module.cc

2022-02-09 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

1ce5395977f37e8d0c03394f7b932a584ce85cc7 is the first bad commit
commit 1ce5395977f37e8d0c03394f7b932a584ce85cc7
Author: Jason Merrill 
Date:   Wed Feb 9 00:31:12 2022 -0500

c++: modules and explicit(bool) [PR103752]

caused

FAIL: g++.dg/modules/xtreme-header-5_a.H module-cmi  
(gcm.cache/$srcdir/g++.dg/modules/xtreme-header-5_a.H.gcm)
FAIL: g++.dg/modules/xtreme-header-5_a.H -std=c++17 (internal compiler error: 
tree check: expected none of template_decl, have template_decl in 
add_specializations, at cp/module.cc:12979)
FAIL: g++.dg/modules/xtreme-header-5_a.H -std=c++17 (test for excess errors)
FAIL: g++.dg/modules/xtreme-header-5_a.H -std=c++2a (internal compiler error: 
tree check: expected none of template_decl, have template_decl in 
add_specializations, at cp/module.cc:12979)
FAIL: g++.dg/modules/xtreme-header-5_a.H -std=c++2a (test for excess errors)
FAIL: g++.dg/modules/xtreme-header-5_a.H -std=c++2b (internal compiler error: 
tree check: expected none of template_decl, have template_decl in 
add_specializations, at cp/module.cc:12979)
FAIL: g++.dg/modules/xtreme-header-5_a.H -std=c++2b (test for excess errors)
FAIL: g++.dg/modules/xtreme-header_a.H module-cmi  
(gcm.cache/$srcdir/g++.dg/modules/xtreme-header_a.H.gcm)
FAIL: g++.dg/modules/xtreme-header_a.H -std=c++17 (internal compiler error: 
tree check: expected none of template_decl, have template_decl in 
add_specializations, at cp/module.cc:12979)
FAIL: g++.dg/modules/xtreme-header_a.H -std=c++17 (test for excess errors)
FAIL: g++.dg/modules/xtreme-header_a.H -std=c++2a (internal compiler error: 
tree check: expected none of template_decl, have template_decl in 
add_specializations, at cp/module.cc:12979)
FAIL: g++.dg/modules/xtreme-header_a.H -std=c++2a (test for excess errors)
FAIL: g++.dg/modules/xtreme-header_a.H -std=c++2b (internal compiler error: 
tree check: expected none of template_decl, have template_decl in 
add_specializations, at cp/module.cc:12979)
FAIL: g++.dg/modules/xtreme-header_a.H -std=c++2b (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-7133/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="modules.exp=g++.dg/modules/xtreme-header-5_a.H 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="modules.exp=g++.dg/modules/xtreme-header-5_a.H 
--target_board='unix{-m64\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="modules.exp=g++.dg/modules/xtreme-header_a.H 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="modules.exp=g++.dg/modules/xtreme-header_a.H 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


[PATCH] [COMMITTED] Fix PR aarch64/104474: ICE with vector float initializers and non-consts.

2022-02-09 Thread apinski--- via Gcc-patches
From: Andrew Pinski 

The problem here is that the aarch64 back-end was placing const0_rtx
into the constant vector RTL even if the mode was a floating point mode.
The fix is instead to use CONST0_RTX and pass the mode to select the
correct zero (either const_int or const_double).

Committed as obvious after a bootstrap/test on aarch64-linux-gnu with
no regressions.

PR target/104474

gcc/ChangeLog:

* config/aarch64/aarch64.cc
(aarch64_sve_expand_vector_init_handle_trailing_constants):
Use CONST0_RTX instead of const0_rtx for the non-constant elements.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/pr104474-1.c: New test.
* gcc.target/aarch64/sve/pr104474-2.c: New test.
* gcc.target/aarch64/sve/pr104474-3.c: New test.
---
 gcc/config/aarch64/aarch64.cc | 2 +-
 gcc/testsuite/gcc.target/aarch64/sve/pr104474-1.c | 9 +
 gcc/testsuite/gcc.target/aarch64/sve/pr104474-2.c | 9 +
 gcc/testsuite/gcc.target/aarch64/sve/pr104474-3.c | 9 +
 4 files changed, 28 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr104474-1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr104474-2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr104474-3.c

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 8dc6d55e0f2..828ee472be2 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -21164,7 +21164,7 @@ aarch64_sve_expand_vector_init_handle_trailing_constants
{
  rtx x = builder.elt (i + nelts_reqd - n_trailing_constants);
  if (!valid_for_const_vector_p (elem_mode, x))
-   x = const0_rtx;
+   x = CONST0_RTX (elem_mode);
  v.quick_push (x);
}
   rtx const_vec = v.build ();
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr104474-1.c 
b/gcc/testsuite/gcc.target/aarch64/sve/pr104474-1.c
new file mode 100644
index 000..9e5bfe64467
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/pr104474-1.c
@@ -0,0 +1,9 @@
+/* { dg-options "-mcpu=neoverse-512tvb -frounding-math -msve-vector-bits=512" 
} */
+
+typedef float __attribute__((__vector_size__ (64))) F;
+
+F
+foo (void)
+{
+  return (F){68435453, 0, 0, 0, 0, 0, 0, 5, 0, 431144844};
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr104474-2.c 
b/gcc/testsuite/gcc.target/aarch64/sve/pr104474-2.c
new file mode 100644
index 000..02a4b6a8fdc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/pr104474-2.c
@@ -0,0 +1,9 @@
+/* { dg-options "-mcpu=neoverse-512tvb -msve-vector-bits=512" } */
+
+typedef float __attribute__((__vector_size__ (64))) F;
+
+F
+foo (float t)
+{
+  return (F){t, 0, 0, 0, 0, 0, 0, 5, 0, t};
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr104474-3.c 
b/gcc/testsuite/gcc.target/aarch64/sve/pr104474-3.c
new file mode 100644
index 000..7bed0142968
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/pr104474-3.c
@@ -0,0 +1,9 @@
+/* { dg-options "-mcpu=neoverse-v1 -frounding-math -msve-vector-bits=256" } */
+
+typedef _Float16 __attribute__((__vector_size__ (32))) F;
+
+F
+foo (void)
+{
+  return (F){0, 6270, 0, 0, 0, 0, 0, 0, 3229, 0, 40};
+}
-- 
2.27.0



Re: [PATCH] libgccjit: Add support for register variables [PR104072]

2022-02-09 Thread Antoni Boucher via Gcc-patches
Here's the updated patch.

Le mardi 25 janvier 2022 à 12:13 -0500, Antoni Boucher via Jit a
écrit :
> See answers below.
> 
> Le lundi 24 janvier 2022 à 18:20 -0500, David Malcolm a écrit :
> > On Sat, 2022-01-22 at 19:29 -0500, Antoni Boucher wrote:
> > > Hi.
> > > 
> > > Le mardi 18 janvier 2022 à 18:49 -0500, David Malcolm a écrit :
> > > > On Mon, 2022-01-17 at 19:46 -0500, Antoni Boucher via Gcc-
> > > > patches
> > > > wrote:
> > > > > I missed the comment about the new define, so here's the
> > > > > updated
> > > > > patch.
> > > > 
> > > > Thanks for the patch.
> > > > > 
> > > > > Le lundi 17 janvier 2022 à 19:24 -0500, Antoni Boucher via
> > > > > Jit
> > > > > a
> > > > > écrit :
> > > > > > Hi.
> > > > > > This patch add supports for register variables in
> > > > > > libgccjit.
> > > > > > 
> > > > > > It passes the JIT tests, but since I added a function in
> > > > > > reginfo.c,
> > > > > > I
> > > > > > wonder if I should run the whole testsuite.
> > > > 
> > > > We're in stage 4 for gcc 12, so we should be especially careful
> > > > about
> > > > changes right now, and we're not meant to be adding new GCC 12
> > > > features.
> > > > 
> > > > How close is gcc 12's libgccjit to being usable with your rustc
> > > > backend?  If we're almost there, I'm willing to make our case
> > > > for
> > > > late-
> > > > breaking libgccjit changes to the release managers, but if you
> > > > think
> > > > rustc users are going to need to build a patched libgccjit,
> > > > then
> > > > let's
> > > > queue this up for gcc 13.
> > > 
> > > As I mentioned in my other email, if the 4 patches currently
> > > being
> > > reviewed (and listed here:
> > > https://github.com/antoyo/libgccjit-patches) were included in gcc
> > > 12,
> > > I'd be able to build rustc_codegen_gcc with an unpatched gcc.
> > 
> > Thanks.  Once the relevant patches look good to me, I'll approach
> > the
> > release managers with the proposal.
> > 
> > > 
> > > It is to be noted however, that I'll need more patches for future
> > > work.
> > > Off the top of my head, I'll at least need a patch for the inline
> > > attribute, try/catch and target-specific builtins.
> > > The last 2 features will probably take some time to implement, so
> > > I'll
> > > let you judge if you think it's worth merging the 4 patches
> > > currently
> > > being reviewed for gcc 12.
> > 
> > Thanks, though I don't know enough about your project's features to
> > make the judgement call.  Does rustc_codegen_gcc have releases yet,
> > or
> > are you just pointing people at the trunk of your repo?  I guess
> > the
> > question is - are you hoping to be able to point people at distro
> > installs of gcc 12's libgccjit and have some version of
> > rustc_codegen_gcc "just work" with that, or are they going to have
> > to
> > rebuild their own libgccjit to meaningfully use it?
> 
> rustc_codegen_gcc does not have releases. It is merged from time to
> time into rustc, so we can as well say that I point them to trunk.
> 
> I can feature gate the future features/patches I will need so that
> people can still use the project with an unpatched gcc.
> 
> There's some interest in having it work with an unpatched gcc because
> that would allow people to start working on the infra to get the
> project distributed via rustup so that it could be distributed as a
> preview component.
> 
> > 
> > [...snip various corrections...]
> > 
> > > 
> > > > > diff --git a/gcc/testsuite/jit.dg/test-register-variable.c
> > > > > b/gcc/testsuite/jit.dg/test-register-variable.c
> > > > > new file mode 100644
> > > > > index 000..3cea3f1668f
> > > > > --- /dev/null
> > > > > +++ b/gcc/testsuite/jit.dg/test-register-variable.c
> > > > > +
> > 
> > [...snip...]
> > 
> > > > > +/* { dg-final { jit-verify-output-file-was-created "" } } */
> > > > > +/* { dg-final { jit-verify-assembler-output
> > > > > "movl  \\\$1,
> > > > > %r12d" } } */
> > > > > +/* { dg-final { jit-verify-assembler-output
> > > > > "movl  \\\$2,
> > > > > %r13d" } } */
> > > > 
> > > > How target-specific is this test?
> > > 
> > > It will only work on x86-64. Should I feature-gate the test
> > > somehow?
> > 
> > 
> > Yes; I think you can do this by adding this to the top of the test:
> > 
> >    /* { dg-do compile { target x86_64-*-* } } */
> > 
> > like test-asm.c does.
> 
> Thanks. I'll update the patch to do this.
> 
> > 
> > > > 
> > > > We should have test coverage for at least these two errors:
> > > > 
> > > > - gcc_jit_lvalue_set_register_name(global_variable,
> > > > "this_is_not_a_register");
> > > > - attempting to set the name for a var that doesn't fit in the
> > > > given
> > > > register (e.g. trying to use a register for an array that's way
> > > > too
> > > > big)
> > > 
> > > Done.
> > 
> > Thanks.
> > 
> > Is the updated patch available for review? It looks like you didn't
> > attach it.
> > 
> > Dave
> > 
> 

From 542b7af61292a04d61480388d425a1057ecd66d4 Mon Sep 17 00:00:00 2001
From: Antoni 

[PATCH] c: Add diagnostic when operator= is used as truth cond [PR25689]

2022-02-09 Thread Zhao Wei Liew via Gcc-patches
Hi!

I wrote a patch for PR 25689, but I feel like it may not be the ideal
fix. Furthermore, there are some standing issues with the patch for
which I would like tips on how to fix them.
Specifically, there are 2 issues:
1. GCC warns about  if (a.operator=(0)). That said, this may not be a
major issue as I don't think such code is widely written.
2. GCC does not warn for `if (a = b)` where the default copy/move
assignment operator is used.

I've included a code snippet in PR25689 that shows the 2 issues I
mentioned. I appreciate any feedback, thanks!

Everything below is the actual patch

When compiling the following code with g++ -Wparentheses, GCC does not
warn on the if statement:

struct A {
A& operator=(int);
operator bool();
};

void f(A a) {
if (a = 0); // no warning
}

This is because a = 0 is a call to operator=, which GCC does not check
for.

This patch fixes that by checking for calls to operator= when deciding
to warn.

PR c/25689

gcc/cp/ChangeLog:

* semantics.cc (maybe_convert_cond): Handle the operator=() case
  as well.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wparentheses-31.C: New test.
---
 gcc/cp/semantics.cc | 14 +-
 gcc/testsuite/g++.dg/warn/Wparentheses-31.C | 11 +++
 2 files changed, 24 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wparentheses-31.C

diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 466d6b56871f4..6a25d039585f2 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -836,7 +836,19 @@ maybe_convert_cond (tree cond)
   /* Do the conversion.  */
   cond = convert_from_reference (cond);

-  if (TREE_CODE (cond) == MODIFY_EXPR
+  /* Also check if this is a call to operator=().
+ Example: if (my_struct = 5) {...}
+  */
+  tree fndecl = NULL_TREE;
+  if (TREE_OPERAND_LENGTH(cond) >= 1) {
+fndecl = cp_get_callee_fndecl(TREE_OPERAND(cond, 0));
+  }
+
+  if ((TREE_CODE (cond) == MODIFY_EXPR
+|| (fndecl != NULL_TREE
+&& DECL_OVERLOADED_OPERATOR_P(fndecl)
+&& DECL_OVERLOADED_OPERATOR_IS(fndecl, NOP_EXPR)
+&& DECL_ASSIGNMENT_OPERATOR_P(fndecl)))
   && warn_parentheses
   && !warning_suppressed_p (cond, OPT_Wparentheses)
   && warning_at (cp_expr_loc_or_input_loc (cond),
diff --git a/gcc/testsuite/g++.dg/warn/Wparentheses-31.C
b/gcc/testsuite/g++.dg/warn/Wparentheses-31.C
new file mode 100644
index 0..abd7476ccb461
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wparentheses-31.C
@@ -0,0 +1,11 @@
+/* PR c/25689 */
+/* { dg-options "-Wparentheses" }  */
+
+struct A {
+   A& operator=(int);
+   operator bool();
+};
+
+void f(A a) {
+   if (a = 0); /* { dg-warning "suggest parentheses" } */
+}
--
2.17.1


Re: [PATCH] Fix PR 101515 (ICE in pp_cxx_unqualified_id, at cp/cxx-pretty-print.c:128)

2022-02-09 Thread Jason Merrill via Gcc-patches

On 2/9/22 16:01, Qing Zhao wrote:




On Feb 9, 2022, at 12:23 PM, Jason Merrill  wrote:

On 2/9/22 10:51, Qing Zhao wrote:

On Feb 8, 2022, at 4:20 PM, Jason Merrill  wrote:

On 2/8/22 15:11, Qing Zhao wrote:

Hi,
This is the patch to fix PR101515 (ICE in pp_cxx_unqualified_id, at  
cp/cxx-pretty-print.c:128)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101515
It's possible that the TYPE_NAME of a record_type is NULL, therefore when
printing the TYPE_NAME, we should check and handle this special case.
Please see the comment of pr101515 for more details.
The fix is very simple, just check and special handle cases when TYPE_NAME is 
NULL.
Bootstrapped and regression tested on both x86 and aarch64, no issues.
Okay for commit?
Thanks.
Qing
=
 From f37ee8d21b80cb77d8108cb97a487c84c530545b Mon Sep 17 00:00:00 2001
From: Qing Zhao 
Date: Tue, 8 Feb 2022 16:10:37 +
Subject: [PATCH] Fix PR 101515 ICE in pp_cxx_unqualified_id, at
  cp/cxx-pretty-print.c:128.
It's possible that the TYPE_NAME of a record_type is NULL, therefore when
printing the TYPE_NAME, we should check and handle this special case.
gcc/cp/ChangeLog:
* cxx-pretty-print.cc (pp_cxx_unqualified_id): Check and handle
the case when TYPE_NAME is NULL.
gcc/testsuite/ChangeLog:
* g++.dg/pr101515.C: New test.
---
  gcc/cp/cxx-pretty-print.cc  |  5 -
  gcc/testsuite/g++.dg/pr101515.C | 25 +
  2 files changed, 29 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/pr101515.C
diff --git a/gcc/cp/cxx-pretty-print.cc b/gcc/cp/cxx-pretty-print.cc
index 4f9a090e520d..744ed0add5ba 100644
--- a/gcc/cp/cxx-pretty-print.cc
+++ b/gcc/cp/cxx-pretty-print.cc
@@ -171,7 +171,10 @@ pp_cxx_unqualified_id (cxx_pretty_printer *pp, tree t)
  case ENUMERAL_TYPE:
  case TYPENAME_TYPE:
  case UNBOUND_CLASS_TEMPLATE:
-  pp_cxx_unqualified_id (pp, TYPE_NAME (t));
+  if (TYPE_NAME (t))
+   pp_cxx_unqualified_id (pp, TYPE_NAME (t));
+  else
+   pp_string (pp, "");


Hmm, but it's not an unnamed class, it's a pointer to member function type, and 
it would be better to avoid dumping compiler internal representations like the 
__pfn field name.

Yes, It’s not an unnamed class, but the ICE happened when try to print the 
compiler generated member function type “__ptrmemfunc_type”, whose TYPE_NAME is 
NULLed during building this type in c++ FE and the c++ FE does not handle the 
case when TYPE_NAME is NULL correctly.
So, there are two levels of issues:
1. The first level issue is that the current C++ FE does not handle the case 
TYPE_NAME being NULL correctly, this is indeed a bug in the current code and 
should be fixed as in the current patch.


Sure, we might as well make this code more robust.  But we can do better than 
 if we check TYPE_PTRMEMFUNC_P.

Okay, so what should we print to the user if it's “TYPE_PTRMEMFUNC_P”? Print 
nothing or some special string?


Maybe call pp->type_id in that case.


2. The second level issue is what you suggested in the above, shall we print 
the “compiler generated internal type”  to the user? And I agree with you that 
it might not be a good idea to print such compiler internal names to the user.  
Are we do this right now in general? (i.e, check whether the current TYPE is a 
source level TYPE or a compiler internal TYPE, and then only print out the name 
of TYPE for the source level TYPE?) and is there a bit in the TYPE to 
distinguish whether a TYPE is user -level type or a compiler generated internal 
type?



I think the real problem comes sooner, when c_fold_indirect_ref_for_warn turns 
a MEM_REF with RECORD_TYPE into a COMPONENT_REF with POINTER_TYPE.



What’s the major issue for this transformation? (I will study this in more 
details).


We told c_fold_indirect_ref that we want a RECORD_TYPE (the PMF as a whole) and 
it gave us back a POINTER_TYPE instead (the __pmf member). Folding shouldn't 
change the type of an expression like that.


Yes, this is not correct transformation, will study in more detail and try to 
fix it.

Qing


Jason






Re: [PATCH] c++: memfn lookup consistency and using-decls [PR104432]

2022-02-09 Thread Jason Merrill via Gcc-patches

On 2/9/22 15:15, Patrick Palka wrote:

On Wed, 9 Feb 2022, Jason Merrill wrote:


On 2/9/22 11:36, Patrick Palka wrote:

On Wed, 9 Feb 2022, Jason Merrill wrote:


On 2/9/22 10:45, Patrick Palka wrote:

In filter_memfn_lookup, we weren't correctly recognizing and matching up
member functions introduced via a non-dependent using-decl.  This caused
us to crash in the below testcases in which we correctly pruned the
overload set for the non-dependent call ahead of time, but then at
instantiation time filter_memfn_lookup failed to match the selected
function (introduced in each case by a non-dependent using-decl) to the
corresponding function from the new lookup set.  Such member functions
need special handling in filter_memfn_lookup because they look exactly
the same in the old and new lookup sets, whereas ordinary member
functions that're defined in the (dependent) current class become more
specialized in the new lookup set.

This patch reworks the matching logic in filter_memfn_lookup so that it
handles non-dependent using-decls correctly, and is hopefully simpler to
follow.

Bootstrapped and regtested on x86_64-pc-linux, does this look OK for
trunk?

PR c++/104432

gcc/cp/ChangeLog:

* call.cc (build_new_method_call): When a non-dependent call
resolves to a specialization of a member template, always build
the pruned overload set using the member template, not the
specialization.
* pt.cc (filter_memfn_lookup): New parameter newtype.  Simplify
and correct how members from the new lookup set are matched to
those from the old one.
(tsubst_baselink): Pass binfo_type as newtype to
filter_memfn_lookup.

gcc/testsuite/ChangeLog:

* g++.dg/template/non-dependent19.C: New test.
* g++.dg/template/non-dependent19a.C: New test.
* g++.dg/template/non-dependent20.C: New test.
---
gcc/cp/call.cc|  9 ++--
gcc/cp/pt.cc  | 49
+--
.../g++.dg/template/non-dependent19.C | 14 ++
.../g++.dg/template/non-dependent19a.C| 16 ++
.../g++.dg/template/non-dependent20.C | 16 ++
5 files changed, 73 insertions(+), 31 deletions(-)
create mode 100644 gcc/testsuite/g++.dg/template/non-dependent19.C
create mode 100644 gcc/testsuite/g++.dg/template/non-dependent19a.C
create mode 100644 gcc/testsuite/g++.dg/template/non-dependent20.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index b2e89c5d783..d6eed5ed835 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -11189,12 +11189,11 @@ build_new_method_call (tree instance, tree
fns,
vec **args,
  if (really_overloaded_fn (fns))
{
  if (DECL_TEMPLATE_INFO (fn)
- && DECL_MEMBER_TEMPLATE_P (DECL_TI_TEMPLATE (fn))
- && dependent_type_p (DECL_CONTEXT (fn)))
+ && DECL_MEMBER_TEMPLATE_P (DECL_TI_TEMPLATE (fn)))
{
- /* FIXME: We're not prepared to fully instantiate "inside-out"
-partial instantiations such as A::f().  So instead
-use the selected template, not the specialization.  */
+ /* Use the selected template, not the specialization, so that
+this looks like an actual lookup result for sake of
+filter_memfn_lookup.  */
  if (OVL_SINGLE_P (fns))
/* If the original overload set consists of a single
function
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 862f337886c..3a5d06bf297 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -16311,12 +16311,12 @@ tsubst (tree t, tree args, tsubst_flags_t
complain, tree in_decl)
}
  /* OLDFNS is a lookup set of member functions from some class
template,
and
-   NEWFNS is a lookup set of member functions from a specialization of
that
-   class template.  Return the subset of NEWFNS which are
specializations
of
-   a function from OLDFNS.  */
+   NEWFNS is a lookup set of member functions from NEWTYPE, a
specialization
+   of that class template.  Return the subset of NEWFNS which are
+   specializations of a function from OLDFNS.  */
  static tree
-filter_memfn_lookup (tree oldfns, tree newfns)
+filter_memfn_lookup (tree oldfns, tree newfns, tree newtype)
{
  /* Record all member functions from the old lookup set OLDFNS into
 VISIBLE_SET.  */
@@ -16326,38 +16326,34 @@ filter_memfn_lookup (tree oldfns, tree newfns)
  if (TREE_CODE (fn) == USING_DECL)
{
  /* FIXME: Punt on (dependent) USING_DECL for now; mapping
-a dependent USING_DECL to its instantiation seems
-tricky.  */
+a dependent USING_DECL to the member functions it introduces
+seems tricky.  */


FWIW I still think this shouldn't be very tricky.


I tried implementing this by substituting into the USING_DECL_SCOPE and
then during the filtering step keeping the member functions 

Re: [PATCH] [PATCH,v4,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack

2022-02-09 Thread Dan Li via Gcc-patches




On 2/9/22 08:08, Richard Sandiford wrote:

Dan Li  writes:

+
+  /* When shadow call stack is enabled, the scs_pop in the epilogue will
+ restore x30, and we don't need to pop x30 again in the traditional
+ way.  Pop candidates record the registers that need to be popped
+ eventually.  */
+  if (frame.is_scs_enabled)
+{
+  if (frame.wb_push_candidate2 == R30_REGNUM)
+   frame.wb_pop_candidate2 = INVALID_REGNUM;
+  else if (frame.wb_push_candidate1 == R30_REGNUM)
+   frame.wb_pop_candidate1 = INVALID_REGNUM;


Although it makes no difference to the behaviour, I think it would be
clearer to use pop rather than push in the checks here.



Got it.

@@ -7885,8 +7914,8 @@ aarch64_save_callee_saves (poly_int64 start_offset,
bool frame_related_p = aarch64_emit_cfi_for_reg_p (regno);
  
if (skip_wb

- && (regno == cfun->machine->frame.wb_candidate1
- || regno == cfun->machine->frame.wb_candidate2))
+ && (regno == cfun->machine->frame.wb_push_candidate1
+ || regno == cfun->machine->frame.wb_push_candidate2))
continue;
  
if (cfun->machine->reg_is_wrapped_separately[regno])

@@ -7996,8 +8025,8 @@ aarch64_restore_callee_saves (poly_int64 start_offset, 
unsigned start,
rtx reg, mem;
  
if (skip_wb

- && (regno == cfun->machine->frame.wb_candidate1
- || regno == cfun->machine->frame.wb_candidate2))
+ && (regno == cfun->machine->frame.wb_push_candidate1
+ || regno == cfun->machine->frame.wb_push_candidate2))


Shouldn't this be using pop rather than push?



There might be a little difference:

- Using push candidates means that a register to be ignored in pop
candidates will not be emitted again during the "restore" (pop_candidates
should always be a subset of push_candidates, since popping a register
without a push might not make sense).

- Using pop candidates means that a registers to be ignored in pop
candidates will be re-emitted during the "restore". For example,
if we specify to ignore the x20 register in pop:

--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -7502,6 +7502,8 @@ aarch64_layout_frame (void)
frame.wb_pop_candidate1 = INVALID_REGNUM;
 }
 
+  if (frame.wb_pop_candidate2 == R20_REGNUM)

+   frame.wb_pop_candidate2 = INVALID_REGNUM;
   /* If candidate2 is INVALID_REGNUM, we need to adjust max_push_offset to
  256 to ensure that the offset meets the requirements of emit_move_insn.
  Similarly, if candidate1 is INVALID_REGNUM, we need to set

With the test case:

int main(void)
{
__asm__ ("":::"x19", "x20");
return 0;
}

When we use "pop_candidate[12]", one more insn is emitted:

00400604 :
   400604:   a9bf53f3stp x19, x20, [sp, #-16]!
   400608:   5280mov w0, #0x0
+  40060c:   f94007f4ldr x20, [sp, #8]
   400610:   f84107f3ldr x19, [sp], #16
   400614:   d65f03c0ret

But in the case of ignoring a specific register (like scs ignores x30),
there is no difference between the two (because we always need
to explicitly specify which registers to ignore in the parameter of
aarch64_restore_callee_saves).

If pop looks better here, I'd like to change it to pop in the
next version :).


+  /* When shadow call stack is enabled, the scs_pop in the epilogue will
+ restore x30, we don't need to restore x30 again in the traditional
+ way.  */
+  if (cfun->machine->frame.is_scs_enabled)
+aarch64_restore_callee_saves (callee_offset - sve_callee_adjust,
+ R0_REGNUM, R29_REGNUM,
+ callee_adjust != 0, &cfi_ops);
+  else
+aarch64_restore_callee_saves (callee_offset - sve_callee_adjust,
+ R0_REGNUM, R30_REGNUM,
+ callee_adjust != 0, &cfi_ops);
+


Very minor, but I think it would be better to have:

   unsigned int last_gpr = (cfun->machine->frame.is_scs_enabled
   ? R29_REGNUM : R30_REGNUM);

so that we don't need to repeat the other arguments.  There's then
less risk of the two versions getting out of sync.



Got it.

  
if (need_barrier_p)

  emit_insn (gen_stack_tie (stack_pointer_rtx, stack_pointer_rtx));
@@ -9066,6 +9109,17 @@ aarch64_expand_epilogue (bool for_sibcall)
RTX_FRAME_RELATED_P (insn) = 1;
  }
  
+  /* Pop return address from shadow call stack.  */

+  if (cfun->machine->frame.is_scs_enabled)
+{
+  machine_mode mode = aarch64_reg_save_mode (R30_REGNUM);
+  rtx reg = gen_rtx_REG (mode, R30_REGNUM);
+
+  insn = emit_insn (gen_scs_pop ());
+  add_reg_note (insn, REG_CFA_RESTORE, reg);
+  RTX_FRAME_RELATED_P (insn) = 1;
+}
+
/* We prefer to emit the combined return/authenticate instruction RETAA,
   however there are three cases in which we must instead emit an explicit
   auth

[PATCH] [vect] Add vect_recog_cond_expr_convert_pattern.

2022-02-09 Thread liuhongt via Gcc-patches
>But in principle @2 or @3 could safely differ in sign, you'd then need to 
>ensure
>to insert sign conversions to @2/@3 to the signedness of @4/@5.
Changed.
>you are not testing for this anywhere?
It's tested in vect_recog_cond_expr_convert_pattern, I've move it to match.pd

>Btw, matching up the comments with the types is somewhat difficult,
>maybe using TYPE_AB, TYPE_CD, TYPE_E instead of 1,2,3 will
>make that easier ;)
Changed.
>I think the precision check should be part of the match.pd pattern.  You
>do not check that the comparison operands are integral - I think float
>comparisons would be OK in principle but the precision check will not
>work there.
Restricted to integeral type.

Here's updated patch.

gcc/ChangeLog:

PR target/103771
* match.pd (cond_expr_convert_p): New match.
* tree-vect-patterns.cc (gimple_cond_expr_convert_p): Declare.
(vect_recog_cond_expr_convert_pattern): New.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr103771-2.c: New test.
* gcc.target/i386/pr103771-3.c: New test.
---
 gcc/match.pd   | 14 
 gcc/testsuite/gcc.target/i386/pr103771-2.c |  8 ++
 gcc/testsuite/gcc.target/i386/pr103771-3.c | 21 +
 gcc/tree-vect-patterns.cc  | 96 ++
 4 files changed, 139 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr103771-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr103771-3.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 7bbb80172fc..7386ee518a1 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -7683,3 +7683,17 @@ and,
to the number of trailing zeroes.  */
 (match (ctz_table_index @1 @2 @3)
   (rshift (mult (bit_and:c (negate @1) @1) INTEGER_CST@2) INTEGER_CST@3))
+
+(match (cond_expr_convert_p @0 @2 @3 @6)
+ (cond (simple_comparison@6 @0 @1) (convert@4 @2) (convert@5 @3))
+  (if (INTEGRAL_TYPE_P (type)
+   && INTEGRAL_TYPE_P (TREE_TYPE (@2))
+   && INTEGRAL_TYPE_P (TREE_TYPE (@0))
+   && INTEGRAL_TYPE_P (TREE_TYPE (@3))
+   && TYPE_PRECISION (type) != TYPE_PRECISION (TREE_TYPE (@0))
+   && TYPE_PRECISION (TREE_TYPE (@0))
+ == TYPE_PRECISION (TREE_TYPE (@2))
+   && TYPE_PRECISION (TREE_TYPE (@0))
+ == TYPE_PRECISION (TREE_TYPE (@3))
+   && single_use (@4)
+   && single_use (@5
diff --git a/gcc/testsuite/gcc.target/i386/pr103771-2.c 
b/gcc/testsuite/gcc.target/i386/pr103771-2.c
new file mode 100644
index 000..962a3a74ecf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr103771-2.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=cascadelake -O3" } */
+/* { dg-final { scan-assembler-not "kunpck" } } */
+/* { dg-final { scan-assembler-not "kand" } } */
+/* { dg-final { scan-assembler-not "kor" } } */
+/* { dg-final { scan-assembler-not "kshift" } } */
+
+#include "pr103771.c"
diff --git a/gcc/testsuite/gcc.target/i386/pr103771-3.c 
b/gcc/testsuite/gcc.target/i386/pr103771-3.c
new file mode 100644
index 000..ef379b23b12
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr103771-3.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-march=cascadelake -O3" } */
+/* { dg-final { scan-assembler-not "kunpck" } } */
+/* { dg-final { scan-assembler-not "kand" } } */
+/* { dg-final { scan-assembler-not "kor" } } */
+/* { dg-final { scan-assembler-not "kshift" } } */
+
+typedef unsigned char uint8_t;
+
+static uint8_t x264_clip_uint8 (int x, unsigned int y)
+{
+  return x & (~255) ? (-x) >> 31 : y;
+}
+
+void
+mc_weight (uint8_t* __restrict dst, uint8_t* __restrict src,
+  int i_width,int i_scale, unsigned int* __restrict y)
+{
+  for(int x = 0; x < i_width; x++)
+dst[x] = x264_clip_uint8 (src[x] * i_scale, y[x]);
+}
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 2baf974627e..aa54bc8bf8b 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -924,6 +924,101 @@ vect_reassociating_reduction_p (vec_info *vinfo,
   return true;
 }
 
+/* match.pd function to match
+   (cond (cmp@3 a b) (convert@1 c) (convert@2 d))
+   with conditions:
+   1) @1, @2, c, d, a, b are all integral type.
+   2) There's single_use for both @1 and @2.
+   3) a, c and d have same precision.
+   4) c and @1 have different precision.
+
+   record a and c and d and @3.  */
+
+extern bool gimple_cond_expr_convert_p (tree, tree*, tree (*)(tree));
+
+/* Function vect_recog_cond_expr_convert
+
+   Try to find the following pattern:
+
+   TYPE_AB A,B;
+   TYPE_CD C,D;
+   TYPE_E E;
+   TYPE_E op_true = (TYPE_E) A;
+   TYPE_E op_false = (TYPE_E) B;
+
+   E = C cmp D ? op_true : op_false;
+
+   where
+   TYPE_PRECISION (TYPE_E) != TYPE_PRECISION (TYPE_CD);
+   TYPE_PRECISION (TYPE_AB) == TYPE_PRECISION (TYPE_CD);
+   single_use of op_true and op_false.
+   TYPE_AB could differ in sign.
+
+   Input:
+
+   * STMT_VINFO: The stmt from which the pattern search begins.
+   here it starts with E = c cmp D ? op_true : op_false;
+
+   Output:
+
+ 

[committed] wwwdocs: gcc-4.7: Update link to Go 1 standard

2022-02-09 Thread Gerald Pfeifer
Pushed.

Gerald
---
 htdocs/gcc-4.7/changes.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/htdocs/gcc-4.7/changes.html b/htdocs/gcc-4.7/changes.html
index 846946d6..c61106e5 100644
--- a/htdocs/gcc-4.7/changes.html
+++ b/htdocs/gcc-4.7/changes.html
@@ -691,7 +691,7 @@ well.
 Go
   
 GCC 4.7 implements
-  the https://golang.org/doc/go1";>Go 1
+  the https://go.dev/doc/go1";>Go 1
   language standard.  The library support in 4.7.0 is not
   quite complete, due to release timing.  Release 4.7.1 includes
   complete support for Go 1.  The Go library is from the Go 1.0.1
-- 
2.35.1