[PATCH] LoongArch: Use UNSPEC for fmin/fmax RTL pattern [PR105414]

2022-09-24 Thread Xi Ruoyao via Gcc-patches
I made a mistake defining fmin/fmax RTL patterns in r13-2085: I used
smin and smax in the definition mistakenly.  This causes the optimizer
to perform constant folding as if fmin/fmax was "really" smin/smax
operations even with -fsignaling-nans.  Then pr105414.c fails.

We don't have fmin/fmax RTL codes for now (PR107013) so we can only use
an UNSPEC for fmin and fmax patterns.

gcc/ChangeLog:

PR tree-optimization/105414
* config/loongarch/loongarch.md (UNSPEC_FMAX): New unspec.
(UNSPEC_FMIN): Likewise.
(fmax3): Use UNSPEC_FMAX instead of smax.
(fmin3): Use UNSPEC_FMIN instead of smin.
---
 gcc/config/loongarch/loongarch.md | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/gcc/config/loongarch/loongarch.md 
b/gcc/config/loongarch/loongarch.md
index 3787fd8230f..214b14bddd3 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -35,6 +35,8 @@ (define_c_enum "unspec" [
   ;; Floating point unspecs.
   UNSPEC_FRINT
   UNSPEC_FCLASS
+  UNSPEC_FMAX
+  UNSPEC_FMIN
 
   ;; Override return address for exception handling.
   UNSPEC_EH_RETURN
@@ -1032,8 +1034,9 @@ (define_insn "smin3"
 
 (define_insn "fmax3"
   [(set (match_operand:ANYF 0 "register_operand" "=f")
-   (smax:ANYF (match_operand:ANYF 1 "register_operand" "f")
-  (match_operand:ANYF 2 "register_operand" "f")))]
+   (unspec:ANYF [(use (match_operand:ANYF 1 "register_operand" "f"))
+ (use (match_operand:ANYF 2 "register_operand" "f"))]
+UNSPEC_FMAX))]
   ""
   "fmax.\t%0,%1,%2"
   [(set_attr "type" "fmove")
@@ -1041,8 +1044,9 @@ (define_insn "fmax3"
 
 (define_insn "fmin3"
   [(set (match_operand:ANYF 0 "register_operand" "=f")
-   (smin:ANYF (match_operand:ANYF 1 "register_operand" "f")
-  (match_operand:ANYF 2 "register_operand" "f")))]
+   (unspec:ANYF [(use (match_operand:ANYF 1 "register_operand" "f"))
+ (use (match_operand:ANYF 2 "register_operand" "f"))]
+UNSPEC_FMIN))]
   ""
   "fmin.\t%0,%1,%2"
   [(set_attr "type" "fmove")
-- 
2.37.0



[committed] libstdc++: Simplify detection idiom using concepts

2022-09-24 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux, pushed to trunk.

-- >8 --

Add a simpler definition of std::__detected_or using concepts.  This
also replaces the __detector::value_t member which should have been using
a reserved name.

Use __detected_or in pointer_traits.

libstdc++-v3/ChangeLog:

* include/bits/alloc_traits.h (allocator_traits::is_always_equal):
Only instantiate is_empty if needed.
* include/bits/ptr_traits.h (__ptr_traits_impl::difference_type)
(__ptr_traits_impl::rebind): Use __detected_or.
* include/experimental/type_traits (is_same_v): Add a partial
specialization instead of instantiating the std::is_same class
template.
(detected_t): Redefine in terms of detected_or_t.
(is_detected, is_detected_v): Redefine in terms of detected_t.
* include/std/type_traits [__cpp_concepts] (__detected_or): Add
new definition using concepts.
(__detector::value_t): Rename to __is_detected.
* testsuite/17_intro/names.cc: Check value_t isn't used.
---
 libstdc++-v3/include/bits/alloc_traits.h  |  4 +--
 libstdc++-v3/include/bits/ptr_traits.h| 27 ---
 libstdc++-v3/include/experimental/type_traits | 24 -
 libstdc++-v3/include/std/type_traits  | 27 ---
 libstdc++-v3/testsuite/17_intro/names.cc  |  1 +
 5 files changed, 44 insertions(+), 39 deletions(-)

diff --git a/libstdc++-v3/include/bits/alloc_traits.h 
b/libstdc++-v3/include/bits/alloc_traits.h
index 507e8f1b6b2..8479bfd612f 100644
--- a/libstdc++-v3/include/bits/alloc_traits.h
+++ b/libstdc++-v3/include/bits/alloc_traits.h
@@ -74,7 +74,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 template
   using __pocs = typename _Tp::propagate_on_container_swap;
 template
-  using __equal = typename _Tp::is_always_equal;
+  using __equal = __type_identity;
   };
 
   template
@@ -209,7 +209,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
* otherwise @c is_empty::type
   */
   using is_always_equal
-   = __detected_or_t::type, __equal, _Alloc>;
+   = typename __detected_or_t, __equal, _Alloc>::type;
 
   template
using rebind_alloc = __alloc_rebind<_Alloc, _Tp>;
diff --git a/libstdc++-v3/include/bits/ptr_traits.h 
b/libstdc++-v3/include/bits/ptr_traits.h
index 8360c3b6557..ae8810706ab 100644
--- a/libstdc++-v3/include/bits/ptr_traits.h
+++ b/libstdc++-v3/include/bits/ptr_traits.h
@@ -144,29 +144,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct __ptr_traits_impl : __ptr_traits_ptr_to<_Ptr, _Elt>
 {
 private:
-  template
-   struct __difference { using type = ptrdiff_t; };
-
   template
-#if __cpp_concepts
-   requires requires { typename _Tp::difference_type; }
-   struct __difference<_Tp>
-#else
-   struct __difference<_Tp, __void_t>
-#endif
-   { using type = typename _Tp::difference_type; };
-
-  template
-   struct __rebind : __replace_first_arg<_Tp, _Up> { };
+   using __diff_t = typename _Tp::difference_type;
 
   template
-#if __cpp_concepts
-   requires requires { typename _Tp::template rebind<_Up>; }
-   struct __rebind<_Tp, _Up>
-#else
-   struct __rebind<_Tp, _Up, __void_t>>
-#endif
-   { using type = typename _Tp::template rebind<_Up>; };
+   using __rebind = __type_identity>;
 
 public:
   /// The pointer type.
@@ -176,11 +158,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   using element_type = _Elt;
 
   /// The type used to represent the difference between two pointers.
-  using difference_type = typename __difference<_Ptr>::type;
+  using difference_type = __detected_or_t;
 
   /// A pointer to a different type.
   template
-using rebind = typename __rebind<_Ptr, _Up>::type;
+   using rebind = typename __detected_or_t<__replace_first_arg<_Ptr, _Up>,
+   __rebind, _Ptr, _Up>::type;
 };
 
   // _GLIBCXX_RESOLVE_LIB_DEFECTS
diff --git a/libstdc++-v3/include/experimental/type_traits 
b/libstdc++-v3/include/experimental/type_traits
index af5970e80d0..fa25a1c2be2 100644
--- a/libstdc++-v3/include/experimental/type_traits
+++ b/libstdc++-v3/include/experimental/type_traits
@@ -223,7 +223,9 @@ template 
 
 // See C++14 20.10.6, type relations
 template 
-  constexpr bool is_same_v = is_same<_Tp, _Up>::value;
+  constexpr bool is_same_v = false;
+template 
+  constexpr bool is_same_v<_Tp, _Tp> = true;
 template 
   constexpr bool is_base_of_v = is_base_of<_Base, _Derived>::value;
 template 
@@ -266,23 +268,21 @@ struct nonesuch : private __nonesuchbase
 };
 #pragma GCC diagnostic pop
 
-template class _Op, typename... _Args>
-  using is_detected
-= typename std::__detector::value_t;
-
-template class _Op, typename... _Args>
-  constexpr bool is_detected_v = is_detected<_Op, _Args...>::value;
-
-template class _Op, typename... _Args>
-  using detected_t
-= typename std::__detector::type;
-
 template class

Re: [PATCH] Fix typo in chapter level for RISC-V attributes

2022-09-24 Thread Jeff Law via Gcc-patches



On 9/23/22 12:43, Torbjörn SVENSSON via Gcc-patches wrote:

The "RISC-V specific attributes" section should be at the same level
as "PowerPC-specific attributes".

gcc/ChangeLog:

* doc/sourcebuild.texi: Fix chapter level.


OK

jeff




Re: [PATCH 1/2]middle-end Fold BIT_FIELD_REF and Shifts into BIT_FIELD_REFs alone

2022-09-24 Thread Jeff Law via Gcc-patches



On 9/23/22 05:42, Tamar Christina wrote:

Hi All,

This adds a match.pd rule that can fold right shifts and bit_field_refs of
integers into just a bit_field_ref by adjusting the offset and the size of the
extract and adds an extend to the previous size.

Concretely turns:

#include 

unsigned int foor (uint32x4_t x)
{
 return x[1] >> 16;
}

which used to generate:

   _1 = BIT_FIELD_REF ;
   _3 = _1 >> 16;

into

   _4 = BIT_FIELD_REF ;
   _2 = (unsigned int) _4;

I currently limit the rewrite to only doing it if the resulting extract is in
a mode the target supports. i.e. it won't rewrite it to extract say 13-bits
because I worry that for targets that won't have a bitfield extract instruction
this may be a de-optimization.

Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
and no issues.

Testcase are added in patch 2/2.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

* match.pd: Add bitfield and shift folding.


Were you planning to handle left shifts as well?  It looks like it since 
you've got iterations for the shift opcode and corresponding adjustment 
to the field, but they currently only handle rshift/plus.



Jeff




Re: [PATCH 1/2]middle-end Fold BIT_FIELD_REF and Shifts into BIT_FIELD_REFs alone

2022-09-24 Thread Andrew Pinski via Gcc-patches
On Fri, Sep 23, 2022 at 4:43 AM Tamar Christina via Gcc-patches
 wrote:
>
> Hi All,
>
> This adds a match.pd rule that can fold right shifts and bit_field_refs of
> integers into just a bit_field_ref by adjusting the offset and the size of the
> extract and adds an extend to the previous size.
>
> Concretely turns:
>
> #include 
>
> unsigned int foor (uint32x4_t x)
> {
> return x[1] >> 16;
> }
>
> which used to generate:
>
>   _1 = BIT_FIELD_REF ;
>   _3 = _1 >> 16;
>
> into
>
>   _4 = BIT_FIELD_REF ;
>   _2 = (unsigned int) _4;
>
> I currently limit the rewrite to only doing it if the resulting extract is in
> a mode the target supports. i.e. it won't rewrite it to extract say 13-bits
> because I worry that for targets that won't have a bitfield extract 
> instruction
> this may be a de-optimization.

It is only a de-optimization for the following case:
* vector extraction

All other cases should be handled correctly in the middle-end when
expanding to RTL because they need to be handled for bit-fields
anyways.
Plus SIGN_EXTRACT and ZERO_EXTRACT would be used in the integer case
for the RTL.
Getting SIGN_EXTRACT/ZERO_EXTRACT early on in the RTL is better than
waiting until combine really.


>
> Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
> and no issues.
>
> Testcase are added in patch 2/2.
>
> Ok for master?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
> * match.pd: Add bitfield and shift folding.
>
> --- inline copy of patch --
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 
> 1d407414bee278c64c00d425d9f025c1c58d853d..b225d36dc758f1581502c8d03761544bfd499c01
>  100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -7245,6 +7245,23 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>&& ANY_INTEGRAL_TYPE_P (type) && ANY_INTEGRAL_TYPE_P (TREE_TYPE(@0)))
>(IFN_REDUC_PLUS_WIDEN @0)))
>
> +/* Canonicalize BIT_FIELD_REFS and shifts to BIT_FIELD_REFS.  */
> +(for shift (rshift)
> + op (plus)
> + (simplify
> +  (shift (BIT_FIELD_REF @0 @1 @2) integer_pow2p@3)
> +  (if (INTEGRAL_TYPE_P (type))
> +   (with { /* Can't use wide-int here as the precision differs between
> + @1 and @3.  */
> +  unsigned HOST_WIDE_INT size = tree_to_uhwi (@1);
> +  unsigned HOST_WIDE_INT shiftc = tree_to_uhwi (@3);
> +  unsigned HOST_WIDE_INT newsize = size - shiftc;
> +  tree nsize = wide_int_to_tree (bitsizetype, newsize);
> +  tree ntype
> += build_nonstandard_integer_type (newsize, 1); }

Maybe use `build_nonstandard_integer_type (newsize, /* unsignedp = */ true);`
or better yet `build_nonstandard_integer_type (newsize, UNSIGNED);`

I had started to convert some of the unsignedp into enum signop but I
never finished or submitted the patch.

Thanks,
Andrew Pinski


> +(if (ntype)
> + (convert:type (BIT_FIELD_REF:ntype @0 { nsize; } (op @2 @3
> +
>  (simplify
>   (BIT_FIELD_REF (BIT_FIELD_REF @0 @1 @2) @3 @4)
>   (BIT_FIELD_REF @0 @3 { const_binop (PLUS_EXPR, bitsizetype, @2, @4); }))
>
>
>
>
> --


Re: [PATCH] Ignore debug insns with CONCAT and CONCATN for insn scheduling

2022-09-24 Thread Jeff Law via Gcc-patches



On 9/21/22 16:11, H.J. Lu wrote:

On Wed, Sep 7, 2022 at 10:03 AM Jeff Law via Gcc-patches
 wrote:



On 9/2/2022 8:36 AM, H.J. Lu via Gcc-patches wrote:

CONCAT and CONCATN never appear in the insn chain.  They are only used
in debug insn.  Ignore debug insns with CONCAT and CONCATN for insn
scheduling to avoid different insn orders with and without debug insn.

gcc/

   PR rtl-optimization/106746
   * sched-deps.cc (sched_analyze_2): Ignore debug insns with CONCAT
   and CONCATN.

Shouldn't we be ignoring everything in a debug insn?   I don't see why
CONCAT/CONCATN are special here.

Debug insns are processed by insn scheduling.   I think it is to improve debug
experiences.  It is just that there are no matching usages of CONCAT/CONCATN
in non-debug insns.


But from a dependency standpoint ISTM all debug insn can be ignored.  I 
still don't see why concat/concatn should be special here.



jeff




Re: [PATCH]middle-end fix floating out of constants in conditionals

2022-09-24 Thread Jeff Law via Gcc-patches



On 9/23/22 03:21, Tamar Christina wrote:

Hi All,

The following testcase:

int zoo1 (int a, int b, int c, int d)
{
return (a > b ? c : d) & 1;
}

gets de-optimized by the front-end since somewhere around GCC 4.x due to a fix
that was added to fold_binary_op_with_conditional_arg.

The folding is supposed to succeed only if we have folded at least one of the
branches, however the check doesn't tests that all of the values are
non-constant.  So if one of the operators are a constant it accepts the folding.

This ends up folding

return (a > b ? c : d) & 1;

into

return (a > b ? c & 1 : d & 1);

and thus performing the AND twice.

change changes it to reject the folding if one of the arguments are a constant
and if the operations being performed are the same.

Secondly it adds a new match.pd rule to now also fold the opposite direction, so
it now also folds:

return (a > b ? c & 1 : d & 1);

into

return (a > b ? c : d) & 1;

Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
and  issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

* fold-const.cc (fold_binary_op_with_conditional_arg): Add relaxation.
* match.pd: Add ternary constant fold rule.
* tree-cfg.cc (verify_gimple_assign_ternary): RHS1 of a COND_EXPR isn't
a value but an expression itself.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/if-compare_3.c: New test.


OK

jeff




Re: [PATCH] Document -fexcess-precision=16 in tm.texi

2022-09-24 Thread Sandra Loosemore

On 9/18/22 02:47, Palmer Dabbelt wrote:

On Fri, 09 Sep 2022 02:46:40 PDT (-0700), Palmer Dabbelt wrote:

I just happened to stuble on this one while trying to sort out the
RISC-V bits.

gcc/ChangeLog

* doc/tm.texi (TARGET_C_EXCESS_PRECISION): Add 16.
---
 gcc/doc/tm.texi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 858bfb80cec..7590924f2ca 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -1009,7 +1009,7 @@ of the excess precision explicitly added.  For
 @code{EXCESS_PRECISION_TYPE_FLOAT16}, and
 @code{EXCESS_PRECISION_TYPE_FAST}, the target should return the
 explicit excess precision that should be added depending on the
-value set for @option{-fexcess-precision=@r{[}standard@r{|}fast@r{]}}.
+value set for 
@option{-fexcess-precision=@r{[}standard@r{|}fast@r{|}16@r{]}}.

 Note that unpredictable explicit excess precision does not make sense,
 so a target should never return @code{FLT_EVAL_METHOD_UNPREDICTABLE}
 when @var{type} is @code{EXCESS_PRECISION_TYPE_STANDARD},


Just pinging this one as I'm not sure if it's OK to self-approve -- no 
rush on my end, I already figured it out so I don't need the 
documentation any more.


This is fine, looks like a trivial correction.

-Sandra


Re: [PATCH] Avoid depending on destructor order

2022-09-24 Thread Iain Sandoe



> On 23 Sep 2022, at 15:30, David Edelsohn via Gcc-patches 
>  wrote:
> 
> On Fri, Sep 23, 2022 at 10:12 AM Thomas Neumann  wrote:
> 
>>> 
>>>+static const bool in_shutdown = false;
>>> 
>>> I'll let Jason or others decide if this is the right solution.  It seems
>>> that in_shutdown also could be declared outside the #ifdef and
>>> initialized as "false".
>> 
>> sure, either is fine. Moving it outside the #ifdef wastes one byte in
>> the executable (while the compiler can eliminate the const), but it does
>> not really matter.
>> 
>> I have verified that the patch below fixes builds for both fast-path and
>> non-fast-path builds. But if you prefer I will move the in_shutdown
>> definition instead.
>> 
>> Best
>> 
>> Thomas
>> 
>> PS: in_shutdown is an int here instead of a bool because non-fast-path
>> builds do not include stdbool. Not a good reason, of course, but I
>> wanted to keep the patch minimal and it makes no difference in practice.
>> 
>> 
>> When using the atomic fast path deregistering can fail during
>> program shutdown if the lookup structures are already destroyed.
>> The assert in __deregister_frame_info_bases takes that into
>> account. In the non-fast-path case however is not aware of
>> program shutdown, which caused a compiler error on such platforms.
>> We fix that by introducing a constant for in_shutdown in
>> non-fast-path builds.
>> 
>> libgcc/ChangeLog:
>> * unwind-dw2-fde.c: Introduce a constant for in_shutdown
>> for the non-fast-path case.
>> 
>> diff --git a/libgcc/unwind-dw2-fde.c b/libgcc/unwind-dw2-fde.c
>> index d237179f4ea..0bcd5061d76 100644
>> --- a/libgcc/unwind-dw2-fde.c
>> +++ b/libgcc/unwind-dw2-fde.c
>> @@ -67,6 +67,8 @@ static void
>>  init_object (struct object *ob);
>> 
>>  #else
>> +/* Without fast path frame deregistration must always succeed.  */
>> +static const int in_shutdown = 0;
>> 
>>  /* The unseen_objects list contains objects that have been registered
>> but not yet categorized in any way.  The seen_objects list has had
>> 
> 
> Thanks for the patch.  I'll let you and Jason decide which style solution
> is preferred.

This also breaks bootstrap on Darwin at least, so an early solution would be
welcome (the fix here allows bootstrap to continue, testing on-going).
thanks,
Iain

> 
> Thanks, David