[PATCH] wide-int: Fix estimation of buffer sizes for wide_int printing [PR111800]

2023-10-14 Thread Jakub Jelinek
Hi!

As mentioned in the PR, my estimations on needed buffer size for wide_int
and especially widest_int printing were incorrect, I've used get_len ()
in the estimations, but that is true only for !wi::neg_p (x) values.
Under the hood, we have 3 ways to print numbers.
print_decs which if
  if ((wi.get_precision () <= HOST_BITS_PER_WIDE_INT)
  || (wi.get_len () == 1))
uses sprintf which always fits into WIDE_INT_PRINT_BUFFER_SIZE (positive or
negative) and otherwise uses print_hex,
print_decu which if
  if ((wi.get_precision () <= HOST_BITS_PER_WIDE_INT)
  || (wi.get_len () == 1 && !wi::neg_p (wi)))
uses sprintf which always fits into WIDE_INT_PRINT_BUFFER_SIZE (positive
only) and print_hex, which doesn't print most significant limbs which are
zero and the first limb which is non-zero prints such that redundant 0
hex digits aren't printed, while all limbs below that are printed with
"%016" PRIx64.  For wi::neg_p (x) values, the first limb of the precision
is always non-zero, so we print all the limbs for the precision.
So, the current estimations are accurate if !wi::neg_p (x), or when
print_decs will be used and x.get_len () == 1, otherwise we need to use
estimation based on get_precision () rather than get_len ().

The following patch does that, bootstrapped/regtested on x86_64-linux and
i686-linux, ok for trunk?

The patch doesn't address what I've talked about earlier, that we might
actually stop using print_hex when asked for print_dec{s,u} - we could for
negative print_decs just negate and call print_decu, and in print_decu
e.g. in a loop UNSIGNED wi::divmod_trunc by
HOST_WIDE_INT_UC (1000) and print the 19 decimal digits of
remainder if quotient is non-zero, otherwise non-padded rest, and then
reshuffle the buffer.  And/or perhaps print_hex should also take signop
and print negative hex constants as -0x. if asked for SIGNED.
And finally, I think we should try to rewrite tree-ssa-ccp.cc bit-cp from
widest_int to wide_int, even the earlier:
PHI node value: CONSTANT 
0xffe2
 (0x19)
in the -fdump-tree-ccp-details dumps is horribly confusing when the
type is say just 32-bit or 64-bit, and with the recent widest_int changes
those are now around with > 32000 f hex digits in there.  Not to mention we 
shouldn't
really care about state of bits beyond the precision and I think we always
have the type in question around (x.val is INTEGER_CST of the right type
and we just to::widest it, just x.mask is widest_int).

2023-10-14  Jakub Jelinek  

PR tree-optimization/111800
gcc/
* wide-int.cc (assert_deceq): Use wi.get_len () for buffer size
estimation only if !wi::neg_p (wi) or if len is 1 and sgn is SIGNED,
otherwise use WIDE_INT_MAX_HWIS for wi.get_precision ().
(assert_hexeq): Use wi.get_len () for buffer size estimation only
if !wi::neg_p (wi), otherwise use WIDE_INT_MAX_HWIS for
wi.get_precision ().
* wide-int-print.cc (print_decs): Use wi.get_len () for buffer size
estimation only if !wi::neg_p (wi) or if len is 1, otherwise use
WIDE_INT_MAX_HWIS for wi.get_precision ().
(print_decu): Use wi.get_len () for buffer size estimation only if
!wi::neg_p (wi), otherwise use WIDE_INT_MAX_HWIS for
wi.get_precision ().
(print_hex): Likewise.
* value-range.cc (irange_bitmask::dump): Use get_len () for
buffer size estimation only if !wi::neg_p (wi), otherwise use
WIDE_INT_MAX_HWIS for get_precision ().
* value-range-pretty-print.cc (vrange_printer::print_irange_bitmasks):
Likewise.
* tree-ssa-loop-niter.cc (do_warn_aggressive_loop_optimizations): Use
i_bound.get_len () for buffer size estimation only if
!wi::neg_p (i_bound) or if len is 1 and !TYPE_UNSIGNED, otherwise use
WIDE_INT_MAX_HWIS for i_bound.get_precision ().  Use TYPE_SIGN macro
in print_dec call argument.
gcc/c-family/
* c-warn.cc (match_case_to_enum_1): Assert w.get_precision ()
is smaller or equal to WIDE_INT_MAX_INL_PRECISION rather than
w.get_len () is smaller or equal to WIDE_INT_MAX_INL_ELTS.

--- gcc/wide-int.cc.jj  2023-10-13 19:34:44.288830022 +0200
+++ gcc/wide-int.cc 2023-10-13 20:23:12.889386810 +0200
@@ -2450,7 +2450,9 @@ static void
 assert_deceq (const char *expected, const wide_int_ref &wi, signop sgn)
 {
   char buf[WIDE_INT_PRINT_BUFFER_SIZE], *p = buf;
   unsigned len = wi.get_len ();
+  if ((len != 1 || sgn == UNSIGNED) && wi::neg_p (wi))
+len = WIDE_INT_MAX_HWIS (wi.get_precision ());
   if (UNLIKELY (len > WIDE_INT_MAX_INL_ELTS))
 p = XALLOCAVEC (char, len * HOST_BITS_PER_WIDE_INT / 4 + 4);
   print_dec (wi, p, sgn);
@@ -2463,7 +2465,11 @@ static void
 assert_hexeq (const char *expected, const wide_int_ref &wi)
 {
   char buf[WIDE_INT_PRINT_BUFFER_

Re: [PATCH] libstdc++: Workaround for LLVM-61763 in ranges

2023-10-14 Thread Jonathan Wakely
On Sat, 14 Oct 2023, 00:33 Benjamin Brock,  wrote:

> This is my first time submitting a patch, so my apologies if I'm
> submitting incorrectly or missing something.
>

Thanks for contributing!

I don't think this patch counts as legally significant, but if you
contribute again in future you should be aware of
https://gcc.gnu.org/contribute.html#legal and either complete the copyright
assignment paperwork, or add a DCO sign-off to the commit message.


> Clang is unable to compile parts of libstdc++'s ranges implementation
> due to LLVM-61763, a Clang frontend compiler bug in handling the
> declarations of constrained friends.  The problem areas are zip_view,
> zip_transform_view, and adjacent_transform_view.
>
> A simple ranges program like the following fails to compile using
> Clang trunk and libstdc++.
>
>   std::vector v = {1, 2, 3, 4};
>   int sum = 0;
>   for (auto&& [i, j] : std::ranges::views::zip(v, v))
> sum += i * j;
>
> In file included from :1:
>
> /opt/compiler-explorer/gcc-snapshot/lib/gcc/x86_64-linux-gnu/14.0.0/../../../../include/c++/14.0.0/ranges:4655:14:
> error: type constraint differs in template redeclaration
>  4655 | template
>   |  ^
> . . . . . .
>
> Godbolt: https://godbolt.org/z/Ynbs15aGh
>
> This patch adds a small workaround that avoids declaring constrained
> friends when compiling with Clang, instead making some members public.
> MSVC's standard library has implemented a similar workaround.
>
> Scanning through libstdc++, there do appear to be other workarounds
> for Clang, e.g. in complex and experimental/simd.  Hopefully this kind
> of workaround is acceptable---while the core issue is a Clang compiler
> bug, it may take a while to fix, and it would be very useful for
> libstdc++ ranges to work with Clang in the meantime.
>

Yes, this is ok because the hack is limited to __clang__.


> 2023-10-13  Benjamin Brock 
>
> libstdc++-v3/ChangeLog:
>
> * include/std/ranges: implement workaround for LLVM-61763 in
>   zip_view and adjacency_view
>

This should be a complete sentence, so capital letter and full stop.

I can get this pushed to trunk and gcc-13 next week, thanks again for the
patch.



> ---
>
> diff --git a/libstdc++-v3/include/std/ranges
> b/libstdc++-v3/include/std/ranges
> index 1d529a886be..7893e3a84c9 100644
> --- a/libstdc++-v3/include/std/ranges
> +++ b/libstdc++-v3/include/std/ranges
> @@ -4632,6 +4632,9 @@ namespace views::__adaptor
>class zip_view<_Vs...>::_Iterator
>  : public __detail::__zip_view_iter_cat<_Const, _Vs...>
>{
> +#ifdef __clang__ // LLVM-61763 workaround
> +  public:
> +#endif
>
>  __detail::__tuple_or_pair_t _Vs>>...> _M_current;
>
>  constexpr explicit
> @@ -4652,11 +4655,13 @@ namespace views::__adaptor
> return input_iterator_tag{};
>  }
>
> +#ifndef __clang__ // LLVM-61763 workaround
>  template
>requires (view<_Ws> && ...) && (sizeof...(_Ws) > 0) &&
> is_object_v<_Fp>
> && regular_invocable<_Fp&, range_reference_t<_Ws>...>
> && std::__detail::__can_reference range_reference_t<_Ws>...>>
>friend class zip_transform_view;
> +#endif
>
>public:
>  // iterator_category defined in __zip_view_iter_cat
> @@ -5327,6 +5332,9 @@ namespace views::__adaptor
>template
>class adjacent_view<_Vp, _Nm>::_Iterator
>{
> +#ifdef __clang__ // LLVM-61763 workaround
> +  public:
> +#endif
>  using _Base = __detail::__maybe_const_t<_Const, _Vp>;
>  array, _Nm> _M_current = array,
> _Nm>();
>
> @@ -5367,12 +5375,14 @@ namespace views::__adaptor
>
>  friend class adjacent_view;
>
> +#ifndef __clang__ // LLVM-61763 workaround
>  template
>requires view<_Wp> && (_Mm > 0) && is_object_v<_Fp>
>  && regular_invocable<__detail::__unarize<_Fp&, _Mm>,
> range_reference_t<_Wp>>
>  &&
> std::__detail::__can_reference _Mm>,
>
> range_reference_t<_Wp>>>
>friend class adjacent_transform_view;
> +#endif
>
>public:
>  using iterator_category = input_iterator_tag;
>


Re: [PATCH] wide-int: Fix estimation of buffer sizes for wide_int printing [PR111800]

2023-10-14 Thread Richard Biener



> Am 14.10.2023 um 10:21 schrieb Jakub Jelinek :
> 
> Hi!
> 
> As mentioned in the PR, my estimations on needed buffer size for wide_int
> and especially widest_int printing were incorrect, I've used get_len ()
> in the estimations, but that is true only for !wi::neg_p (x) values.
> Under the hood, we have 3 ways to print numbers.
> print_decs which if
>  if ((wi.get_precision () <= HOST_BITS_PER_WIDE_INT)
>  || (wi.get_len () == 1))
> uses sprintf which always fits into WIDE_INT_PRINT_BUFFER_SIZE (positive or
> negative) and otherwise uses print_hex,
> print_decu which if
>  if ((wi.get_precision () <= HOST_BITS_PER_WIDE_INT)
>  || (wi.get_len () == 1 && !wi::neg_p (wi)))
> uses sprintf which always fits into WIDE_INT_PRINT_BUFFER_SIZE (positive
> only) and print_hex, which doesn't print most significant limbs which are
> zero and the first limb which is non-zero prints such that redundant 0
> hex digits aren't printed, while all limbs below that are printed with
> "%016" PRIx64.  For wi::neg_p (x) values, the first limb of the precision
> is always non-zero, so we print all the limbs for the precision.
> So, the current estimations are accurate if !wi::neg_p (x), or when
> print_decs will be used and x.get_len () == 1, otherwise we need to use
> estimation based on get_precision () rather than get_len ().
> 
> The following patch does that, bootstrapped/regtested on x86_64-linux and
> i686-linux, ok for trunk?

Can we somehow abstract this common pattern?

> The patch doesn't address what I've talked about earlier, that we might
> actually stop using print_hex when asked for print_dec{s,u} - we could for
> negative print_decs just negate and call print_decu, and in print_decu
> e.g. in a loop UNSIGNED wi::divmod_trunc by
> HOST_WIDE_INT_UC (1000) and print the 19 decimal digits of
> remainder if quotient is non-zero, otherwise non-padded rest, and then
> reshuffle the buffer.  And/or perhaps print_hex should also take signop
> and print negative hex constants as -0x. if asked for SIGNED.
> And finally, I think we should try to rewrite tree-ssa-ccp.cc bit-cp from
> widest_int to wide_int, even the earlier:
> PHI node value: CONSTANT 
> 0xffe2
>  (0x19)
> in the -fdump-tree-ccp-details dumps is horribly confusing when the
> type is say just 32-bit or 64-bit, and with the recent widest_int changes
> those are now around with > 32000 f hex digits in there.  Not to mention we 
> shouldn't
> really care about state of bits beyond the precision and I think we always
> have the type in question around (x.val is INTEGER_CST of the right type
> and we just to::widest it, just x.mask is widest_int).
> 
> 2023-10-14  Jakub Jelinek  
> 
>PR tree-optimization/111800
> gcc/
>* wide-int.cc (assert_deceq): Use wi.get_len () for buffer size
>estimation only if !wi::neg_p (wi) or if len is 1 and sgn is SIGNED,
>otherwise use WIDE_INT_MAX_HWIS for wi.get_precision ().
>(assert_hexeq): Use wi.get_len () for buffer size estimation only
>if !wi::neg_p (wi), otherwise use WIDE_INT_MAX_HWIS for
>wi.get_precision ().
>* wide-int-print.cc (print_decs): Use wi.get_len () for buffer size
>estimation only if !wi::neg_p (wi) or if len is 1, otherwise use
>WIDE_INT_MAX_HWIS for wi.get_precision ().
>(print_decu): Use wi.get_len () for buffer size estimation only if
>!wi::neg_p (wi), otherwise use WIDE_INT_MAX_HWIS for
>wi.get_precision ().
>(print_hex): Likewise.
>* value-range.cc (irange_bitmask::dump): Use get_len () for
>buffer size estimation only if !wi::neg_p (wi), otherwise use
>WIDE_INT_MAX_HWIS for get_precision ().
>* value-range-pretty-print.cc (vrange_printer::print_irange_bitmasks):
>Likewise.
>* tree-ssa-loop-niter.cc (do_warn_aggressive_loop_optimizations): Use
>i_bound.get_len () for buffer size estimation only if
>!wi::neg_p (i_bound) or if len is 1 and !TYPE_UNSIGNED, otherwise use
>WIDE_INT_MAX_HWIS for i_bound.get_precision ().  Use TYPE_SIGN macro
>in print_dec call argument.
> gcc/c-family/
>* c-warn.cc (match_case_to_enum_1): Assert w.get_precision ()
>is smaller or equal to WIDE_INT_MAX_INL_PRECISION rather than
>w.get_len () is smaller or equal to WIDE_INT_MAX_INL_ELTS.
> 
> --- gcc/wide-int.cc.jj2023-10-13 19:34:44.288830022 +0200
> +++ gcc/wide-int.cc2023-10-13 20:23:12.889386810 +0200
> @@ -2450,7 +2450,9 @@ static void
> assert_deceq (const char *expected, const wide_int_ref &wi, signop sgn)
> {
>   char buf[WIDE_INT_PRINT_BUFFER_SIZE], *p = buf;
>   unsigned len = wi.get_len ();
> +  if ((len != 1 || sgn == UNSIGNED) && wi::neg_p (wi))
> +len = WIDE_INT_MAX_HWIS (wi.get_precision ());
>   if (UNLIKELY (len > WIDE_INT_MAX_INL_ELTS))
> p = XALLOCAVEC (char, len * HOST_BITS_PER_WIDE_INT / 4 + 4);
>   print_dec (wi, p, 

[PATCH] wide-int, v2: Fix estimation of buffer sizes for wide_int printing [PR111800]

2023-10-14 Thread Jakub Jelinek
Hi!

On Sat, Oct 14, 2023 at 10:41:28AM +0200, Richard Biener wrote:
> Can we somehow abstract this common pattern?

So like this?  With room for the future tweaks like printing decimal
instead of hex numbers by print_dec*, where we'd only need to adjust
the inlines.  The XALLOCAVEC call is left for the callers, those would
make the inlines uninlinable and not doing what they should.

2023-10-14  Jakub Jelinek  

PR tree-optimization/111800
gcc/
* wide-int-print.h (print_dec_buf_size, print_decs_buf_size,
print_decu_buf_size, print_hex_buf_size): New inline functions.
* wide-int.cc (assert_deceq): Use print_dec_buf_size.
(assert_hexeq): Use print_hex_buf_size.
* wide-int-print.cc (print_decs): Use print_decs_buf_size.
(print_decu): Use print_decu_buf_size.
(print_hex): Use print_hex_buf_size.
(pp_wide_int_large): Use print_dec_buf_size.
* value-range.cc (irange_bitmask::dump): Use print_hex_buf_size.
* value-range-pretty-print.cc (vrange_printer::print_irange_bitmasks):
Likewise.
* tree-ssa-loop-niter.cc (do_warn_aggressive_loop_optimizations): Use
print_dec_buf_size.  Use TYPE_SIGN macro in print_dec call argument.
gcc/c-family/
* c-warn.cc (match_case_to_enum_1): Assert w.get_precision ()
is smaller or equal to WIDE_INT_MAX_INL_PRECISION rather than
w.get_len () is smaller or equal to WIDE_INT_MAX_INL_ELTS.

--- gcc/wide-int-print.h.jj 2023-10-13 19:34:44.283830089 +0200
+++ gcc/wide-int-print.h2023-10-14 11:21:44.190603091 +0200
@@ -36,4 +36,40 @@ extern void print_hex (const wide_int_re
 extern void print_hex (const wide_int_ref &wi, FILE *file);
 extern void pp_wide_int_large (pretty_printer *, const wide_int_ref &, signop);
 
+inline bool
+print_dec_buf_size (const wide_int_ref &wi, signop sgn, unsigned int *len)
+{
+  unsigned int l = wi.get_len ();
+  if ((l != 1 || sgn == UNSIGNED) && wi::neg_p (wi))
+l = WIDE_INT_MAX_HWIS (wi.get_precision ());
+  l = l * HOST_BITS_PER_WIDE_INT / 4 + 4;
+  *len = l;
+  return UNLIKELY (l > WIDE_INT_PRINT_BUFFER_SIZE);
+}
+
+inline bool
+print_decs_buf_size (const wide_int_ref &wi, unsigned int *len)
+{
+  return print_dec_buf_size (wi, SIGNED, len);
+}
+
+inline bool
+print_decu_buf_size (const wide_int_ref &wi, unsigned int *len)
+{
+  return print_dec_buf_size (wi, UNSIGNED, len);
+}
+
+inline bool
+print_hex_buf_size (const wide_int_ref &wi, unsigned int *len)
+{
+  unsigned int l;
+  if (wi::neg_p (wi))
+l = WIDE_INT_MAX_HWIS (wi.get_precision ());
+  else
+l = wi.get_len ();
+  l = l * HOST_BITS_PER_WIDE_INT / 4 + 4;
+  *len = l;
+  return UNLIKELY (l > WIDE_INT_PRINT_BUFFER_SIZE);
+}
+
 #endif /* WIDE_INT_PRINT_H */
--- gcc/wide-int.cc.jj  2023-10-14 11:07:52.738850767 +0200
+++ gcc/wide-int.cc 2023-10-14 11:22:03.100347386 +0200
@@ -2450,9 +2450,9 @@ static void
 assert_deceq (const char *expected, const wide_int_ref &wi, signop sgn)
 {
   char buf[WIDE_INT_PRINT_BUFFER_SIZE], *p = buf;
-  unsigned len = wi.get_len ();
-  if (UNLIKELY (len > WIDE_INT_MAX_INL_ELTS))
-p = XALLOCAVEC (char, len * HOST_BITS_PER_WIDE_INT / 4 + 4);
+  unsigned len;
+  if (print_dec_buf_size (wi, sgn, &len))
+p = XALLOCAVEC (char, len);
   print_dec (wi, p, sgn);
   ASSERT_STREQ (expected, p);
 }
@@ -2463,9 +2463,9 @@ static void
 assert_hexeq (const char *expected, const wide_int_ref &wi)
 {
   char buf[WIDE_INT_PRINT_BUFFER_SIZE], *p = buf;
-  unsigned len = wi.get_len ();
-  if (UNLIKELY (len > WIDE_INT_MAX_INL_ELTS))
-p = XALLOCAVEC (char, len * HOST_BITS_PER_WIDE_INT / 4 + 4);
+  unsigned len;
+  if (print_hex_buf_size (wi, &len))
+p = XALLOCAVEC (char, len);
   print_hex (wi, p);
   ASSERT_STREQ (expected, p);
 }
--- gcc/wide-int-print.cc.jj2023-10-14 11:07:52.737850781 +0200
+++ gcc/wide-int-print.cc   2023-10-14 11:37:43.994623668 +0200
@@ -75,9 +75,9 @@ void
 print_decs (const wide_int_ref &wi, FILE *file)
 {
   char buf[WIDE_INT_PRINT_BUFFER_SIZE], *p = buf;
-  unsigned len = wi.get_len ();
-  if (UNLIKELY (len > WIDE_INT_MAX_INL_ELTS))
-p = XALLOCAVEC (char, len * HOST_BITS_PER_WIDE_INT / 4 + 4);
+  unsigned len;
+  if (print_decs_buf_size (wi, &len))
+p = XALLOCAVEC (char, len);
   print_decs (wi, p);
   fputs (p, file);
 }
@@ -102,9 +102,9 @@ void
 print_decu (const wide_int_ref &wi, FILE *file)
 {
   char buf[WIDE_INT_PRINT_BUFFER_SIZE], *p = buf;
-  unsigned len = wi.get_len ();
-  if (UNLIKELY (len > WIDE_INT_MAX_INL_ELTS))
-p = XALLOCAVEC (char, len * HOST_BITS_PER_WIDE_INT / 4 + 4);
+  unsigned len;
+  if (print_decu_buf_size (wi, &len))
+p = XALLOCAVEC (char, len);
   print_decu (wi, p);
   fputs (p, file);
 }
@@ -141,9 +141,9 @@ void
 print_hex (const wide_int_ref &wi, FILE *file)
 {
   char buf[WIDE_INT_PRINT_BUFFER_SIZE], *p = buf;
-  unsigned len = wi.get_len ();
-  if (UNLIKELY (len > WIDE_INT_MAX_INL_ELTS))
-p = XALLOCAVEC (char, len * HOST_BITS_PER_W

Re: [PATCH] wide-int, v2: Fix estimation of buffer sizes for wide_int printing [PR111800]

2023-10-14 Thread Richard Biener



> Am 14.10.2023 um 11:50 schrieb Jakub Jelinek :
> 
> Hi!
> 
>> On Sat, Oct 14, 2023 at 10:41:28AM +0200, Richard Biener wrote:
>> Can we somehow abstract this common pattern?
> 
> So like this?  With room for the future tweaks like printing decimal
> instead of hex numbers by print_dec*, where we'd only need to adjust
> the inlines.  The XALLOCAVEC call is left for the callers, those would
> make the inlines uninlinable and not doing what they should.

LGTM.

Richard 

> 2023-10-14  Jakub Jelinek  
> 
>PR tree-optimization/111800
> gcc/
>* wide-int-print.h (print_dec_buf_size, print_decs_buf_size,
>print_decu_buf_size, print_hex_buf_size): New inline functions.
>* wide-int.cc (assert_deceq): Use print_dec_buf_size.
>(assert_hexeq): Use print_hex_buf_size.
>* wide-int-print.cc (print_decs): Use print_decs_buf_size.
>(print_decu): Use print_decu_buf_size.
>(print_hex): Use print_hex_buf_size.
>(pp_wide_int_large): Use print_dec_buf_size.
>* value-range.cc (irange_bitmask::dump): Use print_hex_buf_size.
>* value-range-pretty-print.cc (vrange_printer::print_irange_bitmasks):
>Likewise.
>* tree-ssa-loop-niter.cc (do_warn_aggressive_loop_optimizations): Use
>print_dec_buf_size.  Use TYPE_SIGN macro in print_dec call argument.
> gcc/c-family/
>* c-warn.cc (match_case_to_enum_1): Assert w.get_precision ()
>is smaller or equal to WIDE_INT_MAX_INL_PRECISION rather than
>w.get_len () is smaller or equal to WIDE_INT_MAX_INL_ELTS.
> 
> --- gcc/wide-int-print.h.jj2023-10-13 19:34:44.283830089 +0200
> +++ gcc/wide-int-print.h2023-10-14 11:21:44.190603091 +0200
> @@ -36,4 +36,40 @@ extern void print_hex (const wide_int_re
> extern void print_hex (const wide_int_ref &wi, FILE *file);
> extern void pp_wide_int_large (pretty_printer *, const wide_int_ref &, 
> signop);
> 
> +inline bool
> +print_dec_buf_size (const wide_int_ref &wi, signop sgn, unsigned int *len)
> +{
> +  unsigned int l = wi.get_len ();
> +  if ((l != 1 || sgn == UNSIGNED) && wi::neg_p (wi))
> +l = WIDE_INT_MAX_HWIS (wi.get_precision ());
> +  l = l * HOST_BITS_PER_WIDE_INT / 4 + 4;
> +  *len = l;
> +  return UNLIKELY (l > WIDE_INT_PRINT_BUFFER_SIZE);
> +}
> +
> +inline bool
> +print_decs_buf_size (const wide_int_ref &wi, unsigned int *len)
> +{
> +  return print_dec_buf_size (wi, SIGNED, len);
> +}
> +
> +inline bool
> +print_decu_buf_size (const wide_int_ref &wi, unsigned int *len)
> +{
> +  return print_dec_buf_size (wi, UNSIGNED, len);
> +}
> +
> +inline bool
> +print_hex_buf_size (const wide_int_ref &wi, unsigned int *len)
> +{
> +  unsigned int l;
> +  if (wi::neg_p (wi))
> +l = WIDE_INT_MAX_HWIS (wi.get_precision ());
> +  else
> +l = wi.get_len ();
> +  l = l * HOST_BITS_PER_WIDE_INT / 4 + 4;
> +  *len = l;
> +  return UNLIKELY (l > WIDE_INT_PRINT_BUFFER_SIZE);
> +}
> +
> #endif /* WIDE_INT_PRINT_H */
> --- gcc/wide-int.cc.jj2023-10-14 11:07:52.738850767 +0200
> +++ gcc/wide-int.cc2023-10-14 11:22:03.100347386 +0200
> @@ -2450,9 +2450,9 @@ static void
> assert_deceq (const char *expected, const wide_int_ref &wi, signop sgn)
> {
>   char buf[WIDE_INT_PRINT_BUFFER_SIZE], *p = buf;
> -  unsigned len = wi.get_len ();
> -  if (UNLIKELY (len > WIDE_INT_MAX_INL_ELTS))
> -p = XALLOCAVEC (char, len * HOST_BITS_PER_WIDE_INT / 4 + 4);
> +  unsigned len;
> +  if (print_dec_buf_size (wi, sgn, &len))
> +p = XALLOCAVEC (char, len);
>   print_dec (wi, p, sgn);
>   ASSERT_STREQ (expected, p);
> }
> @@ -2463,9 +2463,9 @@ static void
> assert_hexeq (const char *expected, const wide_int_ref &wi)
> {
>   char buf[WIDE_INT_PRINT_BUFFER_SIZE], *p = buf;
> -  unsigned len = wi.get_len ();
> -  if (UNLIKELY (len > WIDE_INT_MAX_INL_ELTS))
> -p = XALLOCAVEC (char, len * HOST_BITS_PER_WIDE_INT / 4 + 4);
> +  unsigned len;
> +  if (print_hex_buf_size (wi, &len))
> +p = XALLOCAVEC (char, len);
>   print_hex (wi, p);
>   ASSERT_STREQ (expected, p);
> }
> --- gcc/wide-int-print.cc.jj2023-10-14 11:07:52.737850781 +0200
> +++ gcc/wide-int-print.cc2023-10-14 11:37:43.994623668 +0200
> @@ -75,9 +75,9 @@ void
> print_decs (const wide_int_ref &wi, FILE *file)
> {
>   char buf[WIDE_INT_PRINT_BUFFER_SIZE], *p = buf;
> -  unsigned len = wi.get_len ();
> -  if (UNLIKELY (len > WIDE_INT_MAX_INL_ELTS))
> -p = XALLOCAVEC (char, len * HOST_BITS_PER_WIDE_INT / 4 + 4);
> +  unsigned len;
> +  if (print_decs_buf_size (wi, &len))
> +p = XALLOCAVEC (char, len);
>   print_decs (wi, p);
>   fputs (p, file);
> }
> @@ -102,9 +102,9 @@ void
> print_decu (const wide_int_ref &wi, FILE *file)
> {
>   char buf[WIDE_INT_PRINT_BUFFER_SIZE], *p = buf;
> -  unsigned len = wi.get_len ();
> -  if (UNLIKELY (len > WIDE_INT_MAX_INL_ELTS))
> -p = XALLOCAVEC (char, len * HOST_BITS_PER_WIDE_INT / 4 + 4);
> +  unsigned len;
> +  if (print_decu_buf_size (wi, &len))
> +p = XALLOCAVEC (char, len);
>   print_decu (wi, p);
>   fputs (p, file);
> }
> @@ -141,9 +141,9 @@ void
> print_hex 

[committed] d: Reduce code duplication of writing generated files.

2023-10-14 Thread Iain Buclaw
Hi,

This is a small refactoring ahead of the next merge from upstream, where
a few more front-end routines will stop doing the file handling
themselves.

Bootstrapped and regression tested on x86_64-linux-gnu/-m32, committed
to mainline.

Regards,
Iain.

---
gcc/d/ChangeLog:

* d-lang.cc (d_write_file): New function.
(d_parse_file): Reduce code duplication.
---
 gcc/d/d-lang.cc | 91 -
 1 file changed, 29 insertions(+), 62 deletions(-)

diff --git a/gcc/d/d-lang.cc b/gcc/d/d-lang.cc
index 7dddcf5ac91..f290acf494b 100644
--- a/gcc/d/d-lang.cc
+++ b/gcc/d/d-lang.cc
@@ -978,6 +978,30 @@ d_add_builtin_module (Module *m)
   builtin_modules.push (m);
 }
 
+/* Writes to FILENAME.  DATA is the full content of the file to be written.  */
+
+static void
+d_write_file (const char *filename, const char *data)
+{
+  FILE *stream;
+
+  if (filename && (filename[0] != '-' || filename[1] != '\0'))
+stream = fopen (filename, "w");
+  else
+stream = stdout;
+
+  if (!stream)
+{
+  error ("unable to open %s for writing: %m", filename);
+  return;
+}
+
+  fprintf (stream, "%s", data);
+
+  if (stream != stdout && (ferror (stream) || fclose (stream)))
+error ("writing output file %s: %m", filename);
+}
+
 /* Implements the lang_hooks.parse_file routine for language D.  */
 
 static void
@@ -1264,8 +1288,6 @@ d_parse_file (void)
   if (d_option.deps)
 {
   obstack buffer;
-  FILE *deps_stream;
-
   gcc_obstack_init (&buffer);
 
   for (size_t i = 0; i < modules.length; i++)
@@ -1275,27 +1297,8 @@ d_parse_file (void)
   if (d_option.deps_filename_user)
d_option.deps_filename = d_option.deps_filename_user;
 
-  if (d_option.deps_filename)
-   {
- deps_stream = fopen (d_option.deps_filename, "w");
- if (!deps_stream)
-   {
- fatal_error (input_location, "opening dependency file %s: %m",
-  d_option.deps_filename);
- goto had_errors;
-   }
-   }
-  else
-   deps_stream = stdout;
-
-  fprintf (deps_stream, "%s", (char *) obstack_finish (&buffer));
-
-  if (deps_stream != stdout
- && (ferror (deps_stream) || fclose (deps_stream)))
-   {
- fatal_error (input_location, "closing dependency file %s: %m",
-  d_option.deps_filename);
-   }
+  d_write_file (d_option.deps_filename,
+   (char *) obstack_finish (&buffer));
 }
 
   if (global.params.vtemplates)
@@ -1306,29 +1309,7 @@ d_parse_file (void)
 {
   OutBuffer buf;
   json_generate (&buf, &modules);
-
-  const char *name = global.params.json.name.ptr;
-  FILE *json_stream;
-
-  if (name && (name[0] != '-' || name[1] != '\0'))
-   {
- const char *nameext
-   = FileName::defaultExt (name, json_ext.ptr);
- json_stream = fopen (nameext, "w");
- if (!json_stream)
-   {
- fatal_error (input_location, "opening json file %s: %m", nameext);
- goto had_errors;
-   }
-   }
-  else
-   json_stream = stdout;
-
-  fprintf (json_stream, "%s", buf.peekChars ());
-
-  if (json_stream != stdout
- && (ferror (json_stream) || fclose (json_stream)))
-   fatal_error (input_location, "closing json file %s: %m", name);
+  d_write_file (global.params.json.name.ptr, buf.peekChars ());
 }
 
   /* Generate Ddoc files.  */
@@ -1391,22 +1372,8 @@ d_parse_file (void)
   /* We want to write the mixin expansion file also on error.  */
   if (global.params.mixinOut.doOutput)
 {
-  FILE *mixin_stream = fopen (global.params.mixinOut.name.ptr, "w");
-
-  if (mixin_stream)
-   {
- OutBuffer *buf = global.params.mixinOut.buffer;
- fprintf (mixin_stream, "%s", buf->peekChars ());
-
- if (ferror (mixin_stream) || fclose (mixin_stream))
-   fatal_error (input_location, "closing mixin file %s: %m",
-global.params.mixinOut.name.ptr);
-   }
-  else
-   {
- fatal_error (input_location, "opening mixin file %s: %m",
-  global.params.mixinOut.name.ptr);
-   }
+  d_write_file (global.params.mixinOut.name.ptr,
+   global.params.mixinOut.buffer->peekChars ());
 }
 
   /* Remove generated .di files on error.  */
-- 
2.39.2



[committed] Fix ICE in set_cell_span, at text-art/table.cc:148 with D front-end and -fanalyzer

2023-10-14 Thread Iain Buclaw
Hi,

The internal error in analyzer turned out to be caused by a subtly
invalid tree representation of STRING_CSTs in the D front-end, fixed by
including the terminating NULL as part of the TREE_STRING_POINTER.

When adding a first analyzer test for D, it flagged up another subtle
mismatch in one assignment in the module support routines as well, fixed
by generating the correct field type for the compiler-generated struct.

Bootstrapped and regression tested on x86_64-linux-gnu/-m32, and
committed to mainline.

Regards,
Iain.

---
PR d/111537

gcc/d/ChangeLog:

* expr.cc (ExprVisitor::visit (StringExp *)): Include null terminator
in STRING_CST string.
* modules.cc (get_compiler_dso_type): Generate ModuleInfo** type for
the minfo fields.

gcc/testsuite/ChangeLog:

* gdc.dg/analyzer/analyzer.exp: New test.
* gdc.dg/analyzer/pr111537.d: New test.
---
 gcc/d/expr.cc  |  6 +--
 gcc/d/modules.cc   |  9 ++--
 gcc/testsuite/gdc.dg/analyzer/analyzer.exp | 51 ++
 gcc/testsuite/gdc.dg/analyzer/pr111537.d   |  7 +++
 4 files changed, 66 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gdc.dg/analyzer/analyzer.exp
 create mode 100644 gcc/testsuite/gdc.dg/analyzer/pr111537.d

diff --git a/gcc/d/expr.cc b/gcc/d/expr.cc
index 7038655bc94..551d004c241 100644
--- a/gcc/d/expr.cc
+++ b/gcc/d/expr.cc
@@ -2535,13 +2535,13 @@ public:
   {
/* Copy the string contents to a null terminated string.  */
dinteger_t length = (e->len * e->sz);
-   char *string = XALLOCAVEC (char, length + 1);
+   char *string = XALLOCAVEC (char, length + e->sz);
+   memset (string, 0, length + e->sz);
if (length > 0)
  memcpy (string, e->string, length);
-   string[length] = '\0';
 
/* String value and type includes the null terminator.  */
-   tree value = build_string (length, string);
+   tree value = build_string (length + e->sz, string);
TREE_TYPE (value) = make_array_type (tb->nextOf (), length + 1);
value = build_address (value);
 
diff --git a/gcc/d/modules.cc b/gcc/d/modules.cc
index f2180d30546..8d6c8f0f9ad 100644
--- a/gcc/d/modules.cc
+++ b/gcc/d/modules.cc
@@ -277,12 +277,13 @@ get_compiler_dso_type (void)
   DECL_CHAIN (field) = fields;
   fields = field;
 
-  field = create_field_decl (build_pointer_type (get_moduleinfo_type ()),
-NULL, 1, 1);
+  tree moduleinfo_ptr_ptr_type =
+build_pointer_type (build_pointer_type (get_moduleinfo_type ()));
+
+  field = create_field_decl (moduleinfo_ptr_ptr_type, NULL, 1, 1);
   DECL_CHAIN (field) = fields;
   fields = field;
-  field = create_field_decl (build_pointer_type (get_moduleinfo_type ()),
-NULL, 1, 1);
+  field = create_field_decl (moduleinfo_ptr_ptr_type, NULL, 1, 1);
   DECL_CHAIN (field) = fields;
   fields = field;
 
diff --git a/gcc/testsuite/gdc.dg/analyzer/analyzer.exp 
b/gcc/testsuite/gdc.dg/analyzer/analyzer.exp
new file mode 100644
index 000..7b82b8e0cd1
--- /dev/null
+++ b/gcc/testsuite/gdc.dg/analyzer/analyzer.exp
@@ -0,0 +1,51 @@
+#  Copyright (C) 2023 Free Software Foundation, Inc.
+
+#  This file is part of GCC.
+#
+#  GCC is free software; you can redistribute it and/or modify it under
+#  the terms of the GNU General Public License as published by the Free
+#  Software Foundation; either version 3, or (at your option) any later
+#  version.
+#
+#  GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+#  WARRANTY; without even the implied warranty of MERCHANTABILITY or
+#  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+#  for more details.
+#
+#  You should have received a copy of the GNU General Public License
+#  along with GCC; see the file COPYING3.  If not see
+#  .
+
+# GCC testsuite that uses the `dg.exp' driver.
+
+# Load support procs.
+load_lib gdc-dg.exp
+
+# If the analyzer has not been enabled, bail.
+if { ![check_effective_target_analyzer] } {
+return
+}
+
+global DEFAULT_DFLAGS
+if [info exists DEFAULT_DFLAGS] then {
+  set save_default_dflags $DEFAULT_DFLAGS
+}
+
+# If a testcase doesn't have special options, use these.
+set DEFAULT_DFLAGS "-fanalyzer -Wanalyzer-too-complex 
-fanalyzer-call-summaries"
+
+# Initialize `dg'.
+dg-init
+
+# Main loop.
+gdc-dg-runtest [lsort \
+   [glob -nocomplain $srcdir/$subdir/*.d ] ] "" $DEFAULT_DFLAGS
+
+# All done.
+dg-finish
+
+if [info exists save_default_dflags] {
+  set DEFAULT_DFLAGS $save_default_dflags
+} else {
+  unset DEFAULT_DFLAGS
+}
diff --git a/gcc/testsuite/gdc.dg/analyzer/pr111537.d 
b/gcc/testsuite/gdc.dg/analyzer/pr111537.d
new file mode 100644
index 000..e50b05a3f79
--- /dev/null
+++ b/gcc/testsuite/gdc.dg/analyzer/pr111537.d
@@ -0,0 +1,7 @@
+// { dg-do compile }
+import core.stdc.string;
+void main()
+{
+char[5] arr;
+str

[C PATCH] error for function with external and internal linkage [PR111708]

2023-10-14 Thread Martin Uecker


Bootstrapped and regression tested on x86_64.


c: error for function with external and internal linkage [PR111708]

Declaring a function with both external and internal linkage
in the same TU is translation-time UB.  Add an error for this
case as already done for objects.

PR c/111708

gcc/c/Changelog:

* c-decl.cc (grokdeclarator): Add error.

gcc/testsuite/Changelog:

* gcc.dg/pr111708-1.c: New test.
* gcc.dg/pr111708-2.c: New test.

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 5822faf01b4..52490a784d0 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -8032,6 +8032,28 @@ grokdeclarator (const struct c_declarator *declarator,
TREE_THIS_VOLATILE (decl) = 1;
  }
  }
+
+   /* C99 6.2.2p7: It is invalid (compile-time undefined
+  behavior) to create an 'extern' declaration for a
+  function if there is a global declaration that is
+  'static' and the global declaration is not visible.
+  (If the static declaration _is_ currently visible,
+  the 'extern' declaration is taken to refer to that decl.) */
+   if (!initialized
+   && storage_class != csc_static
+   && storage_class != csc_auto
+   && current_scope != file_scope)
+ {
+   tree global_decl  = identifier_global_value (declarator->u.id.id);
+   tree visible_decl = lookup_name (declarator->u.id.id);
+
+   if (global_decl
+   && global_decl != visible_decl
+   && VAR_OR_FUNCTION_DECL_P (global_decl)
+   && !TREE_PUBLIC (global_decl))
+ error_at (loc, "function previously declared % "
+   "redeclared %");
+ }
   }
 else
   {
diff --git a/gcc/testsuite/gcc.dg/pr111708-1.c 
b/gcc/testsuite/gcc.dg/pr111708-1.c
new file mode 100644
index 000..4af7f53d75f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr111708-1.c
@@ -0,0 +1,42 @@
+/* { dg-do compile } */
+/* { dg-options "" } */
+
+extern int a(void);// external linkage (6.2.2p4)
+static int a(void);/* { dg-error "static declaration of 'a' follows 
non-static declaration" } */
+
+static int b(void);// internal linkage (6.2.2p3)
+extern int b(void);// internal linkage (6.2.2p4)
+
+static int h0(void);
+
+void s(void)
+{
+   extern int h0(void);// internal linkage (6.2.2p4),
+   extern int h0(void);// internal linkage (6.2.2p4), redeclaration, ok
+   extern int h2(void);// external linkage (6.2.2p4)
+   extern int h2(void);// external linkage (6.2.2p4), redeclaration, 
ok.
+}
+
+
+extern int i(void);// external linkage (6.2.2p4)
+static int j(void);// internal linkage (6.2.2p3)
+
+void bar(void)
+{
+   extern int i(void); // external linkage (6.2.2p4), ok
+}
+
+void foo(void)
+{
+   extern int j(void); // internal linkage (6.2.2p4), ok, internal
+}
+
+void x(void)
+{
+   int i(void);// no linkage (6.2.2p6)
+   int j;  // no linkage (6.2.2p6)
+   {
+   extern int j(void); /* { dg-error "function previously 
declared 'static' redeclared 'extern'" } */
+   }
+}
+
diff --git a/gcc/testsuite/gcc.dg/pr111708-2.c 
b/gcc/testsuite/gcc.dg/pr111708-2.c
new file mode 100644
index 000..065c0525c2e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr111708-2.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "" } */
+/* { dg-require-effective-target trampolines } */
+
+static void pp(void)
+{
+   int pp;
+   {
+   auto void pp(void);
+   void pp(void) { }
+   }
+}
+
+static void q2(void);
+
+static void qq(void)
+{
+   auto void q2(void);
+   void q2(void) { }
+}
+



Re: [PATCH] c++: fix truncated diagnostic in C++23 [PR111272]

2023-10-14 Thread Jason Merrill

On 10/13/23 18:15, Marek Polacek wrote:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK.


-- >8 --
In C++23, since P2448, a constexpr function F that calls a non-constexpr
function N is OK as long as we don't actually call F in a constexpr
context.  So instead of giving an error in maybe_save_constexpr_fundef,
we only give an error when evaluating the call.  Unfortunately, as shown
in this PR, the diagnostic can be truncated:

z.C:10:13: note: 'constexpr Jam::Jam()' is not usable as a 'constexpr' function 
because:
10 |   constexpr Jam() { ft(); }
   | ^~~

...because what?  With this patch, we say:

z.C:10:13: note: 'constexpr Jam::Jam()' is not usable as a 'constexpr' function 
because:
10 |   constexpr Jam() { ft(); }
   | ^~~
z.C:10:23: error: call to non-'constexpr' function 'int Jam::ft()'
10 |   constexpr Jam() { ft(); }
   | ~~^~
z.C:8:7: note: 'int Jam::ft()' declared here
 8 |   int ft() { return 42; }
   |   ^~

Like maybe_save_constexpr_fundef, explain_invalid_constexpr_fn should
also check the body of a constructor, not just the mem-initializer.

PR c++/111272

gcc/cp/ChangeLog:

* constexpr.cc (explain_invalid_constexpr_fn): Also check the body of
a constructor in C++14 and up.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/constexpr-diag1.C: New test.
---
  gcc/cp/constexpr.cc  | 10 +-
  gcc/testsuite/g++.dg/cpp1y/constexpr-diag1.C | 21 
  2 files changed, 30 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-diag1.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 0f948db7c2d..dde4fec4a44 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -1098,7 +1098,15 @@ explain_invalid_constexpr_fn (tree fun)
  body = massage_constexpr_body (fun, body);
  require_potential_rvalue_constant_expression (body);
  if (DECL_CONSTRUCTOR_P (fun))
-   cx_check_missing_mem_inits (DECL_CONTEXT (fun), body, true);
+   {
+ cx_check_missing_mem_inits (DECL_CONTEXT (fun), body, true);
+ if (cxx_dialect > cxx11)
+   {
+ /* Also check the body, not just the ctor-initializer.  */
+ body = DECL_SAVED_TREE (fun);
+ require_potential_rvalue_constant_expression (body);
+   }
+   }
}
  }
  }
diff --git a/gcc/testsuite/g++.dg/cpp1y/constexpr-diag1.C 
b/gcc/testsuite/g++.dg/cpp1y/constexpr-diag1.C
new file mode 100644
index 000..0e2909e83ef
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/constexpr-diag1.C
@@ -0,0 +1,21 @@
+// PR c++/111272
+// { dg-do compile { target c++14 } }
+// { dg-options "-Werror=invalid-constexpr" }
+// { dg-prune-output "some warnings being treated as errors" }
+
+struct Jam
+{
+  // constexpr  // n.b.
+  int ft() { return 42; } // { dg-message "declared here" }
+
+  constexpr Jam() { ft(); } // { dg-error "call to non-.constexpr. function" }
+// { dg-message "declared here" "" { target c++20_down } .-1 }
+};
+
+constexpr bool test()
+{
+  Jam j; // { dg-error "called in a constant expression" }
+  return true;
+}
+
+static_assert(test(), ""); // { dg-error "non-constant condition" }

base-commit: d78fef5371759849944966dec65d9e987efba509




Re: [Patch] libgomp.texi: Note to 'Memory allocation' sect and missing mem-memory routines

2023-10-14 Thread Sandra Loosemore

On 10/12/23 04:53, Tobias Burnus wrote:


libgomp.texi: Note to 'Memory allocation' sect and missing mem-memory routines

This commit completes the documentation of the OpenMP memory-management
routines, except for the unimplemented TR11 additions.  It also makes clear
in the 'Memory allocation' section of the 'OpenMP-Implementation Specifics'
chapter under which condition OpenMP managed memory/allocators are used.

libgomp/ChangeLog:

* libgomp.texi: Fix some typos.
(Memory Management Routines): Document remaining 5.x routines.
(Memory allocation): Make clear when the section applies.

 libgomp/libgomp.texi | 382 +--
 1 file changed, 367 insertions(+), 15 deletions(-)

diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 0d965f96d48..3fc9c7dea23 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -1917,7 +1917,7 @@ is not supported.
 @item @emph{Description}:
 If the device number is refers to the initial device or to a device with


s/is refers/refers/


 memory accessible from the host (shared memory), the @code{omp_get_mapped_ptr}
-routines returnes the value of the passed @var{ptr}.  Otherwise, if associated
+routines returns the value of the passed @var{ptr}.  Otherwise, if associated
 storage to the passed host pointer @var{ptr} exists on device associated with


s/on device/on the device/ (or maybe /on a device/?)


+@node omp_alloc
+@subsection @code{omp_alloc} -- Memory allocation with an allocator
+@table @asis
+@item @emph{Description}:
+Allocate memory with the specified allocator, which can either be a predefined
+allocator, an allocator handle or @code{omp_null_allocator}.  If the allocators
+is @code{omp_null_allocator}, the allocator specified by the


Minimally, s/allocators is/allocator is/, or maybe something like /allocator 
argument is/, and/or add appropriate markup on allocator here to indicate that 
it is an argument.



+@var{def-allocator-var} ICV is used.  @var{size} must be a nonnegative number
+denoting the number of bytes to be allocated; if @var{size} is zero,
+@code{omp_alloc} will return a null pointer.  If successful, a pointer to the


s/will return/returns/


+allocated memory is returned, otherwise the @code{fallback} trait of the
+allocator determines the behavior.  The content of the allocated memory is
+unspecified.
+
+In @code{target} regions, either the @code{dynamic_allocators} clause must
+appear on a @code{requires} directive in the same compilation unit -- or the


s/ -- or/, or/


+@var{allocator} argument may only be a constant expression with the value of


s/may only be/must be/


+@node omp_aligned_alloc
+@subsection @code{omp_aligned_alloc} -- Memory allocation with an allocator 
and alignment
+@table @asis
+@item @emph{Description}:
+Allocate memory with the specified allocator, which can either be a predefined
+allocator, an allocator handle or @code{omp_null_allocator}.  If the allocators
+is @code{omp_null_allocator}, the allocator specified by the


Ditto above comments re "allocators is".


+@var{def-allocator-var} ICV is used.  @var{alignment} must be a positive power
+of two and @var{size} must be a nonnegative number that is a multiple of the
+alignment and denotes the number of bytes to be allocated; if @var{size} is
+zero, @code{omp_aligned_alloc} will return a null pointer.  The alignment will
+be at least the maximal value required by @code{alignment} trait of the


s/will return/return/

"The alignment will be" needs to be rewritten to avoid future tense too, but 
I'm not sure what you're trying to say here.  Is this a requirement on the 
alignment argument or a statement about the actual alignment of the allocated 
memory?



+allocator and the value of the  passed @var{alignment} argument.  If 
successful,
+a pointer to the allocated memory is returned, otherwise the @code{fallback}
+trait of the allocator determines the behavior.  The content of the allocated
+memory is unspecified.
+
+In @code{target} regions, either the @code{dynamic_allocators} clause must
+appear on a @code{requires} directive in the same compilation unit -- or the
+@var{allocator} argument may only be a constant expression with the value of
+one of the predefined allocators and may not be @code{omp_null_allocator}.


Ditto above comments re using comma and "must".


+@node omp_free
+@subsection @code{omp_free} -- Freeing memory allocated with OpenMP routines
+@table @asis
+@item @emph{Description}:
+The @code{omp_free} routine deallocates memory previously allocated by an
+OpenMP memory-management routine. The @var{ptr} argument must point to such
+memory or be a null pointer; if it is a null pointer, no operation is
+performed.  If specified, the @var{allocator} argument must be either the
+memory allocator that was used for the allocation or @code{omp_null_allocator};
+if it is @code{omp_null_allocator}, the implementation will determine the value


s/will determine/dete

[committed] libgomp.fortran/allocate-6.f90: Run with -fdump-tree-gimple (was: [Patch] OpenMP: Add ME support for 'omp allocate' stack variables)

2023-10-14 Thread Tobias Burnus

I wonder why I missed the following – but it now works :-/

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
commit fd6b17a4892684f286274d874f0029604feda7e5
Author: Tobias Burnus 
Date:   Sat Oct 14 20:09:34 2023 +0200

libgomp.fortran/allocate-6.f90: Run with -fdump-tree-gimple

libgomp/
* testsuite/libgomp.fortran/allocate-6.f90: Add missing
dg-additional-options "-fdump-tree-gimple"; fix scan.

diff --git a/libgomp/testsuite/libgomp.fortran/allocate-6.f90 b/libgomp/testsuite/libgomp.fortran/allocate-6.f90
index 5c32652f2a6..5034dd26b2e 100644
--- a/libgomp/testsuite/libgomp.fortran/allocate-6.f90
+++ b/libgomp/testsuite/libgomp.fortran/allocate-6.f90
@@ -1,3 +1,4 @@
+! { dg-additional-options "-fdump-tree-gimple" }
 module m
   use iso_c_binding
   use omp_lib
@@ -23,10 +24,10 @@ subroutine one ()
   !$omp allocate(var,var2) align(128) allocator(omp_low_lat_mem_alloc)
   var = 5
 ! { dg-final { scan-tree-dump-times "var\\.\[0-9\]+ = __builtin_GOMP_alloc \\(128, 4, 5\\);" 1 "gimple" } } */
-! { dg-final { scan-tree-dump-times "var2\\.\[0-9\]+ = __builtin_GOMP_alloc \\(128, D\\.\[0-9\]+, 5\\);" 1 "gimple" } } */
+! { dg-final { scan-tree-dump-times "var2 = __builtin_GOMP_alloc \\(128, D\\.\[0-9\]+, 5\\);" 1 "gimple" } } */
 
 ! { dg-final { scan-tree-dump-times "__builtin_GOMP_free \\(var\\.\[0-9\]+, 0B\\);" 1 "gimple" } } */
-! { dg-final { scan-tree-dump-times "__builtin_GOMP_free \\(var2\\.\[0-9\]+, 0B\\);" 1 "gimple" } } */
+! { dg-final { scan-tree-dump-times "__builtin_GOMP_free \\(var2, 0B\\);" 1 "gimple" } } */
 
   if (mod(transfer(loc(var), intptr), 128_c_intptr_t) /= 0) &
 stop 1


[patch] libgomp.texi: Update "Enabling OpenMP"

2023-10-14 Thread Tobias Burnus

When browsing libgomp doc, I came across
https://gcc.gnu.org/onlinedocs/libgomp/Enabling-OpenMP.html

First, I found especially the Fortran part difficult to read. Secondly,
it missed the C++ attribute syntax. And I also missed a reference to
-fopenmp-simd.

The attached patch tries to improve this. Note that it talks about C and
C++ attributes, even though C23's [[omp::]] support has not yet landed.
(But is expected very soon.)

I also do not try to list what -fopenmp-simd supports as that's at
https://gcc.gnu.org/onlinedocs/gcc/C-Dialect-Options.html#index-fopenmp
and I bet we won't keep both in sync and "man gcc" is more likely to be
up to date.

Comments?

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
libgomp.texi: Update "Enabling OpenMP"

libgomp/
	* libgomp.texi (Enabling OpenMP): Update for C/C++ attributes;
	improve wording especially for Fortran; mention -fopenmp-simd.

diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 526d1be2955..d8126f96fe4 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -136,15 +136,22 @@ changed to GNU Offloading and Multi Processing Runtime Library.
 @node Enabling OpenMP
 @chapter Enabling OpenMP
 
-To activate the OpenMP extensions for C/C++ and Fortran, the compile-time 
-flag @command{-fopenmp} must be specified.  This enables the OpenMP directive
-@code{#pragma omp} in C/C++ and @code{!$omp} directives in free form, 
-@code{c$omp}, @code{*$omp} and @code{!$omp} directives in fixed form, 
-@code{!$} conditional compilation sentinels in free form and @code{c$},
-@code{*$} and @code{!$} sentinels in fixed form, for Fortran.  The flag also
-arranges for automatic linking of the OpenMP runtime library 
+To activate the OpenMP extensions for C/C++ and Fortran, the compile-time
+flag @command{-fopenmp} must be specified.  For C and C++, this enables
+the handling of the OpenMP directives using @code{#pragma omp} and the
+@code{[[omp::directive(...)]]}, @code{[[omp::sequence(...)]]} and
+@code{[[omp::decl(...)]]} attributes.  For Fortran, it enables for
+free source form the @code{!$omp} sentinel for directives and the
+@code{!$} conditional compilation sentinel and for fixed source form the
+@code{c$omp}, @code{*$omp} and @code{!$omp} sentinels for directives and
+the @code{c$}, @code{*$} and @code{!$} conditional compilation sentinels.
+The flag also arranges for automatic linking of the OpenMP runtime library
 (@ref{Runtime Library Routines}).
 
+The @command{-fopenmp-simd} flag can be used to enable a subset of
+OpenMP directives, which do not require the linking of neither the
+OpenMP runtime library nor the POSIX threads library.
+
 A complete description of all OpenMP directives may be found in the
 @uref{https://www.openmp.org, OpenMP Application Program Interface} manuals.
 See also @ref{OpenMP Implementation Status}.



[patch] libgomp.texi: Improve "OpenACC Environment Variables"

2023-10-14 Thread Tobias Burnus

I was recently suggesting someone to use ACC_DEVICE_TYPE; I did not point to the
documentation it only contains the title:
  https://gcc.gnu.org/onlinedocs/libgomp/ACC_005fDEVICE_005fTYPE.html

Admittedly, there is also "Reference:" pointing to the OpenACC specification,
but the latter just states:

"The allowed values of this environment variable are implementation-defined. 
See the
release notes for currently-supported values of this environment variable."

Thus, I now added documentation for the three ACC_ env vars.
I removed GCC_ACC_NOTIFY as it does not seem to exist - and it looks as if it
never did ... (The Nvidia compiler seems to support ACC_NOTIFY with a similar
usage as GOMP_DEBUG and GCN_DEBUG). It was added in GCC 6, which implies that
it also won't be added again. (I also checked OG9 and OG13 and neither contained
it; I don't know whether some older OG did.)

Comments?

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
libgomp.texi: Improve "OpenACC Environment Variables"

None of the ACC_* env vars was documented; in particular, the valid valids
for ACC_DEVICE_TYPE found to be lacking as those are not document in the
OpenACC spec.
GCC_ACC_NOTIFY was removed as I find any traces of it but the addition
to the documentation in commit r6-6185-gcdf6119dad04dd ("libgomp.texi:
Updates for OpenACC.").  It seems to be planned as GCC version of
the ACC_NOTIFY env var used by another compiler for offloading debugging.

libgomp/
	* libgomp.tex (ACC_DEVICE_TYPE, ACC_DEVICE_NUM, ACC_PROFLIB):
	Actually document what the function does.
	(GCC_ACC_NOTIFY): Remove unused env var.

diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 526d1be2955..941525d013d 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -4989,13 +4989,11 @@ The variables @env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM}
 are defined by section 4 of the OpenACC specification in version 2.0.
 The variable @env{ACC_PROFLIB}
 is defined by section 4 of the OpenACC specification in version 2.6.
-The variable @env{GCC_ACC_NOTIFY} is used for diagnostic purposes.
 
 @menu
 * ACC_DEVICE_TYPE::
 * ACC_DEVICE_NUM::
 * ACC_PROFLIB::
-* GCC_ACC_NOTIFY::
 @end menu
 
 
@@ -5003,6 +5001,17 @@ The variable @env{GCC_ACC_NOTIFY} is used for diagnostic purposes.
 @node ACC_DEVICE_TYPE
 @section @code{ACC_DEVICE_TYPE}
 @table @asis
+@item @emph{Description}:
+Control the default device type to use when executing compute regions.
+If unset, the code can be run on any device type, favoring a non-host
+device type.
+
+Supported value in GCC (if compiled in) are
+@itemize
+@item @code{host}
+@item @code{nvidia}
+@item @code{radeon}
+@end itemize
 @item @emph{Reference}:
 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
 4.1.
@@ -5013,6 +5022,10 @@ The variable @env{GCC_ACC_NOTIFY} is used for diagnostic purposes.
 @node ACC_DEVICE_NUM
 @section @code{ACC_DEVICE_NUM}
 @table @asis
+@item @emph{Description}:
+Control which device, identified by device number, is the default device.
+The number must be a nonnegative integer number less that the number of
+devices.  If unset, device number zero is used.
 @item @emph{Reference}:
 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
 4.2.
@@ -5023,6 +5036,11 @@ The variable @env{GCC_ACC_NOTIFY} is used for diagnostic purposes.
 @node ACC_PROFLIB
 @section @code{ACC_PROFLIB}
 @table @asis
+@item @emph{Description}:
+Semicolon separated list of dynamic libraries to be loaded as profiling library.
+The library file is found as described by the documentation of @code{dlopen} of
+your operating system.  Each library must implement at least the routine
+@code{acc_register_library} routine.
 @item @emph{See also}:
 @ref{acc_register_library}, @ref{OpenACC Profiling Interface}
 
@@ -5033,15 +5051,6 @@ The variable @env{GCC_ACC_NOTIFY} is used for diagnostic purposes.
 
 
 
-@node GCC_ACC_NOTIFY
-@section @code{GCC_ACC_NOTIFY}
-@table @asis
-@item @emph{Description}:
-Print debug information pertaining to the accelerator.
-@end table
-
-
-
 @c -
 @c CUDA Streams Usage
 @c -


Re: [Patch] libgomp.texi: Clarify OMP_TARGET_OFFLOAD=mandatory

2023-10-14 Thread Sandra Loosemore

On 10/12/23 10:37, Tobias Burnus wrote:

@@ -3133,15 +3134,25 @@ variable can be set to one of three values - 
@code{MANDATORY}, @code{DISABLED}
 or @code{DEFAULT}.
 
 If set to @code{MANDATORY}, the program will terminate with an error if

-the offload device is not present or is not supported.  If set to
-@code{DISABLED}, then offloading is disabled and all code will run on the
-host. If set to @code{DEFAULT}, the program will try offloading to the
+any device construct or device memory routine uses a device that is unavailable
+or not supported by the implementation, or uses a non-conforming device number.
+If set to @code{DISABLED}, then offloading is disabled and all code will run on
+the host. If set to @code{DEFAULT}, the program will try offloading to the
 device first, then fall back to running code on the host if it cannot.
 
 If undefined, then the program will behave as if @code{DEFAULT} was set.
 
+Note: Even with @code{MANDATORY}, there will be no run-time termination when

+the device number in a @code{device} clause or argument to a device memory
+routine is for host, which includes using the device number in the
+@var{default-device-var} ICV.  However, the initial value of
+the @var{default-device-var} ICV is affected by @code{MANDATORY}.


Can we please rewrite this whole section in the present tense?  E.g.

s/will terminate/terminates/
s/will run/runs/
s/will try ... fall back/tries ... falls back/
s/will behave/behaves/
s/will be/is/

-Sandra




Re: [patch] libgomp.texi: Update "Enabling OpenMP"

2023-10-14 Thread Sandra Loosemore

On 10/14/23 13:43, Tobias Burnus wrote:

When browsing libgomp doc, I came across
https://gcc.gnu.org/onlinedocs/libgomp/Enabling-OpenMP.html>> 
First, I found especially the Fortran part difficult to read. Secondly,

it missed the C++ attribute syntax. And I also missed a reference to
-fopenmp-simd.

The attached patch tries to improve this. Note that it talks about C and
C++ attributes, even though C23's [[omp::]] support has not yet landed.
(But is expected very soon.)


Is somebody actively working on implementing that, and expecting to get it in 
before Stage 1 closes?  I don't think we should document features that don't 
exist yet.  This syntax for C also isn't even in the draft OpenMP 6.0 document 
so at this point it's just a hypothetical extension.  To me it seems a better 
use of resources to finish implementing things that are actually in earlier 
versions of the OpenMP standard, and to fill in documentation for features that 
are actually implemented.


Other than that...


diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 526d1be2955..d8126f96fe4 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -136,15 +136,22 @@ changed to GNU Offloading and Multi Processing Runtime 
Library.
 @node Enabling OpenMP
 @chapter Enabling OpenMP
 
-To activate the OpenMP extensions for C/C++ and Fortran, the compile-time 
-flag @command{-fopenmp} must be specified.  This enables the OpenMP directive
-@code{#pragma omp} in C/C++ and @code{!$omp} directives in free form, 
-@code{c$omp}, @code{*$omp} and @code{!$omp} directives in fixed form, 
-@code{!$} conditional compilation sentinels in free form and @code{c$},

-@code{*$} and @code{!$} sentinels in fixed form, for Fortran.  The flag also
-arranges for automatic linking of the OpenMP runtime library 
+To activate the OpenMP extensions for C/C++ and Fortran, the compile-time

+flag @command{-fopenmp} must be specified.  For C and C++, this enables


Use @option markup on options, not @command.


+the handling of the OpenMP directives using @code{#pragma omp} and the
+@code{[[omp::directive(...)]]}, @code{[[omp::sequence(...)]]} and
+@code{[[omp::decl(...)]]} attributes.  For Fortran, it enables for
+free source form the @code{!$omp} sentinel for directives and the
+@code{!$} conditional compilation sentinel and for fixed source form the
+@code{c$omp}, @code{*$omp} and @code{!$omp} sentinels for directives and
+the @code{c$}, @code{*$} and @code{!$} conditional compilation sentinels.
+The flag also arranges for automatic linking of the OpenMP runtime library
 (@ref{Runtime Library Routines}).


And I think all those @code markups should be @samp instead, or you could just 
replace this whole blurb with something like "This flag enables recognition of 
the directive syntax documented in the OpenMP specification, and also arranges 
for automatic linking..."



+The @command{-fopenmp-simd} flag can be used to enable a subset of


This should be @option too.


+OpenMP directives, which do not require the linking of neither the


s/, which/ that/
s/neither/either/


+OpenMP runtime library nor the POSIX threads library.
+
 A complete description of all OpenMP directives may be found in the
 @uref{https://www.openmp.org, OpenMP Application Program Interface} manuals.
 See also @ref{OpenMP Implementation Status}.


-Sandra


Re: [patch] libgomp.texi: Update "Enabling OpenMP"

2023-10-14 Thread Jakub Jelinek
On Sat, Oct 14, 2023 at 03:46:52PM -0600, Sandra Loosemore wrote:
> On 10/14/23 13:43, Tobias Burnus wrote:
> > When browsing libgomp doc, I came across
> > https://gcc.gnu.org/onlinedocs/libgomp/Enabling-OpenMP.html>> First, I
> > found especially the Fortran part difficult to read. Secondly,
> > it missed the C++ attribute syntax. And I also missed a reference to
> > -fopenmp-simd.
> > 
> > The attached patch tries to improve this. Note that it talks about C and
> > C++ attributes, even though C23's [[omp::]] support has not yet landed.
> > (But is expected very soon.)
> 
> Is somebody actively working on implementing that, and expecting to get it
> in before Stage 1 closes?  I don't think we should document features that

I am (attached is a WIP, which can now compile most of g++.dg/gomp/attrs-1.C
in -std=c2x -fopenmp, except for the scan/section directives).
That said, I agree it might be premature to document it before it is in.

> don't exist yet.  This syntax for C also isn't even in the draft OpenMP 6.0
> document so at this point it's just a hypothetical extension.

It is in OpenMP spec git and it is very unlikely it would be removed.

Jakub
--- gcc/cp/parser.h.jj  2023-09-20 08:42:51.987008923 +0200
+++ gcc/cp/parser.h 2023-10-12 13:32:42.503496571 +0200
@@ -408,7 +408,8 @@ struct GTY(()) cp_parser {
  identifiers) rather than an explicit template parameter list.  */
   bool fully_implicit_function_template_p;
 
-  /* TRUE if omp::directive or omp::sequence attributes may not appear.  */
+  /* TRUE if omp::directive, omp::decl or omp::sequence attributes may not
+ appear.  */
   bool omp_attrs_forbidden_p;
 
   /* Tracks the function's template parameter list when declaring a function
--- gcc/c/c-decl.cc.jj  2023-10-11 10:59:12.378170030 +0200
+++ gcc/c/c-decl.cc 2023-10-11 17:23:42.902257966 +0200
@@ -61,6 +61,7 @@ along with GCC; see the file COPYING3.
 #include "context.h"  /* For 'g'.  */
 #include "omp-general.h"
 #include "omp-offload.h"  /* For offload_vars.  */
+#include "c-parser.h"
 
 #include "tree-pretty-print.h"
 
@@ -325,15 +326,34 @@ i_label_binding (tree node)
 
 /* The resulting tree type.  */
 
-union GTY((desc ("TREE_CODE (&%h.generic) == IDENTIFIER_NODE"),
+union GTY((desc ("TREE_CODE (&%h.generic) == IDENTIFIER_NODE + 2 * (TREE_CODE 
(&%h.generic) == C_TOKEN_VEC)"),
chain_next ("(union lang_tree_node *) c_tree_chain_next 
(&%h.generic)"))) lang_tree_node
  {
   union tree_node GTY ((tag ("0"),
desc ("tree_node_structure (&%h)")))
 generic;
   struct lang_identifier GTY ((tag ("1"))) identifier;
+  struct c_tree_token_vec GTY ((tag ("2"))) c_token_vec;
 };
 
+/* Langhook for tree_size.  */
+size_t
+c_tree_size (enum tree_code code)
+{
+  gcc_checking_assert (code >= NUM_TREE_CODES);
+  switch (code)
+{
+case C_TOKEN_VEC: return sizeof (c_tree_token_vec);
+default:
+  switch (TREE_CODE_CLASS (code))
+   {
+   case tcc_declaration: return sizeof (tree_decl_non_common);
+   case tcc_type: return sizeof (tree_type_non_common);
+   default: gcc_unreachable ();
+   }
+}
+}
+
 /* Track bindings and other things that matter for goto warnings.  For
efficiency, we do not gather all the decls at the point of
definition.  Instead, we point into the bindings structure.  As
--- gcc/c/c-parser.cc.jj2023-10-11 10:59:12.426169364 +0200
+++ gcc/c/c-parser.cc   2023-10-13 17:47:27.32902 +0200
@@ -247,12 +247,21 @@ struct GTY(()) c_parser {
  macro.  */
   BOOL_BITFIELD seen_string_literal : 1;
 
+  /* TRUE if omp::directive, omp::decl or omp::sequence attributes may not
+ appear.  */
+  BOOL_BITFIELD omp_attrs_forbidden_p : 1;
+
   /* Location of the last consumed token.  */
   location_t last_token_location;
 
   /* Holds state for parsing collapsed OMP_FOR loops.  Managed by
  c_parser_omp_for_loop.  */
   struct omp_for_parse_data * GTY((skip)) omp_for_parse_state;
+
+  /* If we're in the context of OpenMP directives written as C23
+ attributes turned into pragma, vector of tokens created from that,
+ otherwise NULL.  */
+  vec *in_omp_attribute_pragma;
 };
 
 /* Return a pointer to the Nth token in PARSERs tokens_buf.  */
@@ -1383,6 +1392,17 @@ c_parser_skip_to_pragma_eol (c_parser *p
 }
   while (token_type != CPP_PRAGMA_EOL);
 
+  if (parser->in_omp_attribute_pragma)
+{
+  c_token *token = c_parser_peek_token (parser);
+  if (token->type == CPP_EOF)
+   {
+ parser->tokens = &parser->tokens_buf[0];
+ parser->tokens_avail = token->flags;
+ parser->in_omp_attribute_pragma = NULL;
+   }
+}
+
   parser->error = false;
 }
 
@@ -5430,6 +5450,109 @@ c_parser_balanced_token_sequence (c_pars
 }
 }
 
+static bool c_parser_check_balanced_raw_token_sequence (c_parser *,
+   unsigned int *);
+
+/* Parse arguments of omp::directive or omp::decl attribute.
+
+  

[PATCH] PR 91865: Avoid ZERO_EXTEND of ZERO_EXTEND in make_compound_operation.

2023-10-14 Thread Roger Sayle

This patch is my proposed solution to PR rtl-optimization/91865.
Normally RTX simplification canonicalizes a ZERO_EXTEND of a ZERO_EXTEND
to a single ZERO_EXTEND, but as shown in this PR it is possible for
combine's make_compound_operation to unintentionally generate a
non-canonical ZERO_EXTEND of a ZERO_EXTEND, which is unlikely to be
matched by the backend.

For the new test case:

const int table[2] = {1, 2};
int foo (char i) { return table[i]; }

compiling with -O2 -mlarge on msp430 we currently see:

Trying 2 -> 7:
2: r25:HI=zero_extend(R12:QI)
  REG_DEAD R12:QI
7: r28:PSI=sign_extend(r25:HI)#0
  REG_DEAD r25:HI
Failed to match this instruction:
(set (reg:PSI 28 [ iD.1772 ])
(zero_extend:PSI (zero_extend:HI (reg:QI 12 R12 [ iD.1772 ]

which results in the following code:

foo:AND #0xff, R12
RLAM.A #4, R12 { RRAM.A #4, R12
RLAM.A  #1, R12
MOVX.W  table(R12), R12
RETA

With this patch, we now see:

Trying 2 -> 7:
2: r25:HI=zero_extend(R12:QI)
  REG_DEAD R12:QI
7: r28:PSI=sign_extend(r25:HI)#0
  REG_DEAD r25:HI
Successfully matched this instruction:
(set (reg:PSI 28 [ iD.1772 ])
(zero_extend:PSI (reg:QI 12 R12 [ iD.1772 ])))
allowing combination of insns 2 and 7
original costs 4 + 8 = 12
replacement cost 8

foo:MOV.B   R12, R12
RLAM.A  #1, R12
MOVX.W  table(R12), R12
RETA


This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}
with no new failures.  Ok for mainline?

2023-10-14  Roger Sayle  

gcc/ChangeLog
PR rtl-optimization/91865
* combine.cc (make_compound_operation): Avoid creating a
ZERO_EXTEND of a ZERO_EXTEND.

gcc/testsuite/ChangeLog
PR rtl-optimization/91865
* gcc.target/msp430/pr91865.c: New test case.


Thanks in advance,
Roger
--

diff --git a/gcc/combine.cc b/gcc/combine.cc
index 360aa2f25e6..f47ff596782 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -8453,6 +8453,9 @@ make_compound_operation (rtx x, enum rtx_code in_code)
new_rtx, GET_MODE (XEXP (x, 0)));
   if (tem)
return tem;
+  /* Avoid creating a ZERO_EXTEND of a ZERO_EXTEND.  */
+  if (GET_CODE (new_rtx) == ZERO_EXTEND)
+   new_rtx = XEXP (new_rtx, 0);
   SUBST (XEXP (x, 0), new_rtx);
   return x;
 }
diff --git a/gcc/testsuite/gcc.target/msp430/pr91865.c 
b/gcc/testsuite/gcc.target/msp430/pr91865.c
new file mode 100644
index 000..8cc21c8b9e8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/msp430/pr91865.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mlarge" } */
+
+const int table[2] = {1, 2};
+int foo (char i) { return table[i]; }
+
+/* { dg-final { scan-assembler-not "AND" } } */
+/* { dg-final { scan-assembler-not "RRAM" } } */


Re: [patch] libgomp.texi: Improve "OpenACC Environment Variables"

2023-10-14 Thread Sandra Loosemore

On 10/14/23 13:51, Tobias Burnus wrote:


@@ -5003,6 +5001,17 @@ The variable @env{GCC_ACC_NOTIFY} is used for diagnostic 
purposes.
 @node ACC_DEVICE_TYPE
 @section @code{ACC_DEVICE_TYPE}
 @table @asis
+@item @emph{Description}:
+Control the default device type to use when executing compute regions.
+If unset, the code can be run on any device type, favoring a non-host
+device type.
+
+Supported value in GCC (if compiled in) are


s/value/values/


+@itemize
+@item @code{host}
+@item @code{nvidia}
+@item @code{radeon}
+@end itemize
 @item @emph{Reference}:
 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
 4.1.
@@ -5013,6 +5022,10 @@ The variable @env{GCC_ACC_NOTIFY} is used for diagnostic 
purposes.
 @node ACC_DEVICE_NUM
 @section @code{ACC_DEVICE_NUM}
 @table @asis
+@item @emph{Description}:
+Control which device, identified by device number, is the default device.
+The number must be a nonnegative integer number less that the number of


Ummm, too many "number"s!

How about rephrasing that as
The value must be a nonnegative integer less than the number of


+devices.  If unset, device number zero is used.
 @item @emph{Reference}:
 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
 4.2.
@@ -5023,6 +5036,11 @@ The variable @env{GCC_ACC_NOTIFY} is used for diagnostic 
purposes.
 @node ACC_PROFLIB
 @section @code{ACC_PROFLIB}
 @table @asis
+@item @emph{Description}:
+Semicolon separated list of dynamic libraries to be loaded as profiling 
library.


s/Semicolon separated/Semicolon-separated/

There's also a semantic issue with "list of dynamic libraries" (plural) and 
"profiling library" (singular).  Are they all separately profiling libraries, 
or does the entire list constitute a single logical profiling library and its 
dependencies?



+The library file is found as described by the documentation of @code{dlopen} of


Probably s/The/Each/?


+your operating system.  Each library must implement at least the routine
+@code{acc_register_library} routine.


Again, I'm not sure if this applies to every library in the list, or just that 
some library in the list must provide this routine.


-Sandra


Re: [PATCH] PR 91865: Avoid ZERO_EXTEND of ZERO_EXTEND in make_compound_operation.

2023-10-14 Thread Jeff Law




On 10/14/23 16:14, Roger Sayle wrote:


This patch is my proposed solution to PR rtl-optimization/91865.
Normally RTX simplification canonicalizes a ZERO_EXTEND of a ZERO_EXTEND
to a single ZERO_EXTEND, but as shown in this PR it is possible for
combine's make_compound_operation to unintentionally generate a
non-canonical ZERO_EXTEND of a ZERO_EXTEND, which is unlikely to be
matched by the backend.

For the new test case:

const int table[2] = {1, 2};
int foo (char i) { return table[i]; }

compiling with -O2 -mlarge on msp430 we currently see:

Trying 2 -> 7:
 2: r25:HI=zero_extend(R12:QI)
   REG_DEAD R12:QI
 7: r28:PSI=sign_extend(r25:HI)#0
   REG_DEAD r25:HI
Failed to match this instruction:
(set (reg:PSI 28 [ iD.1772 ])
 (zero_extend:PSI (zero_extend:HI (reg:QI 12 R12 [ iD.1772 ]

which results in the following code:

foo:AND #0xff, R12
 RLAM.A #4, R12 { RRAM.A #4, R12
 RLAM.A  #1, R12
 MOVX.W  table(R12), R12
 RETA

With this patch, we now see:

Trying 2 -> 7:
 2: r25:HI=zero_extend(R12:QI)
   REG_DEAD R12:QI
 7: r28:PSI=sign_extend(r25:HI)#0
   REG_DEAD r25:HI
Successfully matched this instruction:
(set (reg:PSI 28 [ iD.1772 ])
 (zero_extend:PSI (reg:QI 12 R12 [ iD.1772 ])))
allowing combination of insns 2 and 7
original costs 4 + 8 = 12
replacement cost 8

foo:MOV.B   R12, R12
 RLAM.A  #1, R12
 MOVX.W  table(R12), R12
 RETA


This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}
with no new failures.  Ok for mainline?

2023-10-14  Roger Sayle  

gcc/ChangeLog
 PR rtl-optimization/91865
 * combine.cc (make_compound_operation): Avoid creating a
 ZERO_EXTEND of a ZERO_EXTEND.

gcc/testsuite/ChangeLog
 PR rtl-optimization/91865
 * gcc.target/msp430/pr91865.c: New test case.

Neither an ACK or NAK at this point.

The bug report includes a patch from Segher which purports to fix this 
in simplify-rtx.  Any thoughts on Segher's approach and whether or not 
it should be considered?


The BZ also indicates that removal of 2 patterns from msp430.md would 
solve this too (though it may cause regressions elsewhere?).  Any 
thoughts on that approach as well?


Thanks,
jeff



[PATCH] Improved RTL expansion of 1LL << x.

2023-10-14 Thread Roger Sayle

This patch improves the initial RTL expanded for double word shifts
on architectures with conditional moves, so that later passes don't
need to clean-up unnecessary and/or unused instructions.

Consider the general case, x << y, which is expanded well as:

t1 = y & 32;
t2 = 0;
t3 = x_lo >> 1;
t4 = y ^ ~0;
t5 = t3 >> t4;
tmp_hi = x_hi << y;
tmp_hi |= t5;
tmp_lo = x_lo << y;
out_hi = t1 ? tmp_lo : tmp_hi;
out_lo = t1 ? t2 : tmp_lo;

which is nearly optimal, the only thing that can be improved is
that using a unary NOT operation "t4 = ~y" is better than XOR
with -1, on targets that support it.  [Note the one_cmpl_optab
expander didn't fall back to XOR when this code was originally
written, but has been improved since].

Now consider the relatively common idiom of 1LL << y, which
currently produces the RTL equivalent of:

t1 = y & 32;
t2 = 0;
t3 = 1 >> 1;
t4 = y ^ ~0;
t5 = t3 >> t4;
tmp_hi = 0 << y;
tmp_hi |= t5;
tmp_lo = 1 << y;
out_hi = t1 ? tmp_lo : tmp_hi;
out_lo = t1 ? t2 : tmp_lo;

Notice here that t3 is always zero, so the assignment of t5
is a variable shift of zero, which expands to a loop on many
smaller targets, a similar shift by zero in the first tmp_hi
assignment (another loop), that the value of t4 is no longer
required (as t3 is zero), and that the ultimate value of tmp_hi
is always zero.

Fortunately, for many (but perhaps not all) targets this mess
gets cleaned up by later optimization passes.  However, this
patch avoids generating unnecessary RTL at expand time, by
calling simplify_expand_binop instead of expand_binop, and
avoiding generating dead or unnecessary code when intermediate
values are known to be zero.  For the 1LL << y test case above,
we now generate:

t1 = y & 32;
t2 = 0;
tmp_hi = 0;
tmp_lo = 1 << y;
out_hi = t1 ? tmp_lo : tmp_hi;
out_lo = t1 ? t2 : tmp_lo;

On arc-elf, for example, there are 18 RTL INSN_P instructions
generated by expand before this patch, but only 12 with this patch
(improving both compile-time and memory usage).


This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}
with no new failures.  Ok for mainline?


2023-10-15  Roger Sayle  

gcc/ChangeLog
* optabs.cc (expand_subword_shift): Call simplify_expand_binop
instead of expand_binop.  Optimize cases (i.e. avoid generating
RTL) when CARRIES or INTO_INPUT is zero.  Use one_cmpl_optab
(i.e. NOT) instead of xor_optab with ~0 to calculate ~OP1.


Thanks in advance,
Roger
--

diff --git a/gcc/optabs.cc b/gcc/optabs.cc
index e1898da..f0a048a 100644
--- a/gcc/optabs.cc
+++ b/gcc/optabs.cc
@@ -533,15 +533,13 @@ expand_subword_shift (scalar_int_mode op1_mode, optab 
binoptab,
 has unknown behavior.  Do a single shift first, then shift by the
 remainder.  It's OK to use ~OP1 as the remainder if shift counts
 are truncated to the mode size.  */
-  carries = expand_binop (word_mode, reverse_unsigned_shift,
- outof_input, const1_rtx, 0, unsignedp, methods);
-  if (shift_mask == BITS_PER_WORD - 1)
-   {
- tmp = immed_wide_int_const
-   (wi::minus_one (GET_MODE_PRECISION (op1_mode)), op1_mode);
- tmp = simplify_expand_binop (op1_mode, xor_optab, op1, tmp,
-  0, true, methods);
-   }
+  carries = simplify_expand_binop (word_mode, reverse_unsigned_shift,
+  outof_input, const1_rtx, 0,
+  unsignedp, methods);
+  if (carries == const0_rtx)
+   tmp = const0_rtx;
+  else if (shift_mask == BITS_PER_WORD - 1)
+   tmp = expand_unop (op1_mode, one_cmpl_optab, op1, 0, true);
   else
{
  tmp = immed_wide_int_const (wi::shwi (BITS_PER_WORD - 1,
@@ -552,22 +550,29 @@ expand_subword_shift (scalar_int_mode op1_mode, optab 
binoptab,
 }
   if (tmp == 0 || carries == 0)
 return false;
-  carries = expand_binop (word_mode, reverse_unsigned_shift,
- carries, tmp, 0, unsignedp, methods);
+  if (carries != const0_rtx && tmp != const0_rtx)
+carries = simplify_expand_binop (word_mode, reverse_unsigned_shift,
+carries, tmp, 0, unsignedp, methods);
   if (carries == 0)
 return false;
 
-  /* Shift INTO_INPUT logically by OP1.  This is the last use of INTO_INPUT
- so the result can go directly into INTO_TARGET if convenient.  */
-  tmp = expand_binop (word_mode, unsigned_shift, into_input, op1,
- into_target, unsignedp, methods);
-  if (tmp == 0)
-return false;
+  if (into_input != const0_rtx)
+{
+  /* Shift INTO_INPUT logically by OP1.  This is the last use of
+INTO_INPUT so th

[PATCH 2/2] [c] Fix PR 101364: ICE after error due to diagnose_arglist_conflict not checking for error

2023-10-14 Thread Andrew Pinski
When checking to see if we have a function declaration has a conflict due to
promotations, there is no test to see if the type was an error mark and then 
calls
c_type_promotes_to. c_type_promotes_to is not ready for error_mark and causes an
ICE.

This adds a check for error before the call of c_type_promotes_to.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR c/101364

gcc/c/ChangeLog:

* c-decl.cc (diagnose_arglist_conflict): Test for
error mark before calling of c_type_promotes_to.

gcc/testsuite/ChangeLog:

* gcc.dg/pr101364-1.c: New test.
---
 gcc/c/c-decl.cc   | 3 ++-
 gcc/testsuite/gcc.dg/pr101364-1.c | 8 
 2 files changed, 10 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr101364-1.c

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 5822faf01b4..eb2df08c0a7 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -1899,7 +1899,8 @@ diagnose_arglist_conflict (tree newdecl, tree olddecl,
  break;
}
 
-  if (c_type_promotes_to (type) != type)
+  if (!error_operand_p (type)
+ && c_type_promotes_to (type) != type)
{
  inform (input_location, "an argument type that has a default "
  "promotion cannot match an empty parameter name list "
diff --git a/gcc/testsuite/gcc.dg/pr101364-1.c 
b/gcc/testsuite/gcc.dg/pr101364-1.c
new file mode 100644
index 000..e7c94a05553
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr101364-1.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-std=c90 "} */
+
+void fruit(); /* { dg-message "previous declaration" } */
+void fruit( /* { dg-error "conflicting types for" } */
+int b[x], /* { dg-error "undeclared " } */
+short c)
+{} /* { dg-message "an argument type that has a" } */
-- 
2.39.3



[PATCH 1/2] Fix ICE due to c_safe_arg_type_equiv_p not checking for error_mark node

2023-10-14 Thread Andrew Pinski
This is a simple error recovery issue when c_safe_arg_type_equiv_p
was added in r8-5312-gc65e18d3331aa999. The issue is that after
an error, an argument type (of a function type) might turn
into an error mark node and c_safe_arg_type_equiv_p was not ready
for that. So this just adds a check for error operand for its
arguments before getting the main variant.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR c/101285

gcc/c/ChangeLog:

* c-typeck.cc (c_safe_arg_type_equiv_p): Return true for error
operands early.

gcc/testsuite/ChangeLog:

* gcc.dg/pr101285-1.c: New test.
---
 gcc/c/c-typeck.cc |  3 +++
 gcc/testsuite/gcc.dg/pr101285-1.c | 10 ++
 2 files changed, 13 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr101285-1.c

diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index e55e887da14..6e044b4afbc 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -5960,6 +5960,9 @@ handle_warn_cast_qual (location_t loc, tree type, tree 
otype)
 static bool
 c_safe_arg_type_equiv_p (tree t1, tree t2)
 {
+  if (error_operand_p (t1) || error_operand_p (t2))
+return true;
+
   t1 = TYPE_MAIN_VARIANT (t1);
   t2 = TYPE_MAIN_VARIANT (t2);
 
diff --git a/gcc/testsuite/gcc.dg/pr101285-1.c 
b/gcc/testsuite/gcc.dg/pr101285-1.c
new file mode 100644
index 000..831e35f7662
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr101285-1.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-W -Wall" } */
+const int b;
+typedef void (*ft1)(int[b++]); /* { dg-error "read-only variable" } */
+void bar(int * z);
+void baz()
+{
+(ft1) bar; /* { dg-warning "statement with no effect" } */
+}
+
-- 
2.39.3



[Committed] RISC-V: Fix vsingle attribute

2023-10-14 Thread Juzhe-Zhong
RVVM2x2QI should be rvvm2qi instead of rvvmq1i.

gcc/ChangeLog:

* config/riscv/vector-iterators.md: Fix vsingle incorrect attribute for 
RVVM2x2QI.

---
 gcc/config/riscv/vector-iterators.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index 6800f8d3d76..0850475edc1 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -2230,7 +2230,7 @@
   (RVVM1x5QI "rvvm1qi") (RVVMF2x5QI "rvvmf2qi") (RVVMF4x5QI "rvvmf4qi") 
(RVVMF8x5QI "rvvmf8qi")
   (RVVM2x4QI "rvvm2qi") (RVVM1x4QI "rvvm1qi") (RVVMF2x4QI "rvvmf2qi") 
(RVVMF4x4QI "rvvmf4qi") (RVVMF8x4QI "rvvmf8qi")
   (RVVM2x3QI "rvvm2qi") (RVVM1x3QI "rvvm1qi") (RVVMF2x3QI "rvvmf2qi") 
(RVVMF4x3QI "rvvmf4qi") (RVVMF8x3QI "rvvmf8qi")
-  (RVVM4x2QI "rvvm4qi") (RVVM2x2QI "rvvm1qi") (RVVM1x2QI "rvvm1qi") 
(RVVMF2x2QI "rvvmf2qi") (RVVMF4x2QI "rvvmf4qi") (RVVMF8x2QI "rvvmf8qi")
+  (RVVM4x2QI "rvvm4qi") (RVVM2x2QI "rvvm2qi") (RVVM1x2QI "rvvm1qi") 
(RVVMF2x2QI "rvvmf2qi") (RVVMF4x2QI "rvvmf4qi") (RVVMF8x2QI "rvvmf8qi")
 
   (RVVM1x8HI "rvvm1hi") (RVVMF2x8HI "rvvmf2hi") (RVVMF4x8HI "rvvmf4hi")
   (RVVM1x7HI "rvvm1hi") (RVVMF2x7HI "rvvmf2hi") (RVVMF4x7HI "rvvmf4hi")
-- 
2.36.3



Re: [PATCH] Improved RTL expansion of 1LL << x.

2023-10-14 Thread Jeff Law




On 10/14/23 17:32, Roger Sayle wrote:


This patch improves the initial RTL expanded for double word shifts
on architectures with conditional moves, so that later passes don't
need to clean-up unnecessary and/or unused instructions.

Consider the general case, x << y, which is expanded well as:

 t1 = y & 32;
 t2 = 0;
 t3 = x_lo >> 1;
 t4 = y ^ ~0;
 t5 = t3 >> t4;
 tmp_hi = x_hi << y;
 tmp_hi |= t5;
 tmp_lo = x_lo << y;
 out_hi = t1 ? tmp_lo : tmp_hi;
 out_lo = t1 ? t2 : tmp_lo;

which is nearly optimal, the only thing that can be improved is
that using a unary NOT operation "t4 = ~y" is better than XOR
with -1, on targets that support it.  [Note the one_cmpl_optab
expander didn't fall back to XOR when this code was originally
written, but has been improved since].

Now consider the relatively common idiom of 1LL << y, which
currently produces the RTL equivalent of:

 t1 = y & 32;
 t2 = 0;
 t3 = 1 >> 1;
 t4 = y ^ ~0;
 t5 = t3 >> t4;
 tmp_hi = 0 << y;
 tmp_hi |= t5;
 tmp_lo = 1 << y;
 out_hi = t1 ? tmp_lo : tmp_hi;
 out_lo = t1 ? t2 : tmp_lo;

Notice here that t3 is always zero, so the assignment of t5
is a variable shift of zero, which expands to a loop on many
smaller targets, a similar shift by zero in the first tmp_hi
assignment (another loop), that the value of t4 is no longer
required (as t3 is zero), and that the ultimate value of tmp_hi
is always zero.

Fortunately, for many (but perhaps not all) targets this mess
gets cleaned up by later optimization passes.  However, this
patch avoids generating unnecessary RTL at expand time, by
calling simplify_expand_binop instead of expand_binop, and
avoiding generating dead or unnecessary code when intermediate
values are known to be zero.  For the 1LL << y test case above,
we now generate:

 t1 = y & 32;
 t2 = 0;
 tmp_hi = 0;
 tmp_lo = 1 << y;
 out_hi = t1 ? tmp_lo : tmp_hi;
 out_lo = t1 ? t2 : tmp_lo;

On arc-elf, for example, there are 18 RTL INSN_P instructions
generated by expand before this patch, but only 12 with this patch
(improving both compile-time and memory usage).


This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}
with no new failures.  Ok for mainline?


2023-10-15  Roger Sayle  

gcc/ChangeLog
 * optabs.cc (expand_subword_shift): Call simplify_expand_binop
 instead of expand_binop.  Optimize cases (i.e. avoid generating
 RTL) when CARRIES or INTO_INPUT is zero.  Use one_cmpl_optab
 (i.e. NOT) instead of xor_optab with ~0 to calculate ~OP1.

OK.

FWIW H8 is another one of the targets where variable shifts have to be 
implemented as a loop.


Jeff