date:20230117

Re: [PATCH] xtensa: Eliminate unnecessary general-purpose reg-reg moves

2023-01-17 Thread Max Filippov via Gcc-patches

Hi Suwa-san,

On Mon, Jan 16, 2023 at 8:54 PM Takayuki 'January June' Suwa
 wrote:
>
> Register-register move instructions that can be easily seen as
> unnecessary by the human eye may remain in the compiled result.
> For example:
>
> /* example */
> double test(double a, double b) {
>   return __builtin_copysign(a, b);
> }
>
> test:
> add.n   a3, a3, a3
> extui   a5, a5, 31, 1
> ssai1
> ;; be in the same BB
> src a7, a5, a3  ;; No '0' in the source constraints
> ;; No CALL insns in this span
> ;; Both A3 and A7 are irrelevant to
> ;;   insns in this span
> mov.n   a3, a7  ;; An unnecessary reg-reg move
> ;; A7 is not used after this
> ret.n
>
> The last two instructions above, excluding the return instruction,
> could be done like this:
>
> src a3, a5, a3
>
> This symptom often occurs when handling DI/DFmode values with SImode
> instructions.  This patch solves the above problem using peephole2
> pattern.
>
> gcc/ChangeLog:
>
> * config/xtensa/xtensa.md: New peephole2 pattern that eliminates
> the occurrence of genral-purpose register used only once and for
> transferring intermediate value.
> ---
>  gcc/config/xtensa/xtensa.md | 44 +
>  1 file changed, 44 insertions(+)

This change results in a bunch of ICEs with the following backtrace:

gcc/libgcc/unwind-dw2.c: In function ‘execute_cfa_program_specialized’:
gcc/libgcc/unwind-dw2.c:972:1: internal compiler error: RTL check:
expected elt 2 type 'B', have '0' (rtx barrier) in BLOCK_FOR_INSN, at
rtl.h:1493
 972 | }
 | ^
0x6c3334 rtl_check_failed_type1(rtx_def const*, int, int, char const*,
int, char const*)
   gcc/gcc/rtl.cc:897
0x7bf285 BLOCK_FOR_INSN(rtx_def*)
   gcc/gcc/rtl.h:1493
0x7c448d BLOCK_FOR_INSN(rtx_def*)
   gcc/gcc/rtl.h:1509
0x7c448d gen_peephole2_4(rtx_insn*, rtx_def**)
   gcc/gcc/config/xtensa/xtensa.md:3102
0xe1cce2 peephole2_optimize
   gcc/gcc/recog.cc:4180
0xe1cce2 rest_of_handle_peephole2
   gcc/gcc/recog.cc:4331
0xe1cce2 execute
   gcc/gcc/recog.cc:4368

-- 
Thanks.
-- Max

Re: [PATCH] ada: Respect GNATMAKE

2023-01-17 Thread Arnaud Charlet via Gcc-patches



> Use the GNATMAKE variables consistently.
> Avoids failures when bootstraping with a custom GNATMAKE value.
> 
> gcc/ada/ChangeLog:
> 
>* Make-generated.in: Use GNATMAKE.
>* gcc-interface/Makefile.in: Ditto.

Ok, thanks.

> Signed-off-by: Peter Foley 
> ---
> gcc/ada/Make-generated.in | 6 +++---
> gcc/ada/gcc-interface/Makefile.in | 2 +-
> 2 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/gcc/ada/Make-generated.in b/gcc/ada/Make-generated.in
> index 948fc508a56..34c86b2cd63 100644
> --- a/gcc/ada/Make-generated.in
> +++ b/gcc/ada/Make-generated.in
> @@ -18,7 +18,7 @@ GEN_IL_FLAGS = -gnata -gnat2012 -gnatw.g -gnatyg -gnatU 
> $(GEN_IL_INCLUDES)
> ada/seinfo_tables.ads ada/seinfo_tables.adb ada/sinfo.h ada/einfo.h 
> ada/nmake.ads ada/nmake.adb ada/seinfo.ads ada/sinfo-nodes.ads 
> ada/sinfo-nodes.adb ada/einfo-entities.ads ada/einfo-entities.adb: 
> ada/stamp-gen_il ; @true
> ada/stamp-gen_il: $(fsrcdir)/ada/gen_il*
>$(MKDIR) ada/gen_il
> -cd ada/gen_il; gnatmake -q -g $(GEN_IL_FLAGS) gen_il-main
> +cd ada/gen_il; $(GNATMAKE) -q -g $(GEN_IL_FLAGS) gen_il-main
># Ignore errors to work around finalization issues in older compilers
>- cd ada/gen_il; ./gen_il-main
>$(fsrcdir)/../move-if-change ada/gen_il/seinfo_tables.ads 
> ada/seinfo_tables.ads
> @@ -39,14 +39,14 @@ ada/stamp-gen_il: $(fsrcdir)/ada/gen_il*
> # would cause bootstrapping with older compilers to fail. You can call it by
> # hand, as a sanity check that these files are legal.
> ada/seinfo_tables.o: ada/seinfo_tables.ads ada/seinfo_tables.adb
> -cd ada ; gnatmake $(GEN_IL_INCLUDES) seinfo_tables.adb -gnatU -gnatX
> +cd ada ; $(GNATMAKE) $(GEN_IL_INCLUDES) seinfo_tables.adb -gnatU -gnatX
> 
> ada/snames.h ada/snames.ads ada/snames.adb : ada/stamp-snames ; @true
> ada/stamp-snames : ada/snames.ads-tmpl ada/snames.adb-tmpl ada/snames.h-tmpl 
> ada/xsnamest.adb ada/xutil.ads ada/xutil.adb
>-$(MKDIR) ada/bldtools/snamest
>$(RM) $(addprefix ada/bldtools/snamest/,$(notdir $^))
>$(CP) $^ ada/bldtools/snamest
> -cd ada/bldtools/snamest; gnatmake -q xsnamest ; ./xsnamest
> +cd ada/bldtools/snamest; $(GNATMAKE) -q xsnamest ; ./xsnamest
>$(fsrcdir)/../move-if-change ada/bldtools/snamest/snames.ns ada/snames.ads
>$(fsrcdir)/../move-if-change ada/bldtools/snamest/snames.nb ada/snames.adb
>$(fsrcdir)/../move-if-change ada/bldtools/snamest/snames.nh ada/snames.h
> diff --git a/gcc/ada/gcc-interface/Makefile.in 
> b/gcc/ada/gcc-interface/Makefile.in
> index da6a56fcec8..c8c38acf447 100644
> --- a/gcc/ada/gcc-interface/Makefile.in
> +++ b/gcc/ada/gcc-interface/Makefile.in
> @@ -616,7 +616,7 @@ OSCONS_EXTRACT=$(GCC_FOR_ADA_RTS) $(GNATLIBCFLAGS_FOR_C) 
> -S s-oscons-tmplt.i
>-$(MKDIR) ./bldtools/oscons
>$(RM) $(addprefix ./bldtools/oscons/,$(notdir $^))
>$(CP) $^ ./bldtools/oscons
> -(cd ./bldtools/oscons ; gnatmake -q xoscons)
> +(cd ./bldtools/oscons ; $(GNATMAKE) -q xoscons)
> 
> $(RTSDIR)/s-oscons.ads: ../stamp-gnatlib1-$(RTSDIR) s-oscons-tmplt.c 
> gsocket.h ./bldtools/oscons/xoscons
>$(RM) $(RTSDIR)/s-oscons-tmplt.i $(RTSDIR)/s-oscons-tmplt.s
> -- 
> 2.39.0
>

[PATCH] libsanitizer: Fix asan SEGVs with gld on Solaris

2023-01-17 Thread Rainer Orth

When using GNU ld on Solaris, a large number of asan tests SEGV, while
Solaris ld is fine.  This happens inside the __tls_get_addr interceptor,
which is highly glibc-specific.  Therefore this patch disables that
interceptor.

Posted upstream at https://reviews.llvm.org/D141385.

Tested on i386-pc-solaris2.11 and sparc-sun-solaris2.11.

Ok to cherry-pick into libsanitizer?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2023-01-17  Rainer Orth  

libsanitizer:
* sanitizer_common/sanitizer_platform_interceptors.h: Cherry-pick
llvm-project revision 951cf656b2faaf6fc0baa867293c0cb0ab131951.

# HG changeset patch
# Parent  5c31a29beaaa8ed8a5a0dd7a6c11062a0a208c3a
libsanitizer: Don't intercept __tls_get_addr on Solaris

diff --git a/libsanitizer/sanitizer_common/sanitizer_platform_interceptors.h b/libsanitizer/sanitizer_common/sanitizer_platform_interceptors.h
--- a/libsanitizer/sanitizer_common/sanitizer_platform_interceptors.h
+++ b/libsanitizer/sanitizer_common/sanitizer_platform_interceptors.h
@@ -405,7 +405,7 @@
   (SI_FREEBSD || SI_NETBSD || SI_GLIBC || SI_SOLARIS)
 
 #define SANITIZER_INTERCEPT_TLS_GET_ADDR \
-  (SI_FREEBSD || SI_NETBSD || SI_LINUX_NOT_ANDROID || SI_SOLARIS)
+  (SI_FREEBSD || SI_NETBSD || SI_LINUX_NOT_ANDROID)
 
 #define SANITIZER_INTERCEPT_LISTXATTR SI_LINUX
 #define SANITIZER_INTERCEPT_GETXATTR SI_LINUX

Re: [PATCH,WWWDOCS] htdocs: add an Atom feed for GCC news

2023-01-17 Thread Gerald Pfeifer

On Wed, 11 Jan 2023, Thomas Schwinge wrote:
> On 2022-12-23T10:50:13+0100, "Jose E. Marchesi via Gcc-patches" 
>  wrote:
>> This patch adds an Atom feed for GCC news, which can then be easily 
>> aggregated in other sites, such as the GNU planet 
>> (https://planet.gnu.org).
> I absolutely agree that providing such an RSS feed is a good thing
> (..., and that we generally should make better use of our News section,
> and other "PR"...) -- but I'm less convinced by the prospect of manually
> editing the RSS 'news.xml' file, duplicating in a (potentially) different
> format what we've got in the HTML News section.  :-|

Agreed, yet...

> Ideally, there'd be some simple files for News items (Markdown, or
> similar), which are then converted into HTML News as well as RSS feed.
> Obviously, there needs to be some consensus on what to use, and somebody
> needs to set up the corresponding machinery...

...how are we going to get to that?


On Thu, 12 Jan 2023, Jose E. Marchesi wrote:
> I would like to point out that I have maintained these kind of feeds for
> my own sites for years, and that in my humble personal experience unless
> there are a lot of updates, like more than a couple of new entries per
> month, any automated schema would be overkill, prone to rot, and not
> really worth the effort.

That is a bit of a concern. I'd love having a single source that feeds 
both the News section on our main page, rolls over into news.html, and
also feeds the Atom feed (no pun intended).

On the other hand, with less than a dozen entries per year, even if 
manually converting form one to the other takes 5 minutes, creating 
such a machinery wouldn't amortize anytime soon

> I strongly suggest to not overengineer here [and nowhere else :)]

I am tempted to agree (even if the engineer in me would prefer to avoid 
duplication). Jose, might you be willing to help others create Atom feed
entries?

What do others think?

Gerald

Re: [PATCH,WWWDOCS] htdocs: add an Atom feed for GCC news

2023-01-17 Thread Gerald Pfeifer

On Fri, 23 Dec 2022, Jose E. Marchesi via Gcc-patches wrote:
> This patch adds an Atom feed for GCC news

I was going to approve, would like to see a bit consensus with others 
though.

For now some review:

>  

I recommend switching the two notes. The one on the feed feels more 
important since that aspect is easier to miss than the required rotation 
(which visually presents itself when one looks at our web site).

> +++ b/htdocs/news.xml
> @@ -0,0 +1,28 @@
> +
> +
> +
> +  
> +News about the GNU Compiler Collection
> +https://gcc.gnu.org
> +
> +  The GNU Compiler Collection includes front ends for C, C++,
> +  Objective-C, Fortran, Ada, Go, and D, as well as libraries for
> +  these languages (libstdc++,...). GCC was originally written as
> +  the compiler for the GNU operating system. The GNU system was
> +  developed to be 100% free software, free in the sense that it
> +  respects the user's freedom.
> +

Looks like we should think of updating our description (though that's 
beyond the scope of your patch), for example talking more about the many 
platforms supported and less about history.

> +
> +  GCC BPF in Compiler Explorer
> +  https://godbolt.org
> +  
> +Support for a nightly build of the bpf-unknown-none-gcc
> +compiler has been contributed to Compiler Explorer (aka
> +godbolt.org) by Marc Poulhiès

Would this be full sentences (with a full stop)?

When one adds additional entries, do those come at the beginning or the 
end? (Could there be a comment?)

> +  Fri, 23 December 2022 11:00:00 CET

On the web site we use ISO dates - any issues with that?

And does the feed require time of day, or could we omit that? (Especially 
with different timezones we all are in?)

Gerald

Re: [PATCH][2/n] LTO option handling/merging rewrite

2023-01-17 Thread Andreas Schwab via Gcc-patches

On Nov 02 2011, Richard Guenther wrote:

>   lto/
>   * lto-lang.c (lto_post_options): Do not read file options.
>   * lto.c (lto_read_all_file_options): Remove.

This fails to update the documentation.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

[PATCH] forwprop: Fix up rotate pattern matching [PR106523]

2023-01-17 Thread Jakub Jelinek via Gcc-patches

Hi!

The comment above simplify_rotate roughly describes what patterns
are matched into what:
   We are looking for X with unsigned type T with bitsize B, OP being
   +, | or ^, some type T2 wider than T.  For:
   (X << CNT1) OP (X >> CNT2)   iff CNT1 + CNT2 == B
   ((T) ((T2) X << CNT1)) OP ((T) ((T2) X >> CNT2)) iff CNT1 + CNT2 == B
  
   transform these into:
   X r<< CNT1

   Or for:
   (X << Y) OP (X >> (B - Y))
   (X << (int) Y) OP (X >> (int) (B - Y))
   ((T) ((T2) X << Y)) OP ((T) ((T2) X >> (B - Y)))
   ((T) ((T2) X << (int) Y)) OP ((T) ((T2) X >> (int) (B - Y)))
   (X << Y) | (X >> ((-Y) & (B - 1)))
   (X << (int) Y) | (X >> (int) ((-Y) & (B - 1)))
   ((T) ((T2) X << Y)) | ((T) ((T2) X >> ((-Y) & (B - 1
   ((T) ((T2) X << (int) Y)) | ((T) ((T2) X >> (int) ((-Y) & (B - 1

   transform these into (last 2 only if ranger can prove Y < B):
   X r<< Y
  
   Or for:
   (X << (Y & (B - 1))) | (X >> ((-Y) & (B - 1)))
   (X << (int) (Y & (B - 1))) | (X >> (int) ((-Y) & (B - 1)))
   ((T) ((T2) X << (Y & (B - 1 | ((T) ((T2) X >> ((-Y) & (B - 1
   ((T) ((T2) X << (int) (Y & (B - 1 \
 | ((T) ((T2) X >> (int) ((-Y) & (B - 1
  
   transform these into:
   X r<< (Y & (B - 1))
The following testcase shows that 2 of these are problematic.
If T2 is wider than T, then the 2 which yse (-Y) & (B - 1) on one
of the shift counts but Y on the can do something different from
rotate.  E.g.:
__attribute__((noipa)) unsigned char
f7 (unsigned char x, unsigned int y)
{
  unsigned int t = x;
  return (t << y) | (t >> ((-y) & 7));
}
if y is [0, 7], then it is a normal rotate, and if y is in [32, ~0U]
then it is UB, but for y in [9, 31] the left shift in this case
will never leave any bits in the result, while in a rotate they are
left there.  Say for y 5 and x 0xaa the expression gives
0x55 which is the same thing as rotate, while for y 19 and x 0xaa
0x5, which is different.
Now, I believe the
   ((T) ((T2) X << Y)) OP ((T) ((T2) X >> (B - Y)))
   ((T) ((T2) X << (int) Y)) OP ((T) ((T2) X >> (int) (B - Y)))
forms are ok, because B - Y still needs to be a valid shift count,
and if Y > B then B - Y should be either negative or very large
positive (for unsigned types).
And similarly the last 2 cases above which use & (B - 1) on both
shift operands are definitely ok.

The following patch disables the
   ((T) ((T2) X << Y)) | ((T) ((T2) X >> ((-Y) & (B - 1
   ((T) ((T2) X << (int) Y)) | ((T) ((T2) X >> (int) ((-Y) & (B - 1
unless ranger says Y is not in [B, B2 - 1] range.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk
(for now)?

Aldy/Andrew, is the ranger query ok or should I use something different
when check_range_stmt is non-NULL and I know on which statement to ask?

And, looking at it again this morning, actually the Y equal to B
case is still fine, if Y is equal to 0, then it is
(T) (((T2) X << 0) | ((T2) X >> 0))
and so X, for Y == B it is
(T) (((T2) X << B) | ((T2) X >> 0))
which is the same as
(T) (0 | ((T2) X >> 0))
which is also X.  So instead of the [B, B2 - 1] range we could use
[B + 1, B2 - 1].  And, if we wanted to go further, even multiplies
of B are ok if they are smaller than B2, so we could construct a detailed
int_range_max if we wanted.

2023-01-17  Jakub Jelinek  

PR tree-optimization/106523
* tree-ssa-forwprop.cc (simplify_rotate): For the
patterns with (-Y) & (B - 1) in one operand's shift
count and Y in another, if T2 has wider precision than T,
punt if Y could have a value in [B, B2 - 1] range.

* c-c++-common/rotate-2.c (f5, f6, f7, f8, f13, f14, f15, f16,
f37, f38, f39, f40, f45, f46, f47, f48): Add assertions using
__builtin_unreachable about shift count.
* c-c++-common/rotate-2b.c: New test.
* c-c++-common/rotate-4.c (f5, f6, f7, f8, f13, f14, f15, f16,
f37, f38, f39, f40, f45, f46, f47, f48): Add assertions using
__builtin_unreachable about shift count.
* c-c++-common/rotate-4b.c: New test.
* gcc.c-torture/execute/pr106523.c: New test.

--- gcc/tree-ssa-forwprop.cc.jj 2023-01-02 09:32:26.0 +0100
+++ gcc/tree-ssa-forwprop.cc2023-01-16 18:18:43.524443879 +0100
@@ -1837,7 +1837,7 @@ defcodefor_name (tree name, enum tree_co
((T) ((T2) X << Y)) | ((T) ((T2) X >> ((-Y) & (B - 1
((T) ((T2) X << (int) Y)) | ((T) ((T2) X >> (int) ((-Y) & (B - 1
 
-   transform these into:
+   transform these into (last 2 only if ranger can prove Y < B):
X r<< Y
 
Or for:
@@ -1866,6 +1866,8 @@ simplify_rotate (gimple_stmt_iterator *g
   int i;
   bool swapped_p = false;
   gimple *g;
+  gimple *def_arg_stmt[2] = { NULL, NULL };
+  int wider_prec = 0;
 
   arg[0] = gimple_assign_rhs1 (stmt);
   arg[1] = gimple_assign_rhs2 (stmt);
@@ -1878,7 +1880,11 @@ simplify_rotate (gimple_stmt_iterator *g
 return false;
 
   for (i = 0; i < 2; i++)
-defcodefor_name (arg[i], &def_code[i], &def_arg1

Re: [PATCH] libsanitizer: Fix asan SEGVs with gld on Solaris

2023-01-17 Thread Richard Biener via Gcc-patches

On Tue, Jan 17, 2023 at 9:58 AM Rainer Orth  
wrote:
>
> When using GNU ld on Solaris, a large number of asan tests SEGV, while
> Solaris ld is fine.  This happens inside the __tls_get_addr interceptor,
> which is highly glibc-specific.  Therefore this patch disables that
> interceptor.
>
> Posted upstream at https://reviews.llvm.org/D141385.
>
> Tested on i386-pc-solaris2.11 and sparc-sun-solaris2.11.
>
> Ok to cherry-pick into libsanitizer?

OK.

> Rainer
>
> --
> -
> Rainer Orth, Center for Biotechnology, Bielefeld University
>
>
> 2023-01-17  Rainer Orth  
>
> libsanitizer:
> * sanitizer_common/sanitizer_platform_interceptors.h: Cherry-pick
> llvm-project revision 951cf656b2faaf6fc0baa867293c0cb0ab131951.
>

Re: [PATCH,WWWDOCS] htdocs: add an Atom feed for GCC news

2023-01-17 Thread Jose E. Marchesi via Gcc-patches



> On Wed, 11 Jan 2023, Thomas Schwinge wrote:
>> On 2022-12-23T10:50:13+0100, "Jose E. Marchesi via Gcc-patches"
>>  wrote:
>>> This patch adds an Atom feed for GCC news, which can then be easily 
>>> aggregated in other sites, such as the GNU planet 
>>> (https://planet.gnu.org).
>> I absolutely agree that providing such an RSS feed is a good thing
>> (..., and that we generally should make better use of our News section,
>> and other "PR"...) -- but I'm less convinced by the prospect of manually
>> editing the RSS 'news.xml' file, duplicating in a (potentially) different
>> format what we've got in the HTML News section.  :-|
>
> Agreed, yet...
>
>> Ideally, there'd be some simple files for News items (Markdown, or
>> similar), which are then converted into HTML News as well as RSS feed.
>> Obviously, there needs to be some consensus on what to use, and somebody
>> needs to set up the corresponding machinery...
>
> ...how are we going to get to that?
>
>
> On Thu, 12 Jan 2023, Jose E. Marchesi wrote:
>> I would like to point out that I have maintained these kind of feeds for
>> my own sites for years, and that in my humble personal experience unless
>> there are a lot of updates, like more than a couple of new entries per
>> month, any automated schema would be overkill, prone to rot, and not
>> really worth the effort.
>
> That is a bit of a concern. I'd love having a single source that feeds 
> both the News section on our main page, rolls over into news.html, and
> also feeds the Atom feed (no pun intended).
>
> On the other hand, with less than a dozen entries per year, even if 
> manually converting form one to the other takes 5 minutes, creating 
> such a machinery wouldn't amortize anytime soon

Yeah I guess it all depends on how much the news section is used.

I personally think that it would be beneficial for the different GCC
projects (front-ends, back-ends, etc) to be a little more vocal, public
wise.  Releasing news items more often may help with that.

Of course one could argue that making it easier to add news to the
system (without having to manually rotate the .html file, add to the
feed if desired, etc) would help with that.  And probably would be right
:D

>> I strongly suggest to not overengineer here [and nowhere else :)]
>
> I am tempted to agree (even if the engineer in me would prefer to avoid 
> duplication). Jose, might you be willing to help others create Atom feed
> entries?

Sure.  It is as easy as adding one of these things to the .xml file:


  Rhhw Friday 16 March 2018 - Sunday 18 March 2018 @
  Frankfurt am Main
  http://jemarch.net/rhhw.html#16march2018
  
The Rabbit Herd will be meeting the weekend from 16 March to
18 March.
  
  Mon, 12 March 2018 15:00:00 CET


To be sure nothing breaks we may run a XML validator on the server to
reject pushes that break the .xml file.  There must be an XML schema for
XML Atom feeds somewhere..

> What do others think?
>
> Gerald

[PATCH] middle-end/106075 - non-call EH and DSE

2023-01-17 Thread Richard Biener via Gcc-patches

The following fixes a long-standing bug with DSE removing stores as
dead even though they are live across non-call exceptional flow.
This affects both GIMPLE and RTL DSE and the fix is similar in
making externally throwing statements uses of non-local stores.
Note this doesn't fix the GIMPLE side when the throwing statement
does not involve a load or a store because then the statement does
not have virtual operands and thus is not visited by GIMPLE DSE.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

This doesn't seem to be a regression and I'm unsure as to how
important it is for Ada (I consider it not important for C/C++),
so for now I'll queue it for next stage1.

PR middle-end/106075
* dse.cc (scan_insn): Consider externally throwing insns
to read from not frame based memory.
* tree-ssa-dse.cc (dse_classify_store): Consider externally
throwing uses to read from global memory.

* gcc.dg/torture/pr106075-1.c: New testcase.
---
 gcc/dse.cc|  5 
 gcc/testsuite/gcc.dg/torture/pr106075-1.c | 36 +++
 gcc/tree-ssa-dse.cc   |  8 -
 3 files changed, 48 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr106075-1.c

diff --git a/gcc/dse.cc b/gcc/dse.cc
index a2db8d1cc32..7e258b81f66 100644
--- a/gcc/dse.cc
+++ b/gcc/dse.cc
@@ -2633,6 +2633,11 @@ scan_insn (bb_info_t bb_info, rtx_insn *insn, int 
max_active_local_stores)
   return;
 }
 
+  /* An externally throwing statement may read any memory that is not
+ relative to the frame.  */
+  if (can_throw_external (insn))
+add_non_frame_wild_read (bb_info);
+
   /* Assuming that there are sets in these insns, we cannot delete
  them.  */
   if ((GET_CODE (PATTERN (insn)) == CLOBBER)
diff --git a/gcc/testsuite/gcc.dg/torture/pr106075-1.c 
b/gcc/testsuite/gcc.dg/torture/pr106075-1.c
new file mode 100644
index 000..b9affbf1082
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr106075-1.c
@@ -0,0 +1,36 @@
+/* { dg-do run { target *-*-linux* } } */
+/* { dg-additional-options "-fnon-call-exceptions" } */
+
+#include 
+#include 
+#include 
+
+int a = 1;
+short *b;
+void __attribute__((noipa))
+test()
+{
+  a=12345;
+  *b=0;
+  a=1;
+}
+
+void check (int i)
+{
+  if (a != 12345)
+abort ();
+  exit (0);
+}
+
+int
+main ()
+{
+  struct sigaction s;
+  sigemptyset (&s.sa_mask);
+  s.sa_handler = check;
+  s.sa_flags = 0;
+  sigaction (SIGSEGV, &s, NULL);
+  test();
+  abort ();
+  return 0;
+}
diff --git a/gcc/tree-ssa-dse.cc b/gcc/tree-ssa-dse.cc
index 46ab57d5754..b2e2359c3da 100644
--- a/gcc/tree-ssa-dse.cc
+++ b/gcc/tree-ssa-dse.cc
@@ -960,6 +960,7 @@ dse_classify_store (ao_ref *ref, gimple *stmt,
   auto_bitmap visited;
   std::unique_ptr
 dra (nullptr, free_data_ref);
+  bool maybe_global = ref_may_alias_global_p (ref, false);
 
   if (by_clobber_p)
 *by_clobber_p = true;
@@ -1038,6 +1039,11 @@ dse_classify_store (ao_ref *ref, gimple *stmt,
  last_phi_def = as_a  (use_stmt);
}
}
+ /* If the stmt can throw externally and the store is
+visible in the context unwound to the store is live.  */
+ else if (maybe_global
+  && stmt_can_throw_external (cfun, use_stmt))
+   return DSE_STORE_LIVE;
  /* If the statement is a use the store is not dead.  */
  else if (ref_maybe_used_by_stmt_p (use_stmt, ref))
{
@@ -1116,7 +1122,7 @@ dse_classify_store (ao_ref *ref, gimple *stmt,
 just pretend the stmt makes itself dead.  Otherwise fail.  */
   if (defs.is_empty ())
{
- if (ref_may_alias_global_p (ref, false))
+ if (maybe_global)
return DSE_STORE_LIVE;
 
  if (by_clobber_p)
-- 
2.35.3

RE: [PATCH] AArch64: Gate various crypto intrinsics availability based on features

2023-01-17 Thread Kyrylo Tkachov via Gcc-patches

Hi Tejas,

> -Original Message-
> From: Gcc-patches  bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Tejas Belagod
> via Gcc-patches
> Sent: Monday, January 16, 2023 7:12 AM
> To: gcc-patches@gcc.gnu.org
> Cc: Tejas Belagod ; Richard Sandiford
> ; Richard Earnshaw
> 
> Subject: [PATCH] AArch64: Gate various crypto intrinsics availability based on
> features
> 
> The 64-bit variant of PMULL{2} and AES instructions are available if FEAT_AES
> is implemented according to the Arm ARM [1].  Similarly FEAT_SHA1 and
> FEAT_SHA256 enable the use of SHA1 and SHA256 instruction variants.
> This patch fixes arm_neon.h to correctly reflect the feature availability 
> based
> on '+aes' and '+sha2' as opposed to the ambiguous catch-all '+crypto'.
> 
> [1] Section D17.2.61, C7.2.215
> 
> 2022-01-11  Tejas Belagod  
> 
> gcc/
>   * config/aarch64/arm_neon.h: Gate AES and PMULL64 intrinsics
>   under target feature +aes as opposed to +crypto. Gate SHA1 and
> SHA2
>   intrinsics under +sha2.

The ChangeLog should list the intrinsics affected like
* config/aarch64/arm_neon.h (vmull_p64, vmull_high_p64): Gate under 
"nothing+aes"
For example.
Ok with a fixed ChangeLog.
Thanks,
Kyrill


> 
> testsuite/
> 
>   * gcc.target/aarch64/acle/pmull64.c: New.
>   * gcc/testsuite/gcc.target/aarch64/aes-fuse-1.c: Replace '+crypto'
> with
>   corresponding feature flag based on the intrinsic.
>   * gcc.target/aarch64/aes-fuse-2.c: Likewise.
>   * gcc.target/aarch64/aes_1.c: Likewise.
>   * gcc.target/aarch64/aes_2.c: Likewise.
>   * gcc.target/aarch64/aes_xor_combine.c: Likewise.
>   * gcc.target/aarch64/sha1_1.c: Likewise.
>   * gcc.target/aarch64/sha256_1.c: Likewise.
>   * gcc.target/aarch64/target_attr_crypto_ice_1.c: Likewise.
> ---
>  gcc/config/aarch64/arm_neon.h | 35 ++-
>  .../gcc.target/aarch64/acle/pmull64.c | 14 
>  gcc/testsuite/gcc.target/aarch64/aes-fuse-1.c |  4 +--
>  gcc/testsuite/gcc.target/aarch64/aes-fuse-2.c |  4 +--
>  gcc/testsuite/gcc.target/aarch64/aes_1.c  |  2 +-
>  gcc/testsuite/gcc.target/aarch64/aes_2.c  |  4 ++-
>  .../gcc.target/aarch64/aes_xor_combine.c  |  2 +-
>  gcc/testsuite/gcc.target/aarch64/sha1_1.c |  2 +-
>  gcc/testsuite/gcc.target/aarch64/sha256_1.c   |  2 +-
>  .../aarch64/target_attr_crypto_ice_1.c|  2 +-
>  10 files changed, 44 insertions(+), 27 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/acle/pmull64.c
> 
> diff --git a/gcc/config/aarch64/arm_neon.h
> b/gcc/config/aarch64/arm_neon.h
> index cf6af728ca9..a795a387b38 100644
> --- a/gcc/config/aarch64/arm_neon.h
> +++ b/gcc/config/aarch64/arm_neon.h
> @@ -7496,7 +7496,7 @@ vqrdmlshs_laneq_s32 (int32_t __a, int32_t __b,
> int32x4_t __c, const int __d)
>  #pragma GCC pop_options
> 
>  #pragma GCC push_options
> -#pragma GCC target ("+nothing+crypto")
> +#pragma GCC target ("+nothing+aes")
>  /* vaes  */
> 
>  __extension__ extern __inline uint8x16_t
> @@ -7526,6 +7526,22 @@ vaesimcq_u8 (uint8x16_t data)
>  {
>return __builtin_aarch64_crypto_aesimcv16qi_uu (data);
>  }
> +
> +__extension__ extern __inline poly128_t
> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> +vmull_p64 (poly64_t __a, poly64_t __b)
> +{
> +  return
> +__builtin_aarch64_crypto_pmulldi_ppp (__a, __b);
> +}
> +
> +__extension__ extern __inline poly128_t
> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> +vmull_high_p64 (poly64x2_t __a, poly64x2_t __b)
> +{
> +  return __builtin_aarch64_crypto_pmullv2di_ppp (__a, __b);
> +}
> +
>  #pragma GCC pop_options
> 
>  /* vcage  */
> @@ -20772,7 +20788,7 @@ vrsrad_n_u64 (uint64_t __a, uint64_t __b, const
> int __c)
>  }
> 
>  #pragma GCC push_options
> -#pragma GCC target ("+nothing+crypto")
> +#pragma GCC target ("+nothing+sha2")
> 
>  /* vsha1  */
> 
> @@ -20849,21 +20865,6 @@ vsha256su1q_u32 (uint32x4_t __tw0_3,
> uint32x4_t __w8_11, uint32x4_t __w12_15)
>  __w12_15);
>  }
> 
> -__extension__ extern __inline poly128_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -vmull_p64 (poly64_t __a, poly64_t __b)
> -{
> -  return
> -__builtin_aarch64_crypto_pmulldi_ppp (__a, __b);
> -}
> -
> -__extension__ extern __inline poly128_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -vmull_high_p64 (poly64x2_t __a, poly64x2_t __b)
> -{
> -  return __builtin_aarch64_crypto_pmullv2di_ppp (__a, __b);
> -}
> -
>  #pragma GCC pop_options
> 
>  /* vshl */
> diff --git a/gcc/testsuite/gcc.target/aarch64/acle/pmull64.c
> b/gcc/testsuite/gcc.target/aarch64/acle/pmull64.c
> new file mode 100644
> index 000..6a1e99e2d0d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/acle/pmull64.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-march=armv8.2-a" } */
> +
> +#pragma push_options
> +#prag

[aarch64] Use wzr/xzr for assigning vector element to 0

2023-01-17 Thread Prathamesh Kulkarni via Gcc-patches

Hi Richard,
For the following (contrived) test:

void foo(int32x4_t v)
{
  v[3] = 0;
  return v;
}

-O2 code-gen:
foo:
fmovs1, wzr
ins v0.s[3], v1.s[0]
ret

I suppose we can instead emit the following code-gen ?
foo:
 ins v0.s[3], wzr
 ret

combine produces:
Failed to match this instruction:
(set (reg:V4SI 95 [ v ])
(vec_merge:V4SI (const_vector:V4SI [
(const_int 0 [0]) repeated x4
])
(reg:V4SI 97)
(const_int 8 [0x8])))

So, I wrote the following pattern to match the above insn:
(define_insn "aarch64_simd_vec_set_zero"
  [(set (match_operand:VALL_F16 0 "register_operand" "=w")
(vec_merge:VALL_F16
(match_operand:VALL_F16 1 "const_dup0_operand" "w")
(match_operand:VALL_F16 3 "register_operand" "0")
(match_operand:SI 2 "immediate_operand" "i")))]
  "TARGET_SIMD"
  {
int elt = ENDIAN_LANE_N (, exact_log2 (INTVAL (operands[2])));
operands[2] = GEN_INT ((HOST_WIDE_INT) 1 << elt);
return "ins\\t%0.[%p2], wzr";
  }
)

which now matches the above insn produced by combine.
However, in reload dump, it creates a new insn for assigning
register to (const_vector (const_int 0)),
which results in:
(insn 19 8 13 2 (set (reg:V4SI 33 v1 [99])
(const_vector:V4SI [
(const_int 0 [0]) repeated x4
])) "wzr-test.c":8:1 1269 {*aarch64_simd_movv4si}
 (nil))
(insn 13 19 14 2 (set (reg/i:V4SI 32 v0)
(vec_merge:V4SI (reg:V4SI 33 v1 [99])
(reg:V4SI 32 v0 [97])
(const_int 8 [0x8]))) "wzr-test.c":8:1 1808
{aarch64_simd_vec_set_zerov4si}
 (nil))

and eventually the code-gen:
foo:
moviv1.4s, 0
ins v0.s[3], wzr
ret

To get rid of redundant assignment of 0 to v1, I tried to split the
above pattern
as in the attached patch. This works to emit code-gen:
foo:
ins v0.s[3], wzr
ret

However, I am not sure if this is the right approach. Could you suggest,
if it'd be possible to get rid of UNSPEC_SETZERO in the patch ?

Thanks,
Prathamesh
diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index 104088f67d2..5130f46c0da 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1083,6 +1083,39 @@
   [(set_attr "type" "neon_ins, neon_from_gp, neon_load1_one_lane")]
 )
 
+(define_insn "aarch64_simd_set_zero"
+  [(set (match_operand:VALL_F16 0 "register_operand" "=w")
+   (unspec:VALL_F16 [(match_operand:VALL_F16 1 "register_operand" "0")
+ (match_operand:SI 2 "immediate_operand" "i")]
+UNSPEC_SETZERO))]
+  "TARGET_SIMD"
+  {
+if (GET_MODE_INNER (mode) == DImode)
+  return "ins\\t%0.[%p2], xzr";
+return "ins\\t%0.[%p2], wzr";
+  }
+  [(set_attr "type" "neon_ins")]
+)
+
+(define_insn_and_split "aarch64_simd_vec_set_zero"
+  [(set (match_operand:VALL_F16 0 "register_operand" "=w")
+   (vec_merge:VALL_F16
+   (match_operand:VALL_F16 1 "const_dup0_operand" "w")
+   (match_operand:VALL_F16 3 "register_operand" "0")
+   (match_operand:SI 2 "immediate_operand" "i")))]
+  "TARGET_SIMD"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+  {
+int elt = ENDIAN_LANE_N (, exact_log2 (INTVAL (operands[2])));
+operands[2] = GEN_INT ((HOST_WIDE_INT) 1 << elt);
+emit_insn (gen_aarch64_simd_set_zero (operands[0], operands[3], 
operands[2]));
+DONE;
+  }
+  [(set_attr "type" "neon_ins")]
+)
+
 (define_insn "@aarch64_simd_vec_copy_lane"
   [(set (match_operand:VALL_F16 0 "register_operand" "=w")
(vec_merge:VALL_F16
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 5b26443e5b6..8064841ebb4 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -839,6 +839,7 @@
 UNSPEC_FCMUL_CONJ  ; Used in aarch64-simd.md.
 UNSPEC_FCMLA_CONJ  ; Used in aarch64-simd.md.
 UNSPEC_FCMLA180_CONJ   ; Used in aarch64-simd.md.
+UNSPEC_SETZERO ; Used in aarch64-simd.md.
 UNSPEC_ASRD; Used in aarch64-sve.md.
 UNSPEC_ADCLB   ; Used in aarch64-sve2.md.
 UNSPEC_ADCLT   ; Used in aarch64-sve2.md.
diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md
index ff7f73d3f30..901fa1bd7f9 100644
--- a/gcc/config/aarch64/predicates.md
+++ b/gcc/config/aarch64/predicates.md
@@ -49,6 +49,13 @@
   return CONST_INT_P (op) && IN_RANGE (INTVAL (op), 1, 3);
 })
 
+(define_predicate "const_dup0_operand"
+  (match_code "const_vector")
+{
+  op = unwrap_const_vec_duplicate (op);
+  return CONST_INT_P (op) && rtx_equal_p (op, const0_rtx);
+})
+
 (define_predicate "subreg_lowpart_operator"
   (ior (match_code "truncate")
(and (match_code "subreg")

Re: [PATCH] forwprop: Fix up rotate pattern matching [PR106523]

2023-01-17 Thread Aldy Hernandez via Gcc-patches





On 1/17/23 10:47, Jakub Jelinek wrote:


Aldy/Andrew, is the ranger query ok or should I use something different
when check_range_stmt is non-NULL and I know on which statement to ask?





+ int_range_max r;
+ if (!get_global_range_query ()->range_of_expr (r, rotcnt,
+check_range_stmt))
+   return false;


range_of_expr will work with and without a statement.  If no statement 
is provided, it will return the global range.  So you can use the same 
range_of_expr call with a statement or without one if you don't know it.


Note that get_global_range_query () will always return a global query 
object (think SSA_NAME_RANGE_INFO).  It will never use an existing 
ranger (for example, if called within VRP or another pass that has an 
active ranger enabled).  If simplify_rotate() may be used from some of 
these passes you *may* want to use get_range_query() which will pick up 
the active ranger, or a global query object if no ranger is active.


For that matter, since get_global_range_query() uses a global query, it 
really doesn't matter if you pass a statement or not, since our global 
range store has no context (SSA_NAME_RANGE_INFO).  Although, I 
personally always pass the statement if known, because it's good form, 
and if things ever change to an active ranger, everything will just work.


Aldy

Re: [PATCH] forwprop: Fix up rotate pattern matching [PR106523]

2023-01-17 Thread Richard Biener via Gcc-patches

On Tue, 17 Jan 2023, Jakub Jelinek wrote:

> Hi!
> 
> The comment above simplify_rotate roughly describes what patterns
> are matched into what:
>We are looking for X with unsigned type T with bitsize B, OP being
>+, | or ^, some type T2 wider than T.  For:
>(X << CNT1) OP (X >> CNT2)   iff CNT1 + CNT2 == B
>((T) ((T2) X << CNT1)) OP ((T) ((T2) X >> CNT2)) iff CNT1 + CNT2 == B
>   
>transform these into:
>X r<< CNT1
> 
>Or for:
>(X << Y) OP (X >> (B - Y))
>(X << (int) Y) OP (X >> (int) (B - Y))
>((T) ((T2) X << Y)) OP ((T) ((T2) X >> (B - Y)))
>((T) ((T2) X << (int) Y)) OP ((T) ((T2) X >> (int) (B - Y)))
>(X << Y) | (X >> ((-Y) & (B - 1)))
>(X << (int) Y) | (X >> (int) ((-Y) & (B - 1)))
>((T) ((T2) X << Y)) | ((T) ((T2) X >> ((-Y) & (B - 1
>((T) ((T2) X << (int) Y)) | ((T) ((T2) X >> (int) ((-Y) & (B - 1
> 
>transform these into (last 2 only if ranger can prove Y < B):
>X r<< Y
>   
>Or for:
>(X << (Y & (B - 1))) | (X >> ((-Y) & (B - 1)))
>(X << (int) (Y & (B - 1))) | (X >> (int) ((-Y) & (B - 1)))
>((T) ((T2) X << (Y & (B - 1 | ((T) ((T2) X >> ((-Y) & (B - 1
>((T) ((T2) X << (int) (Y & (B - 1 \
>  | ((T) ((T2) X >> (int) ((-Y) & (B - 1
>   
>transform these into:
>X r<< (Y & (B - 1))
> The following testcase shows that 2 of these are problematic.
> If T2 is wider than T, then the 2 which yse (-Y) & (B - 1) on one
> of the shift counts but Y on the can do something different from
> rotate.  E.g.:
> __attribute__((noipa)) unsigned char
> f7 (unsigned char x, unsigned int y)
> {
>   unsigned int t = x;
>   return (t << y) | (t >> ((-y) & 7));
> }
> if y is [0, 7], then it is a normal rotate, and if y is in [32, ~0U]
> then it is UB, but for y in [9, 31] the left shift in this case
> will never leave any bits in the result, while in a rotate they are
> left there.  Say for y 5 and x 0xaa the expression gives
> 0x55 which is the same thing as rotate, while for y 19 and x 0xaa
> 0x5, which is different.
> Now, I believe the
>((T) ((T2) X << Y)) OP ((T) ((T2) X >> (B - Y)))
>((T) ((T2) X << (int) Y)) OP ((T) ((T2) X >> (int) (B - Y)))
> forms are ok, because B - Y still needs to be a valid shift count,
> and if Y > B then B - Y should be either negative or very large
> positive (for unsigned types).
> And similarly the last 2 cases above which use & (B - 1) on both
> shift operands are definitely ok.
> 
> The following patch disables the
>((T) ((T2) X << Y)) | ((T) ((T2) X >> ((-Y) & (B - 1
>((T) ((T2) X << (int) Y)) | ((T) ((T2) X >> (int) ((-Y) & (B - 1
> unless ranger says Y is not in [B, B2 - 1] range.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk
> (for now)?

OK.

Richard.

> 
> Aldy/Andrew, is the ranger query ok or should I use something different
> when check_range_stmt is non-NULL and I know on which statement to ask?
> 
> And, looking at it again this morning, actually the Y equal to B
> case is still fine, if Y is equal to 0, then it is
> (T) (((T2) X << 0) | ((T2) X >> 0))
> and so X, for Y == B it is
> (T) (((T2) X << B) | ((T2) X >> 0))
> which is the same as
> (T) (0 | ((T2) X >> 0))
> which is also X.  So instead of the [B, B2 - 1] range we could use
> [B + 1, B2 - 1].  And, if we wanted to go further, even multiplies
> of B are ok if they are smaller than B2, so we could construct a detailed
> int_range_max if we wanted.
> 
> 2023-01-17  Jakub Jelinek  
> 
>   PR tree-optimization/106523
>   * tree-ssa-forwprop.cc (simplify_rotate): For the
>   patterns with (-Y) & (B - 1) in one operand's shift
>   count and Y in another, if T2 has wider precision than T,
>   punt if Y could have a value in [B, B2 - 1] range.
> 
>   * c-c++-common/rotate-2.c (f5, f6, f7, f8, f13, f14, f15, f16,
>   f37, f38, f39, f40, f45, f46, f47, f48): Add assertions using
>   __builtin_unreachable about shift count.
>   * c-c++-common/rotate-2b.c: New test.
>   * c-c++-common/rotate-4.c (f5, f6, f7, f8, f13, f14, f15, f16,
>   f37, f38, f39, f40, f45, f46, f47, f48): Add assertions using
>   __builtin_unreachable about shift count.
>   * c-c++-common/rotate-4b.c: New test.
>   * gcc.c-torture/execute/pr106523.c: New test.
> 
> --- gcc/tree-ssa-forwprop.cc.jj   2023-01-02 09:32:26.0 +0100
> +++ gcc/tree-ssa-forwprop.cc  2023-01-16 18:18:43.524443879 +0100
> @@ -1837,7 +1837,7 @@ defcodefor_name (tree name, enum tree_co
> ((T) ((T2) X << Y)) | ((T) ((T2) X >> ((-Y) & (B - 1
> ((T) ((T2) X << (int) Y)) | ((T) ((T2) X >> (int) ((-Y) & (B - 1
>  
> -   transform these into:
> +   transform these into (last 2 only if ranger can prove Y < B):
> X r<< Y
>  
> Or for:
> @@ -1866,6 +1866,8 @@ simplify_rotate (gimple_stmt_iterator *g
>int i;
>bool swapped_p = false;
>gimple *g;
> +  gimple *def_arg_stmt[2] = { NULL, NULL };
>

Re: [PATCH] forwprop: Fix up rotate pattern matching [PR106523]

2023-01-17 Thread Jakub Jelinek via Gcc-patches

On Tue, Jan 17, 2023 at 11:59:53AM +0100, Aldy Hernandez wrote:
> 
> 
> On 1/17/23 10:47, Jakub Jelinek wrote:
> 
> > Aldy/Andrew, is the ranger query ok or should I use something different
> > when check_range_stmt is non-NULL and I know on which statement to ask?
> 
> 
> 
> > + int_range_max r;
> > + if (!get_global_range_query ()->range_of_expr (r, rotcnt,
> > +check_range_stmt))
> > +   return false;
> 
> range_of_expr will work with and without a statement.  If no statement is
> provided, it will return the global range.  So you can use the same
> range_of_expr call with a statement or without one if you don't know it.
> 
> Note that get_global_range_query () will always return a global query object
> (think SSA_NAME_RANGE_INFO).  It will never use an existing ranger (for
> example, if called within VRP or another pass that has an active ranger
> enabled).  If simplify_rotate() may be used from some of these passes you
> *may* want to use get_range_query() which will pick up the active ranger, or
> a global query object if no ranger is active.

This is always in the forwprop pass.
I think it doesn't have any active ranger instance, but I could be wrong.

A question would be if it would be worth to activate it in this spot lazily
if it isn't active yet (and destruct at the end of the pass).

> For that matter, since get_global_range_query() uses a global query, it
> really doesn't matter if you pass a statement or not, since our global range
> store has no context (SSA_NAME_RANGE_INFO).  Although, I personally always
> pass the statement if known, because it's good form, and if things ever
> change to an active ranger, everything will just work.

Jakub

Re: [PATCH] forwprop: Fix up rotate pattern matching [PR106523]

2023-01-17 Thread Aldy Hernandez via Gcc-patches





On 1/17/23 12:09, Jakub Jelinek wrote:

On Tue, Jan 17, 2023 at 11:59:53AM +0100, Aldy Hernandez wrote:



On 1/17/23 10:47, Jakub Jelinek wrote:


Aldy/Andrew, is the ranger query ok or should I use something different
when check_range_stmt is non-NULL and I know on which statement to ask?





+ int_range_max r;
+ if (!get_global_range_query ()->range_of_expr (r, rotcnt,
+check_range_stmt))
+   return false;


range_of_expr will work with and without a statement.  If no statement is
provided, it will return the global range.  So you can use the same
range_of_expr call with a statement or without one if you don't know it.

Note that get_global_range_query () will always return a global query object
(think SSA_NAME_RANGE_INFO).  It will never use an existing ranger (for
example, if called within VRP or another pass that has an active ranger
enabled).  If simplify_rotate() may be used from some of these passes you
*may* want to use get_range_query() which will pick up the active ranger, or
a global query object if no ranger is active.


This is always in the forwprop pass.
I think it doesn't have any active ranger instance, but I could be wrong.

A question would be if it would be worth to activate it in this spot lazily
if it isn't active yet (and destruct at the end of the pass).


That's what it was designed for :).  If you're making sporadic requests, 
the on-demand mechanism should be fast enough.


Aldy

[PATCH (pushed)] Regenerate Makefile.in files.

2023-01-17 Thread Martin Liška

libbacktrace/ChangeLog:

* Makefile.in: Regenerate.

libgomp/ChangeLog:

* Makefile.in: Regenerate.
* configure: Regenerate.

libphobos/ChangeLog:

* Makefile.in: Regenerate.
* libdruntime/Makefile.in: Regenerate.

libstdc++-v3/ChangeLog:

* src/libbacktrace/Makefile.in: Regenerate.
---
 libbacktrace/Makefile.in  | 2 +-
 libgomp/Makefile.in   | 2 +-
 libgomp/configure | 2 +-
 libphobos/Makefile.in | 2 +-
 libphobos/libdruntime/Makefile.in | 2 +-
 libstdc++-v3/src/libbacktrace/Makefile.in | 2 +-
 6 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/libbacktrace/Makefile.in b/libbacktrace/Makefile.in
index c79fd3bfa6b..889144acb1c 100644
--- a/libbacktrace/Makefile.in
+++ b/libbacktrace/Makefile.in
@@ -15,7 +15,7 @@
 @SET_MAKE@
 
 # Makefile.am -- Backtrace Makefile.
-# Copyright (C) 2012-2022 Free Software Foundation, Inc.
+# Copyright (C) 2012-2023 Free Software Foundation, Inc.
 
 # Redistribution and use in source and binary forms, with or without
 # modification, are permitted provided that the following conditions are
diff --git a/libgomp/Makefile.in b/libgomp/Makefile.in
index 8ffd45c9c41..2c81ccacc1d 100644
--- a/libgomp/Makefile.in
+++ b/libgomp/Makefile.in
@@ -16,7 +16,7 @@
 
 # Plugins for offload execution, Makefile.am fragment.
 #
-# Copyright (C) 2014-2022 Free Software Foundation, Inc.
+# Copyright (C) 2014-2023 Free Software Foundation, Inc.
 #
 # Contributed by Mentor Embedded.
 #
diff --git a/libgomp/configure b/libgomp/configure
index 45a769eb10a..fd0e337b578 100755
--- a/libgomp/configure
+++ b/libgomp/configure
@@ -15053,7 +15053,7 @@ _ACEOF
 
 # Plugins for offload execution, configure.ac fragment.  -*- mode: autoconf -*-
 #
-# Copyright (C) 2014-2022 Free Software Foundation, Inc.
+# Copyright (C) 2014-2023 Free Software Foundation, Inc.
 #
 # Contributed by Mentor Embedded.
 #
diff --git a/libphobos/Makefile.in b/libphobos/Makefile.in
index 2e9360a5238..8d62c31dab0 100644
--- a/libphobos/Makefile.in
+++ b/libphobos/Makefile.in
@@ -15,7 +15,7 @@
 @SET_MAKE@
 
 # Makefile for the toplevel directory of the D Standard library.
-# Copyright (C) 2006-2022 Free Software Foundation, Inc.
+# Copyright (C) 2006-2023 Free Software Foundation, Inc.
 #
 # GCC is free software; you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
diff --git a/libphobos/libdruntime/Makefile.in 
b/libphobos/libdruntime/Makefile.in
index e86721fb3fe..3f04496bd7b 100644
--- a/libphobos/libdruntime/Makefile.in
+++ b/libphobos/libdruntime/Makefile.in
@@ -15,7 +15,7 @@
 @SET_MAKE@
 
 # Makefile for the D runtime library.
-# Copyright (C) 2012-2022 Free Software Foundation, Inc.
+# Copyright (C) 2012-2023 Free Software Foundation, Inc.
 #
 # GCC is free software; you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
diff --git a/libstdc++-v3/src/libbacktrace/Makefile.in 
b/libstdc++-v3/src/libbacktrace/Makefile.in
index 7c112722938..1c1242d8827 100644
--- a/libstdc++-v3/src/libbacktrace/Makefile.in
+++ b/libstdc++-v3/src/libbacktrace/Makefile.in
@@ -15,7 +15,7 @@
 @SET_MAKE@
 
 # Makefile.am -- Backtrace in libstdc++ Makefile.
-# Copyright (C) 2012-2013 Free Software Foundation, Inc.
+# Copyright (C) 2012-2023 Free Software Foundation, Inc.
 
 # Redistribution and use in source and binary forms, with or without
 # modification, are permitted provided that the following conditions are
-- 
2.39.0

Re: [PATCH] forwprop: Fix up rotate pattern matching [PR106523]

2023-01-17 Thread Jakub Jelinek via Gcc-patches

On Tue, Jan 17, 2023 at 12:14:14PM +0100, Aldy Hernandez wrote:
> > A question would be if it would be worth to activate it in this spot lazily
> > if it isn't active yet (and destruct at the end of the pass).
> 
> That's what it was designed for :).  If you're making sporadic requests, the
> on-demand mechanism should be fast enough.

So what should be done to do the on-demand query rather than global one?

Jakub

Re: [PATCH] forwprop: Fix up rotate pattern matching [PR106523]

2023-01-17 Thread Aldy Hernandez via Gcc-patches





On 1/17/23 12:19, Jakub Jelinek wrote:

On Tue, Jan 17, 2023 at 12:14:14PM +0100, Aldy Hernandez wrote:

A question would be if it would be worth to activate it in this spot lazily
if it isn't active yet (and destruct at the end of the pass).


That's what it was designed for :).  If you're making sporadic requests, the
on-demand mechanism should be fast enough.


So what should be done to do the on-demand query rather than global one?


gimple_ranger ranger;
if (ranger.range_of_expr (r, .))
   // business as usual

Re: [PATCH v2] xtensa: Eliminate the use of callee-saved register that saves and restores only once

2023-01-17 Thread Max Filippov via Gcc-patches

Hi Suwa-san,

On Mon, Jan 16, 2023 at 8:12 PM Takayuki 'January June' Suwa
 wrote:
>
> In the case of the CALL0 ABI, values that must be retained before and
> after function calls are placed in the callee-saved registers (A12
> through A15) and referenced later.  However, it is often the case that
> the save and the reference are each only once and a simple register-
> register move (the frame pointer is needed to recover the stack pointer
> and must be excluded).
>
> e.g. in the following example, if there are no other occurrences of
> register A14:
>
> ;; before
> ; prologue {
>   ...
> s32i.n  a14, sp, 16
>   ...
> ; } prologue
>   ...
> mov.n   a14, a6
>   ...
> call0   foo
>   ...
> mov.n   a8, a14
>   ...
> ; epilogue {
>   ...
> l32i.n  a14, sp, 16
>   ...
> ; } epilogue
>
> It can be possible like this:
>
> ;; after
> ; prologue {
>   ...
> (deleted)
>   ...
> ; } prologue
>   ...
> s32i.n  a6, sp, 16
>   ...
> call0   foo
>   ...
> l32i.n  a8, sp, 16
>   ...
> ; epilogue {
>   ...
> (deleted)
>   ...
> ; } epilogue
>
> This patch introduces a new peephole2 pattern that implements the above.
>
> gcc/ChangeLog:
>
> * config/xtensa/xtensa.md: New peephole2 pattern that eliminates
> the use of callee-saved register that saves and restores only once
> for other register, by using its stack slot directly.
> ---
>  gcc/config/xtensa/xtensa.md | 60 +
>  1 file changed, 60 insertions(+)

There's still a few regressions in tests with -fcompare-debug because
code generated with -g and without it is different:

+FAIL: gcc.dg/pr41241.c (test for excess errors)
+FAIL: gcc.dg/pr48159-1.c (test for excess errors)
+FAIL: gcc.dg/pr65521.c (test for excess errors)
+FAIL: gcc.dg/torture/pr42878-1.c   -O2  (test for excess errors)
+FAIL: gcc.dg/torture/pr42878-1.c   -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions  (test for
excess errors)
+FAIL: gcc.dg/torture/pr42878-1.c   -O3 -g  (test for excess errors)
+FAIL: gcc.dg/torture/pr42878-1.c   -Os  (test for excess errors)
+FAIL: gcc.dg/torture/pr42878-1.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (test for excess errors)

E.g. check the following test with -g0 and -g:

gcc/cc1 gcc/testsuite/gcc.dg/torture/pr42878-1.c -mlongcalls
-mtext-section-literals -fdiagnostics-plain-output -O3
-fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer
-finline-functions

-- 
Thanks.
-- Max

Re: [PATCH] forwprop: Fix up rotate pattern matching [PR106523]

2023-01-17 Thread Jakub Jelinek via Gcc-patches

On Tue, Jan 17, 2023 at 12:22:42PM +0100, Aldy Hernandez wrote:
> 
> 
> On 1/17/23 12:19, Jakub Jelinek wrote:
> > On Tue, Jan 17, 2023 at 12:14:14PM +0100, Aldy Hernandez wrote:
> > > > A question would be if it would be worth to activate it in this spot 
> > > > lazily
> > > > if it isn't active yet (and destruct at the end of the pass).
> > > 
> > > That's what it was designed for :).  If you're making sporadic requests, 
> > > the
> > > on-demand mechanism should be fast enough.
> > 
> > So what should be done to do the on-demand query rather than global one?
> 
> gimple_ranger ranger;
> if (ranger.range_of_expr (r, .))
>// business as usual

So not worth making the ranger somewhere in the pass (if it is really
sporadic like this one)?

Will test together with the removal of B from the range.

Jakub

Re: [PATCH] forwprop: Fix up rotate pattern matching [PR106523]

2023-01-17 Thread Aldy Hernandez via Gcc-patches





On 1/17/23 12:33, Jakub Jelinek wrote:

On Tue, Jan 17, 2023 at 12:22:42PM +0100, Aldy Hernandez wrote:



On 1/17/23 12:19, Jakub Jelinek wrote:

On Tue, Jan 17, 2023 at 12:14:14PM +0100, Aldy Hernandez wrote:

A question would be if it would be worth to activate it in this spot lazily
if it isn't active yet (and destruct at the end of the pass).


That's what it was designed for :).  If you're making sporadic requests, the
on-demand mechanism should be fast enough.


So what should be done to do the on-demand query rather than global one?


gimple_ranger ranger;
if (ranger.range_of_expr (r, .))
// business as usual


So not worth making the ranger somewhere in the pass (if it is really
sporadic like this one)?


If you're going to call range_of_expr various times within a pass, 
creating a pass instance would be the way to go.


// Early in your pass.
enable_ranger (func);
...
if (get_range_query->range_of_expr ())
  { stuff }

// Late in your pass:
disable_ranger (func);

Note that get_range_query() would work even without enable_ranger...it 
would just pick up the global ranger (SSA_NAME_RANGE_INFO).


Aldy



Will test together with the removal of B from the range.

Jakub

Re: Extend fold_vec_perm to fold VEC_PERM_EXPR in VLA manner

2023-01-17 Thread Prathamesh Kulkarni via Gcc-patches

On Mon, 26 Dec 2022 at 09:56, Prathamesh Kulkarni
 wrote:
>
> On Tue, 13 Dec 2022 at 11:35, Prathamesh Kulkarni
>  wrote:
> >
> > On Tue, 6 Dec 2022 at 21:00, Richard Sandiford
> >  wrote:
> > >
> > > Prathamesh Kulkarni via Gcc-patches  writes:
> > > > On Fri, 4 Nov 2022 at 14:00, Prathamesh Kulkarni
> > > >  wrote:
> > > >>
> > > >> On Mon, 31 Oct 2022 at 15:27, Richard Sandiford
> > > >>  wrote:
> > > >> >
> > > >> > Prathamesh Kulkarni  writes:
> > > >> > > On Wed, 26 Oct 2022 at 21:07, Richard Sandiford
> > > >> > >  wrote:
> > > >> > >>
> > > >> > >> Sorry for the slow response.  I wanted to find some time to think
> > > >> > >> about this a bit more.
> > > >> > >>
> > > >> > >> Prathamesh Kulkarni  writes:
> > > >> > >> > On Fri, 30 Sept 2022 at 21:38, Richard Sandiford
> > > >> > >> >  wrote:
> > > >> > >> >>
> > > >> > >> >> Richard Sandiford via Gcc-patches  
> > > >> > >> >> writes:
> > > >> > >> >> > Prathamesh Kulkarni  writes:
> > > >> > >> >> >> Sorry to ask a silly question but in which case shall we 
> > > >> > >> >> >> select 2nd vector ?
> > > >> > >> >> >> For num_poly_int_coeffs == 2,
> > > >> > >> >> >> a1 /trunc n1 == (a1 + 0x) / (n1.coeffs[0] + n1.coeffs[1]*x)
> > > >> > >> >> >> If a1/trunc n1 succeeds,
> > > >> > >> >> >> 0 / n1.coeffs[1] == a1/n1.coeffs[0] == 0.
> > > >> > >> >> >> So, a1 has to be < n1.coeffs[0] ?
> > > >> > >> >> >
> > > >> > >> >> > Remember that a1 is itself a poly_int.  It's not necessarily 
> > > >> > >> >> > a constant.
> > > >> > >> >> >
> > > >> > >> >> > E.g. the TRN1 .D instruction maps to a VEC_PERM_EXPR with 
> > > >> > >> >> > the selector:
> > > >> > >> >> >
> > > >> > >> >> >   { 0, 2 + 2x, 1, 4 + 2x, 2, 6 + 2x, ... }
> > > >> > >> >>
> > > >> > >> >> Sorry, should have been:
> > > >> > >> >>
> > > >> > >> >>   { 0, 2 + 2x, 2, 4 + 2x, 4, 6 + 2x, ... }
> > > >> > >> > Hi Richard,
> > > >> > >> > Thanks for the clarifications, and sorry for late reply.
> > > >> > >> > I have attached POC patch that tries to implement the above 
> > > >> > >> > approach.
> > > >> > >> > Passes bootstrap+test on x86_64-linux-gnu and aarch64-linux-gnu 
> > > >> > >> > for VLS vectors.
> > > >> > >> >
> > > >> > >> > For VLA vectors, I have only done limited testing so far.
> > > >> > >> > It seems to pass couple of tests written in the patch for
> > > >> > >> > nelts_per_pattern == 3,
> > > >> > >> > and folds the following svld1rq test:
> > > >> > >> > int32x4_t v = {1, 2, 3, 4};
> > > >> > >> > return svld1rq_s32 (svptrue_b8 (), &v[0])
> > > >> > >> > into:
> > > >> > >> > return {1, 2, 3, 4, ...};
> > > >> > >> > I will try to bootstrap+test it on SVE machine to test further 
> > > >> > >> > for VLA folding.
> > > >> > >> >
> > > >> > >> > I have a couple of questions:
> > > >> > >> > 1] When mask selects elements from same vector but from 
> > > >> > >> > different patterns:
> > > >> > >> > For eg:
> > > >> > >> > arg0 = {1, 11, 2, 12, 3, 13, ...},
> > > >> > >> > arg1 = {21, 31, 22, 32, 23, 33, ...},
> > > >> > >> > mask = {0, 0, 0, 1, 0, 2, ... },
> > > >> > >> > All have npatterns = 2, nelts_per_pattern = 3.
> > > >> > >> >
> > > >> > >> > With above mask,
> > > >> > >> > Pattern {0, ...} selects arg0[0], ie {1, ...}
> > > >> > >> > Pattern {0, 1, 2, ...} selects arg0[0], arg0[1], arg0[2], ie 
> > > >> > >> > {1, 11, 2, ...}
> > > >> > >> > While arg0[0] and arg0[2] belong to same pattern, arg0[1] 
> > > >> > >> > belongs to different
> > > >> > >> > pattern in arg0.
> > > >> > >> > The result is:
> > > >> > >> > res = {1, 1, 1, 11, 1, 2, ...}
> > > >> > >> > In this case, res's 2nd pattern {1, 11, 2, ...} is encoded with:
> > > >> > >> > with a0 = 1, a1 = 11, S = -9.
> > > >> > >> > Is that expected tho ? It seems to create a new encoding which
> > > >> > >> > wasn't present in the input vector. For instance, the next elem 
> > > >> > >> > in
> > > >> > >> > sequence would be -7,
> > > >> > >> > which is not present originally in arg0.
> > > >> > >>
> > > >> > >> Yeah, you're right, sorry.  Going back to:
> > > >> > >>
> > > >> > >> (2) The explicit encoding can be used to produce a sequence of 
> > > >> > >> N*Ex*Px
> > > >> > >> elements for any integer N.  This extended sequence can be 
> > > >> > >> reencoded
> > > >> > >> as having N*Px patterns, with Ex staying the same.
> > > >> > >>
> > > >> > >> I guess we need to pick an N for the selector such that each new
> > > >> > >> selector pattern (each one out of the N*Px patterns) selects from
> > > >> > >> the *same pattern* of the same data input.
> > > >> > >>
> > > >> > >> So if a particular pattern in the selector has a step S, and the 
> > > >> > >> data
> > > >> > >> input it selects from has Pi patterns, N*S must be a multiple of 
> > > >> > >> Pi.
> > > >> > >> N must be a multiple of least_common_multiple(S,Pi)/S.
> > > >> > >>
> > > >> > >> I think that means that the total number of patterns in the result
> > > >> > >> (Pr from previous messages) can safely be:
> > > >> > >>
> > > >> > >>

[PATCH (pushed)] contrib: revert removal of CR character

2023-01-17 Thread Martin Liška

contrib/ChangeLog:

* gcc-changelog/test_patches.txt: The CR character was removed
with ./contrib/update-copyright.py which I'm going to change.
---
 contrib/gcc-changelog/test_patches.txt | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/contrib/gcc-changelog/test_patches.txt 
b/contrib/gcc-changelog/test_patches.txt
index 2f1cd923a13..1d120f8e472 100644
--- a/contrib/gcc-changelog/test_patches.txt
+++ b/contrib/gcc-changelog/test_patches.txt
@@ -3631,8 +3631,7 @@ index 000..d75da75
 +pub fn main ()
 +{
 +// { dg-error "Isolated CR" "" { target *-*-* } .+1 }
-+  //! doc cr
- comment
++  //! doc cr
 comment
 +}
 -- 
 2.38.1
-- 
2.39.0

[PATCH] contrib: ignore CR in update-copyright.py

2023-01-17 Thread Martin Liška

When opening files, preserve CR characters. By default, open
accepts universal newlines, but I think we should only split with '\n'.

Ready to be installed?
Thanks,
Martin

contrib/ChangeLog:

* update-copyright.py: Split lines only with '\n'.
---
 contrib/update-copyright.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/contrib/update-copyright.py b/contrib/update-copyright.py
index ac7a94743cf..bdb5417bcec 100755
--- a/contrib/update-copyright.py
+++ b/contrib/update-copyright.py
@@ -408,7 +408,7 @@ class Copyright:
 line_filter = filter.get_line_filter (dir, filename)
 mode = None
 encoding = self.guess_encoding(pathname)
-with open (pathname, 'r', encoding=encoding) as file:
+with open (pathname, 'r', encoding=encoding, newline='\n') as file:
 prev = None
 mode = os.fstat (file.fileno()).st_mode
 for line in file:
@@ -434,7 +434,7 @@ class Copyright:
 # If something changed, write the new file out.
 if changed and self.errors.ok():
 tmp_pathname = pathname + '.tmp'
-with open (tmp_pathname, 'w', encoding=encoding) as file:
+with open (tmp_pathname, 'w', encoding=encoding, newline='\n') as 
file:
 for line in lines:
 file.write (line)
 os.fchmod (file.fileno(), mode)
-- 
2.39.0

Re: [PATCH (pushed)] contrib: revert removal of CR character

2023-01-17 Thread Jakub Jelinek via Gcc-patches

On Tue, Jan 17, 2023 at 12:56:32PM +0100, Martin Liška wrote:
> contrib/ChangeLog:
> 
>   * gcc-changelog/test_patches.txt: The CR character was removed
>   with ./contrib/update-copyright.py which I'm going to change.

We shouldn't be updating that file at all IMHO, shouldn't it be listed
among skip_files ?
> ---
>  contrib/gcc-changelog/test_patches.txt | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/contrib/gcc-changelog/test_patches.txt 
> b/contrib/gcc-changelog/test_patches.txt
> index 2f1cd923a13..1d120f8e472 100644
> --- a/contrib/gcc-changelog/test_patches.txt
> +++ b/contrib/gcc-changelog/test_patches.txt
> @@ -3631,8 +3631,7 @@ index 000..d75da75
>  +pub fn main ()
>  +{
>  +// { dg-error "Isolated CR" "" { target *-*-* } .+1 }
> -+  //! doc cr
> - comment
> ++  //! doc cr
>  comment
>  +}
>  -- 
>  2.38.1
> -- 
> 2.39.0

Jakub

Re: [PATCH v3 2/2] aarch64: Fix bit-field alignment in param passing [PR105549]

2023-01-17 Thread Christophe Lyon via Gcc-patches


Hi Jakub,

On 1/15/23 17:54, Christophe Lyon via Gcc-patches wrote:

Hi!


On 1/13/23 16:38, Jakub Jelinek wrote:
On Wed, Jan 11, 2023 at 03:18:06PM +0100, Christophe Lyon via 
Gcc-patches wrote:

While working on enabling DFP for AArch64, I noticed new failures in
gcc.dg/compat/struct-layout-1.exp (t028) which were not actually
caused by DFP types handling. These tests are generated during 'make
check' and enabling DFP made generation different (not sure if new
non-DFP tests are generated, or if existing ones are generated
differently, the tests in question are huge and difficult to compare).

Anyway, I reduced the problem to what I attach at the end of the new
gcc.target/aarch64/aapcs64/va_arg-17.c test and rewrote it in the same
scheme as other va_arg* AArch64 tests.  Richard Sandiford further
reduced this to a non-vararg function, added as a second testcase.

This is a tough case mixing bit-fields and alignment, where
aarch64_function_arg_alignment did not follow what its descriptive
comment says: we want to use the natural alignment of the bit-field
type only if the user didn't reduce the alignment for the bit-field
itself.

The patch also adds a comment and assert that would help someone who
has to look at this area again.

The fix would be very small, except that this introduces a new ABI
break, and we have to warn about that.  Since this actually fixes a
problem introduced in GCC 9.1, we keep the old computation to detect
when we now behave differently.

This patch adds two new tests (va_arg-17.c and
pr105549.c). va_arg-17.c contains the reduced offending testcase from
struct-layout-1.exp for reference.  We update some tests introduced by
the previous patch, where parameters with bit-fields and packed
attribute now emit a different warning.


I'm seeing
+FAIL: g++.target/aarch64/bitfield-abi-warning-align16-O2.C 
scan-assembler-times and\\tw0, w1, 1 10
+FAIL: g++.target/aarch64/bitfield-abi-warning-align32-O2.C 
scan-assembler-times and\\tw0, w1, 1 10
+FAIL: g++.target/aarch64/bitfield-abi-warning-align8-O2.C 
scan-assembler-times and\\tw0, w0, 1 11
+FAIL: g++.target/aarch64/bitfield-abi-warning-align8-O2.C 
scan-assembler-times and\\tw0, w1, 1 18
+FAIL: gcc.target/aarch64/sve/pcs/struct_3_128.c -march=armv8.2-a+sve 
(internal compiler error: in aarch64_layout_arg, at 
config/aarch64/aarch64.cc:7696)
+FAIL: gcc.target/aarch64/sve/pcs/struct_3_128.c -march=armv8.2-a+sve 
(test for excess errors)
+FAIL: gcc.target/aarch64/sve/pcs/struct_3_256.c -march=armv8.2-a+sve 
(internal compiler error: in aarch64_layout_arg, at 
config/aarch64/aarch64.cc:7696)
+FAIL: gcc.target/aarch64/sve/pcs/struct_3_256.c -march=armv8.2-a+sve 
(test for excess errors)
+FAIL: gcc.target/aarch64/sve/pcs/struct_3_512.c -march=armv8.2-a+sve 
(internal compiler error: in aarch64_layout_arg, at 
config/aarch64/aarch64.cc:7696)
+FAIL: gcc.target/aarch64/sve/pcs/struct_3_512.c -march=armv8.2-a+sve 
(test for excess errors)

regressions with this change.



Really deeply sorry for this :-(



aarch64.cc:7696 is for me the newly added:


+  gcc_assert (alignment <= 16 * BITS_PER_UNIT
+  && (!alignment || abi_break < alignment)
+  && (!abi_break_packed || alignment < abi_break_packed));


assert.
Details in
https://kojipkgs.fedoraproject.org//work/tasks/2857/96062857/build.log
(configure line etc.), plus if you
wget 
https://kojipkgs.fedoraproject.org//work/tasks/2857/96062857/build.log
sed -n '/^begin /,/^end/p' build.log | uuencode > you get a compressed 
tarball with the testsuite *.log files.


Thanks I managed to download this (you meant uudecode rather than 
uuencode ;-) )


I see the scan-assembler-times are also failing in gcc.target, I guess 
you just forgot to paste them?


 From your other message, it seems you are building with stack-protector 
enabled by default, but I can't see that in the configure lines?


Indeed I just checked the generated code with/without 
-fstack-protector-all, and it obviously changes a lot, thus breaking the 
fragile scan-assembler directives. As you said, it's easy to avoid with 
-fno-stack-protector.




As a follow-up to this, I ran the full testsuite with 
-fstack-protector-all and this results in lots of failures (~65000 in 
gcc.sum alone).


Since you also mentioned -fstack-protector-strong, I ran the full 
testsuite with it, which results in more failures too but the difference 
is much smaller than with -fstack-protector=all (from 126 FAIL to 309)


For instance, I see many failures with -fstack-protector-strong in:
gcc.target/aarch64/sve/pcs/
It looks like you have them too, according to the logs I downloaded from 
your link above.


So is it worth adding -fno-stack-protector to my few new testcases?
(I can, no problem, but just wondering why you appear to notice the 
problem with my new tests, and not with the ones in 
gcc.target/aarch64/sve/pcs/)


Thanks,

Christophe


I'll check the problem with the assert.

Thanks and sorry,

Christophe



Jakub

Re: [PATCH v3 2/2] aarch64: Fix bit-field alignment in param passing [PR105549]

2023-01-17 Thread Jakub Jelinek via Gcc-patches

On Tue, Jan 17, 2023 at 01:43:35PM +0100, Christophe Lyon wrote:
> As a follow-up to this, I ran the full testsuite with -fstack-protector-all
> and this results in lots of failures (~65000 in gcc.sum alone).

I guess that is way too much.

> Since you also mentioned -fstack-protector-strong, I ran the full testsuite
> with it, which results in more failures too but the difference is much
> smaller than with -fstack-protector=all (from 126 FAIL to 309)

But this could be doable by adding explicit -fno-stack-protector options
to test that can't handle those.

> For instance, I see many failures with -fstack-protector-strong in:
> gcc.target/aarch64/sve/pcs/
> It looks like you have them too, according to the logs I downloaded from
> your link above.
> 
> So is it worth adding -fno-stack-protector to my few new testcases?
> (I can, no problem, but just wondering why you appear to notice the problem
> with my new tests, and not with the ones in gcc.target/aarch64/sve/pcs/)

Because I mainly look for regressions (compare the test_summary
dumps against older gcc build); if something fails for years, it doesn't
show up in the regression diffs.

Jakub

Re: [PATCH v3 2/2] aarch64: Fix bit-field alignment in param passing [PR105549]

2023-01-17 Thread Christophe Lyon via Gcc-patches





On 1/17/23 13:48, Jakub Jelinek wrote:

On Tue, Jan 17, 2023 at 01:43:35PM +0100, Christophe Lyon wrote:

As a follow-up to this, I ran the full testsuite with -fstack-protector-all
and this results in lots of failures (~65000 in gcc.sum alone).


I guess that is way too much.


Since you also mentioned -fstack-protector-strong, I ran the full testsuite
with it, which results in more failures too but the difference is much
smaller than with -fstack-protector=all (from 126 FAIL to 309)


But this could be doable by adding explicit -fno-stack-protector options
to test that can't handle those.


For instance, I see many failures with -fstack-protector-strong in:
gcc.target/aarch64/sve/pcs/
It looks like you have them too, according to the logs I downloaded from
your link above.

So is it worth adding -fno-stack-protector to my few new testcases?
(I can, no problem, but just wondering why you appear to notice the problem
with my new tests, and not with the ones in gcc.target/aarch64/sve/pcs/)


Because I mainly look for regressions (compare the test_summary
dumps against older gcc build); if something fails for years, it doesn't
show up in the regression diffs.



OK that's what I thought, thanks for confirming.

I'll add -fno-stack-protector to my tests.

Thanks,

Christophe


Jakub

Re: [aarch64] Use wzr/xzr for assigning vector element to 0

2023-01-17 Thread Richard Sandiford via Gcc-patches

Prathamesh Kulkarni  writes:
> Hi Richard,
> For the following (contrived) test:
>
> void foo(int32x4_t v)
> {
>   v[3] = 0;
>   return v;
> }
>
> -O2 code-gen:
> foo:
> fmovs1, wzr
> ins v0.s[3], v1.s[0]
> ret
>
> I suppose we can instead emit the following code-gen ?
> foo:
>  ins v0.s[3], wzr
>  ret
>
> combine produces:
> Failed to match this instruction:
> (set (reg:V4SI 95 [ v ])
> (vec_merge:V4SI (const_vector:V4SI [
> (const_int 0 [0]) repeated x4
> ])
> (reg:V4SI 97)
> (const_int 8 [0x8])))
>
> So, I wrote the following pattern to match the above insn:
> (define_insn "aarch64_simd_vec_set_zero"
>   [(set (match_operand:VALL_F16 0 "register_operand" "=w")
> (vec_merge:VALL_F16
> (match_operand:VALL_F16 1 "const_dup0_operand" "w")
> (match_operand:VALL_F16 3 "register_operand" "0")
> (match_operand:SI 2 "immediate_operand" "i")))]
>   "TARGET_SIMD"
>   {
> int elt = ENDIAN_LANE_N (, exact_log2 (INTVAL (operands[2])));
> operands[2] = GEN_INT ((HOST_WIDE_INT) 1 << elt);
> return "ins\\t%0.[%p2], wzr";
>   }
> )
>
> which now matches the above insn produced by combine.
> However, in reload dump, it creates a new insn for assigning
> register to (const_vector (const_int 0)),
> which results in:
> (insn 19 8 13 2 (set (reg:V4SI 33 v1 [99])
> (const_vector:V4SI [
> (const_int 0 [0]) repeated x4
> ])) "wzr-test.c":8:1 1269 {*aarch64_simd_movv4si}
>  (nil))
> (insn 13 19 14 2 (set (reg/i:V4SI 32 v0)
> (vec_merge:V4SI (reg:V4SI 33 v1 [99])
> (reg:V4SI 32 v0 [97])
> (const_int 8 [0x8]))) "wzr-test.c":8:1 1808
> {aarch64_simd_vec_set_zerov4si}
>  (nil))
>
> and eventually the code-gen:
> foo:
> moviv1.4s, 0
> ins v0.s[3], wzr
> ret
>
> To get rid of redundant assignment of 0 to v1, I tried to split the
> above pattern
> as in the attached patch. This works to emit code-gen:
> foo:
> ins v0.s[3], wzr
> ret
>
> However, I am not sure if this is the right approach. Could you suggest,
> if it'd be possible to get rid of UNSPEC_SETZERO in the patch ?

The problem is with the "w" constraint on operand 1, which tells LRA
to force the zero into an FPR.  It should work if you remove the
constraint.

Also, I think you'll need to use zr for the zero, so that
it uses xzr for 64-bit elements.

I think this and the existing patterns ought to test
exact_log2 (INTVAL (operands[2])) >= 0 in the insn condition,
since there's no guarantee that RTL optimisations won't form
vec_merges that have other masks.

Thanks,
Richard

[PATCH (pushed)] Ignore test_patches.txt in update-copyright.py.

2023-01-17 Thread Martin Liška

contrib/ChangeLog:

* update-copyright.py: Ignore test_patches.txt.
---
 contrib/update-copyright.py | 1 +
 1 file changed, 1 insertion(+)

diff --git a/contrib/update-copyright.py b/contrib/update-copyright.py
index ac7a94743cf..f3691dc11cc 100755
--- a/contrib/update-copyright.py
+++ b/contrib/update-copyright.py
@@ -693,6 +693,7 @@ class ContribFilter(GenericFilter):
 'Info.plist',
 # Contains CR (^M).
 'repro_fail',
+'test_patches.txt',
 ])
 
 class GCCCopyright (Copyright):
-- 
2.39.0

Re: [PATCH (pushed)] Ignore test_patches.txt in update-copyright.py.

2023-01-17 Thread Jakub Jelinek via Gcc-patches

On Tue, Jan 17, 2023 at 02:02:42PM +0100, Martin Liška wrote:
> contrib/ChangeLog:
> 
>   * update-copyright.py: Ignore test_patches.txt.

LGTM.

> diff --git a/contrib/update-copyright.py b/contrib/update-copyright.py
> index ac7a94743cf..f3691dc11cc 100755
> --- a/contrib/update-copyright.py
> +++ b/contrib/update-copyright.py
> @@ -693,6 +693,7 @@ class ContribFilter(GenericFilter):
>  'Info.plist',
>  # Contains CR (^M).
>  'repro_fail',
> +'test_patches.txt',
>  ])
>  
>  class GCCCopyright (Copyright):
> -- 
> 2.39.0

Jakub

Re: [PATCH] middle-end/106075 - non-call EH and DSE

2023-01-17 Thread Jan Hubicka via Gcc-patches

> The following fixes a long-standing bug with DSE removing stores as
> dead even though they are live across non-call exceptional flow.
> This affects both GIMPLE and RTL DSE and the fix is similar in
> making externally throwing statements uses of non-local stores.
> Note this doesn't fix the GIMPLE side when the throwing statement
> does not involve a load or a store because then the statement does
> not have virtual operands and thus is not visited by GIMPLE DSE.

Thanks for looking into this.
My main motivation for poking on this is the patch to add fnspec to
throw/catch machinery
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597124.html

The eh6.C testcase is misoptimized by current trun with the patch.
I think I can adjust it for the throwing function to have no vops and it
will still get misoptimized by DSE.
> 
> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> 
> This doesn't seem to be a regression and I'm unsure as to how
> important it is for Ada (I consider it not important for C/C++),
> so for now I'll queue it for next stage1.

According to compiler explorer testcase:
struct a{int a,b,c,d,e;};
void
test(struct a * __restrict a, struct a *b)
{
  *a = (struct a){0,1,2,3,4};
  *a = *b;
}
Is compiled correctly by GCC 5.4 and first miscopmiled by 6.1, so I
think it is a regression. (For C++ not very important one as
-fnon-call-exceptions is not very common for C++)

Honza
> 
>   PR middle-end/106075
>   * dse.cc (scan_insn): Consider externally throwing insns
>   to read from not frame based memory.
>   * tree-ssa-dse.cc (dse_classify_store): Consider externally
>   throwing uses to read from global memory.
> 
>   * gcc.dg/torture/pr106075-1.c: New testcase.
> ---
>  gcc/dse.cc|  5 
>  gcc/testsuite/gcc.dg/torture/pr106075-1.c | 36 +++
>  gcc/tree-ssa-dse.cc   |  8 -
>  3 files changed, 48 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.dg/torture/pr106075-1.c
> 
> diff --git a/gcc/dse.cc b/gcc/dse.cc
> index a2db8d1cc32..7e258b81f66 100644
> --- a/gcc/dse.cc
> +++ b/gcc/dse.cc
> @@ -2633,6 +2633,11 @@ scan_insn (bb_info_t bb_info, rtx_insn *insn, int 
> max_active_local_stores)
>return;
>  }
>  
> +  /* An externally throwing statement may read any memory that is not
> + relative to the frame.  */
> +  if (can_throw_external (insn))
> +add_non_frame_wild_read (bb_info);
> +
>/* Assuming that there are sets in these insns, we cannot delete
>   them.  */
>if ((GET_CODE (PATTERN (insn)) == CLOBBER)
> diff --git a/gcc/testsuite/gcc.dg/torture/pr106075-1.c 
> b/gcc/testsuite/gcc.dg/torture/pr106075-1.c
> new file mode 100644
> index 000..b9affbf1082
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/pr106075-1.c
> @@ -0,0 +1,36 @@
> +/* { dg-do run { target *-*-linux* } } */
> +/* { dg-additional-options "-fnon-call-exceptions" } */
> +
> +#include 
> +#include 
> +#include 
> +
> +int a = 1;
> +short *b;
> +void __attribute__((noipa))
> +test()
> +{
> +  a=12345;
> +  *b=0;
> +  a=1;
> +}
> +
> +void check (int i)
> +{
> +  if (a != 12345)
> +abort ();
> +  exit (0);
> +}
> +
> +int
> +main ()
> +{
> +  struct sigaction s;
> +  sigemptyset (&s.sa_mask);
> +  s.sa_handler = check;
> +  s.sa_flags = 0;
> +  sigaction (SIGSEGV, &s, NULL);
> +  test();
> +  abort ();
> +  return 0;
> +}
> diff --git a/gcc/tree-ssa-dse.cc b/gcc/tree-ssa-dse.cc
> index 46ab57d5754..b2e2359c3da 100644
> --- a/gcc/tree-ssa-dse.cc
> +++ b/gcc/tree-ssa-dse.cc
> @@ -960,6 +960,7 @@ dse_classify_store (ao_ref *ref, gimple *stmt,
>auto_bitmap visited;
>std::unique_ptr
>  dra (nullptr, free_data_ref);
> +  bool maybe_global = ref_may_alias_global_p (ref, false);
>  
>if (by_clobber_p)
>  *by_clobber_p = true;
> @@ -1038,6 +1039,11 @@ dse_classify_store (ao_ref *ref, gimple *stmt,
> last_phi_def = as_a  (use_stmt);
>   }
>   }
> +   /* If the stmt can throw externally and the store is
> +  visible in the context unwound to the store is live.  */
> +   else if (maybe_global
> +&& stmt_can_throw_external (cfun, use_stmt))
> + return DSE_STORE_LIVE;
> /* If the statement is a use the store is not dead.  */
> else if (ref_maybe_used_by_stmt_p (use_stmt, ref))
>   {
> @@ -1116,7 +1122,7 @@ dse_classify_store (ao_ref *ref, gimple *stmt,
>just pretend the stmt makes itself dead.  Otherwise fail.  */
>if (defs.is_empty ())
>   {
> -   if (ref_may_alias_global_p (ref, false))
> +   if (maybe_global)
>   return DSE_STORE_LIVE;
>  
> if (by_clobber_p)
> -- 
> 2.35.3

Re: [PATCH] middle-end/106075 - non-call EH and DSE

2023-01-17 Thread Richard Biener via Gcc-patches

On Tue, 17 Jan 2023, Jan Hubicka wrote:

> > The following fixes a long-standing bug with DSE removing stores as
> > dead even though they are live across non-call exceptional flow.
> > This affects both GIMPLE and RTL DSE and the fix is similar in
> > making externally throwing statements uses of non-local stores.
> > Note this doesn't fix the GIMPLE side when the throwing statement
> > does not involve a load or a store because then the statement does
> > not have virtual operands and thus is not visited by GIMPLE DSE.
> 
> Thanks for looking into this.
> My main motivation for poking on this is the patch to add fnspec to
> throw/catch machinery
> https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597124.html
> 
> The eh6.C testcase is misoptimized by current trun with the patch.
> I think I can adjust it for the throwing function to have no vops and it
> will still get misoptimized by DSE.

I think conceptually the "throw" may not be 'const' - it has to be
'pure' at most since the program point the control is transfered to
can inspect memory.

There should be no other changes necessary to DSE for cxa_throw to
be 'pure'.

> > 
> > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> > 
> > This doesn't seem to be a regression and I'm unsure as to how
> > important it is for Ada (I consider it not important for C/C++),
> > so for now I'll queue it for next stage1.
> 
> According to compiler explorer testcase:
> struct a{int a,b,c,d,e;};
> void
> test(struct a * __restrict a, struct a *b)
> {
>   *a = (struct a){0,1,2,3,4};
>   *a = *b;
> }
> Is compiled correctly by GCC 5.4 and first miscopmiled by 6.1, so I
> think it is a regression. (For C++ not very important one as
> -fnon-call-exceptions is not very common for C++)

Ah, yes - RTL DSE probably is too weak for this and GIMPLE DSE
didn't handle aggregates well at some point.

Richard.

> 
> Honza
> > 
> > PR middle-end/106075
> > * dse.cc (scan_insn): Consider externally throwing insns
> > to read from not frame based memory.
> > * tree-ssa-dse.cc (dse_classify_store): Consider externally
> > throwing uses to read from global memory.
> > 
> > * gcc.dg/torture/pr106075-1.c: New testcase.
> > ---
> >  gcc/dse.cc|  5 
> >  gcc/testsuite/gcc.dg/torture/pr106075-1.c | 36 +++
> >  gcc/tree-ssa-dse.cc   |  8 -
> >  3 files changed, 48 insertions(+), 1 deletion(-)
> >  create mode 100644 gcc/testsuite/gcc.dg/torture/pr106075-1.c
> > 
> > diff --git a/gcc/dse.cc b/gcc/dse.cc
> > index a2db8d1cc32..7e258b81f66 100644
> > --- a/gcc/dse.cc
> > +++ b/gcc/dse.cc
> > @@ -2633,6 +2633,11 @@ scan_insn (bb_info_t bb_info, rtx_insn *insn, int 
> > max_active_local_stores)
> >return;
> >  }
> >  
> > +  /* An externally throwing statement may read any memory that is not
> > + relative to the frame.  */
> > +  if (can_throw_external (insn))
> > +add_non_frame_wild_read (bb_info);
> > +
> >/* Assuming that there are sets in these insns, we cannot delete
> >   them.  */
> >if ((GET_CODE (PATTERN (insn)) == CLOBBER)
> > diff --git a/gcc/testsuite/gcc.dg/torture/pr106075-1.c 
> > b/gcc/testsuite/gcc.dg/torture/pr106075-1.c
> > new file mode 100644
> > index 000..b9affbf1082
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/torture/pr106075-1.c
> > @@ -0,0 +1,36 @@
> > +/* { dg-do run { target *-*-linux* } } */
> > +/* { dg-additional-options "-fnon-call-exceptions" } */
> > +
> > +#include 
> > +#include 
> > +#include 
> > +
> > +int a = 1;
> > +short *b;
> > +void __attribute__((noipa))
> > +test()
> > +{
> > +  a=12345;
> > +  *b=0;
> > +  a=1;
> > +}
> > +
> > +void check (int i)
> > +{
> > +  if (a != 12345)
> > +abort ();
> > +  exit (0);
> > +}
> > +
> > +int
> > +main ()
> > +{
> > +  struct sigaction s;
> > +  sigemptyset (&s.sa_mask);
> > +  s.sa_handler = check;
> > +  s.sa_flags = 0;
> > +  sigaction (SIGSEGV, &s, NULL);
> > +  test();
> > +  abort ();
> > +  return 0;
> > +}
> > diff --git a/gcc/tree-ssa-dse.cc b/gcc/tree-ssa-dse.cc
> > index 46ab57d5754..b2e2359c3da 100644
> > --- a/gcc/tree-ssa-dse.cc
> > +++ b/gcc/tree-ssa-dse.cc
> > @@ -960,6 +960,7 @@ dse_classify_store (ao_ref *ref, gimple *stmt,
> >auto_bitmap visited;
> >std::unique_ptr
> >  dra (nullptr, free_data_ref);
> > +  bool maybe_global = ref_may_alias_global_p (ref, false);
> >  
> >if (by_clobber_p)
> >  *by_clobber_p = true;
> > @@ -1038,6 +1039,11 @@ dse_classify_store (ao_ref *ref, gimple *stmt,
> >   last_phi_def = as_a  (use_stmt);
> > }
> > }
> > + /* If the stmt can throw externally and the store is
> > +visible in the context unwound to the store is live.  */
> > + else if (maybe_global
> > +  && stmt_can_throw_external (cfun, use_stmt))
> > +   return DSE_STORE_LIVE;
> >   /* If the statement is a use the st

Re: [PATCH,WWWDOCS] htdocs: add an Atom feed for GCC news

2023-01-17 Thread Arsen Arsenović via Gcc-patches

Hi,

"Jose E. Marchesi via Gcc-patches"  writes:

> Yeah I guess it all depends on how much the news section is used.
>
> I personally think that it would be beneficial for the different GCC
> projects (front-ends, back-ends, etc) to be a little more vocal, public
> wise.  Releasing news items more often may help with that.
>
> Of course one could argue that making it easier to add news to the
> system (without having to manually rotate the .html file, add to the
> feed if desired, etc) would help with that.  And probably would be right
> :D

Should that happen, it should be quite trivial to put together (for
instance) a Python+Jinja2 or Perl script to generate both forms from a
common source.  That should be very little engineering ;).

In addition, such a program would make it easy to also provide RSS feeds
besides the Atom ones, which some might benefit from (IIRC, Gnus doesn't
ship Atom support yet, for instance).

I am in favor of putting more words out there, news tends to grab
attention, which could benefit both users and the project.

>>> I strongly suggest to not overengineer here [and nowhere else :)]
>>
>> I am tempted to agree (even if the engineer in me would prefer to avoid 
>> duplication). Jose, might you be willing to help others create Atom feed
>> entries?
>
> Sure.  It is as easy as adding one of these things to the .xml file:
>
> 
>   Rhhw Friday 16 March 2018 - Sunday 18 March 2018 @
>   Frankfurt am Main
>   http://jemarch.net/rhhw.html#16march2018
>   
> The Rabbit Herd will be meeting the weekend from 16 March to
> 18 March.
>   
>   Mon, 12 March 2018 15:00:00 CET
> 
>
> To be sure nothing breaks we may run a XML validator on the server to
> reject pushes that break the .xml file.  There must be an XML schema for
> XML Atom feeds somewhere..

The W3C provides a validator for feeds too:
https://validator.w3.org/feed/ and indeed, there's a schema, see RFC4287
appendix B ().

>> What do others think?
>>
>> Gerald

Hope that helps!
-- 
Arsen Arsenović

signature.asc
Description: PGP signature

[committed] libstdc++: Fix configuration of default zoneinfo dir on linux

2023-01-17 Thread Jonathan Wakely via Gcc-patches

Tested x86_64-linux. Pushed to trunk.

-- >8 --

The config for --with-libstdcxx-zoneinfo=yes was comparing the target
triplet to "gnu* | linux* | kfreebsd*-gnu | knetbsd*-gnu" which is only
the last component of the triplet, so failed to match and always used
the zoneinfo_dir=none default. Check $target_os instead.

There was also an error in the check for native builds that tzdata.zi is
actually present in the configured directory. That meant a warning was
printed even when the file was present:

configure: zoneinfo data directory: /usr/share/zoneinfo
configure: WARNING: "/usr/share/zoneinfo does not contain tzdata.zi file"
configure: static tzdata.zi file will be compiled into the library

libstdc++-v3/ChangeLog:

* acinclude.m4 (GLIBCXX_ZONEINFO_DIR): Check $target_os instead
of $host. Fix check for file being present during native build.
* configure: Regenerate.
---
 libstdc++-v3/acinclude.m4 | 8 
 libstdc++-v3/configure| 8 
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index b1608ae9237..982e979a840 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -5180,17 +5180,17 @@ AC_DEFUN([GLIBCXX_ZONEINFO_DIR], [
 
   if test "x${with_libstdcxx_zoneinfo}" = xyes; then
 # Pick a default when no specific path is set.
-case "$host" in
+case "$target_os" in
   gnu* | linux* | kfreebsd*-gnu | knetbsd*-gnu)
# Not all distros ship tzdata.zi in this dir.
zoneinfo_dir="/usr/share/zoneinfo"
;;
-  *-*-aix*)
+  aix*)
# Binary tzfile files are in /usr/share/lib/zoneinfo
# but tzdata.zi is not present there.
zoneinfo_dir=none
;;
-  *-*-darwin2*)
+  darwin2*)
# Binary tzfile files are in /usr/share/lib/zoneinfo.default
# but tzdata.zi is not present there.
zoneinfo_dir=none
@@ -5230,7 +5230,7 @@ AC_DEFUN([GLIBCXX_ZONEINFO_DIR], [
   if test "x${zoneinfo_dir}" != xnone; then
 AC_DEFINE_UNQUOTED(_GLIBCXX_ZONEINFO_DIR, "${zoneinfo_dir}",
   [Define if a directory should be searched for tzdata files.])
-if $GLIBCXX_IS_NATIVE -a ! test -f "$zoneinfo_dir/tzdata.zi"; then
+if $GLIBCXX_IS_NATIVE && ! test -f "$zoneinfo_dir/tzdata.zi"; then
   AC_MSG_WARN("$zoneinfo_dir does not contain tzdata.zi file")
 fi
   fi
diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure
index 9c0f3a3e7c9..a298cbd45a0 100755
--- a/libstdc++-v3/configure
+++ b/libstdc++-v3/configure
@@ -71536,17 +71536,17 @@ fi
 
   if test "x${with_libstdcxx_zoneinfo}" = xyes; then
 # Pick a default when no specific path is set.
-case "$host" in
+case "$target_os" in
   gnu* | linux* | kfreebsd*-gnu | knetbsd*-gnu)
# Not all distros ship tzdata.zi in this dir.
zoneinfo_dir="/usr/share/zoneinfo"
;;
-  *-*-aix*)
+  aix*)
# Binary tzfile files are in /usr/share/lib/zoneinfo
# but tzdata.zi is not present there.
zoneinfo_dir=none
;;
-  *-*-darwin2*)
+  darwin2*)
# Binary tzfile files are in /usr/share/lib/zoneinfo.default
# but tzdata.zi is not present there.
zoneinfo_dir=none
@@ -71590,7 +71590,7 @@ cat >>confdefs.h <<_ACEOF
 #define _GLIBCXX_ZONEINFO_DIR "${zoneinfo_dir}"
 _ACEOF
 
-if $GLIBCXX_IS_NATIVE -a ! test -f "$zoneinfo_dir/tzdata.zi"; then
+if $GLIBCXX_IS_NATIVE && ! test -f "$zoneinfo_dir/tzdata.zi"; then
   { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: \"$zoneinfo_dir does 
not contain tzdata.zi file\"" >&5
 $as_echo "$as_me: WARNING: \"$zoneinfo_dir does not contain tzdata.zi file\"" 
>&2;}
 fi
-- 
2.39.0

Re: [PATCH] middle-end/106075 - non-call EH and DSE

2023-01-17 Thread Jan Hubicka via Gcc-patches

> On Tue, 17 Jan 2023, Jan Hubicka wrote:
> 
> > > The following fixes a long-standing bug with DSE removing stores as
> > > dead even though they are live across non-call exceptional flow.
> > > This affects both GIMPLE and RTL DSE and the fix is similar in
> > > making externally throwing statements uses of non-local stores.
> > > Note this doesn't fix the GIMPLE side when the throwing statement
> > > does not involve a load or a store because then the statement does
> > > not have virtual operands and thus is not visited by GIMPLE DSE.
> > 
> > Thanks for looking into this.
> > My main motivation for poking on this is the patch to add fnspec to
> > throw/catch machinery
> > https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597124.html
> > 
> > The eh6.C testcase is misoptimized by current trun with the patch.
> > I think I can adjust it for the throwing function to have no vops and it
> > will still get misoptimized by DSE.
> 
> I think conceptually the "throw" may not be 'const' - it has to be
> 'pure' at most since the program point the control is transfered to
> can inspect memory.

We don't use same argumentation about other control flow statements.
The following:

fn()
{
  try {
i_read_no_global_memory ();
  } catch (...)
  {
reutrn 1;
  }
  return 0;
}

should be detected as const.  Marking throw pure would make fn pure too.

With noncall exceptions a=b/c also can transfer to place that inspect
memory.  We may want all statements with can_throw_extenral to have VUSE
on them (like with return) since they may cause function to return, but
I think fnspec is wrong place to model this.
> > According to compiler explorer testcase:
> > struct a{int a,b,c,d,e;};
> > void
> > test(struct a * __restrict a, struct a *b)
> > {
> >   *a = (struct a){0,1,2,3,4};
> >   *a = *b;
> > }
> > Is compiled correctly by GCC 5.4 and first miscopmiled by 6.1, so I
> > think it is a regression. (For C++ not very important one as
> > -fnon-call-exceptions is not very common for C++)
> 
> Ah, yes - RTL DSE probably is too weak for this and GIMPLE DSE
> didn't handle aggregates well at some point.

Yep, we never handled it really correctly but were weaker on optimizing
and thus also producing wrong code :)

Honza
> 
> Richard.
> 
> > 
> > Honza
> > > 
> > >   PR middle-end/106075
> > >   * dse.cc (scan_insn): Consider externally throwing insns
> > >   to read from not frame based memory.
> > >   * tree-ssa-dse.cc (dse_classify_store): Consider externally
> > >   throwing uses to read from global memory.
> > > 
> > >   * gcc.dg/torture/pr106075-1.c: New testcase.
> > > ---
> > >  gcc/dse.cc|  5 
> > >  gcc/testsuite/gcc.dg/torture/pr106075-1.c | 36 +++
> > >  gcc/tree-ssa-dse.cc   |  8 -
> > >  3 files changed, 48 insertions(+), 1 deletion(-)
> > >  create mode 100644 gcc/testsuite/gcc.dg/torture/pr106075-1.c
> > > 
> > > diff --git a/gcc/dse.cc b/gcc/dse.cc
> > > index a2db8d1cc32..7e258b81f66 100644
> > > --- a/gcc/dse.cc
> > > +++ b/gcc/dse.cc
> > > @@ -2633,6 +2633,11 @@ scan_insn (bb_info_t bb_info, rtx_insn *insn, int 
> > > max_active_local_stores)
> > >return;
> > >  }
> > >  
> > > +  /* An externally throwing statement may read any memory that is not
> > > + relative to the frame.  */
> > > +  if (can_throw_external (insn))
> > > +add_non_frame_wild_read (bb_info);
> > > +
> > >/* Assuming that there are sets in these insns, we cannot delete
> > >   them.  */
> > >if ((GET_CODE (PATTERN (insn)) == CLOBBER)
> > > diff --git a/gcc/testsuite/gcc.dg/torture/pr106075-1.c 
> > > b/gcc/testsuite/gcc.dg/torture/pr106075-1.c
> > > new file mode 100644
> > > index 000..b9affbf1082
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/torture/pr106075-1.c
> > > @@ -0,0 +1,36 @@
> > > +/* { dg-do run { target *-*-linux* } } */
> > > +/* { dg-additional-options "-fnon-call-exceptions" } */
> > > +
> > > +#include 
> > > +#include 
> > > +#include 
> > > +
> > > +int a = 1;
> > > +short *b;
> > > +void __attribute__((noipa))
> > > +test()
> > > +{
> > > +  a=12345;
> > > +  *b=0;
> > > +  a=1;
> > > +}
> > > +
> > > +void check (int i)
> > > +{
> > > +  if (a != 12345)
> > > +abort ();
> > > +  exit (0);
> > > +}
> > > +
> > > +int
> > > +main ()
> > > +{
> > > +  struct sigaction s;
> > > +  sigemptyset (&s.sa_mask);
> > > +  s.sa_handler = check;
> > > +  s.sa_flags = 0;
> > > +  sigaction (SIGSEGV, &s, NULL);
> > > +  test();
> > > +  abort ();
> > > +  return 0;
> > > +}
> > > diff --git a/gcc/tree-ssa-dse.cc b/gcc/tree-ssa-dse.cc
> > > index 46ab57d5754..b2e2359c3da 100644
> > > --- a/gcc/tree-ssa-dse.cc
> > > +++ b/gcc/tree-ssa-dse.cc
> > > @@ -960,6 +960,7 @@ dse_classify_store (ao_ref *ref, gimple *stmt,
> > >auto_bitmap visited;
> > >std::unique_ptr
> > >  dra (nullptr, free_data_ref);
> > > +  bool maybe_global = ref_may_alias_global_p (ref, false);
> > >  
> > >if (by_clobb

Re: [PATCH] middle-end/106075 - non-call EH and DSE

2023-01-17 Thread Richard Biener via Gcc-patches

On Tue, 17 Jan 2023, Jan Hubicka wrote:

> > On Tue, 17 Jan 2023, Jan Hubicka wrote:
> > 
> > > > The following fixes a long-standing bug with DSE removing stores as
> > > > dead even though they are live across non-call exceptional flow.
> > > > This affects both GIMPLE and RTL DSE and the fix is similar in
> > > > making externally throwing statements uses of non-local stores.
> > > > Note this doesn't fix the GIMPLE side when the throwing statement
> > > > does not involve a load or a store because then the statement does
> > > > not have virtual operands and thus is not visited by GIMPLE DSE.
> > > 
> > > Thanks for looking into this.
> > > My main motivation for poking on this is the patch to add fnspec to
> > > throw/catch machinery
> > > https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597124.html
> > > 
> > > The eh6.C testcase is misoptimized by current trun with the patch.
> > > I think I can adjust it for the throwing function to have no vops and it
> > > will still get misoptimized by DSE.
> > 
> > I think conceptually the "throw" may not be 'const' - it has to be
> > 'pure' at most since the program point the control is transfered to
> > can inspect memory.
> 
> We don't use same argumentation about other control flow statements.
> The following:
> 
> fn()
> {
>   try {
> i_read_no_global_memory ();
>   } catch (...)
>   {
> reutrn 1;
>   }
>   return 0;
> }
> 
> should be detected as const.  Marking throw pure would make fn pure too.

I suppose i_read_no_global_memory is const here.  Not sure why that
should make it pure?  Only if anything throws externally (not catched
in fn) should force it to be pure, no?

Of course for IPA purposes whether 'fn' is to be considered const
or pure depends on whether its exceptions are catched in the context
where that's interesting - that is, whether the EH side-effect is
explicitely or implicitely modeled.

> With noncall exceptions a=b/c also can transfer to place that inspect
> memory.  We may want all statements with can_throw_extenral to have VUSE
> on them (like with return) since they may cause function to return, but
> I think fnspec is wrong place to model this.

Yes, I think all control transfer instructions need a VUSE.

Richard.

> > > According to compiler explorer testcase:
> > > struct a{int a,b,c,d,e;};
> > > void
> > > test(struct a * __restrict a, struct a *b)
> > > {
> > >   *a = (struct a){0,1,2,3,4};
> > >   *a = *b;
> > > }
> > > Is compiled correctly by GCC 5.4 and first miscopmiled by 6.1, so I
> > > think it is a regression. (For C++ not very important one as
> > > -fnon-call-exceptions is not very common for C++)
> > 
> > Ah, yes - RTL DSE probably is too weak for this and GIMPLE DSE
> > didn't handle aggregates well at some point.
> 
> Yep, we never handled it really correctly but were weaker on optimizing
> and thus also producing wrong code :)
> 
> Honza
> > 
> > Richard.
> > 
> > > 
> > > Honza
> > > > 
> > > > PR middle-end/106075
> > > > * dse.cc (scan_insn): Consider externally throwing insns
> > > > to read from not frame based memory.
> > > > * tree-ssa-dse.cc (dse_classify_store): Consider externally
> > > > throwing uses to read from global memory.
> > > > 
> > > > * gcc.dg/torture/pr106075-1.c: New testcase.
> > > > ---
> > > >  gcc/dse.cc|  5 
> > > >  gcc/testsuite/gcc.dg/torture/pr106075-1.c | 36 +++
> > > >  gcc/tree-ssa-dse.cc   |  8 -
> > > >  3 files changed, 48 insertions(+), 1 deletion(-)
> > > >  create mode 100644 gcc/testsuite/gcc.dg/torture/pr106075-1.c
> > > > 
> > > > diff --git a/gcc/dse.cc b/gcc/dse.cc
> > > > index a2db8d1cc32..7e258b81f66 100644
> > > > --- a/gcc/dse.cc
> > > > +++ b/gcc/dse.cc
> > > > @@ -2633,6 +2633,11 @@ scan_insn (bb_info_t bb_info, rtx_insn *insn, 
> > > > int max_active_local_stores)
> > > >return;
> > > >  }
> > > >  
> > > > +  /* An externally throwing statement may read any memory that is not
> > > > + relative to the frame.  */
> > > > +  if (can_throw_external (insn))
> > > > +add_non_frame_wild_read (bb_info);
> > > > +
> > > >/* Assuming that there are sets in these insns, we cannot delete
> > > >   them.  */
> > > >if ((GET_CODE (PATTERN (insn)) == CLOBBER)
> > > > diff --git a/gcc/testsuite/gcc.dg/torture/pr106075-1.c 
> > > > b/gcc/testsuite/gcc.dg/torture/pr106075-1.c
> > > > new file mode 100644
> > > > index 000..b9affbf1082
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/torture/pr106075-1.c
> > > > @@ -0,0 +1,36 @@
> > > > +/* { dg-do run { target *-*-linux* } } */
> > > > +/* { dg-additional-options "-fnon-call-exceptions" } */
> > > > +
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +
> > > > +int a = 1;
> > > > +short *b;
> > > > +void __attribute__((noipa))
> > > > +test()
> > > > +{
> > > > +  a=12345;
> > > > +  *b=0;
> > > > +  a=1;
> > > > +}
> > > > +

Re: [PATCH v5] LoongArch: Fixed a compilation failure with '%c' in inline assembly [PR107731].

2023-01-17 Thread Richard Sandiford via Gcc-patches

Lulu Cheng  writes:
> Co-authored-by: Yang Yujie 
>
> gcc/ChangeLog:
>
>   * config/loongarch/loongarch.cc (loongarch_classify_address):
>   Add precessint for CONST_INT.
>   (loongarch_print_operand_reloc): Operand modifier 'c' is supported.
>   (loongarch_print_operand): Increase the processing of '%c'.
>   * doc/extend.texi: Adds documents for LoongArch operand modifiers.
>   And port the public operand modifiers information to this document.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/loongarch/tst-asm-const.c: Moved to...
>   * gcc.target/loongarch/pr107731.c: ...here.
>
> ---
> V2 -> v3:
> 1. Correct a clerical error.
> 2. Adding document for loongarch operand modifiers.
>
> v3 -> v4:
> Copy the description of "%c" "%n" "%a" "%l" from gccint.pdf to gcc.pdf.
>
> v4 -> v5:
> Move the operand modifiers description of "%c", "%n", "%a", "%l" to the top 
> of the
> x86Operandmodifiers section.
>
> ---
>  gcc/config/loongarch/loongarch.cc |  14 ++
>  gcc/doc/extend.texi   | 135 --
>  .../loongarch/{tst-asm-const.c => pr107731.c} |   6 +-
>  3 files changed, 106 insertions(+), 49 deletions(-)
>  rename gcc/testsuite/gcc.target/loongarch/{tst-asm-const.c => pr107731.c} 
> (78%)
>
> diff --git a/gcc/config/loongarch/loongarch.cc 
> b/gcc/config/loongarch/loongarch.cc
> index c6b03fcf2f9..cdf190b985e 100644
> --- a/gcc/config/loongarch/loongarch.cc
> +++ b/gcc/config/loongarch/loongarch.cc
> @@ -2075,6 +2075,11 @@ loongarch_classify_address (struct 
> loongarch_address_info *info, rtx x,
>return (loongarch_valid_base_register_p (info->reg, mode, strict_p)
> && loongarch_valid_lo_sum_p (info->symbol_type, mode,
>  info->offset));
> +case CONST_INT:
> +  /* Small-integer addresses don't occur very often, but they
> +  are legitimate if $r0 is a valid base register.  */
> +  info->type = ADDRESS_CONST_INT;
> +  return IMM12_OPERAND (INTVAL (x));
>  
>  default:
>return false;
> @@ -4933,6 +4938,7 @@ loongarch_print_operand_reloc (FILE *file, rtx op, bool 
> hi64_part,
>  
> 'A'   Print a _DB suffix if the memory model requires a release.
> 'b'   Print the address of a memory operand, without offset.
> +   'c'  Print an integer.
> 'C'   Print the integer branch condition for comparison OP.
> 'd'   Print CONST_INT OP in decimal.
> 'F'   Print the FPU branch condition for comparison OP.
> @@ -4979,6 +4985,14 @@ loongarch_print_operand (FILE *file, rtx op, int 
> letter)
> fputs ("_db", file);
>break;
>  
> +case 'c':
> +  if (CONST_INT_P (op))
> + fprintf (file, HOST_WIDE_INT_PRINT_DEC, INTVAL (op));
> +  else
> + output_operand_lossage ("unsupported operand for code '%c'", letter);
> +
> +  break;
> +
>  case 'C':
>loongarch_print_int_branch_condition (file, code, letter);
>break;
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 1103e9936f7..256811cb8f5 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -10402,8 +10402,10 @@ ensures that modifying @var{a} does not affect the 
> address referenced by
>  is undefined if @var{a} is modified before using @var{b}.
>  
>  @code{asm} supports operand modifiers on operands (for example @samp{%k2} 
> -instead of simply @samp{%2}). Typically these qualifiers are hardware 
> -dependent. The list of supported modifiers for x86 is found at 
> +instead of simply @samp{%2}). @ref{GenericOperandmodifiers,
> +Generic Operand modifiers} lists the modifiers that are available
> +on all targets.  Other modifiers are hardware dependent.
> +For example, the list of supported modifiers for x86 is found at
>  @ref{x86Operandmodifiers,x86 Operand modifiers}.
>  
>  If the C code that follows the @code{asm} makes no use of any of the output 
> @@ -10671,8 +10673,10 @@ optimizers may discard the @code{asm} statement as 
> unneeded
>  (see @ref{Volatile}).
>  
>  @code{asm} supports operand modifiers on operands (for example @samp{%k2} 
> -instead of simply @samp{%2}). Typically these qualifiers are hardware 
> -dependent. The list of supported modifiers for x86 is found at 
> +instead of simply @samp{%2}). @ref{GenericOperandmodifiers,
> +Generic Operand modifiers} lists the modifiers that are available
> +on all targets.  Other modifiers are hardware dependent.
> +For example, the list of supported modifiers for x86 is found at
>  @ref{x86Operandmodifiers,x86 Operand modifiers}.
>  
>  In this example using the fictitious @code{combine} instruction, the 
> @@ -11024,9 +11028,8 @@ lab:
>  @}
>  @end example
>  
> -@anchor{x86Operandmodifiers}
> -@subsubsection x86 Operand Modifiers
> -
> +@anchor{GenericOperandmodifiers}
> +@subsubsection Generic Operand Modifiers
>  References to input, output, and goto operands in the assembler template
>  of extended @code{asm} statements can use 
>

[RFC] tree-optimization: fix optimize-out variables passed into func to alloc

2023-01-17 Thread Alexey Lapshin via Gcc-patches

After updating to GCC newer than 11.4.0 we found that some code started
to fail if it was built with size optimization (-Os).
You can find testsuite for reproduction in the attached patch.

The simplified version affected code looks like this:

void alloc_function (unsigned char **data_p) {
  *data_p = malloc (8);
  assert(*data_p != NULL);
}
int main () {
  int *data;
  alloc_function (&data);
  printf ("data pointer is %p", data); // prints NULL(compile with -Os)
}

If the type of passed argument is equal to the type in alloc_function
declaration it works perfectly. Also helps change one or both types to
void.

I found that issue started to appear from commit
d119f34c952f8718fdbabc63e2f369a16e92fa07
if-statement which leads to this issue was found and after being
removed seems it works well.

Could you please elaborate on what cases exactly this checking should
optimize?
I think it should also contain at least one more check for accessing
variable's memory to write..


---
 gcc/testsuite/gcc.dg/tree-ssa/alloc-in-func.c | 17 +
 gcc/tree-ssa-alias.cc |  2 --
 2 files changed, 17 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/alloc-in-func.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/alloc-in-func.c
b/gcc/testsuite/gcc.dg/tree-ssa/alloc-in-func.c
new file mode 100644
index 000..b30c1cedcb9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/alloc-in-func.c
@@ -0,0 +1,17 @@
+/* { dg-do run } */
+/* { dg-options "-Os" } */
+
+#define assert(x) if (!(x)) __builtin_abort ()
+
+static inline void alloc_function (unsigned char **data_p)
+{
+*data_p = (unsigned char *) __builtin_malloc (10);
+assert (*data_p != (void *)0);
+}
+
+int main ()
+{
+int *data = (void *)0;
+alloc_function ((unsigned char **) &data);
+assert (data != (void *)0);
+}
diff --git a/gcc/tree-ssa-alias.cc b/gcc/tree-ssa-alias.cc
index b8f107dfa52..9068db300e5 100644
--- a/gcc/tree-ssa-alias.cc
+++ b/gcc/tree-ssa-alias.cc
@@ -2608,8 +2608,6 @@ modref_may_conflict (const gcall *stmt,
  if (num_tests >= max_tests)
return true;
  alias_stats.modref_tests++;
- if (!alias_sets_conflict_p (base_set, base_node->base))
-   continue;
  num_tests++;
}
 
-- 
2.34.1

Re: [PATCH] middle-end/106075 - non-call EH and DSE

2023-01-17 Thread Jan Hubicka via Gcc-patches

> > We don't use same argumentation about other control flow statements.
> > The following:
> > 
> > fn()
> > {
> >   try {
> > i_read_no_global_memory ();
> >   } catch (...)
> >   {
> > reutrn 1;
> >   }
> >   return 0;
> > }
> > 
> > should be detected as const.  Marking throw pure would make fn pure too.
> 
> I suppose i_read_no_global_memory is const here.  Not sure why that
Suppose we have:

void
i_read_no_global_memory ()
{
  throw(0);
}

If cxa_throw itself was annotated as 'p' rahter than 'c' ipa-modref will
believe that cxa_throw will read any global memory and will propagate it
to all callers. So fn() will be also marked as reading all global
memory.

> should make it pure?  Only if anything throws externally (not catched
> in fn) should force it to be pure, no?
> 
> Of course for IPA purposes whether 'fn' is to be considered const
> or pure depends on whether its exceptions are catched in the context
> where that's interesting - that is, whether the EH side-effect is
> explicitely or implicitely modeled.

We have two things here. const/pure attributes 'c'/'p' fnspec
specifiers.  const/pure implies that function call can be removed when
result is not necessary. This is not the case of funcitons calling
throw() (we have -fdelete-dead-exceptions for noncall exceptions and
those would be OK).  However 'c'/'p' is about memory side effects only
and it is safe for i_read_no_global_memory.

With the C++ FE change adding fnspec to EH handling modref will detect
both i_read_no_global_memory and fn() as 'c'. It won't infer const
attribute that is something I can implement later.
We are very poor on detecting scenarios where all exceptions thrown are
actually caught. It is long time on my TODO to fix that, so probably
next stage1 is time to look into that.

> 
> > With noncall exceptions a=b/c also can transfer to place that inspect
> > memory.  We may want all statements with can_throw_extenral to have VUSE
> > on them (like with return) since they may cause function to return, but
> > I think fnspec is wrong place to model this.
> 
> Yes, I think all control transfer instructions need a VUSE.

I think it is right way to go.  So operands_scanner::parse_ssa_operands
can add vuse to anything that can_throw_external_p (like it does for
GIMPLE_RETURN) and passes like DSE can test for it and understand that
on the EH path the globally accessible memory is live and thus "used" by
the statement.

I can try to cook up a patch.

Thanks,
Honza
> 
> Richard.
> 
> > > > According to compiler explorer testcase:
> > > > struct a{int a,b,c,d,e;};
> > > > void
> > > > test(struct a * __restrict a, struct a *b)
> > > > {
> > > >   *a = (struct a){0,1,2,3,4};
> > > >   *a = *b;
> > > > }
> > > > Is compiled correctly by GCC 5.4 and first miscopmiled by 6.1, so I
> > > > think it is a regression. (For C++ not very important one as
> > > > -fnon-call-exceptions is not very common for C++)
> > > 
> > > Ah, yes - RTL DSE probably is too weak for this and GIMPLE DSE
> > > didn't handle aggregates well at some point.
> > 
> > Yep, we never handled it really correctly but were weaker on optimizing
> > and thus also producing wrong code :)
> > 
> > Honza
> > > 
> > > Richard.
> > > 
> > > > 
> > > > Honza
> > > > > 
> > > > >   PR middle-end/106075
> > > > >   * dse.cc (scan_insn): Consider externally throwing insns
> > > > >   to read from not frame based memory.
> > > > >   * tree-ssa-dse.cc (dse_classify_store): Consider externally
> > > > >   throwing uses to read from global memory.
> > > > > 
> > > > >   * gcc.dg/torture/pr106075-1.c: New testcase.
> > > > > ---
> > > > >  gcc/dse.cc|  5 
> > > > >  gcc/testsuite/gcc.dg/torture/pr106075-1.c | 36 
> > > > > +++
> > > > >  gcc/tree-ssa-dse.cc   |  8 -
> > > > >  3 files changed, 48 insertions(+), 1 deletion(-)
> > > > >  create mode 100644 gcc/testsuite/gcc.dg/torture/pr106075-1.c
> > > > > 
> > > > > diff --git a/gcc/dse.cc b/gcc/dse.cc
> > > > > index a2db8d1cc32..7e258b81f66 100644
> > > > > --- a/gcc/dse.cc
> > > > > +++ b/gcc/dse.cc
> > > > > @@ -2633,6 +2633,11 @@ scan_insn (bb_info_t bb_info, rtx_insn *insn, 
> > > > > int max_active_local_stores)
> > > > >return;
> > > > >  }
> > > > >  
> > > > > +  /* An externally throwing statement may read any memory that is not
> > > > > + relative to the frame.  */
> > > > > +  if (can_throw_external (insn))
> > > > > +add_non_frame_wild_read (bb_info);
> > > > > +
> > > > >/* Assuming that there are sets in these insns, we cannot delete
> > > > >   them.  */
> > > > >if ((GET_CODE (PATTERN (insn)) == CLOBBER)
> > > > > diff --git a/gcc/testsuite/gcc.dg/torture/pr106075-1.c 
> > > > > b/gcc/testsuite/gcc.dg/torture/pr106075-1.c
> > > > > new file mode 100644
> > > > > index 000..b9affbf1082
> > > > > --- /dev/null
> > > > > +++ b/gcc/testsuite/gcc.dg/torture/pr106075-1.c
> > > >

Re: [PATCH] contrib: Partial fix for failed update-copyright --this year [PR108413]

2023-01-17 Thread Gaius Mulley via Gcc-patches

Jakub Jelinek  writes:

> Hi!
>
> As mentioned on IRC or in PR108413, the last update-copyright.py --this year
> failed and that is why we are in a strange state where some copyrights have
> been updated and others have not.
> The full list of errors I got was I think:
> gcc/m2/mc-boot/GmcOptions.c: unrecognised copyright: comment (f, (const char 
> *) "Copyright (C) ''2021'' Free Software Foundation, Inc.", 53);
> gcc/m2/mc-boot/GmcOptions.c: unrecognised copyright: comment (f, (const char 
> *) "Copyright (C) ''2021'' Free Software Foundation, Inc.", 53);
> gcc/testsuite/gm2/switches/pedantic-params/pass/Strings.mod: unrecognised 
> copyright holder: Faculty of Information Technology,
> gcc/testsuite/gm2/switches/pedantic-params/pass/Strings2.mod: unrecognised 
> copyright holder: Faculty of Information Technology,
> libphobos/libdruntime/__builtins.di: unrecognised copyright: * Copyright: 
> Copyright Digital Mars 2022
> libstdc++-v3/src/c++17/fast_float/fast_float.h: unrecognised copyright 
> holder: The fast_float authors
> libstdc++-v3/include/c_compatibility/stdatomic.h: unrecognised copyright 
> holder: The GCC developers
>
> The following patch deals with the gcc/testsuite/gm2 ones and
> with the fast_float.h one, ok for trunk?
>
> Not really sure what we should do in the GmcOptions.c case
> (perhaps obfuscate it in the source somehow by splitting
> the string literals into different substrings
> Perhaps "Copy" "right (" "C) ''..." would do it?  Or do we want
> to bump there each year (manually or by the script)?
> E.g. in gcc.cc we have
>   printf ("Copyright %s 2023 Free Software Foundation, Inc.\n",
>   _("(C)"));
> which also prints (C) nicer in Unicode if possible and is updated
> by hand each year.
>

Hi,

I've git pushed some fixes for gcc/m2/mc/mcOptions.mod to obfuscate the
copyright text.  The change to mcOptions.mod also includes the removal
of the 'YEAR' constant and it queries the system for the year.  A
summary of the ChangeLog:

gcc/m2/ChangeLog:

* mc-boot/GmcOptions.c: Rebuilt.
* mc/mcOptions.mod (displayVersion):
Split first printf into three components
* mc/mcOptions.mod (YEAR): Remove.
(getYear): New procedure function.
(displayVersion): Use result from getYear instead of YEAR.
Emit boilerplate for GPL v3.
(gplBody): Use result from getYear instead of YEAR.
(glplBody): Use result from getYear instead of YEAR.

regards,
Gaius

Re: gcc-13/changes.html: Mention -fstrict-flex-arrays and its impact

2023-01-17 Thread Qing Zhao via Gcc-patches

Thanks for the comment. 

I just committed the following:

>From fc681f5412c421ff9609aea448310106d2570fd5 Mon Sep 17 00:00:00 2001
From: Qing Zhao 
Date: Tue, 17 Jan 2023 15:52:15 +
Subject: [PATCH] gcc13/changes: update id 'flexible array' to
 'flexible-arrays' since ids must not contain white space

---
 htdocs/gcc-13/changes.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
index 08e36fb3..ca9cd2da 100644
--- a/htdocs/gcc-13/changes.html
+++ b/htdocs/gcc-13/changes.html
@@ -438,7 +438,7 @@ a work-in-progress.
 Other significant improvements
 
 
-Treating trailing arrays as flexible array members
+Treating trailing arrays as flexible array 
members
 
 
  GCC can now control when to treat the trailing array of a structure as a
-- 
2.31.1


> On Jan 13, 2023, at 3:59 PM, Gerald Pfeifer  wrote:
> 
> On Tue, 20 Dec 2022, Qing Zhao via Gcc-patches wrote:
>> +Treating trailing arrays as flexible array 
>> members
> 
> Please note that ids must not contain white space.
> 
> Would you mind following up making this "flexiblearray" or similiar?
> 
> Thank you,
> Gerald

Re: [PATCH 1/1] [fwprop]: Add the support of forwarding the vec_duplicate rtx

2023-01-17 Thread Richard Sandiford via Gcc-patches

lehua.d...@rivai.ai writes:
> From: Lehua Ding 
>
> ps: Resend for adjusting the width of each line of text.
>
> Hi,
>
> When I was adding the new RISC-V auto-vectorization function, I found that
> converting `vector-reg1 vop vector-vreg2` to `scalar-reg3 vop vectorreg2`
> is not very easy to handle where `vector-reg1` is a vec_duplicate_expr.
> For example the bellow gimple IR:
>
> ```gimple
> 
> vect_cst__51 = [vec_duplicate_expr] z_14(D);
>
> 
> vect_iftmp.13_53 = .LEN_COND_ADD(mask__40.9_47, vect__6.12_50, vect_cst__51, 
> { 0.0, ... }, curr_cnt_60);
> ```
>
> I once wanted to add corresponding functions to gimple IR, such as adding
> .LEN_COND_ADD_VS, and then convert .LEN_COND_ADD to .LEN_COND_ADD_VS in 
> match.pd.
> This method can be realized, but it will cause too many similar internal 
> functions
> to be added to gimple IR. It doesn't feel necessary. Later, I tried to 
> combine them
> on the combine pass but failed. Finally, I thought of adding the ability to 
> support
> forwarding `(vec_duplciate reg)` in fwprop pass, so I have this patch.
>
> Because the current upstream does not support the RISC-V automatic 
> vectorization
> function, I found an example in sve that can also be optimized and simply 
> tried
> it. For the float type, one instruction can be reduced, for example the 
> bellow C
> code. The difference between the new and old assembly code is that the new one
> uses the mov instruction to directly move the scalar variable to the vector 
> register.
> The old assembly code first moves the scalar variable to the vector register 
> outside
> the loop, and then uses the sel instruction. Compared with the entire 
> assembly code,
> the new assembly code has one instruction less. In addition, I noticed that 
> some
> instructions in the new assembly code are ahead of the `ble .L1` instruction.
> I debugged and found that the modification was made in the ce1 pass. This pass
> believes that moving up is more beneficial to performance.
>
> In addition, for the int type, compared with the float type, the new assembly 
> code
> will have one more `fmov s2, w2` instruction, so I can't judge whether the
> performance is better than the previous one. In fact, I mainly do RISC-V 
> development work.
>
> This patch is an exploratory patch and has not been tested too much. I mainly
> want to see your suggestions on whether this method is feasible and possible
> potential problems.
>
> Best,
> Lehua Ding
>
> ```c
> /* compiler options: -O3 -march=armv8.2-a+sve -S */
> void test1 (int *pred, float *x, float z, int n)
> {
>  for (int i = 0; i < n; i += 1)
>{
>  x[i] = pred[i] != 1 ? x[i] : z;
>}
> }
> ```
>
> The old assembly code like this (compiler explorer link: 
> https://godbolt.org/z/hxTnEhaqY):
>
> ```asm
> test1:
>  cmp w2, 0
>  ble.L1
>  mov x3, 0
>  cntw x4
>  mov z0.s, s0
>  whilelo p0.s, wzr, w2
>  ptrue p2.b, all
> .L3:
>  ld1w z2.s, p0/z, [x0, x3, lsl 2]
>  ld1w z1.s, p0/z, [x1, x3, lsl 2]
>  cmpne p1.s, p2/z, z2.s, #1
>  sel z1.s, p1, z1.s, z0.s
>  st1w z1.s, p0, [x1, x3, lsl 2]
>  add x3, x3, x4
>  while lo p0.s, w3, w2
>  b.any.L3
> .L1:
>  ret
> ```
>
> The new assembly code like this:
>
> ```asm
> test1:
>  whilelo p0.s, wzr, w2
>  mov x3, 0
>  cntw x4
>  ptrue p2.b, all
>  cmp w2, 0
>  ble.L1
> .L3:
>  ld1w z2.s, p0/z, [x0, x3, lsl 2]
>  ld1w z1.s, p0/z, [x1, x3, lsl 2]
>  cmpne p1.s, p2/z, z2.s, #1
>  mov z1.s, p1/m, s0
>  st1w z1.s, p0, [x1, x3, lsl 2]
>  add x3, x3, x4
>  while lo p0.s, w3, w2
>  b.any.L3
> .L1:
>  ret
> ```
>
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64-sve.md (@aarch64_sel_dup_vs): Add new 
> pattern to capture new opeands order
> * fwprop.cc (fwprop_propagation::profitable_p): Add new check
> (reg_single_def_for_src_p): Add new function for src rtx
> (forward_propagate_into): Change to new function call
>
> ---
>  gcc/config/aarch64/aarch64-sve.md | 20 
>  gcc/fwprop.cc | 16 +++-
>  2 files changed, 35 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/config/aarch64/aarch64-sve.md 
> b/gcc/config/aarch64/aarch64-sve.md
> index b8cc47ef5fc..84d8ed0924d 100644
> --- a/gcc/config/aarch64/aarch64-sve.md
> +++ b/gcc/config/aarch64/aarch64-sve.md
> @@ -7636,6 +7636,26 @@
>[(set_attr "movprfx" "*,*,yes,yes,yes,yes")]
>  )
>  
> +;; Swap the order of operand 1 and operand 2 so that it matches the above 
> pattern
> +(define_insn_and_split "@aarch64_sel_dup_vs"
> +  [(set (match_operand:SVE_ALL 0 "register_operand" "=?w, w, ??w, ?&w, ??&w, 
> ?&w")
> + (unspec:SVE_ALL
> +   [(match_operand: 3 "register_operand" "Upl, Upl, Upl, Upl, 
> Upl, Upl")
> +   (match_operand:SV

[COMMITTED] bpf: disable -fstack-protector in BPF

2023-01-17 Thread Jose E. Marchesi via Gcc-patches

The stack protector is not supported in BPF.  This patch disables
-fstack-protector in bpf-* targets, along with the emission of a note
indicating that the feature is not supported in this platform.

Regtested in bpf-unknown-none.

gcc/ChangeLog:

* config/bpf/bpf.cc (bpf_option_override): Disable
-fstack-protector.
---
 gcc/config/bpf/bpf.cc | 8 
 1 file changed, 8 insertions(+)

diff --git a/gcc/config/bpf/bpf.cc b/gcc/config/bpf/bpf.cc
index 576a1fe8eab..b268801d00c 100644
--- a/gcc/config/bpf/bpf.cc
+++ b/gcc/config/bpf/bpf.cc
@@ -253,6 +253,14 @@ bpf_option_override (void)
   if (bpf_has_jmp32 == -1)
 bpf_has_jmp32 = (bpf_isa >= ISA_V3);
 
+  /* Disable -fstack-protector as it is not supported in BPF.  */
+  if (flag_stack_protect)
+{
+  inform (input_location,
+  "%<-fstack-protector%> does not work "
+  " on this architecture");
+  flag_stack_protect = 0;
+}
 }
 
 #undef TARGET_OPTION_OVERRIDE
-- 
2.30.2

[PATCH] modula-2, driver, Front end: Revise handling of I and L paths [PR108182].

2023-01-17 Thread Iain Sandoe via Gcc-patches

Tested on x86_64-linux-gnu (with a 32b multilib), powerpc, i686 and
x86_64-darwin.  OK for trunk?
thanks,
Iain

--- 8< ---

The adds the includes in the FE as done in other GCC languages.
It also revises the library handling to avoid additional -L options
from hiding LIBDIR.

For the include/import paths as presented to the front end initialisation,
we capture them and then arrange to emit the 'standard library' paths in
the same order as specified for C.

The specs are tidied up.

Signed-off-by: Iain Sandoe 

PR modula2/108182

gcc/m2/ChangeLog:

* Make-lang.in: Pass libsubdir to the language init
build.
* gm2-lang.cc (INCLUDE_VECTOR): Define.
(add_one_import_path): New.
(add_m2_import_paths): New.
(gm2_langhook_post_options): Arrange to add the include
paths (and add the system ones) in the same order as C
uses.
* gm2spec.cc (build_archive_path): Remove.
(add_default_combination): Remove.
(add_default_archives): Remove.
(add_default_libs): We no longer need a '-L' option, just
emit the -l and each library in use.
(build_include_path): Remove.
(add_include): Remove.
(add_default_includes): Remove.
(library_installed): Remove.
(check_valid_library): Remove.
(check_valid_list): Remove.
(convert_abbreviation): Diagnose unhandled cases.
(lang_specific_driver): Skip options where we will add back
a validated version.
* lang-specs.h (M2CPP): Reformat, append %I when -fcpp is not
in use.  Revise the cc1gm2 spec to omit mentioning options that
are handled in the c pre-processor line.
* lang.opt: Allow preprocessing and path options as input to the
cc1gm2 invocation, so that they can be passed to the preprocessor
invocation.
---
 gcc/m2/Make-lang.in |   1 +
 gcc/m2/gm2-lang.cc  | 168 --
 gcc/m2/gm2spec.cc   | 344 
 gcc/m2/lang-specs.h |  13 +-
 gcc/m2/lang.opt |  48 +++
 5 files changed, 304 insertions(+), 270 deletions(-)

diff --git a/gcc/m2/Make-lang.in b/gcc/m2/Make-lang.in
index 367be8e8af7..00cca7de617 100644
--- a/gcc/m2/Make-lang.in
+++ b/gcc/m2/Make-lang.in
@@ -543,6 +543,7 @@ m2/gm2-gcc/m2configure.o: 
$(srcdir)/m2/gm2-gcc/m2configure.cc \
 
 m2/gm2-lang.o: $(srcdir)/m2/gm2-lang.cc gt-m2-gm2-lang.h 
$(GCC_HEADER_DEPENDENCIES_FOR_M2)
$(COMPILER) -c -g $(GM2GCC) $(ALL_COMPILERFLAGS) \
+   -DLIBSUBDIR=\"$(libsubdir)\" \
 $(ALL_CPPFLAGS) $(INCLUDES) $< $(OUTPUT_OPTION)
 
 m2/stor-layout.o: $(srcdir)/stor-layout.cc $(GCC_HEADER_DEPENDENCIES_FOR_M2)
diff --git a/gcc/m2/gm2-lang.cc b/gcc/m2/gm2-lang.cc
index b8123273368..98707430ef5 100644
--- a/gcc/m2/gm2-lang.cc
+++ b/gcc/m2/gm2-lang.cc
@@ -20,6 +20,7 @@ along with GNU Modula-2; see the file COPYING.  If not, write 
to the
 Free Software Foundation, 51 Franklin Street, Fifth Floor, Boston, MA
 02110-1301, USA.  */
 
+#define INCLUDE_VECTOR
 #include "gm2-gcc/gcc-consolidation.h"
 
 #include "langhooks-def.h" /* FIXME: for lhd_set_decl_assembler_name.  */
@@ -45,6 +46,18 @@ static void write_globals (void);
 
 static int insideCppArgs = FALSE;
 
+/* We default to pim in the absence of fiso.  */
+static bool iso = false;
+
+/* The language include paths are based on the libraries in use.  */
+static bool allow_libraries = true;
+static const char *flibs = nullptr;
+static const char *iprefix = nullptr;
+static const char *imultilib = nullptr;
+static std::vectorIpaths;
+static std::vectorisystem;
+static std::vectoriquote;
+
 #define EXPR_STMT_EXPR(NODE) TREE_OPERAND (EXPR_STMT_CHECK (NODE), 0)
 
 /* start of new stuff.  */
@@ -198,34 +211,41 @@ gm2_langhook_handle_option (
   return 1;
 case OPT_I:
   if (insideCppArgs)
-{
-  const struct cl_option *option = &cl_options[scode];
-  const char *opt = (const char *)option->opt_text;
-  M2Options_CppArg (opt, arg, TRUE);
-}
+   {
+ const struct cl_option *option = &cl_options[scode];
+ const char *opt = (const char *)option->opt_text;
+ M2Options_CppArg (opt, arg, (option->flags & CL_JOINED)
+  && !(option->flags & CL_SEPARATE));
+   }
   else
-M2Options_SetSearchPath (arg);
+   Ipaths.push_back (arg);
   return 1;
 case OPT_fiso:
   M2Options_SetISO (value);
+  iso = value;
   return 1;
 case OPT_fpim:
   M2Options_SetPIM (value);
+  iso = value ? false : iso;
   return 1;
 case OPT_fpim2:
   M2Options_SetPIM2 (value);
+  iso = value ? false : iso;
   return 1;
 case OPT_fpim3:
   M2Options_SetPIM3 (value);
+  iso = value ? false : iso;
   return 1;
 case OPT_fpim4:
   M2Options_SetPIM4 (value);
+  iso = value ? false : iso;
   return 1;
 case OPT_fpositive_mod_floor_

Go patch committed: Define builtin functions

2023-01-17 Thread Ian Lance Taylor via Gcc-patches

This patch by Andrew Pinski defines two builtin functions that are
used by the middle-end.  This fixes PR 108426.  Bootstrapped and
tested on x86_64-pc-linux-gnu.  Committed to mainline.

Ian

PR go/108426
* go-gcc.cc (Gcc_backend::Gcc_backend): Define __builtin_ctzl and
__builtin_clzl.  Patch by Andrew Pinski.
2bee478038d75487b52e35e29e54c70e4bfa1e2b
diff --git a/gcc/go/go-gcc.cc b/gcc/go/go-gcc.cc
index a4a0e5d903e..07c34a58241 100644
--- a/gcc/go/go-gcc.cc
+++ b/gcc/go/go-gcc.cc
@@ -627,6 +627,11 @@ Gcc_backend::Gcc_backend()
unsigned_type_node,
NULL_TREE),
   builtin_const);
+  this->define_builtin(BUILT_IN_CTZL, "__builtin_ctzl", "ctzl",
+ build_function_type_list(integer_type_node,
+  long_unsigned_type_node,
+  NULL_TREE),
+ builtin_const);
   this->define_builtin(BUILT_IN_CTZLL, "__builtin_ctzll", "ctzll",
   build_function_type_list(integer_type_node,
long_long_unsigned_type_node,
@@ -637,6 +642,11 @@ Gcc_backend::Gcc_backend()
unsigned_type_node,
NULL_TREE),
   builtin_const);
+  this->define_builtin(BUILT_IN_CLZL, "__builtin_clzl", "clzl",
+ build_function_type_list(integer_type_node,
+  long_unsigned_type_node,
+  NULL_TREE),
+ builtin_const);
   this->define_builtin(BUILT_IN_CLZLL, "__builtin_clzll", "clzll",
   build_function_type_list(integer_type_node,
long_long_unsigned_type_node,

Re: [PATCH] testsuite: Skip intrinsics test if arm

2023-01-17 Thread Richard Earnshaw via Gcc-patches





On 15/01/2023 17:06, Torbjorn SVENSSON via Gcc-patches wrote:



On 2023-01-12 16:03, Richard Earnshaw wrote:



On 19/09/2022 17:16, Torbjörn SVENSSON via Gcc-patches wrote:

In the test case, it's clearly written that intrinsics is not
implemented on arm*. A simple xfail does not help since there are
link error and that would cause an UNRESOLVED testcase rather than
XFAIL.
By chaning to dg-skip-if, the entire test case is omitted.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/advsimd-intrinsics/vld1x2.c: Replace
dg-xfail-if with gd-skip-if.


Sorry for the delay reviewing this, I missed it at the time.

My problem with your suggested solution is that if these intrinsics 
are ever added this test will not automatically pick this up as it 
will have been disabled.  I presume from the comment (and the body of 
the test that contains an #ifdef for aarch64) that this is expected to 
be a temporary issue rather than something permanent.


So IMO I think it is correct to leave this as unresolved because the 
test cannot be built due to an issue with the compiler.


This patch has already been merged after Kyrill reviewed it back in 
September.


Without this change, the log would be filled with warnings about missing 
types. Maybe we could add some check that will enable the test only if 
the types are known?

Would that mitigate your concern?

Attached is the log from vld1x2.c on Cortex-A7 with -mfloat-abi=hard 
-mfpu=neon.


When I look at the result of a run, I only look at the test cases that 
are either FAIL (obviously), XPASS and UNRESOLVED. All other test cases 
are in a "good" state from what I can tell. If there are a lot of test 
cases in the UNRESOLVED state, that are not yet implemented year after 
year, it makes it harder to identify those test cases that are of 
interest. Right or wrong, that's why I suggested to remove it for the 
list of test cases that should be working.


Let me know what you think.


Ah, OK.  Somehow I'd misplaced v2 of the patch, which is the version 
that got approved.


R.



Kind regards,
Torbjörn



R.



Co-Authored-By: Yvan ROUX  
Signed-off-by: Torbjörn SVENSSON  
---
  gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c 
b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c

index 92a139bc523..f933102be47 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c
@@ -1,6 +1,6 @@
  /* We haven't implemented these intrinsics for arm yet.  */
-/* { dg-xfail-if "" { arm*-*-* } } */
  /* { dg-do run } */
+/* { dg-skip-if "unsupported" { arm*-*-* } } */
  /* { dg-options "-O3" } */
  #include

[PATCH] modula-2, testsuite: Make libs and interfaces consistent.

2023-01-17 Thread Iain Sandoe via Gcc-patches

Tested on x86_64-linux-gnu (with a 32b multilib), powerpc, i686 and
x86_64-darwin.  OK for trunk?
thanks,
Iain

--- 8< ---

In some case the libraries list was being set before gm2_init_xxx was
called.  In some cases it was omitted - this could lead to a difference
between the link libs and the interfaces (the effect of this would be
dependent on the order in which the .exps were run, which makes it also
depend on the -j and the system).

To avoid a mismatch between the module include paths and the added libs
we now make sure that they are both added in the gm_init_ functions
(if finer control over granularity is needed, then we should as a TODO
add a generic gm_init_xxx that takes a library list and ensures that the
imports and libs are matched in the same order).

Also we cannot use a default variable in tcl if the source for that
variable could be absent, but something else follows, there is no way
to put an empty placeholder in.

Signed-off-by: Iain Sandoe 

gcc/testsuite/ChangeLog:

* gm2/complex/run/pass/complex-run-pass.exp: Remove gm2_link_lib.
* gm2/iso/run/pass/iso-run-pass.exp: Likewise.
* gm2/link/externalscaffold/pass/link-externalscaffold-pass.exp:
* gm2/pimlib/logitech/run/pass/pimlib-logitech-run-pass.exp: Likewise.
* gm2/pimlib/run/pass/pimlib-run-pass.exp: Likewise.
* gm2/projects/iso/run/pass/halma/projects-iso-run-pass-halma.exp:
Likewise.
* gm2/projects/iso/run/pass/hello/projects-iso-run-pass-hello.exp:
Likewise.
* gm2/projects/pim/run/pass/hello/projects-pim-run-pass-hello.exp:
Likewise.
* gm2/sets/run/pass/sets-run-pass.exp: Likewise.
* gm2/switches/none/run/pass/gm2-none.exp: Likewise.
* gm2/switches/pic/run/pass/switches-pic-run-pass.exp: Likewise.
* gm2/projects/pim/run/pass/random/projects-pim-run-pass-random.exp:
Likewise, and also ensure that the -g option is appended to avoid it
being taken as a path.
* lib/gm2.exp: Ensure for each gm2_init_ function that the set of
libraries added matches the set of -I and -L options.
---
 .../gm2/complex/run/pass/complex-run-pass.exp |  1 -
 .../gm2/iso/run/pass/iso-run-pass.exp |  1 -
 .../pass/link-externalscaffold-pass.exp   |  1 -
 .../run/pass/pimlib-logitech-run-pass.exp |  2 --
 .../gm2/pimlib/run/pass/pimlib-run-pass.exp   |  2 --
 .../halma/projects-iso-run-pass-halma.exp |  1 -
 .../hello/projects-iso-run-pass-hello.exp |  1 -
 .../hello/projects-pim-run-pass-hello.exp |  1 -
 .../random/projects-pim-run-pass-random.exp   | 29 +--
 .../gm2/sets/run/pass/sets-run-pass.exp   |  1 -
 .../gm2/switches/none/run/pass/gm2-none.exp   |  1 -
 .../pic/run/pass/switches-pic-run-pass.exp|  2 --
 gcc/testsuite/lib/gm2.exp | 20 -
 13 files changed, 26 insertions(+), 37 deletions(-)

diff --git a/gcc/testsuite/gm2/complex/run/pass/complex-run-pass.exp 
b/gcc/testsuite/gm2/complex/run/pass/complex-run-pass.exp
index a715ec27d82..399f30b89ef 100644
--- a/gcc/testsuite/gm2/complex/run/pass/complex-run-pass.exp
+++ b/gcc/testsuite/gm2/complex/run/pass/complex-run-pass.exp
@@ -27,7 +27,6 @@ load_lib gm2-torture.exp
 
 set gm2src ${srcdir}/../gm2
 
-gm2_link_lib "m2iso m2pim"
 gm2_init_iso "${srcdir}/gm2/complex/run/pass"
 
 
diff --git a/gcc/testsuite/gm2/iso/run/pass/iso-run-pass.exp 
b/gcc/testsuite/gm2/iso/run/pass/iso-run-pass.exp
index 95b13038cd0..09d04ee910d 100644
--- a/gcc/testsuite/gm2/iso/run/pass/iso-run-pass.exp
+++ b/gcc/testsuite/gm2/iso/run/pass/iso-run-pass.exp
@@ -24,7 +24,6 @@ if $tracelevel then {
 # load support procs
 load_lib gm2-torture.exp
 
-gm2_link_lib "m2iso m2pim"
 gm2_init_iso "${srcdir}/gm2/iso/run/pass" -fsoft-check-all
 gm2_link_obj fileio.o
 
diff --git 
a/gcc/testsuite/gm2/link/externalscaffold/pass/link-externalscaffold-pass.exp 
b/gcc/testsuite/gm2/link/externalscaffold/pass/link-externalscaffold-pass.exp
index 26c91553dfc..32d4315aebd 100644
--- 
a/gcc/testsuite/gm2/link/externalscaffold/pass/link-externalscaffold-pass.exp
+++ 
b/gcc/testsuite/gm2/link/externalscaffold/pass/link-externalscaffold-pass.exp
@@ -25,7 +25,6 @@ if $tracelevel then {
 # load support procs
 load_lib gm2-torture.exp
 
-gm2_link_lib "m2pim"
 gm2_init_pim "${srcdir}/gm2/pim/pass" -fscaffold-main -fno-scaffold-dynamic
 gm2_link_obj scaffold.o
 set output [target_compile $srcdir/$subdir/scaffold.c scaffold.o object "-g"]
diff --git 
a/gcc/testsuite/gm2/pimlib/logitech/run/pass/pimlib-logitech-run-pass.exp 
b/gcc/testsuite/gm2/pimlib/logitech/run/pass/pimlib-logitech-run-pass.exp
index 912c4570520..cfe9ff84a08 100644
--- a/gcc/testsuite/gm2/pimlib/logitech/run/pass/pimlib-logitech-run-pass.exp
+++ b/gcc/testsuite/gm2/pimlib/logitech/run/pass/pimlib-logitech-run-pass.exp
@@ -27,10 +27,8 @@ load_lib gm2-torture.exp
 
 set gm2src ${srcdir}/../m2
 
-gm2_link_lib "m2log m2pim m2iso"
 gm2_init_log
 
-
 foreac

Re: [PATCH] tree-optimization/104475 - bogus -Wstringop-overflow

2023-01-17 Thread Jason Merrill via Gcc-patches


On 12/7/22 06:25, Richard Biener wrote:

The following avoids a bogus -Wstringop-overflow diagnostic by
properly recognizing that &d->m_mutex cannot be nullptr in C++
even if m_mutex is at offset zero.  The frontend already diagnoses
a &d->m_mutex != nullptr comparison and the following transfers
this knowledge to the middle-end which sees &d->m_mutex as
simple pointer arithmetic.  The new ADDR_NONZERO flag on an
ADDR_EXPR is used to carry this information and it's checked in
the tree_expr_nonzero_p API which causes this to be folded early.

To avoid the bogus diagnostic this avoids separating the nullptr
path via jump-threading by eliminating the nullptr check.

I'd appreciate C++ folks picking this up and put the flag on
the appropriate ADDR_EXPRs - I've tried avoiding to put it on
all of them and didn't try hard to mimick what -Waddress warns
on (the code is big, maybe some refactoring would help but also
not sure what exactly the C++ standard constraints are here).


This is allowed by the standard, at least after CWG2535, but we need to 
check -fsanitize=null before asserting that the address is non-null. 
With that elaboration, a flag on the ADDR_EXPR may not be a convenient 
way to express the property?



Bootstrapped and tested on x86_64-unknown-linux-gnu.

Thanks,
Richard.

PR tree-optimization/104475
gcc/
* tree-core.h: Document use of nothrow_flag on ADDR_EXPR.
* tree.h (ADDR_NONZERO): New.
* fold-const.cc (tree_single_nonzero_warnv_p): Check
ADDR_NONZERO.

gcc/cp/
* typeck.cc (cp_build_addr_expr_1): Set ADDR_NONZERO
on the built address if it is of a COMPONENT_REF.

* g++.dg/opt/pr104475.C: New testcase.
---
  gcc/cp/typeck.cc|  3 +++
  gcc/fold-const.cc   |  4 +++-
  gcc/testsuite/g++.dg/opt/pr104475.C | 12 
  gcc/tree-core.h |  3 +++
  gcc/tree.h  |  4 
  5 files changed, 25 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/opt/pr104475.C

diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
index 7dfe5acc67e..3563750803e 100644
--- a/gcc/cp/typeck.cc
+++ b/gcc/cp/typeck.cc
@@ -7232,6 +7232,9 @@ cp_build_addr_expr_1 (tree arg, bool strict_lvalue, 
tsubst_flags_t complain)
gcc_assert (same_type_ignoring_top_level_qualifiers_p
  (TREE_TYPE (object), decl_type_context (field)));
val = build_address (arg);
+  if (TREE_CODE (val) == ADDR_EXPR
+ && TREE_CODE (TREE_OPERAND (val, 0)) == COMPONENT_REF)
+   ADDR_NONZERO (val) = 1;
  }
  
if (TYPE_PTR_P (argtype)

diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index e80be8049e1..cdfe3f50ae3 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -15308,8 +15308,10 @@ tree_single_nonzero_warnv_p (tree t, bool 
*strict_overflow_p)
  
  case ADDR_EXPR:

{
-   tree base = TREE_OPERAND (t, 0);
+   if (ADDR_NONZERO (t))
+ return true;
  
+	tree base = TREE_OPERAND (t, 0);

if (!DECL_P (base))
  base = get_base_address (base);
  
diff --git a/gcc/testsuite/g++.dg/opt/pr104475.C b/gcc/testsuite/g++.dg/opt/pr104475.C

new file mode 100644
index 000..013c70302c6
--- /dev/null
+++ b/gcc/testsuite/g++.dg/opt/pr104475.C
@@ -0,0 +1,12 @@
+// { dg-do compile }
+// { dg-require-effective-target c++11 }
+// { dg-options "-O -Waddress -fdump-tree-original" }
+
+struct X { int i; };
+
+bool foo (struct X *p)
+{
+  return &p->i != nullptr; /* { dg-warning "never be NULL" } */
+}
+
+/* { dg-final { scan-tree-dump "return  = 1;" "original" } } */
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index e146b133dbd..303e25b5df6 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -1376,6 +1376,9 @@ struct GTY(()) tree_base {
 TREE_THIS_NOTRAP in
INDIRECT_REF, MEM_REF, TARGET_MEM_REF, ARRAY_REF, ARRAY_RANGE_REF
  
+   ADDR_NONZERO in

+ ADDR_EXPR
+
 SSA_NAME_IN_FREE_LIST in
SSA_NAME
  
diff --git a/gcc/tree.h b/gcc/tree.h

index 23223ca0c87..1c810c0b21b 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -876,6 +876,10 @@ extern void omp_clause_range_check_failed (const_tree, 
const char *, int,
(TREE_CHECK5 (NODE, INDIRECT_REF, MEM_REF, TARGET_MEM_REF, ARRAY_REF,   
\
ARRAY_RANGE_REF)->base.nothrow_flag)
  
+/* Nozero means this ADDR_EXPR is not equal to NULL.  */

+#define ADDR_NONZERO(NODE) \
+  (TREE_CHECK (NODE, ADDR_EXPR)->base.nothrow_flag)
+
  /* In a VAR_DECL, PARM_DECL or FIELD_DECL, or any kind of ..._REF node,
 nonzero means it may not be the lhs of an assignment.
 Nonzero in a FUNCTION_DECL means this function should be treated

Re: [PATCH] c++: Define built-in for std::tuple_element [PR100157]

2023-01-17 Thread Jason Merrill via Gcc-patches


On 1/9/23 14:25, Patrick Palka via Gcc-patches wrote:

On Mon, 9 Jan 2023, Patrick Palka wrote:


On Wed, 5 Oct 2022, Patrick Palka wrote:


On Thu, 7 Jul 2022, Jonathan Wakely via Gcc-patches wrote:


This adds a new built-in to replace the recursive class template
instantiations done by traits such as std::tuple_element and
std::variant_alternative. The purpose is to select the Nth type from a
list of types, e.g. __builtin_type_pack_element(1, char, int, float) is
int.

For a pathological example tuple_element_t<1000, tuple<2000 types...>>
the compilation time is reduced by more than 90% and the memory  used by
the compiler is reduced by 97%. In realistic examples the gains will be
much smaller, but still relevant.

Clang has a similar built-in, __type_pack_element, but that's a
"magic template" built-in using <> syntax, which GCC doesn't support. So
this provides an equivalent feature, but as a built-in function using
parens instead of <>. I don't really like the name "type pack element"
(it gives you an element from a pack of types) but the semi-consistency
with Clang seems like a reasonable argument in favour of keeping the
name. I'd be open to alternative names though, e.g. __builtin_nth_type
or __builtin_type_at_index.


Rather than giving the trait a different name from __type_pack_element,
I wonder if we could just special case cp_parser_trait to expect <>
instead of parens for this trait?

Btw the frontend recently got a generic TRAIT_TYPE tree code, which gets
rid of much of the boilerplate of adding a new type-yielding built-in
trait, see e.g. cp-trait.def.


Here's a tested patch based on Jonathan's original patch that implements
the built-in in terms of TRAIT_TYPE, names it __type_pack_element
instead of __builtin_type_pack_element, and treats invocations of it
like a template-id instead of a call (to match Clang).

-- >8 --

Subject: [PATCH] c++: Define built-in for std::tuple_element [PR100157]

This adds a new built-in to replace the recursive class template
instantiations done by traits such as std::tuple_element and
std::variant_alternative.  The purpose is to select the Nth type from a
list of types, e.g. __type_pack_element<1, char, int, float> is int.
We implement it as a special kind of TRAIT_TYPE.

For a pathological example tuple_element_t<1000, tuple<2000 types...>>
the compilation time is reduced by more than 90% and the memory  used by
the compiler is reduced by 97%.  In realistic examples the gains will be
much smaller, but still relevant.

Unlike the other built-in traits, __type_pack_element uses template-id
syntax instead of call syntax and is SFINAE-enabled, matching Clang's
implementation.  And like the other built-in traits, it's not mangleable
so we can't use it directly in function signatures.

Some caveats:

   * Clang's version of the built-in seems to act like a "magic template"
 that can e.g. be used as a template template argument.  For simplicity
 we implement it in a more ad-hoc way.
   * Our parsing of the <>'s in __type_pack_element<...> is currently
 rudimentary and doesn't try to disambiguate a trailing >> vs > >
 as cp_parser_enclosed_template_argument_list does.


Hmm, this latter caveat turns out to be inconvenient (for code such as
type_pack_element3.C) and admits an easy workaround inspired by what
cp_parser_enclosed_template_argument_list does.

v2: Consider the >> in __type_pack_element<0, int, char>> to be two >'s.
 Handle non-type TRAIT_TYPE_TYPE1 in strip_typedefs (for sake of
 CPTK_TYPE_PACK_ELEMENT).


Why not use cp_parser_enclosed_template_argument_list directly?


-- >8 --

Subject: [PATCH] c++: Define built-in for std::tuple_element [PR100157]

This adds a new built-in to replace the recursive class template
instantiations done by traits such as std::tuple_element and
std::variant_alternative.  The purpose is to select the Nth type from a
list of types, e.g. __type_pack_element<1, char, int, float> is int.
We implement it as a special kind of TRAIT_TYPE.

For a pathological example tuple_element_t<1000, tuple<2000 types...>>
the compilation time is reduced by more than 90% and the memory  used by
the compiler is reduced by 97%.  In realistic examples the gains will be
much smaller, but still relevant.

Unlike the other built-in traits, __type_pack_element uses template-id
syntax instead of call syntax and is SFINAE-enabled, matching Clang's
implementation.  And like the other built-in traits, it's not mangleable
so we can't use it directly in function signatures.

N.B. Clang seems to implement __type_pack_element as a first-class
template that can e.g. be used as a template template argument.  For
simplicity we implement it in a more ad-hoc way.

Co-authored-by: Jonathan Wakely 

PR c++/100157

gcc/cp/ChangeLog:

* cp-trait.def (TYPE_PACK_ELEMENT): Define.
* cp-tree.h (finish_trait_type): Add complain parameter.
* cxx-pretty-print.cc (pp_cxx_trait): Handle
CPTK_TYPE_PACK_ELEMENT.

Re: [PATCH] libgcc: Fix uninitialized RA signing on AArch64 [PR107678]

2023-01-17 Thread Wilco Dijkstra via Gcc-patches

Hi,

> @Wilco, can you please send the rebased patch for patch review? We would
> need in out openSUSE package soon.

Here is an updated and rebased version:

Cheers,
Wilco

v4: rebase and add REG_UNSAVED_ARCHEXT.

A recent change only initializes the regs.how[] during Dwarf unwinding
which resulted in an uninitialized offset used in return address signing
and random failures during unwinding.  The fix is to encode the return
address signing state in REG_UNSAVED and a new state REG_UNSAVED_ARCHEXT.

Passes bootstrap & regress, OK for commit?

libgcc/
PR target/107678
* unwind-dw2.h (REG_UNSAVED_ARCHEXT): Add new enum.
* unwind-dw2.c (uw_update_context_1): Add REG_UNSAVED_ARCHEXT case.
* unwind-dw2-execute_cfa.h: Use REG_UNSAVED_ARCHEXT/REG_UNSAVED to 
encode the return address signing state.
* config/aarch64/aarch64-unwind.h (aarch64_demangle_return_addr)
Check current return address signing state.
(aarch64_frob_update_contex): Remove.

---
diff --git a/libgcc/config/aarch64/aarch64-unwind.h 
b/libgcc/config/aarch64/aarch64-unwind.h
index 
874cf6c3e77fb72d999f51b636d74cb0b5728bbd..727c27ba5da983958b3134715d9d4d7c0af5c1e2
 100644
--- a/libgcc/config/aarch64/aarch64-unwind.h
+++ b/libgcc/config/aarch64/aarch64-unwind.h
@@ -29,8 +29,6 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If 
not, see
 
 #define MD_DEMANGLE_RETURN_ADDR(context, fs, addr) \
   aarch64_demangle_return_addr (context, fs, addr)
-#define MD_FROB_UPDATE_CONTEXT(context, fs) \
-  aarch64_frob_update_context (context, fs)
 
 static inline int
 aarch64_cie_signed_with_b_key (struct _Unwind_Context *context)
@@ -55,42 +53,27 @@ aarch64_cie_signed_with_b_key (struct _Unwind_Context 
*context)
 
 static inline void *
 aarch64_demangle_return_addr (struct _Unwind_Context *context,
- _Unwind_FrameState *fs ATTRIBUTE_UNUSED,
+ _Unwind_FrameState *fs,
  _Unwind_Word addr_word)
 {
   void *addr = (void *)addr_word;
-  if (context->flags & RA_SIGNED_BIT)
+  const int reg = DWARF_REGNUM_AARCH64_RA_STATE;
+
+  if (fs->regs.how[reg] == REG_UNSAVED)
+return addr;
+
+  /* Return-address signing state is toggled by DW_CFA_GNU_window_save (where
+ REG_UNDEFINED means enabled), or set by a DW_CFA_expression.  */
+  if (fs->regs.how[reg] == REG_UNSAVED_ARCHEXT
+  || (_Unwind_GetGR (context, reg) & 0x1) != 0)
 {
   _Unwind_Word salt = (_Unwind_Word) context->cfa;
   if (aarch64_cie_signed_with_b_key (context) != 0)
return __builtin_aarch64_autib1716 (addr, salt);
   return __builtin_aarch64_autia1716 (addr, salt);
 }
-  else
-return addr;
-}
-
-/* Do AArch64 private initialization on CONTEXT based on frame info FS.  Mark
-   CONTEXT as return address signed if bit 0 of DWARF_REGNUM_AARCH64_RA_STATE 
is
-   set.  */
-
-static inline void
-aarch64_frob_update_context (struct _Unwind_Context *context,
-_Unwind_FrameState *fs)
-{
-  const int reg = DWARF_REGNUM_AARCH64_RA_STATE;
-  int ra_signed;
-  if (fs->regs.how[reg] == REG_UNSAVED)
-ra_signed = fs->regs.reg[reg].loc.offset & 0x1;
-  else
-ra_signed = _Unwind_GetGR (context, reg) & 0x1;
-  if (ra_signed)
-/* The flag is used for re-authenticating EH handler's address.  */
-context->flags |= RA_SIGNED_BIT;
-  else
-context->flags &= ~RA_SIGNED_BIT;
 
-  return;
+  return addr;
 }
 
 #endif /* defined AARCH64_UNWIND_H && defined __ILP32__ */
diff --git a/libgcc/unwind-dw2-execute_cfa.h b/libgcc/unwind-dw2-execute_cfa.h
index 
264c11c03ec4a09cac2c19a241c5b110b1b6b602..aef377092ceede6bdda8532679f9b081c98fadce
 100644
--- a/libgcc/unwind-dw2-execute_cfa.h
+++ b/libgcc/unwind-dw2-execute_cfa.h
@@ -278,10 +278,15 @@
case DW_CFA_GNU_window_save:
 #if defined (__aarch64__) && !defined (__ILP32__)
  /* This CFA is multiplexed with Sparc.  On AArch64 it's used to toggle
-return address signing status.  */
+return address signing status.  REG_UNSAVED/REG_UNSAVED_ARCHEXT
+mean RA signing is disabled/enabled.  */
  reg = DWARF_REGNUM_AARCH64_RA_STATE;
- gcc_assert (fs->regs.how[reg] == REG_UNSAVED);
- fs->regs.reg[reg].loc.offset ^= 1;
+ gcc_assert (fs->regs.how[reg] == REG_UNSAVED
+ || fs->regs.how[reg] == REG_UNSAVED_ARCHEXT);
+ if (fs->regs.how[reg] == REG_UNSAVED)
+   fs->regs.how[reg] = REG_UNSAVED_ARCHEXT;
+ else
+   fs->regs.how[reg] = REG_UNSAVED;
 #else
  /* ??? Hardcoded for SPARC register window configuration.  */
  if (__LIBGCC_DWARF_FRAME_REGISTERS__ >= 32)
diff --git a/libgcc/unwind-dw2.h b/libgcc/unwind-dw2.h
index 
e2f81983e1dcf3df6aebde2454630b7bee87d597..53e1b183c7d60112a14411d3356c49cb39cd0de7
 100644
--- a/libgcc/unwind-dw2.h
+++ b/libgcc/unwind-dw2.h
@@ -29,6 +29,7 @@ enum {
   REG_SAVED_EXP,
   REG_SAVED_

Re: [PATCH] libgcc: Fix uninitialized RA signing on AArch64 [PR107678]

2023-01-17 Thread Richard Sandiford via Gcc-patches

Wilco Dijkstra  writes:
> Hi,
>
>> @Wilco, can you please send the rebased patch for patch review? We would
>> need in out openSUSE package soon.
>
> Here is an updated and rebased version:
>
> Cheers,
> Wilco
>
> v4: rebase and add REG_UNSAVED_ARCHEXT.
>
> A recent change only initializes the regs.how[] during Dwarf unwinding
> which resulted in an uninitialized offset used in return address signing
> and random failures during unwinding.  The fix is to encode the return
> address signing state in REG_UNSAVED and a new state REG_UNSAVED_ARCHEXT.
>
> Passes bootstrap & regress, OK for commit?
>
> libgcc/
> PR target/107678
> * unwind-dw2.h (REG_UNSAVED_ARCHEXT): Add new enum.
> * unwind-dw2.c (uw_update_context_1): Add REG_UNSAVED_ARCHEXT case.
> * unwind-dw2-execute_cfa.h: Use REG_UNSAVED_ARCHEXT/REG_UNSAVED to
> encode the return address signing state.
> * config/aarch64/aarch64-unwind.h (aarch64_demangle_return_addr)
> Check current return address signing state.
> (aarch64_frob_update_contex): Remove.
>
> ---
> diff --git a/libgcc/config/aarch64/aarch64-unwind.h 
> b/libgcc/config/aarch64/aarch64-unwind.h
> index 
> 874cf6c3e77fb72d999f51b636d74cb0b5728bbd..727c27ba5da983958b3134715d9d4d7c0af5c1e2
>  100644
> --- a/libgcc/config/aarch64/aarch64-unwind.h
> +++ b/libgcc/config/aarch64/aarch64-unwind.h
> @@ -29,8 +29,6 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
> If not, see
>
>  #define MD_DEMANGLE_RETURN_ADDR(context, fs, addr) \
>aarch64_demangle_return_addr (context, fs, addr)
> -#define MD_FROB_UPDATE_CONTEXT(context, fs) \
> -  aarch64_frob_update_context (context, fs)
>
>  static inline int
>  aarch64_cie_signed_with_b_key (struct _Unwind_Context *context)
> @@ -55,42 +53,27 @@ aarch64_cie_signed_with_b_key (struct _Unwind_Context 
> *context)
>
>  static inline void *
>  aarch64_demangle_return_addr (struct _Unwind_Context *context,
> - _Unwind_FrameState *fs ATTRIBUTE_UNUSED,
> + _Unwind_FrameState *fs,
>   _Unwind_Word addr_word)
>  {
>void *addr = (void *)addr_word;
> -  if (context->flags & RA_SIGNED_BIT)
> +  const int reg = DWARF_REGNUM_AARCH64_RA_STATE;
> +
> +  if (fs->regs.how[reg] == REG_UNSAVED)
> +return addr;
> +
> +  /* Return-address signing state is toggled by DW_CFA_GNU_window_save (where
> + REG_UNDEFINED means enabled), or set by a DW_CFA_expression.  */

Needs updating to REG_UNSAVED_ARCHEXT.

OK with that changes, thanks, and sorry for the delays & runaround.

Richard

> +  if (fs->regs.how[reg] == REG_UNSAVED_ARCHEXT
> +  || (_Unwind_GetGR (context, reg) & 0x1) != 0)
>  {
>_Unwind_Word salt = (_Unwind_Word) context->cfa;
>if (aarch64_cie_signed_with_b_key (context) != 0)
> return __builtin_aarch64_autib1716 (addr, salt);
>return __builtin_aarch64_autia1716 (addr, salt);
>  }
> -  else
> -return addr;
> -}
> -
> -/* Do AArch64 private initialization on CONTEXT based on frame info FS.  Mark
> -   CONTEXT as return address signed if bit 0 of 
> DWARF_REGNUM_AARCH64_RA_STATE is
> -   set.  */
> -
> -static inline void
> -aarch64_frob_update_context (struct _Unwind_Context *context,
> -_Unwind_FrameState *fs)
> -{
> -  const int reg = DWARF_REGNUM_AARCH64_RA_STATE;
> -  int ra_signed;
> -  if (fs->regs.how[reg] == REG_UNSAVED)
> -ra_signed = fs->regs.reg[reg].loc.offset & 0x1;
> -  else
> -ra_signed = _Unwind_GetGR (context, reg) & 0x1;
> -  if (ra_signed)
> -/* The flag is used for re-authenticating EH handler's address.  */
> -context->flags |= RA_SIGNED_BIT;
> -  else
> -context->flags &= ~RA_SIGNED_BIT;
>
> -  return;
> +  return addr;
>  }
>
>  #endif /* defined AARCH64_UNWIND_H && defined __ILP32__ */
> diff --git a/libgcc/unwind-dw2-execute_cfa.h b/libgcc/unwind-dw2-execute_cfa.h
> index 
> 264c11c03ec4a09cac2c19a241c5b110b1b6b602..aef377092ceede6bdda8532679f9b081c98fadce
>  100644
> --- a/libgcc/unwind-dw2-execute_cfa.h
> +++ b/libgcc/unwind-dw2-execute_cfa.h
> @@ -278,10 +278,15 @@
> case DW_CFA_GNU_window_save:
>  #if defined (__aarch64__) && !defined (__ILP32__)
>   /* This CFA is multiplexed with Sparc.  On AArch64 it's used to 
> toggle
> -return address signing status.  */
> +return address signing status.  REG_UNSAVED/REG_UNSAVED_ARCHEXT
> +mean RA signing is disabled/enabled.  */
>   reg = DWARF_REGNUM_AARCH64_RA_STATE;
> - gcc_assert (fs->regs.how[reg] == REG_UNSAVED);
> - fs->regs.reg[reg].loc.offset ^= 1;
> + gcc_assert (fs->regs.how[reg] == REG_UNSAVED
> + || fs->regs.how[reg] == REG_UNSAVED_ARCHEXT);
> + if (fs->regs.how[reg] == REG_UNSAVED)
> +   fs->regs.how[reg] = REG_UNSAVED_ARCHEXT;
> + else
> +   fs->regs.how[reg] = REG_UNSAVED;
>  #else
>

Re: [PATCH/RFC] rs6000: Remove optimize_for_speed check for implicit TARGET_SAVE_TOC_INDIRECT [PR108184]

2023-01-17 Thread Michael Meissner via Gcc-patches

On Mon, Jan 16, 2023 at 05:39:04PM +0800, Kewen.Lin wrote:
> Hi,
> 
> Now we will check optimize_function_for_speed_p (cfun) for
> TARGET_SAVE_TOC_INDIRECT if it's implicitly enabled.  But
> the effect of -msave-toc-indirect is actually to save the
> TOC in the prologue for indirect calls rather than inline,
> it's also good for optimize_function_for_size?  So this
> patch is to remove the check of optimize_function_for_speed
> and make it work for both optimizing for size and speed.
> 
> Bootstrapped and regtested on powerpc64-linux-gnu P8,
> powerpc64le-linux-gnu P{9,10} and powerpc-ibm-aix.
> 
> Any thoughts?
> 
> Thanks in advance!

Well in terms of size, it is only a savings if we have 2 or more indirect calls
within a module, and we are not compiling for power10.

On power9, if we have just one indirect call, then it is the same size.

On power10, the -msave-toc-indirect switch does nothing, because we don't need
TOCs when we have prefixed addressing.

So I have objection to the change.  I suspect it may be better with a check for
just optimize either for speed or size, and not for speed.

The option however, can slow things down if there is an early exit to the
function since the store would always be done, even if the function exits
early.

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com

Re: [PATCH/RFC] rs6000: Remove optimize_for_speed check for implicit TARGET_SAVE_TOC_INDIRECT [PR108184]

2023-01-17 Thread Michael Meissner via Gcc-patches

On Tue, Jan 17, 2023 at 03:57:24PM -0500, Michael Meissner wrote:
> So I have objection to the change.  I suspect it may be better with a check 
> for
> just optimize either for speed or size, and not for speed.

Sigh.  I meant I have NO objection to the change.  Sorry about that.

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com

[PATCH] libstdc++: testsuite: Simplify codecvt_unicode

2023-01-17 Thread Dimitrij Mijoski via Gcc-patches

Stop using unique_ptr, create some objects directly.

libstdc++-v3/ChangeLog:

* testsuite/22_locale/codecvt/codecvt_unicode.cc: Simplify.
* testsuite/22_locale/codecvt/codecvt_unicode.h: Simplify.
* testsuite/22_locale/codecvt/codecvt_unicode_wchar_t.cc: Simplify.
---
 .../22_locale/codecvt/codecvt_unicode.cc   | 18 ++
 .../22_locale/codecvt/codecvt_unicode.h|  9 +
 .../codecvt/codecvt_unicode_wchar_t.cc | 12 ++--
 3 files changed, 17 insertions(+), 22 deletions(-)

diff --git a/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.cc 
b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.cc
index ae4b6c896..3d7393e4a 100644
--- a/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.cc
+++ b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.cc
@@ -29,11 +29,12 @@ test_utf8_utf32_codecvts ()
   using codecvt_c32 = codecvt;
   auto loc_c = locale::classic ();
   VERIFY (has_facet (loc_c));
+
   auto &cvt = use_facet (loc_c);
   test_utf8_utf32_codecvts (cvt);
 
-  auto cvt_ptr = to_unique_ptr (new codecvt_utf8 ());
-  test_utf8_utf32_codecvts (*cvt_ptr);
+  auto cvt2 = codecvt_utf8 ();
+  test_utf8_utf32_codecvts (cvt2);
 }
 
 void
@@ -42,21 +43,22 @@ test_utf8_utf16_codecvts ()
   using codecvt_c16 = codecvt;
   auto loc_c = locale::classic ();
   VERIFY (has_facet (loc_c));
+
   auto &cvt = use_facet (loc_c);
   test_utf8_utf16_cvts (cvt);
 
-  auto cvt_ptr = to_unique_ptr (new codecvt_utf8_utf16 ());
-  test_utf8_utf16_cvts (*cvt_ptr);
+  auto cvt2 = codecvt_utf8_utf16 ();
+  test_utf8_utf16_cvts (cvt2);
 
-  auto cvt_ptr2 = to_unique_ptr (new codecvt_utf8_utf16 ());
-  test_utf8_utf16_cvts (*cvt_ptr2);
+  auto cvt3 = codecvt_utf8_utf16 ();
+  test_utf8_utf16_cvts (cvt3);
 }
 
 void
 test_utf8_ucs2_codecvts ()
 {
-  auto cvt_ptr = to_unique_ptr (new codecvt_utf8 ());
-  test_utf8_ucs2_cvts (*cvt_ptr);
+  auto cvt = codecvt_utf8 ();
+  test_utf8_ucs2_cvts (cvt);
 }
 
 int
diff --git a/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.h 
b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.h
index 99d1a4684..fbdc7a35b 100644
--- a/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.h
+++ b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.h
@@ -15,18 +15,11 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
+#include 
 #include 
 #include 
-#include 
 #include 
 
-template 
-std::unique_ptr
-to_unique_ptr (T *ptr)
-{
-  return std::unique_ptr (ptr);
-}
-
 struct test_offsets_ok
 {
   size_t in_size, out_size;
diff --git 
a/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode_wchar_t.cc 
b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode_wchar_t.cc
index 169504939..f7a0a4fd8 100644
--- a/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode_wchar_t.cc
+++ b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode_wchar_t.cc
@@ -27,8 +27,8 @@ void
 test_utf8_utf32_codecvts ()
 {
 #if __SIZEOF_WCHAR_T__ == 4
-  auto cvt_ptr = to_unique_ptr (new codecvt_utf8 ());
-  test_utf8_utf32_codecvts (*cvt_ptr);
+  auto cvt = codecvt_utf8 ();
+  test_utf8_utf32_codecvts (cvt);
 #endif
 }
 
@@ -36,8 +36,8 @@ void
 test_utf8_utf16_codecvts ()
 {
 #if __SIZEOF_WCHAR_T__ >= 2
-  auto cvt_ptr = to_unique_ptr (new codecvt_utf8_utf16 ());
-  test_utf8_utf16_cvts (*cvt_ptr);
+  auto cvt = codecvt_utf8_utf16 ();
+  test_utf8_utf16_cvts (cvt);
 #endif
 }
 
@@ -45,8 +45,8 @@ void
 test_utf8_ucs2_codecvts ()
 {
 #if __SIZEOF_WCHAR_T__ == 2
-  auto cvt_ptr = to_unique_ptr (new codecvt_utf8 ());
-  test_utf8_ucs2_cvts (*cvt_ptr);
+  auto cvt = codecvt_utf8 ();
+  test_utf8_ucs2_cvts (cvt);
 #endif
 }
 
-- 
2.34.1

[committed] wwwdocs: rsync: Remove trailing slash from tags

2023-01-17 Thread Gerald Pfeifer

Last such occurrence in the tree - gone now.

Gerald

---
 htdocs/rsync.html | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/htdocs/rsync.html b/htdocs/rsync.html
index e206f0b4..264ea2af 100644
--- a/htdocs/rsync.html
+++ b/htdocs/rsync.html
@@ -3,8 +3,8 @@
 
 
 
-
-
+
+
 GCC: Anonymous read-only rsync access
 https://gcc.gnu.org/gcc.css";>
 
-- 
2.38.1

[committed] wwwdocs: gcc-4.7: Adjust www.open-std.org links to https

2023-01-17 Thread Gerald Pfeifer

Pushed.

Gerald

---
 htdocs/gcc-4.7/cxx0x_status.html | 122 +++
 1 file changed, 61 insertions(+), 61 deletions(-)

diff --git a/htdocs/gcc-4.7/cxx0x_status.html b/htdocs/gcc-4.7/cxx0x_status.html
index af6a2ef8..19507d25 100644
--- a/htdocs/gcc-4.7/cxx0x_status.html
+++ b/htdocs/gcc-4.7/cxx0x_status.html
@@ -19,7 +19,7 @@
 ISO C++ committee.  The standard is available from various national
 standards bodies; working papers from before the release of the standard
 are available on the ISO C++ committee's web site
-at http://www.open-std.org/jtc1/sc22/wg21/";>http://www.open-std.org/jtc1/sc22/wg21/.
+at https://www.open-std.org/jtc1/sc22/wg21/";>https://www.open-std.org/jtc1/sc22/wg21/.
 Since this standard has only recently been completed, the feature set
 provided by the experimental C++11 mode may vary greatly from one GCC
 version to another. No attempts will be made to preserve backward
@@ -41,231 +41,231 @@ page.
 
 
   Rvalue references
-  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2118.html";>N2118
+  https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2118.html";>N2118
Yes
 
 
   Rvalue references for *this
-  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2439.htm";>N2439
+  https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2439.htm";>N2439
   No
 
 
   Initialization of class objects by rvalues
-  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1610.html";>N1610
+  https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1610.html";>N1610
   Yes
 
 
   Non-static data member initializers
-  http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2008/n2756.htm";>N2756
+  https://www.open-std.org/JTC1/SC22/WG21/docs/papers/2008/n2756.htm";>N2756
   Yes
 
 
   Variadic templates
-  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2242.pdf";>N2242
+  https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2242.pdf";>N2242
Yes
 
 
   Extending variadic template template 
parameters
-  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2555.pdf";>N2555
+  https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2555.pdf";>N2555
Yes
 
 
   Initializer lists
-  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2672.htm";>N2672
+  https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2672.htm";>N2672
Yes
 
 
   Static assertions
-  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1720.html";>N1720
+  https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1720.html";>N1720
Yes
 
 
   auto-typed variables
-  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n1984.pdf";>N1984
+  https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n1984.pdf";>N1984
Yes
 
 
   Multi-declarator auto
-  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1737.pdf";>N1737
+  https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1737.pdf";>N1737
Yes
 
 
   Removal of auto as a storage-class 
specifier
-  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2546.htm";>N2546
+  https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2546.htm";>N2546
Yes
 
 
   New function declarator syntax
-  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2541.htm";>N2541
+  https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2541.htm";>N2541
Yes
 
 
   New wording for C++11 lambdas
-  http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2009/n2927.pdf";>N2927
+  https://www.open-std.org/JTC1/SC22/WG21/docs/papers/2009/n2927.pdf";>N2927
   Yes
 
 
   Declared type of an expression
-  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2343.pdf";>N2343
+  https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2343.pdf";>N2343
Yes
 
 
   Right angle brackets
-  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1757.html";>N1757
+  https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1757.html";>N1757
Yes
 
 
   Default template arguments for function templates
-  http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#226";>DR226
+  https://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#226";>DR226
Yes
 
 
   Solving the SFINAE problem for expressions
-  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2634.html";>DR339
+  https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2634.html";>DR339
Yes
 
 
   Template aliases
-  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2258.pdf";>N2258
+  https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2258.pdf";>

Re: [PATCH] libstdc++: testsuite: Simplify codecvt_unicode

2023-01-17 Thread Jonathan Wakely via Gcc-patches

On Tue, 17 Jan 2023, 21:12 Dimitrij Mijoski via Libstdc++, <
libstd...@gcc.gnu.org> wrote:

> Stop using unique_ptr, create some objects directly.
>

Thanks, I thought about suggesting this, but decided it was good enough.
But I'm glad to see the simplifications :-)

I'll get this committed.



> libstdc++-v3/ChangeLog:
>
> * testsuite/22_locale/codecvt/codecvt_unicode.cc: Simplify.
> * testsuite/22_locale/codecvt/codecvt_unicode.h: Simplify.
> * testsuite/22_locale/codecvt/codecvt_unicode_wchar_t.cc: Simplify.
> ---
>  .../22_locale/codecvt/codecvt_unicode.cc   | 18 ++
>  .../22_locale/codecvt/codecvt_unicode.h|  9 +
>  .../codecvt/codecvt_unicode_wchar_t.cc | 12 ++--
>  3 files changed, 17 insertions(+), 22 deletions(-)
>
> diff --git a/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.cc
> b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.cc
> index ae4b6c896..3d7393e4a 100644
> --- a/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.cc
> +++ b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.cc
> @@ -29,11 +29,12 @@ test_utf8_utf32_codecvts ()
>using codecvt_c32 = codecvt;
>auto loc_c = locale::classic ();
>VERIFY (has_facet (loc_c));
> +
>auto &cvt = use_facet (loc_c);
>test_utf8_utf32_codecvts (cvt);
>
> -  auto cvt_ptr = to_unique_ptr (new codecvt_utf8 ());
> -  test_utf8_utf32_codecvts (*cvt_ptr);
> +  auto cvt2 = codecvt_utf8 ();
> +  test_utf8_utf32_codecvts (cvt2);
>  }
>
>  void
> @@ -42,21 +43,22 @@ test_utf8_utf16_codecvts ()
>using codecvt_c16 = codecvt;
>auto loc_c = locale::classic ();
>VERIFY (has_facet (loc_c));
> +
>auto &cvt = use_facet (loc_c);
>test_utf8_utf16_cvts (cvt);
>
> -  auto cvt_ptr = to_unique_ptr (new codecvt_utf8_utf16 ());
> -  test_utf8_utf16_cvts (*cvt_ptr);
> +  auto cvt2 = codecvt_utf8_utf16 ();
> +  test_utf8_utf16_cvts (cvt2);
>
> -  auto cvt_ptr2 = to_unique_ptr (new codecvt_utf8_utf16 ());
> -  test_utf8_utf16_cvts (*cvt_ptr2);
> +  auto cvt3 = codecvt_utf8_utf16 ();
> +  test_utf8_utf16_cvts (cvt3);
>  }
>
>  void
>  test_utf8_ucs2_codecvts ()
>  {
> -  auto cvt_ptr = to_unique_ptr (new codecvt_utf8 ());
> -  test_utf8_ucs2_cvts (*cvt_ptr);
> +  auto cvt = codecvt_utf8 ();
> +  test_utf8_ucs2_cvts (cvt);
>  }
>
>  int
> diff --git a/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.h
> b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.h
> index 99d1a4684..fbdc7a35b 100644
> --- a/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.h
> +++ b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.h
> @@ -15,18 +15,11 @@
>  // with this library; see the file COPYING3.  If not see
>  // .
>
> +#include 
>  #include 
>  #include 
> -#include 
>  #include 
>
> -template 
> -std::unique_ptr
> -to_unique_ptr (T *ptr)
> -{
> -  return std::unique_ptr (ptr);
> -}
> -
>  struct test_offsets_ok
>  {
>size_t in_size, out_size;
> diff --git
> a/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode_wchar_t.cc
> b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode_wchar_t.cc
> index 169504939..f7a0a4fd8 100644
> --- a/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode_wchar_t.cc
> +++ b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode_wchar_t.cc
> @@ -27,8 +27,8 @@ void
>  test_utf8_utf32_codecvts ()
>  {
>  #if __SIZEOF_WCHAR_T__ == 4
> -  auto cvt_ptr = to_unique_ptr (new codecvt_utf8 ());
> -  test_utf8_utf32_codecvts (*cvt_ptr);
> +  auto cvt = codecvt_utf8 ();
> +  test_utf8_utf32_codecvts (cvt);
>  #endif
>  }
>
> @@ -36,8 +36,8 @@ void
>  test_utf8_utf16_codecvts ()
>  {
>  #if __SIZEOF_WCHAR_T__ >= 2
> -  auto cvt_ptr = to_unique_ptr (new codecvt_utf8_utf16 ());
> -  test_utf8_utf16_cvts (*cvt_ptr);
> +  auto cvt = codecvt_utf8_utf16 ();
> +  test_utf8_utf16_cvts (cvt);
>  #endif
>  }
>
> @@ -45,8 +45,8 @@ void
>  test_utf8_ucs2_codecvts ()
>  {
>  #if __SIZEOF_WCHAR_T__ == 2
> -  auto cvt_ptr = to_unique_ptr (new codecvt_utf8 ());
> -  test_utf8_ucs2_cvts (*cvt_ptr);
> +  auto cvt = codecvt_utf8 ();
> +  test_utf8_ucs2_cvts (cvt);
>  #endif
>  }
>
> --
> 2.34.1
>
>
>

Re: [RFC] tree-optimization: fix optimize-out variables passed into func to alloc

2023-01-17 Thread Andrew Pinski via Gcc-patches

On Tue, Jan 17, 2023 at 7:36 AM Alexey Lapshin via Gcc-patches
 wrote:
>
> After updating to GCC newer than 11.4.0 we found that some code started
> to fail if it was built with size optimization (-Os).
> You can find testsuite for reproduction in the attached patch.
>
> The simplified version affected code looks like this:
>
> void alloc_function (unsigned char **data_p) {
>   *data_p = malloc (8);
>   assert(*data_p != NULL);
> }
> int main () {
>   int *data;
>   alloc_function (&data);
>   printf ("data pointer is %p", data); // prints NULL(compile with -Os)
> }

This code is violating C/C++ aliasing rules.
You store to data via a "unsigned char*" and then do a load via "int*".
You can just use -fno-strict-aliasing if you want your code to just work.

Thanks,
Andrew Pinski

>
> If the type of passed argument is equal to the type in alloc_function
> declaration it works perfectly. Also helps change one or both types to
> void.
>
> I found that issue started to appear from commit
> d119f34c952f8718fdbabc63e2f369a16e92fa07
> if-statement which leads to this issue was found and after being
> removed seems it works well.
>
> Could you please elaborate on what cases exactly this checking should
> optimize?
> I think it should also contain at least one more check for accessing
> variable's memory to write..
>
>
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/alloc-in-func.c | 17 +
>  gcc/tree-ssa-alias.cc |  2 --
>  2 files changed, 17 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/alloc-in-func.c
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/alloc-in-func.c
> b/gcc/testsuite/gcc.dg/tree-ssa/alloc-in-func.c
> new file mode 100644
> index 000..b30c1cedcb9
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/alloc-in-func.c
> @@ -0,0 +1,17 @@
> +/* { dg-do run } */
> +/* { dg-options "-Os" } */
> +
> +#define assert(x) if (!(x)) __builtin_abort ()
> +
> +static inline void alloc_function (unsigned char **data_p)
> +{
> +*data_p = (unsigned char *) __builtin_malloc (10);
> +assert (*data_p != (void *)0);
> +}
> +
> +int main ()
> +{
> +int *data = (void *)0;
> +alloc_function ((unsigned char **) &data);
> +assert (data != (void *)0);
> +}
> diff --git a/gcc/tree-ssa-alias.cc b/gcc/tree-ssa-alias.cc
> index b8f107dfa52..9068db300e5 100644
> --- a/gcc/tree-ssa-alias.cc
> +++ b/gcc/tree-ssa-alias.cc
> @@ -2608,8 +2608,6 @@ modref_may_conflict (const gcall *stmt,
>   if (num_tests >= max_tests)
> return true;
>   alias_stats.modref_tests++;
> - if (!alias_sets_conflict_p (base_set, base_node->base))
> -   continue;
>   num_tests++;
> }
>
> --
> 2.34.1
>

Re: [PATCH 1/1] [fwprop]: Add the support of forwarding the vec_duplicate rtx

2023-01-17 Thread Jeff Law via Gcc-patches





On 1/17/23 09:00, Richard Sandiford via Gcc-patches wrote:



But the idea of the fwprop change looks OK to me in principle.
What we have now seems conservative, based on heuristics that
haven't been updated in a long time.  So relaxing them a bit seems
like a good idea.  IIRC Jeff had another case in which the current
heuristics were too strict.

Two actually, though neither is relevant to this particular problem.

Jeff

[PATCH] riscv: generate builtin macro for compilation with strict alignment

2023-01-17 Thread Vineet Gupta

This could be useful for library writers who want to write code variants
for fast vs. slow unaligned accesses.

We distinguish explicit -mstrict-align (1) vs. slow_unaligned_access
cpu tune param (2) for even more code divesity.

gcc/ChangeLog:

* config/riscv-c.cc (riscv_cpu_cpp_builtins):
  Generate __riscv_strict_align with value 1 or 2.
* config/riscv/riscv.cc: Define riscv_user_wants_strict_align.
  (riscv_option_override) Set riscv_user_wants_strict_align to
  TARGET_STRICT_ALIGN.
* config/riscv/riscv.h: Declare riscv_user_wants_strict_align.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/attribute.c: Check for
  __riscv_strict_align=1.
* gcc.target/riscv/predef-align-1.c: New test.
* gcc.target/riscv/predef-align-2.c: New test.
* gcc.target/riscv/predef-align-3.c: New test.
* gcc.target/riscv/predef-align-4.c: New test.
* gcc.target/riscv/predef-align-5.c: New test.

Signed-off-by: Vineet Gupta 
---
 gcc/config/riscv/riscv-c.cc | 11 +++
 gcc/config/riscv/riscv.cc   |  9 +
 gcc/config/riscv/riscv.h|  1 +
 gcc/testsuite/gcc.target/riscv/attribute-4.c|  9 +
 gcc/testsuite/gcc.target/riscv/predef-align-1.c | 12 
 gcc/testsuite/gcc.target/riscv/predef-align-2.c | 11 +++
 gcc/testsuite/gcc.target/riscv/predef-align-3.c | 15 +++
 gcc/testsuite/gcc.target/riscv/predef-align-4.c | 16 
 gcc/testsuite/gcc.target/riscv/predef-align-5.c | 16 
 9 files changed, 100 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/predef-align-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/predef-align-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/predef-align-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/predef-align-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/predef-align-5.c

diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc
index 826ae0067bb8..47a396501d74 100644
--- a/gcc/config/riscv/riscv-c.cc
+++ b/gcc/config/riscv/riscv-c.cc
@@ -102,6 +102,17 @@ riscv_cpu_cpp_builtins (cpp_reader *pfile)
 
 }
 
+  /* TARGET_STRICT_ALIGN does not cover all cases.  */
+  if (riscv_slow_unaligned_access_p)
+{
+  /* Explicit -mstruct-align preceedes cpu tune param
+ slow_unaligned_access=true.  */
+  if (riscv_user_wants_strict_align)
+builtin_define_with_int_value ("__riscv_strict_align", 1);
+  else
+builtin_define_with_int_value ("__riscv_strict_align", 2);
+}
+
   if (TARGET_MIN_VLEN != 0)
 builtin_define_with_int_value ("__riscv_v_min_vlen", TARGET_MIN_VLEN);
 
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 9a53999a39de..d6a40d043584 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -255,6 +255,9 @@ struct riscv_tune_info {
 /* Whether unaligned accesses execute very slowly.  */
 bool riscv_slow_unaligned_access_p;
 
+/* Whether use explcitly passed -mstrict-align.  */
+bool riscv_user_wants_strict_align;
+
 /* Stack alignment to assume/maintain.  */
 unsigned riscv_stack_boundary;
 
@@ -6047,6 +6050,12 @@ riscv_option_override (void)
  -m[no-]strict-align is left unspecified, heed -mtune's advice.  */
   riscv_slow_unaligned_access_p = (cpu->tune_param->slow_unaligned_access
   || TARGET_STRICT_ALIGN);
+
+  /* Make a note if user explicitly passed -mstrict-align for later
+ builtin macro generation. Can't use  target_flags_explicit since
+ it is set even for -mno-strict-align.  */
+  riscv_user_wants_strict_align = TARGET_STRICT_ALIGN;
+
   if ((target_flags_explicit & MASK_STRICT_ALIGN) == 0
   && cpu->tune_param->slow_unaligned_access)
 target_flags |= MASK_STRICT_ALIGN;
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index 0ab739bd6ebf..c55546656b7d 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -1030,6 +1030,7 @@ while (0)
 #ifndef USED_FOR_TARGET
 extern const enum reg_class riscv_regno_to_class[];
 extern bool riscv_slow_unaligned_access_p;
+extern bool riscv_user_wants_strict_align;
 extern unsigned riscv_stack_boundary;
 extern unsigned riscv_bytes_per_vector_chunk;
 extern poly_uint16 riscv_vector_chunks;
diff --git a/gcc/testsuite/gcc.target/riscv/attribute-4.c 
b/gcc/testsuite/gcc.target/riscv/attribute-4.c
index 7c565c4963ec..ce7f1929e6a6 100644
--- a/gcc/testsuite/gcc.target/riscv/attribute-4.c
+++ b/gcc/testsuite/gcc.target/riscv/attribute-4.c
@@ -2,5 +2,14 @@
 /* { dg-options "-mriscv-attribute -mstrict-align" } */
 int foo()
 {
+
+#if !defined(__riscv_strict_align)
+#error "__riscv_strict_align"
+#if __riscv_strict_align != 1
+#error "__riscv_strict_align != 1"
+#endif
+#endif
+
+  return 0;
 }
 /* { dg-final { scan-assembler ".attribute unaligned_access, 0" } } */
diff --git a/gcc/testsuite/gcc.target

Re: [PATCH 1/1] [fwprop]: Add the support of forwarding the vec_duplicate rtx

2023-01-17 Thread 丁乐华

> I don't think this pattern is correct, because SEL isn't commutative
> in the vector operands.


Indeed, I think I should invert PRED operand or the comparison
operator which produce the PRED operand first.


> I think this should be:
>
>  if (...)
>    to = XEXP (to, 0);>
> and should be before the REG_P test.  We don't want to treat
> arbitrary duplicates as profitable.

Agree, the adjustment is more rigorous.

> It's not obvious that vec_duplicate is special enough that we should
> treat it differently from other unary operators.  For example,
> zero_extend and sign_extend don't seem fundamentally more expensive
> than vec_duplicate.

Juzhe and I also discussed offline recently. We also have widened vector
operator that needs to be added, this can be finished in RTL with forwarding
instead of adding widen GIMPLE internal function. We think we can add a
TARGET HOOK, for example: 
`rtx try_forward (rtx dest, rtx src, rtx use_insn, rtx def_insn)`


If it returns NULL_RTX, it means that it cannot be forwarded, otherwise
it means replace the dest part in use_insn with the returned rtx.
Letting the backend decide which ones can be forwarded has several
advantages compared to:
1. Let the insn related to TARGET, such as unspec, also can be forwarded,
    and when forwarding, the corresponding content can be extracted
    from def_insn instead of the complete src part.
2. By default  this HOOK returns NULL_TREE, which can reduce compatibility
    issues.


> It's a while since I looked at this code, but I assume that, even after
> this change, we will still require the new in-loop instruction to be
> no more expensive than the old in-loop instruction.  Is that right?


Yeah. Forwarding vec_duplicate maybe reduce the use of vector registers,
but increase the life cycle of scalar registers. If the scalar register pressure
is higher, this change may become more expensive. This decision does not
feel very easy to make, is there some way to do this?


Best,




Lehua

[PATCH] RISC-V: Fix incorrect attributes of vsetvl instructions pattern

2023-01-17 Thread juzhe . zhong

From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/vector.md: Fix incorrect attributes.

---
 gcc/config/riscv/vector.md | 27 ---
 1 file changed, 12 insertions(+), 15 deletions(-)

diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 4e93b7fead5..37cf4d6bcbf 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -95,13 +95,7 @@
 (const_int 32)
 (eq_attr "mode" "VNx1DI,VNx2DI,VNx4DI,VNx8DI,\
  VNx1DF,VNx2DF,VNx4DF,VNx8DF")
-(const_int 64)
-
-(eq_attr "type" "vsetvl")
-(if_then_else (eq_attr "INSN_CODE (curr_insn) == CODE_FOR_vsetvldi
-|| INSN_CODE (curr_insn) == CODE_FOR_vsetvlsi")
-  (symbol_ref "INTVAL (operands[2])")
-  (const_int INVALID_ATTRIBUTE))]
+(const_int 64)]
(const_int INVALID_ATTRIBUTE)))
 
 ;; Ditto to LMUL.
@@ -149,12 +143,7 @@
 (eq_attr "mode" "VNx4DI,VNx4DF")
   (symbol_ref "riscv_vector::get_vlmul(E_VNx4DImode)")
 (eq_attr "mode" "VNx8DI,VNx8DF")
-  (symbol_ref "riscv_vector::get_vlmul(E_VNx8DImode)")
-(eq_attr "type" "vsetvl")
-(if_then_else (eq_attr "INSN_CODE (curr_insn) == CODE_FOR_vsetvldi
-|| INSN_CODE (curr_insn) == CODE_FOR_vsetvlsi")
-  (symbol_ref "INTVAL (operands[3])")
-  (const_int INVALID_ATTRIBUTE))]
+  (symbol_ref "riscv_vector::get_vlmul(E_VNx8DImode)")]
(const_int INVALID_ATTRIBUTE)))
 
 ;; It is valid for instruction that require sew/lmul ratio.
@@ -531,7 +520,11 @@
   "TARGET_VECTOR"
   "vset%i1vli\t%0,%1,e%2,%m3,t%p4,m%p5"
   [(set_attr "type" "vsetvl")
-   (set_attr "mode" "")])
+   (set_attr "mode" "")
+   (set (attr "sew") (symbol_ref "INTVAL (operands[2])"))
+   (set (attr "vlmul") (symbol_ref "INTVAL (operands[3])"))
+   (set (attr "ta") (symbol_ref "INTVAL (operands[4])"))
+   (set (attr "ma") (symbol_ref "INTVAL (operands[5])"))])
 
 ;; vsetvl zero,zero,vtype instruction.
 ;; This pattern has no side effects and does not set X0 register.
@@ -563,7 +556,11 @@
   "TARGET_VECTOR"
   "vset%i0vli\tzero,%0,e%1,%m2,t%p3,m%p4"
   [(set_attr "type" "vsetvl")
-   (set_attr "mode" "")])
+   (set_attr "mode" "")
+   (set (attr "sew") (symbol_ref "INTVAL (operands[1])"))
+   (set (attr "vlmul") (symbol_ref "INTVAL (operands[2])"))
+   (set (attr "ta") (symbol_ref "INTVAL (operands[3])"))
+   (set (attr "ma") (symbol_ref "INTVAL (operands[4])"))])
 
 ;; It's emit by vsetvl/vsetvlmax intrinsics with no side effects.
 ;; Since we have many optmization passes from "expand" to "reload_completed",
-- 
2.36.3

[PATCH] RISC-V: Change VSETVL PASS always call split_all_insns

2023-01-17 Thread juzhe . zhong

From: Ju-Zhe Zhong 

Since LCM will destroy CFG, we are going to reorder the location of VSETVL PASS
at least before bbro (block-reorder PASS) which is before split3 PASS. We need
to call it in VSETVL PASS to get final RVV instructions patterns.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pass_vsetvl::execute): Always call 
split_all_insns.

---
 gcc/config/riscv/riscv-vsetvl.cc | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 0245124e28f..d494369a603 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -3092,12 +3092,10 @@ pass_vsetvl::execute (function *)
   if (n_basic_blocks_for_fn (cfun) <= 0)
 return 0;
 
-  /* The reason we have this since we didn't finish splitting yet
- when optimize == 0. In this case, we should conservatively
- split all instructions here to make sure we don't miss any
- RVV instruction.  */
-  if (!optimize)
-split_all_insns ();
+  /* The RVV instruction may change after split which is not a stable
+ instruction. We need to split it here to avoid potential issue
+ since the VSETVL PASS is insert before split PASS.  */
+  split_all_insns ();
 
   /* Early return for there is no vector instructions.  */
   if (!has_vector_insn (cfun))
-- 
2.36.3

[PATCH] RISC-V: Remove DCE in VSETVL PASS

2023-01-17 Thread juzhe . zhong

From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pass_vsetvl::done): Remove DCE.
* config/riscv/t-riscv: Ditto.

---
 gcc/config/riscv/riscv-vsetvl.cc | 2 --
 gcc/config/riscv/t-riscv | 2 +-
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index d494369a603..c2a8b44584d 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -87,7 +87,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "predict.h"
 #include "profile-count.h"
 #include "riscv-vsetvl.h"
-#include "dce.h"
 
 using namespace rtl_ssa;
 using namespace riscv_vector;
@@ -2996,7 +2995,6 @@ pass_vsetvl::done (void)
cleanup_cfg (0);
   delete crtl->ssa;
   crtl->ssa = nullptr;
-  run_fast_dce ();
 }
   m_vector_manager->release ();
   delete m_vector_manager;
diff --git a/gcc/config/riscv/t-riscv b/gcc/config/riscv/t-riscv
index c95f4aff358..d30e0235356 100644
--- a/gcc/config/riscv/t-riscv
+++ b/gcc/config/riscv/t-riscv
@@ -54,7 +54,7 @@ riscv-c.o: $(srcdir)/config/riscv/riscv-c.cc $(CONFIG_H) 
$(SYSTEM_H) \
 riscv-vsetvl.o: $(srcdir)/config/riscv/riscv-vsetvl.cc \
   $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(REGS_H) \
   $(TARGET_H) tree-pass.h df.h rtl-ssa.h cfgcleanup.h insn-config.h \
-  insn-attr.h insn-opinit.h tm-constrs.h cfgrtl.h cfganal.h lcm.h dce.h \
+  insn-attr.h insn-opinit.h tm-constrs.h cfgrtl.h cfganal.h lcm.h \
   predict.h profile-count.h $(srcdir)/config/riscv/riscv-vsetvl.h
$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \
$(srcdir)/config/riscv/riscv-vsetvl.cc
-- 
2.36.3

[PATCH] RISC-V: Clang-format some annotations[NFC]

2023-01-17 Thread juzhe . zhong

From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc: Clang-format.
---
 gcc/config/riscv/riscv-vsetvl.cc | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index c2a8b44584d..26d096ea939 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -33,7 +33,8 @@ along with GCC; see the file COPYING3.  If not see
 
 Assumption:
 
--  Each avl operand is either an immediate (must be in range 0 ~ 31) or 
reg.
+-  Each avl operand is either an immediate (must be in range 0 ~ 31) or
+   reg.
 
 This pass consists of 5 phases:
 
@@ -43,7 +44,8 @@ along with GCC; see the file COPYING3.  If not see
 -  Phase 2 - Emit vsetvl instructions within each basic block according to
demand, compute and save ANTLOC && AVLOC of each block.
 
--  Phase 3 - Backward && forward demanded info propagation and fusion 
across blocks.
+-  Phase 3 - Backward && forward demanded info propagation and fusion
+   across blocks.
 
 -  Phase 4 - Lazy code motion including: compute local properties,
pre_edge_lcm and vsetvl insertion && delete edges for LCM results.
-- 
2.36.3

[PATCH] RISC-V: Reorder VSETVL PASS location

2023-01-17 Thread juzhe . zhong

From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/riscv-passes.def (INSERT_PASS_BEFORE): Reorder VSETVL 
PASS.

---
 gcc/config/riscv/riscv-passes.def | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv-passes.def 
b/gcc/config/riscv/riscv-passes.def
index d2d48f231aa..614b767dc8a 100644
--- a/gcc/config/riscv/riscv-passes.def
+++ b/gcc/config/riscv/riscv-passes.def
@@ -18,4 +18,4 @@
.  */
 
 INSERT_PASS_AFTER (pass_rtl_store_motion, 1, pass_shorten_memrefs);
-INSERT_PASS_BEFORE (pass_sched2, 1, pass_vsetvl);
+INSERT_PASS_BEFORE (pass_fast_rtl_dce, 1, pass_vsetvl);
-- 
2.36.3

[PATCH] RISC-V: Change parse_insn into public for future use.

2023-01-17 Thread juzhe . zhong

From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.h: Change it into public.

---
 gcc/config/riscv/riscv-vsetvl.h | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.h b/gcc/config/riscv/riscv-vsetvl.h
index f24ad981f65..3b68bf638ae 100644
--- a/gcc/config/riscv/riscv-vsetvl.h
+++ b/gcc/config/riscv/riscv-vsetvl.h
@@ -260,9 +260,6 @@ private:
  Since RTL_SSA can not be enabled when optimize == 0, we don't initialize
  the m_insn.  */
   void parse_insn (rtx_insn *);
-  /* This is only called by lazy_vsetvl subroutine when optimize > 0.
- We use RTL_SSA framework to initialize the insn_info.  */
-  void parse_insn (rtl_ssa::insn_info *);
 
   friend class vector_infos_manager;
 
@@ -272,6 +269,10 @@ public:
   m_insn (nullptr)
   {}
 
+  /* This is only called by lazy_vsetvl subroutine when optimize > 0.
+ We use RTL_SSA framework to initialize the insn_info.  */
+  void parse_insn (rtl_ssa::insn_info *);
+
   bool operator> (const vector_insn_info &) const;
   bool operator>= (const vector_insn_info &) const;
   bool operator== (const vector_insn_info &) const;
-- 
2.36.3

[PATCH] RISC-V: Fix bug of before_p function

2023-01-17 Thread juzhe . zhong

From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (before_p): Fix bug.

---
 gcc/config/riscv/riscv-vsetvl.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 26d096ea939..728a32dacd6 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -188,7 +188,7 @@ real_insn_and_same_bb_p (const insn_info *insn, const 
bb_info *bb)
 static bool
 before_p (const insn_info *insn1, const insn_info *insn2)
 {
-  return insn1->compare_with (insn2) == -1;
+  return insn1->compare_with (insn2) < 0;
 }
 
 static bool
-- 
2.36.3

[PATCH v6] LoongArch: Fixed a compilation failure with '%c' in inline assembly [PR107731].

2023-01-17 Thread Lulu Cheng

Co-authored-by: Yang Yujie 

gcc/ChangeLog:

* config/loongarch/loongarch.cc (loongarch_classify_address):
Add precessint for CONST_INT.
(loongarch_print_operand_reloc): Operand modifier 'c' is supported.
(loongarch_print_operand): Increase the processing of '%c'.
* doc/extend.texi: Adds documents for LoongArch operand modifiers.
And port the public operand modifiers information to this document.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/tst-asm-const.c: Moved to...
* gcc.target/loongarch/pr107731.c: ...here.
---
V2 -> v3:
1. Correct a clerical error.
2. Adding document for loongarch operand modifiers.

v3 -> v4:
Copy the description of "%c" "%n" "%a" "%l" from gccint.pdf to gcc.pdf.

v4 -> v5:
Move the operand modifiers description of "%c", "%n", "%a", "%l" to the top of 
the
x86Operandmodifiers section.

v5 -> v6:
Adjust the location of the added section in the document.

---
 gcc/config/loongarch/loongarch.cc | 14 +
 gcc/doc/extend.texi   | 51 +--
 .../loongarch/{tst-asm-const.c => pr107731.c} |  6 +--
 3 files changed, 64 insertions(+), 7 deletions(-)
 rename gcc/testsuite/gcc.target/loongarch/{tst-asm-const.c => pr107731.c} (78%)

diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index c6b03fcf2f9..cdf190b985e 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -2075,6 +2075,11 @@ loongarch_classify_address (struct 
loongarch_address_info *info, rtx x,
   return (loongarch_valid_base_register_p (info->reg, mode, strict_p)
  && loongarch_valid_lo_sum_p (info->symbol_type, mode,
   info->offset));
+case CONST_INT:
+  /* Small-integer addresses don't occur very often, but they
+are legitimate if $r0 is a valid base register.  */
+  info->type = ADDRESS_CONST_INT;
+  return IMM12_OPERAND (INTVAL (x));
 
 default:
   return false;
@@ -4933,6 +4938,7 @@ loongarch_print_operand_reloc (FILE *file, rtx op, bool 
hi64_part,
 
'A' Print a _DB suffix if the memory model requires a release.
'b' Print the address of a memory operand, without offset.
+   'c'  Print an integer.
'C' Print the integer branch condition for comparison OP.
'd' Print CONST_INT OP in decimal.
'F' Print the FPU branch condition for comparison OP.
@@ -4979,6 +4985,14 @@ loongarch_print_operand (FILE *file, rtx op, int letter)
fputs ("_db", file);
   break;
 
+case 'c':
+  if (CONST_INT_P (op))
+   fprintf (file, HOST_WIDE_INT_PRINT_DEC, INTVAL (op));
+  else
+   output_operand_lossage ("unsupported operand for code '%c'", letter);
+
+  break;
+
 case 'C':
   loongarch_print_int_branch_condition (file, code, letter);
   break;
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 1103e9936f7..6a5d9faf2f3 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -10402,8 +10402,10 @@ ensures that modifying @var{a} does not affect the 
address referenced by
 is undefined if @var{a} is modified before using @var{b}.
 
 @code{asm} supports operand modifiers on operands (for example @samp{%k2} 
-instead of simply @samp{%2}). Typically these qualifiers are hardware 
-dependent. The list of supported modifiers for x86 is found at 
+instead of simply @samp{%2}). @ref{GenericOperandmodifiers,
+Generic Operand modifiers} lists the modifiers that are available
+on all targets.  Other modifiers are hardware dependent.
+For example, the list of supported modifiers for x86 is found at
 @ref{x86Operandmodifiers,x86 Operand modifiers}.
 
 If the C code that follows the @code{asm} makes no use of any of the output 
@@ -10671,8 +10673,10 @@ optimizers may discard the @code{asm} statement as 
unneeded
 (see @ref{Volatile}).
 
 @code{asm} supports operand modifiers on operands (for example @samp{%k2} 
-instead of simply @samp{%2}). Typically these qualifiers are hardware 
-dependent. The list of supported modifiers for x86 is found at 
+instead of simply @samp{%2}). @ref{GenericOperandmodifiers,
+Generic Operand modifiers} lists the modifiers that are available
+on all targets.  Other modifiers are hardware dependent.
+For example, the list of supported modifiers for x86 is found at
 @ref{x86Operandmodifiers,x86 Operand modifiers}.
 
 In this example using the fictitious @code{combine} instruction, the 
@@ -11024,6 +11028,30 @@ lab:
 @}
 @end example
 
+@anchor{GenericOperandmodifiers}
+@subsubsection Generic Operand Modifiers
+@noindent
+The following table shows the modifiers supported by all targets and their 
effects:
+
+@multitable {Modifier} {Print the opcode suffix for the size of th} {Operand}
+@headitem Modifier @tab Description @tab Operand
+@item @code{c}
+@tab Require a constant operand and print the constant expression with no 
punctuation.
+@tab @code{%c0}
+@item @code{n}
+@tab Like @samp{

[PATCH] RISC-V: Refine function args of some functions.

2023-01-17 Thread juzhe . zhong

From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (gen_vsetvl_pat): Refine function args.
(emit_vsetvl_insn): Ditto.

---
 gcc/config/riscv/riscv-vsetvl.cc | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 728a32dacd6..e11751f00af 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -580,7 +580,7 @@ has_vector_insn (function *fn)
 
 /* Emit vsetvl instruction.  */
 static rtx
-gen_vsetvl_pat (enum vsetvl_type insn_type, vl_vtype_info info, rtx vl)
+gen_vsetvl_pat (enum vsetvl_type insn_type, const vl_vtype_info &info, rtx vl)
 {
   rtx avl = info.get_avl ();
   rtx sew = gen_int_mode (info.get_sew (), Pmode);
@@ -600,7 +600,7 @@ gen_vsetvl_pat (enum vsetvl_type insn_type, vl_vtype_info 
info, rtx vl)
 }
 
 static rtx
-gen_vsetvl_pat (rtx_insn *rinsn, const vector_insn_info info)
+gen_vsetvl_pat (rtx_insn *rinsn, const vector_insn_info &info)
 {
   rtx new_pat;
   if (vsetvl_insn_p (rinsn) || vlmax_avl_p (info.get_avl ()))
@@ -617,7 +617,7 @@ gen_vsetvl_pat (rtx_insn *rinsn, const vector_insn_info 
info)
 
 static void
 emit_vsetvl_insn (enum vsetvl_type insn_type, enum emit_type emit_type,
- vl_vtype_info info, rtx vl, rtx_insn *rinsn)
+ const vl_vtype_info &info, rtx vl, rtx_insn *rinsn)
 {
   rtx pat = gen_vsetvl_pat (insn_type, info, vl);
   if (dump_file)
-- 
2.36.3

[PATCH] RISC-V: Add :: for static function calling to avoid confusing

2023-01-17 Thread juzhe . zhong

From: Ju-Zhe Zhong 

Since we have same function in class and static function calling get_avl.
Adding :: to avoid codes confusing.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pass_vsetvl::get_backward_fusion_type): 
Add ::.

---
 gcc/config/riscv/riscv-vsetvl.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index e11751f00af..b33c198bbd6 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -2073,7 +2073,7 @@ pass_vsetvl::get_backward_fusion_type (const bb_info *bb,
reg = get_vl (insn->rtl ());
  else
/* Check AVL operand for vsetvl zero,avl.  */
-   reg = get_avl (insn->rtl ());
+   reg = ::get_avl (insn->rtl ());
}
 }
 
-- 
2.36.3

[PATCH] RISC-V: Finalize VSETVL PASS implementation

2023-01-17 Thread juzhe . zhong

From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (vsetvl_insn_p): Add condition to avoid 
ICE.
(vsetvl_discard_result_insn_p): New function.
(reg_killed_by_bb_p): rename to find_reg_killed_by.
(find_reg_killed_by): New name.
(get_vl): allow it to be called by more functions.
(has_vsetvl_killed_avl_p): Add condition.
(get_avl): allow it to be called by more functions.
(insn_should_be_added_p): New function.
(get_all_nonphi_defs): Refine function.
(get_all_sets): Ditto.
(get_same_bb_set): New function.
(any_insn_in_bb_p): Ditto.
(any_set_in_bb_p): Ditto.
(get_vl_vtype_info): Add VLMAX forward optimization.
(source_equal_p): Fix issues.
(extract_single_source): Refine.
(avl_info::multiple_source_equal_p): New function.
(avl_info::operator==): Adjust for final version.
(vl_vtype_info::operator==): Ditto.
(vl_vtype_info::same_avl_p): Ditto.
(vector_insn_info::parse_insn): Ditto.
(vector_insn_info::available_p): New function.
(vector_insn_info::merge): Adjust for final version.
(vector_insn_info::dump): Add hard_empty.
(pass_vsetvl::hard_empty_block_p): New function.
(pass_vsetvl::backward_demand_fusion): Adjust for final version.
(pass_vsetvl::forward_demand_fusion): Ditto.
(pass_vsetvl::demand_fusion): Ditto.
(pass_vsetvl::cleanup_illegal_dirty_blocks): New function.
(pass_vsetvl::compute_local_properties): Adjust for final version.
(pass_vsetvl::can_refine_vsetvl_p): Ditto.
(pass_vsetvl::refine_vsetvls): Ditto.
(pass_vsetvl::commit_vsetvls): Ditto.
(pass_vsetvl::propagate_avl): New function.
(pass_vsetvl::lazy_vsetvl): Adjust for new version.
* config/riscv/riscv-vsetvl.h (enum def_type): New enum.

---
 gcc/config/riscv/riscv-vsetvl.cc | 930 +++
 gcc/config/riscv/riscv-vsetvl.h  |  30 +-
 2 files changed, 737 insertions(+), 223 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index b33c198bbd6..253bfc7b210 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -54,6 +54,8 @@ along with GCC; see the file COPYING3.  If not see
used any more and VL operand of VSETVL instruction if it is not used by
any non-debug instructions.
 
+-  Phase 6 - Propagate AVL between vsetvl instructions.
+
 Implementation:
 
 -  The subroutine of optimize == 0 is simple_vsetvl.
@@ -175,8 +177,20 @@ vector_config_insn_p (rtx_insn *rinsn)
 static bool
 vsetvl_insn_p (rtx_insn *rinsn)
 {
+  if (!vector_config_insn_p (rinsn))
+return false;
   return (INSN_CODE (rinsn) == CODE_FOR_vsetvldi
-|| INSN_CODE (rinsn) == CODE_FOR_vsetvlsi);
+ || INSN_CODE (rinsn) == CODE_FOR_vsetvlsi);
+}
+
+/* Return true if it is vsetvl zero, rs1.  */
+static bool
+vsetvl_discard_result_insn_p (rtx_insn *rinsn)
+{
+  if (!vector_config_insn_p (rinsn))
+return false;
+  return (INSN_CODE (rinsn) == CODE_FOR_vsetvl_discard_resultdi
+ || INSN_CODE (rinsn) == CODE_FOR_vsetvl_discard_resultsi);
 }
 
 static bool
@@ -191,15 +205,27 @@ before_p (const insn_info *insn1, const insn_info *insn2)
   return insn1->compare_with (insn2) < 0;
 }
 
-static bool
-reg_killed_by_bb_p (const bb_info *bb, rtx x)
+static insn_info *
+find_reg_killed_by (const bb_info *bb, rtx x)
 {
-  if (!x || vlmax_avl_p (x))
-return false;
-  for (const insn_info *insn : bb->real_nondebug_insns ())
+  if (!x || vlmax_avl_p (x) || !REG_P (x))
+return nullptr;
+  for (insn_info *insn : bb->reverse_real_nondebug_insns ())
 if (find_access (insn->defs (), REGNO (x)))
-  return true;
-  return false;
+  return insn;
+  return nullptr;
+}
+
+/* Helper function to get VL operand.  */
+static rtx
+get_vl (rtx_insn *rinsn)
+{
+  if (has_vl_op (rinsn))
+{
+  extract_insn_cached (rinsn);
+  return recog_data.operand[get_attr_vl_op_idx (rinsn)];
+}
+  return SET_DEST (XVECEXP (PATTERN (rinsn), 0, 0));
 }
 
 static bool
@@ -208,6 +234,9 @@ has_vsetvl_killed_avl_p (const bb_info *bb, const 
vector_insn_info &info)
   if (info.dirty_with_killed_avl_p ())
 {
   rtx avl = info.get_avl ();
+  if (vlmax_avl_p (avl))
+   return find_reg_killed_by (bb, get_vl (info.get_insn ()->rtl ()))
+  != nullptr;
   for (const insn_info *insn : bb->reverse_real_nondebug_insns ())
{
  def_info *def = find_access (insn->defs (), REGNO (avl));
@@ -229,18 +258,6 @@ has_vsetvl_killed_avl_p (const bb_info *bb, const 
vector_insn_info &info)
   return false;
 }
 
-/* Helper function to get VL operand.  */
-static rtx
-get_vl (rtx_insn *rinsn)
-{
-  if (has_vl_op (rinsn))
-{
-  extract_insn_cached (rinsn);
-  return recog_data.operand[get_attr_vl_op_idx (rinsn)];
-}
-

[PATCH v3] xtensa: Eliminate the use of callee-saved register that saves and restores only once

2023-01-17 Thread Takayuki 'January June' Suwa via Gcc-patches

On 2023/01/17 20:23, Max Filippov wrote:
> Hi Suwa-san,
Hi!

> There's still a few regressions in tests with -fcompare-debug because
> code generated with -g and without it is different:
> E.g. check the following test with -g0 and -g:
Again debug_insn is the problem...

=
In the case of the CALL0 ABI, values that must be retained before and
after function calls are placed in the callee-saved registers (A12
through A15) and referenced later.  However, it is often the case that
the save and the reference are each only once and a simple register-
register move (the frame pointer is needed to recover the stack pointer
and must be excluded).

e.g. in the following example, if there are no other occurrences of
register A14:

;; before
; prologue {
  ...
s32i.n  a14, sp, 16
  ...
; } prologue
  ...
mov.n   a14, a6
  ...
call0   foo
  ...
mov.n   a8, a14
  ...
; epilogue {
  ...
l32i.n  a14, sp, 16
  ...
; } epilogue

It can be possible like this:

;; after
; prologue {
  ...
(deleted)
  ...
; } prologue
  ...
s32i.n  a6, sp, 16
  ...
call0   foo
  ...
l32i.n  a8, sp, 16
  ...
; epilogue {
  ...
(deleted)
  ...
; } epilogue

This patch introduces a new peephole2 pattern that implements the above.

gcc/ChangeLog:

* config/xtensa/xtensa.md: New peephole2 pattern that eliminates
the use of callee-saved register that saves and restores only once
for other register, by using its stack slot directly.
---
 gcc/config/xtensa/xtensa.md | 62 +
 1 file changed, 62 insertions(+)

diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md
index 98f3c468f8b..2f3b2256d8b 100644
--- a/gcc/config/xtensa/xtensa.md
+++ b/gcc/config/xtensa/xtensa.md
@@ -3024,3 +3024,65 @@ FALLTHRU:;
   operands[1] = GEN_INT (imm0);
   operands[2] = GEN_INT (imm1);
 })
+
+(define_peephole2
+  [(set (match_operand:SI 0 "register_operand")
+   (match_operand:SI 1 "reload_operand"))]
+  "!TARGET_WINDOWED_ABI && df
+   && epilogue_contains (insn)
+   && ! call_used_or_fixed_reg_p (REGNO (operands[0]))
+   && (!frame_pointer_needed
+   || REGNO (operands[0]) != HARD_FRAME_POINTER_REGNUM)"
+  [(const_int 0)]
+{
+  rtx reg = operands[0], pattern;
+  rtx_insn *insnP = NULL, *insnS = NULL, *insnR = NULL;
+  df_ref ref;
+  rtx_insn *insn;
+  for (ref = DF_REG_DEF_CHAIN (REGNO (reg));
+   ref; ref = DF_REF_NEXT_REG (ref))
+if (DF_REF_CLASS (ref) != DF_REF_REGULAR
+   || ! NONJUMP_INSN_P (insn = DF_REF_INSN (ref)))
+  continue;
+else if (insn == curr_insn)
+  continue;
+else if (GET_CODE (pattern = PATTERN (insn)) == SET
+&& rtx_equal_p (SET_DEST (pattern), reg)
+&& REG_P (SET_SRC (pattern)))
+  {
+   if (insnS)
+ FAIL;
+   insnS = insn;
+   continue;
+  }
+else
+  FAIL;
+  for (ref = DF_REG_USE_CHAIN (REGNO (reg));
+   ref; ref = DF_REF_NEXT_REG (ref))
+if (DF_REF_CLASS (ref) != DF_REF_REGULAR
+   || ! NONJUMP_INSN_P (insn = DF_REF_INSN (ref)))
+  continue;
+else if (prologue_contains (insn))
+  {
+   insnP = insn;
+   continue;
+  }
+else if (GET_CODE (pattern = PATTERN (insn)) == SET
+&& rtx_equal_p (SET_SRC (pattern), reg)
+&& REG_P (SET_DEST (pattern)))
+  {
+   if (insnR)
+ FAIL;
+   insnR = insn;
+   continue;
+  }
+else
+  FAIL;
+  if (!insnP || !insnS || !insnR)
+FAIL;
+  SET_DEST (PATTERN (insnS)) = copy_rtx (operands[1]);
+  df_insn_rescan (insnS);
+  SET_SRC (PATTERN (insnR)) = copy_rtx (operands[1]);
+  df_insn_rescan (insnR);
+  set_insn_deleted (insnP);
+})
-- 
2.30.2

[PATCH v2] xtensa: Eliminate unnecessary general-purpose reg-reg moves

2023-01-17 Thread Takayuki 'January June' Suwa via Gcc-patches

Register-register move instructions that can be easily seen as
unnecessary by the human eye may remain in the compiled result.
For example:

/* example */
double test(double a, double b) {
  return __builtin_copysign(a, b);
}

test:
add.n   a3, a3, a3
extui   a5, a5, 31, 1
ssai1
;; be in the same BB
src a7, a5, a3  ;; No '0' in the source constraints
;; No CALL insns in this span
;; Both A3 and A7 are irrelevant to
;;   insns in this span
mov.n   a3, a7  ;; An unnecessary reg-reg move
;; A7 is not used after this
ret.n

The last two instructions above, excluding the return instruction,
could be done like this:

src a3, a5, a3

This symptom often occurs when handling DI/DFmode values with SImode
instructions.  This patch solves the above problem using peephole2
pattern.

gcc/ChangeLog:

* config/xtensa/xtensa.md: New peephole2 pattern that eliminates
the occurrence of genral-purpose register used only once and for
transferring intermediate value.
---
 gcc/config/xtensa/xtensa.md | 43 +
 1 file changed, 43 insertions(+)

diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md
index 3694d95ad..0a477e711 100644
--- a/gcc/config/xtensa/xtensa.md
+++ b/gcc/config/xtensa/xtensa.md
@@ -3091,3 +3091,46 @@ FALLTHRU:;
   df_insn_rescan (insnR);
   set_insn_deleted (insnP);
 })
+
+(define_peephole2
+  [(set (match_operand 0 "register_operand")
+   (match_operand 1 "register_operand"))]
+  "GET_MODE_SIZE (GET_MODE (operands[0])) == 4
+   && GET_MODE_SIZE (GET_MODE (operands[1])) == 4
+   && GP_REG_P (REGNO (operands[0])) && GP_REG_P (REGNO (operands[1]))
+   && peep2_reg_dead_p (1, operands[1])"
+  [(const_int 0)]
+{
+  basic_block bb = BLOCK_FOR_INSN (curr_insn);
+  rtx_insn *head = BB_HEAD (bb), *insn;
+  rtx dest = operands[0], src = operands[1], pattern, t_dest;
+  int i;
+  for (insn = PREV_INSN (curr_insn);
+   insn && insn != head;
+   insn = PREV_INSN (insn))
+if (CALL_P (insn))
+  break;
+else if (INSN_P (insn))
+  {
+   if (GET_CODE (pattern = PATTERN (insn)) == SET
+   && REG_P (t_dest = SET_DEST (pattern))
+   && GET_MODE_SIZE (GET_MODE (t_dest)) == 4
+   && REGNO (t_dest) == REGNO (src))
+   {
+ extract_constrain_insn (insn);
+ for (i = 1; i < recog_data.n_operands; ++i)
+   if (strchr (recog_data.constraints[i], '0'))
+ goto ABORT;
+ SET_REGNO (t_dest, REGNO (dest));
+ goto FALLTHRU;
+   }
+   if (reg_overlap_mentioned_p (dest, pattern)
+   || reg_overlap_mentioned_p (src, pattern)
+   || set_of (dest, insn)
+   || set_of (src, insn))
+ break;
+  }
+ABORT:
+  FAIL;
+FALLTHRU:;
+})
-- 
2.30.2

[PATCH] xtensa: Optimize inversion of the MSB

2023-01-17 Thread Takayuki 'January June' Suwa via Gcc-patches

Such operation can be done either bitwise-XOR or addition with -2147483648,
but the latter is one byte less if TARGET_DENSITY.

gcc/ChangeLog:

* config/xtensa/xtensa.md (xorsi3_internal):
Rename from the original of "xorsi3".
(xorsi3): New expansion pattern that emits addition rather than
bitwise-XOR when the second source is a constant of -2147483648
if TARGET_DENSITY.
---
 gcc/config/xtensa/xtensa.md | 26 +-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md
index 0a477e711..4b5899a4c 100644
--- a/gcc/config/xtensa/xtensa.md
+++ b/gcc/config/xtensa/xtensa.md
@@ -736,7 +736,31 @@
(set_attr "mode""SI")
(set_attr "length"  "3")])
 
-(define_insn "xorsi3"
+(define_expand "xorsi3"
+  [(set (match_operand:SI 0 "register_operand")
+   (xor:SI (match_operand:SI 1 "register_operand")
+   (match_operand:SI 2 "nonmemory_operand")))]
+  ""
+{
+  if (register_operand (operands[2], SImode))
+emit_insn (gen_xorsi3_internal (operands[0], operands[1],
+   operands[2]));
+  else
+{
+  rtx (*gen_op)(rtx, rtx, rtx);
+  if (TARGET_DENSITY
+ && CONST_INT_P (operands[2])
+ && INTVAL (operands[2]) == -2147483648L)
+   gen_op = gen_addsi3;
+  else
+   gen_op = gen_xorsi3_internal;
+  emit_insn (gen_op (operands[0], operands[1],
+force_reg (SImode, operands[2])));
+}
+  DONE;
+})
+
+(define_insn "xorsi3_internal"
   [(set (match_operand:SI 0 "register_operand" "=a")
(xor:SI (match_operand:SI 1 "register_operand" "%r")
(match_operand:SI 2 "register_operand" "r")))]
-- 
2.30.2

Re: [RFC] tree-optimization: fix optimize-out variables passed into func to alloc

2023-01-17 Thread Richard Biener via Gcc-patches

On Tue, Jan 17, 2023 at 10:41 PM Andrew Pinski via Gcc-patches
 wrote:
>
> On Tue, Jan 17, 2023 at 7:36 AM Alexey Lapshin via Gcc-patches
>  wrote:
> >
> > After updating to GCC newer than 11.4.0 we found that some code started
> > to fail if it was built with size optimization (-Os).
> > You can find testsuite for reproduction in the attached patch.
> >
> > The simplified version affected code looks like this:
> >
> > void alloc_function (unsigned char **data_p) {
> >   *data_p = malloc (8);
> >   assert(*data_p != NULL);
> > }
> > int main () {
> >   int *data;
> >   alloc_function (&data);
> >   printf ("data pointer is %p", data); // prints NULL(compile with -Os)
> > }
>
> This code is violating C/C++ aliasing rules.
> You store to data via a "unsigned char*" and then do a load via "int*".
> You can just use -fno-strict-aliasing if you want your code to just work.

You can also use void **data_p as a workaround, GCC treats void *
similar to a character type for aliasing rules (note that this is a GCC
extension and not guaranteed to work by the C/C++ standards).

Richard.

> Thanks,
> Andrew Pinski
>
> >
> > If the type of passed argument is equal to the type in alloc_function
> > declaration it works perfectly. Also helps change one or both types to
> > void.
> >
> > I found that issue started to appear from commit
> > d119f34c952f8718fdbabc63e2f369a16e92fa07
> > if-statement which leads to this issue was found and after being
> > removed seems it works well.
> >
> > Could you please elaborate on what cases exactly this checking should
> > optimize?
> > I think it should also contain at least one more check for accessing
> > variable's memory to write..
> >
> >
> > ---
> >  gcc/testsuite/gcc.dg/tree-ssa/alloc-in-func.c | 17 +
> >  gcc/tree-ssa-alias.cc |  2 --
> >  2 files changed, 17 insertions(+), 2 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/alloc-in-func.c
> >
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/alloc-in-func.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/alloc-in-func.c
> > new file mode 100644
> > index 000..b30c1cedcb9
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/alloc-in-func.c
> > @@ -0,0 +1,17 @@
> > +/* { dg-do run } */
> > +/* { dg-options "-Os" } */
> > +
> > +#define assert(x) if (!(x)) __builtin_abort ()
> > +
> > +static inline void alloc_function (unsigned char **data_p)
> > +{
> > +*data_p = (unsigned char *) __builtin_malloc (10);
> > +assert (*data_p != (void *)0);
> > +}
> > +
> > +int main ()
> > +{
> > +int *data = (void *)0;
> > +alloc_function ((unsigned char **) &data);
> > +assert (data != (void *)0);
> > +}
> > diff --git a/gcc/tree-ssa-alias.cc b/gcc/tree-ssa-alias.cc
> > index b8f107dfa52..9068db300e5 100644
> > --- a/gcc/tree-ssa-alias.cc
> > +++ b/gcc/tree-ssa-alias.cc
> > @@ -2608,8 +2608,6 @@ modref_may_conflict (const gcall *stmt,
> >   if (num_tests >= max_tests)
> > return true;
> >   alias_stats.modref_tests++;
> > - if (!alias_sets_conflict_p (base_set, base_node->base))
> > -   continue;
> >   num_tests++;
> > }
> >
> > --
> > 2.34.1
> >

78 matches

Mail list logo