[Bug ipa/95336] New: Bad code gen omnetpp_r aarch64

2020-05-26 Thread erick.oc...@theobroma-systems.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95336

Bug ID: 95336
   Summary: Bad code gen omnetpp_r aarch64
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: ipa
  Assignee: unassigned at gcc dot gnu.org
  Reporter: erick.oc...@theobroma-systems.com
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

Hello,

I have been using a configuration file to compile and run CPU2017. This
configuration file worked well with gcc9, but it doesn't seem to work well with
gcc10. I am aware of the instructions outlined in "Porting to GCC 10" [0] and I
believe I have followed them. However, at least for omnetpp_r there still
seemed to be an issue. Compilation succeeded, gave several warnings (including
the following):

simulator/matchexpression.tab.cc: In function
'matchexpressionyyparse.constprop.isra':
simulator/matchexpression.tab.cc:1444:37: warning: argument 1 value
'18446744073709551615' exceeds maximum object size 9223372036854775807
[-Walloc-size-larger-than=]
 1444 |  yymsg = (char *) YYSTACK_ALLOC (yyalloc);
  |  

and immediately segfaulted.

I used the following compiler flags:

-flto -fcommon -O3

I noticed that if I reduce the optimization level to -O2, there is no segfault.

I did a bisection from

commit f47f687a97260b1a1305cbf2d7ee3d74b2916a74
Author: Richard Biener 
Date:   Thu Apr 25 17:58:56 2019 +

to:

commit 4945b4c2c8628bdd61b348ea5bd1f9b72537a36e (HEAD)
Author: Martin Liska 
Date:   Tue May 26 09:01:41 2020 +0200

and I found that the following commit may have introduced the error:

commit ff6686d2e5f797d6c6a36ad14a7084bc1dc350e4
Author: Martin Jambor 
Date:   Fri Sep 20 00:25:04 2019 +0200

I am not sure if this is a known issue or if I'm doing something wrong. 

This is the latest GCC version that I know reproduces the error:

[eochoa@osprey1 ~]$ $HOME/code/gcc-inst/bin/gcc -v
Using built-in specs.
COLLECT_GCC=/home/eochoa/code/gcc-inst/bin/gcc
COLLECT_LTO_WRAPPER=/home/eochoa/code/gcc-inst/libexec/gcc/aarch64-unknown-linux-gnu/11.0.0/lto-wrapper
Target: aarch64-unknown-linux-gnu
Configured with: /home/eochoa/code/gcc/configure --disable-bootstrap
--disable-libsanitizer --enable-__cxa_atexit --enable-shared
--disable-libsanitizer --enable-languages=c,c++,fortran --enable-lto
--enable-gold --enable-linker-build-id --with-cpu-emag
--prefix=/home/eochoa/code/gcc-inst/
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 11.0.0 20200526 (experimental) (GCC)

[0] https://gcc.gnu.org/gcc-10/porting_to.html

[Bug ipa/95336] Bad code gen omnetpp_r aarch64

2020-05-26 Thread erick.oc...@theobroma-systems.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95336

--- Comment #2 from Erick Ochoa  ---
(In reply to Andrew Pinski from comment #1)
> Did you try -fno-strict-aliasing?

CXX  = $(CXX_PATH) -ggdb
TUNE_FAST= -mtune=emag -O3
CXXOPTIMIZE  = $(TUNE_FAST) -fno-strict-aliasing
PASS1_CFLAGS = -flto -fcommon
PASS1_CXXFLAGS   = -flto -fcommon
PASS1_FFLAGS = -flto -fcommon
PASS1_LDFLAGS= -flto -fcommon -fno-strict-aliasing

I originally did not. I added it as you suggested. I added it to both the
compiler and the linker just in case.

[Bug ipa/95336] Bad code gen omnetpp_r aarch64

2020-05-26 Thread erick.oc...@theobroma-systems.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95336

--- Comment #3 from Erick Ochoa  ---
(In reply to Erick Ochoa from comment #2)
> (In reply to Andrew Pinski from comment #1)
> > Did you try -fno-strict-aliasing?
> 
> CXX  = $(CXX_PATH) -ggdb
> TUNE_FAST= -mtune=emag -O3
> CXXOPTIMIZE  = $(TUNE_FAST) -fno-strict-aliasing
> PASS1_CFLAGS = -flto -fcommon
> PASS1_CXXFLAGS   = -flto -fcommon
> PASS1_FFLAGS = -flto -fcommon
> PASS1_LDFLAGS= -flto -fcommon -fno-strict-aliasing
> 
> I originally did not. I added it as you suggested. I added it to both the
> compiler and the linker just in case.

I forgot to mention that the problem still persists.

[Bug target/95336] Bad code gen omnetpp_r aarch64

2020-05-26 Thread erick.oc...@theobroma-systems.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95336

--- Comment #5 from Erick Ochoa  ---
(In reply to Andrew Pinski from comment #4)
> Does -O2 -flto -ftree-vectorize fail also?

It does not fail. I will try to narrow down the problem to an optimization
later today.

[Bug target/95336] Bad code gen omnetpp_r aarch64

2020-05-26 Thread erick.oc...@theobroma-systems.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95336

--- Comment #7 from Erick Ochoa  ---

I have ran revision 4945b4c2c8628bdd61b348ea5bd1f9b72537a36e with -O3 but all
-O2 and -O3 optimizations disabled except 

>   -finline-functions  [enabled]
>   -finline-small-functions[enabled]

And the bug was triggered.

(In reply to Martin Jambor from comment #6)
> Can you please try the previous revision (6889a3acfee) but with option
> -fno-ipa-sra ?  If it fails, it means that the previous implementation
> of IPA-SRA hid some other error (we have already had an aliasing bug
> like that) - in that case it would be great if you could bisect again,
> this time with this option.

I ran revision 6889a3acfee

* (with) -fno-ipa-sra:fails
* (without) -fno-ipa-sra: fails

This was weird to me because the bisection should show that without
-fno-ipa-sra it should succeed. But then I used the same flags I used during
bisection which included fprofile-generate...

* (with) -fno-ipa-sra -fprofile-generate: fails
* (with) -fprofile-generate:  succeeds

Maybe using -fprofile-generate to bisect was not the correct decision? The bug
may have hidden due to the indirection provided by the profiling functions. But
at least there's evidence of different behaviour. So: yes, I'll bisect again
with minimal flags to trigger the error and let's see what happens.

[Bug target/95336] [10/11 Regression] Bad code gen omnetpp_r aarch64

2020-05-27 Thread erick.oc...@theobroma-systems.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95336

--- Comment #9 from Erick Ochoa  ---
(In reply to Martin Liška from comment #8)
> I've just tried current gcc-10 branch tip and I can't reproduce it with:
> -mtune=emag -O3 -flto=16
> 
> Can you please attach complete build log?

Hi,

I was able to get rid of the problem. Thanks for all your help and sorry for
opening this ticket. What I believe to be the reason was a mismatch between
gcc-linker or gcc-glibc installed. After rebuilding my toolchain entirely again
(and not just gcc), I have been able to successfully run omnetpp.

[Bug ipa/92538] New: Proposal for IPA init() constant propagation

2019-11-15 Thread erick.oc...@theobroma-systems.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92538

Bug ID: 92538
   Summary: Proposal for IPA init() constant propagation
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: ipa
  Assignee: unassigned at gcc dot gnu.org
  Reporter: erick.oc...@theobroma-systems.com
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

Created attachment 47278
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47278&action=edit
Patch

Hello,

we've been working on an lto optimization and would like to receive some
feedback on this candidate patch.
At the moment the patch compiles correctly when applied to master.
We have some initial test cases in a separate repository and have compiled and
ran a small subset of CPU 2017 benchmarks with comparable results.

The proposed transformation (ipa-initcall-cp) is a variant of interprocedural
constant propagation.
ip-initcall-cp collects all variables with static lifetime that contain only a
single write (like in the cases of initialization functions) and propagates it
to reads across the whole link unit.

In order to run, apply the patch and compile with `-lto -finitcall-cp`.

In order for this transformation to be sound
* writes that can be reached from a function pointer,
* writes that can be reached from a function with outside visibility, and
* writes that may happen after any read
are not taken into account.

In order to determine that no read happens before the write, we have to:
* find the main function
* find the function and basic block of the write
*   for each read in another function
* make sure that call to write dominates all callees of the read function
*   for each read in the same function
* make sure that write dominates read

Some initial drawbacks:
* We've noticed that we have to disable ipa-cp in order for ipa-initcall-cp to
run successfully.
This is most likely due to some issue with clones and we will need to make some
design changes.
The function materialize all clones fails after ipa-initcall-cp if ipa-cp is
not commented out.
Suggestions are welcomed.
* At the moment, I still need to clean the code a bit, since it doesn't pass
the standards.
* I still need to add tests using the testsuite as opposed to running them
myself.

Some future work:
* At the moment, ipa-initcall-cp only propagates values from a single write.
However, we could conceivably improve this work to propagate the first n-writes
and specialize functions based on the constants.

[Bug ipa/92538] Proposal for IPA init() constant propagation

2019-11-25 Thread erick.oc...@theobroma-systems.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92538

Erick Ochoa  changed:

   What|Removed |Added

 CC||erick.ochoa@theobroma-syste
   ||ms.com

--- Comment #3 from Erick Ochoa  ---
Created attachment 47355
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47355&action=edit
Update for ipa-initcall-cp

When applied to master (commit id: d0c0f2f6d2ba374085840c79882a13a4f7bbb6f9)
this patch adds an optimization to propagate constants initialized in init
functions.

[Bug ipa/92685] New: In IPA's execute stage create_version_clone_with_body fails with non-vNULL callers

2019-11-26 Thread erick.oc...@theobroma-systems.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92685

Bug ID: 92685
   Summary: In IPA's execute stage create_version_clone_with_body
fails with non-vNULL callers
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: ipa
  Assignee: unassigned at gcc dot gnu.org
  Reporter: erick.oc...@theobroma-systems.com
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

Created attachment 47367
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47367&action=edit
Hello World IPA pass with call to create_version_clone_with_body

Hello,

I am developing a simple ipa pass that versions a single call site to method
`bar`.

I am using `create_version_clone_with_body` instead of `create_version_clone`
because I want to modify `foo`'s body.

In my test case I have three functions
* main
* foo
* bar

In my simple ipa pass, I have implemented the execute stage to call
`create_version_clone_with_body` for method bar.
I am compiling my test with -flto-partition=none, which if I understand
correctly means the execution stage should have access to the method bodies.

You can apply the patch to master (commit id:
17a2c588c29f089d3c2a36df47175bbdf376e399)

I also add my test case.
After compiling with my patch, to trigger the bug just modify the Makefile to
point to the version of gcc with the patch and `make`.

[Bug ipa/92685] In IPA's execute stage create_version_clone_with_body fails with non-vNULL callers

2019-11-26 Thread erick.oc...@theobroma-systems.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92685

--- Comment #1 from Erick Ochoa  ---
Created attachment 47368
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47368&action=edit
Test Cases

[Bug ipa/92685] In IPA's execute stage create_version_clone_with_body fails with non-vNULL callers

2019-11-26 Thread erick.oc...@theobroma-systems.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92685

Erick Ochoa  changed:

   What|Removed |Added

  Attachment #47367|0   |1
is obsolete||

--- Comment #2 from Erick Ochoa  ---
Created attachment 47369
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47369&action=edit
Hello World IPA pass (corrected)

[Bug ipa/92685] In IPA's execute stage create_version_clone_with_body fails with non-vNULL callers

2019-11-27 Thread erick.oc...@theobroma-systems.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92685

--- Comment #3 from Erick Ochoa  ---
Created attachment 47373
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47373&action=edit
Possible solution

I attach a possible solution. Although I am not familiar if this will break the
design for IPA passes. Someone, more familiar with the area, please let me
know.

[Bug ipa/92538] Proposal for IPA init() constant propagation

2019-12-09 Thread erick.oc...@theobroma-systems.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92538

Erick Ochoa  changed:

   What|Removed |Added

  Attachment #47278|0   |1
is obsolete||
  Attachment #47355|0   |1
is obsolete||

--- Comment #5 from Erick Ochoa  ---
Created attachment 47455
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47455&action=edit
Patch version 3

Hello,

I am updating the patch version.
Currently, this patch can be applied against this commit on master branch.

commit 3cce71b23f6ed221b4335b600e718d79903cc83d
Date:   Wed Dec 4 20:04:10 2019 +  
   
   
   
  git-svn-id:
svn+ssh://gcc.gnu.org/svn/gcc/trunk@278975 138bc75d-0d04-0410-961f-82ee72b054a4

This patch works on the previous patch by eliminating bugs.
Currently, all SPEC CPU 2017 benchmarks can be compiled with this patch and run
successfully.
Furthermore, it detects several constants that can be propagated in 12 of the
SPEC CPU benchmarks.

[Bug ipa/92917] New: ipa-cp segfaults in print_all_lattices

2019-12-11 Thread erick.oc...@theobroma-systems.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92917

Bug ID: 92917
   Summary: ipa-cp segfaults in print_all_lattices
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: ipa
  Assignee: unassigned at gcc dot gnu.org
  Reporter: erick.oc...@theobroma-systems.com
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

Created attachment 47477
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47477&action=edit
Possible fix

Stacktrace
===

lto1: internal compiler error: Segmentation fault
0xbcff9b crash_signal
../../gcc/gcc/toplev.c:328
0x1533c14 print_all_lattices
../../gcc/gcc/ipa-cp.c:547
0x1536edf ipcp_propagate_stage
../../gcc/gcc/ipa-cp.c:3876
0x153c1c7 ipcp_driver
../../gcc/gcc/ipa-cp.c:5746
0x153c1c7 execute
../../gcc/gcc/ipa-cp.c:5839
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
lto-wrapper: fatal error: ../gcc2/gcc-inst/bin/gcc returned 1 exit status
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
make: *** [Makefile:2: all] Error 1

Code to trigger
===

int
main(int argc, char**argv)
{
  return 0;
}

How to compile
===

/path/to/gcc a.c -flto -fdump-ipa-all

How gcc was compiled


$ ../gcc2/gcc-inst/bin/gcc -v
Using built-in specs.
COLLECT_GCC=../gcc2/gcc-inst/bin/gcc
COLLECT_LTO_WRAPPER=/home/eochoa/code/gcc2/gcc-inst/libexec/gcc/aarch64-unknown-linux-gnu/10.0.0/lto-wrapper
Target: aarch64-unknown-linux-gnu
Configured with: ../gcc/configure --disable-bootstrap --disable-libsanitizer
--enable-__cxa_atexit --enable-shared --disable-libsanitizer
--enable-languages=c,c++,fortran --enable-lto --enable-gold
--enable-linker-build-id --with-cpu-emag
--prefix=/home/eochoa/code/gcc2/gcc-inst : (reconfigured) ../gcc/configure
--disable-bootstrap --disable-libsanitizer --enable-__cxa_atexit
--enable-shared --disable-libsanitizer --enable-languages=c,c++,fortran
--enable-lto --enable-gold --enable-linker-build-id --with-cpu-emag
--prefix=/home/eochoa/code/gcc2/gcc-inst
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 10.0.0 20191211 (experimental) (GCC)

Which git commit id is HEAD?
===

300dae5c80ddda7ab4fedffaa0bbf53887232a53
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@279250
138bc75d-0d04-0410-961f-82ee72b054a4

I have attached a possible (lazy) fix.

[Bug lto/93493] New: -flto -flto-partition=none -fipa-pass -fdump-ipa-pass dump_file still in /tmp

2020-01-29 Thread erick.oc...@theobroma-systems.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93493

Bug ID: 93493
   Summary: -flto -flto-partition=none -fipa-pass -fdump-ipa-pass
dump_file still in /tmp
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: lto
  Assignee: unassigned at gcc dot gnu.org
  Reporter: erick.oc...@theobroma-systems.com
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

Created attachment 47728
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47728&action=edit
Hello world IPA pass

Hello,

I am working on an optimization and using -fdump-ipa-$pass for debugging
purposes.
I was initially compiling some test programs in the following way:

```
/path/to/gcc -flto -fipa-hello-world -fdump-ipa-hello-world a.c
```

And a `cc*.074i.hello-world` with the debug information was always available in
my current working directory after compiling

However, recently, I've needed change my compilation scripts by adding
`-flto-partition=none`. Compiling in the following way:

```
/path/to/gcc -flto -flto-partition=none -fipa-hello-world
-fdump-ipa-hello-world a.c
```

Produces no `cc*.074i.hello-world` file in my current working directory.
However, I do see it in the /tmp/ folder.
I attach a "hello world pass" patch to replicate this issue via the commands
given in this ticket.
I have rebased the patch against the following commit:

commit f214ffb336d582a66149068a2a96b7fcf395b5de (HEAD -> gcc-bug-0,
upstream/master, gcc-master)
Author: Jonathan Wakely 
Date:   Wed Jan 29 13:56:49 2020 +

[Bug lto/93493] -flto -flto-partition=none -fipa-pass -fdump-ipa-pass dump_file still in /tmp

2020-02-04 Thread erick.oc...@theobroma-systems.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93493

--- Comment #2 from Erick Ochoa  ---
Hi Martin,

Thanks for the quick reply. I tried both suggestions and the dump file is still
kept in "/tmp".

I also tried a variation where I used -fipa-cp and -fdump-ipa-cp and the same
thing happens. I provide this alternative as a quicker and easier way to
testing (as opposed to applying the patch.)

[Bug lto/93493] -flto -flto-partition=none -fipa-pass -fdump-ipa-pass dump_file still in /tmp

2020-02-04 Thread erick.oc...@theobroma-systems.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93493

--- Comment #4 from Erick Ochoa  ---
[eochoa@osprey1 temp2]$ $HOME/code/gcc-inst/bin/gcc --verbose
Using built-in specs.
COLLECT_GCC=/home/eochoa/code/gcc-inst/bin/gcc
COLLECT_LTO_WRAPPER=/home/eochoa/code/gcc-inst/libexec/gcc/aarch64-unknown-linux-gnu/10.0.1/lto-wrapper
Target: aarch64-unknown-linux-gnu
Configured with: /home/eochoa/code/gcc/configure --disable-bootstrap
--disable-libsanitizer --enable-__cxa_atexit --enable-shared
--disable-libsanitizer --enable-languages=c,c++,fortran --enable-lto
--enable-gold --enable-linker-build-id --with-cpu-emag
--prefix=/home/eochoa/code/gcc-inst/
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 10.0.1 20200129 (experimental) (GCC)


[eochoa@osprey1 temp2]$ ls
a.c
[eochoa@osprey1 temp2]$ cat a.c
#include 

int
main()
{
puts("Hello world");
}
[eochoa@osprey1 temp2]$ ls /tmp/
systemd-private-c268b114a88a493992eec3d225c36103-chronyd.service-FLfthb 
tmux-1102
[eochoa@osprey1 temp2]$ $HOME/code/gcc-inst/bin/gcc -flto -flto-partition=none
-fipa-cp -fdump-ipa-cp a.c --save-temp  [eochoa@osprey1 temp2]$ ls
a.c  a.i  a.o  a.out  a.out.lto_wrapper_args  a.res  a.s
[eochoa@osprey1 temp2]$ ls /tmp/
a.o.074i.cp  ccKNW0Wo ccQYv26D   
systemd-private-c268b114a88a493992eec3d225c36103-chronyd.service-FLfthb 
cc3qdviq.ld  cckUeRrG.le  ccTmkj8o.lto.o  tmux-1102


The file I'm looking for is a.o.074i.cp in this case.

This is the gcc commit of the build:

commit f214ffb336d582a66149068a2a96b7fcf395b5de (HEAD -> gcc-master,
upstream/master)
Author: Jonathan Wakely 
Date:   Wed Jan 29 13:56:49 2020 +

[Bug lto/93493] -flto -flto-partition=none -fipa-pass -fdump-ipa-pass dump_file still in /tmp

2020-02-05 Thread erick.oc...@theobroma-systems.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93493

--- Comment #6 from Erick Ochoa  ---
Thanks, that helps.

I'm just going to make the note that the behaviour I was describing involves
the use of -flto-partition=none and not just -flto. With just -flto I never
needed to add an explicit output file. With -flto-partition=none, if one
doesn't specify the output file, the dump file remains on the /tmp folder.