[Bug tree-optimization/67283] GCC regression over inlining of returned structures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67283 Xavier Roche changed: What|Removed |Added Known to work||6.1.0, 6.2.0 Known to fail||5.3.0, 5.4.0 --- Comment #15 from Xavier Roche --- Seems to be working fine starting from 6.1.0 (tested up to 7.0.0 20161113)
[Bug c/67283] New: GCC regression over inlining of returned structures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67283 Bug ID: 67283 Summary: GCC regression over inlining of returned structures Product: gcc Version: 5.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: roche at httrack dot com Target Milestone: --- Created attachment 36219 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36219&action=edit Sample test case (gcc -S -O3 -W -Wall) An optimization bug regression appears to exist when dealing with structures returned by inlined functions. This was working totally fine with GCC 4.4.7. (see below) A typical example is: struct foo { int flags; /* Let it be enough NOT to be packed in registers */ void *opaque[2]; }; static __inline__ struct foo add_flag(struct foo foo, int flag) { foo.flags |= flag; return foo; } Calls to "add_flag" are inlined, but the stack usage increases with latest GCC versions (the code should be almost identical, except the flag in place in the stack). Tested the following GCC versions: (grep "addq.*%rsp" to get stack usage for each function) ; tested architecture: x86-64 GCC 4.4.7: OK addq$72, %rsp # demo_1 addq$72, %rsp addq$72, %rsp addq$72, %rsp addq$72, %rsp # demo_5 GCC 4.5.3 to 4.6.4: NOK (UNTESTED between 4.4.8 to 4.5.2) addq$72, %rsp # demo_1 addq$136, %rsp addq$168, %rsp addq$200, %rsp addq$232, %rsp # demo_5 GCC 4.7.3 to 5.2.0: NOK (UNTESTED between 4.6.5 to 4.7.2) addq$72, %rsp # demo_1 addq$136, %rsp addq$200, %rsp addq$264, %rsp addq$328, %rsp # demo_5 Therefore, the test case was fine in GCC 4.4.7, first degraded between 4.4.8 and 4.5.3, and then again between 4.6.5 and 4.7.3 Note: code produced with http://gcc.godbolt.org/ with -O3 -W -Wall flags. (same results with -01)
[Bug c/67283] GCC regression over inlining of returned structures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67283 --- Comment #1 from Xavier Roche --- Created attachment 36220 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36220&action=edit Produced assembly code with GCC 4.4.7 on x86_64 Produced assembly code with GCC 4.4.7 on x86_64
[Bug c/67283] GCC regression over inlining of returned structures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67283 --- Comment #2 from Xavier Roche --- Created attachment 36221 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36221&action=edit Produced assembly code with GCC 4.6.4 on x86_64 Produced assembly code with GCC 4.6.4 on x86_64
[Bug c/67283] GCC regression over inlining of returned structures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67283 --- Comment #3 from Xavier Roche --- Created attachment 36222 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36222&action=edit Produced assembly code with GCC 5.2.0 on x86_64 Produced assembly code with GCC 5.2.0 on x86_64
[Bug tree-optimization/67283] GCC regression over inlining of returned structures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67283 --- Comment #6 from Xavier Roche --- > the problem is that the structure contains an array an total scalarization is > not implemented for them I confirm that without any array the inlining is fine. Side note: the same problem appears with an union (including union with only one void* member).
[Bug tree-optimization/67283] GCC regression over inlining of returned structures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67283 --- Comment #9 from Xavier Roche --- Created attachment 36260 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36260&action=edit Second test case (might be useful for unit testing)
[Bug tree-optimization/67283] GCC regression over inlining of returned structures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67283 --- Comment #10 from Xavier Roche --- The "Second test case" attached should produce exactly the same bytes (byte-to-byte) for the two demo_1 and demo_2 functions. And this would not rely on stack size that might change. With GCC 4.4.7: demo_2(): subq$72, %rsp movl$0, 32(%rsp) movq32(%rsp), %rax movq$0, 48(%rsp) movq$0, 40(%rsp) movq$0, 8(%rsp) movq$0, 16(%rsp) movq%rax, (%rsp) callsome_unknown_function(foo) addq$72, %rsp ret demo_1(): subq$72, %rsp movl$0, 32(%rsp) movq32(%rsp), %rax movq$0, 48(%rsp) movq$0, 40(%rsp) movq$0, 8(%rsp) movq$0, 16(%rsp) movq%rax, (%rsp) callsome_unknown_function(foo) addq$72, %rsp ret
[Bug tree-optimization/67283] GCC regression over inlining of returned structures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67283 --- Comment #11 from Xavier Roche --- PS: Shall I create a twin ticket for the structure case ?