https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88589
Bug ID: 88589 Summary: ICE + seemingly wrong codegen with m68k-elf Product: gcc Version: 8.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: lionel_debroux at yahoo dot fr Target Milestone: --- Created attachment 45283 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45283&action=edit The original source file I looked for other bugs for the m68k-elf target, there are several ICEs but none of them looks the same. Also, no change when passing -fno-strict-aliasing -fwrapv to gcc. I'm in the process of making https://github.com/debrouxl/tilibs/blob/master/libticalcs/trunk/src/romdump_9x/romdump.c build with standard m68k toolchains, removing a hard dependency on the GCC4TI headers and custom toolchain. The original plan was to make a pure ASM version of that program - it's small, after all - but I noticed that Debian packages a m68k ELF toolchain (in sid, binutils 2.31.1 and GCC 8.2.0), so I decided to give those a try at first. In the process of making the code closer to what I want the assembly version to be, I hit two snags: 1) an ICE when adding a normally innocuous line of code, "*cmd = CMD_NONE;", in RecvPacket(). "during RTL pass: ira romdump_testcase_ice.c:603:1: internal compiler error: in form_sum, at reload.c:5331" The command-line invocation was gcc -c romdump_testcase_ice.c -o romdump_testcase_ice -Os -s -fdata-sections -Wall -W -Wwrite-strings -mpcrel -Wa,-l -fomit-frame-pointer -m68000 -mshort -ffreestanding -fcall-used-d0 -fcall-used-d1 -fcall-used-d2 -fcall-used-a0 -fcall-used-a1 -fcall-saved-d3 -fcall-saved-d4 -fcall-saved-d5 -fcall-saved-d6 -fcall-saved-d7 -fno-optimize-sibling-calls --verbose -save-temps -fverbose-asm 2) when that "*cmd = CMD_NONE;" line is commented out, there's a code generation issue on line 543, between the two nops. * as it is, the compiler generates a single 16-bit read + rotate + write instead of the expected 32-bit, or possibly 2 x 16-bit, operations: move.w -4098(%fp),%d0 | MEM[(uint8_t[4102] *)&buf + 4B], MEM[(uint8_t[4102] *)&buf + 4B] ror.w #8,%d0 |, MEM[(uint8_t[4102] *)&buf + 4B] move.w %d0,-7982(%fp) | MEM[(uint8_t[4102] *)&buf + 4B], %sfp * when I'm using two 16-bit values and OR'ing them together, the compiler also generates a single 16-bit read + rotate + write; * when I'm using the same 16-bit values and adding them together, the compiler generates more or less the code I expect. It's not very optimized, but I could just rewrite that part using inline ASM with C operands. AFAICT, I'm not invoking UB by left shifts of counts larger than the width of the variable, because I use casts to uint32_t before shifting. The command-line invocation is the same as 1).