Hi Kito.
I fixed almost all of the rv32be testcase failures simply by taking endianness into account on the first line of riscv_subword, which is used for long long handling on 32-bit. Now, I only have one failing testcase (which does not also fail on little endian), and it's a doozy. The test in question is gcc.c-torture/compile/pr35318.c. The test in its entirety is double x = 4, y; __asm__ volatile ("# %0,%1,%2,%3" : "=r,r" (x), "=r,r" (y) : "%0,0" (x), "m,r" (8)); (the asm comment in the first argument was added by me to track what the actual assignments were.) When compiled with -mbig-endian, this results in an ICE: ---8<--- /tmp/pr35318.c: In function 'foo': /tmp/pr35318.c:9:1: error: unrecognizable insn: 9 | } | ^ (insn 12 24 25 2 (parallel [ (set (reg:DF 11 a1 [orig:74 x ] [74]) (asm_operands/v:DF ("# %0,%1,%2,%3") ("=r,r") 0 [ (reg:SI 12 a2 [orig:74 x+4 ] [74]) (mem/c:DF (plus:SI (reg/f:SI 8 s0) (const_int -40 [0xffffffffffffffd8])) [2 %sfp+-24 S8 A64]) ] [ (asm_input:DF ("%0,0") /tmp/pr35318.c:8) (asm_input:SI ("m,r") /tmp/pr35318.c:8) ] [] /tmp/pr35318.c:8)) (set (reg:DF 15 a5 [orig:75 y ] [75]) (asm_operands/v:DF ("# %0,%1,%2,%3") ("=r,r") 1 [ (reg:SI 12 a2 [orig:74 x+4 ] [74]) (mem/c:DF (plus:SI (reg/f:SI 8 s0) (const_int -40 [0xffffffffffffffd8])) [2 %sfp+-24 S8 A64]) ] [ (asm_input:DF ("%0,0") /tmp/pr35318.c:8) (asm_input:SI ("m,r") /tmp/pr35318.c:8) ] [] /tmp/pr35318.c:8)) ]) "/tmp/pr35318.c":8:3 -1 (nil)) during RTL pass: reload dump file: /tmp/pr35318b.txt /tmp/pr35318.c:9:1: internal compiler error: in extract_constrain_insn, at recog.c:2670 0x101bf90b _fatal_insn(char const*, rtx_def const*, char const*, int, char const*) ../../../riscv-gcc/gcc/rtl-error.c:108 0x101bf953 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*) ../../../riscv-gcc/gcc/rtl-error.c:116 0x10a1193f extract_constrain_insn(rtx_insn*) ../../../riscv-gcc/gcc/recog.c:2670 0x1088fc77 check_rtl ../../../riscv-gcc/gcc/lra.c:2087 0x108971c7 lra(_IO_FILE*) ../../../riscv-gcc/gcc/lra.c:2505 0x1082fcb7 do_reload ../../../riscv-gcc/gcc/ira.c:5827 0x1082fcb7 execute ../../../riscv-gcc/gcc/ira.c:6013 ---8<--- This insn looks extremely similar to one that's in the dump-rtl for little endian: ---8<--- (insn 12 20 21 2 (parallel [ (set (reg:DF 13 a3 [orig:74 x ] [74]) (asm_operands/v:DF ("# %0,%1,%2,%3") ("=r,r") 0 [ (reg:SI 13 a3 [orig:74 x ] [74]) (mem/c:DF (plus:SI (reg/f:SI 8 s0) (const_int -40 [0xffffffffffffffd8])) [2 %sfp+-24 S8 A64]) ] [ (asm_input:DF ("%0,0") /tmp/pr35318.c:8) (asm_input:SI ("m,r") /tmp/pr35318.c:8) ] [] /tmp/pr35318.c:8)) (set (reg:DF 15 a5 [orig:75 y ] [75]) (asm_operands/v:DF ("# %0,%1,%2,%3") ("=r,r") 1 [ (reg:SI 13 a3 [orig:74 x ] [74]) (mem/c:DF (plus:SI (reg/f:SI 8 s0) (const_int -40 [0xffffffffffffffd8])) [2 %sfp+-24 S8 A64]) ] [ (asm_input:DF ("%0,0") /tmp/pr35318.c:8) (asm_input:SI ("m,r") /tmp/pr35318.c:8) ] [] /tmp/pr35318.c:8)) ]) "/tmp/pr35318.c":8:3 -1 (nil)) ---8<--- So I don't know what's "unrecognizable" about it... I also don't understand the code that is actually generated in the little-endian case. The way I read the asm statement, %2 should be a register (same as %0) containing the (floating point?) value "4", and %3 should be a memory location (assuming the first alternative is chosen) containing the value "8". However, looking at the generated assembler code, it seems that %2 is a register (a3) which contains the integer value "8" and %3 is a memory location (-40(s0)) which contains the floating point value "4.0". This seems mixed up. ---8<--- foo: addi sp,sp,-48 sw s0,44(sp) addi s0,sp,48 lui a5,%hi(.LC0) fld fa5,%lo(.LC0)(a5) fsd fa5,-24(s0) fld fa5,-24(s0) li a5,8 fsd fa5,-40(s0) mv a3,a5 #APP # 8 "/tmp/pr35318.c" 1 # a3,a5,a3,-40(s0) # 0 "" 2 #NO_APP sw a3,-40(s0) sw a4,-36(s0) fld fa5,-40(s0) fsd fa5,-24(s0) sw a5,-32(s0) sw a6,-28(s0) nop lw s0,44(sp) addi sp,sp,48 jr ra .size foo, .-foo .section .rodata .align 3 .LC0: # little endian double "4.0" .word 0 .word 1074790400 ---8<--- Is this code correct, or is there some deeper issue at play here? (AFAIU the testcase only checks that the compiler doesn't ICE, not that the generated code is correct...) If the code generated for LE is bad, I probably should not try to make BE generate the same thing. :-/ // Marcus