Hello, I'm considering attempting a 65816 target but decided it would be a good idea to start with something simple in order to learn how GCC generate code. So I created a minimal machine description with just two instructions (plus the mandatory nop/jump/etc):
(define_mode_iterator INT [QI HI SI]) (define_insn "mov<mode>" [(set (match_operand:INT 0 "nonimmediate_operand" "=mr") (match_operand:INT 1 "general_operand" "mri")) ] "" "move\t%0,%1") (define_insn "add<mode>3" [(set (match_operand:INT 0 "nonimmediate_operand" "=mr") (plus:INT (match_operand:INT 1 "general_operand" "mri") (match_operand:INT 2 "general_operand" "mri"))) ] "" "add\t%0,%1,%2") As you can see, all operands may be memory, or registers. I then created some simple tests and had a look at the generated code: C: int data[4]; data[0] = data[1] + data[2]; ASM: move r0,#data ;# 6 movsi add r1,r0,#4 ;# 7 addsi3 add r0,r0,#8 ;# 8 addsi3 add data,(r1),(r0) ;# 9 addsi3 C: int data[4]; data[1] = data[2] + data[3]; ASM: add data+4,data+8,data+12 ;# 9 addsi3 This is the point where I got really confused. Why does GCC split the add and generate 4 opcodes in the first example? Changing the optimization level doesn't make a difference. I've tried to set the indirect register address cost much higher than symbol_ref/label_ref/const in an atempt to make GCC avoid those indirect loads, but it didn't help. I've tried it with GCC 4.8.3 and 4.9.1, they both produce similar, but not identical, results. The result is equally strange for a simple move: C: data[1] = data[0]; ASM: move data+4,data ;# 8 movsi C: data[0] = data[1]; ASM: add r0,#data,#4 ;# 7 addsi3 move data,(r0) ;# 8 movsi What's so special about the first entry in an array that causes this? I have a feeling I'm missing something obvious here :-) Any ideas? Best regards, Mathias Roslund