Hello, I am working with c6x processor from TI. It has a VLIW architecture. It has 32 registers namedly a0-a15 and b0-b15. b15 is used as the SP in the current port. I am facing a problem with the scheduler of GCC.
Following is the c code I was compiling - ******************************* int mult(int a,int b) { int result=0,flag; if(b<0) flag=1; else flag=-1; for(;b;b+=flag) result += a; return result; } int main() { return mult(5,4); } ******************************** Following is part of the assembly generated by GCC. Code was compiled with O2. ********************************* mult: stw .D2T1 a15, *--b15 ;D1,D2 are functional units ;T1,T2 are transmission paths || mvk 0, b4 ;|| implies that this instruction is executed ;in parallel with the previous instruction mv b15, a15 ldw .D1T2 *+a15[3], b1 ldw .D1T1 *+a15[2], a3 nop 3 ;equivalent to 3 nops cmplt b1, b4, b0 [ b0] mvkl L2, b4 ;[] implies conditional execution. The ;instruction is executed if b0 is TRUE [ b0] mvkh L2, b4 [ b0] b b4 nop 5 [ b1] mvkl L5, b4 || mvk 0, a4 [ b1] mvkh L5, b4 [ b1] b b4 nop 5 ;; problem - the below instruction should have been scheduled before ;; the branch instruction because it will not be executed if the branch is ;; taken mvk -1, b3 L9: ldw .D1T2 *+a15[1], b14 || mv a15, b15 ldw .D2T1 *b15++, a15 add 4, b15, b15 nop 2 b .S2 b14 nop 5 L2: mvk 0, a4 || mvk 1, b3 L5: add b3, b1, b1 || add a3, a4, a4 [ b1] mvkl L5, b4 [ b1] mvkh L5, b4 [ b1] b b4 nop 5 b .S2 L9 nop 5 *************************************** Following is the debugging dump by the scheduler - ************************************** ;; ====================================================== ;; -- basic block 1 from 17 to 89 -- after reload ;; ====================================================== ;; --------------- forward dependences: ------------ ;; --- Region Dependences --- b 1 bb 0 ;; insn code bb dep prio cost reservation ;; ---- ---- -- --- ---- ---- ----------- ;; 17 5 0 0 1 1 S1 : ;; 18 5 0 0 1 1 S2 : ;; 90 67 0 0 8 1 S2 : 89 91 ;; 91 66 0 1 7 1 S2 : 89 ;; 89 100 0 2 6 6 S2 : ;; Ready list after queue_to_ready: 90 18 17 ;; Ready list after ready_sort: 18 17 90 ;; Ready list (t = 0): 18 17 90 ;; 0--> 90 (b1) b4=b4+low(L25) :S2 ;; dependences resolved: insn 91 into queue with cost=1 ;; Ready-->Q: insn 91: queued for 1 cycles. ;; Ready list (t = 0): 18 17 ;; 0--> 17 a4=0x0 :S1 ;; Ready list (t = 0): 18 ;; Ready-->Q: insn 18: queued for 1 cycles. ;; Ready list (t = 0): ;; Second chance ;; Q-->Ready: insn 18: moving to ready without stalls ;; Q-->Ready: insn 91: moving to ready without stalls ;; Ready list after queue_to_ready: 91 18 ;; Ready list after ready_sort: 18 91 ;; Ready list (t = 1): 18 91 ;; 1--> 91 (b1) {b4=high(L25);use b4;} :S2 ;; dependences resolved: insn 89 into queue with cost=1 ;; Ready-->Q: insn 89: queued for 1 cycles. ;; Ready list (t = 1): 18 ;; Ready-->Q: insn 18: queued for 1 cycles. ;; Ready list (t = 1): ;; Second chance ;; Q-->Ready: insn 18: moving to ready without stalls ;; Q-->Ready: insn 89: moving to ready without stalls ;; Ready list after queue_to_ready: 89 18 ;; Ready list after ready_sort: 18 89 ;; Ready list (t = 2): 18 89 ;; 2--> 89 (b1) pc=b4 :S2 ;; Ready list (t = 2): 18 ;; Ready-->Q: insn 18: queued for 1 cycles. ;; Ready list (t = 2): ;; Second chance ;; Q-->Ready: insn 18: moving to ready without stalls ;; Ready list after queue_to_ready: 18 ;; Ready list after ready_sort: 18 ;; Ready list (t = 3): 18 ;; 3--> 18 b3=0xffffffff :S2 ;; Ready list (t = 3): ;; Second chance ;; Ready list (final): ;; total time = 3 ;; new head = 33 ;; new tail = 18 ******************************************* As can be seen in the assembly dump, one instruction is scheduled after the branch instruction. The branch is a conditionally executed branch instruction. This is incorrect because if the branch is executed then the instruction after that will not be executed. Please help. Thanks in advance. Regards, Kunal.