Hello,
I am working with c6x processor from TI. It has a VLIW architecture.
It has 32 registers namedly a0-a15 and b0-b15. b15 is used as the SP
in the current port.
I am facing a problem with the scheduler of GCC.

Following is the c code I was compiling -

*******************************
int mult(int a,int b) {
 int result=0,flag;

 if(b<0)
   flag=1;
 else
   flag=-1;
 for(;b;b+=flag)
   result += a;
 return result;
}

int main() {
 return mult(5,4);
}
********************************


Following is part of the assembly generated by GCC. Code was compiled with O2.


*********************************
mult:
       stw     .D2T1   a15,    *--b15                ;D1,D2 are functional units
                                                                 
;T1,T2 are transmission paths
||      mvk             0,      b4                         ;|| implies
that this instruction is executed
                                                                  ;in
parallel with the previous instruction
       mv              b15,    a15
       ldw     .D1T2   *+a15[3],       b1
       ldw     .D1T1   *+a15[2],       a3
       nop             3                                     
;equivalent to 3 nops
       cmplt           b1,     b4,     b0
       [ b0] mvkl              L2,     b4               ;[] implies
conditional execution. The
                                                                 
;instruction is executed if b0 is TRUE
       [ b0] mvkh              L2,     b4
       [ b0] b         b4
       nop             5
       [ b1] mvkl              L5,     b4
||      mvk             0,      a4
       [ b1] mvkh              L5,     b4
       [ b1] b         b4
       nop             5
            ;; problem - the below instruction should have been scheduled before
            ;; the branch instruction because it will not be executed
if the branch is
            ;; taken
       mvk             -1,     b3
L9:
       ldw     .D1T2   *+a15[1],       b14
||      mv              a15,    b15
       ldw     .D2T1   *b15++, a15
       add             4,      b15,    b15
       nop             2
       b       .S2     b14
       nop             5
L2:
       mvk             0,      a4
||      mvk             1,      b3
L5:
       add             b3,     b1,     b1
||      add             a3,     a4,     a4
       [ b1] mvkl              L5,     b4
       [ b1] mvkh              L5,     b4
       [ b1] b         b4
       nop             5
       b       .S2     L9
       nop             5
***************************************


Following is the debugging dump by the scheduler -

**************************************
;;   ======================================================
;;   -- basic block 1 from 17 to 89 -- after reload
;;   ======================================================

;;   --------------- forward dependences: ------------

;;   --- Region Dependences --- b 1 bb 0
;;      insn  code    bb   dep  prio  cost   reservation
;;      ----  ----    --   ---  ----  ----   -----------
;;       17     5     0     0     1     1   S1  :
;;       18     5     0     0     1     1   S2  :
;;       90    67     0     0     8     1   S2  : 89 91
;;       91    66     0     1     7     1   S2  : 89
;;       89   100     0     2     6     6   S2  :

;;              Ready list after queue_to_ready:    90  18  17
;;              Ready list after ready_sort:    18  17  90
;;      Ready list (t =  0):    18  17  90
;;        0--> 90   (b1) b4=b4+low(L25)                :S2
;;              dependences resolved: insn 91 into queue with cost=1
;;              Ready-->Q: insn 91: queued for 1 cycles.
;;      Ready list (t =  0):    18  17
;;        0--> 17   a4=0x0                             :S1
;;      Ready list (t =  0):    18
;;              Ready-->Q: insn 18: queued for 1 cycles.
;;      Ready list (t =  0):
;;              Second chance
;;              Q-->Ready: insn 18: moving to ready without stalls
;;              Q-->Ready: insn 91: moving to ready without stalls
;;              Ready list after queue_to_ready:    91  18
;;              Ready list after ready_sort:    18  91
;;      Ready list (t =  1):    18  91
;;        1--> 91   (b1) {b4=high(L25);use b4;}        :S2
;;              dependences resolved: insn 89 into queue with cost=1
;;              Ready-->Q: insn 89: queued for 1 cycles.
;;      Ready list (t =  1):    18
;;              Ready-->Q: insn 18: queued for 1 cycles.
;;      Ready list (t =  1):
;;              Second chance
;;              Q-->Ready: insn 18: moving to ready without stalls
;;              Q-->Ready: insn 89: moving to ready without stalls
;;              Ready list after queue_to_ready:    89  18
;;              Ready list after ready_sort:    18  89
;;      Ready list (t =  2):    18  89
;;        2--> 89   (b1) pc=b4                         :S2
;;      Ready list (t =  2):    18
;;              Ready-->Q: insn 18: queued for 1 cycles.
;;      Ready list (t =  2):
;;              Second chance
;;              Q-->Ready: insn 18: moving to ready without stalls
;;              Ready list after queue_to_ready:    18
;;              Ready list after ready_sort:    18
;;      Ready list (t =  3):    18
;;        3--> 18   b3=0xffffffff                      :S2
;;      Ready list (t =  3):
;;              Second chance
;;      Ready list (final):
;;   total time = 3
;;   new head = 33
;;   new tail = 18
*******************************************

As can be seen in the assembly dump, one instruction is scheduled
after the branch instruction. The branch is a conditionally executed
branch instruction. This is incorrect because if the branch is
executed then the instruction after that will not be executed.
Please help.
Thanks in advance.
Regards,
Kunal.

Reply via email to