Hello, I am a student working on a project involving generating a MIPS processor. We have decided to NOT implement logic to handle branch delay slots and instead work with generating a compiler that will emit code without these delay slots. The compiler tool chain versions are:
binutils: 2.15 gcc: 3.4.5 glibc: 2.3.6 linux-headers: 2.6.12 So far, I have been mostly successful in removing these delay slots. (nops still exist in some situations). However, the offsets are still with respect to the branch delay slot. This is not a problem in most cases because we just add 4 to the offset in the pipeline instead of adding the PC of the delay-slot instruction to the offset. There is one case where this solution does not work. Here is the segment of code: 400184: 04110001 bal 40018c<__start+0xc> 400188: 00000000 nop 40018c: 3c1c0fc1 lui gp,0xfc1 400190: 279cbc64 addiu gp,gp,-17308 400194: 039fe021 addu gp,gp,ra The bal instruction here is expected to push the PC of the instruction after the delay slot (0x40018c). This value is then later added to the $gp register in the last instruction of the sequence. Because of this behavior, putting the PC of the instruction after the branch results in incorrect execution. Looking at another segment of code from the same program: 400204: 0320f809 jalr t9 400208: 8fbc0010 lw gp,16(sp) 40020c: 8fbf0018 lw ra,24(sp) Notice here that if I push the PC of the instruction after the delay slot (0x40020c), then I will skip the load that writes to the $gp. I realize that in this case it is possible that 0x40020c is expected to be pushed, but I have not noticed any repercussions of simply pushing 0x400208 instead. I would like to make the changes necessary so that the compiler expects the PC of the instruction directly after the branch to be put in the $ra register. I cannot locate where it is specified that PC+8 of an "and link" instruction is to be put in the $ra so that I may change it. -Brandon