On 27/11/2012, at 4:34 AM, Greg McGary wrote: > On 11/25/12 23:33, Maxim Kuvyrkov wrote: >> You essentially need a fix-up pass just before the end of compilation >> (machine-dependent reorg, if memory serves me right) to space instructions >> consuming values from CPRs from the CALL_INSNS that set those CPRs. I.e., >> for the 99% of compilation you don't care about this restriction, it's only >> the very last VLIW bundling and delay slot passes that need to know about it. >> >> You, probably, want to make the 2nd scheduler pass run as machine-dependent >> reorg (as ia64 does) and enable an additional constraint (through scheduling >> bypass) for the scheduler DFA to space CALL_INSNs from their consumers for >> at least for 2 cycles. One challenge here is that scheduler operates on >> basic blocks, and it is difficult to track dependencies across basic block >> boundaries. To workaround basic-block scope of the scheduler you could emit >> dummy instructions at the beginning of basic blocks that have predecessors >> that end with CALL_INSNs. These dummy instructions would set the >> appropriate registers (probably just assign the register to itself), and you >> will have a bypass (see define_bypass) between these dummy instructions and >> consumers to guarantee the 2-cycle delay. > > Thanks for the advice. We're already on the same page--I have most of what > you > recommend: I only schedule once from machine_dependent_reorg, after splitting > loads/stores, calls/branches into "init" and "fini" phases bound at fixed > clock > offsets by record_delay_slot_pair(). I already have a fixup pass to handle > inter-EBB hazards. (The selective scheduler would handle interblock > automatically, but I had trouble with it initially with split load/stores. I > want > to revisit that.) Regarding CPRs, I strongly desire to avoid kludgy fixups > for > schedules created with an incomplete dependence graph when the generic > scheduler > can do the job perfectly with a complete dependence graph.
I wonder if "kludgy fixups" refers to the dummy-instruction solution I mentioned above. The complete dependence graph is a myth. You cannot have a complete dependence graph for a function -- scheduler works on DAG regions (and I doubt it will ever support anything more complex), so you would have to do something to account for inter-region dependencies anyway. It is simpler to have a unified solution that would handle both inter- and intra-region dependencies, rather than implementing two different approaches. -- Maxim Kuvyrkov CodeSourcery / Mentor Graphics