Handling the insertion of the nops at the end of RTL pipe needs to take also care of branch shortening optimizations, and filling delay slots. Probably for the given context (only FPU ops) may be a doable approach.
//Claudiu On Thu, Jan 8, 2015 at 4:28 PM, Joel Sherrill <joel.sherr...@oarcorp.com> wrote: > > On 1/8/2015 9:01 AM, Eric Botcazou wrote: >>> I've worked on a gcc target that was porting an architecture without >>> hardware interlock support. Basically, you need to emit nop operations >>> to avoid possible hw conflicts. At the moment, this was done by >>> patching the gcc scheduler to do so, Another issue to keep is to check >>> for hardware conflicts across basic-block boundaries. And not the >>> last, is to prohibit/avoid any instruction stream modification after >>> scheduler (e.g., peephole optimizations etc.). >> That's an overly complex approach, this usually can be done in a simpler way >> with a machine-specific pass that runs at the end of the RTL pipeline. >> > Isn't this similar to needing to fill a delay slot after a branch > instruction? My recollection > is that some SPARC and MIPS have to deal with that. > > -- > Joel Sherrill, Ph.D. Director of Research & Development > joel.sherr...@oarcorp.com On-Line Applications Research > Ask me about RTEMS: a free RTOS Huntsville AL 35805 > Support Available (256) 722-9985 >