Hi all,

I'm trying to solve an infinite loop in the "reload" pass (LRA). I need early-clobber on my load instructions and it goes wrong when register pressure is high.

Is there a proper way to fix this? Or do I need to do something "hacky" like fixing a register for use with reloads?

Here's the background .....

AMD GCN has a thing called XNACK mode in which load instructions can be interrupted (by a page miss, for example) and therefore need to be written such that they are "restartable". This basically means that the output must not overwrite the input registers (it can happen that a load is partially successful, especially for vectors, but I believe overwriting the address and offsets is never safe, even for scalars). Up to now we've not needed this mode, but it will be needed for Unified Shared Memory (and theoretically for APU devices).

So I have added new alternatives into my machine description that use early-clobber set:

  [v   ,RF  ;flat ,*   ,12,*    ,off] flat_load%o1\t%0, %A1%O1%g1
  [&v  ,RF  ;flat ,*   ,12,*    ,on ] ^

(The "on" and "off" represent the XNACK mode.)

LRA then generates a register "Assignment" section in the dump, but it's not happy for some reason and generates another, and another, each with more and more pseudo registers and insns, and it goes on forever until the dump file is gigabytes and I kill it.

This is a vague description, sorry, because I don't really understand what's going on here and the dump files are huge with tens of thousands of pseudo registers to wade through. I'm hoping somebody recognises the issue without me spending days on it.

I have a workaround because there's no known failure on devices that have the AVGPR register file (they use it as spill space and therefore don't need the memory loads) and I actually don't need XNACK on the older devices at this time, but probably this is just pushing the problem further down the road so if there's a better solution then I'd like to find it.

Thanks in advance

Andrew

Reply via email to