I have a port for a multi-processor with high-latency memory accesses,
even for cache hits.  Each CPU core has a small private scratchpad RAM
with 1 cycle access.  I'd like to persuade GCC to use the scratchpad
(I'll probably allocate somewhere between 8 and 32 words) for reload,
rather than stack slots which have much higher latency.  I have some
ill-formed ideas about how to do this, which could involve describing
these as another class of register, only movable in/out of general
registers.  I'm still trying to understand secondary-reload well
enough to determine if that's the mechanism I want.

Comments & suggestions are welcome!  Pithy clues (e.g., "Look at
the port for machine XYZ") are fine.  I can dig-out the details if
given broad hints.

Greg

Reply via email to