Michael Hope wrote:
 but this causes trouble when setting up ACC for the likes of the add
above.  The compiler runs but the code is incorrect

Incorrect how? If you don't give us precise descriptions of the problem, then we can't give you precise answers. A precise description would be RTL excerpts from -da dumps, showing exactly when (which RTL pass) and exactly how (good and bad RTL) it breaks.

Taking a guess, I would say it breaks when the reload pass emits instructions to resolve reloads, because these instructions will clobber ACC, which generates wrong code if ACC already has a live value in it.

To avoid this problem, I think you would have to hide ACC from the compiler before reload. So instead of having an addsi3 pattern that does R11 = R11 + ACC. You write an addsi3 pattern that does R11 = R11 + R12, and which emits two assembly language instructions to achieve this. Then you write a post-reload splitter which splits it into two RTL instructions, one to do the add and one to do the move. You have a post-reload only pattern to accept R11 = R11 + ACC. Then you hope that CSE, peephole, and other post-reload optimizations can eliminate most of the redundant instructions. If need be, you might add a machine dependent reorg pass that cleans up the code even more. You can have post-reload patterns by checking for reload_completed in the condition.

There have been some ports to small accumulator based machines, but mostly they have used tricks like a page-zero register file. You just assume that you have 16 registers in page-zero memory, and you don't bother telling gcc about the actual hardware registers, taking advantage of the fact that you have fast page-zero addressing modes. This of course gives horrible code, as every operation involves two loads an ACC operation and a store. However, it does give a working compiler, for those desperate enough to need one.

Jim

Reply via email to