On Mon, 2015-03-23 at 20:10 +0100, Erik Varga wrote: > On Sun, Mar 22, 2015 at 10:10 PM, Oleg Endo <oleg.e...@t-online.de> wrote: > > The PBQP approach is indeed very tempting, but there > > are a lot more things to it than just the solver. To get good > > improvements of the generated code, the optimization also has to be able > > to reorder memory accesses and perform other transformations such as > > converting pre-inc into post-inc modes in loops etc. > > I confess there are some optimizations that the PBQP approach doesn't > take into account, like reordering the instructions. Could you > elaborate a bit on the prec-inc to post-inc conversion? I might be > missing something, but I think the PBQP algorithm should have no > problem transforming the addressing mode to post-inc if that makes the > overall cost less.
Yes, the PBQP approach tries to minimize the total cost of address sequences based on the available candidate addressing modes for a particular access. If a (cheaper) post-inc mode can be used instead of e.g. a reg+disp mode, it will most likely catch it. What I was referring to is transforming a pre-inc mode access into a post-inc mode access inside a loop (if the target can do post-inc only). The transformation itself is quite straight forward -- just place one inc before the loop and convert the pre-inc inside the loop into a post-inc access: unsigned int test (const char* s0) { const char* s1 = s0; while (*++s1); return s1 - s0 - 1; } In this particular case, the transformation is already done by something else (yet the auto-inc-dec fails to see the post-inc opportunity). I'd have to dig through my notes to see if there was another use case... > I'd like to give this a try, I'm sure a lot could be achieved in a > summer. Could you share how you planned to approach the problem? As far as I can see there are two different problems. One is providing the necessary infrastructure/framework to be able to do various optimizations on access sequences and addressing modes inside of a GCC RTL pass. The other problem is trying to find a good algorithmic solution for the optimization itself. Before any optimization can be done access sequences need to be extracted from the insn list. A simple way would be to just locating all memory accesses and use their access expressions as-is. But that won't go very far. The most important information is some sort of base-register/constant value of an access sequence, which needs to be found. Once the access sequences are there, there are multiple ways how to optimize them. For example, one option could be to first try to maximize the utilization of post/pre-inc/dec modes (assuming/knowing that they are generally a good idea on that target). Then try to see how to handle non-linear / pseudo-random accesses where displacements are out of range. At least on SH, there are two options. Either adjust the base address to make the displacements fit or load the constant displacement into a reg and use a reg+reg mode. Which one is better depends on the surrounding code. Another option could be to try to come up with a way to create good costs and access mode candidates for the (PBQP) solver. There you need to answer the question "what are the costs for this access, in this sequence, with this surrounding code, approximated register pressure,...?". Again, some modes might become cheaper after applying particular transformations (access reordering to create linear access patterns for post/pre-inc/dec). If such special transformations are already done to improve access sequences, running a solver afterwards might not result in any further improvement. > I'd > also be interested in some of the papers you found (I haven't yet been > able to find much on the subject apart from the PBQP method). I'll send you a private message with those. If anyone else is interested in that stuff, please let me know. Cheers, Oleg