Hello, > >>In principle, I don't see anything forbidding Zdenek's idea. > >>Unfortunately, what I avoided to mention is that TARGET_MEM_REF nodes > >>are also transformed into pointer arithmetics > >>and the equivalent > >>INDIRECT_REF memory access... therefore, this is not an option even only > >>because of that. > > > >hmm... why you do that? Could you please describe more precisely what > >are you trying to achieve? > > Sure! > The short answer is that, though most GIMPLE tree codes closely match > what is representable in the CIL bytecode, some do not; hence, such > codes are "lowered" into equivalent expressions that directly match what > is representable in the bytecode.
this seems quite close to what TARGET_MEM_REFs are designed for. IMHO, the best way would be to lower the memory references to TARGET_MEM_REFs (*) just once, sometime quite late in the optimization pipeline (just after loop optimizations, for example), so that high-level optimizers/alias analysis see the easy to understand code, while at least the essential cleanups are performed on the lower level code. Zdenek (*) or just the pointer arithmetics like you do now, if you have some reasons for avoiding TMRs, although then you have to rerun the pass once later to get rid of invalid forms that may be created by the optimizers. > The long answer is in "CIL simplification pass" section at > http://gcc.gnu.org/projects/cli.html :-) > > >>The first time this CLI-specific transformation is performed is before > >>GIMPLE code enters SSA form > > > >This looks like a wrong place; I guess later optimizations will in > >general try to revert the trees to the original form (at the moment, > >we do not have tree-combine pass, but if we had, it definitely would). > >IMHO, it would make more sense to do this kind of target specific > >transformations as late as possible. > > Yes, this is precisely the reason behind running this transformation > pass twice. > The first pass (before SSA) simplifies GIMPLE expressions not atomically > representable in the bytecode, in the hope that GCC middle-end passes > optimize them further. > The second pass makes sure to re-simplify expressions that may have been > reverted to the original form. The practice shows this happens quite > rarily; still, it's a possibility that must be taken into account. > The first pass is optional, the second is strictly required. > > My experiments also show that the bytecode generated by running the CIL > simplification twice as opposed to running the final pass only is > significantly better. For multimedia codecs with heavy array accesses > (which, by the way, is the kind of code ST is particularly interested > in), the recorded difference in performance is up to 40%. > > >Anyway, if that is the case, using TMRs is not a good idea. > > > >Zdenek > > Cheers, > Roberto