15.03.2015 17:44, Ajit Kumar Agarwal writes: > > Hello All: > > The below example is taken from the article by Albert Cohen et.al. > > "Deep Jam : Conversion of Coarse grain Parallelism to Fine grain vector > parallelism"
It is not clear from this article, whether such optimization can be implemented in practice (i.e. automatically and within reasonable time). Authors implemented some proof-of-concept, but from their description it looks like they had to tweak their algorithm in each particular case (and they hope, that later this could be automated using FDO). Also nothing definite is said about compilation time. > I would like to propose the above heuristics for unroll and jam and renaming > which enables the loop fusion and the IF-merging > to achieve the optimize code. AFAIK, most of loop transformations in GCC are applicable to rather simple loops, which, at least have an induction variable. So, loops like while (q) { ... } for some arbitrary condition q, e.g. while (*a++ = *b++) ; are unlikely to be optimized well. The proposed transformation > For ( I = 0 ; I < 10; i++) > a1 = phi(a1,a2); > If ( p1 && p2) > a1 = ...; > a2 = ...; > Else > If (p1) > a1= ....; > Else > If (p2) > a2 = ...; > > While (q1 && q2) > { > ... = a1 + ...; > ... = a2 + .....; > } > While (q1) > ....= a1 + ...; > While ( q2) > ..... = a2 + ..; involves some sort of unroll-and-jam, nontrival code motion, loop fusion. And it obviously requires analysis of loop-carried data dependence, that can deal with nested loops having complex conditions inside them, which GCC does not yet provide. GCC includes Graphite framework - it allows to perform rather complex transformations of loops (perhaps similar to what you described), but it only works for loops which have canonical induction variables and involve only affine memory access (i.e. addresses must be represented as linear combinations of induction variables and constants) - no if's, no while's with non-trivial exit conditions. So, I think this idea is, unfortunately, not very realistic in terms of required efforts. P.S. I'm new in GCC community, but I hope that in general terms my description is close to reality. Please correct me, if something is wrong. -- Regards, Mikhail Maltsev