Ok, I've actually gone a different route. Instead of waiting for the middle end to perform this, I've directly modified the parser stage to unroll the loop directly there.
Basically, I take the parser of the for and modify how it adds the various statements. Telling it to, instead of doing in the c_finish_loop : if (body) add_stmt (body); if (clab) add_stmt (build1 (LABEL_EXPR, void_type_node, clab)); if (incr) add_stmt (incr); ... I tell it to add multiple copies of body and incr and the at the end add in the loop the rest of it. I've also added support to remove further unrolling to these modified loops and will be handling the "No-unroll" pragma. I then let the rest of the optimization passes, fuse the incrementations together if possible, etc. The initial results are quite good and seem to work and produce good code. Currently, there are two possibilities : - If the loop is not in the form we want, for example: for (;i<n;) { ... } Do we still unroll even though we have to trust the user that the number of unrolling will not break the semantics ? To handle this, I am adding warnings that will appear if the loop is anything but : for (i=C1; i < C2; i ++) { ... } Later on, once this is thoroughly tested, I will allow : for (i=C1; fct (i, C2); i = fct2 (i)) where fct is any comparison function with only i and C2, fct2 is a incrementation/decrementation calculation using i. Any comments ? Concerns ? Questions ? Thanks in advance, Jc On Thu, Oct 8, 2009 at 12:22 PM, Jean Christophe Beyler <jean.christophe.bey...@gmail.com> wrote: > Hi, > >> such an epilogue is needed when the # of iterations is not known in the >> compile time; it should be fairly easy to modify the unrolling not to >> emit it when it is not necessary, > > Agreed, that is why I was surprised to see this in my simple example. > It seems to me that the whole unrolling process has been made to, on > purpose, have this epilogue in place. > > In the case where the unrolling would be perfect (ie. there would be > no epilogue), the calculation of the max bound of the unrolled version > is always done to have this epilogue (if you have 4 iterations and ask > to unroll twice, it will actually change the max bound to 3, > therefore, having one iteration of the unrolled version and 2 > iterations of the original...). I am currently looking at the code of > tree_transform_and_unroll_loop to figure out how to change this and > not have an epilogue in my cases. > > Jc >