On May 15, 2017 6:56:53 PM GMT+02:00, Steve Ellcey <sell...@cavium.com> wrote: >On Sat, 2017-05-13 at 08:18 +0200, Richard Biener wrote: >> On May 12, 2017 10:42:34 PM GMT+02:00, Steve Ellcey <sellcey@cavium.c >> om> wrote: >> > >> > (Short version of this email, is there a way to recalculate/rebuild >> > virtual >> > phi nodes after modifying the CFG.) >> > >> > I have a question about duplicating loops and virtual phi nodes. >> > I am trying to implement the following optimization as a pass: >> > >> > Transform: >> > >> > for (i = 0; i < n; i++) { >> > A[i] = A[i] + B[i]; >> > C[i] = C[i-1] + D[i]; >> > } >> > >> > Into: >> > >> > if (noalias between A&B, A&C, A&D) >> > for (i = 0; i < 100; i++) >> > A[i] = A[i] + B[i]; >> > for (i = 0; i < 100; i++) >> > C[i] = C[i-1] + D[i]; >> > else >> > for (i = 0; i < 100; i++) { >> > A[i] = A[i] + B[i]; >> > C[i] = C[i-1] + D[i]; >> > } >> > >> > Right now the vectorizer sees that 'C[i] = C[i-1] + D[i];' cannot >be >> > vectorized so it gives up and does not vectorize the loop. If we >split >> > up the loop into two loops then the vector add with A[i] could be >> > vectorized >> > even if the one with C[i] could not. >> Loop distribution does this transform but it doesn't know about >> versioning for unknown dependences. >> > >Yes, I looked at loop distribution. But it only works with global >arrays and not with pointer arguments where it doesn't know the size of >the array being pointed at. I would like to be able to have it work >with pointer arguments. If I call a function with 2 or >more integer pointers, and I have a loop that accesses them with >offsets between 0 and N where N is loop invariant then I should have >enough information (at runtime) to determine if there are overlapping >memory accesses through the pointers and determine whether or not I can >distribute the loop.
Not sure where you got that from. Loop distribution works with our data reference / dependence analysis. The cost model might be more restricted but that can be fixed. >The loop splitting code seemed like a better template since it already >knows how to split a loop based on a runtime determined condition. That >part seems to be working for me, it is when I try to >distribute/duplicate one of those loops (under the unaliased condition) >that I am running into the problem with virtual PHIs. There's mark_virtual*for_renaming (sp?). But as said you are performing loop distribution so please enhance the existing pass rather than writing a new one. Richard. >Steve Ellcey >sell...@cavium.com