On 09/28/2015 03:30 AM, Aditya Kumar wrote:
From: hiraditya <hiradi...@msn.com>
Redesign Graphite scop detection for faster compiler time and detecting more
SCoPs.
Existing algorithm for SCoP detection in graphite was based on dominator tree
where a tree (CFG) traversal was required for analyzing an SESE. The tree
traversal is linear in the number of basic blocks and SCoP detection is
(probably) linear in number of instructions. That algorithm utilized a generic
infrastructure of SESE which does not directly represent loops. With regards to
graphite framework, we are only interested in subtrees with loops. The new
algorithm is geared towards tree traversal on loop structure. The algorithm is
linear in number of loops which is faster than the previous algorithm.
Briefly, we start the traversal at a loop-nest and analyze it recursively for
validity. Once a valid loop is found we find a valid adjacent loop. If an
adjacent loop is found and is valid, we merge both loop nests otherwise we form
a SCoP from the previous loop nest, and resume the algorithm from the adjacent
loop nest. The data structure to represent an SESE is an ordered pair of edges
(entry, exit). The new algoritm can extend a SCoP in both the directions. With
this approach, the number of instructions to be analyzed for validity reduces to
a minimal set. We start by analyzing those statements which are inside a loop,
because validity of those statements is necessary for the validity of loop. The
statements outside the loop nest can be just excluded from the SESE if they are
not valid.
I am generally fine with this, but please consider that when growing a SCoP
certain
previous analysis may become invalid (an affine expression may suddenly become
non-affine as parameters that were previously scop-invariant may now be part of
the scop. Also, how are you planning to handle non-affine regions/loops. In
polly
we can encapsulate non-affine loops and regions in bigger scops. To handle this,
I assume you would need to teach your patch to start growing regions even though
the innermost loops cannot be modeled precisely.
Best,
Tobias