All:

I am trying the place the following Analysis in the vectorizer of GCC that 
helps in improving the vectorizer to a great extent
For the unit stride, zero stride and non stride accesses of memory that helps 
in vectorizer.

For the Data Dependency graph, the topological sort is performed. The 
topological sorted Data Dependence graph the time
Stamp for each node of the DDG is assigned based on the following Algorithm.

For each node in Topological sorted order in DDG
{

    Timestamp = 0;
    Timestamp(node) = Max(Timestamp, Timestamp of all predecessors) + 1;

}

Based on the above calculation of timestamp, the partition of DDG is formed. 
Each partition of DDG is having the nodes with the same 
Stamp. So nodes in each partition can be vectorized as they are independent 
nodes in the DDG. To enable the vectorization, the accesses
 based on contiguous access and non-Contagious access the sub partition is 
formed. The memory address of all the operands of each node 
in the partition formed above is sorted in increasing/decreasing order. Based 
on the sorted increasing/decreasing order of the memory 
address of each operands of each node in the partition the sub partition is 
performed based on the unit stride access, zero stride access 
and the accesses that require shuffling of operands through the vectorized 
instruction.

The above analysis will help in performing Data Layout on the partitioned nodes 
of the DDG and  based on Sub partition formed above and 
more vectorization opportunities is enabled for performing data Layout on non 
contiguous accesses and  the sub partition With the contiguous 
access helps in vectorization.

Thoughts?

Thanks & Regards
Ajit

Reply via email to