All: I am trying the place the following Analysis in the vectorizer of GCC that helps in improving the vectorizer to a great extent For the unit stride, zero stride and non stride accesses of memory that helps in vectorizer.
For the Data Dependency graph, the topological sort is performed. The topological sorted Data Dependence graph the time Stamp for each node of the DDG is assigned based on the following Algorithm. For each node in Topological sorted order in DDG { Timestamp = 0; Timestamp(node) = Max(Timestamp, Timestamp of all predecessors) + 1; } Based on the above calculation of timestamp, the partition of DDG is formed. Each partition of DDG is having the nodes with the same Stamp. So nodes in each partition can be vectorized as they are independent nodes in the DDG. To enable the vectorization, the accesses based on contiguous access and non-Contagious access the sub partition is formed. The memory address of all the operands of each node in the partition formed above is sorted in increasing/decreasing order. Based on the sorted increasing/decreasing order of the memory address of each operands of each node in the partition the sub partition is performed based on the unit stride access, zero stride access and the accesses that require shuffling of operands through the vectorized instruction. The above analysis will help in performing Data Layout on the partitioned nodes of the DDG and based on Sub partition formed above and more vectorization opportunities is enabled for performing data Layout on non contiguous accesses and the sub partition With the contiguous access helps in vectorization. Thoughts? Thanks & Regards Ajit