Hello,
I try to make modulo scheduling work more efficiently for our VLIW target. I 
found one serious issue that prevents current SMS algorithm from achieving high 
IPC is so-called "transitive closure" problem, where scheduling window is only 
calculated using direct predecessors and successors. Because SMS is not an 
iterative algorithm, this may cause failures in finding a valid schedule. 
Without splitting rows, some simple loops just cannot be scheduled not matter 
how big the II is. With splitting rows, schedule can be found, but only at 
bigger II. GCC wiki (http://gcc.gnu.org/wiki/SwingModuloScheduling) lists this 
as a TODO. Is there any work going on about this issue (the last wiki update 
was one year ago)? If no one is working on it, I plan to do it. My idea is to 
use the MinDist algorithm described in B. Rau's classic paper "iterative modulo 
scheduling" (http://www.hpl.hp.com/techreports/94/HPL-94-115.html). The same 
algorithm can also be used to compute better RecMII. The biggest concern is 
complexity of computing MinDist matrix, which is O(N^3). N is number of nodes 
in the loop. I remember somewhere GCC coding guide says "never write quadratic 
algorithm" :-) Is this an absolute requirement? If yes, I will keep it as our 
target-specific code (we are less concerned about compilation time). Otherwise, 
I will try to make it more generic to see if it can make into mainline in 4.5. 
Any comments? 

Cheers,
Bingfeng Mei

Broadcom UK 


Reply via email to