Paolo, Thanks for the reply. However, I am not sure it is a simple folding issue.
For example, B1 = B + 4; = [A, B1] B2 = B + 8; = [A, B2] B3 = B + 12; = [A, B3] Should be transformed to C = A + B = [C, 4] = [C, 8] = [C, 12] Loop exit condition needs to be changed accordingly. BTW, I just added an experimental tree-level loop unrolling pass in my porting, right before ivopt pass. The results are very promising except a few quirky things, which I belive to be problem of ivopts. The produced assembly code is as good as maunal unrolling now. Cheers, Bingfeng -----Original Message----- From: Paolo Bonzini [mailto:[EMAIL PROTECTED] On Behalf Of Paolo Bonzini Sent: 10 July 2008 13:34 To: Bingfeng Mei Cc: Steven Bosscher; gcc@gcc.gnu.org Subject: Re: Inefficient loop unrolling. Bingfeng Mei wrote: > Steven, > I just created a bug report. You should receive a CCed mail now. > > I can see these issues are solvable at RTL-level, but require lots of > efforts. The main optimization in loop unrolling pass, split iv, can > reduce dependence chain but not extra ADDs and alias issue. What is the > main reason that loop unrolling should belong to RTL level? Is it > fundamental? No, it is just effectiveness of the code size expansion heuristics. Ivopts is already complex enough on the tree level, that doing it on RTL would be insane. But other low-level loop optimizations had already been written on the RTL level and since there were no compelling reasons, they were left there. That said, this is a bug -- fwprop should have folded the ADDs, at the very least. I'll look at the PR. Paolo