------- Comment #13 from changpeng dot fang at amd dot com  2010-06-30 00:23 
-------
Here is the current status of this work:
patch1: http://gcc.gnu.org/ml/gcc-patches/2010-06/msg02956.html
patch2: http://gcc.gnu.org/ml/gcc-patches/2010-06/msg03049.html
On my system with -O3 zero_sized_1.f90 -fprefetch-loop-arrays 
-fno-unroll-loops --param max-completely-peeled-insns=2000:

original timing:      5m30s
with patch1:          1m20s
with patch1 + patch2: 1m03s
without prefetch:     0m30s

The timing with prefetch-loop-arrays is still doubled after the two patch
compared to no-prefetch-loop-arrays. The extra 33s is mostly spent in 
dependence computation for loops. For this test case, prefetching is the
only optimization that invokes "compute_all_dependences".

I am not sure whether we should tolerate this timing increase with aggressive
peeling and prefetching, or we should work on the cost reduction of dependence
computation.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44576

Reply via email to