------- Comment #3 from changpeng dot fang at amd dot com 2010-05-27 23:51 ------- I did a quick look at 434.zeusmp and found that prefetching for the following simple loop is responsible:
linpck.f: 131: c c code for increment not equal to 1 c ix = 1 smax = abs(sx(1)) ix = ix + incx do 10 i = 2,n if(abs(sx(ix)).le.smax) go to 5 isamax = i smax = abs(sx(ix)) 5 ix = ix + incx 10 continue Prefetching for this loop seems too aggressive with unknown incx. It is not precditable which sx(ix) will cause cache miss. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44297