------- Comment #13 from changpeng dot fang at amd dot com 2010-06-01 19:59 ------- (In reply to comment #12) > Ok. So I will let you continue to look into that and wait for your results? > > Do you have any feedback on separate.patch and its influence on performance? >
+ for (; groups; groups = groups->next) + for (ref = groups->refs; ref; ref = ref->next) + { + if (cst_and_fits_in_hwi (ref->group->step)) + continue; + if (!ref->issue_prefetch_p) + continue; + insn_to_prefetch_ratio = (unroll_factor * ninsns) / prefetch_count; + if (insn_to_prefetch_ratio < MIN_INSN_TO_SPECULATIVE_PREFETCH) + { + ref->issue_prefetch_p = false; + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, + "Ignoring %p-- insn to prefetch ratio (%d) too small\n", + (void *) ref, insn_to_prefetch_ratio); + } + } + } The patch should fix the tonto regression caused by non-constant step prefetching. It is just that you should move the computation and comparison outside (before) the loop and the debug dump after the loop. I am just thinking that for such loop, we should do nothing: non-non-temporal stores and no constant step prefetching because nothing could be trusted. I am doing some experiemnts and let you know what I could find. Thanks. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44297