http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57223
--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> --- Testcase for the PRE issue: typedef float test_t; void foo(test_t * d, int n) { int i, j, k; for (k=0; k<n; ++k) { for (i=0; i<n; ++i) { test_t t; j = k; t = d[i*n+k] + d[k*n+j]; d[i*n+j] = (d[i*n+j] < t) ? d[i*n+j] : t; for (j=k+1; j<n; ++j) { t = d[i*n+k] + d[k*n+j]; d[i*n+j] = (d[i*n+j] < t) ? d[i*n+j] : t; } } } } for which I think we have (simpler) duplicates. We do not inhibit PRE from removing partial memory redundancies just because we vectorize.