In the previous patch to reduce iteration counts, I have overlooked
that, in the inner loop of s176, the array index i+m-j-1
turns negativeat for higher iterations of the middle loop for small m.
m and the iteration end of the middle loop should stay the same.
Fix PR testsuite/116271, gcc.dg/vect/tsvc/vect-tsvc-s176.c fails
2024-08-07 Joern Rennecke <joern.renne...@riscy-ip.com>
gcc/testsuite:
PR testsuite/116271
* gcc.dg/vect/tsvc/vect-tsvc-s176.c [TRUNCATE_TEST]: Make sure
that m stays the same as the loop bound of the middle loop.
* gcc.dg/vect/tsvc/tsvc.h (get_expected_result): <s176>
[TRUNCATE_TEST]: Adjust expected value.
Index: gcc.dg/vect/tsvc/tsvc.h
===================================================================
--- gcc.dg/vect/tsvc/tsvc.h (revision 6682)
+++ gcc.dg/vect/tsvc/tsvc.h (working copy)
@@ -1727,7 +1727,7 @@ real_t get_expected_result(const char * name)
#ifndef TRUNCATE_TEST
return 32021.121094f;
#else /* TRUNCATE_TEST */
- return 32024.082031f;
+ return 32023.751953f;
#endif /* TRUNCATE_TEST */
#endif /* iterations */
} else {
Index: gcc.dg/vect/tsvc/vect-tsvc-s176.c
===================================================================
--- gcc.dg/vect/tsvc/vect-tsvc-s176.c (revision 6682)
+++ gcc.dg/vect/tsvc/vect-tsvc-s176.c (working copy)
@@ -15,18 +15,20 @@ real_t s176(struct args_t * func_args)
initialise_arrays(__func__);
- int m = LEN_1D/2;
#ifdef TRUNCATE_TEST
- /* Do something equivalent to if (1) which the compiler is unlikely to
- figure out.
- FUNC_ARGS is in the caller's frame, so it shouldn't be between A and B.
- */
- if ((void *)func_args <= (void *)a || (void *)func_args >= (void *)b)
- m = 32;
+/* Reduce the iteration counts without changing what is a variable and
+ what is a constant expression.
+ 32000/25 == 640, i.e. it still has a nice power of two factor, but is
+ not a power of two itself, and still somewhat large-ish, so hopefully
+ this won't perturb the vectorizer decisions much. */
+#define M_CONST LEN_1D/50
+#else
+#define M_CONST LEN_1D/2
#endif
+ int m = M_CONST;
for (int nl = 0; nl < 4*(10*iterations/LEN_1D); nl++) {
- for (int j = 0; j < (LEN_1D/2); j++) {
+ for (int j = 0; j < (M_CONST); j++) {
for (int i = 0; i < m; i++) {
a[i] += b[i+m-j-1] * c[j];
}