Hi Thomas, Thanks for reviewing the patch.
On Tue, 2025-10-21 at 09:24 +0200, Thomas Schwinge wrote: > Hi Avinash! > > On 2025-10-21T11:46:04+0530, Avinash Jayakar <[email protected]> > wrote: > > Some targets (aarch64 and x86_64 with multilib) reported regression > > for some > > test cases made for PR104116. > > Thanks for looking into this. > > I've similarly observed for '--target=amdgcn-amdhsa': I hope the issue is the same and this patch fixes it. Is it possible to run this on x86_64, if so I can run and check this. > > +PASS: gcc.dg/vect/pr104116-ceil-umod-2.c (test for excess > errors) > +PASS: gcc.dg/vect/pr104116-ceil-umod-2.c execution test > +FAIL: gcc.dg/vect/pr104116-ceil-umod-2.c scan-tree-dump-times > vect "optimized: loop vectorized" 1 > > +PASS: gcc.dg/vect/pr104116-ceil-umod-pow2.c (test for excess > errors) > +PASS: gcc.dg/vect/pr104116-ceil-umod-pow2.c execution test > +FAIL: gcc.dg/vect/pr104116-ceil-umod-pow2.c scan-tree-dump-times > vect "optimized: loop vectorized" 1 > > +PASS: gcc.dg/vect/pr104116-round-div-2.c (test for excess > errors) > +PASS: gcc.dg/vect/pr104116-round-div-2.c execution test > +FAIL: gcc.dg/vect/pr104116-round-div-2.c scan-tree-dump-times > vect "optimized: loop vectorized" 1 > > +PASS: gcc.dg/vect/pr104116-round-div-pow2.c (test for excess > errors) > +PASS: gcc.dg/vect/pr104116-round-div-pow2.c execution test > +FAIL: gcc.dg/vect/pr104116-round-div-pow2.c scan-tree-dump-times > vect "optimized: loop vectorized" 1 > > +PASS: gcc.dg/vect/pr104116-round-div.c (test for excess errors) > +PASS: gcc.dg/vect/pr104116-round-div.c execution test > +FAIL: gcc.dg/vect/pr104116-round-div.c scan-tree-dump-times vect > "optimized: loop vectorized" 1 > > +PASS: gcc.dg/vect/pr104116-round-mod-2.c (test for excess > errors) > +PASS: gcc.dg/vect/pr104116-round-mod-2.c execution test > +FAIL: gcc.dg/vect/pr104116-round-mod-2.c scan-tree-dump-times > vect "optimized: loop vectorized" 1 > > +PASS: gcc.dg/vect/pr104116-round-mod-pow2.c (test for excess > errors) > +PASS: gcc.dg/vect/pr104116-round-mod-pow2.c execution test > +FAIL: gcc.dg/vect/pr104116-round-mod-pow2.c scan-tree-dump-times > vect "optimized: loop vectorized" 1 > > +PASS: gcc.dg/vect/pr104116-round-mod.c (test for excess errors) > +PASS: gcc.dg/vect/pr104116-round-mod.c execution test > +FAIL: gcc.dg/vect/pr104116-round-mod.c scan-tree-dump-times vect > "optimized: loop vectorized" 1 > > +PASS: gcc.dg/vect/pr104116-round-umod-2.c (test for excess > errors) > +PASS: gcc.dg/vect/pr104116-round-umod-2.c execution test > +FAIL: gcc.dg/vect/pr104116-round-umod-2.c scan-tree-dump-times > vect "optimized: loop vectorized" 1 > > > Turned out an extra loop which was for checking > > results in run-time was also being vectorized and the count of vect > > loop was 2 > > instead of 1. In this patch I have made sure no other loop other > > than the one > > in interest of test case is vectorized. Ok for master? > > > The commit gcc-16-4464-g6883d51304f added 30 new tests for testing > > vectorization of {FLOOR,MOD,ROUND}_{DIV,MOD}_EXPR. Few of them > > failed > > for certain targets due to the vectorization of runtime-check loop > > which > > was not intended. > > This patch disables optimization for all of the run-time check > > loops so > > that the count of vectorized loop is always 1. > > > > 2025-10-21 Avinash Jayakar <[email protected]> > > > > gcc/testsuite/ChangeLog: > > PR target/104116 > > * gcc.dg/vect/pr104116.h: disable optimizations. > > Here, you should list the individual functions that your modifying. > > > --- a/gcc/testsuite/gcc.dg/vect/pr104116.h > > +++ b/gcc/testsuite/gcc.dg/vect/pr104116.h > > @@ -106,6 +106,7 @@ int cl_div (int x, int y) > > return q; > > } > > > > +__attribute__((optimize("O0"))) > > unsigned int cl_udiv (unsigned int x, unsigned int y) > > { > > unsigned int r = x % y; > > As far as I can tell, the standard idiom is to put '#pragma GCC > novector' Thanks for this info, i will use this instead, for other loops as well. I was using the attribute since it would change only in one file and I was not aware of this idiom. I will send the reworked patch as v2 in a separate thread. Thanks and regards, Avinash Jayakar > in front of the loop that's not to be vectorized. That's more > expressive > than enforcing '-O0'. Or is '-O0' necessary for other reasons? > > > Grüße > Thomas > > > > @@ -123,6 +124,7 @@ int cl_mod (int x, int y) > > return r; > > } > > > > +__attribute__((optimize("O0"))) > > unsigned int cl_umod (unsigned int x, unsigned int y) > > { > > unsigned int r = x % y; > > @@ -141,7 +143,7 @@ int fl_div (int x, int y) > > return q; > > } > > > > - > > +__attribute__((optimize("O0"))) > > int fl_mod (int x, int y) > > { > > int r = x % y; > > @@ -150,12 +152,14 @@ int fl_mod (int x, int y) > > return r; > > } > > > > +__attribute__((optimize("O0"))) > > int abs(int x) > > { > > if (x < 0) return -x; > > return x; > > } > > > > +__attribute__((optimize("O0"))) > > int rd_mod (int x, int y) > > { > > int r = x % y; > > @@ -169,6 +173,7 @@ int rd_mod (int x, int y) > > return r; > > } > > > > +__attribute__((optimize("O0"))) > > int rd_div (int x, int y) > > { > > int r = x % y; > > @@ -183,6 +188,7 @@ int rd_div (int x, int y) > > return q; > > } > > > > +__attribute__((optimize("O0"))) > > unsigned int rd_umod (unsigned int x, unsigned int y) > > { > > unsigned int r = x % y; > > @@ -191,6 +197,7 @@ unsigned int rd_umod (unsigned int x, unsigned > > int y) > > return r; > > } > > > > +__attribute__((optimize("O0"))) > > unsigned int rd_udiv (unsigned int x, unsigned int y) > > { > > unsigned int r = x % y; > > -- > > 2.51.0
