https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
--- Comment #29 from sergey.shalnov at intel dot com ---
Richard,
Thank you for your latest patch. I would like to clarify
the multiple_p() function usage in if() clause.
First of all, I assume that architectures with fixed
size of HW registers (x86) should go to if(){} branch,
but arch with unknown registers size should go to else{}.
This is because is_constant() function used.
The problem is in multiple_p() function. In the test case
we have group_size=16 and const_nunits=8.
So, the multiple_p(16, 8) returns 1 and with
–march=skylake-avx512 (or –march=znver1) we go to
else{} branch which is not correct.
I used following change in your patch and test works as I expect.
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -1923,7 +1923,7 @@ vect_analyze_slp_cost_1 (slp_instance instance, slp_tree
node,
unsigned HOST_WIDE_INT const_nunits;
unsigned nelt_limit;
if (TYPE_VECTOR_SUBPARTS (vectype).is_constant (&const_nunits)
- && ! multiple_p (group_size, const_nunits))
+ && multiple_p (group_size, const_nunits))
{
num_vects_to_check = ncopies_for_cost * const_nunits /
group_size;
nelt_limit = const_nunits;
--
Other thing here is what we should do if group_size is, for example, 17.
In this case (after my changes) wrong else{} branch will be taken.
I would propose to change this particular if() in following way.
if (TYPE_VECTOR_SUBPARTS (vectype).is_constant (&const_nunits))
{
If(multiple_p (group_size, const_nunits))
{
num_vects_to_check = ncopies_for_cost * const_nunits /
group_size;
nelt_limit = const_nunits;
}
else
{
...not clear what we should have here...
}
}
else
{
num_vects_to_check = 1;
nelt_limit = group_size;
}
Or perhaps multiple_p() should not be here at all?
Sergey