On 13/04/2023 15:00, Richard Biener wrote:
On Thu, Apr 13, 2023 at 3:00 PM Andre Vieira (lists) via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
On 13/04/2023 11:01, Andrew Stubbs wrote:
Hi Andre,
I don't have a cascadelake device to test on, nor any knowledge about
what makes it different from regular x86_64.
Not sure you need one, but yeah I don't know either, it looks like it
fails because:
in-branch vector clones are not yet supported for integer mask modes.
A quick look tells me this is because mask_mode is not VOIDmode.
i386.cc's TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN will set
mask_mode to either DI or SI mode when TARGET_AVX512F. So I suspect
cascadelake is TARGET_AVX512F.
This is where I bail out as I really don't want to dive into the target
specific simd clone handling of x86 ;)
If the cascadelake device is supposed to work the same as other x86_64
devices for these vectors then the test has found a bug in the compiler
and you should be looking to fix that, not fudge the testcase.
Alternatively, if the device's capabilities really are different and the
tests should behave differently, then the actual expectations need to be
encoded in the dejagnu directives. If you can't tell the difference by
looking at the "x86_64*-*-*" target selector alone then the correct
solution is to invent a new "effective-target" selector. There are lots
of examples of using these throughout the testsuite (you could use
dg-require-effective-target to disable the whole testcase, or just use
the name in the scan-tree-dump-times directive to customise the
expectations), and the definitions can be found in the
lib/target-supports.exp and lib/target-supports-dg.exp scripts. Some are
fixed expressions and some run the compiler to probe the configuration,
but in this case you probably want to do something with "check-flags".
Even though I agree with you, I'm not the right person to do this
digging for such target specific stuff. So for now I'd probably suggest
xfailing this for avx512f.
For the unroll problem, you can probably tweak the optimization options
to disable that, the same as has been done for the epilogues feature
that had the same problem.
I mistaken the current behaviour for unrolling, it's actually because of
a latent bug. The vectorizer calls `vect_get_smallest_scalar_type` to
determine the vectype of a stmt. For a function like foo, that has the
same type (long long) everywhere this wouldn't be a problem, however,
because you transformed it into a MASK_CALL that has a function pointer
(which is 32-bit in -m32) that now becomes the 'smallest' type.
This is all a red-herring though, I don't think we should be calling
this function for potential simdclone calls as the type on which the
veclen is not necessarily the 'smallest' type. And some arguments (like
uniform and linear) should be ignored anyway as they won't be mapped to
vectors. So I do think this might have been broken even before your
changes, but needs further investigation.
Since these are new tests for a new feature, I don't really understand
why this is classed as a regression?
Andrew
P.S. there was a commit to these tests in the last few days, so make
sure you pull that before making changes.
The latest commit to these tests was mine, it's the one Haochen is
reporting this regression against. My commit was to fix the issue richi
had introduced that was preventing the feature you introduced from
working. The reason nobody noticed was because the tests you introduced
didn't actually test your feature, since you didn't specify 'inbranch'
the omp declare simd pragma was allowing the use of not-inbranch simd
clones and the vectorizer was being smart enough to circumvent the
conditional and was still able to use simdclones (non inbranch ones) so
when the inbranch stopped working, the test didn't notice.
The other changes to this test were already after the fix for 108888
that broke the inbranch feature you added, and so it was fixing a
cascadelake testism but for the not-inbranch simdclones. So basically
fixing a testism of a testism :/
I am working on simdclone's for AArch64 for next Stage 1 so I don't mind
looking at the issue with the vectype being chosen wrongly, as for the
other x86 specific testisms I'll leave them to someone else.
Btw, the new testsuite FAILs could be just epiloge vectorizations, so
maybe try the usual --param vect-epilogues-nomask=0 ...
It already has those, Jakub added them.
But that's not it, I've been looking at it, and there is code in place
that does what I expected which is defer the choice of vectype for simd
clones until vectorizable_simd_clone_call, unfortunately it has a
mistaken assumption that simdclones don't return :/
see vect_get_vector_types_for_stmt:
...
if (gimple_get_lhs (stmt) == NULL_TREE
/* MASK_STORE has no lhs, but is ok. */
&& !gimple_call_internal_p (stmt, IFN_MASK_STORE))
{
if (is_a <gcall *> (stmt))
{
/* Ignore calls with no lhs. These must be calls to
#pragma omp simd functions, and what vectorization factor
it really needs can't be determined until
vectorizable_simd_clone_call. */
if (dump_enabled_p ())
dump_printf_loc (MSG_NOTE, vect_location,
"defer to SIMD clone analysis.\n");
return opt_result::success ();
}
return opt_result::failure_at (stmt,
"not vectorized: irregular
stmt.%G", stmt);
}
...
I'm working on a patch.
Kind Regards,
Andre