On Mon, Apr 11, 2022 at 1:26 PM Andrew Stubbs <a...@codesourcery.com> wrote:
>
> Hi all,
>
> I've been looking at implementing the complex multiply patterns for the
> amdgcn port, but I'm not getting the code I was hoping for. When I try
> to use the patterns on x86_64 or AArch64 they don't seem to work there
> either, so is there something wrong with the middle-end? I've tried both
> current HEAD and GCC 11.
>
> The example shown in the internals manual is a simple loop multiplying
> two arrays of complex numbers, and writing the results to a third. I had
> expected that it would use the largest vectorization factor available,
> with the real/imaginary numbers in even/odd lanes as described, but the
> vectorization factor is only 2 (so, a single complex number), and I have
> to set -fvect-cost-model=unlimited to get even that.
>
> I tried another example with SLP and that too uses the cmul patterns
> only for a single real/imaginary pair.
>
> Did proper vectorization of cmul ever really work? There is a case in
> the testsuite for the pattern match, but it isn't in a loop.

You need to check the vectorizer dump whether a complex pattern
was recognized or not.  Did you properly use -ffast-math?

>
> Thanks
>
> Andrew
>
> P.S. I attached my testcase, in case I'm doing something stupid.
>
> P.P.S. The manual says the pattern is "cmulm4", etc., but it's actually
> "cmulm3" in the implementation.

Reply via email to