On Mon, Apr 11, 2022 at 1:26 PM Andrew Stubbs <a...@codesourcery.com> wrote: > > Hi all, > > I've been looking at implementing the complex multiply patterns for the > amdgcn port, but I'm not getting the code I was hoping for. When I try > to use the patterns on x86_64 or AArch64 they don't seem to work there > either, so is there something wrong with the middle-end? I've tried both > current HEAD and GCC 11. > > The example shown in the internals manual is a simple loop multiplying > two arrays of complex numbers, and writing the results to a third. I had > expected that it would use the largest vectorization factor available, > with the real/imaginary numbers in even/odd lanes as described, but the > vectorization factor is only 2 (so, a single complex number), and I have > to set -fvect-cost-model=unlimited to get even that. > > I tried another example with SLP and that too uses the cmul patterns > only for a single real/imaginary pair. > > Did proper vectorization of cmul ever really work? There is a case in > the testsuite for the pattern match, but it isn't in a loop.
You need to check the vectorizer dump whether a complex pattern was recognized or not. Did you properly use -ffast-math? > > Thanks > > Andrew > > P.S. I attached my testcase, in case I'm doing something stupid. > > P.P.S. The manual says the pattern is "cmulm4", etc., but it's actually > "cmulm3" in the implementation.