On Tue, 2025-09-23 at 15:56 +0200, Richard Biener wrote:
> On Tue, 23 Sep 2025, Avinash Jayakar wrote:
>
> > Hi,
> >
> > I had a question regarding the function vect_pattern_recog that is
> > triggered in the slp/vectorization pass.
> > In case the original code is already in vector form, for example
> > below
> > is the original gimple dump of a vector function
> >
> >
> > ;; Function lshift1_64 (null)
> > ;; enabled by -tree-original
> >
> >
> > {
> > __vector unsigned long long D.4059 = { 2, 2 };
> >
> > return <<< Unknown tree: compound_literal_expr
> > __vector unsigned long long D.4059 = { 2, 2 }; >>> * a;
> > }
> >
> >
> > This however does not go through the pattern recognition since
> > 1. It is already in vector form
> > 2. This check fails in slp pass
> > if (bb_vinfo->grouped_stores.is_empty ()
> > && bb_vinfo->roots.is_empty ())
> >
> > So my main question is, suppose this line of code (in this example
> > v1 *
> > {2, 2}) do go through the pattern recognition, it could have
> > generated
> > better code like v1 + v1 or v1 << {1,1}) which could result in
> > vectorization in certain target which does not have native double
> > word
> > multiplication support, but might have double word shift or add.
> > Is this intended or is there a way have this code pattern
> > recognized?
>
> The vectorizer is currently not set up for re-vectorizing already
> vectorized code. Instead for the situation you describe, a target
> without vector multiplication support, it would be the task of the
> vector lowering pass (tree-vect-generic.cc) to turn this into a
> supported operation.
>
I looked into the tree-vect-generic.cc, the function
expand_vector_operations_1 function.
I encountered this while fixing the PR119702. If I write a simple
scalar code
void lshift1_64(uint64_t *a) {
a[0] *= 2;
a[1] *= 2;
}
This does get vectorized as a << {1,1}. But writing this
vector unsigned long long
lshift1_64 (vector unsigned long long a, vector unsigned long long b)
{
return a * (vector unsigned long long) { 2, 2 };
}
gets converted to scalar code during veclower pass.
I see 2 ways of fixing it for multiply expression
1. In expand_vector_operations_1 function, before lowering to scalar we
can check if code is MULT_EXPR and see if it can be implemented with
shifts/add/sub (as done in pattern recognition), like it is done for
LROTATE_EXPR and RROTATE_EXPR.
2. Implement mulv2di3 for this specific target (which does exactly what
scalar code would do), and let expand pass (expand_mult) take care of
converting mult to shift/add/sub.
Thanks and regards,
Avinash Jayakar