On Wed, Aug 5, 2020 at 3:30 PM Andrew Stubbs <a...@codesourcery.com> wrote:
>
> This patch improves SLP performance in combination with some patches I
> have in development to add multiple vector sizes to amdgcn.
>
> The problem is that amdgcn's preferred vector size has 64 lanes, and SLP
> does not support lane masking.  My patches will add smaller vector sizes
> (32, 16, 8, 4, 2) which make the lane masking implicit, but still SLP
> doesn't use them; it simply rejects the first size it sees and gives up.
>
> This patch detects the rejection early and looks to see if there is a
> smaller, more suitable vector size.  The result is many more successful
> SLP testcases.
>
> OK to commit? (I have an x86_64 bootstrap and test in progress.)

Is this about basic-block SLP?  There it should eventually split groups.
For loop based SLP did you specify the autovectorize_vector_modes
hook?  Otherwise the vectorizer only tries a single size.

Richard.

> Andrew

Reply via email to