On Wed, Aug 5, 2020 at 3:30 PM Andrew Stubbs <a...@codesourcery.com> wrote: > > This patch improves SLP performance in combination with some patches I > have in development to add multiple vector sizes to amdgcn. > > The problem is that amdgcn's preferred vector size has 64 lanes, and SLP > does not support lane masking. My patches will add smaller vector sizes > (32, 16, 8, 4, 2) which make the lane masking implicit, but still SLP > doesn't use them; it simply rejects the first size it sees and gives up. > > This patch detects the rejection early and looks to see if there is a > smaller, more suitable vector size. The result is many more successful > SLP testcases. > > OK to commit? (I have an x86_64 bootstrap and test in progress.)
Is this about basic-block SLP? There it should eventually split groups. For loop based SLP did you specify the autovectorize_vector_modes hook? Otherwise the vectorizer only tries a single size. Richard. > Andrew