Re: [committed 0/6] amdgcn: Add V32, V16, V8, V4, and V2 vectors

Andrew Stubbs Tue, 11 Oct 2022 04:53:51 -0700

On 11/10/2022 12:29, Richard Biener wrote:

On Tue, Oct 11, 2022 at 1:03 PM Andrew Stubbs <[email protected]> wrote:


This patch series adds additional vector sizes for the amdgcn backend.

The hardware supports any arbitrary vector length up to 64-lanes via
masking, but GCC cannot (yet) make full use of them due to middle-end
limitations.  Adding smaller "virtual" vector sizes increases the
complexity of the backend a little, but opens up optimization
opportunities for the current middle-end implementation somewhat. In
particular, it enables many more cases of SLP optimization.

The patchset gives aproximately 100 addtional test PASS and a few extra
FAIL.  However, the failures are not new issues, but rather existing
problems that did not show up because the code did not previously
vectorize.  Expanding the testcase to allow 64-lane vectors shows the
same problems there.

I shall backport these patches to the OG12 branch shortly.


I suppose until you change the related_vector_mode hook the PR107096 issue
will not hit you but at least it's then latent ...


How do you mean, change it?

static opt_machine_mode
gcn_related_vector_mode (machine_mode vector_mode,
                         scalar_mode element_mode, poly_uint64 nunits)
{
  int n = nunits.to_constant ();

  if (n == 0)
    n = GET_MODE_NUNITS (vector_mode);

  return VnMODE (n, element_mode);
}

It returns what it's asked for, always matching the number of lanes (notthe bitsize), which is most likely the most natural for GCN.


Andrew

Re: [committed 0/6] amdgcn: Add V32, V16, V8, V4, and V2 vectors

Reply via email to