Re: [committed] amdgcn: Add fold_left_plus vector reductions

2020-07-09 Thread Richard Sandiford
Andrew Stubbs writes: > On 07/07/2020 12:03, Richard Sandiford wrote: >> Andrew Stubbs writes: >>> This patch implements a floating-point fold_left_plus vector pattern, >>> which gives a significant speed-up in the BabelStream "dot" benchmark. >>> >>> The GCN architecture can't actually do an in-

Re: [committed] amdgcn: Add fold_left_plus vector reductions

2020-07-07 Thread Andrew Stubbs
On 07/07/2020 12:03, Richard Sandiford wrote: Andrew Stubbs writes: This patch implements a floating-point fold_left_plus vector pattern, which gives a significant speed-up in the BabelStream "dot" benchmark. The GCN architecture can't actually do an in-order vector reduction any more efficien

Re: [committed] amdgcn: Add fold_left_plus vector reductions

2020-07-07 Thread Richard Sandiford
Andrew Stubbs writes: > This patch implements a floating-point fold_left_plus vector pattern, > which gives a significant speed-up in the BabelStream "dot" benchmark. > > The GCN architecture can't actually do an in-order vector reduction any > more efficiently than that equivalent scalar algori

[committed] amdgcn: Add fold_left_plus vector reductions

2020-07-03 Thread Andrew Stubbs
This patch implements a floating-point fold_left_plus vector pattern, which gives a significant speed-up in the BabelStream "dot" benchmark. The GCN architecture can't actually do an in-order vector reduction any more efficiently than that equivalent scalar algorithm, so this is a bit of a che