Andrew Stubbs writes:
> On 07/07/2020 12:03, Richard Sandiford wrote:
>> Andrew Stubbs writes:
>>> This patch implements a floating-point fold_left_plus vector pattern,
>>> which gives a significant speed-up in the BabelStream "dot" benchmark.
>>>
>>> The GCN architecture can't actually do an in-
On 07/07/2020 12:03, Richard Sandiford wrote:
Andrew Stubbs writes:
This patch implements a floating-point fold_left_plus vector pattern,
which gives a significant speed-up in the BabelStream "dot" benchmark.
The GCN architecture can't actually do an in-order vector reduction any
more efficien
Andrew Stubbs writes:
> This patch implements a floating-point fold_left_plus vector pattern,
> which gives a significant speed-up in the BabelStream "dot" benchmark.
>
> The GCN architecture can't actually do an in-order vector reduction any
> more efficiently than that equivalent scalar algori
This patch implements a floating-point fold_left_plus vector pattern,
which gives a significant speed-up in the BabelStream "dot" benchmark.
The GCN architecture can't actually do an in-order vector reduction any
more efficiently than that equivalent scalar algorithm, so this is a bit
of a che