On Wed, Nov 06, 2024 at 03:27:20PM +, Andrew Stubbs wrote:
> The chunk size for SIMD loops should be right for the current device; too big
> allocates too much memory, too small is inefficient. Getting it wrong doesn't
> actually break anything though.
>
> This patch attempts to choose the op
The chunk size for SIMD loops should be right for the current device; too big
allocates too much memory, too small is inefficient. Getting it wrong doesn't
actually break anything though.
This patch attempts to choose the optimal setting based on the context. Both
host-fallback and device will g