On Wed, Nov 06, 2024 at 12:16:33PM +1300, David Rowley wrote:
> On Wed, 6 Nov 2024 at 04:03, Bertrand Drouvot
> <bertranddrouvot...@gmail.com> wrote:
>> Another option could be to use SIMD instructions to check multiple bytes
>> is zero in a single operation. Maybe just an idea to keep in mind and 
>> experiment
>> if we feel the need later on.
> 
> Could do. I just wrote it that way to give the compiler flexibility to
> do SIMD implicitly. That seemed easier than messing around with SIMD
> intrinsics. I guess the compiler won't use SIMD with the single
> size_t-at-a-time version as it can't be certain it's ok to access the
> memory beyond the first zero word. Because I wrote the "if" condition
> using bitwise-OR, there's no boolean short-circuiting, so the compiler
> sees it must be safe to access all the memory for the loop iteration.

How complex would that be compared to the latest patch proposed if
done this way?  If you can force SIMD without having to know about
these specific gcc switches (aka -march is not set by default in the
tree except for some armv8 path), then the performance happens
magically.  If that makes the code more readable, that's even better.
--
Michael

Attachment: signature.asc
Description: PGP signature

Reply via email to