On Fri, 16 Feb 2024, Richard Henderson wrote:

> Split less-than and greater-than 256 cases.
> Use unaligned accesses for head and tail.
> Avoid using out-of-bounds pointers in loop boundary conditions.

I guess it did not carry

  typedef uint64_t uint64_a __attribute__((may_alias));

along the way, not a big deal since Qemu builds with -fno-strict-aliasing,
but I felt it was nice to be explicit in the code about that.

Am I expected to give Reviewed-by's to you? I did read the code to the
best of my ability and did not spot any issues.

Copies of the old comment will need updating:

> Signed-off-by: Richard Henderson <richard.hender...@linaro.org>
> ---
>  util/bufferiszero.c | 86 +++++++++++++++++++++++++++------------------
>  1 file changed, 52 insertions(+), 34 deletions(-)
> 
> diff --git a/util/bufferiszero.c b/util/bufferiszero.c
> index 02df82b4ff..a904b747c7 100644
> --- a/util/bufferiszero.c
> +++ b/util/bufferiszero.c
> @@ -28,40 +28,58 @@
>  
>  static bool (*buffer_is_zero_accel)(const void *, size_t);
>  
> -static bool buffer_is_zero_integer(const void *buf, size_t len)
> +static bool buffer_is_zero_int_lt256(const void *buf, size_t len)
>  {
[snip]
> +    /*
> +     * Use unaligned memory access functions to handle
> +     * the beginning and end of the buffer, with a couple
> +     * of loops handling the middle aligned section.
> +     */

... here, there is only one loop now, not two,

> +    if (unlikely(len <= 8)) {
> +        return (ldl_he_p(buf) | ldl_he_p(buf + len - 4)) == 0;
>      }
> +
> +    t = ldq_he_p(buf) | ldq_he_p(buf + len - 8);
> +    p = QEMU_ALIGN_PTR_DOWN(buf + 8, 8);
> +    e = QEMU_ALIGN_PTR_DOWN(buf + len - 1, 8);
> +
> +    while (p < e) {
> +        t |= *p++;
> +    }
> +    return t == 0;
> +}
> +
> +static bool buffer_is_zero_int_ge256(const void *buf, size_t len)
> +{
> +    /*
> +     * Use unaligned memory access functions to handle
> +     * the beginning and end of the buffer, with a couple
> +     * of loops handling the middle aligned section.
> +     */

... and likewise here.

Alexander

Reply via email to