On Fri, 16 Feb 2024, Richard Henderson wrote:
> Split less-than and greater-than 256 cases. > Use unaligned accesses for head and tail. > Avoid using out-of-bounds pointers in loop boundary conditions. I guess it did not carry typedef uint64_t uint64_a __attribute__((may_alias)); along the way, not a big deal since Qemu builds with -fno-strict-aliasing, but I felt it was nice to be explicit in the code about that. Am I expected to give Reviewed-by's to you? I did read the code to the best of my ability and did not spot any issues. Copies of the old comment will need updating: > Signed-off-by: Richard Henderson <richard.hender...@linaro.org> > --- > util/bufferiszero.c | 86 +++++++++++++++++++++++++++------------------ > 1 file changed, 52 insertions(+), 34 deletions(-) > > diff --git a/util/bufferiszero.c b/util/bufferiszero.c > index 02df82b4ff..a904b747c7 100644 > --- a/util/bufferiszero.c > +++ b/util/bufferiszero.c > @@ -28,40 +28,58 @@ > > static bool (*buffer_is_zero_accel)(const void *, size_t); > > -static bool buffer_is_zero_integer(const void *buf, size_t len) > +static bool buffer_is_zero_int_lt256(const void *buf, size_t len) > { [snip] > + /* > + * Use unaligned memory access functions to handle > + * the beginning and end of the buffer, with a couple > + * of loops handling the middle aligned section. > + */ ... here, there is only one loop now, not two, > + if (unlikely(len <= 8)) { > + return (ldl_he_p(buf) | ldl_he_p(buf + len - 4)) == 0; > } > + > + t = ldq_he_p(buf) | ldq_he_p(buf + len - 8); > + p = QEMU_ALIGN_PTR_DOWN(buf + 8, 8); > + e = QEMU_ALIGN_PTR_DOWN(buf + len - 1, 8); > + > + while (p < e) { > + t |= *p++; > + } > + return t == 0; > +} > + > +static bool buffer_is_zero_int_ge256(const void *buf, size_t len) > +{ > + /* > + * Use unaligned memory access functions to handle > + * the beginning and end of the buffer, with a couple > + * of loops handling the middle aligned section. > + */ ... and likewise here. Alexander