Peter Lieven <p...@kamp.de> wrote: > this adds buffer_find_nonzero_offset() which is a SSE2/Altivec > optimized function that searches for non-zero content in a > buffer. > > the function starts full unrolling only after the first few chunks have > been checked one by one. analyzing real memory page data has revealed > that non-zero pages are non-zero within the first 256-512 bits in > most cases. as this function is also heavily used to check for zero memory > pages this tweak has been made to avoid the high setup costs of the fully > unrolled check for non-zero pages. > > due to the optimizations used in the function there are restrictions > on buffer address and search length. the function > can_use_buffer_find_nonzero_content() can be used to check if > the function can be used safely. > > Signed-off-by: Peter Lieven <p...@kamp.de> > --- > include/qemu-common.h | 13 ++++++++++++ > util/cutils.c | 55 > +++++++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 68 insertions(+) > > diff --git a/include/qemu-common.h b/include/qemu-common.h > index 9022646..7c7c244 100644 > --- a/include/qemu-common.h > +++ b/include/qemu-common.h > @@ -472,4 +472,17 @@ void hexdump(const char *buf, FILE *fp, const char > *prefix, size_t size); > #define ALL_EQ(v1, v2) ((v1) == (v2)) > #endif > > +#define BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR 8 > +static inline bool > +can_use_buffer_find_nonzero_offset(const void *buf, size_t len) > +{ > + if (len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR > + * sizeof(VECTYPE)) == 0 > + && ((uintptr_t) buf) % sizeof(VECTYPE) == 0) { > + return true; > + } > + return false; > +}
This can be spelled as: return (len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR * sizeof(VECTYPE)) == 0 && ((uintptr_t) buf) % sizeof(VECTYPE) == 0);; But I don't care too much.