Il 11/03/2013 15:22, Peter Lieven ha scritto: > > Am 11.03.2013 um 15:14 schrieb Paolo Bonzini <pbonz...@redhat.com>: > >> Il 11/03/2013 14:44, Peter Lieven ha scritto: >>> Hi, >>> >>> I ever since had a few VMs which are very hard to migrate because of a >>> lot of memory I/O. I found that finding the next dirty bit >>> seemed to be one of the culprits (apart from removing locking which >>> Paolo is working on). >>> >>> I have to following proposal which seems to help a lot in my case. Just >>> wanted to have some feedback. >>> I applied the same unrolling idea like in buffer_is_zero(). >>> >>> Peter >>> >>> --- a/util/bitops.c >>> +++ b/util/bitops.c >>> @@ -24,12 +24,13 @@ unsigned long find_next_bit(const unsigned long >>> *addr, unsigned long size, >>> const unsigned long *p = addr + BITOP_WORD(offset); >>> unsigned long result = offset & ~(BITS_PER_LONG-1); >>> unsigned long tmp; >>> + unsigned long d0,d1,d2,d3; >>> >>> if (offset >= size) { >>> return size; >>> } >>> size -= result; >>> - offset %= BITS_PER_LONG; >>> + offset &= (BITS_PER_LONG-1); >>> if (offset) { >>> tmp = *(p++); >>> tmp &= (~0UL << offset); >>> @@ -43,6 +44,18 @@ unsigned long find_next_bit(const unsigned long >>> *addr, unsigned long size, >>> result += BITS_PER_LONG; >>> } >>> while (size & ~(BITS_PER_LONG-1)) { >>> + while (!(size & (4*BITS_PER_LONG-1))) { >> >> This really means >> >> if (!(size & (4*BITS_PER_LONG-1))) { >> while (1) { >> ... >> } >> } >> >> because the subtraction will not change the result of the "while" loop >> condition. > > Are you sure? The above is working nicely for me (wondering why ;-))
while (!(size & (4*BITS_PER_LONG-1))) => while (!(size % (4*BITS_PER_LONG)) => while ((size % (4*BITS_PER_LONG)) == 0) Subtracting 4*BITS_PER_LONG doesn't change the modulus. > I think !(size & (4*BITS_PER_LONG-1)) is the same as what you > propose. If size & (4*BITS_PER_LONG-1) is not zero its not dividable > by 4*BITS_PER_LONG. But it see it might be a problem for size == 0. In fact I'm not really sure why it works for you. :) >> What you want is probably "while (size & ~(4*BITS_PER_LONG-1))", which >> in turn means "while (size >= 4*BITS_PER_LONG). >> >> Please change both while loops to use a ">=" condition, it's easier to read. > > Good idea, its easier to understand. > >>> Please change both while loops to use a ">=" condition, it's easier to read. > > Thinking again, in case a bit is found, this might lead to unnecessary > iterations > in the while loop if the bit is in d1, d2 or d3. How would that be different in your patch? But you can solve it by making two >= loops, one checking for 4*BITS_PER_LONG and one checking BITS_PER_LONG. Paolo > > Peter > >> >> Paolo >> >>> + d0 = *p; >>> + d1 = *(p+1); >>> + d2 = *(p+2); >>> + d3 = *(p+3); >>> + if (d0 || d1 || d2 || d3) { >>> + break; >>> + } >>> + p+=4; >>> + result += 4*BITS_PER_LONG; >>> + size -= 4*BITS_PER_LONG; >>> + } >>> if ((tmp = *(p++))) { >>> goto found_middle; >>> } >> >