Re: [PATCH v3 1/3] lib: find_*_bit reimplementation

2015-02-13 Thread George Spelvin
> Ohh... I used to think I know something about optimization. I tried build > Rasmus' 'find_last_bit' against mine on MIPS, and found that sizes are as > 268 vs 320 bytes. I think the best version is suggested by George, both > readable, and effective. What about using it? If no objections, I'll >

Re: [PATCH v3 1/3] lib: find_*_bit reimplementation

2015-02-13 Thread Rasmus Villemoes
On Fri, Feb 13 2015, "George Spelvin" wrote: >> the main loop is 20--3b. The test instruction at 2e seems to be >> redundant. The same at 37: the sub instruction already sets plenty of >> flags that could be used, so explicitly comparing %rbx to -1 seems >> redundant. > > Er... I think you hand-e

Re: [PATCH v3 1/3] lib: find_*_bit reimplementation

2015-02-12 Thread George Spelvin
> That, and if the compiler was smart enough, the AND should actually be > free (at least on x86), since it can replace a test instruction. [1] Ah, right, x86 loads don't set the flags, so you need a TEST instruction anyway. So the AND is free! Of course, not all the world's an x86. But load/st

Re: [PATCH v3 1/3] lib: find_*_bit reimplementation

2015-02-12 Thread Rasmus Villemoes
On Thu, Feb 12 2015, "George Spelvin" wrote: >> Rasmus, your version has ANDing by mask, and resetting the mask at each >> iteration >> of main loop. I think we can avoid it. What do you think on next? > > Yes, that's basically what I proposed (modulo checking for zero size and > my buggy LAST_W

Re: [PATCH v3 1/3] lib: find_*_bit reimplementation

2015-02-12 Thread George Spelvin
> Rasmus, your version has ANDing by mask, and resetting the mask at each > iteration > of main loop. I think we can avoid it. What do you think on next? Yes, that's basically what I proposed (modulo checking for zero size and my buggy LAST_WORD_MASK). But two unconditional instructions in the l

Re: [PATCH v3 1/3] lib: find_*_bit reimplementation

2015-02-11 Thread Yury
On 09.02.2015 14:53, Rasmus Villemoes wrote: > [Yury, please do remember to Cc everyone who has previously > participated] > > On Mon, Feb 09 2015, "George Spelvin" wrote: > >> Two more comments on the code. Two minor, but one that >> seems like a bug, so for now, it's >> >> Nacked-by: George Sp

Re: [PATCH v3 1/3] lib: find_*_bit reimplementation

2015-02-11 Thread Rasmus Villemoes
[for some reason google decided to put this in my spam folder, hrmpf] On Mon, Feb 09 2015, "George Spelvin" wrote: > Sorry, I screwed up the bit-twiddling while messing with various options. > I was trying to get size == 32 to work; that should have been: > >> tmp &= (2UL << ((size-1) % BIT

Re: [PATCH v3 1/3] lib: find_*_bit reimplementation

2015-02-09 Thread George Spelvin
Sorry, I screwed up the bit-twiddling while messing with various options. I was trying to get size == 32 to work; that should have been: > tmp &= (2UL << ((size-1) % BITS_PER_LONG)) - 1; /* Mask last word */ And you're right that LAST_WORD_MASK is a good wrapper. Vasrious working solutions

Re: [PATCH v3 1/3] lib: find_*_bit reimplementation

2015-02-09 Thread Rasmus Villemoes
[Yury, please do remember to Cc everyone who has previously participated] On Mon, Feb 09 2015, "George Spelvin" wrote: > Two more comments on the code. Two minor, but one that > seems like a bug, so for now, it's > > Nacked-by: George Spelvin > > Specifically, it seems like find_last_bit used

Re: [PATCH v3 1/3] lib: find_*_bit reimplementation

2015-02-09 Thread George Spelvin
Two more comments on the code. Two minor, but one that seems like a bug, so for now, it's Nacked-by: George Spelvin Specifically, it seems like find_last_bit used to ignore trailing garbage in the bitmap, but now will stop searching if the last word contains some set bits not within size. Th

Re: [PATCH v3 1/3] lib: find_*_bit reimplementation

2015-02-08 Thread George Spelvin
This basically has my Reviewed-by: (I'll send it in a few hours when I have time to do a real final copy-editing), but a few minor notes: >+ /* >+ * This is an equvalent for: >+ * >+ * tmp = find_set ? addr[start / BITS_PER_LONG] >+

[PATCH v3 1/3] lib: find_*_bit reimplementation

2015-02-08 Thread Yury Norov
New implementations takes less space in source file (see diffstat) and in object. For me it's 710 vs 453 bytes of text. It also shows better performance. Signed-off-by: Yury Norov --- lib/find_last_bit.c | 30 ++ lib/find_next_bit.c | 275