Re: [PATCH v3 1/3] lib: find_*_bit reimplementation

2015-02-13 Thread George Spelvin
> Ohh... I used to think I know something about optimization. I tried build > Rasmus' 'find_last_bit' against mine on MIPS, and found that sizes are as > 268 vs 320 bytes. I think the best version is suggested by George, both > readable, and effective. What about using it? If no objections, I'll >

Re: [PATCH v3 1/3] lib: find_*_bit reimplementation

2015-02-13 Thread Rasmus Villemoes
On Fri, Feb 13 2015, "George Spelvin" wrote: >> the main loop is 20--3b. The test instruction at 2e seems to be >> redundant. The same at 37: the sub instruction already sets plenty of >> flags that could be used, so explicitly comparing %rbx to -1 seems >> redundant. > > Er... I think you hand-e

Re: [PATCH v3 1/3] lib: find_*_bit reimplementation

2015-02-12 Thread George Spelvin
> That, and if the compiler was smart enough, the AND should actually be > free (at least on x86), since it can replace a test instruction. [1] Ah, right, x86 loads don't set the flags, so you need a TEST instruction anyway. So the AND is free! Of course, not all the world's an x86. But load/st

Re: [PATCH v3 1/3] lib: find_*_bit reimplementation

2015-02-12 Thread Rasmus Villemoes
On Thu, Feb 12 2015, "George Spelvin" wrote: >> Rasmus, your version has ANDing by mask, and resetting the mask at each >> iteration >> of main loop. I think we can avoid it. What do you think on next? > > Yes, that's basically what I proposed (modulo checking for zero size and > my buggy LAST_W

Re: [PATCH v3 1/3] lib: find_*_bit reimplementation

2015-02-12 Thread George Spelvin
> Rasmus, your version has ANDing by mask, and resetting the mask at each > iteration > of main loop. I think we can avoid it. What do you think on next? Yes, that's basically what I proposed (modulo checking for zero size and my buggy LAST_WORD_MASK). But two unconditional instructions in the l

Re: [PATCH v3 1/3] lib: find_*_bit reimplementation

2015-02-11 Thread Yury
On 09.02.2015 14:53, Rasmus Villemoes wrote: > [Yury, please do remember to Cc everyone who has previously > participated] > > On Mon, Feb 09 2015, "George Spelvin" wrote: > >> Two more comments on the code. Two minor, but one that >> seems like a bug, so for now, it's >> >> Nacked-by: George Sp

Re: [PATCH v3 1/3] lib: find_*_bit reimplementation

2015-02-11 Thread Rasmus Villemoes
[for some reason google decided to put this in my spam folder, hrmpf] On Mon, Feb 09 2015, "George Spelvin" wrote: > Sorry, I screwed up the bit-twiddling while messing with various options. > I was trying to get size == 32 to work; that should have been: > >> tmp &= (2UL << ((size-1) % BIT

Re: [PATCH v3 1/3] lib: find_*_bit reimplementation

2015-02-09 Thread George Spelvin
Sorry, I screwed up the bit-twiddling while messing with various options. I was trying to get size == 32 to work; that should have been: > tmp &= (2UL << ((size-1) % BITS_PER_LONG)) - 1; /* Mask last word */ And you're right that LAST_WORD_MASK is a good wrapper. Vasrious working solutions

Re: [PATCH v3 1/3] lib: find_*_bit reimplementation

2015-02-09 Thread Rasmus Villemoes
[Yury, please do remember to Cc everyone who has previously participated] On Mon, Feb 09 2015, "George Spelvin" wrote: > Two more comments on the code. Two minor, but one that > seems like a bug, so for now, it's > > Nacked-by: George Spelvin > > Specifically, it seems like find_last_bit used

Re: [PATCH v3 1/3] lib: find_*_bit reimplementation

2015-02-09 Thread George Spelvin
Two more comments on the code. Two minor, but one that seems like a bug, so for now, it's Nacked-by: George Spelvin Specifically, it seems like find_last_bit used to ignore trailing garbage in the bitmap, but now will stop searching if the last word contains some set bits not within size. Th

Re: [PATCH v3 1/3] lib: find_*_bit reimplementation

2015-02-08 Thread George Spelvin
This basically has my Reviewed-by: (I'll send it in a few hours when I have time to do a real final copy-editing), but a few minor notes: >+ /* >+ * This is an equvalent for: >+ * >+ * tmp = find_set ? addr[start / BITS_PER_LONG] >+