Re: [PATCH 3/4] lib: vsprintf: Optimize put_dec_trunc8

2012-09-25 Thread Denys Vlasenko
On Tue, Sep 25, 2012 at 1:44 PM, George Spelvin wrote: > Thanks to Denys Vlasenko for sending me his benchmarking code. > > I went and hacked on it to ransomize the numbers being converted more, > since repeatedly converting the same number underestimates the number > of branch mispredictions. > >

Re: [PATCH 3/4] lib: vsprintf: Optimize put_dec_trunc8

2012-09-25 Thread George Spelvin
Thanks to Denys Vlasenko for sending me his benchmarking code. I went and hacked on it to ransomize the numbers being converted more, since repeatedly converting the same number underestimates the number of branch mispredictions. Then I tried computing the number of digits beforehand, as mentione

Re: [PATCH 3/4] lib: vsprintf: Optimize put_dec_trunc8

2012-09-24 Thread Michal Nazarewicz
On Mon, Sep 24 2012, George Spelvin wrote: > Michal Nazarewicz wrote: >> static noinline_for_stack >> char *put_dec_trunc8(char *buf, unsigned r) { >> unsigned q; >> >> if (r > 1) { >> do { >> q = r + '0'; >> r = (r * (uint64_t)

Re: [PATCH 3/4] lib: vsprintf: Optimize put_dec_trunc8

2012-09-24 Thread Michal Nazarewicz
On Mon, Sep 24 2012, Michal Nazarewicz wrote: > Ah, right. I also thought about that first but than started worrying > that it could produce unnecessary zeros if the loop iterates at least > once and exits with r being zero, but now I see that this cannot happen > since if the loop condition was t

Re: [PATCH 3/4] lib: vsprintf: Optimize put_dec_trunc8

2012-09-24 Thread Michal Nazarewicz
On Mon, Sep 24 2012, George Spelvin wrote: > The fix is straightforward: > > diff --git a/lib/vsprintf.c b/lib/vsprintf.c > index e755083..9872855 100644 > --- a/lib/vsprintf.c > +++ b/lib/vsprintf.c > @@ -180,8 +180,6 @@ char *put_dec_trunc8(char *buf, unsigned r) > *buf++ = q - 10*r

Re: [PATCH 3/4] lib: vsprintf: Optimize put_dec_trunc8

2012-09-24 Thread George Spelvin
Rabin Vincent wrote: > This patch breaks IP address printing with "%pI4" (and by extension, > nfsroot). Example: > > - Before: 10.0.0.1 > - After: 10...1 Mea culpa, and thank you for catching it! As I said in my earlier comment, I tested this most extensively wrapped by some sprintf code that

Re: [PATCH 3/4] lib: vsprintf: Optimize put_dec_trunc8

2012-09-24 Thread George Spelvin
Michal Nazarewicz wrote: > The original has it a bit awkwardly because it just copies code from > put_dec_full9() with the first iteration skipped. Yeah, it also makes the comments pretty confusing. > I guess the following should work, even though it's not so pretty: > > static noinline_for_sta

Re: [PATCH 3/4] lib: vsprintf: Optimize put_dec_trunc8

2012-09-24 Thread Michal Nazarewicz
>>> @@ -174,20 +174,12 @@ char *put_dec_trunc8(char *buf, unsigned r) >>> unsigned q; >>> /* Copy of previous function's body with added early returns */ >>> - q = (r * (uint64_t)0x199a) >> 32; >>> - *buf++ = (r - 10 * q) + '0'; /* 2 */ >>> - if (q == 0) >>> - retur

Re: [PATCH 3/4] lib: vsprintf: Optimize put_dec_trunc8

2012-09-24 Thread George Spelvin
>> lib/vsprintf.c | 20 ++-- >> 1 file changed, 6 insertions(+), 14 deletions(-) >> >> diff --git a/lib/vsprintf.c b/lib/vsprintf.c >> index a8e7392..3ca77b8 100644 >> --- a/lib/vsprintf.c >> +++ b/lib/vsprintf.c >> @@ -174,20 +174,12 @@ char *put_dec_trunc8(char *buf, unsigned r

Re: [PATCH 3/4] lib: vsprintf: Optimize put_dec_trunc8

2012-09-24 Thread George Spelvin
Oh, joy, a new week and a nice dose of public humiliation to start it off. (Remind me never to go AFK for a weekend again.) Seriously, Rabin, thank you very much for the bug report and my apologies for inflicting the bug on you in the first place. Denys, good to hear from you. I had hoped this w

Re: [PATCH 3/4] lib: vsprintf: Optimize put_dec_trunc8

2012-09-23 Thread Michal Nazarewicz
On Fri, Aug 03 2012, George Spelvin wrote: > If you're going to have a conditional branch after > each 32x32->64-bit multiply, might as well shrink the code > and make it a loop. > > This also avoids using the long multiply for small integers. > > (This leaves the comments in a confusing state, but

Re: [PATCH 3/4] lib: vsprintf: Optimize put_dec_trunc8

2012-09-23 Thread Rabin Vincent
2012/8/3 George Spelvin : > If you're going to have a conditional branch after > each 32x32->64-bit multiply, might as well shrink the code > and make it a loop. > > This also avoids using the long multiply for small integers. > > (This leaves the comments in a confusing state, but that's a separat