Re: [PATCH] unistr/u8-strchr: speed up searching for ASCII characters

2010-07-20 Thread Pádraig Brady
On 18/07/10 16:23, Bruno Haible wrote: > Hi Pádraig, > >> However, the first byte of a multibyte >> UTF-8 char is the same for a lot of characters > > Yes. The last byte is equidistributed across the range 0x80..0xBF, whereas > the first byte is often the same. I'm applying the commit below to ex

Re: [PATCH] unistr/u8-strchr: speed up searching for ASCII characters

2010-07-18 Thread Bruno Haible
Hi Pádraig, > However, the first byte of a multibyte > UTF-8 char is the same for a lot of characters Yes. The last byte is equidistributed across the range 0x80..0xBF, whereas the first byte is often the same. I'm applying the commit below to exploit it for speed. > I was wondering myself about

Re: [PATCH] unistr/u8-strchr: speed up searching for ASCII characters

2010-07-12 Thread Paolo Bonzini
On 07/12/2010 01:38 AM, Pádraig Brady wrote: On 11/07/10 15:20, Paolo Bonzini wrote: On 07/07/2010 03:44 PM, Pádraig Brady wrote: Subject: [PATCH] unistr/u8-strchr: speed up searching for ASCII characters * lib/unistr/u8-strchr.c (u8_strchr): Use strchr() for the single byte case as it was

Re: [PATCH] unistr/u8-strchr: speed up searching for ASCII characters

2010-07-11 Thread Pádraig Brady
On 11/07/10 15:20, Paolo Bonzini wrote: > On 07/07/2010 03:44 PM, Pádraig Brady wrote: >> Subject: [PATCH] unistr/u8-strchr: speed up searching for ASCII >> characters >> >> * lib/unistr/u8-strchr.c (u8_strchr): Use strchr() for >> the single byte case as it was

Re: [PATCH] unistr/u8-strchr: speed up searching for ASCII characters

2010-07-11 Thread Paolo Bonzini
On 07/07/2010 03:44 PM, Pádraig Brady wrote: Subject: [PATCH] unistr/u8-strchr: speed up searching for ASCII characters * lib/unistr/u8-strchr.c (u8_strchr): Use strchr() for the single byte case as it was measured to be 50% faster than the existing code on x86 linux. Also add a comment on why

Re: [PATCH] unistr/u8-strchr: speed up searching for ASCII characters

2010-07-11 Thread Bruno Haible
Hi Pádraig, > +2010-07-07 Pádraig Brady > + > + * lib/unistr/u8-strchr.c (u8_strchr): Use strchr() as it's faster Thanks for the patch. I've applied it as below, with minor changes: - Keep around the unoptimized code, for clarity. - Add the rationale for the change to the comments, n

Re: [PATCH] unistr/u8-strchr: speed up searching for ASCII characters

2010-07-08 Thread Pádraig Brady
On 07/07/10 23:49, Pádraig Brady wrote: > On 07/07/10 17:07, Simon Josefsson wrote: >> Pádraig Brady writes: >> >>> +/* The following is equivalent to: >>> + return memmem (s, strlen(s), c, csize); >>> + but faster for long S with matching UC near the start, >>> + and also

Re: [PATCH] unistr/u8-strchr: speed up searching for ASCII characters

2010-07-08 Thread Pádraig Brady
On 08/07/10 04:24, Ralf Wildenhues wrote: > Hi Pádraig, > > * Pádraig Brady wrote on Wed, Jul 07, 2010 at 03:44:29PM CEST: >> Subject: [PATCH] unistr/u8-strchr: speed up searching for ASCII characters > >> --- a/lib/unistr/u8-strchr.c >> +++ b/lib/uni

Re: [PATCH] unistr/u8-strchr: speed up searching for ASCII characters

2010-07-07 Thread Ralf Wildenhues
Hi Pádraig, * Pádraig Brady wrote on Wed, Jul 07, 2010 at 03:44:29PM CEST: > Subject: [PATCH] unistr/u8-strchr: speed up searching for ASCII characters > --- a/lib/unistr/u8-strchr.c > +++ b/lib/unistr/u8-strchr.c > uint8_t * > u8_strchr (const uint8_t *s, ucs4_t uc) >

Re: [PATCH] unistr/u8-strchr: speed up searching for ASCII characters

2010-07-07 Thread Pádraig Brady
On 07/07/10 17:07, Simon Josefsson wrote: > Pádraig Brady writes: > >> +/* The following is equivalent to: >> + return memmem (s, strlen(s), c, csize); >> + but faster for long S with matching UC near the start, >> + and also memmem is sometimes buggy and inefficient. */

Re: [PATCH] unistr/u8-strchr: speed up searching for ASCII characters

2010-07-07 Thread Simon Josefsson
Pádraig Brady writes: > +/* The following is equivalent to: > + return memmem (s, strlen(s), c, csize); > + but faster for long S with matching UC near the start, > + and also memmem is sometimes buggy and inefficient. */ > switch (u8_uctomb_aux (c, uc, 6)) Don't we

[PATCH] unistr/u8-strchr: speed up searching for ASCII characters

2010-07-07 Thread Pádraig Brady
Subject: [PATCH] unistr/u8-strchr: speed up searching for ASCII characters * lib/unistr/u8-strchr.c (u8_strchr): Use strchr() for the single byte case as it was measured to be 50% faster than the existing code on x86 linux. Also add a comment on why not to use memmem() for the moment for the