On 09/21/2011 01:57 AM, Christophe wrote:
"Jonathan M Davis" , dans le message (digitalmars.D.learn:29637), a
  écrit :
On Tuesday, September 20, 2011 14:43 Andrej Mitrovic wrote:
On 9/20/11, Jonathan M Davis<jmdavisp...@gmx.com>  wrote:
Or std.range.walkLength. I don't know why we really have std.utf.count. I
just
calls walkLength anyway. I suspect that it's a function that predates
walkLength and was made to use walkLength after walkLength was
introduced. But
it's kind of pointless now.

- Jonathan M Davis

I don't think having better-named aliases is a bad thing. Although now
I'm seeing it's not just an alias but a function.


std.utf.count has on advantage: someone looking for the function will
find it. The programmer might not look in std.range to find a function
about UFT strings, and even if he did, it is not indicated in walkLength
that it works with (narrow) strings the way it does. To know you can use
walklength, you must know that:
-popFront works differently in string.
-hasLength is not true for strings.
-what is walkLength.

So yes, you experienced programmer don't need std.utf.count, but newbies
do.

Last point: WalkLength is not optimized for strings.
std.utf.count should be.

This short implementation of count was 3 to 8 times faster than
walkLength is a simple benchmark:

size_t myCount(string text)
{
   size_t n = text.length;
   for (uint i=0; i<text.length; ++i)
     {
       auto s = text[i]>>6;
       n -= (s>>1) - ((s+1)>>2);
     }
   return n;
}

(compiled with gdc on 64 bits, the sample text was the introduction of
french wikipedia UTF-8 article down to the sommaire -
http://fr.wikipedia.org/wiki/UTF-8 ).

The reason is that the loop can be unrolled by the compiler.

Very good point, you might want to file an enhancement request. It would make the functionality different enough to prevent count from being removed: walkLength throws on an invalid UTF sequence.

Reply via email to