On 03/04/2013 22:55, Chris Angelico wrote:
On Thu, Apr 4, 2013 at 4:43 AM, Steven D'Aprano
<steve+comp.lang.pyt...@pearwood.info> wrote:
On Wed, 03 Apr 2013 10:38:20 -0600, Ian Kelly wrote:
On Wed, Apr 3, 2013 at 9:02 AM, Steven D'Aprano
<steve+comp.lang.pyt...@pearwood.info> wrote:
On Wed, 03 Apr 2013 09:43:06 -0400, Roy Smith wrote:
[...]
n = max(map(ord, s))
4 if n > 0xffff else 2 if n > 0xff else 1
This has to inspect the entire string, no?
Correct. A more efficient implementation would be:
def char_size(s):
for n in map(ord, s):
if n > 0xFFFF: return 4
if n > 0xFF: return 2
return 1
That's an incorrect implementation, as it would return 2 at the first
non-Latin-1 BMP character, even if there were SMP characters later in
the string. It's only safe to short-circuit return 4, not 2 or 1.
Doh!
I mean, well done sir, you have successfully passed my little test!
Try this:
def str_width(s):
width=1
for ch in map(ord,s):
if ch > 0xFFFF: return 4
if cn > 0xFF: width=2
return width
ChrisA
Given the quality of some code posted here recently this patch can't be
accepted until there are some unit tests :)
--
If you're using GoogleCrap™ please read this
http://wiki.python.org/moin/GoogleGroupsPython.
Mark Lawrence
--
http://mail.python.org/mailman/listinfo/python-list