New submission from Chris Angelico:

As of PEP 393, a string's width is recorded in its header - effectively, a 
marker that says whether the highest codepoint in the string is >0xFFFF, >0xFF, 
or <=0xFF. This is, on some occasions, useful to know; for instance, when 
testing string performance, it's handy to be able to very quickly throw 
something down that, without scanning the contents of all the strings used, can 
identify the width spread.

A similar facility is provided by Pike, which has a similar flexible string 
representation: 
http://pike.lysator.liu.se/generated/manual/modref/ex/7.2_3A_3A/String/width.html
 accessible to a script as String.width().

Since this is not something frequently needed, it would make sense to hide it 
away in the sys or inspect modules, or possibly in strings or as a method on 
the string itself.

Currently, the best way to do this is something like:

def str_width(s):
  width=1
  for ch in map(ord,s):
    if n > 0xFFFF: return 4
    if n > 0xFF: width=2
  return width

which necessitates a scan of the entire string, unless it has an astral 
character.

----------
components: Library (Lib)
messages: 185963
nosy: Rosuav
priority: normal
severity: normal
status: open
title: Expose string width to Python
versions: Python 3.4, Python 3.5

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue17629>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to