On Thu, Jun 11, 2015 at 12:26 PM, Steven D'Aprano <st...@pearwood.info> wrote: > No, despite the name, that is not a space character, it is a formatting > character. Due to Unicode's stability policy, the name is stuck forever, > but it should not be treated as a space character: > > py> unicodedata.category(' ') > 'Zs' > py> unicodedata.category('\u00A0') # NBSP > 'Zs' > py> unicodedata.category('\uFEFF') # ZWNBSP > 'Cf' > > > Ideally, outside of the BOM, you should never come across a ZWNBSP. You > should use U+2060 WORD JOINER instead. But if you do come across one > outside of the BOM, it should be treated as a legitimate non-space > character: > > http://www.unicode.org/faq/utf_bom.html#bom6 > > Although ZWNBSP is a "default ignorable" code point, I believe that the font > is well within its rights to show it with a visible glyph: > > "Fonts can contain glyphs intended for visible display of > default ignorable code points that would otherwise be > rendered invisibly when not supported." > > http://www.unicode.org/faq/unsup_char.html
Huh. Okay, my bad. I was under the impression that it was supposed to take up no width, as the name implies, but stability trumps logic sometimes. Learn something new every day. >> notable because it's also used as >> the byte-order mark (as its counterpart, U+FFFE, is unallocated). I've >> been fighting with VLC Media Player over the font it uses for subtitles; >> for some bizarre reason, that font represents U+FEFF not with zero pixels >> of emptiness, but with a box containing the letters "ZWN" "BSP" on two >> lines. Yeah, because that totally takes up zero width and looks like blank >> space. > > Why do the subtitles contain ZWNBSP in the first place? Surely they're not > English subtitles? No, they're not :) The character comes up in the Cantonese and Japanese subs for Once Upon A December. http://youtu.be/CEpcUeWP0bg http://youtu.be/WFZAaHrHens Possibly some others in the series as well. It may well be a fault in the subtitles, but most programs I've seen don't show U+FEFF as a big fat box. ChrisA -- https://mail.python.org/mailman/listinfo/python-list