On 25-09-17 09:53, Richard Sargent wrote:
Rather than off-the-cuffing anything, please honour the Unicode Character
Properties. Refer to
https://en.wikipedia.org/wiki/Unicode_character_property#Whitespace, among
others.

That is a good idea. And it won't help you if you scrape data from the web, as you'll find plenty of bad encoding. And unclarity over which version of which standard was used (see mongolian vowel separator)

Stephan


Reply via email to