> On Mar 19, 2021, at 9:42 AM, Grant Edwards <grant.b.edwa...@gmail.com> wrote:
> 
> On 2021-03-19, Skip Montanaro <skip.montan...@gmail.com> wrote:
>>> 
>>> That's annoying. You have to roll your own solution!
>>> 
>> 
>> Certainly seems like a known issue:
>> 
>> https://bugs.python.org/issue12737
> 
> While that is an issue with string.title(), I don't see how it's
> related to what the OP is reporting. Issue 12737 is about Unicode
> combining marks.

Hi,
I’ve been frustrated by my experiences processing unstructured multilingual 
text with python. I’ve always assumed this was due to my insufficient 
experience with python (3) text processing. I’ve recently begun coding with Go. 
(I also continue to code in Python) And Go has exceptionally crisp and clear 
capacity to process unstructured multilingual utf-8 encoded text.

In just a few days of working with text processing in Go, using the book “The 
Go Programming Language” by Donovan and Kernighan, along with the Go language 
specification and other free online help, I have acquired a clear and crisp 
understanding of how to work effectively with unstructured, multilingual utf-8 
encoded text (and emojis) and any unicode code point — even invalid unicode 
code points.

To see some of these issues first hand, write a palindrome detector that works 
with any sequence of utf-8 encoded code points, including invalid code points. 
I’m sure it can be done in python, although I’ve not done it. It’s a trivial 
exercise in Go.

I’m not bashing Python here. I will continue to code with python. Its an 
exceptional language and community. Just commenting on my experience.

humbly,
Karen

-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to