I get data from various sources; client emails, spreadsheets, and data from web applications. I find that I can do some_string.decode('latin1') to get unicode that I can use with xlsxwriter, or put <meta charset="latin1"> in the header of a web page to display European characters correctly. But normally UTF-8 is recommended as the encoding to use today. latin1 works correctly more often when I am using data from the wild. It's frustrating that I have to play a guessing game to figure out how to use incoming text. I'm just wondering if there are any thoughts. What if we just globally decided to use utf-8? Could that ever happen?
-- https://mail.python.org/mailman/listinfo/python-list