On Apr 17, 10:10 am, [EMAIL PROTECTED] wrote: > Thank you Martin and John, for you excellent explanations. > > I think I understand the unicode basic principles, what confuses me is the > usage different applications make out of it. > > For example, I got that EN DASH out of a web page which states <?xml > version="1.0" encoding="ISO-8859-1"?> at the beggining. That's why I did go > for that encoding. But if the browser can properly decode that character > using that encoding, how come other applications can't? > > I might need to go for python's htmllib to avoid this, not sure. But if I > don't, if I only want to just copy and paste some web pages text contents > into a tkinter Text widget, what should I do to succesfully make every single > character go all the way from the widget and out of tkinter into a python > string variable? How did my browser knew it should render an EN DASH instead > of a circumflexed lowercase u? > > This is the webpage in case you are interested, 4th line of first paragraph, > there is the EN > DASH:http://www.pagina12.com.ar/diario/elmundo/subnotas/102453-32303-2008-... > > Thanks a lot. >
Simplemente escribe en ingles. Like this, see? No encodings mess. -- http://mail.python.org/mailman/listinfo/python-list