On Sun, 24 Apr 2005 11:26:20 +0200, Ivan Voras <[EMAIL PROTECTED]> wrote:
>Jp Calderone wrote: > >> You don't have a string fetched from a database, in iso-8859-2, alas. >> That is the root of the problem you're having. What you have is a >> unicode string. > >Yes, you're right :) I actually did have iso-8859-2 data, but, as I >found out late last night, the data got converted to unicode along the way. Just a thought: I noticed from the traceback that you are running this on a Windows box. Profound apologies in advance if this question is an insult to your intelligence, but you do know that Windows code page 1250 (Latin 2) -- which I guess is the code page that you would be using -- is *NOT* the same as iso-8859-2, don't you? >>> (Does anyone else feel that python's unicode handling is, well... >>> suboptimal at least?) >> >> Hmm. Not really. The only problem I've found with it is misguided >> attempt to "do the right thing" by implicitly encoding unicode strings, >> and this isn't so much of a problem once you figure things out, because >> you can always do things explicitly and avoid invoking the implicit >> behavior. > >I'm learning that, the hard way :) > >One thing that I always wanted to do (but probably can't be done?) is to >set the default/implicit encoding to the one I'm using... I often have >to deal with 8-bit encodings and rarely with unicode. Can it be done >per-program? It's a bit difficult to understand what you are trying to do, but I'd suggest that you forget about setting the default encoding; if you need to deal with Unicode, then set up the encoding explicitly on a per-file or per-socket basis. The default ASCII encoding is then there as a trap when (sorry to rub it in) you don't know what type of data you have. HTH, John -- http://mail.python.org/mailman/listinfo/python-list