"Jerry Hill" <[EMAIL PROTECTED]> wrote: > william tanksley <[EMAIL PROTECTED]> wrote: > > Here's one example. The others are similar -- they have the same > > things that look like problems to me. > > "Buffett Time - Annual Shareholders\xc2\xa0L.mp3"
> > I tried doing track_id.encode("utf-8"), but it doesn't seem to make > > any difference at all. > I don't have anything to say about your iTunes problems, but encode() > is the wrong method to turn a byte string into a unicode string. > Instead, use decode(), like this: Awesome... Thank you! I had my mental model of Python turned around backwards. That's an odd feeling. Okay, so you decode to go from raw byes into a given encoding, and you encode to go from a given encoding to raw bytes. Not what I thought it was, but that's cool, makes sense. At first I thought this fixed my problem, but I had to tweak the obvious fix to make it work, and I don't understand why. Fix #1: track_id = track_id.decode('utf-8') track_id = url2pathname(urlparse(track_id).path) That doesn't work -- it produces no error, but the raw bytes appear in the unicode string. Fix #2: track_id = url2pathname(urlparse(track_id).path) track_id = track_id.decode('utf-8') This one appears to work. (Although I can't confirm it for sure, because although all my debug prints are now correct, the overall application fails in the same way it did before, back before I put in debug printfs. I'm going to spend some time assuming that the problem is elsewhere in my code, since at least I definitely fixed one serious problem.) I've got a few questions for Python-XML-Unicode experts... 1. Why does the order of those statements matter? 2. Shouldn't it be more correct to decode BEFORE transforming the string? Why does that kill the decoding? 3. Why is ElementTree dumping raw bytes on me instead of decoding to UTF-8? The XML file has its encoding set to: <?xml version="1.0" encoding="UTF-8"?>, so it seems like it should know what codec to use. > Jerry -Wm -- http://mail.python.org/mailman/listinfo/python-list