Thank you for the response. Here's some more info, including a little that you didn't ask me for but which might be useful.
John Machin <[EMAIL PROTECTED]> wrote: > william tanksley <[EMAIL PROTECTED]> wrote: > > To ask another way: how do I convert from a file:// URL to a local > > path in a standard way, so that filepaths from two different sources > > will work the same way in a dictionary? > > The problems occur when the filenames have non-ascii characters in > > them -- I suspect that the URLs are having some encoding placed on > > them that Python's decoder doesn't know about. > # track_id = url2pathname(urlparse(track_id).path) > print repr(track_id) > parse_result = urlparse(track_id).path > print repr(parse_result) > track_id_replacement = url2pathname(parse_result) > print repr(track_id_replacement) The "important" value here is track_id_replacement; it contains the data that's throwing me. It appears that some UTF-8 characters are being read as multiple bytes by ElementTree rather than being decoded into Unicode. Could this be a bug in ElementTree's Unicode support? If so, can I work around it? Here's one example. The others are similar -- they have the same things that look like problems to me. "Buffett Time - Annual Shareholders\xc2\xa0L.mp3" Note some problems here: 1. This isn't Unicode; it's missing the u"" (I printed using repr). 2. It's got the UTF-8 bytes there in the middle. I tried doing track_id.encode("utf-8"), but it doesn't seem to make any difference at all. Of course, my ultimate goal is to compare the track_id to the track_id I get from iTunes' COM interface, including hashing to the same value for dict lookups. > and copy/paste the results into your next posting. In addition to the above results, while trying to get more diagnostic printouts I got the following warning from Python: C:\projects\podcasts\podstrand\podcast.py:280: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal return track.databaseID == trackLocation The code that triggered this is as follows: if trackLocation in self.podcasts: track = self.podcasts[trackLocation] if trackRelease: track.release_date = trackRelease elif track.is_podcast: print "No release date:", repr(track.name) else: # For the sake of diagnostics, try to find the track. def track_has_location(track): return track.databaseID == trackLocation fillers = filter(track_has_location, self.fillers) if len(fillers): return disabled = filter(track_has_location, self.deferred) if len(disabled): return print "Location not known:", repr(trackLocation) -Wm -- http://mail.python.org/mailman/listinfo/python-list