On Jul 31, 12:58 am, william tanksley <[EMAIL PROTECTED]> wrote: > Thank you for the response. Here's some more info, including a little > that you didn't ask me for but which might be useful. > > John Machin <[EMAIL PROTECTED]> wrote: > > william tanksley <[EMAIL PROTECTED]> wrote: > > > To ask another way: how do I convert from a file:// URL to a local > > > path in a standard way, so that filepaths from two different sources > > > will work the same way in a dictionary? > > > The problems occur when the filenames have non-ascii characters in > > > them -- I suspect that the URLs are having some encoding placed on > > > them that Python's decoder doesn't know about. > > # track_id = url2pathname(urlparse(track_id).path) > > print repr(track_id) > > parse_result = urlparse(track_id).path > > print repr(parse_result) > > track_id_replacement = url2pathname(parse_result) > > print repr(track_id_replacement) > > The "important" value here is track_id_replacement; it contains the > data that's throwing me. It appears that some UTF-8 characters are > being read as multiple bytes by ElementTree rather than being decoded > into Unicode.
Appearances can be deceptive. You present no evidence. > Could this be a bug in ElementTree's Unicode support? It could, yes, but the probability is extremely low. > If > so, can I work around it? > > Here's one example. The others are similar -- they have the same > things that look like problems to me. > > "Buffett Time - Annual Shareholders\xc2\xa0L.mp3" > > Note some problems here: Where? > > 1. This isn't Unicode; it's missing the u"" (I printed using repr). > 2. It's got the UTF-8 bytes there in the middle. > > I tried doing track_id.encode("utf-8"), but it doesn't seem to make > any difference at all. > > Of course, my ultimate goal is to compare the track_id to the track_id > I get from iTunes' COM interface, including hashing to the same value > for dict lookups. > > > and copy/paste the results into your next posting. > > In addition to the above results, *WHAT* results? I don't see any repr() output, just your interpretation of what you think you saw! -- http://mail.python.org/mailman/listinfo/python-list