Re: Python parsing iTunes XML/COM

2008-08-01 Thread John Machin
On Aug 2, 10:02 am, william tanksley <[EMAIL PROTECTED]> wrote: > Given that the input file was > Unicode, You mean something like "encoded in UTF-8". Here's another reference for you to read: http://www.amk.ca/python/howto/unicode -- http://mail.python.org/mailman/listinfo/python-list

Re: Python parsing iTunes XML/COM

2008-08-01 Thread william tanksley
John Machin <[EMAIL PROTECTED]> wrote: > william tanksley <[EMAIL PROTECTED]> wrote: > > Cool. Sorry for the misunderstanding. Thank you for helping again! > > Postscript: your request to print the actual data did the trick. > I'd back inspecting actual data against armchair philosophy any > time

Re: Python parsing iTunes XML/COM

2008-07-31 Thread John Machin
On Aug 1, 7:44 am, william tanksley <[EMAIL PROTECTED]> wrote: > John Machin <[EMAIL PROTECTED]> wrote: > > william tanksley <[EMAIL PROTECTED]> wrote: > > Let's try again: > > Cool. Sorry for the misunderstanding. Thank you for helping again! > > Postscript: your request to print the actual data d

Re: Python parsing iTunes XML/COM

2008-07-31 Thread Jerry Hill
On Thu, Jul 31, 2008 at 9:44 AM, william tanksley <[EMAIL PROTECTED]> wrote: > I'm using a file, a file that's correctly encoded as UTF-8, and it > returns some text elements that are raw bytes (undecoded). I have to > manually decode them. I can't reproduce this behavior. Here's a simple test ca

Re: Python parsing iTunes XML/COM

2008-07-31 Thread william tanksley
John Machin <[EMAIL PROTECTED]> wrote: > william tanksley <[EMAIL PROTECTED]> wrote: > Let's try again: Cool. Sorry for the misunderstanding. Thank you for helping again! Postscript: your request to print the actual data did the trick. I'm including the rest of my reply just to provide context, b

Re: Python parsing iTunes XML/COM

2008-07-31 Thread John Machin
On Jul 31, 11:54 pm, william tanksley <[EMAIL PROTECTED]> wrote: > John Machin <[EMAIL PROTECTED]> wrote: > > william tanksley <[EMAIL PROTECTED]> wrote: > > > "Buffett Time - Annual Shareholders\xc2\xa0L.mp3" > > > 1. This isn't Unicode; it's missing the u"" (I printed using repr). > > > 2. It's g

Re: Python parsing iTunes XML/COM

2008-07-31 Thread Stefan Behnel
william tanksley wrote: > I didn't > pass a string. I passed a file. It didn't error out; instead, it > produced bytestring-encoded output (not Unicode). >From my experience (and from the source code I have seen so far), ElementTree does not return UTF-8 encoded strings at the API level. Can you p

Re: Python parsing iTunes XML/COM

2008-07-31 Thread william tanksley
John Machin <[EMAIL PROTECTED]> wrote: > william tanksley <[EMAIL PROTECTED]> wrote: > > "Buffett Time - Annual Shareholders\xc2\xa0L.mp3" > > 1. This isn't Unicode; it's missing the u"" (I printed using repr). > > 2. It's got the UTF-8 bytes there in the middle. > > In addition to the above result

Re: Python parsing iTunes XML/COM

2008-07-31 Thread william tanksley
Stefan Behnel <[EMAIL PROTECTED]> wrote: > william tanksley wrote: > > Okay, my answer is that ElementTree (in Python 2.5) is simply > > deranged when it comes to Unicode. It assumes everything's ASCII. > It does not "assume" that. It *requires* byte strings to be ASCII. You can't encode Unicode

Re: Python parsing iTunes XML/COM

2008-07-31 Thread John Machin
On Jul 31, 12:58 am, william tanksley <[EMAIL PROTECTED]> wrote: > Thank you for the response. Here's some more info, including a little > that you didn't ask me for but which might be useful. > > John Machin <[EMAIL PROTECTED]> wrote: > > william tanksley <[EMAIL PROTECTED]> wrote: > > > To ask an

Re: Python parsing iTunes XML/COM

2008-07-30 Thread Stefan Behnel
william tanksley wrote: > william tanksley <[EMAIL PROTECTED]> wrote: >> I'm still puzzled why I'm getting some non-Unicode out of an >> ElementTree's text, though. > > Now I know. > > Okay, my answer is that cElementTree (in Python 2.5) is simply > deranged when it comes to Unicode. It assumes e

Re: Python parsing iTunes XML/COM

2008-07-30 Thread william tanksley
william tanksley <[EMAIL PROTECTED]> wrote: > I'm still puzzled why I'm getting some non-Unicode out of an > ElementTree's text, though. Now I know. Okay, my answer is that cElementTree (in Python 2.5) is simply deranged when it comes to Unicode. It assumes everything's ASCII. Reference: http://

Re: Python parsing iTunes XML/COM

2008-07-30 Thread william tanksley
"Jerry Hill" <[EMAIL PROTECTED]> wrote: > On Wed, Jul 30, 2008 at 2:27 PM, william tanksley <[EMAIL PROTECTED]> wrote: > > Awesome... Thank you! I had my mental model of Python turned around > > backwards. That's an odd feeling. Okay, so you decode to go from raw > > byes into a given encoding, and

Re: Python parsing iTunes XML/COM

2008-07-30 Thread Jerry Hill
On Wed, Jul 30, 2008 at 2:27 PM, william tanksley <[EMAIL PROTECTED]> wrote: > Awesome... Thank you! I had my mental model of Python turned around > backwards. That's an odd feeling. Okay, so you decode to go from raw > byes into a given encoding, and you encode to go from a given encoding > to raw

Re: Python parsing iTunes XML/COM

2008-07-30 Thread Stefan Behnel
william tanksley wrote: > Okay, so you decode to go from raw > byes into a given encoding, and you encode to go from a given encoding > to raw bytes. No, decoding goes from a byte sequence to a Unicode string and encoding goes from a Unicode string to a byte sequence. Unicode is not an encoding.

Re: Python parsing iTunes XML/COM

2008-07-30 Thread william tanksley
"Jerry Hill" <[EMAIL PROTECTED]> wrote: > william tanksley <[EMAIL PROTECTED]> wrote: > > Here's one example. The others are similar -- they have the same > > things that look like problems to me. > > "Buffett Time - Annual Shareholders\xc2\xa0L.mp3" > > I tried doing track_id.encode("utf-8"), but

Re: Python parsing iTunes XML/COM

2008-07-30 Thread Jerry Hill
On Wed, Jul 30, 2008 at 10:58 AM, william tanksley <[EMAIL PROTECTED]> wrote: > Here's one example. The others are similar -- they have the same > things that look like problems to me. > > "Buffett Time - Annual Shareholders\xc2\xa0L.mp3" > > Note some problems here: > > 1. This isn't Unicode; it's

Re: Python parsing iTunes XML/COM

2008-07-30 Thread william tanksley
Thank you for the response. Here's some more info, including a little that you didn't ask me for but which might be useful. John Machin <[EMAIL PROTECTED]> wrote: > william tanksley <[EMAIL PROTECTED]> wrote: > > To ask another way: how do I convert from a file:// URL to a local > > path in a stan

Re: Python parsing iTunes XML/COM

2008-07-30 Thread pyshib
If you want to convert the file names which use standard URL encoding (with %20 for space, etc) use: from urllib import unquote new_filename = unquote(filename) I have found this does not convert encoded characters of the form '&#CC;' so you may have to do that manually. I think these are just as

Re: Python parsing iTunes XML/COM

2008-07-29 Thread John Machin
On Jul 30, 3:53 am, william tanksley <[EMAIL PROTECTED]> wrote: > To ask another way: how do I convert from a file:// URL to a local > path in a standard way, so that filepaths from two different sources > will work the same way in a dictionary? > > Right now I'm using the following source: > > tra

Re: Python parsing iTunes XML/COM

2008-07-29 Thread william tanksley
To ask another way: how do I convert from a file:// URL to a local path in a standard way, so that filepaths from two different sources will work the same way in a dictionary? Right now I'm using the following source: track_id = url2pathname(urlparse(track_id).path) url2pathname is from urllib;

Python parsing iTunes XML/COM

2008-07-28 Thread william tanksley
I'm trying to convert the URLs contained in iTunes' XML file into a form comparable with the filenames returned by iTunes' COM interface. I'm writing a podcast sorter in Python; I'm using iTunes under Windows right now. iTunes' COM provides most of my data input and all of my mp3/aac editing capab