On Aug 2, 10:02 am, william tanksley <[EMAIL PROTECTED]> wrote:
> Given that the input file was
> Unicode,
You mean something like "encoded in UTF-8".
Here's another reference for you to read: http://www.amk.ca/python/howto/unicode
--
http://mail.python.org/mailman/listinfo/python-list
John Machin <[EMAIL PROTECTED]> wrote:
> william tanksley <[EMAIL PROTECTED]> wrote:
> > Cool. Sorry for the misunderstanding. Thank you for helping again!
> > Postscript: your request to print the actual data did the trick.
> I'd back inspecting actual data against armchair philosophy any
> time
On Aug 1, 7:44 am, william tanksley <[EMAIL PROTECTED]> wrote:
> John Machin <[EMAIL PROTECTED]> wrote:
> > william tanksley <[EMAIL PROTECTED]> wrote:
> > Let's try again:
>
> Cool. Sorry for the misunderstanding. Thank you for helping again!
>
> Postscript: your request to print the actual data d
On Thu, Jul 31, 2008 at 9:44 AM, william tanksley <[EMAIL PROTECTED]> wrote:
> I'm using a file, a file that's correctly encoded as UTF-8, and it
> returns some text elements that are raw bytes (undecoded). I have to
> manually decode them.
I can't reproduce this behavior. Here's a simple test ca
John Machin <[EMAIL PROTECTED]> wrote:
> william tanksley <[EMAIL PROTECTED]> wrote:
> Let's try again:
Cool. Sorry for the misunderstanding. Thank you for helping again!
Postscript: your request to print the actual data did the trick. I'm
including the rest of my reply just to provide context, b
On Jul 31, 11:54 pm, william tanksley <[EMAIL PROTECTED]> wrote:
> John Machin <[EMAIL PROTECTED]> wrote:
> > william tanksley <[EMAIL PROTECTED]> wrote:
> > > "Buffett Time - Annual Shareholders\xc2\xa0L.mp3"
> > > 1. This isn't Unicode; it's missing the u"" (I printed using repr).
> > > 2. It's g
william tanksley wrote:
> I didn't
> pass a string. I passed a file. It didn't error out; instead, it
> produced bytestring-encoded output (not Unicode).
>From my experience (and from the source code I have seen so far), ElementTree
does not return UTF-8 encoded strings at the API level. Can you p
John Machin <[EMAIL PROTECTED]> wrote:
> william tanksley <[EMAIL PROTECTED]> wrote:
> > "Buffett Time - Annual Shareholders\xc2\xa0L.mp3"
> > 1. This isn't Unicode; it's missing the u"" (I printed using repr).
> > 2. It's got the UTF-8 bytes there in the middle.
> > In addition to the above result
Stefan Behnel <[EMAIL PROTECTED]> wrote:
> william tanksley wrote:
> > Okay, my answer is that ElementTree (in Python 2.5) is simply
> > deranged when it comes to Unicode. It assumes everything's ASCII.
> It does not "assume" that. It *requires* byte strings to be ASCII.
You can't encode Unicode
On Jul 31, 12:58 am, william tanksley <[EMAIL PROTECTED]> wrote:
> Thank you for the response. Here's some more info, including a little
> that you didn't ask me for but which might be useful.
>
> John Machin <[EMAIL PROTECTED]> wrote:
> > william tanksley <[EMAIL PROTECTED]> wrote:
> > > To ask an
william tanksley wrote:
> william tanksley <[EMAIL PROTECTED]> wrote:
>> I'm still puzzled why I'm getting some non-Unicode out of an
>> ElementTree's text, though.
>
> Now I know.
>
> Okay, my answer is that cElementTree (in Python 2.5) is simply
> deranged when it comes to Unicode. It assumes e
william tanksley <[EMAIL PROTECTED]> wrote:
> I'm still puzzled why I'm getting some non-Unicode out of an
> ElementTree's text, though.
Now I know.
Okay, my answer is that cElementTree (in Python 2.5) is simply
deranged when it comes to Unicode. It assumes everything's ASCII.
Reference: http://
"Jerry Hill" <[EMAIL PROTECTED]> wrote:
> On Wed, Jul 30, 2008 at 2:27 PM, william tanksley <[EMAIL PROTECTED]> wrote:
> > Awesome... Thank you! I had my mental model of Python turned around
> > backwards. That's an odd feeling. Okay, so you decode to go from raw
> > byes into a given encoding, and
On Wed, Jul 30, 2008 at 2:27 PM, william tanksley <[EMAIL PROTECTED]> wrote:
> Awesome... Thank you! I had my mental model of Python turned around
> backwards. That's an odd feeling. Okay, so you decode to go from raw
> byes into a given encoding, and you encode to go from a given encoding
> to raw
william tanksley wrote:
> Okay, so you decode to go from raw
> byes into a given encoding, and you encode to go from a given encoding
> to raw bytes.
No, decoding goes from a byte sequence to a Unicode string and encoding goes
from a Unicode string to a byte sequence.
Unicode is not an encoding.
"Jerry Hill" <[EMAIL PROTECTED]> wrote:
> william tanksley <[EMAIL PROTECTED]> wrote:
> > Here's one example. The others are similar -- they have the same
> > things that look like problems to me.
> > "Buffett Time - Annual Shareholders\xc2\xa0L.mp3"
> > I tried doing track_id.encode("utf-8"), but
On Wed, Jul 30, 2008 at 10:58 AM, william tanksley
<[EMAIL PROTECTED]> wrote:
> Here's one example. The others are similar -- they have the same
> things that look like problems to me.
>
> "Buffett Time - Annual Shareholders\xc2\xa0L.mp3"
>
> Note some problems here:
>
> 1. This isn't Unicode; it's
Thank you for the response. Here's some more info, including a little
that you didn't ask me for but which might be useful.
John Machin <[EMAIL PROTECTED]> wrote:
> william tanksley <[EMAIL PROTECTED]> wrote:
> > To ask another way: how do I convert from a file:// URL to a local
> > path in a stan
If you want to convert the file names which use standard URL encoding
(with %20 for space, etc) use:
from urllib import unquote
new_filename = unquote(filename)
I have found this does not convert encoded characters of the form
'CC;' so you may have to do that manually. I think these are just
as
On Jul 30, 3:53 am, william tanksley <[EMAIL PROTECTED]> wrote:
> To ask another way: how do I convert from a file:// URL to a local
> path in a standard way, so that filepaths from two different sources
> will work the same way in a dictionary?
>
> Right now I'm using the following source:
>
> tra
To ask another way: how do I convert from a file:// URL to a local
path in a standard way, so that filepaths from two different sources
will work the same way in a dictionary?
Right now I'm using the following source:
track_id = url2pathname(urlparse(track_id).path)
url2pathname is from urllib;
I'm trying to convert the URLs contained in iTunes' XML file into a
form comparable with the filenames returned by iTunes' COM interface.
I'm writing a podcast sorter in Python; I'm using iTunes under Windows
right now. iTunes' COM provides most of my data input and all of my
mp3/aac editing capab
22 matches
Mail list logo