Re: Mysterious xml.sax Encoding Exception

2008-02-05 Thread JKPeck
On Feb 4, 4:09 pm, John Machin <[EMAIL PROTECTED]> wrote: > On Feb 5, 9:02 am, JKPeck <[EMAIL PROTECTED]> wrote: > > > > > On Feb 2, 12:56 am, Jeroen Ruigrok van der Werven <[EMAIL PROTECTED] > > > nomine.org> wrote: > > > -On [20080201 19:06], JKPeck ([EMAIL PROTECTED]) wrote: > > > > >In both of

Re: Mysterious xml.sax Encoding Exception

2008-02-04 Thread John Machin
On Feb 5, 9:02 am, JKPeck <[EMAIL PROTECTED]> wrote: > On Feb 2, 12:56 am, Jeroen Ruigrok van der Werven <[EMAIL PROTECTED] > > nomine.org> wrote: > > -On [20080201 19:06], JKPeck ([EMAIL PROTECTED]) wrote: > > > >In both of these cases, there are only plain, 7-bit ascii characters > > >in the xml,

Re: Mysterious xml.sax Encoding Exception

2008-02-04 Thread JKPeck
On Feb 2, 12:56 am, Jeroen Ruigrok van der Werven <[EMAIL PROTECTED] nomine.org> wrote: > -On [20080201 19:06], JKPeck ([EMAIL PROTECTED]) wrote: > > >In both of these cases, there are only plain, 7-bit ascii characters > >in the xml, and it really is valid utf-16 as far as I can tell. > > Did you

Re: Mysterious xml.sax Encoding Exception

2008-02-02 Thread Stefan Behnel
Hi, Peck, Jon top-posted: >> Stefan Behnel wrote: >> No. The internal representation of unicode characters is platform >> dependent, and is either 2 or 4 bytes per character. If you want UTF-16, >> use ".encode()". > > Thanks. The two users having the problem are on Windows, so I think Python >

Re: Mysterious xml.sax Encoding Exception

2008-02-02 Thread Stefan Behnel
Peck, Jon schrieb: > Yes, the characters were from the 0-127 ascii block but encoded as utf-16, so > there is a null byte with each nonzero character. I.e., > \x00?\x00x\x00m\x00l\x00 > > Here is something weird I found while experimenting with ElementTree with > this same XML string. > > Con

RE: Mysterious xml.sax Encoding Exception

2008-02-02 Thread Peck, Jon
Unicode string, so it is actually encoded as utf-16 and as a string containing utf-16 bytes. That is u'mailto:[EMAIL PROTECTED] Sent: Saturday, February 02, 2008 12:57 AM To: JKPeck Cc: python-list@python.org Subject: Re: Mysterious xml.sax Encoding Exception -On [20080201 19:06], JKPeck ([

Re: Mysterious xml.sax Encoding Exception

2008-02-02 Thread John Machin
On Feb 2, 8:12 am, JKPeck <[EMAIL PROTECTED]> wrote: > On Feb 1, 1:51 pm, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > > > > They sent me the actual file, which was created on Windows, as an > > > email attachment. They had also sent the actual dataset from which > > > the XML was generated so

Re: Mysterious xml.sax Encoding Exception

2008-02-01 Thread Jeroen Ruigrok van der Werven
-On [20080201 19:06], JKPeck ([EMAIL PROTECTED]) wrote: >In both of these cases, there are only plain, 7-bit ascii characters >in the xml, and it really is valid utf-16 as far as I can tell. Did you mean to say that the only characters they used in the UTF-16 encoded file are characters from the B

Re: Mysterious xml.sax Encoding Exception

2008-02-01 Thread Martin v. Löwis
> The basic fact, though, remains, the same code works for me with the > same input but not for two particular users (out of hundreds). I see. That's mysterious. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list

Re: Mysterious xml.sax Encoding Exception

2008-02-01 Thread JKPeck
On Feb 1, 1:51 pm, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > > They sent me the actual file, which was created on Windows, as an > > email attachment. They had also sent the actual dataset from which > > the XML was generated so that I could generate it myself using the > > same version of o

Re: Mysterious xml.sax Encoding Exception

2008-02-01 Thread Martin v. Löwis
> They sent me the actual file, which was created on Windows, as an > email attachment. They had also sent the actual dataset from which > the XML was generated so that I could generate it myself using the > same version of our app as the user has. I did that but did not get > an exception. So

Re: Mysterious xml.sax Encoding Exception

2008-02-01 Thread JKPeck
On Feb 1, 1:22 pm, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > > In both of these cases, there are only plain, 7-bit ascii characters > > in the xml, and it really is valid utf-16 as far as I can tell. > > What do you mean by "7-bit ascii characters"? If it means what I think > it means (namely,

Re: Mysterious xml.sax Encoding Exception

2008-02-01 Thread Martin v. Löwis
> In both of these cases, there are only plain, 7-bit ascii characters > in the xml, and it really is valid utf-16 as far as I can tell. What do you mean by "7-bit ascii characters"? If it means what I think it means (namely, a sequence of bytes whose values are between 1 and 127), then it is *no

Mysterious xml.sax Encoding Exception

2008-02-01 Thread JKPeck
I have a module that uses xml.sax and feeds it a string of xml as in xml.sax.parseString(dictfile,handler) The xml is always encoded in utf-16, and the XML string always starts with This almost always works fine, but two users of this module get an exception whatever input they use it on. (The