Re: Unicode characters in btye-strings

2010-03-12 Thread Martin v. Loewis
Michael Rudolf wrote: > Am 12.03.2010 21:56, schrieb Martin v. Loewis: >> (*) If a source encoding was given, the source is actually recoded to >> UTF-8, parsed, and then re-encoded back into the original encoding. > > Why is that? Why is what? That string literals get reencoded into the source e

Re: Unicode characters in btye-strings

2010-03-12 Thread John Bokma
Michael Rudolf writes: > Am 12.03.2010 21:56, schrieb Martin v. Loewis: >> (*) If a source encoding was given, the source is actually recoded to >> UTF-8, parsed, and then re-encoded back into the original encoding. > > Why is that? So "unicode"-strings (as in u"string") are not really > unicode-

Re: Unicode characters in btye-strings

2010-03-12 Thread Michael Rudolf
Am 12.03.2010 21:56, schrieb Martin v. Loewis: (*) If a source encoding was given, the source is actually recoded to UTF-8, parsed, and then re-encoded back into the original encoding. Why is that? So "unicode"-strings (as in u"string") are not really unicode-, but utf8-strings? Need citatio

Re: Unicode characters in btye-strings

2010-03-12 Thread Martin v. Loewis
>> Can somebody explain what happens when I put non-ASCII characters into a >> non-unicode string? My guess is that the result will depend on the >> current encoding of my terminal. > > Exactly right. To elaborate on the "what happens" part: the string that gets entered is typically passed as a b

Re: Unicode characters in btye-strings

2010-03-12 Thread Robert Kern
On 2010-03-12 06:35 AM, Steven D'Aprano wrote: I know this is wrong, but I'm not sure just how wrong it is, or why. Using Python 2.x: s = "éâÄ" print s éâÄ len(s) 6 list(s) ['\xc3', '\xa9', '\xc3', '\xa2', '\xc3', '\x84'] Can somebody explain what happens when I put non-ASCII characters i

Re: Unicode characters, XML/RSS

2008-07-30 Thread Stefan Behnel
Adam W. wrote: > File "C:\Python25\lib\xml\sax\expatreader.py", line 207, in feed > self._parser.Parse(data, isFinal) > File "C:\Users\Adam\Desktop\Rev3 DL\XMLWorkspace.py", line 51, in > characters > self.data.append(string) > UnicodeEncodeError: 'ascii' codec can't encode character u'

Re: Unicode characters

2006-09-04 Thread Diez B. Roggisch
Paul Johnston wrote: > Hi > I have a string which I convert into a list then read through it > printing its glyph and numeric representation > > #-*- coding: utf-8 -*- > > thestring = "abcd" > thelist = list(thestring) > > for c in thelist: > print c, > print ord(c) > > Works fine fo

Re: Unicode characters

2006-09-04 Thread limodou
On 9/4/06, Paul Johnston <[EMAIL PROTECTED]> wrote: > Hi > I have a string which I convert into a list then read through it > printing its glyph and numeric representation > > #-*- coding: utf-8 -*- > > thestring = "abcd" > thelist = list(thestring) > > for c in thelist: > print c, > prin