On Thu, Dec 22, 2011 at 15:25, Stan Iverson <iversons...@gmail.com> wrote:
> On Thu, Dec 22, 2011 at 10:58 AM, Chris Angelico <ros...@gmail.com> wrote: > >> Firstly, are you using Python 2 or Python 3? Things will be slightly >> different, since the default 'str' object in Py3 is Unicode. >> > > 2 > >> >> I would guess that your page is being output as UTF-8; you may find >> that the solution is as easy as declaring the encoding of your text >> file when you read it in. >> > > So I tried this: > > file = open(p + "2.txt") > for line in file: > print unicode(line, 'utf-8') > Could you try using the 'open' function from the 'codecs' module? file = codecs.open(p + "2.txt", "utf-8") # or whatever encoding your file is written in for line in file: print line > > and got this error: > > 142 print unicode(line, 'utf-8') > 143 > 144 print '''<br /><br /><form id="signup" action=" > http://13gems.com/Sign_Up.py" method="post" target="_blank"> > *builtin* *unicode* = <type 'unicode'>, *line* = '<span class="text">\r\n > ' /usr/lib64/python2.4/encodings/utf_8.py in *decode*(input=<read-only > buffer ptr 0x2b197e378454, size 21>, errors='strict') 14 > 15 def decode(input, errors='strict'): > 16 return codecs.utf_16_decode(input, errors, True) > 17 > 18 class StreamWriter(codecs.StreamWriter): > *global* *codecs* = <module 'codecs' from > '/usr/lib64/python2.4/codecs.pyc'>, codecs.*utf_16_decode* = <built-in > function utf_16_decode>, *input* = <read-only buffer ptr 0x2b197e378454, > size 21>, *errors* = 'strict', *builtin* *True* = True > > *UnicodeDecodeError*: 'utf16' codec can't decode byte 0x0a in position > 20: truncated data > args = ('utf16', '<span class="text">\r\n', 20, 21, 'truncated > data') > encoding = 'utf16' > end = 21 > object = '<span class="text">\r\n' > reason = 'truncated data' > start = 20 > > Tried it with utf-16 with same results. > > TIA, > > Stan > > -- > http://mail.python.org/mailman/listinfo/python-list > > -- Rami Chowdhury "Never assume malice when stupidity will suffice." -- Hanlon's Razor +44-7581-430-517 / +1-408-597-7068 / +88-0189-245544
-- http://mail.python.org/mailman/listinfo/python-list