Re: encoding hell - any chance of salvation ?

Terry Reedy Mon, 07 Mar 2011 12:17:01 -0800

On 3/7/2011 6:24 AM, southof40 wrote:

Hi - I've got some code which uses array (http://docs.python.org/
library/array.html) to store charcters read from a file (it's not my
code it comes from here http://sourceforge.net/projects/pygold/)


The read is done, in GrammarReader.py,  like this ...

     def readString(self, maxsize = -1):
         result = array('u')
         char = None
         while True:
             if (maxsize>= 0) and (len(result)>= maxsize):
                 break
             char = self.reader.read(2)
             if (char == '') or (char == '\x00\x00'):
                 break


               print(type(char),char) # to see what is going on

             result.append(char)
         return result.tounicode()

... and results in the error"TypeError: array item must be unicode
character" is raised (full stack trace at bottom) .

The whole unicode thing is a bit strange because the input file is a
compiled grammar and so not a text file at all (the file able to be
downloaded from here http:///kubadev.com/share/VBScript.cgt)

Can anyone make a suggestion as to the best way to allow the array
object to accept what is in essence a binary file ?

Here's the full stack trace ...

p=pygold.Parser('C:/data/Gold-Parser-VBScript-Grammar/VBScript-Test0-UTF8.cgt','utf-8')

Traceback (most recent call last):
   File "<stdin>", line 1, in<module>
   File "pygold\Parser.py", line 100, in __init__
     self.loadTables(filename)
   File "pygold\Parser.py", line 365, in loadTables
     reader = GrammarReader(filename, self.encoding)
   File "pygold\GrammarReader.py", line 14, in __init__
     if not self.hasValidHeader():
   File "pygold\GrammarReader.py", line 43, in hasValidHeader
     header = self.readString(64) ## read max 64 chars
   File "pygold\GrammarReader.py", line 68, in readString
     result.append(char)
TypeError: array item must be unicode character



--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list

Re: encoding hell - any chance of salvation ?

Reply via email to