Here's a traceback that's not helping: Traceback (most recent call last): File "InfoCompaniesHouse.py", line 255, in <module> main() File "InfoCompaniesHouse.py", line 251, in main loader.dofile(infile) # load this file File "InfoCompaniesHouse.py", line 213, in dofile self.dofilezip(infilename) # do ZIP file File "InfoCompaniesHouse.py", line 198, in dofilezip self.dofilecsv(infile, infd) # as a CSV file File "InfoCompaniesHouse.py", line 182, in dofilecsv for fields in reader : # read entire CSV file UnicodeEncodeError: 'ascii' codec can't encode character u'\xa3' in position 14: ordinal not in range(128)
This is wierd, becuase "for fields in reader" isn't directly doing a decode. That's further down somewhere, and the backtrace didn't tell me where. The program is converting some .CSV files that come packaged in .ZIP files. The files are big, so rather than expanding them, they're read directly from the ZIP files and processed through the ZIP and CSV modules. Here's the code that's causing the error above: decoder = codecs.getreader('utf-8') with decoder(infdraw,errors="replace") as infd : with codecs.open(outfilename, encoding='utf-8', mode='w') as outfd : headerline = infd.readline() self.doheaderline(headerline) reader = csv.reader(infd, delimiter=',', quotechar='"') for fields in reader : pass Normally, the "pass" is a call to something that uses the data, but for test purposes, I put a "pass" in there. It still fails. With that "pass", nothing is ever written to the output file, and no "encoding" should be taking place. "infdraw" is a stream from the zip module, create like this: with inzip.open(zipelt.filename,"r") as infd : self.dofilecsv(infile, infd) This works for data records that are pure ASCII, but as soon as some non-ASCII character comes through, it fails. Where is the error being generated? I'm not seeing any place where there's a conversion to ASCII. Not even a print. John Nagle -- http://mail.python.org/mailman/listinfo/python-list