On Wed, 2007-10-10 at 16:03 -0500, Robert Dailey wrote: > I've tried everything to make the original CSV module work. It just > doesn't. I've tried UTF-16 encoding
What do you mean, "tried?" Don't you know what the file is encoded in? > (which works fine with codecs.open()) but when I pass in the file > object returned from codecs.open() into csv.reader(), the call to > reader.next() fails because it says something isnt' in the range of > range(128) or something (Not really an expert on Unicode so I'm not > sure of the meaning). I would use CSV if I could! That's because the codec-file object feeds it decoded Unicode strings, but the CSV module wants to work with encoded octet strings, so it tries to encode the unicode string with the default codec. The default codec is ASCII, which can't represent characters with code points greater than 127. Instead of passing the file object directly to the csv parser, pass in a generator that reads from the file and explicitly encodes the strings into UTF-8, along these lines: def encode_to_utf8(f): for line in f: yield line.encode("utf-8") There may be a fundamental problem with this approach that I can't foresee at the moment, but it's worth a try when your alternative is to build a Unicode-aware CSV parser from scratch. Hope this helps, -- Carsten Haese http://informixdb.sourceforge.net -- http://mail.python.org/mailman/listinfo/python-list