Peter Otten schrieb: > I'd preprocess the rows as I tend to prefer the simplest approach I can come > up with. Example: > > def recode_rows(rows, source_encoding, target_encoding): > def recode(field): > if isinstance(field, unicode): > return field.encode(target_encoding) > elif isinstance(field, str): > return unicode(field, source_encoding).encode(target_encoding) > return unicode(field).encode(target_encoding) > > return (map(recode, row) for row in rows) >
For this case isinstance really seems to be quite reasonable. And it was silly of me not to think of sys.stdout as file object for the example! > rows = [[1.23], [u"äöü"], [u"ÄÖÜ".encode("latin1")], [1, 2, 3]] > writer = csv.writer(sys.stdout) > writer.writerows(recode_rows(rows, "latin1", "utf-8")) > > The only limitation I can see: target_encoding probably has to be a superset > of ASCII. > Coping with umlauts and accents is quite enough for me. This problem really goes away with Python 3 (tried it on another machine), but something else changes too: in Python 2.6 the documentation for the csv module explicitly says "If csvfile is a file object, it must be opened with the ‘b’ flag on platforms where that makes a difference." The documentation for Python 3.1 doesn't have this sentence, and if I do that in Python 3.1 I get for all sorts of data, even for a list with only one integer literal: TypeError: must be bytes or buffer, not str I don't really understand that. Regards, Sibylle -- http://mail.python.org/mailman/listinfo/python-list