On 18 dec, 00:06, John Machin <sjmac...@lexicon.net> wrote:
- Tekst uit oorspronkelijk bericht niet weergeven - - Tekst uit oorspronkelijk bericht weergeven - > On Dec 18, 3:15 am, aka <alexoploca...@gmail.com> wrote: > Do you mean that this file was created by whatever.UnicodeWriter? If > so, did you just now discover this information? > How do you know that "the UnicodeWriter is functioning perfectly"? > What does "functioning perfectly mean to you"? In particular, what > encoding is it using? > Which do you mean: > (a) you typed those lines into Notepad yourself > (b) you took a copy of a file created by whatever.UnicodeWriter, > opened it with Notepad, trimmed off some rows and columns, and saved > it again > ? > Here's a likely hypothesis: the file was written in utf16. In that > case: > either (i) you really want utf16 (why?), so: > (1) the csv module will not cope with it, and is not expected to cope > with it > (2) the whatever.UnicodeReader should (in order of preference): > (a) be allowed to find out for itself that 'utf16' is the go > (b) be told explicitly that 'utf16' is the go > (c) be served with a bug report > OR (ii) you really want utf8, so: > (1) the csv module should be happy > (2) the whatever.UnicodeWriter should be told to use 'utf8' > (3) the whatever.UnicodeReader should (in order of preference): > [as above but s/16/8/] The csv file originally was created by the UnicodeWriter class and was used for a mailmerge function with Microsoft Word which all functioned perfectly. The reverse did not: read back the outputted file so at last I editted it in Notepad, cutting off columns, but I didn't know that the encoding would remain even after that because it still caused problems. Now after testing from the Python command line with a csv file generated from Excel I could get it working so it had to be the encoding. Because the write side of my code, which uses the UnicodeWriter, was ok I didn't pay attention to the fact that I had changed the UW class from UTF-8 to UTF-16 because of difficulties with dutch characters like ë and ö. Then at last I tried changing back to UTF-8 and noticed both out -and input was working, including those special characters, so it was my unjustifiable conclusion that I couldn't get around these special characters at the write side without UTF-16 which ultimately got me in trouble with the read side. With your help I got it straight. Once again minimizing the problem to its bare basics and to prevent big steps is the key. Thanks a lot for your help John. BTW, the TurboGears code is not very different from Python, it just uses some extra identifiers around the Python code. -- http://mail.python.org/mailman/listinfo/python-list