On 2 May, 17:29, Jean-Paul Calderone <[EMAIL PROTECTED]> wrote: > On 2 May 2007 09:19:25 -0700, [EMAIL PROTECTED] wrote: > > > > >The code: > > >import codecs > > >udlASCII = file("c:\\temp\\CSVDB.udl",'r') > >udlUNI = codecs.open("c:\\temp\\CSVDB2.udl",'w',"utf_16") > > >udlUNI.write(udlASCII.read()) > > >udlUNI.close() > >udlASCII.close() > > >This doesn't seem to generate the correct line endings. Instead of > >converting 0x0D/0x0A to 0x0D/0x00/0x0A/0x00, it leaves it as 0x0D/ > >0x0A > > >I have tried various 2 byte unicode encoding but it doesn't seem to > >make a difference. I have also tried modifying the code to read and > >convert a line at a time, but that didn't make any difference either. > > >I have tried to understand the unicode docs but nothing seems to > >indicate why an seemingly incorrect conversion is being done. > >Obviously I am missing something blindingly obvious here, any help > >much appreciated. > > Consider this simple example: > > >>> import codecs > >>> f = codecs.open('test-newlines-file', 'w', 'utf16') > >>> f.write('\r\n') > >>> f.close() > >>> f = file('test-newlines-file') > >>> f.read() > '\xff\xfe\r\x00\n\x00' > >>> > > And how it differs from your example. Are you sure you're examining > the resulting output properly? > > By the way, "\r\0\n\0" isn't a "unicode line ending", it's just the UTF-16 > encoding of "\r\n". > > Jean-Paul
I am not sure what you are driving at here, since I started with an ascii file, whereas you just write a unicode file to start with. I guess the direct question is "is there a simple way to convert my ascii file to a utf16 file?". I thought either string.encode() or writing to a utf16 file would do the trick but it probably isn't that simple! I used a binary file editor I have used a great deal for all sorts of things to get the hex values. Dom -- http://mail.python.org/mailman/listinfo/python-list