On Dec 21, 7:41 am, Wojciech Gryc <[EMAIL PROTECTED]> wrote: > Hi, > > Python 2.5, on Windows XP. Actually, I think you may be right about > \x1a -- there's a few lines that definitely have some strange > character sequences, so this would make sense... Would you happen to > know how I can actually fix this (e.g. replace the character)? Since > Python doesn't see the rest of the file, I don't even know how to get > to it to fix the problem... Due to the nature of the data I'm working > with, manual editing is also not an option. >
Please don't top-post. Quick hack to remove all occurrences of '\x1a' (untested): fin = open('old_file', 'rb') # note b BINARY fout = open('new_file', 'wb') blksz = 1024 * 1024 while True: blk = fin.read(blksz) if not blk: break fout.write(blk.replace('\x1a', '')) fout.close() fin.close() You may however want to investigate the "strange character sequences" that have somehow appeared in your file after you built it yourself :-) HTH, John -- http://mail.python.org/mailman/listinfo/python-list