On Mar 2, 10:09 am, [EMAIL PROTECTED] wrote: > Folks, > > I've a Python 2.5 app running on 32 bit Win 2k SP4 (NTFS volume). > Reading a file of 13 GBytes, one line at a time. It appears that, > once the read line passes the 4 GByte boundary, I am getting > occasional random line concatenations. Input file is confirmed good > via UltraEdit. Groovy version of the same app runs fine. > > Any ideas? > > Cheers
It appears to be a bug. I am able to reproduce the problem with the code fragment below. It creates a 12GB file with line lengths ranging from 0 to 126 bytes, and repeating that set of lines 1500000 times. It fails on W2K SP4 with both Python 2.4 and 2.5. It works correctly on Linux (Ubuntu 6.10). I have reported on SourceForge as bug 1672853. # Read and write a huge file. import sys def write_file(end = 126, loops = 150, fname='bigfile'): fh = open(fname, 'w') buff = 'A' * end for k in range(loops): for t in range(end+1): fh.write(buff[:t]+'\n') fh.close() def read_file(end = 126, fname = 'bigfile'): fh = open(fname, 'r') offset = 0 loops = 0 for rec in fh: if offset != len(rec.strip()): print 'Error at loop:', loops print 'Expected record length:', offset print 'Actual record length:', len(rec.strip()) sys.exit(0) offset += 1 if offset > end: offset = 0 loops += 1 if not loops % 10000: print loops fh.close() if __name__ == '__main__': write_file(loops=1500000) read_file() casevh -- http://mail.python.org/mailman/listinfo/python-list