Hi, Python 2.5, on Windows XP. Actually, I think you may be right about \x1a -- there's a few lines that definitely have some strange character sequences, so this would make sense... Would you happen to know how I can actually fix this (e.g. replace the character)? Since Python doesn't see the rest of the file, I don't even know how to get to it to fix the problem... Due to the nature of the data I'm working with, manual editing is also not an option.
Thanks, Wojciech On Dec 20, 3:30 pm, John Machin <[EMAIL PROTECTED]> wrote: > On Dec 21, 6:48 am, Wojciech Gryc <[EMAIL PROTECTED]> wrote: > > > Hi, > > > I'm currently using Python to deal with a fairly large text file (800 > > MB), which I know has about 85,000 lines of text. I can confirm this > > because (1) I built the file myself, and (2) running a basic Java > > program to count lines yields a number in that range. > > > However, when I use Python's various methods -- readline(), > > readlines(), or xreadlines() and loop through the lines of the file, > > the line program exits at 16,000 lines. No error output or anything -- > > it seems the end of the loop was reached, and the code was executed > > successfully. > > > I'm baffled and confused, and would be grateful for any advice as to > > what I'm doing wrong, or why this may be happening. > > What platform, what version of python? > > One possibility: you are running this on Windows and the file contains > Ctrl-Z aka chr(26) aka '\x1a'. -- http://mail.python.org/mailman/listinfo/python-list