"John Salerno" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > John Salerno wrote: > > What is the best way of altering something (in my case, a file) while > > you are iterating over it? I've tried this before by accident and got an > > error, naturally. > > > > I'm trying to read the lines of a file and remove all the blank ones. > > One solution I tried is to open the file and use readlines(), then copy > > that list into another variable, but this doesn't seem very efficient to > > have two variables representing the file. > > > > Perhaps there's also some better to do it than this, including using > > readlines(), but I'm most interested in just how you edit something as > > you are iterating with it. > > > > Thanks. > > Slightly new question as well. here's my code: > > phonelist = open('file').readlines() > new_phonelist = phonelist > > for line in phonelist: > if line == '\n': > new_phonelist.remove(line) > > import pprint > pprint.pprint(new_phonelist) > > But I notice that there are still several lines that print out as '\n', > so why doesn't it work for all lines?
Okay, so it looks like you are moving away from modifying a list while iterating over it. In general this is good practice, that is, it is good practice to *not* modify a list while iterating over it (although if you *must* do this, it is possible, just iterate from back-to-front instead of front to back, so that deletions don't mess up your "next" pointer). Your coding style is a little dated - are you using an old version of Python? This style is the old-fashioned way: noblanklines = [] lines = open("filename.dat").readlines() for line in lines: if line != '\n': noblanklines.append(lin) 1. open("xxx") still works - not sure if it's even deprecated or not - but the new style is to use the file class 2. the file class is itself an iterator, so no need to invoke readlines 3. no need for such a simple for loop, a list comprehension will do the trick - or even a generator expression passed to a list constructor. So this construct collapses down to: noblanklines = [ line for line in file("filename.dat") if line != '\n' ] Now to your question about why '\n' lines persist into your new list. The answer is - you are STILL UPDATING THE LIST YOUR ARE ITERATING OVER!!! Here's your code: new_phonelist = phonelist for line in phonelist: if line == '\n': new_phonelist.remove(line) phonelist and new_phonelist are just two names bound to the same list! If you have two consecutive '\n's in the file (say lines 3 and 4), then removing the first (line 3) shortens the list by one, so that line 4 becomes the new line 3. Then you advance to the next line, being line 4, and the second '\n' has been skipped over. Also, don't confuse remove with del. new_phonelist.remove(line) does a search of new_phonelist for the first matching entry of line. We know line = '\n' - all this is doing is scanning through new_phonelist and removing the first occurrence of '\n'. You'd do just as well with: numEmptyLines = lines.count('\n') for i in range( numEmptyLines ): lines.remove('\n') Why didn't I just write this: for i in range( lines.count('\n') ): lines.remove('\n') Because lines.count('\n') would be evaluated every time in the loop, reducing by one each time because of the line we'd removed. Talk about sucky performance! You might also want to strip whitespace from your lines - I expect while you are removing blank lines, a line composed of all spaces and/or tabs would be equally removable. Try this: lines = map(str.rstrip, file("XYZZY.DAT") ) -- Paul -- http://mail.python.org/mailman/listinfo/python-list