On 12/05/2006 6:11 PM, Ravi Teja wrote: > [EMAIL PROTECTED] wrote: >> hi >> say i have a text file >> >> line1 [snip] >> line6 >> abc >> line8 <---to be delete [snip] >> line13 <---to be delete >> xyz >> line15 [snip] >> line18 >> >> I wish to delete lines that are in between 'abc' and 'xyz' and print >> the rest of the lines. Which is the best way to do it? Should i get >> everything into a list, get the index of abc and xyz, then pop the >> elements out? or any other better methods? >> thanks > > In other words ... > lines = open('test.txt').readlines() > for line in lines[lines.index('abc\n') + 1:lines.index('xyz\n')]: > lines.remove(line)
I don't think that's what you really meant. >>> lines = ['blah', 'fubar', 'abc\n', 'blah', 'fubar', 'xyz\n', 'xyzzy'] >>> for line in lines[lines.index('abc\n') + 1:lines.index('xyz\n')]: ... lines.remove(line) ... >>> lines ['abc\n', 'blah', 'fubar', 'xyz\n', 'xyzzy'] Uh-oh. Try this: >>> lines = ['blah', 'fubar', 'abc\n', 'blah', 'fubar', 'xyz\n', 'xyzzy'] >>> del lines[lines.index('abc\n') + 1:lines.index('xyz\n')] >>> lines ['blah', 'fubar', 'abc\n', 'xyz\n', 'xyzzy'] >>> Of course wrapping it in try/except would be a good idea, not for the slicing, which behaves itself and does nothing if the 'abc\n' appears AFTER the 'xyz\n', but for the index() in case the sought markers aren't there. Perhaps it might be a good idea even to do it carefully one piece at a time: is the abc there? is the xyz there? is the xyz after the abc -- then del[index1+1:index2]. I wonder what the OP wants to happen in a case like this: guff1 xyz guff2 abc guff2 xyz guff3 or this: guff1 abc guff2 abc guff2 xyz guff3 > for line in lines: > print line, > > Regular expressions are better in this case Famous last words. > import re > pat = re.compile('abc\n.*?xyz\n', re.DOTALL) > print re.sub(pat, '', open('test.txt').read()) > I don't think you really meant that either. >>> lines = ['blah', 'fubar', 'abc\n', 'blah', 'fubar', 'xyz\n', 'xyzzy'] >>> linestr = "".join(lines) >>> linestr 'blahfubarabc\nblahfubarxyz\nxyzzy' >>> import re >>> pat = re.compile('abc\n.*?xyz\n', re.DOTALL) >>> print re.sub(pat, '', linestr) blahfubarxyzzy >>> Uh-oh. Try this: >>> pat = re.compile('(?<=abc\n).*?(?=xyz\n)', re.DOTALL) >>> re.sub(pat, '', linestr) 'blahfubarabc\nxyz\nxyzzy' ... and I can't imagine why you're using the confusing [IMHO] undocumented [AFAICT] feature that the first arg of the module-level functions like sub and friends can be a compiled regular expression object. Why not use this: >>> pat.sub('', linestr) 'blahfubarabc\nxyz\nxyzzy' >>> One-liner fanboys might prefer this: >>> re.sub('(?i)(?<=abc\n).*?(?=xyz\n)', '', linestr) 'blahfubarabc\nxyz\nxyzzy' >>> HTH, John -- http://mail.python.org/mailman/listinfo/python-list