>>>>> "sf" == sf <[EMAIL PROTECTED]> writes:
sf> Just started thinking about learning python. Is there any sf> place where I can get some free examples, especially for sf> following kind of problem ( it must be trivial for those using sf> python) sf> I have files A, and B each containing say 100,000 lines (each sf> line=one string without any space) sf> I want to do sf> " A - (A intersection B) " sf> Essentially, want to do efficient grep, i..e from A remove sf> those lines which are also present in file B. If you're only talking about 100K lines or so, and you have a reasonably modern computer, you can do this all in memory. If order doesn't matter (it probably does) you can use a set to get all the lines in file B that are not in A from sets import Set A = Set(file('test1.dat').readlines()) B = Set(file('test2.dat').readlines()) print B-A To preserve order, you should use a dictionary that maps lines to line numbers. You can later use these numbers to sort A = dict([(line, num) for num,line in enumerate(file('test1.dat'))]) B = dict([(line, num) for num,line in enumerate(file('test2.dat'))]) keep = [(num, line) for line,num in B.items() if not A.has_key(line)] keep.sort() for num, line in keep: print line, Now someone else will come along and tell you all this functionality is already in the standard library. But it's always fun to hack this out yourself once because python makes such things so damned easy. JDH -- http://mail.python.org/mailman/listinfo/python-list