hi i wrote some code to compare 2 files. One is the base file, the other file i got from somewhere. I need to compare this file against the base, eg base file abc def ghi
eg another file abc def ghi jkl after compare , the base file will be overwritten with "jkl". Also both files tend to grow towards > 20MB .. Here is my code...using difflib. pat = re.compile(r'^\+') ## i want to get rid of the '+' from the difflib output... def difference(filename,basename): import difflib base = open(basename) a = base.readlines() input = open(filename) b = input.readlines() d = difflib.Differ() diff = list(d.compare(a, b)) if len(diff) > 0: os.remove(basename) o = open(basename, "aU") for i in diff: if pat.search(i): i = i.lstrip("\+ ") o.writelines(i) ## write a new base file... o.close() g = open(basename) return g.readlines() Whenever the 2 files get very large, i find that it's very slow comparing...any good advice to speed things up.? I thought of removing readlines() method, and use line by line compare. Is it a better way? thanks -- http://mail.python.org/mailman/listinfo/python-list