Hello Pythonistas, i have a very large textfile with contents like:
@INBOOK{Ackermann1999-b, author = {Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann}, year = {1980}, timestamp = {1995-12-02} } And i want to delete the duplicate rows except these rows containing the brackets { or }. The result should look like: @INBOOK{Ackermann1999-b, author = {Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann}, year = {1980}, timestamp = {1995-12-02} } I come across with this Python-Skript: lines_seen = set() # holds lines already seen outfile = open("literatur_clean.txt", "w") for line in open("literatur_dupl.txt", "r"): if line not in lines_seen: # not a duplicate outfile.write(line) lines_seen.add(line) outfile.close() But it deletes also the lines with a closing bracket } and the lines with the same authordata. Therefor i need the condition of the brackets. Could someone point me out to adding this condition? Thanks in advance, Joon -- http://mail.python.org/mailman/listinfo/python-list