dirknbr wrote: > Hi > > I have 2 files (done and outf), and I want to chose unique elements > from the 2nd column in outf which are not in done. This code works but > is not efficient, can you think of a quicker way? The a=1 is just a > redundant task obviously, I put it this way around because I think > 'in' is quicker than 'not in' - is that true? > > done_={} > for line in done: > done_[line.strip()]=0 > > print len(done_) > > universe={} > for line in outf: > if line.split(',')[1].strip() in universe.keys(): > a=1 > else: > if line.split(',')[1].strip() in done_.keys(): > a=1 > else: > universe[line.split(',')[1].strip()]=0
Instead of if key in some_dict.keys(): #... which converts the keys in the dictionary to a list and then performs an O(N) lookup on that list you should use if key in some_dict: #... which doesn't build a list and looks up the key in constant time. Peter -- http://mail.python.org/mailman/listinfo/python-list