Re: deduping

2010-06-21 Thread Paul Rubin
dirknbr writes: > done_={} > for line in done: > done_[line.strip()]=0 > ... Maybe you mean: done_ = set(line.strip() for line in done) outf_ = set(line.split(',')[1] for line in outf) universe = done_ & outf # this finds the set intersection -- http://mail.python.org/mailman/

Re: deduping

2010-06-21 Thread Peter Otten
dirknbr wrote: > Hi > > I have 2 files (done and outf), and I want to chose unique elements > from the 2nd column in outf which are not in done. This code works but > is not efficient, can you think of a quicker way? The a=1 is just a > redundant task obviously, I put it this way around because I

Re: deduping

2010-06-21 Thread Dave Angel
dirknbr wrote: Hi I have 2 files (done and outf), and I want to chose unique elements from the 2nd column in outf which are not in done. This code works but is not efficient, can you think of a quicker way? The a=1 is just a redundant task obviously, I put it this way around because I think 'in'

Re: deduping

2010-06-21 Thread python
Use a set instead of a dictionary for done keys? Malcolm -- http://mail.python.org/mailman/listinfo/python-list

Re: deduping

2010-06-21 Thread Thomas Lehmann
> universe={} > for line in outf: >     if line.split(',')[1].strip() in universe.keys(): >         a=1 >     else: >         if line.split(',')[1].strip() in done_.keys(): >             a=1 >         else: >             universe[line.split(',')[1].strip()]=0 > I can not say too much because I don

deduping

2010-06-21 Thread dirknbr
Hi I have 2 files (done and outf), and I want to chose unique elements from the 2nd column in outf which are not in done. This code works but is not efficient, can you think of a quicker way? The a=1 is just a redundant task obviously, I put it this way around because I think 'in' is quicker than