On 2007-08-01, at 06:50, Astan Chee wrote:

Hi,
I have a dictionary which looks something like this:

elem = {"co":[1,2,3],"cp":[3,2,1],"pp":[5,6,7],"cl":[1,2,3],"qw": [6,7,8],"qa":[8,7,6]}

what Im trying to do is find all keys in the list that have the same value and delete those (except one); thereby removing all redundant keys so that the above dict will look like this:

elem = {"co":[1,2,3],"pp":[5,6,7],"qa":[8,7,6]}

my code so far looks like this:

for out1 in elem.keys():
    for out2 in elem.keys():
        if out1 != out2:
            elem[out1].sort()
            elem[out2].sort()
            if elem[out1] == elem[out2]:
                del elem[out1]

This returns in a KeyError exception when sorting. What am I missing here?

Consider a dict {'a': [1], 'b': [1], 'c': [1]}.

First time we reach the innerloop, out1 == 'a'. Then we iterate over the list ['a', 'b', 'c'].

'a' is skipped. At 'b', we remove 'a' from the dictionary. And then it all goes wrong.

The last time we go through the inner loop, out1 is still 'a' -- an element which does not exist in the dictionary anymore. Then your attempts to reference it fail at both the sorting and comparison stages. You should break out of the inner loop immediately after you've deleted the item or delete out2 instead..

Furthermore, the algorithm is somewhat inefficient due to the fact that it sorts the lists O(n^2) times when you only need to do them n times. If your data allows it, you might like to use sets instead of lists. This would avoid the need to sort the lists.

However, here is a very quick version that works in my tests:
for out_list in elem.values():
    out_list.sort()

for out1 in elem.keys():
    for out2 in elem.keys():
        if out1 != out2 and elem[out1] == elem[out2]:
            del elem[out1]
            break

There are probably several ways to improve on that, however.

--
[ [EMAIL PROTECTED] <*> Antti Rasinen ]

"Pay the man his wage, he's making paper to fuel the Information Age."


-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to