John Henry wrote: > Hi list, > > I am sure there are many ways of doing comparision but I like to see > what you would do if you have 2 dictionary sets (containing lots of > data - like 20000 keys and each key contains a dozen or so of records) > and you want to build a list of differences about these two sets. > > I like to end up with 3 lists: what's in A and not in B, what's in B > and not in A, and of course, what's in both A and B. > > What do you think is the cleanest way to do it? (I am sure you will > come up with ways that astonishes me :=) ) >
Paddy has already pointed out a necessary addition to your requirement definition: common keys with different values. Here's another possible addition: you say that "each key contains a dozen or so of records". I presume that you mean like this: a = {1: ['rec1a', 'rec1b'], 42: ['rec42a', 'rec42b']} # "dozen" -> 2 to save typing :-) Now that happens if the other dictionary contains: b = {1: ['rec1a', 'rec1b'], 42: ['rec42b', 'rec42a']} Key 42 would be marked as different by Paddy's classification, but the values are the same, just not in the same order. How do you want to treat that? avalue == bvalue? sorted(avalue) == sorted(bvalue)? Oh, and are you sure the buckets don't contain duplicates? Maybe you need set(avalue) == set(bvalue). What about 'rec1a' vs 'Rec1a' vs 'REC1A'? All comparisons are equal, but some comparisons are more equal than others :-) Cheers, John -- http://mail.python.org/mailman/listinfo/python-list