Is this reasonnable to do on 10^8 elements with repeats in the keys? I guess I should just try and see for myself.
Yeah, that's usually the right solution. I didn't comment on space/speed issues because they're so data dependent in a situation like this, and without actually looking at your data, I doubt anyone here can even really ballpark an answer for you. And if we had your data, we'd probably just try to load it and see what happens anyway. ;)
Steve -- http://mail.python.org/mailman/listinfo/python-list