On Sep 19, 2019, at 12:09, Richard Higginbotham <[email protected]> wrote:
>
> It might / probably? even be adapted to be used in the built in set
> operations.
How?
I’m not sure if you get how sets work, and maybe you think they’re based on a
red-black tree or skiplist or some other logarithmic collection, like C++ and
Java sets? It that were true, then implementing set.intersection as a
step-compare instead of just {x for x in self if x in y} would be an
optimization—it would replace O(nlogm) time with O(n+m). (But then such a set
would almost certainly already be using step-comparison—or, in fact, would
already be optimized it with subtree pruning, which has the same worst case but
better best case—before anyone added it as a builtin.)
But sets actually use a hash table, just like dicts do.
This means they work on objects that can’t be compared (e.g., you can throw a
bunch of complex numbers in a set). So you couldn’t possibly use an algorithm
that requires comparison to implement a set method, because it would raise a
TypeError on perfectly valid sets.
This also means that lookup is an O(1) operation, not O(logn). So, intersection
is O(m) time, which is better than your O(n+m), not worse.
It also means that insert is amortized O(1) time, so you can build a set in
O(n) time. So, if the collections aren’t prepared in advance, building a set to
intersect with an iterable is O(n+m) time, which is better than sorting two
iterables to step-compare them in O(nlogn+mlogm+n+m) as you’re proposing.
_______________________________________________
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/[email protected]/message/Q52WP2SWXAJDBP25OA6Z5WTGJXJEBZD5/
Code of Conduct: http://python.org/psf/codeofconduct/