David Eppstein <[EMAIL PROTECTED]> writes: > In article <[EMAIL PROTECTED]>, > "Xah Lee" <[EMAIL PROTECTED]> wrote: > >> parti(aList, equalFunc) >> >> given a list aList of n elements, we want to return a list that is a >> range of numbers from 1 to n, partition by the predicate function of >> equivalence equalFunc. (a predicate function is a function that >> takes two arguments, and returns either True or False.) > > In Python it is much more natural to use ranges from 0 to n-1. > In the worst case, this is going to have to take quadratic time > (consider an equalFunc that always returns false) so we might as well do > something really simple rather than trying to be clever.
As you say, with the spec as it stands, you can't do better than quadratic time (although it's O(n*m) where m is the number of partitions, rather than O(n^2)). You can do a lot better if you can use a "key" function, rather than an "equivalence" function, much as list.sort has a "key" argument, and itertools.groupby (which is pretty close in function to this partitioning problem) uses a key argument. In fact, I'd have difficulty thinking of an example where I'd want a partition function as specified, in Python. In Perl, it makes a lot of sense, as Perl's array indexing operations lend themselves to slinging round lists of indices like this. But in Python, I'd be far more likely to use list.sort followed by itertools.groupby - sort is stable (so doesn't alter the relative order within equivalence classes), and groupby then picks out the equivalence classes: >>> elements = [['x', 'x', 'x', '1'], ... ['x', 'x', 'x', '2'], ... ['x', 'x', 'x', '2'], ... ['x', 'x', 'x', '2'], ... ['x', 'x', 'x', '3'], ... ['x', 'x', 'x', '4'], ... ['x', 'x', 'x', '5'], ... ['x', 'x', 'x', '5']] >>> # No need to sort here, as the elements are already sorted! >>> from pprint import pprint >>> pprint([(k, list(v)) for k, v in groupby(elements, itemgetter(3))]) [('1', [['x', 'x', 'x', '1']]), ('2', [['x', 'x', 'x', '2'], ['x', 'x', 'x', '2'], ['x', 'x', 'x', '2']]), ('3', [['x', 'x', 'x', '3']]), ('4', [['x', 'x', 'x', '4']]), ('5', [['x', 'x', 'x', '5'], ['x', 'x', 'x', '5']])] If you avoid the sort, the whole thing is highly memory efficient, as well, because by using iterators, we don't ever take a copy of the original list. Having cleverly redefined the question so that it fits the answer I wanted to give, I'll shut up now :-) Paul. -- To attain knowledge, add things every day; to attain wisdom, remove things every day. -- Lao-Tse -- http://mail.python.org/mailman/listinfo/python-list