On Thu, Dec 12, 2013 at 6:25 PM, Robert Voigtländer <r.voigtlaen...@gmail.com> wrote: > I need to find a -performant- way to transform this into a list with tuples > (a[0],[a[0][1]min],[a[0][1]max]). > > Hard to explaint what I mean .. [0] of the first three tuples is 52. [1] is > 193,193 and 192. > What I need as result for these three tuples is: (52,192,193). > > For the next five tuples it is (51,188,193). > > > Extra challenges: > - This list is sorted. For performance reasons I would like to keep it > unsorted. > - There may be tuples where min=max. > - There my be tupples where [0] only exists once. So mix is automatically max
Yep, I see what you mean! Apart from the first of the challenges, which is ambiguous: do you mean you'd rather be able to work with it unsorted, or is that a typo, "keep it sorted"? This is a common task of aggregation. Your list is of (key, value) tuples, and you want to do some per-key statistics. Here are three variants on the code: # Fastest version, depends on the keys being already grouped # and the values sorted within each group. It actually returns # the last and first, not the smallest and largest. def min_max_1(lst): prev_key = None for key, value in lst: if key != prev_key: if prev_key is not None: yield prev_key, value, key_max key_max = value if prev_key is not None: yield prev_key, value, key_max # This version depends on the keys being grouped, but # not on them being sorted within the groups. def min_max_2(lst): prev_key = None for key, value in lst: if key != prev_key: if prev_key is not None: yield prev_key, key_min, key_max key_min = key_max = value else: key_min = min(key_min, value) key_max = min(key_max, value) if prev_key is not None: yield prev_key, key_min, key_max # Slowest version, does not depend on either the keys # or the values being sorted. Will iterate over the entire # list before producing any results. Returns tuples in # arbitrary order, unlike the others (which will retain). def min_max_3(lst): data = {} for key, value in lst: if key not in data: data[key]=(value, value) else: data[key][0] = min(data[key][0], value) data[key][1] = min(data[key][1], value) for key, minmax in data.items(): yield key, minmax[0], minmax[1] Each of these is a generator that yields (key, min, max) tuples. The third one needs the most memory and execution time; the others simply take the input as it comes. None of them actually requires that the input be a list - any iterable will do. ChrisA -- https://mail.python.org/mailman/listinfo/python-list