david jensen wrote: > Hi all, > > I'm trying to find a good way of doing the following: > > Each n-tuple in combinations( range( 2 ** m ), n ) has a corresponding > value n-tuple (call them "scores" for clarity later). I'm currently > storing them in a dictionary, by doing: > > ---- > res={} > for i in itertools.combinations( range( 2**m ) , n): > res[ i ] = getValues( i ) # getValues() is computationally > expensive > ---- > > For each (n-1)-tuple, I need to find the two numbers that have the > highest scores versus them. I know this isn't crystal clear, but > hopefully an example will help: with m=n=3: > > Looking at only the (1, 3) case, assuming: > getValues( (1, 2, 3) ) == ( -200, 125, 75 ) # this contains the > highest "other" score, where 2 scores 125 > getValues( (1, 3, 4) ) == ( 50, -50, 0 ) > getValues( (1, 3, 5) ) == ( 25, 300, -325 ) > getValues( (1, 3, 6) ) == ( -100, 0, 100 ) # this contains the > second-highest, where 6 scores 100 > getValues( (1, 3, 7) ) == ( 80, -90, 10 ) > getValues( (1, 3, 8) ) == ( 10, -5, -5 ) > > I'd like to return ( (2, 125), (6, 100) ). > > The most obvious (to me) way to do this would be not to generate the > res dictionary at the beginning, but just to go through each > combinations( range( 2**m), n-1) and try every possibility... this > will test each combination n times, however, and generating those > values is expensive. [e.g. (1,2,3)'s scores will be generated when > finding the best possibilities for (1,2), (1,3) and (2,3)] > > What I'm doing now is ugly, and i think is where i'm confusing myself: > > ---- > best2={} > for i in itertools.combinations( range( 2**m), n-1): > scorelist=[] > for j in range( 2**m ): > if j not in i: > k=list(i) > k.append(j) > k=tuple(sorted(k)) #gets the key for looking up the > scores in res > scorelist.append((j,res[k][k.index(j)])) > best2[i]=sorted(scorelist,key=lambda x: -x[1])[:2] > ---- > > Am I missing an obviously better way?
After some tinkering I came up with def calculate(getvalues): best2 = defaultdict(list) for t in combinations(range(2**m), n): values = getvalues(t) for i, k in enumerate(t): best2[t[:i] + t[i+1:]].append((k, values[i])) return dict((k, nlargest(2, v, key=itemgetter(1))) for k, v in best2.iteritems()) which is concise but slower than your code with Raymond's improvements. I then tried inlining the nlargest() operation: def calculate_inlined(getvalues): best2 = {} for t in combinations(range(2**m), n): values = getvalues(t) for i, k in enumerate(t): key = t[:i] + t[i+1:] value = values[i] if key not in best2: best2[key] = [(k, value), (None, None)] else: a, b = best2[key] if value > a[1]: best2[key] = [(k, value), a] elif value > b[1]: best2[key] = [a, (k, value)] return best2 This gives a speed boost (I measured with m=n=5) but is both messy and brittle. I've got a hunch that there is a better way; I just don't see it at the moment... Peter -- http://mail.python.org/mailman/listinfo/python-list