[EMAIL PROTECTED]: > My code follows, and it seems a bit clumsy - is there a cleaner way to do it?
The code doesn't look bad. I can suggest few things: - When you have "paragraphs" of code that do something definite then the comment before them can be written without indentation, to denote it regards the whole little block. Or you can even call a subfunction (a function inside another function/method) - Finding good names for variables is important. So this code: > d = {} #identify unique instances of x and y > for xx,yy in zip(x,y): > if xx in d: > d[xx].append(yy) > else: > d[xx] = [yy] May become (untested): > # identify unique instances of x and y > d = {} > for xx, yy in zip(x, y): > if xx in d: > d[xx].append(yy) > else: > d[xx] = [yy] Or better: > # identify unique instances of seqx and seqx > d = defaultdict(list) > for x, y in izip(seqx, seqx): > d[x].append(y) Or even something like: > def unique_x(seqx, seqx): > """identify unique instances of seqx and seqx""" > result = defaultdict(list) > for x, y in izip(seqx, seqx): > result[x].append(y) > return result > ... > d = unique_x(x, y) The following line is too much heavy, it looks like a misuse of the inline if syntax: > y1.append( (yy[l//2-1] + yy[l//2])/2.0 if l % 2 == 0 else > yy[l//2] ) It was not added to python for that long lines. This looks better to me: l = len(yy) if l == 1: y_median = yy[0]) else: yy.sort() if l & 1: y_median = yy[l // 2] else: y_median = yy[l // 2 - 1] + yy[l // 2]) / 2.0 y1.append(y_median) But even better is to use a stronger library code, for example a median() able to run in O(n), and a more strong unique(), like: http://code.activestate.com/recipes/438599/ http://code.activestate.com/recipes/466330/ But using a stronger unique() this time isn't useful, so I like this compromise: # identify unique instances of seqx and seqx uniques_x = defaultdict(list) for x, y in izip(seqx, seqx): uniques_x[x].append(y) result_x = [] result_y = [] for x, ys in uniques_x.iteritems(): result_x.append(x) result_y.append(median(ys)) If you want to use the last bit of semantics of CPython you can even write that code like this: # identify unique instances of seqx and seqx uniques_x = defaultdict(list) for x, y in izip(seqx, seqx): uniques_x[x].append(y) result_x = uniques_x.keys() result_y = map(median, uniques_x.itervalues()) I think it works because keys and values are given in the same order, but in real code I tend to avoid using such subtle things. Because if you translate that code to another language, or you use another Python implementation it may not work anymore, and lot of code sooner or later becomes translated... Bye, bearophile -- http://mail.python.org/mailman/listinfo/python-list