"Paul McGuire" <[EMAIL PROTECTED]> writes: > My original question was in response to your post, that sort() wasn't > required but only a temp variable. I am very interested in seeing your > solution that does not require the data to be sorted. (This is not just an > academic exercise - given a large historical data set, sorting the data is > one of the costliest parts of computing the median, and I would greatly > appreciate seeing an alternative algorithm.)
There are well known algorithms for finding the median in expected O(n) time, the most straightforward of which is a modified quicksort. You do the traditional quicksort pivot step, figure out which partition the median is in, and recurse on just that partition instead of on both of them. Expected running time is n + n/2 + n/4 + ... which is approx 2n. Tarjan discovered a guaranteed O(n) algorithm in the 1970's(?) whose operation is much different and quite complex. But all of these need more than one temp var. See an algorithms book like CLRS or Knuth for more info. -- http://mail.python.org/mailman/listinfo/python-list