On Sun, 07 Jan 2007 20:55:19 -0500, Dan Sommers wrote: > On Sun, 07 Jan 2007 22:23:22 +0100, > "Michael M." <[EMAIL PROTECTED]> wrote: > >> How to find the longst element list of lists? >> I think, there should be an easier way then this: > >> s1 = ["q", "e", "d"] >> s2 = ["a", "b"] >> s3 = ["a", "b", "c", "d"] > > [ snip ] > > One more thing to think about: if your list of lists grows (i.e., if > you end up with thousands of lists instead of just three), then sorting > may not be the way to go. Assuming that list_of_lists is your list of > lists, then something like this: > > longest_list, longest_length = list_of_lists[ 0 ], len( longest_list ) > for a_list in list_of_lists[ 1 : ]: > a_length = len( a_list ) > if a_length > longest_length: > longest_list, longest_length = a_list, a_length > > will run faster than sorting the list just to pick off one element (O(n) > vs. O(n log n) for all of you Big-Oh notation fans out there; you know > who you are!).
But your O(n) code is running in relatively slow Python, while the sort method, while O(n log n), is some of the fastest, most highly optimized C code out there. Unless your list is truly gigantic, chances are the sort version will win. Here's my timing code: import timeit def getlongest1(list_of_lists): longest_list = list_of_lists[ 0 ] longest_length = len( longest_list ) for a_list in list_of_lists[ 1 : ]: a_length = len( a_list ) if a_length > longest_length: longest_list, longest_length = a_list, a_length return longest_list def getlongest2(list_of_lists): list_of_lists.sort(key=len) return list_of_lists[0] def make_list_of_lists(length): return [[None]*i for i in xrange(length)] t1 = timeit.Timer("getlongest1(L)", "from __main__ import getlongest1, L") t2 = timeit.Timer("getlongest2(L)", "from __main__ import getlongest2, L") Note that my test list_of_lists grows very big quite fast, like O(n**2). Assuming Python pointers are eight bytes, a mere length=10000 will require over 760MB just for the pointers. More realistic data may allow more extensive testing. And here are my timing results: >>> L = make_list_of_lists(1) >>> print t1.timeit(1000), t2.timeit(1000) 0.00209903717041 0.00367403030396 >>> L = make_list_of_lists(10) >>> print t1.timeit(1000), t2.timeit(1000) 0.00871086120605 0.00775289535522 >>> L = make_list_of_lists(100) >>> print t1.timeit(1000), t2.timeit(1000) 0.121382951736 0.0518100261688 >>> L = make_list_of_lists(1000) >>> print t1.timeit(1000), t2.timeit(1000) 0.809508085251 0.508343935013 >>> L = make_list_of_lists(10000) >>> print t1.timeit(100), t2.timeit(100) 0.906499147415 0.732254981995 >>> L = make_list_of_lists(20000) >>> print t1.timeit(100), t2.timeit(100) 1.83560800552 1.58732700348 For a list of 1 item, sorting is 1.8 times SLOWER; For a list of 10 items, sorting is 1.1 times FASTER; For 100 items, sorting is 2.3 times faster; For 1000 items, sorting is 1.6 times faster; For 10,000 items, sorting is 1.2 times faster; For 20,000 items, sorting is 1.1 times faster. The precise results depend on the version of Python you're running, the amount of memory you have, other processes running, and the details of what's in the list you are trying to sort. But as my test shows, sort has some overhead that makes it a trivial amount slower for sufficiently small lists, but for everything else you're highly unlikely to beat it. -- Steven. -- http://mail.python.org/mailman/listinfo/python-list