Steven D'Aprano wrote: > On Sun, 07 Jan 2007 20:55:19 -0500, Dan Sommers wrote: > >> On Sun, 07 Jan 2007 22:23:22 +0100, >> "Michael M." <[EMAIL PROTECTED]> wrote: >> >>> How to find the longst element list of lists? >>> I think, there should be an easier way then this: >> >>> s1 = ["q", "e", "d"] >>> s2 = ["a", "b"] >>> s3 = ["a", "b", "c", "d"] >> >> [ snip ] >> >> One more thing to think about: if your list of lists grows (i.e., if >> you end up with thousands of lists instead of just three), then sorting >> may not be the way to go. Assuming that list_of_lists is your list of >> lists, then something like this: >> >> longest_list, longest_length = list_of_lists[ 0 ], len( longest_list >> ) for a_list in list_of_lists[ 1 : ]: >> a_length = len( a_list ) >> if a_length > longest_length: >> longest_list, longest_length = a_list, a_length >> >> will run faster than sorting the list just to pick off one element (O(n) >> vs. O(n log n) for all of you Big-Oh notation fans out there; you know >> who you are!). > > But your O(n) code is running in relatively slow Python, while the sort > method, while O(n log n), is some of the fastest, most highly optimized C > code out there. Unless your list is truly gigantic, chances are the sort > version will win. > > Here's my timing code: > > > import timeit > > def getlongest1(list_of_lists): > longest_list = list_of_lists[ 0 ] > longest_length = len( longest_list ) > for a_list in list_of_lists[ 1 : ]: > a_length = len( a_list ) > if a_length > longest_length: > longest_list, longest_length = a_list, a_length > return longest_list > > def getlongest2(list_of_lists): > list_of_lists.sort(key=len) > return list_of_lists[0] > > def make_list_of_lists(length): > return [[None]*i for i in xrange(length)] > > t1 = timeit.Timer("getlongest1(L)", "from __main__ import getlongest1, L") > t2 = timeit.Timer("getlongest2(L)", "from __main__ import getlongest2, L") > > > > Note that my test list_of_lists grows very big quite fast, like O(n**2). > Assuming Python pointers are eight bytes, a mere length=10000 will > require over 760MB just for the pointers. More realistic data may allow > more extensive testing. > > And here are my timing results: > >>>> L = make_list_of_lists(1) >>>> print t1.timeit(1000), t2.timeit(1000) > 0.00209903717041 0.00367403030396 > >>>> L = make_list_of_lists(10) >>>> print t1.timeit(1000), t2.timeit(1000) > 0.00871086120605 0.00775289535522 > >>>> L = make_list_of_lists(100) >>>> print t1.timeit(1000), t2.timeit(1000) > 0.121382951736 0.0518100261688 > >>>> L = make_list_of_lists(1000) >>>> print t1.timeit(1000), t2.timeit(1000) > 0.809508085251 0.508343935013 > >>>> L = make_list_of_lists(10000) >>>> print t1.timeit(100), t2.timeit(100) > 0.906499147415 0.732254981995 > >>>> L = make_list_of_lists(20000) >>>> print t1.timeit(100), t2.timeit(100) > 1.83560800552 1.58732700348 > > For a list of 1 item, sorting is 1.8 times SLOWER; > For a list of 10 items, sorting is 1.1 times FASTER; > For 100 items, sorting is 2.3 times faster; > For 1000 items, sorting is 1.6 times faster; > For 10,000 items, sorting is 1.2 times faster; > For 20,000 items, sorting is 1.1 times faster. > > > The precise results depend on the version of Python you're running, the > amount of memory you have, other processes running, and the details of > what's in the list you are trying to sort. But as my test shows, sort has > some overhead that makes it a trivial amount slower for sufficiently small > lists, but for everything else you're highly unlikely to beat it.
Try again with tN.timeit(1) and a second list that is random.shuffle()d and copied to L before each measurement. list.sort() treats already sorted lists specially. Peter -- http://mail.python.org/mailman/listinfo/python-list