On 22/04/2013 13:51, Dave Angel wrote: > On 04/22/2013 07:58 AM, Blind Anagram wrote: >> I would be grateful for any advice people can offer on the fastest way >> to count items in a sub-sequence of a large list. >> >> I have a list of boolean values that can contain many hundreds of >> millions of elements for which I want to count the number of True values >> in a sub-sequence, one from the start up to some value (say hi). >> >> I am currently using: >> >> sieve[:hi].count(True) >> >> but I believe this may be costly because it copies a possibly large part >> of the sieve. >> >> Ideally I would like to be able to use: >> >> sieve.count(True, hi) >> >> where 'hi' sets the end of the count but this function is, sadly, not >> available for lists. >> >> The use of a bytearray with a memoryview object instead of a list solves >> this particular problem but it is not a solution for me as it creates >> more problems than it solves in other aspects of the program. >> >> Can I assume that one possible solution would be to sub-class list and >> create a C based extension to provide list.count(value, limit)? >> >> Are there any other solutions that will avoid copying a large part of >> the list? >> > > Instead of using the default slice notation, why not use > itertools.islice() ? > > Something like (untested): > > import itertools > > it = itertools.islice(sieve, 0, hi) > sum(itertools.imap(bool, it)) > > I only broke it into two lines for clarity. It could also be: > > sum(itertools.imap(bool, itertools.islice(sieve, 0, hi))) > > If you're using Python 3.x, say so, and I'm sure somebody can simplify > these, since in Python 3, many functions already produce iterators > instead of lists.
Thanks, I'll look at these ideas. And, yes, my interest is mainly in Python 3. -- http://mail.python.org/mailman/listinfo/python-list