On 04/22/2013 07:58 AM, Blind Anagram wrote:
I would be grateful for any advice people can offer on the fastest way
to count items in a sub-sequence of a large list.

I have a list of boolean values that can contain many hundreds of
millions of elements for which I want to count the number of True values
in a sub-sequence, one from the start up to some value (say hi).

I am currently using:

    sieve[:hi].count(True)

but I believe this may be costly because it copies a possibly large part
of the sieve.

Ideally I would like to be able to use:

    sieve.count(True, hi)

where 'hi' sets the end of the count but this function is, sadly, not
available for lists.

The use of a bytearray with a memoryview object instead of a list solves
this particular problem but it is not a solution for me as it creates
more problems than it solves in other aspects of the program.

Can I assume that one possible solution would be to sub-class list and
create a C based extension to provide list.count(value, limit)?

Are there any other solutions that will avoid copying a large part of
the list?


Instead of using the default slice notation, why not use itertools.islice() ?

Something like  (untested):

import itertools

it = itertools.islice(sieve, 0, hi)
sum(itertools.imap(bool, it))

I only broke it into two lines for clarity.  It could also be:

sum(itertools.imap(bool, itertools.islice(sieve, 0, hi)))

If you're using Python 3.x, say so, and I'm sure somebody can simplify these, since in Python 3, many functions already produce iterators instead of lists.


--
DaveA
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to