On 23/04/2013 00:28, Steven D'Aprano wrote: > On Mon, 22 Apr 2013 22:25:50 +0100, Blind Anagram wrote: > >> I have looked at solutions based on listing primes and here I have found >> that they are very much slower than my existing solution when the sieve >> is not large (which is the majority use case). > > Yes. This is hardly surprising. Algorithms suitable for dealing with the > first million primes are not suitable for dealing with the first trillion > primes, and vice versa. We like to pretend that computer programming is > an abstraction, and for small enough data we often can get away with > that, but like all abstractions eventually it breaks and the cost of > dealing with real hardware becomes significant. > > But I must ask, given that the primes are so widely distributed, why are > you storing them in a list instead of a sparse array (i.e. a dict)? There > are 50,847,534 primes less than or equal to 1,000,000,000, so you are > storing roughly 18 False values for every True value. That ratio will > only get bigger. With a billion entries, you are using 18 times more > memory than necessary.
Because the majority use case for my Prime class is for a sieve that is not large. I am just pushing the envelope for a minority use case so that it still works for huge sieves, albeit inefficiently. I accept it is inefficient, but the fact remains that I can produce a sieve that can yield and count a billion primes in a reasonable time but this fails when I wish to count on a part of the sieve because this can double the memory requirement for the lack of a list.count(value, limit) function. I would not dream of doing this job by copying a slice in any other language that I have used so I was simply asking for advice to discover whether this copy could be avoided whilst staying with the simple sieve design. Thank you for your input. Brian -- http://mail.python.org/mailman/listinfo/python-list