Dennis Lee Bieber wrote:
> On Thu, 29 Apr 2010 11:38:28 +0200, "Karin Lagesen"
> <karin.lage...@bio.uio.no> declaimed the following in comp.lang.python:
> 
>> Hello.
>>
>> I have approx 83 million strings, all 14 characters long. I need to be
>> able to take another string and find out whether this one is present
>> within the 83 million strings.
>>

>>
>       So don't load them into memory... First use a file-based (not memory
>
> 
>       That lets you do a binary search on the file. Much faster than a
> linear search (linear search will average out to 41.5M read operations;
> binary should be around 10000 reads)

Don't you meant 27 reads instead of 41.5 M reads?

>>> from math import log
>>> log(83e6)/log(2)
26.306608000671101
>>>


N
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to