Re: MultiSearcher to overcome the Integer.MAX_VALUE limit

Mark Miller Sat, 08 Mar 2008 10:02:23 -0800

Random text can often be pretty slow when done per word.

I think you will have to modify the MultiSearcher a bit. TheMultiSearcher takes a global id space and converts to and from anindividual Searcher id space. The MultiSearcher's id space is limited toan int as well, but I think if you change it to a float/double, youshould be all set.


- Mark

Toke Eskildsen wrote:

On Fri, 2008-03-07 at 00:03 +0100, Ray wrote:

I am currently running a small random text indexer with 400 docs/second.
It will reach 2 billion in around 45 days.


If you are just doing it to test large indexes (in terms of document
count), then you need to look into your index-generation code. I tried
making an ultra-simple index builder, where each document contains a
unique id and one of nine fixed strings. The index-building speed on my
desktop computer is 40.000 documents/second (tested with 100 million
documents).

I would suspect that your random text generator is where all the
time-intensive processing occurs. Either that or you're flushing after
each document addition (which lowers my execution speed to about 100
documents/second).


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: MultiSearcher to overcome the Integer.MAX_VALUE limit

Reply via email to