Re: Retrieving large numbers of documents from several disks in parallel

2011-12-21 Thread Paul Libbrecht
Michael, from a physical point of view, it would seem like the order in which the documents are read is very significant for the reading speed (feel the random access jump as being the issue). You could: - move to ram-disk or ssd to make a difference? - use something different than a searcher w

Retrieving large numbers of documents from several disks in parallel

2011-12-21 Thread Robert Bart
Hi All, I am running Lucene 3.4 in an application that indexes about 1 billion factual assertions (Documents) from the web over four separate disks, so that each disk has a separate index of about 250 million documents. The Documents are relatively small, less than 1KB each. These indexes provide