2 jul 2010 kl. 08.32 skrev Li Li:
I have an index of about 8,000,000 document and the current index size is about 30GB. Is it possbile to use this contrib to speed up my search? I have enough memory for it.
In order to answer your question you'll need to benchmark using a lot of typical queries. My guess is that it will probably be about as fast as a RAMDirectory while consuming a lot more memory. It's hard to say for sure though.
II is faster than RD mainly due to the need for RD to unmarshall information from a byte stream to java instances, hence the name. As the index grows the time spent in RD unmarshalling will shrink compared to the time spent seeking (mainly in DocsEnum/ DocsAndPositionsEnum) and scoring documents. Thus executing queries on a large index using terms that are only available in a small portion of the documents should be faster on II than on RD, while exeuting queries using frequently occuring terms will consume about as much time.
(Perhaps the documentation should explain it this way rather than just state "Mileage may vary depending on term saturation".)
While benchmarking remember that RD might require a warm up period while II does not.
Feel free to report back with any findings. karl --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org