Hi, I am freshman to Lucene and I am reading the book
“Lucene In Action”. Just as that we know, there are two kinds of directory
to hold index, one is File System and the other is RAM. There is a sample to compare performances of these two
kind directories and there is also a piece of code about “Batch indexing
by using RAMDirectory as a buffer”. When I follow some samples, I found an interesting
thing about indexing performance. I combine these two pieces of codes and time each kind
directory indexing. (Please refer the attachment for details processes) I load 3000 docs and 5 words per doc. I use File
System Directory and RAM Directory to indexing these docs directly. The time of
these two are 10737ms and 1575ms. Then I use a RAM directory as a buffer for indexing
and use method “addIndexes” of a new Index writer which finally
holds index in a File System directory. The time it consumed is 1348ms. How could this be? I think the time that buffered indexing consumes
should base on the time of RAM indexing. I wonder why a buffered indexing even has a good
performance than a ram indexing. So interesting! Best regards, Flik Shen
|
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]