Hi, We plan to upgrade the Lucene library in our application from 2.4.1 to 3.5.0. I have been running benchmark tests that come with Lucence. To my surprise, I found that the indexing in 3.5.0 is significant slower than 2.4.1 for the Wikipedia data.
Attached is the algorithm for the tests. The tests used default Lucence settings for flush memory size and merge factor. 512M memory was used for the tasks. The test machine is a 64-bit Windows 7 machine with Intel Core i7. The command: %ant -Dtask.alg=conf/wikipedia-default.alg -Dtask.mem=512M run-task Here are the test results: Lucece 2.4.1 [java] ------------> Report sum by Prefix (MAddDocs) and Round (3 about 3 out of 14) [java] Operation round flush mrg runCnt recsPerRun rec/s elapsedSec avgUsedMem avgTotalMem [java] MAddDocs_200000 0 16.00 10 1 200000 1,609.1 124.29 89,218,496 241,631,232 [java] MAddDocs_200000 - 1 16.00 10 - - 1 - - 200000 - - 1,746.4 - - 114.52 - 102,365,864 - 241,762,304 [java] MAddDocs_200000 2 16.00 10 1 200000 1,566.8 127.65 69,428,144 174,194,688 Lucene 2.9.4 [java] ------------> Report sum by Prefix (MAddDocs) and Round (3 about 3 out of 14) [java] Operation round flush mrg runCnt recsPerRun rec/s elapsedSec avgUsedMem avgTotalMem [java] MAddDocs_200000 0 16.00 10 1 200000 1,046.49 191.12 82,676,152 139,657,216 [java] MAddDocs_200000 - 1 16.00 10 - - 1 - - 200000 - 1,165.35 - - 171.62 - 119,364,128 - 156,762,112 [java] MAddDocs_200000 2 16.00 10 1 200000 1,245.86 160.53 50,361,760 137,625,600 Lucene 3.5.0 [java] ------------> Report sum by Prefix (MAddDocs) and Round (3 about 3 out of 14) [java] Operation round flush mrg runCnt recsPerRun rec/s elapsedSec avgUsedMem avgTotalMem [java] MAddDocs_200000 0 16.00 10 1 200000 676.48 295.65 70,917,592 129,695,744 [java] MAddDocs_200000 - 1 16.00 10 - - 1 - - 200000 - - 626.13 - - 319.42 - 50,329,552 - 94,240,768 [java] MAddDocs_200000 2 16.00 10 1 200000 687.68 290.83 57,732,640 92,864,512 The indexing speed using 2.4.1 is 2.3x of the speed using 3.5.0. Did I miss any settings or configurations? Thanks, Sean
--------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org