Hi,

We plan to upgrade the Lucene library in our application from 2.4.1 to 3.5.0. I 
have been running  benchmark tests that come with Lucence. To my surprise, I 
found that the indexing  in 3.5.0 is significant slower than 2.4.1 for the 
Wikipedia data.

Attached is the algorithm for the tests.   The tests used default Lucence 
settings for flush memory size and merge factor. 512M memory was used  for the 
tasks.  The test machine is a 64-bit Windows 7 machine with Intel Core i7.

The command:
%ant -Dtask.alg=conf/wikipedia-default.alg -Dtask.mem=512M run-task

Here are the test results:

Lucece 2.4.1

       [java] ------------> Report sum by Prefix (MAddDocs) and Round (3 about 
3 out of 14)

     [java] Operation       round flush mrg   runCnt   recsPerRun        rec/s  
elapsedSec    avgUsedMem    avgTotalMem

     [java] MAddDocs_200000     0 16.00  10        1       200000      1,609.1  
    124.29    89,218,496    241,631,232

     [java] MAddDocs_200000 -   1 16.00  10 -  -   1 -  -  200000 -  - 1,746.4 
-  - 114.52 - 102,365,864 -  241,762,304

     [java] MAddDocs_200000     2 16.00  10        1       200000      1,566.8  
    127.65    69,428,144    174,194,688

Lucene 2.9.4

     [java] ------------> Report sum by Prefix (MAddDocs) and Round (3 about 3 
out of 14)

     [java] Operation       round flush mrg   runCnt   recsPerRun        rec/s  
elapsedSec    avgUsedMem    avgTotalMem

     [java] MAddDocs_200000     0 16.00  10        1       200000     1,046.49  
    191.12    82,676,152    139,657,216

     [java] MAddDocs_200000 -   1 16.00  10 -  -   1 -  -  200000 -   1,165.35 
-  - 171.62 - 119,364,128 -  156,762,112

     [java] MAddDocs_200000     2 16.00  10        1       200000     1,245.86  
    160.53    50,361,760    137,625,600

Lucene 3.5.0

     [java] ------------> Report sum by Prefix (MAddDocs) and Round (3 about 3 
out of 14)

     [java] Operation       round flush mrg   runCnt   recsPerRun        rec/s  
elapsedSec    avgUsedMem    avgTotalMem

     [java] MAddDocs_200000     0 16.00  10        1       200000       676.48  
    295.65    70,917,592    129,695,744

     [java] MAddDocs_200000 -   1 16.00  10 -  -   1 -  -  200000 -  -  626.13 
-  - 319.42 -  50,329,552 -   94,240,768

     [java] MAddDocs_200000     2 16.00  10        1       200000       687.68  
    290.83    57,732,640     92,864,512


The indexing speed using 2.4.1 is 2.3x  of the speed using 3.5.0.   Did I miss 
any settings or configurations?

Thanks,

Sean


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to