Hi Mike,

I retested and results are the same:

1/ I did not use sort (so FieldCache should not enter picture?)
2/ I created indexed data from scratch separately for 361 and 43
   based on same text (text files), and I ran test from command
   line separately against each index folder, so seems a pretty 
   fair test.
3/ Each test I created searcher from scrath (to measure creation
   time). I did not include JVM start time in each case. The 
   tests are in same box.

>From indexed data it seems that 43 generated a lot more data in
folder, below I listed (ls -ltr) result (always pass in LUCENE_43
version, so lucen 42 codec should be used, why lucene41?).

Thanks very much for helps, Lisheng


///////////////////
36:
total 1228
-rw-r--r-- 1 root root     68 Jul 31 15:50 _0.fdx
-rw-r--r-- 1 root root     44 Jul 31 15:50 _0.fdt
-rw-r--r-- 1 root root    132 Jul 31 15:50 _0.tvx
-rw-r--r-- 1 root root 260207 Jul 31 15:50 _0.tvf
-rw-r--r-- 1 root root     20 Jul 31 15:50 _0.tvd
-rw-r--r-- 1 root root 345803 Jul 31 15:50 _0.tis
-rw-r--r-- 1 root root   4899 Jul 31 15:50 _0.tii
-rw-r--r-- 1 root root 539098 Jul 31 15:50 _0.prx
-rw-r--r-- 1 root root     12 Jul 31 15:50 _0.nrm
-rw-r--r-- 1 root root  61703 Jul 31 15:50 _0.frq
-rw-r--r-- 1 root root     29 Jul 31 15:50 _0.fnm
-rw-r--r-- 1 root root    252 Jul 31 15:50 segments_1
-rw-r--r-- 1 root root     20 Jul 31 15:50 segments.gen

43:
total 1828
-rw-r--r-- 1 root root      45 Jul 31 15:51 _0.fdx
-rw-r--r-- 1 root root      66 Jul 31 15:51 _0.fdt
-rw-r--r-- 1 root root      60 Jul 31 15:51 _0.tvx
-rw-r--r-- 1 root root  176845 Jul 31 15:51 _0.tvd
-rw-r--r-- 1 root root   10980 Jul 31 15:51 _0_Lucene41_0.tip
-rw-r--r-- 1 root root  401339 Jul 31 15:51 _0_Lucene41_0.tim
-rw-r--r-- 1 root root 1007621 Jul 31 15:51 _0_Lucene41_0.pos
-rw-r--r-- 1 root root  216711 Jul 31 15:51 _0_Lucene41_0.pay
-rw-r--r-- 1 root root   12048 Jul 31 15:51 _0_Lucene41_0.doc
-rw-r--r-- 1 root root      46 Jul 31 15:51 _0.nvm
-rw-r--r-- 1 root root      34 Jul 31 15:51 _0.nvd
-rw-r--r-- 1 root root     205 Jul 31 15:51 _0.fnm
-rw-r--r-- 1 root root     395 Jul 31 15:51 _0.si
-rw-r--r-- 1 root root      69 Jul 31 15:51 segments_1
-rw-r--r-- 1 root root      20 Jul 31 15:51 segments.gen
///////////////////


-----Original Message-----
From: Michael McCandless [mailto:luc...@mikemccandless.com]
Sent: Wednesday, July 31, 2013 11:31 AM
To: Lucene Users
Subject: Re: lucene 4.3 seems to be much slower in indexing than lucene
3.6?


On Tue, Jul 30, 2013 at 6:13 PM, Zhang, Lisheng
<lisheng.zh...@broadvision.com> wrote:
> Hi Mike,
>
> I did more tests with realistic text from different languages (typical
> text for 8 different languages, English one is novel "Animal Farm").
>
> What I found seems to be:
>
> ## Indexing:
> 36 and 43 comparable (your previous comment is very correct).
>
> ## Search:
> 43 seems to be slower (30%), checking details, it seems it all due to
> initial searcher creation and first search (warming), as if 43 did much
> more in warming?

Hmm, I'm not sure off hand why searcher warming would be slower in 4.3.

Are you relying on FieldCache (e.g. sorting by a field instead of by
relevance).  Switching to doc values should make warming much faster.

Are you sure the test was fair?  Ie, in both cases the index was
either hot or cold?

For your 4.3 test you fully reindexed right?  Ie, searched against a
4.3 (not 3.6) index?

Mike McCandless

http://blog.mikemccandless.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to