Re: Lucene Performance issue

2009-01-21 Thread Anshul jain
@Erick: Yes I changed the default field, it is "bagofwords" now. @Ian: Yes both indexes were optimized, and I didn't do any deletions. version 2.4.0 I'll repeat the experiment, just be sure. Mean while, do you have any document on Lucene fields? what I need to know is how lucene is storing field

Re: Lucene Performance issue

2009-01-21 Thread Ian Lea
> ... > I can for sure say that multiple copies are not index. But the number of > fields in which text is divided are many. Can that be a reason? Not for that amount of difference. You may be sure that you are not indexing multiple copies, but I'm not. Convince me - create 2 new indexes via the

Re: Lucene Performance issue

2009-01-21 Thread Erick Erickson
Note that your two queries are different unless you've changed the default operator. Also, your bagOfWords query is searching across your default field for the second two terms. Your bagOfWords is really something like bagOfWords:Alexander OR :history OR :Macedon. Best Erick On Wed, Jan 21, 20

Re: Lucene Performance issue

2009-01-21 Thread Erick Erickson
I agree with Ian that these times sound way too high. I'd also ask whether you fire a few warmup searches at your server before measuring the increased time, you might just be seeing the cache being populated. Best Erick On Wed, Jan 21, 2009 at 10:42 AM, Ian Lea wrote: > Hi > > > Space: 700Mb v

Re: Lucene Performance issue

2009-01-21 Thread Anshul jain
Hi, thanks for the reply. For the document, in my last mail.. multifieldQuery: name: Alexander AND domain: history AND first_sentence: Macedon Single field query: bagOfWords: Alexander history Macedon I can for sure say that multiple copies are not index. But the number of fields in which text

Re: Lucene Performance issue

2009-01-21 Thread Ian Lea
Hi Space: 700Mb vs 4.5Gb sounds way too big a difference. Are you sure you aren't loading multiple copies of the data or something like that? Queries: a 20 times slowdown for a multi field query also sounds way too big. What do the simple and multi field queries look like? -- Ian. On Wed,

Lucene Performance issue

2009-01-21 Thread Anshul jain
Hi, I've indexed around half a million XML documents. Here is the document sample: cogito:Name Alexander the Great cogito:domain ancient history cogito:first_sentence Alexander the Great (Greek: or Megas Alexandros; July 20 356 BC June 10 323 BC), also known as Alexander III