Re: Boosting a query

2008-01-20 Thread Yonik Seeley
Boosts on query clauses are relative to other clauses. To see boosting really work try looking at foo bar vs foo bar^10 -Yonik On Jan 20, 2008 8:13 PM, <[EMAIL PROTECTED]> wrote: > I am trying to submit a query to Lucene with just one term trying to > understand how the boost of a term influe

Boosting a query

2008-01-20 Thread angelica
I am trying to submit a query to Lucene with just one term trying to understand how the boost of a term influences the final document score: for example "computer" and "computer^5" (using query.setBoost()). Lucene returns the same documents with the same document score values for both queries. Whe

Re: Flush by RAM size question...

2008-01-20 Thread Erick Erickson
Michael: Thanks, that's what I figured, but it's nice to have confirmed. Erick On Jan 20, 2008 11:59 AM, Michael McCandless <[EMAIL PROTECTED]> wrote: > > Hi Erick, > > Yes, you do still need to guard against this case in 2.3. IndexWriter > checks the RAM usage after each doc is processed and

Re: Flush by RAM size question...

2008-01-20 Thread Michael McCandless
Hi Erick, Yes, you do still need to guard against this case in 2.3. IndexWriter checks the RAM usage after each doc is processed and flushes when that's over the limit. However, the memory consumed by a very large doc should be quite a bit less than before, because in 2.3 IndexWriter makes mor

Flush by RAM size question...

2008-01-20 Thread Erick Erickson
About flush by RAM I was playing around with something similar on the 2.1 codebase (roll-my-own) and had the quirk of a possible *very* large incoming document. As in 250M. So I had to put some logic in to try say, in effect, "if the incoming doc is completely ridiculous, flush now". I should say

Re: Multiple searchers (Was: CachingWrapperFilter: why cache per IndexReader?)

2008-01-20 Thread Mark Miller
Anyone tried using this on Lucene yet? http://www.alphaworks.ibm.com/tech/jla Michael McCandless wrote: These results are very interesting. With 3 threads on SSD your searches run 87% faster if you use 3 IndexSearchers instead of sharing a single one. This means, for your test, there are

Re: Optimize for large index size

2008-01-20 Thread Michael McCandless
On upgrading to 2.3, it's best to flush by RAM (writer.setRAMBufferSizeMB) instead of document count. Generally, the more RAM the better, to a point. Though you should also be sure not to use so much RAM that your JVM must GC too often or hits OOM error, or your machine starts swapping.

Re: Optimize for large index size

2008-01-20 Thread vivek sar
my maxBufferedDocs is 1000, do you recommend bigger than that? What's a good number for this for a very high indexing rate (10K new documents every min). On Jan 19, 2008 10:30 PM, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > In addition to what Mike already said: > > maxMergeDocs=9 -- do yo

Re: Multiple searchers (Was: CachingWrapperFilter: why cache per IndexReader?)

2008-01-20 Thread Michael McCandless
These results are very interesting. With 3 threads on SSD your searches run 87% faster if you use 3 IndexSearchers instead of sharing a single one. This means, for your test, there are some crazy synchronization bottlenecks when searching, which I think we should ferret out and fix. Ha