Results of setting LogMergePolicy "calibrateSizeByDeletes=true"

2009-09-30 Thread Jibo John
Hello, I am in the process of trying out the lucene patch LUCENE-1634, however I'm not getting the expected behavior. I see that the segments are not getting merged even after all the documents are deleted from it. Because of this, the index size really grows to a huge number. The expec

Re: ThreadedIndexWriter vs. IndexWriter

2009-08-11 Thread Jibo John
ng because the thread pool was already told to shut down. Larger queues made it much more likely to happen. Can you try the new version (attached)? Also, make sure you add 'doc.reuse.fields=false' to your alg (on trunk). Mike On Tue, Aug 11, 2009 at 12:39 PM, Jibo John wrote: Mik

Re: ThreadedIndexWriter vs. IndexWriter

2009-08-03 Thread Jibo John
ng from earlier releases, which could explain what you're seeing). If you are missing that, can you download the current code from http://www.manning.com/hatcher3 and try again? If that's not the problem... can you post the benchmark alg you are using in each case? Mike On Fri, Jul 31, 200

Re: ThreadedIndexWriter vs. IndexWriter

2009-07-31 Thread Jibo John
Hi Phil, It's 5 threads for IndexWriter. For ThreadedIndexWriter, I used: writer.num.threads=16 writer.max.thread.queue.size=80 Thanks, -Jibo On Jul 31, 2009, at 5:01 PM, Phil Whelan wrote: Hi Jibo, Your mergeFactor is different, and the resulting numFiles (segment files) is different. May

Re: ThreadedIndexWriter vs. IndexWriter

2009-07-31 Thread Jibo John
On Jul 31, 2009, at 2:52 PM, Michael McCandless wrote: Hmmm... can you run CheckIndex on both indexes and post the results? java org.apache.lucene.index.CheckIndex /path/to/index Mike On Fri, Jul 31, 2009 at 2:38 PM, Jibo John wrote: Number of docs are the same in the index for both the

Re: ThreadedIndexWriter vs. IndexWriter

2009-07-31 Thread Jibo John
or were they different? If different, how so (e.g., missing terms, etc.)? Later, Jim On Fri, Jul 31, 2009 at 2:38 PM , Jibo John wrote: Number of docs are the same in the index for both the cases (200,000). I haven't altered the benchmark/ code, but, used a profiler to verify

Re: ThreadedIndexWriter vs. IndexWriter

2009-07-31 Thread Jibo John
a smaller index. Can you sanity check the index? Eg is numDocs() the same for both? You definitely called close() on the writer, right? That method waits for all threads to finish their work before actually closing. Mike On Thu, Jul 30, 2009 at 8:01 PM, Jibo John wrote: While trying out a few tuni

ThreadedIndexWriter vs. IndexWriter

2009-07-30 Thread Jibo John
While trying out a few tuning options using contrib/benchmak as described in LIA (2nd edition) book, I had an interesting observation. If I use a ThreadedIndexWriter (picked the example from lia2e, page 356) instead of IndexWriter, the index size got reduced by 40% compared to using IndexWr