RE: Number of documents in each segment before a merge occurs

2009-07-26 Thread Venkat Rangan
Mike, Yes, it is with Lucene 2.2.0. Thanks for the response. -venkat -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Sunday, July 26, 2009 2:33 AM To: java-user@lucene.apache.org Subject: Re: Number of documents in each segment before a merge occurs

Multiline Regex with Lucene

2009-07-26 Thread ba3
I was trying to do a regex search with the lucene and JavaUtilRegexCapabilities. The code used is : RegexQuery query = new RegexQuery(new Term("contents","(?m)hello.*(\r[^#]*)This is to be searched.*(\r[^#]*)#")); query.setRegexImplementation(new JavaUtilRegexCapabilities()); I verified the regex

Index html sites using IndexHtml

2009-07-26 Thread starz10de
Hi, I am indexing a set of html websites using lucene (IndexHtml). The indexer work fine and I can also find the indexed term but the problem this class (IndexHtml) index all text inside the html site even the advertisements. I am interested just in the body text and not interested in the adverti

Re: Number of documents in each segment before a merge occurs

2009-07-26 Thread Michael McCandless
It looks like you're using a version of Lucene before 2.3? Before 2.3, every document was written to its own RAM segment, and then these segments were merged during flushing. Mike On Sun, Jul 26, 2009 at 2:42 AM, Venkat Rangan wrote: > Shai, > > Thanks for your response. There isn't any specific

Re: Number of documents in each segment before a merge occurs

2009-07-26 Thread Shai Erera
Which Lucene is it? Do you perhaps call commit() or flush() after every document (just a long shot)? On Sun, Jul 26, 2009 at 9:42 AM, Venkat Rangan < venkat.ran...@clearwellsystems.com> wrote: > Shai, > > Thanks for your response. There isn't any specific options I am setting > and am leaving eve