Re: Lucene Highlighting and Dynamic Summaries

2009-03-12 Thread Amin Mohammed-Coleman
Hi I think that would be good. Probably a silly thing to ask but I guess there is a performance implication by setting it to max value. Is there a general setting that other developers use? Cheers Amin On 12 Mar 2009, at 22:03, Michael McCandless wrote: IndexWriter has such behavi

Re: Lucene Highlighting and Dynamic Summaries

2009-03-12 Thread Michael McCandless
IndexWriter has such behavior too, and because it was such a common trap (developers could not understand why their content was being truncated), we made that setting explicit, up front so you were aware of it. I think this in general is a reasonable approach for settings that "lose" stuff

Re: Lucene Highlighting and Dynamic Summaries

2009-03-12 Thread Amin Mohammed-Coleman
I did the following: highlighter.setMaxDocCharsToAnalyze(Integer.MAX_VALUE); which works. On Thu, Mar 12, 2009 at 6:41 PM, Amin Mohammed-Coleman wrote: > JIRA updated. Includes new testcase which shows highlighter not working as > expected. > > > On Thu, Mar 12, 2009 at 5:56 PM, Amin Mohammed

Re: search problem when indexed using Field.setOmitTf()

2009-03-12 Thread Otis Gospodnetic
I bet omitTf will be confusing to people. When I see omitTf I read that as "aha, don't store term frequency". I don't read that as "don't store term frequency and don't store positional information". We'll have to document this well or maybe even consider renaming this so it's more self-desc

Re: Lucene Highlighting and Dynamic Summaries

2009-03-12 Thread Amin Mohammed-Coleman
JIRA updated. Includes new testcase which shows highlighter not working as expected. On Thu, Mar 12, 2009 at 5:56 PM, Amin Mohammed-Coleman wrote: > Hi > > I have found that it is not issue with POI. I extracted text using PoI but > differenlty and the term is extracted properly. When I store t

Re: Lucene Highlighting and Dynamic Summaries

2009-03-12 Thread Amin Mohammed-Coleman
Hi I have found that it is not issue with POI. I extracted text using PoI but differenlty and the term is extracted properly. When I store the text and retrieve it the term exists. However running the text through highlighter doesn't work I will post test case with plain text file on JIR

Re: Memory during Indexing

2009-03-12 Thread Grant Ingersoll
On Mar 12, 2009, at 10:47 AM, Niels Ott wrote: Michael McCandless schrieb: When RAM is full, IW flushes the pending changes to disk, but does not commit them, meaning external (newly opened or reopened) readers will not see the changes. Is there a built-in mechanism in the IndexReader to

Getting Field details on a hit

2009-03-12 Thread NickHirst
Hello Experts, I am using a MultiFieldQueryParser to search my index. The index has been set up with the following structure: design: [designcode] att1: [att1Value] att2: [att2Value] ... attn: [attnValue] Where the attvalues all correspond to the designcode. The search works well, and it ret

What kind of performance to expect from a MultiTermQuery being used in BooleanQuery?

2009-03-12 Thread ArtemGr
Hi! I have this NotEmptyQuery class (http://gist.github.com/78115) which extends the MultiTermQuery. The class is added into a BooleanQuery, after some other queries (e.g. after TermQuery and LongTrieRangeFilter queries). I wonder: does Lucene need to scan all the terms in the inverted index and th

Re: Memory during Indexing

2009-03-12 Thread Niels Ott
Michael McCandless schrieb: When RAM is full, IW flushes the pending changes to disk, but does not commit them, meaning external (newly opened or reopened) readers will not see the changes. Is there a built-in mechanism in the IndexReader to reload the index every now and then, after having c

Re: Lucene Highlighting and Dynamic Summaries

2009-03-12 Thread Amin Mohammed-Coleman
JIRA raised: https://issues.apache.org/jira/browse/LUCENE-1559 Thanks On Thu, Mar 12, 2009 at 11:29 AM, Amin Mohammed-Coleman wrote: > Hi > > Did both attachments not come through? > > Cheers > Amin > > > On Thu, Mar 12, 2009 at 9:52 AM, mark harwood wrote: > >> The attachment didn't make it th

Re: Lucene Highlighting and Dynamic Summaries

2009-03-12 Thread Amin Mohammed-Coleman
Hi Did both attachments not come through? Cheers Amin On Thu, Mar 12, 2009 at 9:52 AM, mark harwood wrote: > The attachment didn't make it through here. Can you add it as an attachment > to a new JIRA issue? > > Thanks, > Mark > > > > > > > From: Amin Mohammed-C

StandardTokenizer issue ?

2009-03-12 Thread iMe
I spotted an unexepcted behavior when using the StandardAnalyzer. This analyzer uses the StandardTokenizer which javadoc states: Splits words at hyphens, unless there's a number in the token, in which case the whole token is interpreted as a product number and is not split. But looking to

Re: Lucene Highlighting and Dynamic Summaries

2009-03-12 Thread mark harwood
The attachment didn't make it through here. Can you add it as an attachment to a new JIRA issue? Thanks, Mark From: Amin Mohammed-Coleman To: java-user@lucene.apache.org Sent: Thursday, 12 March, 2009 7:47:20 Subject: Re: Lucene Highlighting and Dynamic Summ

Re: Memory during Indexing

2009-03-12 Thread Michael McCandless
Niels Ott wrote: Hi Mark, markharw00d schrieb: Hi Niels, See the javadocs for IndexWriter.setRAMBufferSizeMB() I tried different settings. Apart from the fact that my memory issue seems to by my own fault, I'm wondering what Lucene does in the background. Apparently it does flush(), but

Re: Lucene Highlighting and Dynamic Summaries

2009-03-12 Thread Amin Mohammed-Coleman
Hi Please find attadched a test case plus a document. Just to mention this occurs sometimes for other files. Cheers Amin On Wed, Mar 11, 2009 at 6:11 PM, markharw00d wrote: > If you can supply a Junit test that recreates the problem I think we can > start to make progress on this. > > > > Amin