Re: new to lucene- some questions regarding internals

2015-08-11 Thread Erick Erickson
1-3 are really answered by the same explanation: When you open a searcher, lucene "knows" what all the closed segments are (i.e., the last commit point). And you can't commit when only part of a document has been written to the current segment. You can think of commits as atomic at the document le

Re: Standard highlighter returns whole document as a fragment

2015-08-11 Thread Duke DAI
Seems we are encountering same problem. (thread: bug of highlighter/SimpleSpanFragmenter, returned longer fragment than expected?) When debugging, your fragmenter is SimpleSpanFragmenter? isNewFragment() returns true due to below logic? boolean isNewFrag = offsetAtt.endOffset() >= (fragmentSize * c

Re: bug of highlighter/SimpleSpanFragmenter, returned longer fragment than expected?

2015-08-11 Thread Duke DAI
Greetings! Any body has input on this? Best regards, Duke If not now, when? If not me, who? On Fri, Aug 7, 2015 at 10:58 AM, Duke DAI wrote: > Hi experts, > > I'm trying to reproduce a bug from Lucene side, and found something. > > In latest codeline, 5.2.1, I modified test > case HighlighterT

Re: Re: memory cost in forceMerge(1)

2015-08-11 Thread Duke DAI
>From my experience, you must hit some system issue. You should check disk performance at first, disk queue length on Windows. Or you can enable gc verbose to know the gc activities in details. I designed auto upgrade mechanism in application by calling forceMerge(1), to eradicate hybrid index for

new to lucene- some questions regarding internals

2015-08-11 Thread Yechiel Feffer
Hi 1. as I understand Lucene is preparing the documents of the search result in a lazy fashion- using the docId in the ScoreDoc. What happens if the document "pointed" by the ScoreDoc is deleted meanwhile i.e. the DocId is not relevant (maybe assigned to a different document) ? 2. when a docu

Re: Re: memory cost in forceMerge(1)

2015-08-11 Thread Phaneendra N
There could be other applications running on the machine with 24 GB memory? Which would result in total available memory less than what is required. In this case there may be disk swap, which would take long time. In theory, if you run this test on machines with memory 50 GB and 100 GB in this case

Re:Re: memory cost in forceMerge(1)

2015-08-11 Thread 丁儒
The index will not change oftenly, so we call forceMerge in the end. Will forceMerge(1) cost too much memory? And the final size of the index is 15GB. I just want to know why different machine cost different time in forceMerge, them have the same cpu and disk, but different size of memory. One