Re: BlockTreeTermsReader consumes crazy amount of memory

2014-09-10 Thread Robert Muir
Yes, there is also a safety check, but IMO it should be removed. See the patch on the issue, the test passes now. On Wed, Sep 10, 2014 at 9:31 PM, Vitaly Funstein wrote: > Seems to me the bug occurs regardless of whether the passed in newer reader > is NRT or non-NRT. This is because the user op

Re: BlockTreeTermsReader consumes crazy amount of memory

2014-09-10 Thread Vitaly Funstein
Seems to me the bug occurs regardless of whether the passed in newer reader is NRT or non-NRT. This is because the user operates at the level of DirectoryReader, not SegmentReader and modifying the test code to do the following reproduces the bug: writer.commit(); DirectoryReader latest =

Re: BlockTreeTermsReader consumes crazy amount of memory

2014-09-10 Thread Robert Muir
Thats because there are 3 constructors in segmentreader: 1. one used for opening new (checks hasDeletions, only reads liveDocs if so) 2. one used for non-NRT reopen <-- problem one for you 3. one used for NRT reopen (takes a LiveDocs as a param, so no bug) so personally i think you should be able

Re: BlockTreeTermsReader consumes crazy amount of memory

2014-09-10 Thread Vitaly Funstein
One other observation - if instead of a reader opened at a later commit point (T1), I pass in an NRT reader *without* doing the second commit on the index prior, then there is no exception. This probably also hinges on the assumption that no buffered docs have been flushed after T0, thus creating n

Re: BlockTreeTermsReader consumes crazy amount of memory

2014-09-10 Thread Vitaly Funstein
> > Normally, reopens only go forwards in time, so if you could ensure > that when you reopen one reader to another, the 2nd one is always > "newer", then I think you should never hit this issue Mike, I'm not sure if I fully understand your suggestion. In a nutshell, the use here case is as follo

Re: BlockTreeTermsReader consumes crazy amount of memory

2014-09-10 Thread Michael McCandless
Thanks, I'll look at the issue soon. Right, segment merging won't spontaneously create deletes. Deletes are only made if you explicitly delete OR (tricky) there is a non-aborting exception (e.g. an analysis problem) hit while indexing a document; in that case IW indexes a portion of the document

RE: 4.10.0: java.lang.IllegalStateException: cannot write 3x SegmentInfo unless codec is Lucene3x (got: Lucene40)

2014-09-10 Thread Uwe Schindler
Hi, we looked into earlier releases: The index version number of 4.0-ALPHA was "4.0" The index version number of 4.0-BETA was "4.0.0.1" The index version number of 4.0 final was "4.0.0.2" Ian's index is there fore a real official 4.0 index. Unfortunately the version comparison logic in Lucene 4

Re: 4.10.0: java.lang.IllegalStateException: cannot write 3x SegmentInfo unless codec is Lucene3x (got: Lucene40)

2014-09-10 Thread Robert Muir
Ian, its a supported version. It wouldnt matter if its 4.0 alpha or beta anyway, because we support index back compat for those. In your case, its actually the final version. I will open an issue. Thank you for reporting this! On Wed, Sep 10, 2014 at 7:54 AM, Ian Lea wrote: > Yes, quite possibl

Re: 4.10.0: java.lang.IllegalStateException: cannot write 3x SegmentInfo unless codec is Lucene3x (got: Lucene40)

2014-09-10 Thread Ian Lea
Yes, quite possible. I do sometimes download and test beta versions. This isn't really a problem for me - it has only happened on test indexes that I don't care about, but there might be live indexes out there that are also affected and having them made unusable would be undesirable, to put it mi

Re: 4.10.0: java.lang.IllegalStateException: cannot write 3x SegmentInfo unless codec is Lucene3x (got: Lucene40)

2014-09-10 Thread Ian Lea
Sent to your personal email address. -- Ian. On Wed, Sep 10, 2014 at 12:36 PM, Robert Muir wrote: > Ian, this looks terrible, thanks for reporting this. Is there any > possible way I could have a copy of that "working" index to make it > easier to reproduce? > > On Wed, Sep 10, 2014 at 7:01 AM

RE: 4.10.0: java.lang.IllegalStateException: cannot write 3x SegmentInfo unless codec is Lucene3x (got: Lucene40)

2014-09-10 Thread Uwe Schindler
If you want to upgrade the index, you may try to run IndexUpgrader on Lucene 4.9, to have it up to date. But Index upgrading may fail because of the BETA-Status of the original creator. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de

RE: 4.10.0: java.lang.IllegalStateException: cannot write 3x SegmentInfo unless codec is Lucene3x (got: Lucene40)

2014-09-10 Thread Uwe Schindler
Hi Ian, this index was created with the BETA version of Lucene 4.0: Segments file=segments_2 numSegments=1 version=4.0.0.2 format= 1 of 1: name=_0 docCount=15730 "4.0.0.2" was the index version number of Lucene 4.0-BETA. This is not a supported version and may not open correctly. In Lucene 4.

Re: 4.10.0: java.lang.IllegalStateException: cannot write 3x SegmentInfo unless codec is Lucene3x (got: Lucene40)

2014-09-10 Thread Robert Muir
Ian, this looks terrible, thanks for reporting this. Is there any possible way I could have a copy of that "working" index to make it easier to reproduce? On Wed, Sep 10, 2014 at 7:01 AM, Ian Lea wrote: > Hi > > > On running a quick test after a handful of minor code changes to deal > with 4.10 d

4.10.0: java.lang.IllegalStateException: cannot write 3x SegmentInfo unless codec is Lucene3x (got: Lucene40)

2014-09-10 Thread Ian Lea
Hi On running a quick test after a handful of minor code changes to deal with 4.10 deprecations, a program that updates an existing index failed with Exception in thread "main" java.lang.IllegalStateException: cannot write 3x SegmentInfo unless codec is Lucene3x (got: Lucene40) at org.apache.luc

IndexReader.document method - Why is this made final?

2014-09-10 Thread Buddhavarapu, Suresh
Hi, I'm working on an upgrade of project of Lucene from 2.9.3 to 4.10. We have a need to implement the IndexReader interfact to create an abstraction over two disparate indexes. First, I found that Indexreader can no longer be extended. Instead I chose to extend the CopositeReader abstract class.