Re: Searching while optimizing

2009-11-29 Thread Michael McCandless
OK I opened https://issues.apache.org/jira/browse/LUCENE-2097 to track this. Thanks v.sevel! Mike On Sun, Nov 29, 2009 at 5:57 AM, Michael McCandless wrote: > OK I dug down on this one... it's actually a bug in IndexWriter, when > used in near real-time mode *and* when CFS is enabled.  In that

Re: Searching while optimizing

2009-11-29 Thread Michael McCandless
OK I dug down on this one... it's actually a bug in IndexWriter, when used in near real-time mode *and* when CFS is enabled. In that case, internally IndexWriter holds open the wrong SegmentReader, thus tying up more disk space than it should. Functionally, the bug is harmless -- it's just tying

Re: Searching while optimizing

2009-11-28 Thread vsevel
Hi, thanks for the explanations. Though I had no luck... I now do the close of the reader before the commit. But still, only the close get us back to nominal. Here is the complete test: @Test public void optimize() throws Exception { final File dir = new File("lucene_work/optimiz

Re: Searching while optimizing

2009-11-27 Thread Michael McCandless
Phew, thanks for testing! It's all explainable... When you have a reader open, it prevents the segments it had opened from being deleted. When you close that reader, the segments could be deleted, however, that won't happen until the writer next tries to delete, which it does only periodically (

Re: Searching while optimizing

2009-11-27 Thread vsevel
Hi, I have done some testing that I would like to share with you. I am starting my tests with an unoptimized 40Mb index. I have 3 test cases: 1) open a writer, optimize, commit, close 2) open a writer, open a reader from the writer, optimize, commit, close 3) same as 2) except the reader is opene

Re: Searching while optimizing

2009-11-24 Thread Michael McCandless
OK, I'll add that to the javadocs; thanks. But the fact that you weren't closing the old readers was probably also tying up lots of disk space... Mike On Tue, Nov 24, 2009 at 3:31 PM, vsevel wrote: > > Hi, this is good information. as I read your post I realized that I am > supposed to commit a

Re: Searching while optimizing

2009-11-24 Thread vsevel
Hi, this is good information. as I read your post I realized that I am supposed to commit after an optimize, which is something I do not currently do. That would probably lead to the extra disk space I saw being consumed. If this is correct, then the optimize javadoc could be improved to say that

Re: Searching while optimizing

2009-11-24 Thread Michael McCandless
On Tue, Nov 24, 2009 at 9:08 AM, vsevel wrote: > Hi, just to make sure I understand correctly... After an optimize, without > any reader, my index takes 30Gb on the disk. Are you saying that if I can > ensure there is only one reader at a time, it could take up to 120Gb on the > disk if searching

Re: Searching while optimizing

2009-11-24 Thread vsevel
Hi, just to make sure I understand correctly... After an optimize, without any reader, my index takes 30Gb on the disk. Are you saying that if I can ensure there is only one reader at a time, it could take up to 120Gb on the disk if searching while an optimize is going on? I did not get your 3X

Re: Searching while optimizing

2009-11-24 Thread Michael McCandless
> - > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > >> -Original Message- >> From: Michael McCandless [mailto:luc...@mikemccandless.com] >> Sent: Tuesday, November 24, 2009 11:00

RE: Searching while optimizing

2009-11-24 Thread Uwe Schindler
e.apache.org > Subject: Re: Searching while optimizing > > On Tue, Nov 24, 2009 at 1:44 AM, vsevel wrote: > > > > 1) correct: I am using IndexWriter.getReader(). I guess I was assuming > that > > was a privately owned object and I had no business dealing with its

Re: Searching while optimizing

2009-11-24 Thread Michael McCandless
On Tue, Nov 24, 2009 at 1:44 AM, vsevel wrote: > > 1) correct: I am using IndexWriter.getReader(). I guess I was assuming that > was a privately owned object and I had no business dealing with its > lifecycle. the api would be clearer to rename the operation createReader(). I just committed an ad

Re: Searching while optimizing

2009-11-23 Thread vsevel
1) correct: I am using IndexWriter.getReader(). I guess I was assuming that was a privately owned object and I had no business dealing with its lifecycle. the api would be clearer to rename the operation createReader(). 2) how much transient disk space should I expect? isn't this pretty much what

Re: Searching while optimizing

2009-11-23 Thread Michael McCandless
When you say "getting a reader of the writer" do you mean writer.getReader()? Ie the new near real-time API in 2.9? For that API (an in general whenever you open a reader), you must close it. I think all your files is because you're not closing your old readers. Reopening readers during optimiz