Re: debugging growing index size

2015-11-13 Thread Michael McCandless
I can see that segments that are deleted never decRef to 0, e.g. _1dtf, I assume because NRT readers are not being closed. I think you should fix that bug, so every NRT reader ever opened is also closed (SearcherManager is a simple way to ensure this), and then let's regroup if disk space is still

Re: debugging growing index size

2015-11-13 Thread Michael McCandless
So with MMapDir at defaults (unmap is enabled) you see old files, with no open file handles as reported by lsof, still existing in your index directory, taking lots of space. But with NIOFSDirectory the issue doesn't happen? Are you sure? I'll look at the 6.6 GB infoStream to see what it says ab

RE: debugging growing index size

2015-11-13 Thread Rob Audenaerde
I haven't disabled unmapping, and I am running out-of-the-box FSDirectory.open(). As I can see it tries to pick MMap. For the test I explicitly constructed a NIOFSDIrectoryReader OS is (from the top of my head) CentOS 6.x, Java 1.8.0u33. I can check later for more details. On Nov 13, 2015 18:07,

RE: debugging growing index size

2015-11-13 Thread Uwe Schindler
Hi, Lucene has the workaround, so it should not happen, UNLESS you explicitly disable the hack using MMapDirectory#setEnableUnmap(false). Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: will martin

RE: debugging growing index size

2015-11-13 Thread Uwe Schindler
Did you disable unmapping using MMapDirectory#setEnableUnmap() ? By default it should be enabled, but maybe you disabled it for some reason? Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Rob Auden

Re: debugging growing index size

2015-11-13 Thread will martin
Hi Rob: Doesn’t this look like known SE issue JDK-4724038 and discussed by Peter Levart and Uwe Schindler on a lucene-dev thread 9/9/2015? MappedByteBuffer …. what OS are you on Rob? What JVM? http://bugs.java.com/view_bug.do?bug_id=4724038 http://mail-archives.apache.org/mod_mbox/lucene-dev/

Re: debugging growing index size

2015-11-13 Thread Rob Audenaerde
I'm currently running using NIOFS. It seems to prevent the issue from appearing. This is a second run (with applied deletes etc) raudenaerd@:/<6>index/index$sudo ls -lSra *.dvd -rw-r--r--. 1 apache apache 7993 Nov 13 16:09 _y_Lucene50_0.dvd -rw-r--r--. 1 apache apache 39048886 Nov 13 17:12

Re: Query documents where Field Doesn't Exist

2015-11-13 Thread Adrien Grand
Hi Vlad, This is something that you generally can't do. If you have doc values enabled on your fields, you can use Lucene's FieldValueQuery, but beware that this query is very slow. Otherwise if your field is indexed, you can run a TermRangeQuery that has both bounds open but this will be even slo

Re: debugging growing index size

2015-11-13 Thread Rob Audenaerde
I got the data (beware, it is about 180MB download, xz-zipped, unpacked it is about 6.6 GB). Unfortunately, I accidentally restarted the application so the index-files and lsof output could not be determined for this run. Hopefully the infoStream log with the extra logging will provide enough inf