Hi Kameron,

> It's clear to me now - I guess that puts garbage collection out of the
> picture.
> 
> But what is then more confusing - especially if, as you say, Apache Lucene
> forcefully unmaps all mapped byte buffers when it closes the IndexInputs.
> So, it must mean that for some reason the IndexInputs are not getting
> closed.  Is there a way to see that?  I guess you very clearly outlined
> these possible causes, which will require code checking:

A quick solution to find that out is to use "lsof" (list of open files on Linux 
command line). If you see a growing number of open files (especially those 
which are marked "deleted"), then you have a leak in your software.

Another trick is to limit number of open files with "ulimit". Set it to 
something low and run your app. If it runs out of file handles, you have a leak 
:-)

> 1) If you do not close IndexWriter and DirectoryReaders when required, the
> index files stay open.
> 2) If indexing goes on and you reopen the DirectoryReader (e.g. with the
> near realtime functions of IndexWriter to see the actual state), be sure
> to close the "old" reader. Otherwise it will open more and more files.

Yes. If you just have an IndexWriter and sequentially index your files, there 
is no risk to leak file handles/mappings. But as soon as you use the NRT 
functions in IndexWriter to get an DirectoryReader on the actual state, you may 
produce a leak. Be sure to close the old reader once you got a new one (after 
index has updated).

The Lucene team recommends to use SearcherManager to reopen 
DirectoryReaders/IndexSearcher. This class can be connected to the IndexWriter 
and your software can acquire/release NRT (near-realtime) searchers (to execute 
searches). During acquire, the class check if the current reader is up-to-date 
(with everything that was indexed up to now using the IndexWriter). If the 
DirectoryReader is outdated, it will *correctly* open a new one, not leaking 
file handles.

> In our case, indexing (actually, re-indexing) happens a lot!  The people
> managing this installation have a need to keep the large index updated. Is
> there just a kind of fundamental "race condition" that comes from indexing
> back-to-back?  Clearly, fewer rebuilds in a day lessens the danger of
> machine crash.  We can be fairly certain of that.  Still, I don't see why
> it should be necessary to worry about too many index builds.  The OS
> should be able to handle this.

If you reindex a lot while using NRT functions (see above) there is the risk.

If you have some bulk indexing going on, but you don't require NRT visibility 
of index changes *during indexing*, maybe just keep the DirectoryReaders for 
searches open (it will stay on the snapshot before indexing started) and then 
index your stuff, and once committed, finally reopen the DirectoryReader. And 
don't forget to close the old one!

The risk of leaking file handles is always there if you do indexing and 
searching at the same time and you want to see the updates to the index ASAP 
(near real time). SearcherManager, Apache Solr or Elasticsearch can handle 
this, maybe look at their code.

> I keep coming back to this, though - can this have anything to do with
> Windows virtual memory management?  I kind of specialized way back in
> college in OS level functions, and Windows has a completely different
> paradigm that Unix for memory management.  Throughout a pretty long
> career
> in IT development, I have seen time and time again - including in this
> case - that when you reboot Windows, the memory problems are gone.  I
> have
> almost never seen or heard of rebooting Linux or AIX in this regard. That
> said, I guess that any discussion of ulimits is moot, right?

Oh Windows. This makes debugging the problem harder (no lsof or ulimit tools - 
see my first recommendation). Nevertheless, you can still debug a problem in 
your app by trying on Linux (lsof and ulimit of open files) to verify 
correctness of your app.

But otherwise Windows is no problem, only for some huge installations you may 
get into trouble: On Windows, the 64 bit address space is very limited. Linux 
has 16 Terabytes of virtual adress space, but Windows only around 2-4 
Terabytes. Due to fragmentation and huge indexes, the address space may not be 
large enough. Lucene prevents that by memory mapping in chunks of 1 Gigabyte, 
so it is unlikely, but if your Index is very huge, you may still run out of 
disk space. I have heard of customers that were able to crush their machines 
with that. But in any case, if you have so huge indexes, you should think of 
sharding (e.g. by using Solr or Elasticsearch on multiple machines).

The bad thing on Windows is: You cannot use another IO system:
- NIOFSDirectory is too slow (Windows bug with multithreaded access) - on Linux 
one may use NIOFSDirectory instead of MMapDirectory if you run out of virtual 
address space
- SimpleFSDirectory is just too slow....

Uwe

> From:   "Uwe Schindler" <u...@thetaphi.de>
> To:     <java-user@lucene.apache.org>
> Date:   02/24/2017 06:22 PM
> Subject:        RE: MappedByteBuffer duplicates
> 
> 
> 
> Hi,
> 
> You did not give us all information. So I can only give some hints,
> because there could be multiple causes for your problems. There is for
> sure no bug in Apache Lucene as there are thousands of Solr and
> Elasticsearch instances running without such problems.
> 
> > Actually, at a certain point, they have crashed the machine. The native
> > file mappings are deallocated (unmapped) by the JVM when the
> > MappedByteBuffers are eligible for garbage collection. The problem we're
> > seeing  is that there are thousands of MappedByteBuffers which are not
> > eligible for garbage collection. The native memory is retained because
> the
> > Lucene code is still referencing the MappedByteBuffer objects on the
> Java
> > heap. This isn't the fault of Windows or the JVM. It appears to be a
> fault
> > in Lucen, but we can't diagnose it - we can't see why the
> MappedByteBuffer
> > objects are being retained.
> 
> For Apache Lucene this is not true:
> 
> Apache Lucene forcefully unmaps all mapped byte buffers when it closes the
> IndexInputs. Without that, we would need to wait for Garbage Collection
> for this to happen, which not only brings problems for virtual address
> space (your problem), but also disk usage (files that have mapped contents
> cannot be deleted). So your statement is not true. Lucene does not need to
> wait for Garbage Collector, it forces unmapping!
> 
> If forceful unmapping does not work (requires Oracle JDK, OpenJDK or IBM
> J9 - version [7 for Lucene 5], Java 8, Java 9 b150+), MMapDirectory is not
> used by default. This happens on JVMs which do not expose the internal
> APIs that are needed to do that. To check this, print the contents of:
> 
> http://lucene.apache.org/core/6_4_1/core/org/apache/lucene/store/MMap
> Directory.html#UNMAP_SUPPORTED
> 
> http://lucene.apache.org/core/6_4_1/core/org/apache/lucene/store/MMap
> Directory.html#UNMAP_NOT_SUPPORTED_REASON
> 
> 
> If you use FSDirectory.open() to get a directory instance (factory
> method), it will not choose MMapDir if unmapping is not supported. So It
> may happen that you forcefully use MMapDirectory, although unmapping
> does
> not work for your JVM?
> 
> Nevertheless, you say that you see many MappedByteBuffers that are not
> eligible for garbage collection. Of course Lucene will not unmap those
> because they are still in use. The reason for this could be incorrect code
> on your side. If you do not close IndexWriter and DirectoryReaders when
> required, the index files stay open. If indexing goes on and you reopen
> the DirectoryReader (e.g. with the near realtime functions of IndexWriter
> to see the actual state), be sure to close the "old" reader. Otherwise it
> will open more and more files. Depending on maximum open files limit, you
> can run out of file handles or (if you have many file handles) or it may
> crush the machine, because you use all virtual address space.
> 
> To fully analyze your problem, we need more information. Please also
> provide:
> - Lucene version
> - Operating System version
> - "ulimit -a" output (POSIX operating systems)
> - Java version and vendor
> - Crash report
> - Source code to show what you are doing: Just indexing (your problem is
> impossible), indexing and searching in parallel, do your use NRT readers
> for realtime visibility of indexed content
> 
> Uwe
> 
> > From:   "Uwe Schindler" <u...@thetaphi.de>
> > To:     <java-user@lucene.apache.org>
> > Date:   02/24/2017 01:39 PM
> > Subject:        RE: MappedByteBuffer duplicates
> >
> >
> >
> > Hi,
> >
> > that is not an issue, the duplicates are required for so called
> IndexInput
> > clones and splices. Every search request will create many of them. But
> > there is no need to worry, they are just thin wrappers - they don't
> > allocate any extra off-heap memory. They are just there to have a
> separate
> > position(), limit() and other settings for each searcher thread.
> >
> > Why do you worry?
> > Uwe
> >
> > -----
> > Uwe Schindler
> > Achterdiek 19, D-28357 Bremen
> > http://www.thetaphi.de
> > eMail: u...@thetaphi.de
> >
> > > -----Original Message-----
> > > From: Kameron Cole [mailto:kameronc...@us.ibm.com]
> > > Sent: Friday, February 24, 2017 7:19 PM
> > > To: java-user@lucene.apache.org
> > > Subject: MappedByteBuffer duplicates
> > >
> > > We have a Lucene engine that creates MappedByteBuffer objects when
> > > creating the Lucene index.  I don't know Lucene well enough to know if
> > > this standard behavior.
> > >
> > > The mapped files are being created by Lucene, via the JRE's NIO APIs
> > > native file mapping underneath each MappedByteBuffer object. We see
> an
> > > issue where duplicate MappedByteBuffer objects are being created.  Has
> > > anyone seen this?
> > >
> > > Thank you!
> > >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: java-user-h...@lucene.apache.org
> >
> >
> >
> >
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
> 
> 
> 
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to