Re:RE: RE: About lucene memory consumption

308181687 Sat, 28 Jun 2014 07:47:24 -0700

Hi, 
  Thank you very much for your reply!


   After analysis of the heap dump file, i found that there is a RAMFile  
instance  whose size is up to 1,670,583,296. I have limit the maxCachedMB  to 
60M,  why NRTCachingDirectory  decide to cache a big file?  Obviously this file 
were produced by merge scheduler‍.‍



   Code to initialize lucene  as follows, you can see that maxCachedMB is 60 
and maxMergeSizeMB  is 5.‍
   -----------------------------------
                        indexDir = FSDirectory.open(new File(indexDirName));
                        NRTCachingDirectory cachedFSDir = new 
NRTCachingDirectory(indexDir, 5.0, 60.0);
                        IndexWriterConfig iwc = new 
IndexWriterConfig(Version.LUCENE_47, analyzer);
                        iwc.setOpenMode(OpenMode.CREATE_OR_APPEND);
                        indexWriter = new IndexWriter(cachedFSDir, iwc);
                        
                        searcherMgr = new SearcherManager(indexWriter, true, 
new SearcherFactory());
                        searcherMgrList.add(searcherMgr);
                        
                        facetsConfig = new FacetsConfig();
                        taxoDir = FSDirectory.open(new 
File(indexDirName+"/taxo"));
                        taxoWriter = new DirectoryTaxonomyWriter(taxoDir);‍

----------------------------------------------------------------





‍
‍
‍
  Thanks & Best Regards!‍






------------------ Original ------------------
From:  "Uwe Schindler";<u...@thetaphi.de>;
Date:  Sat, Jun 28, 2014 05:41 PM
To:  "java-user"<java-user@lucene.apache.org>; 

Subject:  RE: RE: About lucene memory consumption



Hi,

how does your configuration for NRTCaching directory looks like. There are 2 
constructor params, one of the  maxMergeSizeMB the other one is maxCachedMB. If 
you correctly close (or release in case of ReaderManager/SearcherManager) all 
indexes, this should limit the memory use.

There is no caching using LinkedHashMap<byte[],byte[]> in Lucene. RAMDirectory 
(used by NRTCaching) uses a ConcurrentHashMap. It looks like you have some 
other stuff referenced.

> Hi, 
>    I got a heap dump and use tool to analyze it, and then found that almost 
> all
> of the byte[] ‍ instances is ‍indirectly referenced ‍by SearcherManager‍.
> Memory-path as follows:‍‍‍
> 
> 
>     byte[]
>     [Ljava/lang/Object
>     java/util/ArrayList
>     org/apache/lucene/store/RAMFile
>     org/apache/lucene/store/RAMInputStream
>     org/apache/lucene/store/Directory$SlicedIndexInput
>     org/apache/lucene/codecs/compressing/CompressingStoredFieldsReader
>     org/apache/lucene/index/SegmentCoreReaders
>     org/apache/lucene/index/SegmentReader
>     [Lorg/apache/lucene/index/SegmentReader;
>     org/apache/lucene/index/StandardDirectoryReader
>     org/apache/lucene/search/IndexSearcher
>     org/apache/lucene/search/SearcherManager
>     com/crm/lucene/LuceneMailIndex‍

It looks, like you give too much heap space to NRTCachingDirectory. For NRT 
caching only very few memory is needed (its main reason is just to not flush 
all data to disk on every reopen). FYI, the defaults in Apache Solr are:
DEFAULT_MAX_MERGE_SIZE_MB = 4;
DEFAULT_MAX_CACHED_MB = 48;

I would use those as defaults. As this is backed by RAMDirectory, it’s a bad 
idea to use too much heap, because it allocates in block of 1024 bytes and with 
large caches, this allocates millions of byte[] and create pressure on GC. 

Uwe


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

.

Re:RE: RE: About lucene memory consumption

Reply via email to