RE: Why does Lucene 7.4.0 commit() Increase Memory Usage x2

2019-04-05 Thread thturk
Thank you for all answers. The basic solution was commit rarely with an Scheduled Task or manually commit and keep heap size to minimun to GC run often and parallelly to release Memory Usage . -- Sent from: http://lucene.472066.n3.nabble.com/Lucene-Java-Users-f532864.html

RE: Why does Lucene 7.4.0 commit() Increase Memory Usage x2

2019-04-04 Thread Uwe Schindler
ursday, April 4, 2019 11:49 AM > To: java-user@lucene.apache.org > Subject: RE: Why does Lucene 7.4.0 commit() Increase Memory Usage x2 > > Hi, > > Thanks Adrien. With current JVM versions (Java 8 or Java 11), the garbage > collector never gives back memory to the operating sys

RE: Why does Lucene 7.4.0 commit() Increase Memory Usage x2

2019-04-04 Thread Uwe Schindler
ginal Message- > From: Adrien Grand > Sent: Thursday, April 4, 2019 10:00 AM > To: Lucene Users Mailing List > Subject: Re: Why does Lucene 7.4.0 commit() Increase Memory Usage x2 > > I think what you are experiencing is just due to how the JVM works: it > happily reserves

Re: Why does Lucene 7.4.0 commit() Increase Memory Usage x2

2019-04-04 Thread Adrien Grand
I think what you are experiencing is just due to how the JVM works: it happily reserves memory to the operating system if it thinks it might need it, and then it's reluctant to give it back because it assumes that if it has needed so much memory in the past, it might need it again in the future. If

Re: Why does Lucene 7.4.0 commit() Increase Memory Usage x2

2019-04-03 Thread thturk
I have tried Java VisualVM too watch GC status per each commit and relase variables for Reader Writer Searcher. But as result GC working like in photo at below After 16.40 I called GC manully but Heap size didnt

Re: Why does Lucene 7.4.0 commit() Increase Memory Usage x2

2019-04-02 Thread Erick Erickson
Task manager is almost useless for this kind of measurement. You never quite know how much garbage that hasn’t been collected is in that total. You can attach something like jconsole to the running Solr process and hit the “perform full GC” to get a more accurate number. Or you can look at GCVi

Re: Why does Lucene 7.4.0 commit() Increase Memory Usage x2

2019-04-02 Thread thturk
I am watching via task manager. Now i tired to handle this with hard coded way. I create new index and with commit in small index cost low memory. but i dont think that its good way to do this. Its getting harder to manage indexes. -- Sent from: http://lucene.472066.n3.nabble.com/Lucene-Java-U

Re: Why does Lucene 7.4.0 commit() Increase Memory Usage x2

2019-04-02 Thread Adrien Grand
How do you measure memory usage? On Mon, Apr 1, 2019 at 8:33 AM thturk wrote: > > Hello, > -For a while i am tring to figure out why ram usage incease x2 than before > after commit one single document. > > -Lucene Version 7.4.0 > -Writer Directory FSDirectory > -Reader

Why does Lucene 7.4.0 commit() Increase Memory Usage x2

2019-03-31 Thread thturk
. -After commit i create new searcher for real-time data. also close older one. Closing old index decrease memory usage but not that much as app started. I know that i shouldn't commit that often but i am tring to test how it will react per each commit. Even commit per hour or day the memory u

Re: Crazy increase of MultiPhraseQuery memory usage in Lucene 5 (compared with 3)

2016-10-05 Thread Trejkaz
Thought I would try some thread necromancy here, because nobody replied about this a year ago. Now we're on 5.4.1 and the numbers changed a bit again. Recording best times for each operation. Indexing: 5.723 s SpanQuery: 25.13 s MultiPhraseQuery: (waited 10 minutes and it hasn't compl

Re: Crazy increase of MultiPhraseQuery memory usage in Lucene 5 (compared with 3)

2015-08-23 Thread Trejkaz
I spent some time carving out a quick test of the bits that matter and put them up here: https://gist.github.com/trejkaz/a72b87277b1aec800c2e The tests index 1,000,000 docs with just one instance of the field/sub-field trick we're using, plus one unique value. So it's a bit of an artificial test,

Crazy increase of MultiPhraseQuery memory usage in Lucene 5 (compared with 3)

2015-08-23 Thread Trejkaz
There is a MultiPhraseQuery we use which looks a bit like: MultiPhraseQuery query = new MultiPhraseQuery(); query.add(new Term[] { "first" }); query.add(new Term[] { "second1", "second2", ... }); The actual number of terms in this particular case is 207087. The size of the index itsel

Re: Avoid automaton Memory Usage

2013-08-13 Thread Michael McCandless
On Tue, Aug 13, 2013 at 9:44 AM, Anna Björk Nikulásdóttir wrote: > I created these 3 issues for the discussed items: Thanks! If you (or anyone!) want to work up a patch that would be great ... > Thanks a lot for your suggestions (pun intended) ;) ;) Mike McCandless http://blog.mikemccandless

Re: Avoid automaton Memory Usage

2013-08-13 Thread Anna Björk Nikulásdóttir
I created these 3 issues for the discussed items: On disk FST objects: https://issues.apache.org/jira/browse/LUCENE-5174 FuzzySuggester should boost terms with minimal Levenshtein Distance: https://issues.apache.org/jira/browse/LUCENE-5172 AnalyzingSuggester and FuzzySuggester should be able to

Re: WeakIdentityMap high memory usage

2013-08-10 Thread Robert Muir
On Thu, Aug 8, 2013 at 11:31 AM, Michael McCandless wrote: > A number of users have complained about the apparent RAM usage of > WeakIdentityMap, and it adds complexity to ByteBufferIndexInput to do > this tracking ... I think defaulting the unmap hack to off is best for > users of MMapDir. > Fo

Re: WeakIdentityMap high memory usage

2013-08-09 Thread Denis Bazhenov
Yes, definitely. Our typical setup is 16Gb physical RAM and -Xmx4G per node (index size is about 1-1.5Gb per node). So there is plenty of room for OS cache, I guess. I'll take a closer look at the number of major page faults, but at the moment iostat says that everything is pretty fine. On the

Re: Avoid automaton Memory Usage

2013-08-08 Thread Michael McCandless
On Thu, Aug 8, 2013 at 12:54 PM, Anna Björk Nikulásdóttir wrote: > > Am 8.8.2013 um 12:37 schrieb Michael McCandless : > >> >>> What would help in my case as I use the same FST for both analyzers, if the >>> same FST object could be shared among both analyzers. So what I am doing is >>> to use

Re: Avoid automaton Memory Usage

2013-08-08 Thread Anna Björk Nikulásdóttir
Am 8.8.2013 um 12:37 schrieb Michael McCandless : > >> What would help in my case as I use the same FST for both analyzers, if the >> same FST object could be shared among both analyzers. So what I am doing is >> to use AnalyzingSuggester.store() and use the stored file for >> AnalyzingSugges

Re: WeakIdentityMap high memory usage

2013-08-08 Thread Michael McCandless
de > > >> -Original Message- >> From: Michael McCandless [mailto:luc...@mikemccandless.com] >> Sent: Thursday, August 08, 2013 2:18 PM >> To: Lucene Users >> Subject: Re: WeakIdentityMap high memory usage >> >> Thanks for bringing closure. >>

RE: WeakIdentityMap high memory usage

2013-08-08 Thread Uwe Schindler
essage- > From: Michael McCandless [mailto:luc...@mikemccandless.com] > Sent: Thursday, August 08, 2013 2:18 PM > To: Lucene Users > Subject: Re: WeakIdentityMap high memory usage > > Thanks for bringing closure. > > Note that you should still run a tight ship, ie don&#

Re: Avoid automaton Memory Usage

2013-08-08 Thread Michael McCandless
On Wed, Aug 7, 2013 at 1:18 PM, Anna Björk Nikulásdóttir wrote: > Ah I see. I will look into the AnalyzingInfixSuggester. I suppose it could be > useful as an alternative rather to AnalyzingSuggester instead of > FuzzySuggestor ? Yes, but it's very different (it does no fuzzing, and it matches

Re: WeakIdentityMap high memory usage

2013-08-08 Thread Michael McCandless
? >> >> Uwe >> >> - >> Uwe Schindler >> H.-H.-Meier-Allee 63, D-28213 Bremen >> http://www.thetaphi.de >> eMail: u...@thetaphi.de >> >>> -Original Message- >>> From: Michael McCandless [mailto:luc...@mikemccandle

Re: WeakIdentityMap high memory usage

2013-08-07 Thread Denis Bazhenov
>> -Original Message----- >> From: Michael McCandless [mailto:luc...@mikemccandless.com] >> Sent: Wednesday, August 07, 2013 3:45 PM >> To: Lucene Users >> Subject: Re: WeakIdentityMap high memory usage >> >> This map is used to track all cloned open file

RE: WeakIdentityMap high memory usage

2013-08-07 Thread Uwe Schindler
August 07, 2013 3:45 PM > To: Lucene Users > Subject: Re: WeakIdentityMap high memory usage > > This map is used to track all cloned open files, which can be a very large > number over time (each search will create maybe 3 of them). > > This is done as a "best effort&quo

Re: Avoid automaton Memory Usage

2013-08-07 Thread Anna Björk Nikulásdóttir
Ah I see. I will look into the AnalyzingInfixSuggester. I suppose it could be useful as an alternative rather to AnalyzingSuggester instead of FuzzySuggestor ? What would help in my case as I use the same FST for both analyzers, if the same FST object could be shared among both analyzers. So wh

Re: Avoid automaton Memory Usage

2013-08-07 Thread Michael McCandless
Unfortunately, the FST based suggesters currently must be HEAP resident. In theory this is fixable, e.g. if we could map the FST and then access it via DirectByteBuffer ... maybe open a Jira issue to explore this possibility? You could also try AnalyzingInfixSuggester; it uses a "normal" Lucene i

Avoid automaton Memory Usage

2013-08-07 Thread Anna Björk Nikulásdóttir
Hi, I am using Lucene 4.3 on Android for terms auto suggestions (>500.000). I am using both FuzzySuggester and AnalyzingSuggester, each for their specific strengths. Everything works great but my app consumes 69MB of RAM with most of that dedicated to the suggester classes. This is too much for

Re: WeakIdentityMap high memory usage

2013-08-07 Thread Michael McCandless
). Here is screenshot of the JProfiler output: > https://dl.dropboxusercontent.com/u/16254496/Screen%20Shot%202013-08-07%20at%205.35.22%20PM.png. > > The keys of the map are MMapIndexInput. What this map is for and how ca

WeakIdentityMap high memory usage

2013-08-07 Thread Denis Bazhenov
even retained size). Here is screenshot of the JProfiler output: https://dl.dropboxusercontent.com/u/16254496/Screen%20Shot%202013-08-07%20at%205.35.22%20PM.png. The keys of the map are MMapIndexInput. What this map is for and how can I reduce it memory usage? --- Denis Bazhenov FarPost

Re: DocValues memory usage

2013-03-28 Thread Peter Keegan
This is wierd. I indexed using DiskDocValuesFormat as the default codec and observed 16K qps with BinaryDocValuesField. But with a simple StoredField, I observed a much higher 30K qps. When I added both fields (BinaryDocValuesField and StoredField) to the index, I observed only 100 qps on each fiel

Re: high memory usage by indexreader

2013-03-27 Thread ash nix
Hi Ian, Thanks once again. I think the problem is mainly due to index being on NFS. Currently, there are other process running on the server with heavy disk IO on NFS. This has resulted in sharing of disk IO. Thanks, Ashwin On Fri, Mar 22, 2013 at 5:45 AM, Ian Lea wrote: > I did ask if there wa

Re: DocValues memory usage

2013-03-26 Thread Michael McCandless
DiskDocValuesFormat is the right thing to use: it loads certain things into RAM, eg the compressed bits that tell it the addresses of the bytes on disk, but then leaves the actual bytes on disk. I believe the old DirectSource was more extreme as it left the addresses on disk too, so there were 2 s

Re: DocValues memory usage

2013-03-26 Thread Duke
I made the same experiment and got same result. Then I used per-field codec with DiskDocValuesFormat, it works like DirectSource in 4.0.0, but I'm not feeling confident with this usage. Anyone can say more about removing DirectSource API? On 2013-3-26, at 22:59, Peter Keegan wrote: > Inspir

DocValues memory usage

2013-03-26 Thread Peter Keegan
Inspired by this presentation of DocValues: http://www.slideshare.net/lucenerevolution/willnauer-simon-doc-values-column-stride-fields-in-lucene I decided to try them out in 4.2. I created a 1M document index with one DocValues field: BinaryDocValuesField conceptsDV = new BinaryDocValuesField("con

Re: high memory usage by indexreader

2013-03-22 Thread Ian Lea
I did ask if there was anything else relevant you'd forgotten to mention ... How fast are general file operations on the NFS files? Your times are still extremely long and my guess is that your network/NFS setup are to blame. Can you run your code on the server that is exporting the index, if on

Re: high memory usage by indexreader

2013-03-21 Thread ash nix
Hi Ian, Thanks for your reply. The index is on NFS and there is no storage local/near to machine. Operating system is CentOS 6.3 with linux 2.6. It has 16 Gigs of memory. By initializing the Indexreader, I mean opening the IndexReader. I timed my operations using System.currentTimeMillis and exe

Re: high memory usage by indexreader

2013-03-21 Thread Ian Lea
That number of docs is far more than I've ever worked with but I'm still surprised it takes 4 minutes to initialize an index reader. What exactly do you mean by initialization? Show us the code that takes 4 minutes. What version of lucene? What OS? What disks? -- Ian. On Wed, Mar 20, 2013

Re: high memory usage by indexreader

2013-03-20 Thread ash nix
Thanks Ian. Number of documents in index is 381,153,828. The data set size is 1.9TB. The index size of this dataset is 290G. It is single index. The following are the fields indexed for each of the document. 1. Document id : It is StoredField and is generally around 128 chars or more. 2. Text fie

Re: high memory usage by indexreader

2013-03-20 Thread Ian Lea
Searching doesn't usually use that much memory, even on large indexes. What version of lucene are you on? How many docs in the index? What does a slow query look like (q.toString()) and what search method are you calling? Anything else relevant you forgot to tell us? Or google "lucene shardin

high memory usage by indexreader

2013-03-20 Thread ash nix
Hi Everybody, I have created a single compound index which is of size 250 Gigs. I open a single index reader to search simple boolean queries. The process is consuming lot of memory search painfully slow. It seems that I will have to create multiple indexes and have multiple index readers. Can an

Re: Limiting IndexWriters memory usage?

2012-05-02 Thread Maxim Terletsky
I suggest using setRAMBufferSizeMB method of IndexWriter (or SetMaxBufferedDocs, they are interchangeable).   Maxim From: Clemens Wyss To: "java-user@lucene.apache.org" Sent: Wednesday, May 2, 2012 4:16 PM Subject: Limiting IndexWriters memory u

Limiting IndexWriters memory usage?

2012-05-02 Thread Clemens Wyss
Is there a way to limit IndexWriters memory usage? While indexing many many documents my IndexWriter occupies > 30MB in memory. Is there a way to limit this "usage"? Thx Clemens - To unsubscribe, e-mail: java-

RE: Best practices for searcher memory usage?

2010-07-16 Thread Toke Eskildsen
On Thu, 2010-07-15 at 20:53 +0200, Christopher Condit wrote: [Toke: 140GB single segment is huge] > Sorry - I wasn't clear here. The total index size ends up being 140GB > but to try to help improve performance we build 50 separate indexes > (which end up being a bit under 3gb each) and then ope

RE: Best practices for searcher memory usage?

2010-07-15 Thread Christopher Condit
> [Toke: No frequent updates] > > So everything is rebuild from scratch each time? Or do you mean that you're > only adding new documents, not changing old ones? Everything is reindexed from scratch - indexing speed is not essential to us... > Either way, optimizing to a single 140GB segment is

RE: Best practices for searcher memory usage?

2010-07-15 Thread Toke Eskildsen
ddrives. Together with the unexpected high memory requirement my guess is that there's something going on with your terms. If you try opening the index with luke, it'll tell you the number of terms. If that is very high for the fields you search on, this would explain the memory usage. Yo

Re: Best practices for searcher memory usage?

2010-07-14 Thread Lance Norskog
Glen, thank you for this very thorough and informative post. Lance Norskog - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Best practices for searcher memory usage?

2010-07-14 Thread Glen Newton
There are a number of strategies, on the Java or OS side of things: - Use huge pages[1]. Esp on 64 bit and lots of ram. For long running, large memory (and GC busy) applications, this has achieved significant improvements. Like 300% on EJBs. See [2],[3],[4]. For a great article introducing and benc

RE: Best practices for searcher memory usage?

2010-07-14 Thread Christopher Condit
Hi Toke- > > * 20 million documents [...] > > * 140GB total index size > > * Optimized into a single segment > > I take it that you do not have frequent updates? Have you tried to see if you > can get by with more segments without significant slowdown? Correct - in fact there are no updates and n

Re: Best practices for searcher memory usage?

2010-07-14 Thread Michael McCandless
You can also set the termsIndexDivisor when opening the IndexReader. The terms index is an in-memory data structure and it an consume ALOT of RAM when your index has many unique terms. Flex (only on Lucene's trunk / next major release (4.0)) has reduced this RAM usage (as well as the RAM required

Re: Best practices for searcher memory usage?

2010-07-14 Thread Toke Eskildsen
On Tue, 2010-07-13 at 23:49 +0200, Christopher Condit wrote: > * 20 million documents [...] > * 140GB total index size > * Optimized into a single segment I take it that you do not have frequent updates? Have you tried to see if you can get by with more segments without significant slowdown? > Th

Re: Best practices for searcher memory usage?

2010-07-13 Thread Paul Libbrecht
Le 13-juil.-10 à 23:49, Christopher Condit a écrit : * are there performance optimizations that I haven't thought of? The first and most important one I'd think of is get rid of NFS. You can happily do a local copy which might, even for 10 Gb take less than 30 seconds at server start. pa

Best practices for searcher memory usage?

2010-07-13 Thread Christopher Condit
t step was to deploy the shards as RemoteSearchables for the same ParallelMultiSearcher over RMI - but before I do that I'm curious: * are there other ways to get that memory usage down? * are there performance optimizations that I haven't thought

Re: IndexWriter and memory usage

2010-05-19 Thread Michael McCandless
Michael McCandless [mailto:luc...@mikemccandless.com] > Sent: Friday, May 14, 2010 11:23 AM > To: java-user@lucene.apache.org > Subject: Re: IndexWriter and memory usage > > The patch looks correct. > > The 16 MB RAM buffer means the sum of the shared char[], byte[] and > Post

RE: IndexWriter and memory usage

2010-05-19 Thread Woolf, Ross
memory during our large document indexing runs. Thanks for your help Michael in resolving this. -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Friday, May 14, 2010 11:23 AM To: java-user@lucene.apache.org Subject: Re: IndexWriter and memory usage The

Re: IndexWriter and memory usage

2010-05-14 Thread Michael McCandless
> of memory is used up in objects of this sort.  We only know that byte[] total > for all is at 197891887. > > However, I have provided another image that breaks down the memory usage from > the heap.   A big question we have is that we talk about the 16 mb buffer, > but is ther

Re: IndexWriter and memory usage

2010-05-10 Thread Michael McCandless
H... Your usage (searching for old doc & updating it, to add new fields) is fine. But: what memory usage do you see if you open a searcher, and search for all docs, but don't open an IndexWriter? We need to tease apart the IndexReader vs IndexWriter memory usage you are seeing.

RE: IndexWriter and memory usage

2010-05-06 Thread Woolf, Ross
y but at 5 threads it gets into trouble. We still see a trend up in memory usage, but not as severe as when we use the multiple threads. http://tinypic.com/view.php?pic=2w6bf68&s=5 There is another piece of the picture that I think might be coming into play. We have plugged Lucene into a lega

Re: IndexWriter and memory usage

2010-04-29 Thread Michael McCandless
ing this issue. > > -Original Message- > From: Michael McCandless [mailto:luc...@mikemccandless.com] > Sent: Tuesday, April 27, 2010 4:40 AM > To: java-user@lucene.apache.org > Subject: Re: IndexWriter and memory usage > > Oooh -- I suspect you are hitting this issue: > &

Re: IndexWriter and memory usage

2010-04-27 Thread Michael McCandless
:28 PM, Woolf, Ross wrote: >> How do I get to the 2.9.x branch?  Every link I take from the Lucene site >> takes me to the trunk which I assume is the 3.x version.  I've tried to look >> around svn but can't find anything labeled 2.9.x.  Is there a daily build of &

RE: IndexWriter and memory usage

2010-04-26 Thread Woolf, Ross
s things down, so we don't want to run like this, but we wanted to test the behavior if we did so). Thanks, Ross -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Wednesday, April 14, 2010 2:52 PM To: java-user@lucene.apache.org Subject:

Re: IndexWriter and memory usage

2010-04-14 Thread Michael McCandless
the fix you > put into it, but I'm not sure where I get it from. > > -Original Message- > From: Michael McCandless [mailto:luc...@mikemccandless.com] > Sent: Wednesday, April 14, 2010 4:12 AM > To: java-user@lucene.apache.org > Subject: Re: IndexWriter and memory

RE: IndexWriter and memory usage

2010-04-14 Thread Woolf, Ross
ry out the fix you put into it, but I'm not sure where I get it from. -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Wednesday, April 14, 2010 4:12 AM To: java-user@lucene.apache.org Subject: Re: IndexWriter and memory usage It looks like t

Re: IndexWriter and memory usage

2010-04-14 Thread Michael McCandless
sage- > From: Woolf, Ross [mailto:ross_wo...@bmc.com] > Sent: Tuesday, April 13, 2010 1:29 PM > To: java-user@lucene.apache.org > Subject: RE: IndexWriter and memory usage > > Are these fixes in 2.9x branch?  We are using 2.9x and can't move to 3x just > yet.  If so,

RE: IndexWriter and memory usage

2010-04-13 Thread Woolf, Ross
apache.org Subject: RE: IndexWriter and memory usage Are these fixes in 2.9x branch? We are using 2.9x and can't move to 3x just yet. If so, where do I specifically pick this up from? -Original Message- From: Lance Norskog [mailto:goks...@gmail.com] Sent: Monday, April 12, 2010 10

RE: IndexWriter and memory usage

2010-04-13 Thread Woolf, Ross
t: Re: IndexWriter and memory usage There is some bugs where the writer data structures retain data after it is flushed. They are committed as of maybe the past week. If you can pull the trunk and try it with your use case, that would be great. On Mon, Apr 12, 2010 at 8:54 AM, Woolf, Ross wrote: >

Re: IndexWriter and memory usage

2010-04-13 Thread Michael McCandless
attachment).  I'll see what I can do about reducing the >> heap dump (It was supplied by a colleague). >> >> >> -Original Message- >> From: Michael McCandless [mailto:luc...@mikemccandless.com] >> Sent: Saturday, April 03, 2010 3:39 AM >> To

Re: IndexWriter and memory usage

2010-04-13 Thread Michael McCandless
supplied by a colleague). > > > -Original Message- > From: Michael McCandless [mailto:luc...@mikemccandless.com] > Sent: Saturday, April 03, 2010 3:39 AM > To: java-user@lucene.apache.org > Subject: Re: IndexWriter and memory usage > > Hmm why is the heap dump so immense?

Re: IndexWriter and memory usage

2010-04-12 Thread Lance Norskog
urday, April 03, 2010 3:39 AM > To: java-user@lucene.apache.org > Subject: Re: IndexWriter and memory usage > > Hmm why is the heap dump so immense?  Normally it contains the top N > (eg 100) object types and their count/aggregate RAM usage. > > Can you attach the infoStream

Re: IndexWriter and memory usage

2010-04-03 Thread Michael McCandless
To: java-user@lucene.apache.org > Subject: Re: IndexWriter and memory usage > > Hmm, not good.  Can you post a heap dump?  Also, can you turn on > infoStream, index up to the OOM @ 512 MB, and post the output? > > IndexWriter should not hang onto much beyond the RAM buffer.

RE: IndexWriter and memory usage

2010-04-02 Thread Woolf, Ross
5:21 PM To: java-user@lucene.apache.org Subject: Re: IndexWriter and memory usage Hmm, not good. Can you post a heap dump? Also, can you turn on infoStream, index up to the OOM @ 512 MB, and post the output? IndexWriter should not hang onto much beyond the RAM buffer. But, it does allocate and

Re: IndexWriter and memory usage

2010-04-01 Thread Michael McCandless
e (basically in an idle state).  Why would this > be?  Is the only way to totally clean up the memory is to close the writer?   > Our index is also used for real time indexing so the IndexWriter is intended > to remain open for the lifetime of the app. > > Any help in understand

IndexWriter and memory usage

2010-04-01 Thread Woolf, Ross
so the IndexWriter is intended to remain open for the lifetime of the app. Any help in understanding why the IndexWriter is maxing out our heap space or what is expected from memory usage of the IndexWriter would be appreciated.

Re: Sort memory usage

2010-02-03 Thread Jake Mannix
On Wed, Feb 3, 2010 at 1:33 PM, tsuraan wrote: > > The FieldCache loads per segment, and the NRT reader is reloading only > > new segments from disk, so yes, it's "smarter" about this caching in this > > case. > > Ok, so the cache is tied to the index, and not to any particular > reader. The act

Re: Sort memory usage

2010-02-03 Thread tsuraan
> The FieldCache loads per segment, and the NRT reader is reloading only > new segments from disk, so yes, it's "smarter" about this caching in this > case. Ok, so the cache is tied to the index, and not to any particular reader. The actual FieldCacheImpl keeps a mapping from Reader to its terms,

Re: Sort memory usage

2010-02-03 Thread Jake Mannix
The FieldCache loads per segment, and the NRT reader is reloading only new segments from disk, so yes, it's "smarter" about this caching in this case. -jake On Wed, Feb 3, 2010 at 1:07 PM, tsuraan wrote: > Is the cache used by sorting on strings separated by reader, or is it > a global thing?

Sort memory usage

2010-02-03 Thread tsuraan
Is the cache used by sorting on strings separated by reader, or is it a global thing? I'm trying to use the near-realtime search, and I have a few indices with a million docs apiece. If I'm opening a new reader every minute, am I going to have every term in every sort field read into RAM for each

Re: Lucene memory usage

2009-12-25 Thread tsuraan
> Have you tried setting the termInfosIndexDivisor when opening the > IndexReader? EG a setting of 2 would load every 256th term (instead > of every 128th term) into RAM, halving RAM usage, with the downside > being that looking up a term will generally take longer since it'll > require more scann

Re: Lucene memory usage

2009-12-25 Thread Michael McCandless
Sorry, LUCENE-1458 is "continuing" under LUCENE-2111 (ie, flexible indexing is not yet committed). I've just added a comment to LUCENE-1458 to that effect. Lucene, even with flexible indexing, loads the terms index entirely into RAM (it's just that the terms index in flexible indexing has less RA

Re: Lucene memory usage

2009-12-23 Thread tsuraan
> This (very large number of unique terms) is a problem for Lucene currently. > > There are some simple improvements we could make to the terms dict > format to not require so much RAM per term in the terms index... > LUCENE-1458 (flexible indexing) has these improvements, but > unfortunately tied

Re: updateDocument and high Memory Usage

2009-06-24 Thread Michael McCandless
Likely this is because under the hood when IndexWriter flushes your deletes, it opens readers. It closes the readers as soon as the deletes are done, thus creating a fair amount of garbage (which looks like memory used by the JVM). How are you measuring the memory usage? Likely it's m

updateDocument and high Memory Usage

2009-06-24 Thread Kris Leite
updating existing documents maxes out the JVM memory allocation. Is there a configuration option that can be used to adjust the memory usage? I have tried doing separate delete/add code and get similar results so it appears to be more of a Lucene delete document issue? Any help would be

Re: Lucene memory usage

2009-06-11 Thread Jaison Sunny
how to run lucene on eclipse

Re: Lucene memory usage

2009-06-10 Thread Jason Rutherglen
hat slower > >> searching. > >> > >> Also: you should peek at your index, eg using Luke, to understand why > >> you have so many terms. It could be legitimate (indexing a massive > >> catalog with eg part numbers), or, it could be your document filtering &

Re: Lucene memory usage

2009-06-10 Thread Michael McCandless
>> catalog with eg part numbers), or, it could be your document filtering >> / analyzer are accidentally producing garbage terms. >> >> Mike >> >> On Wed, Jun 10, 2009 at 8:23 AM, Benedikt Boss wrote: >> > Hej hej, >> > >> > i

Re: Lucene memory usage

2009-06-10 Thread Jason Rutherglen
talog with eg part numbers), or, it could be your document filtering > / analyzer are accidentally producing garbage terms. > > Mike > > On Wed, Jun 10, 2009 at 8:23 AM, Benedikt Boss wrote: > > Hej hej, > > > > i have a question regarding lucenes memory usage >

Re: Lucene memory usage

2009-06-10 Thread Michael McCandless
pScoreDocCollector collector = new TopScoreDocCollector(10); > > if we do: > >> TopScoreDocCollector collector = new TopScoreDocCollector(2); > > instead (only see top two documents), could memory usage be less? > > Best regards, Lisheng > > -Original

RE: Lucene memory usage

2009-06-10 Thread Zhang, Lisheng
Hi, Does this issue has anything to do with the line: > TopScoreDocCollector collector = new TopScoreDocCollector(10); if we do: > TopScoreDocCollector collector = new TopScoreDocCollector(2); instead (only see top two documents), could memory usage be less? Best regards, L

Re: Lucene memory usage

2009-06-10 Thread Michael McCandless
/ analyzer are accidentally producing garbage terms. Mike On Wed, Jun 10, 2009 at 8:23 AM, Benedikt Boss wrote: > Hej hej, > > i have a question regarding lucenes memory usage > when launching a query. When i execute my query > lucene eats up over 1gig of heap-memory even > when m

Lucene memory usage

2009-06-10 Thread Benedikt Boss
Hej hej, i have a question regarding lucenes memory usage when launching a query. When i execute my query lucene eats up over 1gig of heap-memory even when my result-set is only a single hit. I found out that this is due to the "ensureIndexIsRead()" method-call in the "TermInf

Re: Memory Usage

2008-07-03 Thread Keith Watson
Thanks very much for this; I'll give it a shot. Keith. On 4 Jul 2008, at 00:02, Paul Smith wrote: (there are around 6,000,000 posts on the message board database) Date encoded as yyMMdd: appears to be using around 30M Date encoded as yyMMddHHmmss: appears to be using more than 400M! I g

Re: Memory Usage

2008-07-03 Thread Paul Smith
(there are around 6,000,000 posts on the message board database) Date encoded as yyMMdd: appears to be using around 30M Date encoded as yyMMddHHmmss: appears to be using more than 400M! I guess I would have understood if I was seeing the usage double for sure, or even a little more; no idea

Memory Usage

2008-07-03 Thread Keith Watson
the second, rather than just the day. I didn't pay much attention to memory usage until I started getting out of heap space errors... When I looked into the usage I found: (there are around 6,000,000 posts on the message board database) Date encoded as yyMMdd: appears to be using arou

Re: lucene memory usage, caching...

2006-06-01 Thread Yonik Seeley
On 5/31/06, Heng Mei <[EMAIL PROTECTED]> wrote: Does Lucene do any caching of Document fields during a search? No... but it wouldn't be too difficult to make your own cache, or use Solr which does have a Document cache (among other types of caches). -Yonik http://incubator.apache.org/solr Solr

lucene memory usage, caching...

2006-05-31 Thread Heng Mei
retrievely a large text field from each Document, but the memory usage by the Lucene java process seems very low compared to the total size of all those fields. Just curious if there are any config params or other way to tweak Lucene's caching strategies...

RE: Storing large text or binary source documents in the index and memory usage

2006-01-21 Thread George Washington
thank you Daniel, but the best I get from MaxBufferedDocs(1) is an OOM error after trying 5 iterations of 10MB each in the JUnit test provided by Chris, running inside Eclipse 3.1. I had already tried with MaxBufferdDocs(2) with no success before I posted the original post. I also tried: write

RE: Storing large text or binary source documents in the index and memory usage

2006-01-21 Thread George Washington
ext or binary source documents in the index and memory usage Date: Fri, 20 Jan 2006 18:35:41 -0800 (PST) : otherwise I would have done so already. My real question is question number : one, which did not receive a reply, is there a formula that can tell me if : what is happening is reasonable and

RE: Storing large text or binary source documents in the index and memory usage

2006-01-20 Thread Chris Hostetter
: otherwise I would have done so already. My real question is question number : one, which did not receive a reply, is there a formula that can tell me if : what is happening is reasonable and to be expected, or am I doing something I've never played with the binary fields much, nor have i ever t

RE: Storing large text or binary source documents in the index and memory usage

2006-01-20 Thread George Washington
nswer to question one is that there is no other alternative. Cheers From: "George Washington" <[EMAIL PROTECTED]> Reply-To: java-user@lucene.apache.org To: java-user@lucene.apache.org Subject: Storing large text or binary source documents in the index and memory usage Date: F

RE: Storing large text or binary source documents in the index and memory usage

2006-01-20 Thread John Powers
@lucene.apache.org Subject: Storing large text or binary source documents in the index and memory usage I would like to store large source documents (>10MB) in the index in their original form, i.e. as text for text documents or as byte[] for binary documents. I have no difficulty adding the sou

Storing large text or binary source documents in the index and memory usage

2006-01-19 Thread George Washington
I would like to store large source documents (>10MB) in the index in their original form, i.e. as text for text documents or as byte[] for binary documents. I have no difficulty adding the source document as a field to the Lucene index document, but when I write the index document to the index I

  1   2   >