Re: lucene nicking my memory ?

Magnus Rundberget Thu, 04 Dec 2008 01:19:10 -0800

hmmm,

Well in production (1024M heap), it seems that after a while (somehundred user queries) the memory starts reaching the max threshold andwhen it does at some point it becomes unresponsive.Id rather it was slightly less performant (cleaning up memory morefrequently) than freezing up when there are "many" (not really thatmany users..) concurrently doing searches when the memory threshold isreached.

In dev Im not able to recreate this behavior, but by reducingavailable memory to say 384M things like OOME starts to show up whenrunning a number of concurrent users.

... as mentioned earlier though it might not just be lucene it couldbe that in combination with lack of db connections or other thingsthat actually cause the freeze up in production. I just still thinkits suspicious to grab all available resources and deal with theproblem (with gc) when max available resources are reached.

I'd love to be able to tell lucene, hey you have 300Mb for yourcache... deal with it as best as you can.Somewhat similar to DB connections when you configure a pool ofconnections and this is what the application has available, not createdb connections until db server starts hicking up and then release a few.

Anyways I guess we have a lot of tuning potential in the way we index(and what) and how we search as well.


magnus




On 4. des.. 2008, at 09.47, Khawaja Shams wrote:

Magnus, Please feel free to ignore my last email; I see that youhadthis setup earlier. As far as using up all the memory it can getits hands
on, this is actually a good thing. This allows Lucene and other java
applications to keep more things in cache when more memory isavailable.Also, if you throw more memory at the program, the GC will try tospendlittle effort in cleaning up until it is necessary. By setting xmxto 1536M,you are effectively telling the jvm that you have this much memoryavailablefor the java program. Therefore, there is no reason for the GC towaste anymore resources when the program is taking 1300M of ram. I think youshouldnot worry until you start throwing OOMEs even after you haveallocated the
1536M.


Is the program still responsive even after it hits the "peak" memory
utilization? You said that the gc request was not being honored, butit is
unclear if the queries are returning at that point.  Lastly, I highly
recommend against ever making the gc requests.



Regards,
Khawaja Shams
On Thu, Dec 4, 2008 at 12:30 AM, Khawaja Shams <[EMAIL PROTECTED]>wrote:
Magnus, If you get a chance, can you try setting a different xmsand xmx
value. For instance, try xms384M and xmx1024M.
The "forced" GC [request] will almost always reduce the memoryfootprintsimply because of the weak references that lucene leverages, but Ibetsubsequent queries are not as fast and you basically need to warmup your
server after the GC (which would boost up the footprint again :) ).



Regards,
Khawaja
On Wed, Dec 3, 2008 at 10:27 PM, Magnus Rundberget <[EMAIL PROTECTED]>wrote:
Well...
after various tests I downgraded to lucene 1.9.1 to see if thathad any
effect... doesn't seem that way.
I have set up a JMeter test with 5 concurrent users doing a search(asilly search for a two letter word) every 3 seconds (with a randomof +/-
500ms).
- With 512 MB xms/xmx memory usage settles between 400/500 after afew
iterations, but no OOME.
At the end of the run memory settles usually between 200-300somewhere(really depends), but no cleanup occurs for minutes ...unless I doa forced
GC.

- Did the same run with 384MB and hit 2 OOME

- Did the same run with 256MB and hit 5 or 6 OOME
Tried to run tomcat with jdk 1.6 and -server option as well, butdidn't
seem to help at all either.
The finally I ran the test scenario above but with 1536MB xms/xmx... andguess what. It used it all pretty quickly. It used between1000-1400/1500for most of the run. At the end of the run memory usage settled atabout 750
MB ... until I did a forced gc.
This do bother me, if the solution could have been to throw morememory atthe problem I could live with that, but it just seems to consumeall memory
it can get it hands on (:-
Is there any way to limit the memory usage in Lucene(configuration) ?
Im obviously not sure if lucene is the culprit as Im using spring(2.5) ,hibernate (3.3) and open session in view and lots of other stuffetc in myapp. So I guess my next step would be to create a very limited webapp withjust a servlet calling the lucene api.Then do some profiling onthat.
cheers
Magnus










On 3. des.. 2008, at 14.45, Michael McCandless wrote:
Are you actually hitting OOME?
Or, you're watching heap usage and it bothers you that the GC istaking along time (allowing too much garbage to use up heap space) beforesweeping?
One thing to try (only for testing) might be a lower and lower -Xmx untilyou do hit OOME; then you'll know the "real" memory usage of theapp.
Mike

Magnus Rundberget wrote:

Sure,
Tried with the following
Java version: build 1.5.0_16-b06-284 (dev), 1.5.0_12 (production)
OS : Mac OS/X Leopard(dev) and Windows XP(dev), Windows 2003
(production)
Container : Jetty 6.1 and Tomcat 5.5 (latter is used both in devand
production)


current jvm options
-Xms512m -Xmx1024M -XX:MaxPermSize=256m
... tried a few gc settings as well but nothing that has helped(rather
slowed things down)

production hw running 2 XEON dual core processors
in production our memory reaches the 1024 limit after a while (afewhours) and at some point it stops responding to forced gc (usingjconsole).
need to digg quite a bit more to figure out the exact prodsettings. Butsafe to say the memory usage pattern can be recreated ondifferent hardwareconfigs, with different os's, different 1.5 jvms and differentcontainers
(jetty and tomcat).



cheers
Magnus



On 3. des.. 2008, at 13.10, Glen Newton wrote:

Hi Magnus,
Could you post the OS, version, RAM size, swapsize, Java VMversion,hardware, #cores, VM command line parameters, etc? This can bevery
relevant.
Have you tried other garbage collectors and/or tuning asdescribed in
http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html?

2008/12/3 Magnus Rundberget <[EMAIL PROTECTED]>:
Hi,
We have an application using Tomcat, Spring etc and Lucene2.4.0.Our index is about 100MB (in test) and has about 20 indexedfields.
Performance is pretty good, but we are experiencing a veryhigh usage
of
memory when searching.
Looking at JConsole during a somewhat silly scenario (butillustrates
the
problem);
(Allocated 512 MB Min heap space, max 1024)

0. Initially memory usage is about 70MB
1. Search for word "er", heap memory usage goes up by 100-150MB
1.1 Wait for 30 seconds... memory usage stays the same (ie nogc)2. Search by word "og", heap memory usage goes up another50-100MB
2.1 See 1.1
...and so on until it seems to reach the 512 MB limit, andthen a
garbage
collection is performed
i.e garbage collection doesn't seem to occur until it "hitsthe roof"
We believe the scenario is similar in production, were ourheap space
is
limited to 1.5 GB.


Our search is basically as follows
----------------------------------------------
1. Open an IndexSearcher
2. Build a Boolean Query searching across 4 fields (title,summary,
content
and daterangestring YYYYMMDD)
2.1 Sort on title
3. Perform search
4. Iterate over hits to build a set of custom result objects(pretty
small,
as we dont include content in these)
5. Close searcher
6. Return result objects.
You should not close the searcher: it can be shared by allqueries.What happens when you warm Lucene with a (large) number ofqueries: do
things stabilize over time?
A 100MB index is (relatively) very small for Lucene (I haveindexes
100GB). What kind of response times are you getting,independent of
memory usage.

-glen
We have tried various options based on entries on this mailinglist;
a) Cache the IndexSearcher - Same results
b) Remove sorting - Same result
c) In point 4 only iterating over a limited amount of hitsrather than
whole
collection - Same result in terms of memory usage, but obviously
increased
performance
d) Using RamDirectory vs FSDirectory - Same result onlyinitial heap
usage
is higher using ramdirectory (in conjuction with cachedindexsearcher)
Doing some profiling using YourKit shows a huge number ofchar[],
int[] and
string[], and ever increasing number of lucene related objects.
Reading through the mailing lists, suspicions are that ourproblem isrelated to ThreadLocals and memory not being released. Noticedthat
there
was a related patch for this in 2.4.0, but it doesn't seem tohelp us
much.

Any ideas ?

kind regards
Magnus

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: java-user-[EMAIL PROTECTED]
--

-

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: lucene nicking my memory ?

Reply via email to