You do not want to be using Hits. Frankly, the way pagination should
normally be done, Hits caching means almost nobody really wants to be
using Hits, but in your case it's even worse. Look into a Hit collector
-- not only are you caching every single document in your search
results, but you are re-querying many times for a single query by
running through all of the result in Hits! I don't have the time at the
moment to explain further (I am sure someone else will) but you need to
drop Hits like a bad habit and look into HitCollector.
- Mark
John Powers wrote:
Thanks for the response. Its definitely the user search object's
search(). I have to iterate through all the hits that come back to get
all the categories used in the results, so the number that hits gets
really doesn't matter--ill need them all.
-----Original Message-----
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
Sent: Sunday, May 27, 2007 10:26 PM
To: java-user@lucene.apache.org
Subject: Re: synchronize hits variable?
Hi John,
20M sounds suspicious. Without seeing the code, it's hard to tell. My
guess is the problem lies elsewhere or some piece of Lucene is being
incorrectly used. Or maybe your Lucene Documents are just very large.
Are they? You could go modify Hits source and change the number of hits
that Hits instance loads. I think it caches either 100 or 200 (haven't
looked at it in a while). There is no API for changing it, so you could
change it in the sources, recompile, redeploy, and see if that helps
you.
Otis
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Simpy -- http://www.simpy.com/ - Tag - Search - Share
----- Original Message ----
From: John Powers <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Sunday, May 27, 2007 9:17:59 PM
Subject: synchronize hits variable?
In a j2ee webapp we have a search object that stores a user's search
preferences (items/page, detail level, etc). it has a search() that
calls a static method getSearcher() that returns a static IndexSearcher
that all these user search objects use.....searching with that gives us
a Hits object that this user object iterates through to find out what
categories are used, which ones to display, etc. this hits object is
huge. Its fluffing up each user session by 20M in some cases. This
is unacceptable of course. I am sure many have run into this issue,
and I was curious what you did to solve it. If I use a local
variable to that search method, null it after I'm done and call gc(), it
still is bad. I can't put the hits variable into a static object or
attribute cause of course there are multiple of these user search
objects using it at any time. Even if I isolated the user search
part to a singleton, others may use it at the same time as well. I
imagine some sort of synchronization of that static variable is in order
but am not quite sure. does someone have an example of this? I would
image everyone has had to deal with this problem. 20M is just to big.
When we use a profiler we find that it's a char[] that seems to be
holding a lot of the data..like 16M of it. For different indexes in
different servers, this size is different, but for the most part its way
to big everywhere.
Thoughts? I appreciate any help on this.
Thanks
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]