Looking into the memory problems further I read
"Every time you open an IndexSearcher/IndexReader resources are used which
take up memory. for an application pointed at a static index, you only
ever need one IndexReader/IndexSearcher that can be shared among multiple
threads issuing queries. if your index is being incrimentally updated,
you should never need more then two searcher/reader pairs open at a time"
We may have many different segments of our index, and it seems below we are
using one
IndexSearcher per segment. Could this explain why we run out of memory when
using more than 2/3 segments?
Anyone else have any comments on the below?
Many thanks
Leon
ps. At the moment I think it is set to only look at 2 segements
private Searcher getSearcher() throws IOException {
if (mSearcher == null) {
synchronized (Monitor) {
Searcher[] srs = new IndexSearcher[SearchersDir.size()];
int maxI = 2;
// Searcher[] srs = new IndexSearcher[maxI];
int i = 0;
for (Iterator iter = SearchersDir.iterator(); iter.hasNext() && i<maxI;
i++) {
String dir = (String) iter.next();
try {
srs[i] = new IndexSearcher(IndexDir+dir);
} catch (IOException e) {
log.error(ClassTool.getClassNameOnly(e) + ": " + e.getMessage(), e);
}
}
mSearcher = new MultiSearcher(srs);
changeTime = System.currentTimeMillis();
}
}
return mSearcher;
}
----- Original Message -----
From: "Leon Chaddock" <[EMAIL PROTECTED]>
To: <java-user@lucene.apache.org>
Sent: Wednesday, February 15, 2006 9:28 AM
Subject: Re: Size + memory restrictions
Hi Greg,
Thanks. We are actually running against 4 segments of 4gb so about 20
million docs. We cant merge the segments as their seems to be problems
with out linux box , with having files over about 4gb. Not sure why that
is.
If I was to upgrade to 8gb of ram does it seem likely this will double the
amount of docs we can handle, or would this provide an exponential
increase?
Thanks
Leon
----- Original Message -----
From: "Greg Gershman" <[EMAIL PROTECTED]>
To: <java-user@lucene.apache.org>
Sent: Wednesday, February 15, 2006 12:41 AM
Subject: Re: Size + memory restrictions
You may consider incrementally adding documents to
your index; I'm not sure why there would be problems
adding to an existing index, but you can always add
additional documents. You can optimize later to get
everything back into a single segment.
Querying is a different story; if you are using the
Sort API, you will need enough memory to store a full
sorting of your documents in memory. If you're trying
to sort on a string or anything other than an int or
float, this could require a lot of memory.
I've used indices much bigger than 5 mil. docs/3.5 gb
with less than 4GB of RAM and had no problems.
Greg
--- Leon Chaddock <[EMAIL PROTECTED]> wrote:
Hi,
we are having tremendous problems building a large
lucene index and querying
it.
The programmers are telling me that when the index
file reaches 3.5 gb or 5
million docs the index file can no longer grow any
larger.
To rectify this they have built index files in
multiple directories. Now
apparently my 4gb memory is not enough to query.
Does this seem right to people or does anyone have
any experience on largish
scale projects.
I am completely tearing my hair out here and dont
know what to do.
Thanks
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--
Internal Virus Database is out-of-date.
Checked by AVG Free Edition.
Version: 7.1.375 / Virus Database: 267.15.0/248 - Release Date:
01/02/2006
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--
Internal Virus Database is out-of-date.
Checked by AVG Free Edition.
Version: 7.1.375 / Virus Database: 267.15.0/248 - Release Date: 01/02/2006
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]