OK new version of SearcherManager, that fixes maybeReopen() so that it
can be called from multiple threads.
NOTE: it's still untested!
Mike
package lia.admin;
import java.io.IOException;
import java.util.HashMap;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.store.Directory;
/** Utility class to get/refresh searchers when you are
* using multiple threads. */
public class SearcherManager {
private IndexSearcher currentSearcher; //A
private Directory dir;
public SearcherManager(Directory dir) throws IOException {
this.dir = dir;
currentSearcher = new IndexSearcher(IndexReader.open(dir)); //B
}
public void warm(IndexSearcher searcher) {} //C
private boolean reopening;
private synchronized void startReopen() //D
throws InterruptedException {
while (reopening) {
wait();
}
reopening = true;
}
private synchronized void doneReopen() { //E
reopening = false;
notifyAll();
}
public void maybeReopen() throws InterruptedException, IOException
{ //F
startReopen();
try {
final IndexSearcher searcher = get();
try {
long currentVersion =
currentSearcher.getIndexReader().getVersion(); //G
if (IndexReader.getCurrentVersion(dir) != currentVersion)
{ //G
IndexReader newReader =
currentSearcher.getIndexReader().reopen(); //G
assert newReader !=
currentSearcher.getIndexReader(); //G
IndexSearcher newSearcher = new
IndexSearcher(newReader); //G
warm(newSearcher); //G
swapSearcher(newSearcher); //G
}
} finally {
release(searcher);
}
} finally {
doneReopen();
}
}
public synchronized IndexSearcher get() { //H
currentSearcher.getIndexReader().incRef();
return currentSearcher;
}
public synchronized void release(IndexSearcher searcher) //I
throws IOException {
searcher.getIndexReader().decRef();
}
private synchronized void swapSearcher(IndexSearcher newSearcher) //J
throws IOException {
release(currentSearcher);
currentSearcher = newSearcher;
}
}
/*
#A Current IndexSearcher
#B Create initial searcher
#C Implement in subclass to warm new searcher
#D Pauses until no other thread is reopening
#E Finish reopen and notify other threads
#F Reopen searcher if there are changes
#G Check index version and reopen, warm, swap if needed
#H Returns current searcher
#I Release searcher
#J Swaps currentSearcher to new searcher
*/
Mike
On Mar 1, 2009, at 8:27 AM, Amin Mohammed-Coleman wrote:
just a quick point:
public void maybeReopen() throws IOException { //D
long currentVersion = currentSearcher.getIndexReader().getVersion();
if (IndexReader.getCurrentVersion(dir) != currentVersion) {
IndexReader newReader = currentSearcher.getIndexReader().reopen();
assert newReader != currentSearcher.getIndexReader();
IndexSearcher newSearcher = new IndexSearcher(newReader);
warm(newSearcher);
swapSearcher(newSearcher);
}
}
should the above be synchronised?
On Sun, Mar 1, 2009 at 1:25 PM, Amin Mohammed-Coleman <ami...@gmail.com
>wrote:
thanks. i will rewrite..in between giving my baby her feed and
playing
with the other child and my wife who wants me to do several other
things!
On Sun, Mar 1, 2009 at 1:20 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:
Amin Mohammed-Coleman wrote:
Hi
Thanks for your input. I would like to have a go at doing this
myself
first, Solr may be an option.
* You are creating a new Analyzer & QueryParser every time, also
creating unnecessary garbage; instead, they should be created once
& reused.
-- I can moved the code out so that it is only created once and
reused.
* You always make a new IndexSearcher and a new MultiSearcher even
when nothing has changed. This just generates unnecessary garbage
which GC then must sweep up.
-- This was something I thought about. I could move it out so
that it's
created once. However I presume inside my code i need to check
whether
the
indexreaders are update to date. This needs to be synchronized
as well I
guess(?)
Yes you should synchronize the check for whether the IndexReader is
current.
* I don't see any synchronization -- it looks like two search
requests are allowed into this method at the same time? Which is
dangerous... eg both (or, more) will wastefully reopen the
readers.
-- So i need to extract the logic for reopening and provide a
synchronisation mechanism.
Yes.
Ok. So I have some work to do. I'll refactor the code and see if
I can
get
inline to your recommendations.
On Sun, Mar 1, 2009 at 12:11 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:
On a quick look, I think there are a few problems with the code:
* I don't see any synchronization -- it looks like two search
requests are allowed into this method at the same time? Which is
dangerous... eg both (or, more) will wastefully reopen the
readers.
* You are over-incRef'ing (the reader.incRef inside the loop) -- I
don't see a corresponding decRef.
* You reopen and warm your searchers "live" (vs with BG thread);
meaning the unlucky search request that hits a reopen pays the
cost. This might be OK if the index is small enough that
reopening & warming takes very little time. But if index gets
large, making a random search pay that warming cost is not nice to
the end user. It erodes their trust in you.
* You always make a new IndexSearcher and a new MultiSearcher even
when nothing has changed. This just generates unnecessary garbage
which GC then must sweep up.
* You are creating a new Analyzer & QueryParser every time, also
creating unnecessary garbage; instead, they should be created once
& reused.
You should consider simply using Solr -- it handles all this
logic for
you and has been well debugged with time...
Mike
Amin Mohammed-Coleman wrote:
The reason for the indexreader.reopen is because I have a webapp
which
enables users to upload files and then search for the
documents. If I
don't
reopen i'm concerned that the facet hit counter won't be updated.
On Tue, Feb 24, 2009 at 8:32 PM, Amin Mohammed-Coleman <
ami...@gmail.com
wrote:
Hi
I have been able to get the code working for my scenario,
however I
have
a
question and I was wondering if I could get some help. I have
a list
of
IndexSearchers which are used in a MultiSearcher class. I use
the
indexsearchers to get each indexreader and put them into a
MultiIndexReader.
IndexReader[] readers = new IndexReader[searchables.length];
for (int i =0 ; i < searchables.length;i++) {
IndexSearcher indexSearcher = (IndexSearcher)searchables[i];
readers[i] = indexSearcher.getIndexReader();
IndexReader newReader = readers[i].reopen();
if (newReader != readers[i]) {
readers[i].close();
}
readers[i] = newReader;
}
multiReader = new MultiReader(readers);
OpenBitSetFacetHitCounter facetHitCounter =
newOpenBitSetFacetHitCounter();
IndexSearcher indexSearcher = new IndexSearcher(multiReader);
I then use the indexseacher to do the facet stuff. I end the
code
with
closing the multireader. This is causing problems in another
method
where I
do some other search as the indexreaders are closed. Is it ok
to not
close
the multiindexreader or should I do some additional checks in
the
other
method to see if the indexreader is closed?
Cheers
P.S. Hope that made sense...!
On Mon, Feb 23, 2009 at 7:20 AM, Amin Mohammed-Coleman <
ami...@gmail.com
wrote:
Hi
Thanks just what I needed!
Cheers
Amin
On 22 Feb 2009, at 16:11, Marcelo Ochoa <marcelo.oc...@gmail.com
>
wrote:
Hi Amin:
Please take a look a this blog post:
http://sujitpal.blogspot.com/2007/04/lucene-search-within-search-with.html
Best regards, Marcelo.
On Sun, Feb 22, 2009 at 1:18 PM, Amin Mohammed-Coleman <
ami...@gmail.com>
wrote:
Hi
Sorry to re send this email but I was wondering if I could
get some
advice
on this.
Cheers
Amin
On 16 Feb 2009, at 20:37, Amin Mohammed-Coleman <ami...@gmail.com
>
wrote:
Hi
I am looking at building a faceted search using Lucene. I
know
that
Solr
comes with this built in, however I would like to try this
by
myself
(something to add to my CV!). I have been looking around
and I
found
that
you can use the IndexReader and use TermVectors. This
looks ok
but
I'm
not
sure how to filter the results so that a particular user
can only
see
a
subset of results. The next option I was looking at was
something
like
Term term1 = new Term("brand", "ford");
Term term2 = new Term("brand", "vw");
Term[] termsArray = new Term[] { term1, term2 };un
int[] docFreqs = indexSearcher.docFreqs(termsArray);
The only problem here is that I have to provide the brand
type
each
time a
new brand is created. Again I'm not sure how I can filter
the
results
here.
It may be that I'm using the wrong api methods to do this.
I would be grateful if I could get some advice on this.
Cheers
Amin
P.S. I am basically trying to do something that displays
the
following
Personal Contact (23) Business Contact (45) and so on..
--
Marcelo F. Ochoa
http://marceloochoa.blogspot.com/
http://marcelo.ochoa.googlepages.com/home
______________
Want to integrate Lucene and Oracle?
http://marceloochoa.blogspot.com/2007/09/running-lucene-inside-your-oracle-jvm.html
Is Oracle 11g REST ready?
http://marceloochoa.blogspot.com/2008/02/is-oracle-11g-rest-ready.html
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-
unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org