Re: A couple of questions regarding load balancing and failover

Jeff Rodenburg Wed, 30 Nov 2005 19:14:09 -0800

On 11/30/05, Daniel Pfeifer <[EMAIL PROTECTED]> wrote:
>
>
> 1.) Does Lucenes MultiSearcher implement some kind of automatic failover
> and/or load-balancing mechanism if both Searchables which I supply in
> MultiSearchers constructor go to two different servers but to the very same
> index-files?



We're on version 1.4.3 (and I'm not sure it's current to the trunk), and
load balancing is not built into the remote searchers.  In your scenario,
with two servers pointing to the same index-files, if you issue a search
against both servers (where both servers are participating), you'll have
duplicate results.


if server 1 crashes there is still server 2 and thus at least one server
> will be able to complete the request. Will the MultiSearcher which is using
> RemoteSearchables from two servers automatically detect that Searchable
> number 1 (server 1) does not respond and then try Searchable 2? If not, what
> is the recommended way of doing this.


No.  The recommended way is very dependent on your underlying
infrastructure.  My personal approach is to hardware load balance in front
of the remote searcher systems, but this can be achieved a number of ways.


And the second part of the question is: If both Searchables are available
> and working, will the MultiSearcher automatically distribute requests to
> both Searchables or is there a risk that we get duplicates since both
> Searchables actually expose the same indexes? If this isn't the case, what
> would be the recommended way of implemented load distribution over several
> servers.


Again no.  I would recommend a load balancing implementation external to
Lucene; don't make Lucene figure out what systems are/aren't available.  If
I were inclined to add failover functionality to Lucene, it would be within
ParallelMultiSearcher to gracefully allow (expect?) failure in searches.

I'm speaking from the v1.4.3 codebase, and the trunk is currently at v1.9.
Check the object to see if any improvements have been made.


2.) On our index-servers which expose the underlaying index as a
> RemoteSearchable we do have four dualcore processors each. Since we thus
> have great multithreading-capabilities I do use the ParallelMultiSearcher
> instead of the MultiSearcher. On the client side (the application which
> connects to the Index RMI-Server), should I therefore also be using a
> ParallelMultiSearcher or is it ok if I use the standard MultiSeacher? And if
> so, why?


That sounds like a lot of power for Lucene, possibly overpowered.
ParallelMultiSearcher makes use of spawning threads for individual
subsearchers, so a multi-processor system can execute those in tandem vs.
the MultiSearcher.  The more subsearchers, the more this makes a
difference.  I would base your decision of which searcher to use on the
parallel processing capacity of ParallelMultiSearcher; if your client has
multiple processors, I'd go with ParallelMultiSearcher.

3.) Currently - to increase speed - we are loading the entire index into
> memory (using RAMDirectory rather than the FSDirectory). We found out that
> the RAMDirectory will not update itself if the files in the directory from
> where the RAMDirectory is loading the index are updated.
> Therefore I simply coded a Thread which every 10 minutes instantiates new
> RAMDirectories, unbinds the current RemoteSearchable and then rebinds to the
> RMI Registry with the new Searchable which uses the new RAMDirectories. This
> certainly doesn't feel like a good solution, even though the time under
> which the RMI service will not be able to answer is minimal, there is still
> a small chance that this very moment a client application tries to find
> something in the index. Is there a way to refresh the RAMDirectory without
> having to create new instances of all classes and bind this classes to the
> RMI Registry? If so, how?


Not that I'm aware of.  The remote searchers need to re-open to reflect the
changes, and those have to be disconnected from the RMI binding in order to
make that change.

Hope this helps.

Re: A couple of questions regarding load balancing and failover

Reply via email to