I have been browsing the archives concerning this particular topic.
I'm in the same boat and the customer has clustering requirements.
To give some background:
I have a constant flow of incoming messages flying over the network that
need to be archived in db, indexed and dispatched to thousand of clients
(rich client console).
the backend architecture needs to be clustered meaning that:
- the message broker needs to be clustered
- the database needs to be replicated and support failover
- the search engine index needs to be replicated
This is for a 24x7 operation.
My main problem is that there is a constant flow of write just about
everywhere meaning that the lucene index keeps changing, and that I have
a very small window available to replicate the data across the network.
(As of now, I have 2 messages / minute and should go over 50 in the
medium-term).
Concerning the index, being able to replicate is cool, but if one node
goes down, it must be able to resynchronize when you bring it up on the
cluster...that's a hell of problem.
As it is acceptable to have downtime on the search engine, I was
thinking it was much easier to:
1) rely on a shared index via NFS for each node.
2) dedicate a box to the search engine and access it via rpc from each node
Considering the messages I have seen in the archives, 1) seems to be a
no-go.
Option 2) is generally not recommended but think it could fit my needs
quite well. IMHO it should work quite well to bring the box in operation
if it goes down. Synchronizing the index for me is just a matter of
going through the database to reindex the archived content, this will
take sometime but as I said, running in degraded mode is acceptable.
As anyone any suggestion/recommendation/experience/thoughts concerning
the problems mentionned above ?
Cheers,
Stephane
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]