Re: Index Replication / Clustering

2005-06-30 Thread Stephane Bailliez
Nader Henein wrote: Considerations that you may want to think about when sanitizing your clustered indecies: 1) Number of documents available vs. number of documents in the persistent store. 2) Are all the document up to date (involves comparing the existence and the last date updated of Luce

Re: Index Replication / Clustering

2005-06-27 Thread Chris Lu
DBSight, a J2EE search engine on database, meets most of your requirements. It has clustering support. Basically you can configure one DBSight server specially for indexing on database content. Another or several other DBSight servers devoted to search, and they subscribe to the indexing server

Re: Index Replication / Clustering

2005-06-27 Thread Nader Henein
Considerations that you may want to think about when sanitizing your clustered indecies: 1) Number of documents available vs. number of documents in the persistent store. 2) Are all the document up to date (involves comparing the existence and the last date updated of Lucene documents to persi

Re: Index Replication / Clustering

2005-06-27 Thread Paul Smith
On 27/06/2005, at 7:14 PM, Nader Henein wrote: I implemented a JMS based solution about a year ago because I thought it would solve my atomicity problem and give me a centralized way of indexing, you'll have to use the pluggable persistence (if you use ActiveMQ) to be able to recover from

Re: Index Replication / Clustering

2005-06-27 Thread Paul Smith
If you use ActiveMQ for JMS, you can take advantage of it's Composite Destination feature and have a virtual Queue/Topic that is actually several Queues/Topics. This is what we use to keep a mirror index server completely in sync. The application sends an update message to a queue

Re: Index Replication / Clustering

2005-06-27 Thread Nader Henein
I implemented a JMS based solution about a year ago because I thought it would solve my atomicity problem and give me a centralized way of indexing, you'll have to use the pluggable persistence (if you use ActiveMQ) to be able to recover from a failure and you'll also need some way of maintaini

Re: Index Replication / Clustering

2005-06-27 Thread Stephane Bailliez
Hi Paul, Thanks for the reply. Many interesting points. Paul Smith wrote: Why not try using JMS messaging to send messages to the indexing server that Document X needs to be updated via a JMS queue? This gives you the flexibility to have the indexing system down but not lose the message t

Re: Index Replication / Clustering

2005-06-26 Thread Paul Smith
Why not try using JMS messaging to send messages to the indexing server that Document X needs to be updated via a JMS queue? This gives you the flexibility to have the indexing system down but not lose the message that it needs to be indexed, and also allows the indexing server to be 'busy

Re: Index Replication / Clustering

2005-06-26 Thread Nader Henein
As far as indexing is concerned, a simple way of tracking a clustered system, is to create autonomous indecies that report to a central repository, creating a table in the DB with a row per document ( you have a unique document ID, right? ) and then a column per server node (the columns act as

Re: Index Replication / Clustering

2005-06-26 Thread Stephane Bailliez
Nader Henein wrote: Our setup is quite similar to yours, but in all honesty, you will need to do some for of batching on your updates simply because, you don't want to keep the Index Writter open all the time. For now, the index writer is closed after each added document. It does not seem to

Re: Index Replication / Clustering

2005-06-26 Thread Nader Henein
Our setup is quite similar to yours, but in all honesty, you will need to do some for of batching on your updates simply because, you don't want to keep the Index Writter open all the time. As for clustering, we went through three iterations, that keep x indexes parallelized on x servers all o