Hi Mark - Having gone down this path for the past year, I echo comments from others that scalability/availability/failover is a lot of work. We migrated away from a custom system based on Lucene running on Windows to Solr running on Linux. It took us 6 months to get our system to a solid five-nines in availability. Having done this previously, I can advise one not to underestimate the effort involved with this. We would have taken the simple route had it been available.
We shifted to Solr because of the operational elements that allows us to achieve clustering and failover capability within the Linux/Apache/Tomcat (our flavor) mix. It just works better for us than our home-brew. -- j On 7/27/06, Mark Miller <[EMAIL PROTECTED]> wrote:
I know there has been a lot of discussion on distributed search...I am looking for a cross platform solution, which seems to kill solr's approach...Everyone seems to have implemented this, but only as proprietary code...it would seem that just using the RMI searcher would allow a simple solution? Is this the case? Could you easily provide clustering and fail over using a variety of indexes and searching them all with RMI searcher? Is it all really that complicated? I have read that Lucene tops out at about 10m docs for a single server...I want to hit 100m. I have a beautiful app that allows realtime updating/searching (updates are rare but should be instant)...and I just want it to scale up to 100m docs or so . Is that going to be an really advanced project no matter how I slice it? I have done a lot of custom work with the lucene stuff so it would seem difficult to adapt it to Nutch (but what do I know Nutch) ... I have seen a lot of talk but not much on a simple RMI searcher solution...any idea? - Mark --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]