Broadband here is fairly stable, to be honest don't remember last time I had problems such as larger than expected latency or downtime - ISP Bethere /UK My application can cope fine with up to 10 min lag (data freshness), however taking your input into consideration I agree with you, so don't think I should trust the setup and believe my data will be sync'ed across the two clusters as failures will surely occur.
Regarding Hadoop, it requires a lot more coding if compared to what I need to get my app working with Cassandra. Also, Hadoop's single point of failure weakness scares me because I don't have budget nor time to work on fail-safe kind of solution. Once again, thanks for your feedback Marco On 14 November 2011 09:43, Radim Kolar <h...@sendmail.cz> wrote: > > Well to be honest I was thinking of using that connection in production, >> not for a backup node. >> >> For productions. there are several problems. Added network latency which > is inconsistent and vary greatly during day, sometimes you will face > network lags which will break cluster for a while (about 1-2 minutes). Also > network bandwidth is problem especially during peak hours. It might not be > problem if you dont have interactive workload - app can wait, human cant. > Be sure to use connection pooling to different servers at client. Over WAN > you can have about 4:1 ratio in available bw in peak hours/night hours. - > You need to schedule antientropy repairs at nights. > > >> My Cassandra deployment works just like an expensive file caching and >> replication - I mean, all I use it for is to replicate some 5million files >> of 2M each across few nodes and intensively read/write. >> >> for mass replication of large files hadoop is really better then > cassandra because there are no compactions. > > >> Not only the files themselves but I also need to attach some tags to each >> file (see them as key=value) so I though of Haadop but in the end settle >> for Cassandra because of better consistency, community support, no single >> point of failure and some! >> >> hadoop is far better then cassandra for batch processing if your batch > processing changes majority of data set. SPOF is not problem, but it is way > harder to write optimised applications for hadoop, its kinda low level. >