Re: multi datacenter cluster, without fibre speeds

Radim Kolar Mon, 14 Nov 2011 01:44:20 -0800

Well to be honest I was thinking of using that connection inproduction, not for a backup node.

For productions. there are several problems. Added network latency whichis inconsistent and vary greatly during day, sometimes you will facenetwork lags which will break cluster for a while (about 1-2 minutes).Also network bandwidth is problem especially during peak hours. It mightnot be problem if you dont have interactive workload - app can wait,human cant. Be sure to use connection pooling to different servers atclient. Over WAN you can have about 4:1 ratio in available bw in peakhours/night hours. - You need to schedule antientropy repairs at nights.

My Cassandra deployment works just like an expensive file caching andreplication - I mean, all I use it for is to replicate some 5millionfiles of 2M each across few nodes and intensively read/write.

for mass replication of large files hadoop is really better thencassandra because there are no compactions.

Not only the files themselves but I also need to attach some tags toeach file (see them as key=value) so I though of Haadop but in the endsettle for Cassandra because of better consistency, community support,no single point of failure and some!

hadoop is far better then cassandra for batch processing if your batchprocessing changes majority of data set. SPOF is not problem, but it isway harder to write optimised applications for hadoop, its kinda low level.

Re: multi datacenter cluster, without fibre speeds

Reply via email to