Re: multi datacenter cluster, without fibre speeds

M Vieira Mon, 14 Nov 2011 04:11:56 -0800

Broadband here is fairly stable, to be honest don't remember last time I
had problems such as larger than expected latency or downtime - ISP Bethere
/UK
My application can cope fine with up to 10 min lag (data
freshness), however taking your input into consideration I agree with you,
so don't think I should trust the setup and believe my data will be sync'ed
across the two clusters as failures will surely occur.


Regarding Hadoop, it requires a lot more coding if compared to what I need
to get my app working with Cassandra. Also, Hadoop's single point of
failure weakness scares me because I don't have budget nor time to work on
fail-safe kind of solution.

Once again, thanks for your feedback

Marco



On 14 November 2011 09:43, Radim Kolar <h...@sendmail.cz> wrote:

>
>  Well to be honest I was thinking of using that connection in production,
>> not for a backup node.
>>
>>  For productions. there are several problems. Added network latency which
> is inconsistent and vary greatly during day, sometimes you will face
> network lags which will break cluster for a while (about 1-2 minutes). Also
> network bandwidth is problem especially during peak hours. It might not be
> problem if you dont have interactive workload - app can wait, human cant.
> Be sure to use connection pooling to different servers at client. Over WAN
> you can have about 4:1 ratio in available bw in peak hours/night hours. -
> You need to schedule antientropy repairs at nights.
>
>
>> My Cassandra deployment works just like an expensive file caching and
>> replication - I mean, all I use it for is to replicate some 5million files
>> of 2M each across few nodes and intensively read/write.
>>
>>  for mass replication of large files hadoop is really better then
> cassandra because there are no compactions.
>
>
>> Not only the files themselves but I also need to attach some tags to each
>> file (see them as key=value) so I though of Haadop but in the end settle
>> for Cassandra because of better consistency, community support, no single
>> point of failure and some!
>>
>>  hadoop is far better then cassandra for batch processing if your batch
> processing changes majority of data set. SPOF is not problem, but it is way
> harder to write optimised applications for hadoop, its kinda low level.
>

Re: multi datacenter cluster, without fibre speeds

Reply via email to