Hello Eric,

You make a good point about resiliency being applied at a higher level in
the stack.

Thanks

Jabbar Azam

On 8 November 2014 14:24, Eric Stevens <migh...@gmail.com> wrote:

> > They do not use Raid10 on the node, they don't use dual power as well,
> because it's not cheap in cluster of many nodes
>
> I think the point here is that money spent on traditional failure
> avoidance models is better spent in a Cassandra cluster by instead having
> more nodes of less expensive hardware.  Rather than redundant disks network
> ports and power supplies, spend that money on another set of nodes in a
> different topological (and probably physical) rack.  The parallel to
> having redundant disk arrays is to increase replication factor (RF=3 is
> already one replica better than Raid 10, and with fewer SPOFs).
>
> The only reason I can think you'd want to double down on hardware failover
> like the traditional model is if you are constrained in your data center
> (eg, space or cooling) and you'd rather run machines which are individually
> physically more resilient in exchange for running a lower RF.
>
> On Sat Nov 08 2014 at 5:32:22 AM Plotnik, Alexey <aplot...@rhonda.ru>
> wrote:
>
>>  Let me speak from my heart. I maintenance 200+TB Cassandra cluster. The
>> problem is money. If your IT people have a $$$ they can deploy Cassandra on
>> super robust hardware with triple power supply of course. But why then you
>> need Cassandra? Only for scalability?
>>
>> The idea of high available clusters is to get robustness from
>> availability (not from hardware reliability). More availability (more
>> nodes) you have - more money you need to buy hardware. Cassandra is the
>> most high available system on the planet - it scaled horizontally to any
>> number of nodes. You have time series data, you can set replication
>> factor > 3 if needed.
>>
>> There is a concept of network topology in Cassandra - you can specify on
>> which *failure domain* (racks or independent power lines) your nodes
>> installed on, and then replication will be computed correspondingly to
>> store replicas of a specified data on a different failure domains. The same
>> is for DC - there is a concept of data center in Cassandra topology, it
>> knows about your data centers.
>>
>> You should think not about hardware but about your data model - is
>> Cassandra applicable for you domain? Thinks about queries to your
>> data. Cassandra is actually a key value storage (documentation says it's a
>> column based storage, but it's just an CQL-abstraction over key and binary
>> value, nothing special except counters) so be very careful in designing
>> your data model.
>>
>> Anyway, let me answer your original question:
>> > what do people use in the real world in terms of node resiliancy when
>> running a cassandra cluster?
>>
>> Nothing because Cassandra is high available system. They use SSDs if
>> they need speed. They do not use Raid10 on the node, they don't use dual
>> power as well, because it's not cheap in cluster of many nodes and have no
>> sense because reliability is ensured by replication in large clusters. Not
>> sure about dual NICs, network reliability is ensured by distributing your
>> cluster across multiple data centers.
>>
>> We're using single SSD and single HDD on each node (we symlink some CF
>> folders to other disk). SSD for CFs where we need low latency, HDD for
>> binary data. If one of them fails, replication save us and we have
>> time to deploy new node and load data from replicas with Cassandra repair
>> feature back to original node. And we have no problem with it, node fail
>> sometimes, but it doesn't affect customers. That is.
>>
>>
>> ------ Original Message ------
>> From: "Jabbar Azam" <aja...@gmail.com>
>> To: "user@cassandra.apache.org" <user@cassandra.apache.org>
>> Sent: 08.11.2014 19:43:18
>> Subject: Re: Redundancy inside a cassandra node
>>
>>
>> Hello Alexey,
>>
>> The node count is 20 per site and there will be two sites. RF=3. But
>> since the software isn't complete and the database code is going through a
>> rewrite we aren't sure about space requirements. The node count is only a
>> guess, bases on the number of dev nodes in use. We will have better
>> information when the rewrite is done and testing resumes.
>>
>> The data will be time series data. It was binary blobs originally but we
>> have found that the new datastax c# drivers have improved alot in terms of
>> read performance.
>>
>> I'm curious. What is your definition of commodity. My IT people seem to
>> think that the servers must be super robust. Personally I'm not sure if
>> that should be the case.
>>
>> The node
>>
>>  Thanks
>>
>> Jabbar Azam
>>
>> On 8 November 2014 02:56, Plotnik, Alexey <aplot...@rhonda.ru> wrote:
>>
>>> Cassandra is a cluster itself, it's not necessary to have redundant each
>>> node. Cassandra has replication for that. And also Cassandra is designed to
>>> run in multiple data center - am think that redundant policy is applicable
>>> for you. Only thing from your saying you can deploy is raid10, other don't
>>> make any sense. As you are in stage of designing you cluster, please
>>> provide some numbers: how many data will be stored on each node, how many
>>> nodes would you have? What type of data will be stored in cluster: binary
>>> object o something time series?
>>>
>>> Cassandra is designed to run on commodity hardware.
>>>
>>> Отправлено с iPad
>>>
>>> > 8 нояб. 2014 г., в 6:26, Jabbar Azam <aja...@gmail.com> написал(а):
>>>  >
>>> > Hello all,
>>> >
>>> > My work will be deploying a cassandra cluster next year. Due to
>>> internal wrangling we can't seem to agree on the hardware. The software
>>> hasn't been finished, but management are asking for a ballpark figure for
>>> the hardware costs.
>>> >
>>> > The problem is the IT team are saying the nodes need to have multiple
>>> points of redundancy
>>> >
>>> > e.g. dual power supplies, dual nics, SSD's configured in raid 10.
>>> >
>>> >
>>> > The software team is saying that due to cassandras resilient nature,
>>> due to the way data is distributed and scalability that lots of cheap boes
>>> should be used. So they have been taling about self build consumer grade
>>> boxes with single nics, PSU's single SSDs etc.
>>> >
>>> > Obviously the self build boxes will cost a fraction of the price, but
>>> each box is not as resilient as the first option.
>>> >
>>> > We don;t use any cloud technologies, so that's out of the question.
>>> >
>>> > My question is what do people use in the real world in terms of node
>>> resiliancy when running a cassandra cluster?
>>> >
>>> > Write now the team is only thinking of hosting cassandra on the nodes.
>>> I'll see if I can twist their arms and see the light with Apache Spark.
>>> >
>>> > Obviously there are other tiers of servers, but they won't be running
>>> cassandra.
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > Thanks
>>> >
>>> > Jabbar Azam
>>>
>>
>>

Reply via email to