Hello Eric, You make a good point about resiliency being applied at a higher level in the stack.
Thanks Jabbar Azam On 8 November 2014 14:24, Eric Stevens <migh...@gmail.com> wrote: > > They do not use Raid10 on the node, they don't use dual power as well, > because it's not cheap in cluster of many nodes > > I think the point here is that money spent on traditional failure > avoidance models is better spent in a Cassandra cluster by instead having > more nodes of less expensive hardware. Rather than redundant disks network > ports and power supplies, spend that money on another set of nodes in a > different topological (and probably physical) rack. The parallel to > having redundant disk arrays is to increase replication factor (RF=3 is > already one replica better than Raid 10, and with fewer SPOFs). > > The only reason I can think you'd want to double down on hardware failover > like the traditional model is if you are constrained in your data center > (eg, space or cooling) and you'd rather run machines which are individually > physically more resilient in exchange for running a lower RF. > > On Sat Nov 08 2014 at 5:32:22 AM Plotnik, Alexey <aplot...@rhonda.ru> > wrote: > >> Let me speak from my heart. I maintenance 200+TB Cassandra cluster. The >> problem is money. If your IT people have a $$$ they can deploy Cassandra on >> super robust hardware with triple power supply of course. But why then you >> need Cassandra? Only for scalability? >> >> The idea of high available clusters is to get robustness from >> availability (not from hardware reliability). More availability (more >> nodes) you have - more money you need to buy hardware. Cassandra is the >> most high available system on the planet - it scaled horizontally to any >> number of nodes. You have time series data, you can set replication >> factor > 3 if needed. >> >> There is a concept of network topology in Cassandra - you can specify on >> which *failure domain* (racks or independent power lines) your nodes >> installed on, and then replication will be computed correspondingly to >> store replicas of a specified data on a different failure domains. The same >> is for DC - there is a concept of data center in Cassandra topology, it >> knows about your data centers. >> >> You should think not about hardware but about your data model - is >> Cassandra applicable for you domain? Thinks about queries to your >> data. Cassandra is actually a key value storage (documentation says it's a >> column based storage, but it's just an CQL-abstraction over key and binary >> value, nothing special except counters) so be very careful in designing >> your data model. >> >> Anyway, let me answer your original question: >> > what do people use in the real world in terms of node resiliancy when >> running a cassandra cluster? >> >> Nothing because Cassandra is high available system. They use SSDs if >> they need speed. They do not use Raid10 on the node, they don't use dual >> power as well, because it's not cheap in cluster of many nodes and have no >> sense because reliability is ensured by replication in large clusters. Not >> sure about dual NICs, network reliability is ensured by distributing your >> cluster across multiple data centers. >> >> We're using single SSD and single HDD on each node (we symlink some CF >> folders to other disk). SSD for CFs where we need low latency, HDD for >> binary data. If one of them fails, replication save us and we have >> time to deploy new node and load data from replicas with Cassandra repair >> feature back to original node. And we have no problem with it, node fail >> sometimes, but it doesn't affect customers. That is. >> >> >> ------ Original Message ------ >> From: "Jabbar Azam" <aja...@gmail.com> >> To: "user@cassandra.apache.org" <user@cassandra.apache.org> >> Sent: 08.11.2014 19:43:18 >> Subject: Re: Redundancy inside a cassandra node >> >> >> Hello Alexey, >> >> The node count is 20 per site and there will be two sites. RF=3. But >> since the software isn't complete and the database code is going through a >> rewrite we aren't sure about space requirements. The node count is only a >> guess, bases on the number of dev nodes in use. We will have better >> information when the rewrite is done and testing resumes. >> >> The data will be time series data. It was binary blobs originally but we >> have found that the new datastax c# drivers have improved alot in terms of >> read performance. >> >> I'm curious. What is your definition of commodity. My IT people seem to >> think that the servers must be super robust. Personally I'm not sure if >> that should be the case. >> >> The node >> >> Thanks >> >> Jabbar Azam >> >> On 8 November 2014 02:56, Plotnik, Alexey <aplot...@rhonda.ru> wrote: >> >>> Cassandra is a cluster itself, it's not necessary to have redundant each >>> node. Cassandra has replication for that. And also Cassandra is designed to >>> run in multiple data center - am think that redundant policy is applicable >>> for you. Only thing from your saying you can deploy is raid10, other don't >>> make any sense. As you are in stage of designing you cluster, please >>> provide some numbers: how many data will be stored on each node, how many >>> nodes would you have? What type of data will be stored in cluster: binary >>> object o something time series? >>> >>> Cassandra is designed to run on commodity hardware. >>> >>> Отправлено с iPad >>> >>> > 8 нояб. 2014 г., в 6:26, Jabbar Azam <aja...@gmail.com> написал(а): >>> > >>> > Hello all, >>> > >>> > My work will be deploying a cassandra cluster next year. Due to >>> internal wrangling we can't seem to agree on the hardware. The software >>> hasn't been finished, but management are asking for a ballpark figure for >>> the hardware costs. >>> > >>> > The problem is the IT team are saying the nodes need to have multiple >>> points of redundancy >>> > >>> > e.g. dual power supplies, dual nics, SSD's configured in raid 10. >>> > >>> > >>> > The software team is saying that due to cassandras resilient nature, >>> due to the way data is distributed and scalability that lots of cheap boes >>> should be used. So they have been taling about self build consumer grade >>> boxes with single nics, PSU's single SSDs etc. >>> > >>> > Obviously the self build boxes will cost a fraction of the price, but >>> each box is not as resilient as the first option. >>> > >>> > We don;t use any cloud technologies, so that's out of the question. >>> > >>> > My question is what do people use in the real world in terms of node >>> resiliancy when running a cassandra cluster? >>> > >>> > Write now the team is only thinking of hosting cassandra on the nodes. >>> I'll see if I can twist their arms and see the light with Apache Spark. >>> > >>> > Obviously there are other tiers of servers, but they won't be running >>> cassandra. >>> > >>> > >>> > >>> > >>> > >>> > Thanks >>> > >>> > Jabbar Azam >>> >> >>