Re: Node Size

2021-01-20 Thread Joe Obernberger
Anyone know where I could find out more information on this? Thanks! -Joe On 1/13/2021 8:42 AM, Joe Obernberger wrote: Reading the documentation on Cassandra 3.x there is recommendations that node size should be ~1TByte of data.  Modern servers can have 24 SSDs, each at 2TBytes in size for dat

Re: Node Size

2021-01-20 Thread Yakir Gibraltar
It possible to use large nodes and it will work, the problem of large nodes will be: - Maintenance like join/remove nodes will take more time. - Larger heap - etc. On Wed, Jan 20, 2021 at 3:54 PM Joe Obernberger < joseph.obernber...@gmail.com> wrote: > Anyone know where I could find ou

RE: Node Size

2021-01-20 Thread Durity, Sean R
Yakir is correct. While it is feasible to have large disk nodes, the practical aspect of managing them is an issue. With the current technology, I do not build nodes with more than about 3.5 TB of disk available. I prefer 1-2 TB, but costs/number of nodes can change the considerations. Putting

Re: Node Size

2021-01-20 Thread Joe Obernberger
Thank you Sean and Yakir.  Is 4.x the same? So if you were to build a 1PByte system, you would want 512-1024 nodes?  Doesn't seem space efficient vs say 48TByte nodes where you would need ~21 machines. What would you do to build a 1PByte configuration?  I know there are a lot of - it depends -

Re: Node Size

2021-01-20 Thread Jeff Jirsa
Not going to give a number other than to say that 1TB/instance is probably super super super conservative in 2021. The modern number is likely considerably higher. But let's look at this from first principles. There's basically two things to worry about here: 1) Can you get enough CPU/memory to su

RE: Node Size

2021-01-20 Thread Durity, Sean R
This is a great way to think through the problem and solution. I will add that part of my calculation on failure time is how long does it take to actually replace a drive and/or a server with (however many) drives? We pay for very fast vendor SLAs. However, in reality, there has been quite a bit

Re: Node Size

2021-01-20 Thread Joe Obernberger
This is great information - thank you! I'm coming from HDFS+Hbase, lots of nodes, nodes with many spindles.  When a drive fails in this environment (which happens a lot with 16-24 drives per node), HDFS removes that one failed volume and then maintains the 3x replication with the rest of the c