(Yes, just somewhat less likely to be the same order of speed-up in STCS
where sstables are more likely to cross token boundaries, modulo some stuff
around sstable splitting at token ranges a la 6696)
On Mon, Aug 21, 2023 at 11:35 AM Dinesh Joshi wrote:
> Minor correction, zero copy streaming ak
Minor correction, zero copy streaming aka faster streaming also works for STCS.DineshOn Aug 21, 2023, at 8:01 AM, Jeff Jirsa wrote:There's a lot of questionable advice scattered in this thread. Set aside most of the guidance like 2TB/node, it's old and super nuanced.If you're bare metal, do what
- k8s
1. depending on the version and networking, number of containers per
node, nodepooling, etc. you can expect to see 1-2% additional storage IO
latency (depends on whether all are on the same network vs. separate
storage IO TCP network)
2. System overhead may be 3-15% depending
...and a shameless plug for the Cassandra Summit in December. We have a
talk from somebody that is doing 70TB per node and will be digging into all
the aspects that make that work for them. I hope everyone in this thread is
at that talk! I can't wait to hear all the questions.
Patrick
On Mon, Aug
There's a lot of questionable advice scattered in this thread. Set aside
most of the guidance like 2TB/node, it's old and super nuanced.
If you're bare metal, do what your organization is good at. If you have
millions of dollars in SAN equipment and you know how SANs work and fail
and get backed u
For our scenario, the goal is to minimize down-time for a single (at
least initially) data center system. Data-loss is basically
unacceptable. I wouldn't say we have a "rusty slow data center" - we
can certainly use SSDs and have servers connected via 10G copper to a
fast back-plane. For our
, even in a single data
center scenario. Otherwise, there are other data options.
Sean R. Durity
DB Solutions
Staff Systems Engineer – Cassandra
INTERNAL USE
From: daemeon reiydelle
Sent: Thursday, August 17, 2023 7:38 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Big Data Question
I
I started to respond, then realized I and the other OP posters are not
thinking the same: What is the business case for availability, data
los/reload/recoverability? You all argue for higher availability and damn
the cost. But noone asked "can you lose access, for 20 minutes, to a
portion of the da
Was assuming reaper did incremental? That was probably a bad assumption.
nodetool repair -pr
I know it well now!
:)
-Joe
On 8/17/2023 4:47 PM, Bowen Song via user wrote:
I don't have experience with Cassandra on Kubernetes, so I can't
comment on that.
For repairs, may I interest you with i
I don't have experience with Cassandra on Kubernetes, so I can't comment
on that.
For repairs, may I interest you with incremental repairs? It will make
repairs hell of a lot faster. Of course, occasional full repair is still
needed, but that's another story.
On 17/08/2023 21:36, Joe Obernb
Thank you. Enjoying this conversation.
Agree on blade servers, where each blade has a small number of SSDs.
Yeh/Nah to a kubernetes approach assuming fast persistent storage? I
think that might be easier to manage.
In my current benchmarks, the performance is excellent, but the repairs
are
From my experience, that's not entirely true. For large nodes, the
bottleneck is usually the JVM garbage collector. The the GC pauses can
easily get out of control on very large heaps, and long STW pauses may
also result in nodes flip up and down from other nodes' perspective,
which often rende
A lot of (actually all) seem to be based on local nodes with 1gb networks
of spinning rust. Much of what is mentioned below is TOTALLY wrong for
cloud. So clarify whether you are "real world" or rusty slow data center
world (definitely not modern DC either).
E.g. should not handle more than 2tb of
The optimal node size largely depends on the table schema and read/write
pattern. In some cases 500 GB per node is too large, but in some other
cases 10TB per node works totally fine. It's hard to estimate that
without benchmarking.
Again, just pointing out the obvious, you did not count the o
.
Sean R. Durity
INTERNAL USE
From: Joe Obernberger
Sent: Thursday, August 17, 2023 10:46 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Big Data Question
Thanks for this - yeah - duh - forgot about replication in my example! So - is
2TBytes per Cassandra instance advisable? Better to
A few thoughts on this:– 80TB per machine is pretty dense. Consider the amount of data
you'd need to re-replicate in the event of a hardware failure that takes down all 80TB
(DIMM failure requiring replacement, non-reduntant PSU failure, NIC, etc).– 24GB of heap
is also pretty generous. Dependi
Thanks for this - yeah - duh - forgot about replication in my example!
So - is 2TBytes per Cassandra instance advisable? Better to use
more/less? Modern 2u servers can be had with 24 3.8TBtyte SSDs; so
assume 80Tbytes per server, you could do:
(1024*3)/80 = 39 servers, but you'd have to run 40
Just pointing out the obvious, for 1PB of data on nodes with 2TB disk
each, you will need far more than 500 nodes.
1, it is unwise to run Cassandra with replication factor 1. It usually
makes sense to use RF=3, so 1PB data will cost 3PB of storage space,
minimal of 1500 such nodes.
2, depend
A lot of things depend on actual cluster config - compaction settings (LCS
vs STCS vs TWCS) and token allocation (single token, vnodes, etc) matter a
ton.
With 4.0 and LCS, streaming for replacement is MUCH faster, so much so that
most people should be fine with 4-8TB/node, because the rebuild tim
19 matches
Mail list logo