Re: Re: bigger data density with Cassandra 4.0?

kurt greaves Wed, 29 Aug 2018 03:48:02 -0700

Most of the issues around big nodes is related to streaming, which is
currently quite slow (should be a bit better in 4.0). HBase is built on top
of hadoop, which is much better at large files/very dense nodes, and tends
to be quite average for transactional data. ScyllaDB IDK, I'd assume they
just sorted out streaming by learning from C*'s mistakes.


On 29 August 2018 at 19:43, onmstester onmstester <onmstes...@zoho.com>
wrote:

> Thanks Kurt,
> Actually my cluster has > 10 nodes, so there is a tiny chance to stream a
> complete SSTable.
> While logically any Columnar noSql db like Cassandra, needs always to
> re-sort grouped data for later-fast-reads and having nodes with big amount
> of data (> 2 TB) would be annoying for this background process, How is it
> possible that some of these databases like HBase and Scylla db does not
> emphasis on small nodes (like Cassandra do)?
>
> Sent using Zoho Mail <https://www.zoho.com/mail/>
>
>
> ============ Forwarded message ============
> From : kurt greaves <k...@instaclustr.com>
> To : "User"<user@cassandra.apache.org>
> Date : Wed, 29 Aug 2018 12:03:47 +0430
> Subject : Re: bigger data density with Cassandra 4.0?
> ============ Forwarded message ============
>
> My reasoning was if you have a small cluster with vnodes you're more
> likely to have enough overlap between nodes that whole SSTables will be
> streamed on major ops. As  N gets >RF you'll have less common ranges and
> thus less likely to be streaming complete SSTables. Correct me if I've
> misunderstood.
>
>
>
>

Re: Re: bigger data density with Cassandra 4.0?

Reply via email to