Re: Large partitions

Jonathan Haddad Thu, 13 Sep 2018 10:13:35 -0700

It depends on a number of factors, such as compaction strategy and read
patterns.  I recommend sticking to the 100MB per partition limit (and I aim
for significantly less than that).

If you're doing time series with TWCS & TTL'ed data and small enough
windows, and you're only querying for a small subset of the data, sure, you
could do it.  Outside of that, I don't see a reason why you'd want to.  I
wrote a blog post on how to scale time series workloads in Cassandra a ways
back, might be worth a read:
http://thelastpickle.com/blog/2017/08/02/time-series-data-modeling-massive-scale.html

Regarding your write performance, since you're only bound by commit log
performance + memtable inserts, if your writes are slow there's a good
chance you're hitting long GC pauses.  Those *could* be caused by
compaction.  If your compaction throughput is too high you could see high
rates of object allocation which lead to long GC pauses, slowing down your
writes.  There's other things that can cause long GC pauses, sometimes you
just need some basic tuning. I recommend reading up on it:
http://thelastpickle.com/blog/2018/04/11/gc-tuning.html

Jon

On Thu, Sep 13, 2018 at 9:47 AM Mun Dega <mundeg...@gmail.com> wrote:

> I disagree.
>
> We had several over 150MB in 3.11 and we were able to break cluster doing
> r/w from these partitions in a short period of time.
>
> On Thu, Sep 13, 2018, 12:42 Gedeon Kamga <gka...@gmail.com> wrote:
>
>> Folks,
>>
>> Based on the information found here
>> https://docs.datastax.com/en/dse-planning/doc/planning/planningPartitionSize.html
>>  ,
>> the recommended limit for a partition size is 100MB. Even though, DataStax
>> clearly states that this is a rule of thumb, some team members are claiming
>> that our Cassandra *Write *is very slow because the partitions on some
>> tables are over 100MB. I know for a fact that this rule has changed since
>> 2.2. Starting Cassandra 2.2 and up, the new rule of thumb for partition
>> size is *a few hundreds MB*, given the improvement on the architecture.
>> Now, I am unable to find the reference (maybe I got it at a Cassandra
>> training by DataStax). I would like to share it with my team. Did anyone
>> come across this information? If yes, can you please share it?
>>
>> Thanks!
>>
>

-- 
Jon Haddad
http://www.rustyrazorblade.com
twitter: rustyrazorblade

Re: Large partitions

Reply via email to