There's a few, and there have been various proposals (some in progress) to
deal with them. The two most obvious problems are:

The primary problem for most people is that wide partitions cause JVM heap
pressure on reads (CASSANDRA-11206, CASSANDRA-9754). This is because we
break the wide partitions into 64k chunks for indexing, and then load the
entire index for a partition into memory at once. You 820MB partition would
then create ~12000 index objects, each with 2 clustering keys (start of
index, end of index). When the read is done, the objects are released, and
the JVM has to clean it up - that's expensive (and can lead to GC
pauses). CASSANDRA-11206 lazily loads these objects for 3.6 and higher,
CASSANDRA-9754 will make it a b-tree on disk - look for #9754 in the 4.0
era.  In this category, you can end up with a huge addition to your key
cache that is either immediately invalidated, or invalidates a number of
other rows - key cache is one of the most important caches in cassandra, so
having a huge row wipe it out is bad.

The second problem is repair, both anti-entropy and read repair. The unit
we use for repair is a partition. If you have huge partitions, when you
repair, you repair the whole partition. You've got 820MB of data, but may
100 bytes difference? For anti-entropy repairs right now: we're streaming
820MB-100 bytes of data, and letting compaction clean it up. For
anti-entropy repairs, CASSANDRA-8911 is a proposal to do that more
efficiently. For read repairs: we'll end up reading most of the partition
and sending mutations for the whole thing all at once, which can be a lot
of updates if you're very out of sync.

The typical recommendation is to keep rows around 10-100MB. In your case,
you're ~800. Whether or not that's "too big" is based on your read latency
requirements, read concurrency, and whether or not 800MB is the upper
bound. It may be ok if you're rarely reading it and it doesnt grow. Or it
may be that you're reading it a lot and you need to re-model your data.



On Mon, Dec 11, 2017 at 5:44 AM, Micha <mich...@fantasymail.de> wrote:

> Hi,
>
> What are the effects of large partitions?
>
> I have a few tables which have partitions sizes as:
>
> 95%  24000
> 98%  42000
> 99%  85000
>
> Max  820000000
>
>
> So, should I redesign the schema to get this max smaller or doesn't it
> matter much, since 99% of the partitions are <= 85000 ?
>
> Thanks for answering
>  Michael
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>

Reply via email to