There's a few, and there have been various proposals (some in progress) to deal with them. The two most obvious problems are:
The primary problem for most people is that wide partitions cause JVM heap pressure on reads (CASSANDRA-11206, CASSANDRA-9754). This is because we break the wide partitions into 64k chunks for indexing, and then load the entire index for a partition into memory at once. You 820MB partition would then create ~12000 index objects, each with 2 clustering keys (start of index, end of index). When the read is done, the objects are released, and the JVM has to clean it up - that's expensive (and can lead to GC pauses). CASSANDRA-11206 lazily loads these objects for 3.6 and higher, CASSANDRA-9754 will make it a b-tree on disk - look for #9754 in the 4.0 era. In this category, you can end up with a huge addition to your key cache that is either immediately invalidated, or invalidates a number of other rows - key cache is one of the most important caches in cassandra, so having a huge row wipe it out is bad. The second problem is repair, both anti-entropy and read repair. The unit we use for repair is a partition. If you have huge partitions, when you repair, you repair the whole partition. You've got 820MB of data, but may 100 bytes difference? For anti-entropy repairs right now: we're streaming 820MB-100 bytes of data, and letting compaction clean it up. For anti-entropy repairs, CASSANDRA-8911 is a proposal to do that more efficiently. For read repairs: we'll end up reading most of the partition and sending mutations for the whole thing all at once, which can be a lot of updates if you're very out of sync. The typical recommendation is to keep rows around 10-100MB. In your case, you're ~800. Whether or not that's "too big" is based on your read latency requirements, read concurrency, and whether or not 800MB is the upper bound. It may be ok if you're rarely reading it and it doesnt grow. Or it may be that you're reading it a lot and you need to re-model your data. On Mon, Dec 11, 2017 at 5:44 AM, Micha <mich...@fantasymail.de> wrote: > Hi, > > What are the effects of large partitions? > > I have a few tables which have partitions sizes as: > > 95% 24000 > 98% 42000 > 99% 85000 > > Max 820000000 > > > So, should I redesign the schema to get this max smaller or doesn't it > matter much, since 99% of the partitions are <= 85000 ? > > Thanks for answering > Michael > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org > For additional commands, e-mail: user-h...@cassandra.apache.org > >