Nothing magic, just put in there based on experience. You can find the
story behind the original recommendation here

https://issues.apache.org/jira/browse/CASSANDRA-6487

Key reasoning for the desire comes from Patrick McFadden:

"Yes that was in bytes. Just in my own experience, I don't recommend more
than ~100 mutations per batch. Doing some quick math I came up with 5k as
100 x 50 byte mutations.

Totally up for debate."

It's totally changeable, however, it's there in no small part because so
many people confuse the BATCH keyword as a performance optimization, this
helps flag those cases of misuse.

On Thu, Dec 11, 2014 at 2:43 PM, Mohammed Guller <moham...@glassbeam.com>
wrote:
>
>   Hi –
>
> The cassandra.yaml file has property called *batch_size_warn_threshold_in_kb.
> *
>
> The default size is 5kb and according to the comments in the yaml file, it
> is used to log WARN on any batch size exceeding this value in kilobytes. It
> says caution should be taken on increasing the size of this threshold as it
> can lead to node instability.
>
>
>
> Does anybody know the significance of this magic number 5kb? Why would a
> higher number (say 10kb) lead to node instability?
>
>
>
> Mohammed
>


-- 

[image: datastax_logo.png] <http://www.datastax.com/>

Ryan Svihla

Solution Architect

[image: twitter.png] <https://twitter.com/foundev> [image: linkedin.png]
<http://www.linkedin.com/pub/ryan-svihla/12/621/727/>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

Reply via email to