Re: batch_size_warn_threshold_in_kb

2014-12-16 Thread Eric Stevens
tax.com/dev/blog/cassandra-2-1-now- >>>>>>>> over-50-faster) which massively helps performance. It provides >>>>>>>> the benefit of batches but without the coordinator overhead. >>>>>>>> >>>>>>>> Ca

Re: batch_size_warn_threshold_in_kb

2014-12-15 Thread Jonathan Haddad
gt;> have 100 servers, and perform a mutation on 100 partitions, you could >>>>>>>> have >>>>>>>> a coordinator that's >>>>>>>> >>>>>>>> 1) talking to every machine in the cluster and >>&

Re: batch_size_warn_threshold_in_kb

2014-12-15 Thread Eric Stevens
>>>> Jonathan says “It is absolutely not going to help you if you're >>>>>>>> trying to lump queries together to reduce network & server overhead - >>>>>>>> in >>>>>>>> fact it'll do the opposite”, but I w

Re: batch_size_warn_threshold_in_kb

2014-12-13 Thread Jonathan Haddad
t;>>>> trying to lump queries together to reduce network & server overhead - >>>>>>>> in >>>>>>>> fact it'll do the opposite”, but I would note that the CQL3 spec says “ >>>>>>>> The BATCH statement ... serves

Re: batch_size_warn_threshold_in_kb

2014-12-13 Thread Jonathan Haddad
ps between the client and the server (and sometimes >>>>>>> between the server coordinator and the replicas) when batching multiple >>>>>>> updates.” Is the spec inaccurate? I mean, it seems in conflict with your >>>>>>> stateme

Re: batch_size_warn_threshold_in_kb

2014-12-13 Thread Eric Stevens
t with your >>>>>>> statement. >>>>>>> >>>>>>> See: >>>>>>> https://cassandra.apache.org/doc/cql3/CQL.html >>>>>>> >>>>>>> I see the spec as gospel – if it’s not accurate, let’s propose a >

Re: batch_size_warn_threshold_in_kb

2014-12-13 Thread Eric Stevens
>>>> >>>>>> -- Jack Krupansky >>>>>> >>>>>> *From:* Jonathan Haddad >>>>>> *Sent:* Friday, December 12, 2014 12:58 PM >>>>>> *To:* user@cassandra.apache.org ; Ryan Svihla >>>>>> *Subj

Re: batch_size_warn_threshold_in_kb

2014-12-13 Thread Eric Stevens
formance is usually not >>>>> successful, as described in Using and misusing batches section. For >>>>> information about the fastest way to load data, see "Cassandra: Batch >>>>> loading without the Batch keyword."” >>>>> >>

Re: batch_size_warn_threshold_in_kb

2014-12-13 Thread Jonathan Haddad
ptimize performance. Using batches to optimize performance is usually not >>>>> successful, as described in Using and misusing batches section. For >>>>> information about the fastest way to load data, see "Cassandra: Batch >>>>> loading without th

Re: batch_size_warn_threshold_in_kb

2014-12-13 Thread Eric Stevens
atch”, which is >>>> simply a way to collect “batches” of operations in the client/driver and >>>> then let the driver determine what degree of batching and asynchronous >>>> operation is appropriate. >>>> >>>> It might also be nice t

Re: batch_size_warn_threshold_in_kb

2014-12-13 Thread Ryan Svihla
o hit a lot of problems if you use them excessively (timeouts / >>> failures). >>> >>> tl;dr: you probably don't want batch, you most likely want many async >>> calls >>> >>> On Thu Dec 11 2014 at 11:15:00 PM Mohammed Guller < >>&g

Re: batch_size_warn_threshold_in_kb

2014-12-13 Thread Jonathan Haddad
connections, and to have that be dynamic based >> on overall cluster load. >> >> I would also note that the example in the spec has multiple inserts with >> different partition key values, which flies in the face of the admonition >> to to refrain from using server-sid

Re: batch_size_warn_threshold_in_kb

2014-12-13 Thread Ryan Svihla
verhead - in fact it'll do the >>> opposite. If you're trying to do that, instead perform many async >>> queries. The overhead of batches in cassandra is significant and you're >>> going to hit a lot of problems if you use them excessively (timeou

Re: batch_size_warn_threshold_in_kb

2014-12-13 Thread Ryan Svihla
Ryan, >>> >>> Thanks for the quick response. >>> >>> >>> >>> I did see that jira before posting my question on this list. However, I >>> didn’t see any information about why 5kb+ data will cause instability. 5kb >>> or e

Re: batch_size_warn_threshold_in_kb

2014-12-13 Thread Jonathan Haddad
ore clear statement of intent and > non-intent for BATCH. > > -- Jack Krupansky > > *From:* Jonathan Haddad > *Sent:* Friday, December 12, 2014 12:58 PM > *To:* user@cassandra.apache.org ; Ryan Svihla > *Subject:* Re: batch_size_warn_threshold_in_kb > > The really important thing

Re: batch_size_warn_threshold_in_kb

2014-12-13 Thread Eric Stevens
gt;> >> >> In addition, Patrick is saying that he does not recommend more than 100 >> mutations per batch. So why not warn users just on the # of mutations in a >> batch? >> >> >> >> Mohammed >> >> >> >> *From:* Ryan Svi

Re: batch_size_warn_threshold_in_kb

2014-12-13 Thread Jack Krupansky
, 2014 12:58 PM To: user@cassandra.apache.org ; Ryan Svihla Subject: Re: batch_size_warn_threshold_in_kb The really important thing to really take away from Ryan's original post is that batches are not there for performance. The only case I consider batches to be useful for is when you absolut

Re: batch_size_warn_threshold_in_kb

2014-12-12 Thread Jonathan Haddad
> > > *From:* Ryan Svihla [mailto:rsvi...@datastax.com] > *Sent:* Thursday, December 11, 2014 12:56 PM > *To:* user@cassandra.apache.org > *Subject:* Re: batch_size_warn_threshold_in_kb > > > > Nothing magic, just put in there based on experience. You can find the > s

Re: batch_size_warn_threshold_in_kb

2014-12-12 Thread Ryan Svihla
Any insert, update, or delete On Fri, Dec 12, 2014 at 1:31 AM, Jens Rantil wrote: > > Maybe slightly off-topic, but what is a mutation? Is it equivalent to a > CQL row? Or maybe a column in a row? Does include tombstones within the > selected range? > > Thanks, > Jens > > > > On Thu, Dec 11, 2014

Re: batch_size_warn_threshold_in_kb

2014-12-12 Thread Ryan Svihla
ohammed > > > > *From:* Ryan Svihla [mailto:rsvi...@datastax.com] > *Sent:* Thursday, December 11, 2014 12:56 PM > *To:* user@cassandra.apache.org > *Subject:* Re: batch_size_warn_threshold_in_kb > > > > Nothing magic, just put in there based on experience. Y

Re: batch_size_warn_threshold_in_kb

2014-12-11 Thread Jens Rantil
Maybe slightly off-topic, but what is a mutation? Is it equivalent to a CQL row? Or maybe a column in a row? Does include tombstones within the selected range? Thanks, Jens On Thu, Dec 11, 2014 at 9:56 PM, Ryan Svihla wrote: > Nothing magic, just put in there based on experience. You can find

RE: batch_size_warn_threshold_in_kb

2014-12-11 Thread Mohammed Guller
@cassandra.apache.org Subject: Re: batch_size_warn_threshold_in_kb Nothing magic, just put in there based on experience. You can find the story behind the original recommendation here https://issues.apache.org/jira/browse/CASSANDRA-6487 Key reasoning for the desire comes from Patrick McFadden: "Yes

Re: batch_size_warn_threshold_in_kb

2014-12-11 Thread Shane Hansen
I don't know why 5kb was chosen. The general trend is that larger batches will put more stress on the coordinator node. The precise point at which things fall over will vary. On Thu, Dec 11, 2014 at 1:43 PM, Mohammed Guller wrote: > Hi – > > The cassandra.yaml file has property called *batch_

Re: batch_size_warn_threshold_in_kb

2014-12-11 Thread Ryan Svihla
Nothing magic, just put in there based on experience. You can find the story behind the original recommendation here https://issues.apache.org/jira/browse/CASSANDRA-6487 Key reasoning for the desire comes from Patrick McFadden: "Yes that was in bytes. Just in my own experience, I don't recommend