tax.com/dev/blog/cassandra-2-1-now-
>>>>>>>> over-50-faster) which massively helps performance. It provides
>>>>>>>> the benefit of batches but without the coordinator overhead.
>>>>>>>>
>>>>>>>> Ca
gt;> have 100 servers, and perform a mutation on 100 partitions, you could
>>>>>>>> have
>>>>>>>> a coordinator that's
>>>>>>>>
>>>>>>>> 1) talking to every machine in the cluster and
>>&
>>>> Jonathan says “It is absolutely not going to help you if you're
>>>>>>>> trying to lump queries together to reduce network & server overhead -
>>>>>>>> in
>>>>>>>> fact it'll do the opposite”, but I w
t;>>>> trying to lump queries together to reduce network & server overhead -
>>>>>>>> in
>>>>>>>> fact it'll do the opposite”, but I would note that the CQL3 spec says “
>>>>>>>> The BATCH statement ... serves
ps between the client and the server (and sometimes
>>>>>>> between the server coordinator and the replicas) when batching multiple
>>>>>>> updates.” Is the spec inaccurate? I mean, it seems in conflict with your
>>>>>>> stateme
t with your
>>>>>>> statement.
>>>>>>>
>>>>>>> See:
>>>>>>> https://cassandra.apache.org/doc/cql3/CQL.html
>>>>>>>
>>>>>>> I see the spec as gospel – if it’s not accurate, let’s propose a
>
>>>>
>>>>>> -- Jack Krupansky
>>>>>>
>>>>>> *From:* Jonathan Haddad
>>>>>> *Sent:* Friday, December 12, 2014 12:58 PM
>>>>>> *To:* user@cassandra.apache.org ; Ryan Svihla
>>>>>> *Subj
formance is usually not
>>>>> successful, as described in Using and misusing batches section. For
>>>>> information about the fastest way to load data, see "Cassandra: Batch
>>>>> loading without the Batch keyword."”
>>>>>
>>
ptimize performance. Using batches to optimize performance is usually not
>>>>> successful, as described in Using and misusing batches section. For
>>>>> information about the fastest way to load data, see "Cassandra: Batch
>>>>> loading without th
atch”, which is
>>>> simply a way to collect “batches” of operations in the client/driver and
>>>> then let the driver determine what degree of batching and asynchronous
>>>> operation is appropriate.
>>>>
>>>> It might also be nice t
o hit a lot of problems if you use them excessively (timeouts /
>>> failures).
>>>
>>> tl;dr: you probably don't want batch, you most likely want many async
>>> calls
>>>
>>> On Thu Dec 11 2014 at 11:15:00 PM Mohammed Guller <
>>&g
connections, and to have that be dynamic based
>> on overall cluster load.
>>
>> I would also note that the example in the spec has multiple inserts with
>> different partition key values, which flies in the face of the admonition
>> to to refrain from using server-sid
verhead - in fact it'll do the
>>> opposite. If you're trying to do that, instead perform many async
>>> queries. The overhead of batches in cassandra is significant and you're
>>> going to hit a lot of problems if you use them excessively (timeou
Ryan,
>>>
>>> Thanks for the quick response.
>>>
>>>
>>>
>>> I did see that jira before posting my question on this list. However, I
>>> didn’t see any information about why 5kb+ data will cause instability. 5kb
>>> or e
ore clear statement of intent and
> non-intent for BATCH.
>
> -- Jack Krupansky
>
> *From:* Jonathan Haddad
> *Sent:* Friday, December 12, 2014 12:58 PM
> *To:* user@cassandra.apache.org ; Ryan Svihla
> *Subject:* Re: batch_size_warn_threshold_in_kb
>
> The really important thing
gt;>
>>
>> In addition, Patrick is saying that he does not recommend more than 100
>> mutations per batch. So why not warn users just on the # of mutations in a
>> batch?
>>
>>
>>
>> Mohammed
>>
>>
>>
>> *From:* Ryan Svi
, 2014 12:58 PM
To: user@cassandra.apache.org ; Ryan Svihla
Subject: Re: batch_size_warn_threshold_in_kb
The really important thing to really take away from Ryan's original post is
that batches are not there for performance. The only case I consider batches
to be useful for is when you absolut
>
>
> *From:* Ryan Svihla [mailto:rsvi...@datastax.com]
> *Sent:* Thursday, December 11, 2014 12:56 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: batch_size_warn_threshold_in_kb
>
>
>
> Nothing magic, just put in there based on experience. You can find the
> s
Any insert, update, or delete
On Fri, Dec 12, 2014 at 1:31 AM, Jens Rantil wrote:
>
> Maybe slightly off-topic, but what is a mutation? Is it equivalent to a
> CQL row? Or maybe a column in a row? Does include tombstones within the
> selected range?
>
> Thanks,
> Jens
>
>
>
> On Thu, Dec 11, 2014
ohammed
>
>
>
> *From:* Ryan Svihla [mailto:rsvi...@datastax.com]
> *Sent:* Thursday, December 11, 2014 12:56 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: batch_size_warn_threshold_in_kb
>
>
>
> Nothing magic, just put in there based on experience. Y
Maybe slightly off-topic, but what is a mutation? Is it equivalent to a CQL
row? Or maybe a column in a row? Does include tombstones within the selected
range?
Thanks,
Jens
On Thu, Dec 11, 2014 at 9:56 PM, Ryan Svihla wrote:
> Nothing magic, just put in there based on experience. You can find
@cassandra.apache.org
Subject: Re: batch_size_warn_threshold_in_kb
Nothing magic, just put in there based on experience. You can find the story
behind the original recommendation here
https://issues.apache.org/jira/browse/CASSANDRA-6487
Key reasoning for the desire comes from Patrick McFadden:
"Yes
I don't know why 5kb was chosen.
The general trend is that larger batches will put more stress on the
coordinator node. The precise point at which
things fall over will vary.
On Thu, Dec 11, 2014 at 1:43 PM, Mohammed Guller
wrote:
> Hi –
>
> The cassandra.yaml file has property called *batch_
Nothing magic, just put in there based on experience. You can find the
story behind the original recommendation here
https://issues.apache.org/jira/browse/CASSANDRA-6487
Key reasoning for the desire comes from Patrick McFadden:
"Yes that was in bytes. Just in my own experience, I don't recommend
24 matches
Mail list logo