Re: Batch : Isolation and Atomicity for same partition on multiple table

Jeff Jirsa Wed, 13 Dec 2017 09:18:34 -0800

Entry point is here:
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/statements/BatchStatement.java#L346
, which will call through to
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/StorageProxy.java#L938-L953


I believe the guarantees are weaker than the blog suggests, but it's
nuanced, and a lot of these types of questions come down to data model (you
can model it in a way that you can avoid problems with weaknesses in
isolation, but that requires a detailed explanation of your use case, etc).




On Wed, Dec 13, 2017 at 8:56 AM, Mickael Delanoë <delanoe...@gmail.com>
wrote:

> Hi Nicolas,
> Thanks for you answer.
> Is your assumption 100% sure ?
> Because the few test I did - using nodetools getendpoints - shown that the
> data for the two tables when I used the same partition key went to the same
> "nodes" . So I would have expected cassandra to be smart enough to apply
> them in the memtable in a single operation to achieve the isolation as the
> whole batch will be executed on a single node.
> Does anybody know where I can find, where the batch operations are
> processed in the Cassandra source code, so I could check how all this is
> processed ?
>
> Regards,
> Mickaël
>
>
>
> 2017-12-13 11:18 GMT+01:00 Nicolas Guyomar <nicolas.guyo...@gmail.com>:
>
>> Hi Mickael,
>>
>> Partition are related to the table they exist in, so in your case, you
>> are targeting 2 partitions in 2 different tables.
>> Therefore, IMHO, you will only get atomicity using your batch statement
>>
>> On 11 December 2017 at 15:59, Mickael Delanoë <delanoe...@gmail.com>
>> wrote:
>>
>>> Hello,
>>>
>>> I have a question regarding batch isolation and atomicity with query
>>> using a same partition key.
>>>
>>> The Datastax documentation says about the batches :
>>> "Combines multiple DML statements to achieve atomicity and isolation
>>> when targeting a single partition or only atomicity when targeting multiple
>>> partitions. A batch applies all DMLs within a single partition before the
>>> data is available, ensuring atomicity and isolation.""
>>>
>>> But I try to find exactly what can be considered as a "single partition"
>>> and I cannot find a clear response yet. The examples and explanations
>>> always speak about partition with only one table used inside the batch. My
>>> concern is about partition when we use different table in a batch. So I
>>> would like some clarification.
>>>
>>> Here is my use case, I have 2 tables with the same partition-key which
>>> is "user_id" :
>>>
>>> CREATE TABLE tableA (
>>>    user_id text,
>>>    clustering text,
>>>    value text,
>>>    PRIMARY KEY (user_id, clustering));
>>>
>>> CREATE TABLE tableB (
>>>    user_id text,
>>>    clustering1 text,
>>>    clustering2 text,
>>>    value text,
>>>    PRIMARY KEY (user_id, clustering1, clustering2));
>>>
>>> If I do a batch query like this :
>>>
>>> BEGIN BATCH
>>> INSERT INTO tableA (user_id, clustering, value) VALUES ('1234', 'c1',
>>> 'val1');
>>> INSERT INTO tableB (user_id, clustering1, clustering1, value) VALUES
>>> ('1234', 'cl1', 'cl2', 'avalue');
>>> APPLY BATCH;
>>>
>>> the DML statements uses the same partition-key, can we say they are
>>> targetting the same partition or, as the partition key are for different
>>> table, should we consider this is different partition? And so does this
>>> batch ensure atomicity and isolation (in the sense described in Datastax
>>> doc)? Or only atomicity?
>>>
>>> Thanks for you help,
>>> Mickaël Delanoë
>>>
>>
>>
>
>
> --
> Mickaël Delanoë
>

Re: Batch : Isolation and Atomicity for same partition on multiple table

Reply via email to