"This simple example shows how hard it is to implement multi-partition
Paxos rounds. The fact that you have multiple Paxos rounds that are
dependent on each other break the safety guarantees of the original Paxos
paper. "
What if this dependency is explicitly specified in proposal.
Assume that each proposal consists of all mutations in this case: {M1,M2}.
So N1,N2,N3 receive proposal {M1,M2} and nodes N4,N5,N6 receive proposal
{M1,M2}. (Yes, in this case I assume that nodes will get more data than
they need, but that's the cost).

If C2 wins with higher ballot at nodes N4, N5, N6  after N1,N2,N3  accepted
{M1,M2} from C1, then C2 sees in progress proposal {M1, M2} which it has to
complete, so it will try nodes N1,N2,N3 and get agreement as well as nodes
N4,N5,N6 and eventually commit unless another coordinator jumps in and
trumps paxos rounds.

Do you think it could work?

"To guarantee you need a distributed lock or a different design like
datomic. Look at what rich hickey has done with datomic "
I'll look into that, thanks.

2015-09-08 12:52 GMT+02:00 <wool...@gmail.com>:

>
> There's quite a bit of literature on the topic. Look at what is in
> acmqueue and you'll see what others are saying is accurate.
>
> To guarantee you need a distributed lock or a different design like
> datomic. Look at what rich hickey has done with datomic
>
>
>
> Sent from my iPhone
>
> On Sep 8, 2015, at 5:54 AM, DuyHai Doan <doanduy...@gmail.com> wrote:
>
> "I could imagine that multiple paxos rounds could be played for different
> partitions and these rounds would be dependent on each other"
>
> Example of cluster of 10 nodes (N1 ... N10) and RF=3.
>
> Suppose a LWT with 2 partitions and 2 mutations M1 & M2, coordinator C1.
> It will imply 2 Paxos rounds with the same ballot B1, one round affecting
> N1, N2, N3 for mutation M1 and the other round affecting N4, N5 and N6 for
> mutation M2.
>
> Suppose that the prepare/promise, read/results phases are successful for
> both Paxos rounds (see here for different LWT phases
> http://www.datastax.com/dev/blog/lightweight-transactions-in-cassandra-2-0
> )
>
> The coordinator C1 then sends an propose() message to N1 ... N6 with
> respective mutations M1 and M2. N1, N2 and N3 will reply accept() but
> imagine another coordinator C2 has just proposed a higher ballot B2 (B2 >
> B1) for nodes N4, N5 & N6 only. Those node will reply NoAck() (or won't
> reply) to C1.
>
> Then the whole multi-partition LWT will need to be aborted because we
> cannot proceed with mutation M2. But N1, N2 and N3 already accepted the
> mutation M1 and it could be committed by any subsequent Paxos round on N1,
> N2 and N3
>
> We certainly don't want that.
>
> So how to abort safely mutation M1 on N1, N2 and N3 in this case ? Of
> course the coordinator C1 could send itself another prepare() message with
> ballot B3 higher than B1 to abort the accepted value in N1, N2 and N3, but
> we do not have ANY GUARANTEE that in the meantime, there is another Paxos
> round impacting N1, N2 and N3 with ballot higher than B1 that will commit
> the undesirable mutation M1...
>
> This simple example shows how hard it is to implement multi-partition
> Paxos rounds. The fact that you have multiple Paxos rounds that are
> dependent on each other break the safety guarantees of the original Paxos
> paper.
>
>  The only way I can see that will ensure safety in this case is to forbid
> any Paxos round on N1 ... N6 as long as the current rounds are not finished
> yet, and this is exactly what a distributed lock does.
>
>
>
>
>
>
>
>
>
> On Tue, Sep 8, 2015 at 8:15 AM, Marek Lewandowski <
> marekmlewandow...@gmail.com> wrote:
>
>> Are you absolutely sure that lock is required? I could imagine that
>> multiple paxos rounds could be played for different partitions and these
>> rounds would be dependent on each other.
>>
>> Performance aside, can you please elaborate where do you see such need
>> for lock?
>> On 8 Sep 2015 00:05, "DuyHai Doan" <doanduy...@gmail.com> wrote:
>>
>>> Multi partitions LWT is not supported currently on purpose. To support
>>> it, we would have to emulate a distributed lock which is pretty bad for
>>> performance.
>>>
>>> On Mon, Sep 7, 2015 at 10:38 PM, Marek Lewandowski <
>>> marekmlewandow...@gmail.com> wrote:
>>>
>>>> Hello there,
>>>>
>>>> would you be interested in having multi-partition support for
>>>> lightweight transactions in order words to have ability to express
>>>> something like:
>>>>
>>>> INSERT INTO … IF NOT EXISTS AND
>>>> UPDATE … IF EXISTS AND
>>>> UPDATE … IF colX = ‘xyz’
>>>>
>>>> where each statement refers to a row living potentially on different
>>>> set of nodes.
>>>> In yet another words: imagine batch with conditions, but which allows
>>>> to specify multiple statements with conditions for rows in different
>>>> partitions.
>>>>
>>>> Do you find it very useful, moderately useful or you don’t need that
>>>> because you have some superb business logic handling of such cases for
>>>> example?
>>>>
>>>> Let me know.
>>>> Regards,
>>>> Marek
>>>
>>>
>>>
>


-- 
Marek Lewandowski

Reply via email to