> I think the general idea is that if Commit is WAL logged, then the > operation is considered to committed on local node and commit should > happen on any node, only once prepare from all nodes is successful. > And after that transaction is not supposed to abort. But I think you are > trying to optimize the DTM in some way to not follow that kind of protocol. > By the way, how will arbiter does the recovery in a scenario where it > crashes, won't it need to contact all nodes for the status of in-progress or > prepared transactions? > I think it would be better if more detailed design of DTM with respect to > transaction management and recovery could be updated on wiki for having > discussion on this topic. I have seen that you have already updated many > details of the system, but still the complete picture of DTM is not clear.
I agree. I have not been following this discussion but from what I have read above I think the recovery model in this design is broken. You have to follow some protocol, whichever you choose. I think you can try using something like Paxos, if you are looking at a higher reliable model but don't want the overhead of 3PC.