On Tue, Jun 05, 2007 at 09:56:29PM +0300, Timo Sirainen wrote: > On Tue, 2007-05-22 at 09:58 -0500, Troy Benjegerdes wrote: > > Best case, when all the nodes, and the network is up, locking latency > > shouldn't be much longer than say twice the RTT. But what really > > matters, and causes all the nasty bugs that even single-master > > replication systems have to deal with is the *worst case* latency. So > > everything is going along fine, and then due to a surge in incoming > > spam, one of your switches starts dropping 2% of the packets, and the > > server holding a lock starts taking 50ms instead of 1ms to respond to an > > incoming packet. > > > > Now your previous lock latency of 1ms could easily extend into seconds if > > a couple of responses to lock requests don't get through. And your 16 > > node imap cluster is now 8 times slower than a single server, instead of > > 8 times faster ;) > > If you're so worried about that, you could create another internal > network just for replication :)
Things are worse if the internal network for replication is the one that started having errors ;) .. Your machine is accessible to the world, but you can't reliably communicate to get a lock > > The nasty part about this for imap is that we can't ever have a UID be > > handed out without *confirming* that it's been replicated to another > > server before sending out the packet. Otherwise you can get in the > > situation where node A sends out a new UID to a client out it's public > > NIC card, while in the meantime, it's internal NIC melted so the update > > never got propagated, so node B,C, and D decides "ooops, node A is > > dead, we are stealing his lock", and B takes over the lock and allocates > > the same UID to a different message, and now the CEO didn't get that > > notice from the SEC to save all his emails. > > When the servers sync up again they'll notice the duplicated UID and > both of the emails will be assigned a new UID to fix the situation. This > conflict handling will have to be done in any case. That sounds like a pretty clean solution, and makes a lot of the things that make replication hard go away.