On Sun, Nov 15, 2015 at 5:41 AM, Simon Riggs <si...@2ndquadrant.com> wrote: > Hmm, if that's where we're at, I'll summarize my thoughts. > > All of this discussion presupposes we are distributing/load balancing > queries so that reads and writes might occur on different nodes.
Agreed. I think that's a pretty common pattern, though certainly not the only one. > We need a good balancer. Any discussion of this that ignores the balancer > component is only talking about half the solution. What we need to do is > decide whether functionality should live in the balancer or the core. I'm all in favor of having a load-balancer in core, but that seems completely unrelated to the patch at hand. > Your option (1) is viable, but only in certain cases. We could add support > for some token/wait mechanism but as you say, this would require application > changes not pooler changes. Agreed. > Your option (2) is wider but also worse in some ways. It can be implemented > in a pooler. > > Your option (3) doesn't excite me much. You've got a load of stuff that > really should happen in a pooler. And at its core we have synchronous_commit > = apply but with a timeout rather than a wait. So anyway, consider me nudged > to finish my patch to provide capability for that by 1 Jan. I don't see how either option (2) or option (3) could be implemented in a pooler. How would that work? To be frank, it's starting to seem to me like you are just trying to block this patch so you can have time to develop your own version instead. I hope that's not the case, because it would be quite unfair. When Thomas originally posted the patch, you complained that "This causes every writer to wait. What we want is to isolate the wait only to people performing a write-read sequence, so I think it should be readers that wait. Let's have that debate up front before we start reviewing the patch." Now, you seem to be saying that's OK, because you want to post a patch to do exactly the same thing under the name synchronous_commit=apply, but you want it to be your own patch, leaving out the other stuff that Thomas has put into this one. That could be the right thing to do, but how about we discuss it a bit? The timeout stuff that Thomas has added here is really useful. Without that, if a machine goes down, we wait forever. That's the right thing to do if we're replicating to make sure transactions can never be lost, but it's a bad idea if we're replicating for load balancing. In the load balancing case, you want to drop sync slaves quickly to ensure the cluster remains available, and you need them to know they are out of sync so that the load balancer doesn't get confused. That's exactly what is implemented here. If you have an idea for a simpler implementation, great, but I think we need something. I don't see how it's going to work to make it entirely the pooler's job to figure out whether the cluster is in sync - it needs a push from the core server. Here, that push is easy to find: if a particular replica starts returning the "i'm out of sync" error when you query it, then stop routing queries to that replica until the error clears (which the pooler can find out by polling it with some trivial query). That's a great deal more useful than synchronous_commit=apply without such a feature. > On a related note, any further things like "GUC causal_reads_standby_names" > should be implemented by Node Registry as a named group of nodes. We can > have as many arbitrary groups of nodes as we want. If that sounds strange > look back at exactly why GUCs are called GUCs. I think a node registry is a good idea, and my impression from the session in Vienna is quite a few other hackers do, too. But I also don't think it's remotely reasonable to make that a precondition for accepting this patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers