On Mon, Nov 16, 2015 at 5:44 AM, Simon Riggs <si...@2ndquadrant.com> wrote: > On 15 November 2015 at 14:50, Robert Haas <robertmh...@gmail.com> wrote: >> On Sun, Nov 15, 2015 at 5:41 AM, Simon Riggs <si...@2ndquadrant.com> >> wrote: >> > Hmm, if that's where we're at, I'll summarize my thoughts. >> > >> > All of this discussion presupposes we are distributing/load balancing >> > queries so that reads and writes might occur on different nodes. >> >> Agreed. I think that's a pretty common pattern, though certainly not >> the only one. > It looks to me this functionality is only of use in a pooler. Please explain > how else this would be used.
I think you're right. I mean, you could have the pooling built into the application, but some place connections have to be being farmed out to different nodes, or there's no point to using this feature. Some people may not want to use this feature, but those who do are using some form of pooling at some level. >> > Your option (2) is wider but also worse in some ways. It can be >> > implemented >> > in a pooler. >> > >> > Your option (3) doesn't excite me much. You've got a load of stuff that >> > really should happen in a pooler. And at its core we have >> > synchronous_commit >> > = apply but with a timeout rather than a wait. >> >> I don't see how either option (2) or option (3) could be implemented >> in a pooler. How would that work? > > My starting thought was that (1) was the only way forwards. Through > discussion, I now see that its not the best solution for the general case. > > The pooler knows which statements are reads and writes, it also knows about > transaction boundaries, so it is possible for it to perform the waits for > either (2) or (3). As Craig says, it may not: pgbouncer, for example, won't. pgpool will, except when it's wrong because some apparently read-only function is actually writing data. But even if the pooler does know, that isn't enough for it to perform the waits for (2) or (3) without some support for the server. If it wants to wait for a particular transaction to be applied on the standby, it needs to know how long to wait, and without some support for the server, it has no way of knowing. Now that could be done by doing (1) and then having the pooler perform the waits, but now every pooler has to be taught how to do that. pgpool needs to know, pgbouncer needs to know, every JDBC-based connection pooler needs to know. Uggh. Thomas's solution is nice because it works with any pooler. The other point I'd make, which I think may be what you said before but I think is worth making very explicit, is that (1) supposes that we know which reads are dependent on which previous writes. In the real world, that's probably frequently untrue. If someone does SELECT sum(salary) FROM emp on the standby, there's no particular write to the emp table that they want to wait for: they want to wait for ALL such writes previously acknowledged as committed. Now, even when the dependent writes can be identified, it may be convenient to just wait for all of them instead of a particular subset that we know are related. But I bet there will be many cases where identification is impractical or impossible, and thus I suspect (1) won't be very useful. I think (2) and (3) both have good prospects for being useful, but I suspect that the performance consequences of (3), which is what Thomas actually implemented, although possibly severe, are still likely to be only a fraction of the cost of (2). Having to potentially wait every time a standby takes a snapshot just sounds awful to me. > I would like to see a load balancing pooler in Postgres. Me, too, but I don't expect that to be possible in the near future, and I think this is awfully useful until it does. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers