Josh Berkus <j...@agliodbs.com> wrote: >>> You'd also need a way to let the connection nodes know when a replica >>> has fallen behind so that they can be taken out of >>> load-balancing/sharding for read queries. For the synchronous model, >>> that would be "fallen behind at all"; for asynchronous it would be >>> "fallen more than ### behind". >> >> How is that different from the previous thing? Just that we'd treat >> "lagging" as "down" beyond some threshold? That doesn't seem like a >> mandatory feature. > > It's a mandatory feature if you want to load-balance reads. We have to > know which nodes not to send reads to because they are out of sync.
There is another approach to this that we should consider how (if?) we are going to cover: database affinity. I have seen cases where there are multiple databases which are targets of asynchronous replication, with a web application load balancing among them. The application kept track of which copy each connection was using, so that if when they were not exactly in sync the user never saw "time moving backward". Two different users might see versions of the data from different points in time, but that generally doesn't matter, especially if the difference is just a few minutes. If one copy got too far behind for some reason, they would load-shift to the other servers (time still moves forward, only there is a "jump" forward at the shift). This would allow the tardy database to be dedicated to catching up again. Bottom line is that this very smooth behavior required two features -- the ability for the application to control database affinity, and the ability to shift that affinity gracefully (with no down time). -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers