On 2011-03-09 15:10, Simon Riggs wrote:
On Wed, 2011-03-09 at 16:38 +0900, Fujii Masao wrote:
On Wed, Mar 9, 2011 at 2:14 PM, Jaime Casanova<ja...@2ndquadrant.com>  wrote:
On Tue, Mar 8, 2011 at 11:58 AM, Robert Haas<robertmh...@gmail.com>  wrote:
The fast shutdown handling seems fine, but why not just handle smart
shutdown the same way?
currently, smart shutdown means no new connections, wait until
existing ones close normally. for consistency, it should behave the
same for sync rep.
Agreed. I think that user who wants to request smart shutdown expects all
the existing connections to basically be closed normally by the client. So it
doesn't seem to be good idea to forcibly close the connection and prevent
the COMMIT from being returned in smart shutdown case. But I'm all ears
for better suggestions.

Anyway, we got the consensus about how fast shutdown should work with
sync rep. So I created the patch. Please feel free to comment and commit
the patch first ;)
We're just about to publish Alpha4 with this feature in.

If we release waiters too early we will cause effective data loss, that
part is agreed. We've also accepted that there are few ways to release
the waiters.

I want to release the first version as "safe" and then relax from there
after feedback.
This is not safe and possible in the first version:

1) issue stop on master when no sync standby is connected:
mgrid@mg73:~$ pg_ctl -D /data stop
waiting for server to shut down............................................................... failed
pg_ctl: server does not shut down

2) start the standby that failed
mgrid@mg72:~$ pg_ctl -D /data start
pg_ctl: another server might be running; trying to start server anyway
LOG: 00000: database system was interrupted while in recovery at log time 2011-03-09 15:22:31 CET HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.
LOG:  00000: entering standby mode
LOG:  00000: redo starts at 57/1A000078
LOG:  00000: consistent recovery state reached at 57/1A0000A0
FATAL: XX000: could not connect to the primary server: FATAL: the database system is shutting down

LOCATION:  libpqrcv_connect, libpqwalreceiver.c:102
server starting
mgrid@mg72:~$ FATAL: XX000: could not connect to the primary server: FATAL: the database system is shutting down

A safe solution would be to prevent smart shutdown on the master if it is in sync mode and there are no sync standbys connected.

The current situation is definately unsafe because it forces people that are in this state to do a fast shutdown.. but that fails as well, so they are only left with immediate.

mgrid@mg73:~$ pg_ctl -D /data stop -m fast
waiting for server to shut down............................................................... failed
pg_ctl: server does not shut down
mgrid@mg73:~$ pg_ctl -D /data stop -m immediate
waiting for server to shut down.... done
server stopped

regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to