On 02.01.2011 14:47, Dimitri Fontaine wrote:
Heikki Linnakangas<heikki.linnakan...@enterprisedb.com>  writes:
BTW, there's a bunch of replication related stuff that we should work to
close, that are IMHO more important than synchronous replication. Like
making the standby follow timeline changes, to make failovers smoother, and
the facility to stream a base-backup over the wire. I wish someone worked on
those...

So, we've been talking about base backup streaming at conferences and we
have a working prototype.  We even have a needed piece of it in core
now, that's the pg_read_binary_file() function.  What we still miss is
an overall design and some integration effort.  Let's design first.

We even have a rudimentary patch to add the required backend support:

http://archives.postgresql.org/message-id/4c80d9b8.2020...@enterprisedb.com

That just needs to be polished into shape, and documentation.

I propose the following new pg_ctl command to initiate the cloning:

  pg_ctl clone [-D datadir] [-s on|off] [-t filename]  "primary_conninfo"

As far as user are concerned, that would be the only novelty.  Once that
command is finished (successfully) they would edit postgresql.conf and
start the service as usual.  A basic recovery.conf file is created with
the given options, standby_mode is driven by -s and defaults to off, and
trigger_file defaults to being omitted and is given by -t.  Of course
the primary_conninfo given on the command line is what ends up into the
recovery.conf file.

That alone would allow for making base backups for recovery purposes and
for standby preparing.

+1. Or maybe it would be better make it a separate binary, rather than part of pg_ctl.

To support for this new tool, the simplest would be to just copy what
I've been doing in the prototype, that is run a query to get the primary
file listing (per tablespace, not done in the prototype) then get their
bytea content over the wire.  That means there's no further backend
support code to write.

It would be so much nicer to have something more integrated, like the patch I linked above. Running queries requires connecting to a real database, which means that the user needs to have privileges to do that and you need to know the name of a valid database. Ideally this would all work through a replication connection. I think we should go with that from day one.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to