On 06/23/2015 05:03 PM, Fujii Masao wrote:
On Tue, Jun 23, 2015 at 9:19 PM, Heikki Linnakangas <hlinn...@iki.fi> wrote:
On 06/23/2015 07:51 AM, Michael Paquier wrote:
So... Attached are a set of patches dedicated at fixing this issue:
Thanks for working on this!
- 0001, add if_not_exists to pg_tablespace_location, returning NULL if
path does not exist
- 0002, same with pg_stat_file, returning NULL if file does not exist
- 0003, same with pg_read_*file. I added them to all the existing
functions for consistency.
- 0004, pg_ls_dir extended with if_not_exists and include_dot_dirs
(thanks Robert for the naming!)
- 0005, as things get complex, a set of regression tests aimed to
covering those things. pg_tablespace_location is platform-dependent,
so there are no tests for it.
- 0006, the fix for pg_rewind, using what has been implemented before.
With thes patches, pg_read_file() will return NULL for any failure to open
the file, which makes pg_rewind to assume that the file doesn't exist in the
source server, and will remove the file from the destination. That's
dangerous, those functions should check specifically for ENOENT.
I'm wondering if using pg_read_file() to copy the file from source server
is reasonable. ISTM that it has two problems as follows.
1. It cannot read very large file like 1GB file. So if such large file was
created in source server after failover, pg_rewind would not be able
to copy the file. No?
pg_read_binary_file() handles large files just fine. It cannot return
more than 1GB in one call, but you can call it several times and
retrieve the file in chunks. That's what pg_rewind does, except for
reading the control file, which is known to be small.
2. Many users may not allow a remote client to connect to the
PostgreSQL server as a superuser for some security reasons. IOW,
there would be no entry in pg_hba.conf for such connection.
In this case, pg_rewind always fails because pg_read_file() needs
superuser privilege. No?
I'm tempting to implement the replication command version of
pg_read_file(). That is, it reads and sends the data like BASE_BACKUP
replication command does...
Yeah, that would definitely be nice. Peter suggested it back in January
(http://www.postgresql.org/message-id/54ac4801.7050...@gmx.net). I think
it's way too late to do that for 9.5, however. I'm particularly worried
that if we design the required API in a rush, we're not going to get it
right, and will have to change it again soon. That might be difficult in
a minor release. Using pg_read_file() and friends is quite flexible,
even though we just find out that they're not quite flexible enough
right now (the ENOENT problem).
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers