On Wed, Jun 24, 2015 at 3:36 AM, Michael Paquier <michael.paqu...@gmail.com> wrote: > On Wed, Jun 24, 2015 at 1:40 AM, Fujii Masao <masao.fu...@gmail.com> wrote: >> On Tue, Jun 23, 2015 at 11:21 PM, Heikki Linnakangas <hlinn...@iki.fi> wrote: >>> On 06/23/2015 05:03 PM, Fujii Masao wrote: >>>> >>>> On Tue, Jun 23, 2015 at 9:19 PM, Heikki Linnakangas <hlinn...@iki.fi> >>>> wrote: >>>>> >>>>> On 06/23/2015 07:51 AM, Michael Paquier wrote: >>>>>> >>>>>> >>>>>> So... Attached are a set of patches dedicated at fixing this issue: >>>>> >>>>> >>>>> >>>>> Thanks for working on this! >>>>> >>>>>> - 0001, add if_not_exists to pg_tablespace_location, returning NULL if >>>>>> path does not exist >>>>>> - 0002, same with pg_stat_file, returning NULL if file does not exist >>>>>> - 0003, same with pg_read_*file. I added them to all the existing >>>>>> functions for consistency. >>>>>> - 0004, pg_ls_dir extended with if_not_exists and include_dot_dirs >>>>>> (thanks Robert for the naming!) >>>>>> - 0005, as things get complex, a set of regression tests aimed to >>>>>> covering those things. pg_tablespace_location is platform-dependent, >>>>>> so there are no tests for it. >>>>>> - 0006, the fix for pg_rewind, using what has been implemented before. >>>>> >>>>> >>>>> >>>>> With thes patches, pg_read_file() will return NULL for any failure to >>>>> open >>>>> the file, which makes pg_rewind to assume that the file doesn't exist in >>>>> the >>>>> source server, and will remove the file from the destination. That's >>>>> dangerous, those functions should check specifically for ENOENT. >>>> >>>> >>>> I'm wondering if using pg_read_file() to copy the file from source server >>>> is reasonable. ISTM that it has two problems as follows. >>>> >>>> 1. It cannot read very large file like 1GB file. So if such large file was >>>> created in source server after failover, pg_rewind would not be able >>>> to copy the file. No? >>> >>> >>> pg_read_binary_file() handles large files just fine. It cannot return more >>> than 1GB in one call, but you can call it several times and retrieve the >>> file in chunks. That's what pg_rewind does, except for reading the control >>> file, which is known to be small. >> >> Yeah, you're right. >> >> I found that pg_rewind creates a temporary table to fetch the file in chunks. >> This would prevent pg_rewind from using the *hot standby* server as a source >> server at all. This is of course a limitation of pg_rewind, but we might want >> to alleviate it in the future. > > This is something that a replication command could address properly. > >>>> 2. Many users may not allow a remote client to connect to the >>>> PostgreSQL server as a superuser for some security reasons. IOW, >>>> there would be no entry in pg_hba.conf for such connection. >>>> In this case, pg_rewind always fails because pg_read_file() needs >>>> superuser privilege. No? >>>> >>>> I'm tempting to implement the replication command version of >>>> pg_read_file(). That is, it reads and sends the data like BASE_BACKUP >>>> replication command does... >>> >>> >>> Yeah, that would definitely be nice. Peter suggested it back in January >>> (http://www.postgresql.org/message-id/54ac4801.7050...@gmx.net). I think >>> it's way too late to do that for 9.5, however. I'm particularly worried that >>> if we design the required API in a rush, we're not going to get it right, >>> and will have to change it again soon. That might be difficult in a minor >>> release. Using pg_read_file() and friends is quite flexible, even though we >>> just find out that they're not quite flexible enough right now (the ENOENT >>> problem). >> >> I agree that it's too late to do what I said... >> >> But just using pg_read_file() cannot address the #2 problem that I pointed >> in my previous email. Also requiring a superuser privilege on pg_rewind >> really conflicts with the motivation why we added replication privilege. >> >> So we should change pg_read_file() so that even replication user can >> read the file? > > From the security prospective, a replication user can take a base > backup so it can already retrieve easily the contents of PGDATA. Hence > I guess that it would be fine. However, what about cases where > pg_hba.conf authorizes access to a given replication user via psql and > blocks it for the replication protocol? We could say that OP should > not give out replication access that easily, but in this case the user > would have access to the content of PGDATA even if he should not. Is > that unrealistic?
The most realistic case is that both source and target servers have the pg_hba.conf containing the following authentication setting regarding replication. That is, each server allows other to use the replication user to connect to via replication protocol. # TYPE DATABASE USER ADDRESS METHOD host replication repuser X.X.X.X/Y md5 This case makes me think that allowing even replication user to call pg_read_file() may not be good solution for us. Because in that case replication user needs to log in the real database like "postgres" to call pg_read_file(), but usually there would be no entry allowing replication user to connect to any real database in pg_hba.conf. So if we want to address this problem, replication command version of pg_read_file() would be required. However, that's too late to do for now... >> Or replication user version of pg_read_file() should be implemented? > > You mean a new function? In what is it different from authorizing > pg_read_file usage for a replication user? > > Honestly, I can live with this superuser restriction in 9.5. And come > back to the replication user restriction in 9.6 once things cool down > a bit. Yeah, finally I agree with you. This seems only approach we can adopt for now. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers