On Tue, Jun 23, 2015 at 9:19 PM, Heikki Linnakangas <hlinn...@iki.fi> wrote: > On 06/23/2015 07:51 AM, Michael Paquier wrote: >> - 0001, add if_not_exists to pg_tablespace_location, returning NULL if >> path does not exist >> - 0002, same with pg_stat_file, returning NULL if file does not exist >> - 0003, same with pg_read_*file. I added them to all the existing >> functions for consistency. >> - 0004, pg_ls_dir extended with if_not_exists and include_dot_dirs >> (thanks Robert for the naming!) >> - 0005, as things get complex, a set of regression tests aimed to >> covering those things. pg_tablespace_location is platform-dependent, >> so there are no tests for it. >> - 0006, the fix for pg_rewind, using what has been implemented before. > > > With these patches, pg_read_file() will return NULL for any failure to open > the file, which makes pg_rewind to assume that the file doesn't exist in the > source server, and will remove the file from the destination. That's > dangerous, those functions should check specifically for ENOENT.
This makes sense. I changed all those functions to do so. > There's still a small race condition with tablespaces. If you run CREATE > TABLESPACE in the source server while pg_rewind is running, it's possible > that the recursive query that pg_rewind uses sees the symlink in pg_tblspc/ > directory, but its snapshot doesn't see the row in pg_tablespace yet. It > will think that the symlink is a regular file, try to read it, and fail (if > we checked for ENOENT). > Actually, I think we need try to deal with symlinks a bit harder. Currently, > pg_rewind assumes that anything in pg_tblspace that has a matching row in > pg_tablespace is a symlink, and nothing else is. I think symlinks to > directories. I just noticed that pg_rewind fails miserable if pg_xlog is a > symlink, because of that: > > ---- > The servers diverged at WAL position 0/3023F08 on timeline 1. > Rewinding from last common checkpoint at 0/2000060 on timeline 1 > > "data-master//pg_xlog" is not a directory > Failure, exiting > ---- It may be possible that in this case the path is a symlink on the source but not on the target, and vice-versa, so this looks too restrictive to me if we begin to use pg_readlink. Think for example of a symlink of pg_xlog that is not included in a base backup, we ignore it in copy_fetch.c for pg_xlog and the others, I think that here as well we should ignore those errors except for tablespaces. > I think we need to add a column to pg_stat_file output, to indicate symbolic > links, and add a pg_readlink() function. That still leaves a race condition > if the type of a file changes, i.e. a file is deleted and a directory with > the same name is created in its place, but that seems acceptable. I don't > think PostgreSQL ever does such a thing, so that could only happen if you > mess with the data directory manually while the server is running. Hm. pg_stat_file uses now stat(), and not lstat() so it cannot make the difference between what is a link or not. I have changed pg_stat_file to use lstat instead to cover this case in my set of patches, but a new function may be a better answer here. I have added as well a pg_readlink() function on the stack, and actually the if_not_exists mode of pg_tablespace_location is not needed anymore if we rely on pg_readlink in this case. I have let it in the set of patches though. This still looks useful for other utility tools. > I just realized another problem: We recently learned the hard way that some > people have files in the data directory that are not writeable by the > 'postgres' user > (http://www.postgresql.org/message-id/20150523172627.ga24...@msg.df7cb.de). > pg_rewind will try to overwrite all files it doesn't recognize as relation > files, so it's going to fail on those. A straightforward fix would be to > first open the destination file in read-only mode, and compare its contents, > and only open the file in write mode if it has changed. It would still fail > when the files really differ, but I think that's acceptable. If I am missing nothing, two code paths need to be patched here: copy_file_range and receiveFileChunks. copy_file_range is straight-forward. Now wouldn't it be better to write the contents into a temporary file, compare their content, and then switch if necessary for receiveFileChunks? > I note that pg_rewind doesn't need to distinguish between an empty and a > non-existent directory, so it's quite silly for it to pass > include_dot_dirs=true, and then filter out "." and ".." from the result set. > The documentation should mention the main reason for including "." and "..": > to distinguish between an empty and non-existent directory. OK. Switched to that in the first patch for pg_rewind. Attached is a new set of patches. Except for the last ones that addresses one issue of pg_rewind (symlink management when streaming PGDATA), all the others introduce if_not_exists options for the functions of genfile.c. The pg_rewind stuff could be more polished though. Feel free to comment. Regards, -- Michael
From 228d17d6808d350507a2ef137c7bdb496dc765d3 Mon Sep 17 00:00:00 2001 From: Michael Paquier <michael@otacoo.com> Date: Wed, 24 Jun 2015 10:32:01 +0900 Subject: [PATCH 1/8] Extend pg_tablespace_location with if_not_exists option This is useful for code paths that prefer receive an empty response instead of an ERROR if tablespace specified does not exist. --- doc/src/sgml/func.sgml | 8 +++++-- src/backend/utils/adt/misc.c | 49 ++++++++++++++++++++++++++++++++++++------- src/include/catalog/pg_proc.h | 2 ++ src/include/utils/builtins.h | 1 + 4 files changed, 50 insertions(+), 10 deletions(-) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 650051b..7929e7e 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -15844,9 +15844,13 @@ SELECT pg_type_is_visible('myschema.widget'::regtype); <entry>get the set of database OIDs that have objects in the tablespace</entry> </row> <row> - <entry><literal><function>pg_tablespace_location(<parameter>tablespace_oid</parameter>)</function></literal></entry> + <entry><literal><function>pg_tablespace_location(<parameter>tablespace_oid</parameter> [, <parameter>if_not_exists</> <type>boolean</type>])</function></literal></entry> <entry><type>text</type></entry> - <entry>get the path in the file system that this tablespace is located in</entry> + <entry> + Get the path in the file system that this tablespace is located in. + If <parameter>if_not_exists</parameter> is true, <literal>NULL</literal> + is returned if tablespace does not exist instead of an error. + </entry> </row> <row> <entry><literal><function>pg_typeof(<parameter>any</parameter>)</function></literal></entry> diff --git a/src/backend/utils/adt/misc.c b/src/backend/utils/adt/misc.c index c0495d9..dc19c2c 100644 --- a/src/backend/utils/adt/misc.c +++ b/src/backend/utils/adt/misc.c @@ -336,17 +336,24 @@ pg_tablespace_databases(PG_FUNCTION_ARGS) SRF_RETURN_DONE(funcctx); } - /* - * pg_tablespace_location - get location for a tablespace + * Wrapper function for pg_tablespace_location and + * pg_tablespace_location_extended. */ -Datum -pg_tablespace_location(PG_FUNCTION_ARGS) +static Datum +tablespace_location_wrapper(FunctionCallInfo fcinfo) { Oid tablespaceOid = PG_GETARG_OID(0); char sourcepath[MAXPGPATH]; char targetpath[MAXPGPATH]; int rllen; + bool if_not_exists = false; + + /* + * Check for IF NOT EXISTS option. + */ + if (PG_NARGS() == 2 && !PG_ARGISNULL(1)) + if_not_exists = PG_GETARG_BOOL(1); /* * It's useful to apply this function to pg_class.reltablespace, wherein @@ -373,15 +380,22 @@ pg_tablespace_location(PG_FUNCTION_ARGS) rllen = readlink(sourcepath, targetpath, sizeof(targetpath)); if (rllen < 0) - ereport(ERROR, - (errcode_for_file_access(), - errmsg("could not read symbolic link \"%s\": %m", - sourcepath))); + { + if (if_not_exists && errno == ENOENT) + PG_RETURN_NULL(); + else + ereport(ERROR, + (errcode_for_file_access(), + errmsg("could not read symbolic link \"%s\": %m", + sourcepath))); + } if (rllen >= sizeof(targetpath)) + { ereport(ERROR, (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), errmsg("symbolic link \"%s\" target is too long", sourcepath))); + } targetpath[rllen] = '\0'; PG_RETURN_TEXT_P(cstring_to_text(targetpath)); @@ -394,6 +408,25 @@ pg_tablespace_location(PG_FUNCTION_ARGS) } /* + * pg_tablespace_location - get location for a tablespace + */ +Datum +pg_tablespace_location(PG_FUNCTION_ARGS) +{ + return tablespace_location_wrapper(fcinfo); +} + +/* + * pg_tablespace_location - get location for a tablespace, with option to + * bypass errors in case of a non-existing tablespace. + */ +Datum +pg_tablespace_location_extended(PG_FUNCTION_ARGS) +{ + return tablespace_location_wrapper(fcinfo); +} + +/* * pg_sleep - delay for N seconds */ Datum diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index 6b3d194..1f163f0 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -2922,6 +2922,8 @@ DESCR("current trigger depth"); DATA(insert OID = 3778 ( pg_tablespace_location PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 25 "26" _null_ _null_ _null_ _null_ _null_ pg_tablespace_location _null_ _null_ _null_ )); DESCR("tablespace location"); +DATA(insert OID = 3579 ( pg_tablespace_location PGNSP PGUID 12 1 0 0 0 f f f f t f s 2 0 25 "26 16" _null_ _null_ _null_ _null_ _null_ pg_tablespace_location_extended _null_ _null_ _null_ )); +DESCR("tablespace location"); DATA(insert OID = 1946 ( encode PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 25 "17 25" _null_ _null_ _null_ _null_ _null_ binary_encode _null_ _null_ _null_ )); DESCR("convert bytea value into some ascii-only text string"); diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h index 51f25a2..ab78ec2 100644 --- a/src/include/utils/builtins.h +++ b/src/include/utils/builtins.h @@ -486,6 +486,7 @@ extern Datum pg_terminate_backend(PG_FUNCTION_ARGS); extern Datum pg_reload_conf(PG_FUNCTION_ARGS); extern Datum pg_tablespace_databases(PG_FUNCTION_ARGS); extern Datum pg_tablespace_location(PG_FUNCTION_ARGS); +extern Datum pg_tablespace_location_extended(PG_FUNCTION_ARGS); extern Datum pg_rotate_logfile(PG_FUNCTION_ARGS); extern Datum pg_sleep(PG_FUNCTION_ARGS); extern Datum pg_get_keywords(PG_FUNCTION_ARGS); -- 2.4.4
From 2804fdc5fd974e688a930ab58559dbb7047cca8e Mon Sep 17 00:00:00 2001 From: Michael Paquier <michael@otacoo.com> Date: Wed, 24 Jun 2015 10:34:17 +0900 Subject: [PATCH 2/8] Extend pg_stat_file with if_not_exists option This is useful for code paths that prefer receiving a NULL result instead of an ERROR if file specified does not exist. --- doc/src/sgml/func.sgml | 9 ++++-- src/backend/utils/adt/genfile.c | 65 +++++++++++++++++++++++++++++++---------- src/include/catalog/pg_proc.h | 2 ++ src/include/utils/builtins.h | 1 + 4 files changed, 59 insertions(+), 18 deletions(-) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 7929e7e..0f70985 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -17830,10 +17830,15 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup()); </row> <row> <entry> - <literal><function>pg_stat_file(<parameter>filename</> <type>text</>)</function></literal> + <literal><function>pg_stat_file(<parameter>filename</> <type>text</>[, <parameter>if_not_exists</> <type>boolean</type>])</function></literal> </entry> <entry><type>record</type></entry> - <entry>Return information about a file</entry> + <entry> + Return information about a file. If + <parameter>if_not_exists</parameter> is true, <literal>NULL</literal> + is returned for all result fields if file does not exist instead of + an error. + </entry> </row> </tbody> </tgroup> diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c index f3f3cca..dd97d83 100644 --- a/src/backend/utils/adt/genfile.c +++ b/src/backend/utils/adt/genfile.c @@ -254,10 +254,10 @@ pg_read_binary_file_all(PG_FUNCTION_ARGS) } /* - * stat a file + * Wrapper for pg_stat_file and pg_stat_file_extended. */ -Datum -pg_stat_file(PG_FUNCTION_ARGS) +static Datum +stat_file_wrapper(PG_FUNCTION_ARGS) { text *filename_t = PG_GETARG_TEXT_P(0); char *filename; @@ -266,18 +266,29 @@ pg_stat_file(PG_FUNCTION_ARGS) bool isnull[6]; HeapTuple tuple; TupleDesc tupdesc; + bool if_not_exists = false; + bool return_null = false; if (!superuser()) ereport(ERROR, (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), (errmsg("must be superuser to get file information")))); + /* Check for option if_not_exists */ + if (PG_NARGS() == 2 && !PG_ARGISNULL(1)) + if_not_exists = PG_GETARG_BOOL(1); + filename = convert_and_check_filename(filename_t); if (stat(filename, &fst) < 0) - ereport(ERROR, - (errcode_for_file_access(), - errmsg("could not stat file \"%s\": %m", filename))); + { + if (if_not_exists && errno == ENOENT) + return_null = true; + else + ereport(ERROR, + (errcode_for_file_access(), + errmsg("could not stat file \"%s\": %m", filename))); + } /* * This record type had better match the output parameters declared for me @@ -298,20 +309,23 @@ pg_stat_file(PG_FUNCTION_ARGS) "isdir", BOOLOID, -1, 0); BlessTupleDesc(tupdesc); - memset(isnull, false, sizeof(isnull)); + memset(isnull, return_null, sizeof(isnull)); - values[0] = Int64GetDatum((int64) fst.st_size); - values[1] = TimestampTzGetDatum(time_t_to_timestamptz(fst.st_atime)); - values[2] = TimestampTzGetDatum(time_t_to_timestamptz(fst.st_mtime)); - /* Unix has file status change time, while Win32 has creation time */ + if (!return_null) + { + values[0] = Int64GetDatum((int64) fst.st_size); + values[1] = TimestampTzGetDatum(time_t_to_timestamptz(fst.st_atime)); + values[2] = TimestampTzGetDatum(time_t_to_timestamptz(fst.st_mtime)); + /* Unix has file status change time, while Win32 has creation time */ #if !defined(WIN32) && !defined(__CYGWIN__) - values[3] = TimestampTzGetDatum(time_t_to_timestamptz(fst.st_ctime)); - isnull[4] = true; + values[3] = TimestampTzGetDatum(time_t_to_timestamptz(fst.st_ctime)); + isnull[4] = true; #else - isnull[3] = true; - values[4] = TimestampTzGetDatum(time_t_to_timestamptz(fst.st_ctime)); + isnull[3] = true; + values[4] = TimestampTzGetDatum(time_t_to_timestamptz(fst.st_ctime)); #endif - values[5] = BoolGetDatum(S_ISDIR(fst.st_mode)); + values[5] = BoolGetDatum(S_ISDIR(fst.st_mode)); + } tuple = heap_form_tuple(tupdesc, values, isnull); @@ -320,6 +334,25 @@ pg_stat_file(PG_FUNCTION_ARGS) PG_RETURN_DATUM(HeapTupleGetDatum(tuple)); } +/* + * pg_stat_file_extended + * Stat a file, with possibility to bypass error in case of a missing file. + */ +Datum +pg_stat_file_extended(PG_FUNCTION_ARGS) +{ + return stat_file_wrapper(fcinfo); +} + +/* + * stat a file + */ +Datum +pg_stat_file(PG_FUNCTION_ARGS) +{ + return stat_file_wrapper(fcinfo); +} + /* * List a directory (returns the filenames only) diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index 1f163f0..8da1781 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -3185,6 +3185,8 @@ DESCR("rotate log file"); DATA(insert OID = 2623 ( pg_stat_file PGNSP PGUID 12 1 0 0 0 f f f f t f v 1 0 2249 "25" "{25,20,1184,1184,1184,1184,16}" "{i,o,o,o,o,o,o}" "{filename,size,access,modification,change,creation,isdir}" _null_ _null_ pg_stat_file _null_ _null_ _null_ )); DESCR("get information about file"); +DATA(insert OID = 3307 ( pg_stat_file PGNSP PGUID 12 1 0 0 0 f f f f t f v 2 0 2249 "25 16" "{25,16,20,1184,1184,1184,1184,16}" "{i,i,o,o,o,o,o,o}" "{filename,if_not_exists,size,access,modification,change,creation,isdir}" _null_ _null_ pg_stat_file_extended _null_ _null_ _null_ )); +DESCR("get information about file"); DATA(insert OID = 2624 ( pg_read_file PGNSP PGUID 12 1 0 0 0 f f f f t f v 3 0 25 "25 20 20" _null_ _null_ _null_ _null_ _null_ pg_read_file _null_ _null_ _null_ )); DESCR("read text from a file"); DATA(insert OID = 3826 ( pg_read_file PGNSP PGUID 12 1 0 0 0 f f f f t f v 1 0 25 "25" _null_ _null_ _null_ _null_ _null_ pg_read_file_all _null_ _null_ _null_ )); diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h index ab78ec2..e46650a 100644 --- a/src/include/utils/builtins.h +++ b/src/include/utils/builtins.h @@ -472,6 +472,7 @@ extern Datum pg_relation_filepath(PG_FUNCTION_ARGS); extern bytea *read_binary_file(const char *filename, int64 seek_offset, int64 bytes_to_read); extern Datum pg_stat_file(PG_FUNCTION_ARGS); +extern Datum pg_stat_file_extended(PG_FUNCTION_ARGS); extern Datum pg_read_file(PG_FUNCTION_ARGS); extern Datum pg_read_file_all(PG_FUNCTION_ARGS); extern Datum pg_read_binary_file(PG_FUNCTION_ARGS); -- 2.4.4
From 0b6b6b438d6a17573f929e50c3adfd726f0b3948 Mon Sep 17 00:00:00 2001 From: Michael Paquier <michael@otacoo.com> Date: Wed, 24 Jun 2015 10:37:31 +0900 Subject: [PATCH 3/8] Add IF NOT EXISTS to pg_read_file and pg_read_binary_file This is useful for code paths that do not want to fail if the file queried does not exist. --- doc/src/sgml/func.sgml | 16 +++- src/backend/commands/extension.c | 2 +- src/backend/utils/adt/genfile.c | 174 +++++++++++++++++++++++++++++++++------ src/include/catalog/pg_proc.h | 8 ++ src/include/utils/builtins.h | 8 +- 5 files changed, 179 insertions(+), 29 deletions(-) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 0f70985..edfea8b 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -17816,17 +17816,25 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup()); </row> <row> <entry> - <literal><function>pg_read_file(<parameter>filename</> <type>text</> [, <parameter>offset</> <type>bigint</>, <parameter>length</> <type>bigint</>])</function></literal> + <literal><function>pg_read_file(<parameter>filename</> <type>text</> [, <parameter>offset</> <type>bigint</>, <parameter>length</> <type>bigint</> [, <parameter>if_not_exists</> <type>boolean</>] ])</function></literal> </entry> <entry><type>text</type></entry> - <entry>Return the contents of a text file</entry> + <entry> + Return the contents of a text file. If + <parameter>if_not_exists</parameter> is true, <literal>NULL</literal> + is returned if file does not exist instead of an error. + </entry> </row> <row> <entry> - <literal><function>pg_read_binary_file(<parameter>filename</> <type>text</> [, <parameter>offset</> <type>bigint</>, <parameter>length</> <type>bigint</>])</function></literal> + <literal><function>pg_read_binary_file(<parameter>filename</> <type>text</> [, <parameter>offset</> <type>bigint</>, <parameter>length</> <type>bigint</> [, <parameter>if_not_exists</> <type>boolean</>] ])</function></literal> </entry> <entry><type>bytea</type></entry> - <entry>Return the contents of a file</entry> + <entry> + Return the contents of a file. If + <parameter>if_not_exists</parameter> is true, <literal>NULL</literal> + is returned if file does not exist instead of an error. + </entry> </row> <row> <entry> diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c index 5cc74d0..25bbe01 100644 --- a/src/backend/commands/extension.c +++ b/src/backend/commands/extension.c @@ -640,7 +640,7 @@ read_extension_script_file(const ExtensionControlFile *control, char *dest_str; int len; - content = read_binary_file(filename, 0, -1); + content = read_binary_file(filename, 0, -1, false); /* use database encoding if not given */ if (control->encoding < 0) diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c index dd97d83..d45c66e 100644 --- a/src/backend/utils/adt/genfile.c +++ b/src/backend/utils/adt/genfile.c @@ -88,7 +88,8 @@ convert_and_check_filename(text *arg) * We read the whole of the file when bytes_to_read is negative. */ bytea * -read_binary_file(const char *filename, int64 seek_offset, int64 bytes_to_read) +read_binary_file(const char *filename, int64 seek_offset, + int64 bytes_to_read, bool if_not_exists) { bytea *buf; size_t nbytes; @@ -103,9 +104,14 @@ read_binary_file(const char *filename, int64 seek_offset, int64 bytes_to_read) struct stat fst; if (stat(filename, &fst) < 0) - ereport(ERROR, - (errcode_for_file_access(), - errmsg("could not stat file \"%s\": %m", filename))); + { + if (if_not_exists && errno == ENOENT) + return NULL; + else + ereport(ERROR, + (errcode_for_file_access(), + errmsg("could not stat file \"%s\": %m", filename))); + } bytes_to_read = fst.st_size - seek_offset; } @@ -118,10 +124,15 @@ read_binary_file(const char *filename, int64 seek_offset, int64 bytes_to_read) errmsg("requested length too large"))); if ((file = AllocateFile(filename, PG_BINARY_R)) == NULL) - ereport(ERROR, - (errcode_for_file_access(), - errmsg("could not open file \"%s\" for reading: %m", - filename))); + { + if (if_not_exists && errno == ENOENT) + return NULL; + else + ereport(ERROR, + (errcode_for_file_access(), + errmsg("could not open file \"%s\" for reading: %m", + filename))); + } if (fseeko(file, (off_t) seek_offset, (seek_offset >= 0) ? SEEK_SET : SEEK_END) != 0) @@ -150,11 +161,16 @@ read_binary_file(const char *filename, int64 seek_offset, int64 bytes_to_read) * in the database encoding. */ static text * -read_text_file(const char *filename, int64 seek_offset, int64 bytes_to_read) +read_text_file(const char *filename, int64 seek_offset, + int64 bytes_to_read, bool if_not_exists) { bytea *buf; - buf = read_binary_file(filename, seek_offset, bytes_to_read); + buf = read_binary_file(filename, seek_offset, + bytes_to_read, if_not_exists); + + if (buf == NULL) + return NULL; /* Make sure the input is valid */ pg_verifymbstr(VARDATA(buf), VARSIZE(buf) - VARHDRSZ, false); @@ -164,21 +180,27 @@ read_text_file(const char *filename, int64 seek_offset, int64 bytes_to_read) } /* - * Read a section of a file, returning it as text + * Wrapper function for pg_read_file and pg_read_file_extended. */ -Datum -pg_read_file(PG_FUNCTION_ARGS) +static Datum +read_file_wrapper(PG_FUNCTION_ARGS) { text *filename_t = PG_GETARG_TEXT_P(0); int64 seek_offset = PG_GETARG_INT64(1); int64 bytes_to_read = PG_GETARG_INT64(2); + bool if_not_exists = false; char *filename; + bytea *result; if (!superuser()) ereport(ERROR, (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), (errmsg("must be superuser to read files")))); + /* Check for option if_not_exists */ + if (PG_NARGS() == 4 && !PG_ARGISNULL(3)) + if_not_exists = PG_GETARG_BOOL(3); + filename = convert_and_check_filename(filename_t); if (bytes_to_read < 0) @@ -186,44 +208,100 @@ pg_read_file(PG_FUNCTION_ARGS) (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("requested length cannot be negative"))); - PG_RETURN_TEXT_P(read_text_file(filename, seek_offset, bytes_to_read)); + result = read_text_file(filename, seek_offset, bytes_to_read, + if_not_exists); + if (result == NULL) + PG_RETURN_NULL(); + PG_RETURN_TEXT_P(result); } /* - * Read the whole of a file, returning it as text + * Read a section of a file, returning it as text */ Datum -pg_read_file_all(PG_FUNCTION_ARGS) +pg_read_file(PG_FUNCTION_ARGS) +{ + return read_file_wrapper(fcinfo); +} + +/* + * Similar to pg_read_file, with IF NOT EXISTS option. + */ +Datum +pg_read_file_extended(PG_FUNCTION_ARGS) +{ + return read_file_wrapper(fcinfo); +} + +/* + * Wrapper function for pg_read_file_all and pg_read_file_all_extended. + */ +static Datum +read_file_all_wrapper(PG_FUNCTION_ARGS) { text *filename_t = PG_GETARG_TEXT_P(0); char *filename; + text *result; + bool if_not_exists = false; if (!superuser()) ereport(ERROR, (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), (errmsg("must be superuser to read files")))); + /* Check for option if_not_exists */ + if (PG_NARGS() == 2 && !PG_ARGISNULL(1)) + if_not_exists = PG_GETARG_BOOL(1); + filename = convert_and_check_filename(filename_t); - PG_RETURN_TEXT_P(read_text_file(filename, 0, -1)); + result = read_text_file(filename, 0, -1, if_not_exists); + if (result == NULL) + PG_RETURN_NULL(); + + PG_RETURN_TEXT_P(result); } /* - * Read a section of a file, returning it as bytea + * Read the whole of a file, returning it as text */ Datum -pg_read_binary_file(PG_FUNCTION_ARGS) +pg_read_file_all(PG_FUNCTION_ARGS) +{ + return read_file_all_wrapper(fcinfo); +} + +/* + * Similar to pg_read_file_all, with IF NOT EXISTS option. + */ +Datum +pg_read_file_all_extended(PG_FUNCTION_ARGS) +{ + return read_file_all_wrapper(fcinfo); +} + +/* + * Wrapper for pg_read_binary_file and pg_read_binary_file_extended. + */ +static Datum +read_binary_file_wrapper(PG_FUNCTION_ARGS) { text *filename_t = PG_GETARG_TEXT_P(0); int64 seek_offset = PG_GETARG_INT64(1); int64 bytes_to_read = PG_GETARG_INT64(2); char *filename; + bool if_not_exists = false; + bytea *result; if (!superuser()) ereport(ERROR, (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), (errmsg("must be superuser to read files")))); + /* Check for option if_not_exists */ + if (PG_NARGS() == 4 && !PG_ARGISNULL(3)) + if_not_exists = PG_GETARG_BOOL(3); + filename = convert_and_check_filename(filename_t); if (bytes_to_read < 0) @@ -231,26 +309,76 @@ pg_read_binary_file(PG_FUNCTION_ARGS) (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("requested length cannot be negative"))); - PG_RETURN_BYTEA_P(read_binary_file(filename, seek_offset, bytes_to_read)); + result = read_binary_file(filename, seek_offset, + bytes_to_read, if_not_exists); + if (result == NULL) + PG_RETURN_NULL(); + PG_RETURN_BYTEA_P(result); } /* - * Read the whole of a file, returning it as bytea + * Read a section of a file, returning it as bytea */ Datum -pg_read_binary_file_all(PG_FUNCTION_ARGS) +pg_read_binary_file(PG_FUNCTION_ARGS) +{ + return read_binary_file_wrapper(fcinfo); +} + +/* + * Similar to pg_read_binary_file, with IF NOT EXISTS option. + */ +Datum +pg_read_binary_file_extended(PG_FUNCTION_ARGS) +{ + return read_binary_file_wrapper(fcinfo); +} + +/* + * Wrapper for pg_read_binary_file_all and pg_read_binary_file_all_extended. + */ +static Datum +read_binary_file_all_wrapper(PG_FUNCTION_ARGS) { text *filename_t = PG_GETARG_TEXT_P(0); char *filename; + bool if_not_exists = false; + text *result; if (!superuser()) ereport(ERROR, (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), (errmsg("must be superuser to read files")))); + /* Check for option if_not_exists */ + if (PG_NARGS() == 2 && !PG_ARGISNULL(1)) + if_not_exists = PG_GETARG_BOOL(1); + filename = convert_and_check_filename(filename_t); - PG_RETURN_BYTEA_P(read_binary_file(filename, 0, -1)); + result = read_binary_file(filename, 0, -1, if_not_exists); + if (result == NULL) + PG_RETURN_NULL(); + + PG_RETURN_BYTEA_P(result); +} + +/* + * Read the whole of a file, returning it as bytea + */ +Datum +pg_read_binary_file_all(PG_FUNCTION_ARGS) +{ + return read_binary_file_all_wrapper(fcinfo); +} + +/* + * Similar to pg_read_binary_file_all, with IF NOT EXISTS option. + */ +Datum +pg_read_binary_file_all_extended(PG_FUNCTION_ARGS) +{ + return read_binary_file_all_wrapper(fcinfo); } /* diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index 8da1781..c805a9a 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -3189,12 +3189,20 @@ DATA(insert OID = 3307 ( pg_stat_file PGNSP PGUID 12 1 0 0 0 f f f f t f v 2 0 DESCR("get information about file"); DATA(insert OID = 2624 ( pg_read_file PGNSP PGUID 12 1 0 0 0 f f f f t f v 3 0 25 "25 20 20" _null_ _null_ _null_ _null_ _null_ pg_read_file _null_ _null_ _null_ )); DESCR("read text from a file"); +DATA(insert OID = 3293 ( pg_read_file PGNSP PGUID 12 1 0 0 0 f f f f t f v 4 0 25 "25 20 20 16" _null_ _null_ _null_ _null_ _null_ pg_read_file_extended _null_ _null_ _null_ )); +DESCR("read text from a file"); DATA(insert OID = 3826 ( pg_read_file PGNSP PGUID 12 1 0 0 0 f f f f t f v 1 0 25 "25" _null_ _null_ _null_ _null_ _null_ pg_read_file_all _null_ _null_ _null_ )); DESCR("read text from a file"); +DATA(insert OID = 3294 ( pg_read_file PGNSP PGUID 12 1 0 0 0 f f f f t f v 2 0 25 "25 16" _null_ _null_ _null_ _null_ _null_ pg_read_file_all_extended _null_ _null_ _null_ )); +DESCR("read text from a file"); DATA(insert OID = 3827 ( pg_read_binary_file PGNSP PGUID 12 1 0 0 0 f f f f t f v 3 0 17 "25 20 20" _null_ _null_ _null_ _null_ _null_ pg_read_binary_file _null_ _null_ _null_ )); DESCR("read bytea from a file"); +DATA(insert OID = 3295 ( pg_read_binary_file PGNSP PGUID 12 1 0 0 0 f f f f t f v 4 0 17 "25 20 20 16" _null_ _null_ _null_ _null_ _null_ pg_read_binary_file_extended _null_ _null_ _null_ )); +DESCR("read bytea from a file"); DATA(insert OID = 3828 ( pg_read_binary_file PGNSP PGUID 12 1 0 0 0 f f f f t f v 1 0 17 "25" _null_ _null_ _null_ _null_ _null_ pg_read_binary_file_all _null_ _null_ _null_ )); DESCR("read bytea from a file"); +DATA(insert OID = 3296 ( pg_read_binary_file PGNSP PGUID 12 1 0 0 0 f f f f t f v 2 0 17 "25 16" _null_ _null_ _null_ _null_ _null_ pg_read_binary_file_all_extended _null_ _null_ _null_ )); +DESCR("read bytea from a file"); DATA(insert OID = 2625 ( pg_ls_dir PGNSP PGUID 12 1 1000 0 0 f f f f t t v 1 0 25 "25" _null_ _null_ _null_ _null_ _null_ pg_ls_dir _null_ _null_ _null_ )); DESCR("list all files in a directory"); DATA(insert OID = 2626 ( pg_sleep PGNSP PGUID 12 1 0 0 0 f f f f t f v 1 0 2278 "701" _null_ _null_ _null_ _null_ _null_ pg_sleep _null_ _null_ _null_ )); diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h index e46650a..82d0e18 100644 --- a/src/include/utils/builtins.h +++ b/src/include/utils/builtins.h @@ -470,13 +470,19 @@ extern Datum pg_relation_filepath(PG_FUNCTION_ARGS); /* genfile.c */ extern bytea *read_binary_file(const char *filename, - int64 seek_offset, int64 bytes_to_read); + int64 seek_offset, + int64 bytes_to_read, + bool if_not_exists); extern Datum pg_stat_file(PG_FUNCTION_ARGS); extern Datum pg_stat_file_extended(PG_FUNCTION_ARGS); extern Datum pg_read_file(PG_FUNCTION_ARGS); +extern Datum pg_read_file_extended(PG_FUNCTION_ARGS); extern Datum pg_read_file_all(PG_FUNCTION_ARGS); +extern Datum pg_read_file_all_extended(PG_FUNCTION_ARGS); extern Datum pg_read_binary_file(PG_FUNCTION_ARGS); +extern Datum pg_read_binary_file_extended(PG_FUNCTION_ARGS); extern Datum pg_read_binary_file_all(PG_FUNCTION_ARGS); +extern Datum pg_read_binary_file_all_extended(PG_FUNCTION_ARGS); extern Datum pg_ls_dir(PG_FUNCTION_ARGS); /* misc.c */ -- 2.4.4
From 3299e47802759d362a034d718492420e3bea62dd Mon Sep 17 00:00:00 2001 From: Michael Paquier <michael@otacoo.com> Date: Wed, 24 Jun 2015 10:59:16 +0900 Subject: [PATCH 4/8] Extend pg_ls_dir with include_dot_dirs and if_not_exists If include_dot_dirs is set to true, "." and ".." are listed in the list of files listed. If if_not_exists is true, an empty list of files is returned to caller instead of erroring out if the folder specified does not exist. --- doc/src/sgml/func.sgml | 11 +++++-- src/backend/utils/adt/genfile.c | 65 ++++++++++++++++++++++++++++++++++------- src/include/catalog/pg_proc.h | 2 ++ src/include/utils/builtins.h | 1 + 4 files changed, 66 insertions(+), 13 deletions(-) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index edfea8b..3b5161a 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -17809,10 +17809,17 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup()); <tbody> <row> <entry> - <literal><function>pg_ls_dir(<parameter>dirname</> <type>text</>)</function></literal> + <literal><function>pg_ls_dir(<parameter>dirname</> <type>text</> [, <parameter>if_not_exists</> <type>boolean</>, <parameter>include_dot_dirs</> <type>boolean</>] )</function></literal> </entry> <entry><type>setof text</type></entry> - <entry>List the contents of a directory</entry> + <entry> + List the contents of a directory. If + <parameter>if_not_exists</parameter> is true, an empty result is + returned instead of an error. If <parameter>include_dot_dirs</> + is true, <quote>.</> and <quote>..</> are included as well in + the returned list, which is useful to distinguish an empty + and an non-existent directory. + </entry> </row> <row> <entry> diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c index d45c66e..12866dd 100644 --- a/src/backend/utils/adt/genfile.c +++ b/src/backend/utils/adt/genfile.c @@ -35,6 +35,8 @@ typedef struct { char *location; DIR *dirdesc; + bool include_dot_dirs; + bool leave_early; } directory_fctx; @@ -483,14 +485,15 @@ pg_stat_file(PG_FUNCTION_ARGS) /* - * List a directory (returns the filenames only) + * Wrapper for pg_ls_dir and pg_ls_dir_extended. */ -Datum -pg_ls_dir(PG_FUNCTION_ARGS) +static Datum +ls_dir_wrapper(PG_FUNCTION_ARGS) { FuncCallContext *funcctx; struct dirent *de; directory_fctx *fctx; + MemoryContext oldcontext; if (!superuser()) ereport(ERROR, @@ -499,7 +502,17 @@ pg_ls_dir(PG_FUNCTION_ARGS) if (SRF_IS_FIRSTCALL()) { - MemoryContext oldcontext; + bool if_not_exists = false; + bool include_dot_dirs = false; + + /* see if custom parameters have been passed down */ + if (PG_NARGS() == 3) + { + if (!PG_ARGISNULL(1)) + if_not_exists = PG_GETARG_BOOL(1); + if (!PG_ARGISNULL(2)) + include_dot_dirs = PG_GETARG_BOOL(2); + } funcctx = SRF_FIRSTCALL_INIT(); oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx); @@ -507,14 +520,20 @@ pg_ls_dir(PG_FUNCTION_ARGS) fctx = palloc(sizeof(directory_fctx)); fctx->location = convert_and_check_filename(PG_GETARG_TEXT_P(0)); + fctx->include_dot_dirs = include_dot_dirs; + fctx->leave_early = false; fctx->dirdesc = AllocateDir(fctx->location); if (!fctx->dirdesc) - ereport(ERROR, - (errcode_for_file_access(), - errmsg("could not open directory \"%s\": %m", - fctx->location))); - + { + if (if_not_exists && errno == ENOENT) + fctx->leave_early = true; + else + ereport(ERROR, + (errcode_for_file_access(), + errmsg("could not open directory \"%s\": %m", + fctx->location))); + } funcctx->user_fctx = fctx; MemoryContextSwitchTo(oldcontext); } @@ -522,10 +541,15 @@ pg_ls_dir(PG_FUNCTION_ARGS) funcctx = SRF_PERCALL_SETUP(); fctx = (directory_fctx *) funcctx->user_fctx; + /* Leave earlier if needed */ + if (fctx->leave_early) + SRF_RETURN_DONE(funcctx); + while ((de = ReadDir(fctx->dirdesc, fctx->location)) != NULL) { - if (strcmp(de->d_name, ".") == 0 || - strcmp(de->d_name, "..") == 0) + if (!fctx->include_dot_dirs && + (strcmp(de->d_name, ".") == 0 || + strcmp(de->d_name, "..") == 0)) continue; SRF_RETURN_NEXT(funcctx, CStringGetTextDatum(de->d_name)); @@ -535,3 +559,22 @@ pg_ls_dir(PG_FUNCTION_ARGS) SRF_RETURN_DONE(funcctx); } + +/* + * List a directory (returns the filenames only) + */ +Datum +pg_ls_dir(PG_FUNCTION_ARGS) +{ + return ls_dir_wrapper(fcinfo); +} + +/* + * List a directory (can optionally return dot directories or + * use IF NOT EXISTS). + */ +Datum +pg_ls_dir_extended(PG_FUNCTION_ARGS) +{ + return ls_dir_wrapper(fcinfo); +} diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index c805a9a..b559572 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -3205,6 +3205,8 @@ DATA(insert OID = 3296 ( pg_read_binary_file PGNSP PGUID 12 1 0 0 0 f f f f t f DESCR("read bytea from a file"); DATA(insert OID = 2625 ( pg_ls_dir PGNSP PGUID 12 1 1000 0 0 f f f f t t v 1 0 25 "25" _null_ _null_ _null_ _null_ _null_ pg_ls_dir _null_ _null_ _null_ )); DESCR("list all files in a directory"); +DATA(insert OID = 3297 ( pg_ls_dir PGNSP PGUID 12 1 1000 0 0 f f f f t t v 3 0 25 "25 16 16" _null_ _null_ _null_ _null_ _null_ pg_ls_dir_extended _null_ _null_ _null_ )); +DESCR("list all files in a directory"); DATA(insert OID = 2626 ( pg_sleep PGNSP PGUID 12 1 0 0 0 f f f f t f v 1 0 2278 "701" _null_ _null_ _null_ _null_ _null_ pg_sleep _null_ _null_ _null_ )); DESCR("sleep for the specified time in seconds"); DATA(insert OID = 3935 ( pg_sleep_for PGNSP PGUID 14 1 0 0 0 f f f f t f v 1 0 2278 "1186" _null_ _null_ _null_ _null_ _null_ "select pg_catalog.pg_sleep(extract(epoch from pg_catalog.clock_timestamp() operator(pg_catalog.+) $1) operator(pg_catalog.-) extract(epoch from pg_catalog.clock_timestamp()))" _null_ _null_ _null_ )); diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h index 82d0e18..446cda6 100644 --- a/src/include/utils/builtins.h +++ b/src/include/utils/builtins.h @@ -484,6 +484,7 @@ extern Datum pg_read_binary_file_extended(PG_FUNCTION_ARGS); extern Datum pg_read_binary_file_all(PG_FUNCTION_ARGS); extern Datum pg_read_binary_file_all_extended(PG_FUNCTION_ARGS); extern Datum pg_ls_dir(PG_FUNCTION_ARGS); +extern Datum pg_ls_dir_extended(PG_FUNCTION_ARGS); /* misc.c */ extern Datum current_database(PG_FUNCTION_ARGS); -- 2.4.4
From fae83e948c7f8bd9f8f33ef6f088f8d7b5cf465a Mon Sep 17 00:00:00 2001 From: Michael Paquier <michael@otacoo.com> Date: Wed, 24 Jun 2015 11:26:02 +0900 Subject: [PATCH 5/8] Add new column islink in pg_stat_file This allows to detect if the file path evaluated is a synlink or not. pg_stat_file is switched to use as well lstat to enable soft link detection. --- doc/src/sgml/func.sgml | 5 +++-- src/backend/utils/adt/genfile.c | 15 +++++++++++---- src/include/catalog/pg_proc.h | 4 ++-- 3 files changed, 16 insertions(+), 8 deletions(-) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 3b5161a..20b91ab 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -17903,8 +17903,9 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8'); <function>pg_stat_file</> returns a record containing the file size, last accessed time stamp, last modified time stamp, last file status change time stamp (Unix platforms only), - file creation time stamp (Windows only), and a <type>boolean</type> - indicating if it is a directory. Typical usages include: + file creation time stamp (Windows only), a <type>boolean</type> + indicating if it is a directory, and a <type>boolean</type> + indicating if it is a symbolic link. Typical usages include: <programlisting> SELECT * FROM pg_stat_file('filename'); SELECT (pg_stat_file('filename')).modification; diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c index 12866dd..456061b 100644 --- a/src/backend/utils/adt/genfile.c +++ b/src/backend/utils/adt/genfile.c @@ -392,8 +392,8 @@ stat_file_wrapper(PG_FUNCTION_ARGS) text *filename_t = PG_GETARG_TEXT_P(0); char *filename; struct stat fst; - Datum values[6]; - bool isnull[6]; + Datum values[7]; + bool isnull[7]; HeapTuple tuple; TupleDesc tupdesc; bool if_not_exists = false; @@ -410,7 +410,7 @@ stat_file_wrapper(PG_FUNCTION_ARGS) filename = convert_and_check_filename(filename_t); - if (stat(filename, &fst) < 0) + if (lstat(filename, &fst) < 0) { if (if_not_exists && errno == ENOENT) return_null = true; @@ -424,7 +424,7 @@ stat_file_wrapper(PG_FUNCTION_ARGS) * This record type had better match the output parameters declared for me * in pg_proc.h. */ - tupdesc = CreateTemplateTupleDesc(6, false); + tupdesc = CreateTemplateTupleDesc(7, false); TupleDescInitEntry(tupdesc, (AttrNumber) 1, "size", INT8OID, -1, 0); TupleDescInitEntry(tupdesc, (AttrNumber) 2, @@ -437,6 +437,8 @@ stat_file_wrapper(PG_FUNCTION_ARGS) "creation", TIMESTAMPTZOID, -1, 0); TupleDescInitEntry(tupdesc, (AttrNumber) 6, "isdir", BOOLOID, -1, 0); + TupleDescInitEntry(tupdesc, (AttrNumber) 7, + "islink", BOOLOID, -1, 0); BlessTupleDesc(tupdesc); memset(isnull, return_null, sizeof(isnull)); @@ -455,6 +457,11 @@ stat_file_wrapper(PG_FUNCTION_ARGS) values[4] = TimestampTzGetDatum(time_t_to_timestamptz(fst.st_ctime)); #endif values[5] = BoolGetDatum(S_ISDIR(fst.st_mode)); +#ifndef WIN32 + values[6] = BoolGetDatum(S_ISLNK(fst.st_mode)); +#else + values[6] = BoolGetDatum(pgwin32_is_junction(filename)); +#endif } tuple = heap_form_tuple(tupdesc, values, isnull); diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index b559572..36d3e15 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -3183,9 +3183,9 @@ DESCR("reload configuration files"); DATA(insert OID = 2622 ( pg_rotate_logfile PGNSP PGUID 12 1 0 0 0 f f f f t f v 0 0 16 "" _null_ _null_ _null_ _null_ _null_ pg_rotate_logfile _null_ _null_ _null_ )); DESCR("rotate log file"); -DATA(insert OID = 2623 ( pg_stat_file PGNSP PGUID 12 1 0 0 0 f f f f t f v 1 0 2249 "25" "{25,20,1184,1184,1184,1184,16}" "{i,o,o,o,o,o,o}" "{filename,size,access,modification,change,creation,isdir}" _null_ _null_ pg_stat_file _null_ _null_ _null_ )); +DATA(insert OID = 2623 ( pg_stat_file PGNSP PGUID 12 1 0 0 0 f f f f t f v 1 0 2249 "25" "{25,20,1184,1184,1184,1184,16,16}" "{i,o,o,o,o,o,o,o}" "{filename,size,access,modification,change,creation,isdir,islink}" _null_ _null_ pg_stat_file _null_ _null_ _null_ )); DESCR("get information about file"); -DATA(insert OID = 3307 ( pg_stat_file PGNSP PGUID 12 1 0 0 0 f f f f t f v 2 0 2249 "25 16" "{25,16,20,1184,1184,1184,1184,16}" "{i,i,o,o,o,o,o,o}" "{filename,if_not_exists,size,access,modification,change,creation,isdir}" _null_ _null_ pg_stat_file_extended _null_ _null_ _null_ )); +DATA(insert OID = 3307 ( pg_stat_file PGNSP PGUID 12 1 0 0 0 f f f f t f v 2 0 2249 "25 16" "{25,16,20,1184,1184,1184,1184,16,16}" "{i,i,o,o,o,o,o,o,o}" "{filename,if_not_exists,size,access,modification,change,creation,isdir,islink}" _null_ _null_ pg_stat_file_extended _null_ _null_ _null_ )); DESCR("get information about file"); DATA(insert OID = 2624 ( pg_read_file PGNSP PGUID 12 1 0 0 0 f f f f t f v 3 0 25 "25 20 20" _null_ _null_ _null_ _null_ _null_ pg_read_file _null_ _null_ _null_ )); DESCR("read text from a file"); -- 2.4.4
From ef78401e6b136577fb1703a6820743ab404631b1 Mon Sep 17 00:00:00 2001 From: Michael Paquier <michael@otacoo.com> Date: Wed, 24 Jun 2015 13:03:08 +0900 Subject: [PATCH 6/8] Add pg_readlink to get value of a symbolic link This is similar to Posix's readlink, except that PostgreSQL restrictions are applied to the source path queried, similarly to other functions of genfile.c. --- doc/src/sgml/func.sgml | 9 +++++++ src/backend/utils/adt/genfile.c | 54 +++++++++++++++++++++++++++++++++++++++++ src/include/catalog/pg_proc.h | 2 ++ src/include/utils/builtins.h | 1 + 4 files changed, 66 insertions(+) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 20b91ab..81a8090 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -17855,6 +17855,15 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup()); an error. </entry> </row> + <row> + <entry> + <literal><function>pg_readlink(<parameter>filename</> <type>text</>)</function></literal> + </entry> + <entry><type>text</type></entry> + <entry> + Return value of a symbolic link. + </entry> + </row> </tbody> </tgroup> </table> diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c index 456061b..94ff7d6 100644 --- a/src/backend/utils/adt/genfile.c +++ b/src/backend/utils/adt/genfile.c @@ -585,3 +585,57 @@ pg_ls_dir_extended(PG_FUNCTION_ARGS) { return ls_dir_wrapper(fcinfo); } + + +/* + * pg_readlink - Find the real location of the provided symbolic link. + */ +Datum +pg_readlink(PG_FUNCTION_ARGS) +{ + text *sourcepath_t = PG_GETARG_TEXT_P(0); + char *sourcepath; + char targetpath[MAXPGPATH]; + int rllen; + bool if_not_exists = false; + + if (!superuser()) + ereport(ERROR, + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), + (errmsg("must be superuser to read files")))); + + /* Check for option if_not_exists */ + if (PG_NARGS() == 2 && !PG_ARGISNULL(1)) + if_not_exists = PG_GETARG_BOOL(1); + +#if defined(HAVE_READLINK) || defined(WIN32) + sourcepath = convert_and_check_filename(sourcepath_t); + + rllen = readlink(sourcepath, targetpath, sizeof(targetpath)); + if (rllen < 0) + { + if (if_not_exists && errno == ENOENT) + PG_RETURN_NULL(); + else + ereport(ERROR, + (errcode_for_file_access(), + errmsg("could not read symbolic link \"%s\": %m", + sourcepath))); + } + if (rllen >= sizeof(targetpath)) + { + ereport(ERROR, + (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), + errmsg("symbolic link \"%s\" target is too long", + sourcepath))); + } + targetpath[rllen] = '\0'; + + PG_RETURN_TEXT_P(cstring_to_text(targetpath)); +#else + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("readline is not available on this platform"))); + PG_RETURN_NULL(); +#endif +} diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index 36d3e15..11dd80d 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -3207,6 +3207,8 @@ DATA(insert OID = 2625 ( pg_ls_dir PGNSP PGUID 12 1 1000 0 0 f f f f t t v 1 0 DESCR("list all files in a directory"); DATA(insert OID = 3297 ( pg_ls_dir PGNSP PGUID 12 1 1000 0 0 f f f f t t v 3 0 25 "25 16 16" _null_ _null_ _null_ _null_ _null_ pg_ls_dir_extended _null_ _null_ _null_ )); DESCR("list all files in a directory"); +DATA(insert OID = 3298 ( pg_readlink PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 25 "25" _null_ _null_ _null_ _null_ _null_ pg_readlink _null_ _null_ _null_ )); +DESCR("read value of a symbolic link"); DATA(insert OID = 2626 ( pg_sleep PGNSP PGUID 12 1 0 0 0 f f f f t f v 1 0 2278 "701" _null_ _null_ _null_ _null_ _null_ pg_sleep _null_ _null_ _null_ )); DESCR("sleep for the specified time in seconds"); DATA(insert OID = 3935 ( pg_sleep_for PGNSP PGUID 14 1 0 0 0 f f f f t f v 1 0 2278 "1186" _null_ _null_ _null_ _null_ _null_ "select pg_catalog.pg_sleep(extract(epoch from pg_catalog.clock_timestamp() operator(pg_catalog.+) $1) operator(pg_catalog.-) extract(epoch from pg_catalog.clock_timestamp()))" _null_ _null_ _null_ )); diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h index 446cda6..27a4fd8 100644 --- a/src/include/utils/builtins.h +++ b/src/include/utils/builtins.h @@ -485,6 +485,7 @@ extern Datum pg_read_binary_file_all(PG_FUNCTION_ARGS); extern Datum pg_read_binary_file_all_extended(PG_FUNCTION_ARGS); extern Datum pg_ls_dir(PG_FUNCTION_ARGS); extern Datum pg_ls_dir_extended(PG_FUNCTION_ARGS); +extern Datum pg_readlink(PG_FUNCTION_ARGS); /* misc.c */ extern Datum current_database(PG_FUNCTION_ARGS); -- 2.4.4
From e39f8148f2baa95a6426f46f407eeb0fa243a478 Mon Sep 17 00:00:00 2001 From: Michael Paquier <michael@otacoo.com> Date: Tue, 23 Jun 2015 10:42:48 +0900 Subject: [PATCH 7/8] Add regression tests for pg_ls_dir and pg_read_[binary_]file With the new options if_not_exists that have been added, this looks worth adding. --- src/test/regress/expected/file.out | 84 ++++++++++++++++++++++++++++++++++++++ src/test/regress/parallel_schedule | 3 ++ src/test/regress/sql/file.sql | 29 +++++++++++++ 3 files changed, 116 insertions(+) create mode 100644 src/test/regress/expected/file.out create mode 100644 src/test/regress/sql/file.sql diff --git a/src/test/regress/expected/file.out b/src/test/regress/expected/file.out new file mode 100644 index 0000000..0ddae30 --- /dev/null +++ b/src/test/regress/expected/file.out @@ -0,0 +1,84 @@ +-- +-- FILE +-- +-- Tests of SQL functions interacting directly with files of PGDATA +-- pg_read_file +SELECT pg_read_file('postgresql.auto.conf', 0, 23); -- OK + pg_read_file +------------------------- + # Do not edit this file +(1 row) + +SELECT pg_read_file('postgresql_not_exists', 0, 23); -- not OK +ERROR: could not open file "postgresql_not_exists" for reading: No such file or directory +SELECT pg_read_file('postgresql_not_exists', 0, 23, false); -- not OK +ERROR: could not open file "postgresql_not_exists" for reading: No such file or directory +SELECT pg_read_file('postgresql_not_exists', 0, 23, true); -- OK + pg_read_file +-------------- + +(1 row) + +-- pg_read_binary_file +SELECT pg_read_binary_file('postgresql.auto.conf', 0, 23); -- OK + pg_read_binary_file +-------------------------------------------------- + \x2320446f206e6f74206564697420746869732066696c65 +(1 row) + +SELECT pg_read_binary_file('postgresql_not_exists', 0, 23); -- not OK +ERROR: could not open file "postgresql_not_exists" for reading: No such file or directory +SELECT pg_read_binary_file('postgresql_not_exists', 0, 23, false); -- not OK +ERROR: could not open file "postgresql_not_exists" for reading: No such file or directory +SELECT pg_read_binary_file('postgresql_not_exists', 0, 23, true); -- OK + pg_read_binary_file +--------------------- + +(1 row) + +-- pg_ls_dir +SELECT pg_ls_dir('postgresql.conf'); -- not OK, not a directory +ERROR: could not open directory "postgresql.conf": Not a directory +SELECT pg_ls_dir('pg_twophase'); -- empty directory + pg_ls_dir +----------- +(0 rows) + +SELECT pg_ls_dir('pg_twophase', true, true); -- OK, not empty list + pg_ls_dir +----------- + . + .. +(2 rows) + +SELECT pg_ls_dir('pg_twophase', false, false); -- OK, empty list + pg_ls_dir +----------- +(0 rows) + +SELECT pg_ls_dir('pg_twophase', false, true); -- OK, not empty list + pg_ls_dir +----------- + . + .. +(2 rows) + +SELECT pg_ls_dir('pg_twophase', true, false); -- OK, empty list + pg_ls_dir +----------- +(0 rows) + +SELECT pg_ls_dir('postgresql_not_exists', false, true); -- not OK +ERROR: could not open directory "postgresql_not_exists": No such file or directory +SELECT pg_ls_dir('postgresql_not_exists', false, false); -- not OK +ERROR: could not open directory "postgresql_not_exists": No such file or directory +SELECT pg_ls_dir('postgresql_not_exists', true, true); -- OK, empty list + pg_ls_dir +----------- +(0 rows) + +SELECT pg_ls_dir('postgresql_not_exists', true, false); -- OK, empty list + pg_ls_dir +----------- +(0 rows) + diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule index 91780cd..b1739fa 100644 --- a/src/test/regress/parallel_schedule +++ b/src/test/regress/parallel_schedule @@ -108,5 +108,8 @@ test: plancache limit plpgsql copy2 temp domain rangefuncs prepare without_oid c # event triggers cannot run concurrently with any test that runs DDL test: event_trigger +# Files read during this test may be modified in parallel by other tests +test: file + # run stats by itself because its delay may be insufficient under heavy load test: stats diff --git a/src/test/regress/sql/file.sql b/src/test/regress/sql/file.sql new file mode 100644 index 0000000..a0238f2 --- /dev/null +++ b/src/test/regress/sql/file.sql @@ -0,0 +1,29 @@ +-- +-- FILE +-- + +-- Tests of SQL functions interacting directly with files of PGDATA + +-- pg_read_file +SELECT pg_read_file('postgresql.auto.conf', 0, 23); -- OK +SELECT pg_read_file('postgresql_not_exists', 0, 23); -- not OK +SELECT pg_read_file('postgresql_not_exists', 0, 23, false); -- not OK +SELECT pg_read_file('postgresql_not_exists', 0, 23, true); -- OK + +-- pg_read_binary_file +SELECT pg_read_binary_file('postgresql.auto.conf', 0, 23); -- OK +SELECT pg_read_binary_file('postgresql_not_exists', 0, 23); -- not OK +SELECT pg_read_binary_file('postgresql_not_exists', 0, 23, false); -- not OK +SELECT pg_read_binary_file('postgresql_not_exists', 0, 23, true); -- OK + +-- pg_ls_dir +SELECT pg_ls_dir('postgresql.conf'); -- not OK, not a directory +SELECT pg_ls_dir('pg_twophase'); -- empty directory +SELECT pg_ls_dir('pg_twophase', true, true); -- OK, not empty list +SELECT pg_ls_dir('pg_twophase', false, false); -- OK, empty list +SELECT pg_ls_dir('pg_twophase', false, true); -- OK, not empty list +SELECT pg_ls_dir('pg_twophase', true, false); -- OK, empty list +SELECT pg_ls_dir('postgresql_not_exists', false, true); -- not OK +SELECT pg_ls_dir('postgresql_not_exists', false, false); -- not OK +SELECT pg_ls_dir('postgresql_not_exists', true, true); -- OK, empty list +SELECT pg_ls_dir('postgresql_not_exists', true, false); -- OK, empty list -- 2.4.4
From 370cc091048a2259e633ca65473ed1a835e13451 Mon Sep 17 00:00:00 2001 From: Michael Paquier <michael@otacoo.com> Date: Wed, 24 Jun 2015 15:10:16 +0900 Subject: [PATCH 8/8] Fix symlink usage in pg_rewind Make use of pg_readlink to determine where a symlink in a source PGDATA streamed is from. Also remove restrictions related to symlink comparisons on source and target instances, those need to be only limited to tablespaces. --- src/bin/pg_rewind/file_ops.c | 2 ++ src/bin/pg_rewind/filemap.c | 11 ++----- src/bin/pg_rewind/filemap.h | 1 + src/bin/pg_rewind/libpq_fetch.c | 63 +++++++++++++++++++++++++---------------- 4 files changed, 45 insertions(+), 32 deletions(-) diff --git a/src/bin/pg_rewind/file_ops.c b/src/bin/pg_rewind/file_ops.c index c2d8aa1..29ddb1d 100644 --- a/src/bin/pg_rewind/file_ops.c +++ b/src/bin/pg_rewind/file_ops.c @@ -139,6 +139,7 @@ remove_target(file_entry_t *entry) remove_target_file(entry->path); break; + case FILE_TYPE_TABLESPACE: case FILE_TYPE_SYMLINK: remove_target_symlink(entry->path); break; @@ -156,6 +157,7 @@ create_target(file_entry_t *entry) create_target_dir(entry->path); break; + case FILE_TYPE_TABLESPACE: case FILE_TYPE_SYMLINK: create_target_symlink(entry->path, entry->link_target); break; diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c index 05eff68..6ef2342 100644 --- a/src/bin/pg_rewind/filemap.c +++ b/src/bin/pg_rewind/filemap.c @@ -112,12 +112,6 @@ process_source_file(const char *path, file_type_t type, size_t newsize, switch (type) { case FILE_TYPE_DIRECTORY: - if (exists && !S_ISDIR(statbuf.st_mode)) - { - /* it's a directory in source, but not in target. Strange.. */ - pg_fatal("\"%s\" is not a directory\n", localpath); - } - if (!exists) action = FILE_ACTION_CREATE; else @@ -125,7 +119,7 @@ process_source_file(const char *path, file_type_t type, size_t newsize, oldsize = 0; break; - case FILE_TYPE_SYMLINK: + case FILE_TYPE_TABLESPACE: if (exists && #ifndef WIN32 !S_ISLNK(statbuf.st_mode) @@ -140,7 +134,8 @@ process_source_file(const char *path, file_type_t type, size_t newsize, */ pg_fatal("\"%s\" is not a symbolic link\n", localpath); } - + /* fallback to default */ + case FILE_TYPE_SYMLINK: if (!exists) action = FILE_ACTION_CREATE; else diff --git a/src/bin/pg_rewind/filemap.h b/src/bin/pg_rewind/filemap.h index 9943ec5..35540d6 100644 --- a/src/bin/pg_rewind/filemap.h +++ b/src/bin/pg_rewind/filemap.h @@ -36,6 +36,7 @@ typedef enum { FILE_TYPE_REGULAR, FILE_TYPE_DIRECTORY, + FILE_TYPE_TABLESPACE, FILE_TYPE_SYMLINK } file_type_t; diff --git a/src/bin/pg_rewind/libpq_fetch.c b/src/bin/pg_rewind/libpq_fetch.c index df71069..49950b9 100644 --- a/src/bin/pg_rewind/libpq_fetch.c +++ b/src/bin/pg_rewind/libpq_fetch.c @@ -140,30 +140,28 @@ libpqProcessFileList(void) * Create a recursive directory listing of the whole data directory. * * The WITH RECURSIVE part does most of the work. The second part gets the - * targets of the symlinks in pg_tblspc directory. - * - * XXX: There is no backend function to get a symbolic link's target in - * general, so if the admin has put any custom symbolic links in the data - * directory, they won't be copied correctly. + * targets of the symlinks in the data directory. */ sql = "WITH RECURSIVE files (path, filename, size, isdir) AS (\n" - " SELECT '' AS path, filename, size, isdir FROM\n" - " (SELECT pg_ls_dir('.') AS filename) AS fn,\n" - " pg_stat_file(fn.filename) AS this\n" + " SELECT '' AS path, filename, size, isdir, islink FROM\n" + " (SELECT pg_ls_dir('.', true, false) AS filename) AS fn,\n" + " pg_stat_file(fn.filename, true) AS this\n" + " WHERE this.size IS NOT NULL\n" " UNION ALL\n" " SELECT parent.path || parent.filename || '/' AS path,\n" - " fn, this.size, this.isdir\n" + " fn, this.size, this.isdir, this.islink\n" " FROM files AS parent,\n" - " pg_ls_dir(parent.path || parent.filename) AS fn,\n" - " pg_stat_file(parent.path || parent.filename || '/' || fn) AS this\n" - " WHERE parent.isdir = 't'\n" - ")\n" - "SELECT path || filename, size, isdir,\n" - " pg_tablespace_location(pg_tablespace.oid) AS link_target\n" - "FROM files\n" - "LEFT OUTER JOIN pg_tablespace ON files.path = 'pg_tblspc/'\n" - " AND oid::text = files.filename\n"; + " pg_ls_dir(parent.path || parent.filename, true, false) AS fn,\n" + " pg_stat_file(parent.path || parent.filename || '/' || fn, true)\n" + " AS this\n" + " WHERE parent.isdir = 't' AND\n" + " this.size IS NOT NULL)\n" + "SELECT path || filename AS file_path, size, isdir, islink,\n" + " path = 'pg_tblspc/' AS istblspc,\n" + " CASE WHEN islink THEN pg_readlink(path || filename)\n" + " ELSE NULL END AS link_target\n" + "FROM files\n"; res = PQexec(conn, sql); if (PQresultStatus(res) != PGRES_TUPLES_OK) @@ -171,7 +169,7 @@ libpqProcessFileList(void) PQresultErrorMessage(res)); /* sanity check the result set */ - if (PQnfields(res) != 4) + if (PQnfields(res) != 6) pg_fatal("unexpected result set while fetching file list\n"); /* Read result to local variables */ @@ -180,10 +178,14 @@ libpqProcessFileList(void) char *path = PQgetvalue(res, i, 0); int filesize = atoi(PQgetvalue(res, i, 1)); bool isdir = (strcmp(PQgetvalue(res, i, 2), "t") == 0); - char *link_target = PQgetvalue(res, i, 3); + bool islink = (strcmp(PQgetvalue(res, i, 3), "t") == 0); + bool istblspc = (strcmp(PQgetvalue(res, i, 4), "t") == 0); + char *link_target = PQgetvalue(res, i, 5); file_type_t type; - if (link_target[0]) + if (istblspc) + type = FILE_TYPE_TABLESPACE; + else if (islink) type = FILE_TYPE_SYMLINK; else if (isdir) type = FILE_TYPE_DIRECTORY; @@ -259,8 +261,7 @@ receiveFileChunks(const char *sql) } if (PQgetisnull(res, 0, 0) || - PQgetisnull(res, 0, 1) || - PQgetisnull(res, 0, 2)) + PQgetisnull(res, 0, 1)) { pg_fatal("unexpected null values in result while fetching remote files\n"); } @@ -278,6 +279,20 @@ receiveFileChunks(const char *sql) memcpy(filename, PQgetvalue(res, 0, 0), filenamelen); filename[filenamelen] = '\0'; + /* + * It may be possible that a file has been deleted on remote side after + * creating the file map. In this case simply ignore it and move on. + */ + if (PQgetisnull(res, 0, 2)) + { + pg_log(PG_DEBUG, + "received NULL chunk for file \"%s\", file has been deleted\n", + filename); + pg_free(filename); + PQclear(res); + continue; + } + chunk = PQgetvalue(res, 0, 2); pg_log(PG_DEBUG, "received chunk for file \"%s\", offset %d, size %d\n", @@ -445,7 +460,7 @@ libpq_executeFileMap(filemap_t *map) */ sql = "SELECT path, begin, \n" - " pg_read_binary_file(path, begin, len) AS chunk\n" + " pg_read_binary_file(path, begin, len, true) AS chunk\n" "FROM fetchchunks\n"; receiveFileChunks(sql); -- 2.4.4
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers