Thanks for the review! I agree with all the code cleanups suggested and have made then in the attached patch, to save the committer some time.
Also in this patch, I changed sgml para to <para> Changing table-level options requires superuser privileges, for security reasons: only a superuser should be able to determine which file is read or which program is run. In principle non-superusers could be allowed to change the other options, but that's not supported at present. </para> "Determine" is an odd word in this context. I understand it to mean "set/change", but I can see where a less familiar reader would take the meaning to be "has permission to see the value already set". Either way, it now mentions program as an option in addition to filename. On Mon, Sep 12, 2016 at 1:59 AM, Amit Langote <langote_amit...@lab.ntt.co.jp > wrote: > On 2016/09/11 8:04, Corey Huinker wrote: > > V2 of this patch: > > > > Changes: > > * rebased to most recent master > > * removed non-tap test that assumed the existence of Unix sed program > > * added non-tap test that assumes the existence of perl > > * switched from filename/program to filename/is_program to more closely > > follow patterns in copy.c > > * slight wording change in C comments > > This version looks mostly good to me. Except some whitespace noise in > some hunks: > > @@ -139,7 +142,9 @@ static bool fileIsForeignScanParallelSafe(PlannerInfo > *root, RelOptInfo *rel, > */ > static bool is_valid_option(const char *option, Oid context); > static void fileGetOptions(Oid foreigntableid, > - char **filename, List **other_options); > + char **filename, > + bool *is_program, > > Space after "is_program," > > @@ -196,16 +201,17 @@ file_fdw_validator(PG_FUNCTION_ARGS) > > /* > * Only superusers are allowed to set options of a file_fdw foreign > table. > - * This is because the filename is one of those options, and we don't > want > - * non-superusers to be able to determine which file gets read. > + * The reason for this is to prevent non-superusers from changing the > > Space after "the" > > - if (stat(filename, &stat_buf) == 0) > + if ((! is_program) && (stat(filename, &stat_buf) == 0))) > > Space between ! and is_program. > > > - if (strcmp(def->defname, "filename") == 0) > + if ((strcmp(def->defname, "filename") == 0) || > (strcmp(def->defname, "program") == 0)) > > I think the usual style would be to split the if statement into two lines > as follows to keep within 80 characters per line [1]: > > + if ((strcmp(def->defname, "filename") == 0) || > + (strcmp(def->defname, "program") == 0)) > > And likewise for: > > - &fdw_private->filename, &fdw_private->options); > + &fdw_private->filename, &fdw_private->is_program, > &fdw_private->options); > > By the way, doesn't the following paragraph in file-fdw.sgml need an > update? > > <para> > Changing table-level options requires superuser privileges, for security > reasons: only a superuser should be able to determine which file is read. > In principle non-superusers could be allowed to change the other options, > but that's not supported at present. > </para> > > > I would like to mark this now as "Ready for Committer". > > Thanks, > Amit > > [1] https://www.postgresql.org/docs/devel/static/source-format.html > > >
diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c index b471991..7f534b1 100644 --- a/contrib/file_fdw/file_fdw.c +++ b/contrib/file_fdw/file_fdw.c @@ -59,6 +59,7 @@ struct FileFdwOption static const struct FileFdwOption valid_options[] = { /* File options */ {"filename", ForeignTableRelationId}, + {"program", ForeignTableRelationId}, /* Format options */ /* oids option is not supported */ @@ -85,10 +86,11 @@ static const struct FileFdwOption valid_options[] = { */ typedef struct FileFdwPlanState { - char *filename; /* file to read */ - List *options; /* merged COPY options, excluding filename */ - BlockNumber pages; /* estimate of file's physical size */ - double ntuples; /* estimate of number of rows in file */ + char *filename; /* file/program to read */ + bool is_program; /* true if filename represents an OS command */ + List *options; /* merged COPY options, excluding filename and program */ + BlockNumber pages; /* estimate of file or program output's physical size */ + double ntuples; /* estimate of number of rows in file/program output */ } FileFdwPlanState; /* @@ -96,9 +98,10 @@ typedef struct FileFdwPlanState */ typedef struct FileFdwExecutionState { - char *filename; /* file to read */ - List *options; /* merged COPY options, excluding filename */ - CopyState cstate; /* state of reading file */ + char *filename; /* file/program to read */ + bool is_program; /* true if filename represents an OS command */ + List *options; /* merged COPY options, excluding filename and is_program */ + CopyState cstate; /* state of reading file or program */ } FileFdwExecutionState; /* @@ -139,7 +142,9 @@ static bool fileIsForeignScanParallelSafe(PlannerInfo *root, RelOptInfo *rel, */ static bool is_valid_option(const char *option, Oid context); static void fileGetOptions(Oid foreigntableid, - char **filename, List **other_options); + char **filename, + bool *is_program, + List **other_options); static List *get_file_fdw_attribute_options(Oid relid); static bool check_selective_binary_conversion(RelOptInfo *baserel, Oid foreigntableid, @@ -196,16 +201,17 @@ file_fdw_validator(PG_FUNCTION_ARGS) /* * Only superusers are allowed to set options of a file_fdw foreign table. - * This is because the filename is one of those options, and we don't want - * non-superusers to be able to determine which file gets read. + * The reason for this is to prevent non-superusers from changing the + * definition to access an arbitrary file not visible to that user + * or to run programs not accessible to that user. * * Putting this sort of permissions check in a validator is a bit of a * crock, but there doesn't seem to be any other place that can enforce * the check more cleanly. * - * Note that the valid_options[] array disallows setting filename at any - * options level other than foreign table --- otherwise there'd still be a - * security hole. + * Note that the valid_options[] array disallows setting filename and + * program at any options level other than foreign table --- otherwise + * there'd still be a security hole. */ if (catalog == ForeignTableRelationId && !superuser()) ereport(ERROR, @@ -247,11 +253,12 @@ file_fdw_validator(PG_FUNCTION_ARGS) } /* - * Separate out filename and column-specific options, since + * Separate out filename, program, and column-specific options, since * ProcessCopyOptions won't accept them. */ - if (strcmp(def->defname, "filename") == 0) + if ((strcmp(def->defname, "filename") == 0) || + (strcmp(def->defname, "program") == 0)) { if (filename) ereport(ERROR, @@ -296,12 +303,13 @@ file_fdw_validator(PG_FUNCTION_ARGS) ProcessCopyOptions(NULL, NULL, true, other_options); /* - * Filename option is required for file_fdw foreign tables. + * Either filename or program option is required for file_fdw foreign + * tables. */ if (catalog == ForeignTableRelationId && filename == NULL) ereport(ERROR, (errcode(ERRCODE_FDW_DYNAMIC_PARAMETER_VALUE_NEEDED), - errmsg("filename is required for file_fdw foreign tables"))); + errmsg("either filename or program is required for file_fdw foreign tables"))); PG_RETURN_VOID(); } @@ -326,12 +334,13 @@ is_valid_option(const char *option, Oid context) /* * Fetch the options for a file_fdw foreign table. * - * We have to separate out "filename" from the other options because - * it must not appear in the options list passed to the core COPY code. + * We have to separate out "filename" and "program" from the other options + * because it must not appear in the options list passed to the core COPY + * code. */ static void fileGetOptions(Oid foreigntableid, - char **filename, List **other_options) + char **filename, bool *is_program, List **other_options) { ForeignTable *table; ForeignServer *server; @@ -359,9 +368,10 @@ fileGetOptions(Oid foreigntableid, options = list_concat(options, get_file_fdw_attribute_options(foreigntableid)); /* - * Separate out the filename. + * Separate out the filename and program. */ *filename = NULL; + *is_program = false; prev = NULL; foreach(lc, options) { @@ -373,6 +383,13 @@ fileGetOptions(Oid foreigntableid, options = list_delete_cell(options, lc, prev); break; } + else if (strcmp(def->defname, "program") == 0) + { + *filename = defGetString(def); + *is_program = true; + options = list_delete_cell(options, lc, prev); + break; + } prev = lc; } @@ -381,7 +398,7 @@ fileGetOptions(Oid foreigntableid, * options, but check again, just in case. */ if (*filename == NULL) - elog(ERROR, "filename is required for file_fdw foreign tables"); + elog(ERROR, "either filename or program is required for file_fdw foreign tables"); *other_options = options; } @@ -475,12 +492,15 @@ fileGetForeignRelSize(PlannerInfo *root, FileFdwPlanState *fdw_private; /* - * Fetch options. We only need filename at this point, but we might as - * well get everything and not need to re-fetch it later in planning. + * Fetch options. We only need filename (or program) at this point, but + * we might as well get everything and not need to re-fetch it later in + * planning. */ fdw_private = (FileFdwPlanState *) palloc(sizeof(FileFdwPlanState)); fileGetOptions(foreigntableid, - &fdw_private->filename, &fdw_private->options); + &fdw_private->filename, + &fdw_private->is_program, + &fdw_private->options); baserel->fdw_private = (void *) fdw_private; /* Estimate relation size */ @@ -583,20 +603,27 @@ static void fileExplainForeignScan(ForeignScanState *node, ExplainState *es) { char *filename; + bool is_program; List *options; - /* Fetch options --- we only need filename at this point */ + /* Fetch options --- we only need filename and is_program at this point */ fileGetOptions(RelationGetRelid(node->ss.ss_currentRelation), - &filename, &options); + &filename, &is_program, &options); - ExplainPropertyText("Foreign File", filename, es); + if(filename) + { + if (is_program) + ExplainPropertyText("Foreign Program", filename, es); + else + ExplainPropertyText("Foreign File", filename, es); + } /* Suppress file size if we're not showing cost details */ if (es->costs) { struct stat stat_buf; - if (stat(filename, &stat_buf) == 0) + if ((!is_program) && (stat(filename, &stat_buf) == 0)) ExplainPropertyLong("Foreign File Size", (long) stat_buf.st_size, es); } @@ -611,6 +638,7 @@ fileBeginForeignScan(ForeignScanState *node, int eflags) { ForeignScan *plan = (ForeignScan *) node->ss.ps.plan; char *filename; + bool is_program; List *options; CopyState cstate; FileFdwExecutionState *festate; @@ -623,7 +651,7 @@ fileBeginForeignScan(ForeignScanState *node, int eflags) /* Fetch options of foreign table */ fileGetOptions(RelationGetRelid(node->ss.ss_currentRelation), - &filename, &options); + &filename, &is_program, &options); /* Add any options from the plan (currently only convert_selectively) */ options = list_concat(options, plan->fdw_private); @@ -633,11 +661,11 @@ fileBeginForeignScan(ForeignScanState *node, int eflags) * as to match the expected ScanTupleSlot signature. */ cstate = BeginCopyFrom(NULL, - node->ss.ss_currentRelation, - filename, - false, - NIL, - options); + node->ss.ss_currentRelation, + filename, + is_program, + NIL, + options); /* * Save state in node->fdw_state. We must save enough information to call @@ -645,6 +673,7 @@ fileBeginForeignScan(ForeignScanState *node, int eflags) */ festate = (FileFdwExecutionState *) palloc(sizeof(FileFdwExecutionState)); festate->filename = filename; + festate->is_program = is_program; festate->options = options; festate->cstate = cstate; @@ -709,7 +738,7 @@ fileReScanForeignScan(ForeignScanState *node) festate->cstate = BeginCopyFrom(NULL, node->ss.ss_currentRelation, festate->filename, - false, + festate->is_program, NIL, festate->options); } @@ -738,11 +767,19 @@ fileAnalyzeForeignTable(Relation relation, BlockNumber *totalpages) { char *filename; + bool is_program; List *options; struct stat stat_buf; /* Fetch options of foreign table */ - fileGetOptions(RelationGetRelid(relation), &filename, &options); + fileGetOptions(RelationGetRelid(relation), &filename, &is_program, &options); + + /* + * If this is a program instead of a file, just return false to skip + * analyzing the table. + */ + if (is_program) + return false; /* * Get size of the file. (XXX if we fail here, would it be better to just @@ -916,9 +953,10 @@ estimate_size(PlannerInfo *root, RelOptInfo *baserel, /* * Get size of the file. It might not be there at plan time, though, in - * which case we have to use a default estimate. + * which case we have to use a default estimate. We also have to fall + * back to the default if using a program as the input. */ - if (stat(fdw_private->filename, &stat_buf) < 0) + if (fdw_private->is_program || stat(fdw_private->filename, &stat_buf) < 0) stat_buf.st_size = 10 * BLCKSZ; /* @@ -1036,6 +1074,7 @@ file_acquire_sample_rows(Relation onerel, int elevel, bool *nulls; bool found; char *filename; + bool is_program; List *options; CopyState cstate; ErrorContextCallback errcallback; @@ -1050,12 +1089,12 @@ file_acquire_sample_rows(Relation onerel, int elevel, nulls = (bool *) palloc(tupDesc->natts * sizeof(bool)); /* Fetch options of foreign table */ - fileGetOptions(RelationGetRelid(onerel), &filename, &options); + fileGetOptions(RelationGetRelid(onerel), &filename, &is_program, &options); /* * Create CopyState from FDW options. */ - cstate = BeginCopyFrom(NULL, onerel, filename, false, NIL, options); + cstate = BeginCopyFrom(NULL, onerel, filename, is_program, NIL, options); /* * Use per-tuple memory context to prevent leak of memory used to read diff --git a/contrib/file_fdw/input/file_fdw.source b/contrib/file_fdw/input/file_fdw.source index 685561f..59a983d 100644 --- a/contrib/file_fdw/input/file_fdw.source +++ b/contrib/file_fdw/input/file_fdw.source @@ -185,6 +185,17 @@ SET ROLE regress_file_fdw_user; ALTER FOREIGN TABLE agg_text OPTIONS (SET format 'text'); SET ROLE regress_file_fdw_superuser; +-- PROGRAM test +CREATE FOREIGN TABLE program_test( + a text, + b integer, + c date +) SERVER file_server +OPTIONS(program 'perl -e ''print "abc\t123\t1994-09-10\n"'''); + +SELECT * FROM program_test; +DROP FOREIGN TABLE program_test; + -- cleanup RESET ROLE; DROP EXTENSION file_fdw CASCADE; diff --git a/contrib/file_fdw/output/file_fdw.source b/contrib/file_fdw/output/file_fdw.source index 6fa5440..7dae388 100644 --- a/contrib/file_fdw/output/file_fdw.source +++ b/contrib/file_fdw/output/file_fdw.source @@ -76,7 +76,7 @@ CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'csv', null ' '); -- ERROR ERROR: COPY null representation cannot use newline or carriage return CREATE FOREIGN TABLE tbl () SERVER file_server; -- ERROR -ERROR: filename is required for file_fdw foreign tables +ERROR: either filename or program is required for file_fdw foreign tables CREATE FOREIGN TABLE agg_text ( a int2 CHECK (a >= 0), b float4 @@ -132,7 +132,7 @@ ERROR: invalid option "force_not_null" HINT: There are no valid options in this context. CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (force_not_null '*'); -- ERROR ERROR: invalid option "force_not_null" -HINT: Valid options in this context are: filename, format, header, delimiter, quote, escape, null, encoding +HINT: Valid options in this context are: filename, program, format, header, delimiter, quote, escape, null, encoding -- force_null is not allowed to be specified at any foreign object level: ALTER FOREIGN DATA WRAPPER file_fdw OPTIONS (ADD force_null '*'); -- ERROR ERROR: invalid option "force_null" @@ -145,7 +145,7 @@ ERROR: invalid option "force_null" HINT: There are no valid options in this context. CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (force_null '*'); -- ERROR ERROR: invalid option "force_null" -HINT: Valid options in this context are: filename, format, header, delimiter, quote, escape, null, encoding +HINT: Valid options in this context are: filename, program, format, header, delimiter, quote, escape, null, encoding -- basic query tests SELECT * FROM agg_text WHERE b > 10.0 ORDER BY a; a | b @@ -341,6 +341,20 @@ SET ROLE regress_file_fdw_user; ALTER FOREIGN TABLE agg_text OPTIONS (SET format 'text'); ERROR: only superuser can change options of a file_fdw foreign table SET ROLE regress_file_fdw_superuser; +-- PROGRAM test +CREATE FOREIGN TABLE program_test( + a text, + b integer, + c date +) SERVER file_server +OPTIONS(program 'perl -e ''print "abc\t123\t1994-09-10\n"'''); +SELECT * FROM program_test; + a | b | c +-----+-----+------------ + abc | 123 | 09-10-1994 +(1 row) + +DROP FOREIGN TABLE program_test; -- cleanup RESET ROLE; DROP EXTENSION file_fdw CASCADE; diff --git a/doc/src/sgml/file-fdw.sgml b/doc/src/sgml/file-fdw.sgml index d3b39aa..4937be8 100644 --- a/doc/src/sgml/file-fdw.sgml +++ b/doc/src/sgml/file-fdw.sgml @@ -27,7 +27,26 @@ <listitem> <para> - Specifies the file to be read. Required. Must be an absolute path name. + Specifies the file to be read. Must be an absolute path name. + Either <literal>filename</literal> or <literal>program</literal> must be + specified. They are mutually exclusive. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>program</literal></term> + + <listitem> + <para> + Specifies the command to executed. + Note that the command is invoked by the shell, so if you need to pass any + arguments to shell command that come from an untrusted source, you must + be careful to strip or escape any special characters that might have a + special meaning for the shell. For security reasons, it is best to use a + fixed command string, or at least avoid passing any user input in it. + Either <literal>program</literal> or <literal>filename</literal> must be + specified. They are mutually exclusive. </para> </listitem> </varlistentry> @@ -171,9 +190,9 @@ <para> Changing table-level options requires superuser privileges, for security - reasons: only a superuser should be able to determine which file is read. - In principle non-superusers could be allowed to change the other options, - but that's not supported at present. + reasons: only a superuser should be able to determine which file is read + or which program is run. In principle non-superusers could be allowed to + change the other options, but that's not supported at present. </para> <para>
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers