On 22.02.2013 12:43, Etsuro Fujita wrote:
1. "Broken pipe" is not handled in case of psql "\copy" command;
     Issue are as follows:
         Following are verified on SuSE-Linux 10.2.
         1) psql is exiting when "\COPY xxx TO" command is issued
and
command/script is not found
                  When popen is called in write mode it is creating
valid
file descriptor and when it tries to write to file "Broken pipe"
error
is>  coming which is not handled.
                         psql# \copy pgbench_accounts TO PROGRAM
'../compress.sh pgbench_accounts4.txt'
         2) When "\copy" command is in progress then
program/command
is
killed/"crashed due to any problem"
            psql is exiting.

This is a headache.  I have no idea how to solve this.

I think we can keep it for committer to take a call on this issue.

Agreed.

Fortunately this one is easy. We just need to ignore SIGPIPE, by calling pqsignal(SIGPIPE, SIG_IGN) before popen(). We already do the same in psql when the query output is piped to a pager, in PageOutput.

5. Following in copy.sgml can be changed to make more meaningful as
the first line looks little adhoc.
+<para>
+      The command that input comes from or that output goes to.
+      The command for COPY FROM, which input comes from, must write
+ its
output
+      to standard output.  The command for COPY TO, which output
goes
+ to,
must
+      read its input from standard input.
+</para>

I've struggled to make the document more meaningful.

To be honest, I am not sure whether introducing pre, post processor
terminology is right or not,
But again I shall let committer decide about this point.

Agreed.

I changed the terminology to use terms like "the command specified with PROGRAM", instead of talking about pre- and post-processsors.

I have one small another doubt that in function parse_slash_copy, you
avoided expand tilde
for program case, which I am not sure is the right thing or not.

Sorry, I'm not sure that, too.  I'd like to leave this for committers.

That seems correct. The shell will do tilde expansion with popen(), we don't need to do it ourselves.

There's one oddity in the psql \copy syntax. This is actually present in earlier versions too, but I think we should fix it now that we extend the syntax:

-- Writes to standard output. There's no way to write to a file called "stdout".
 \copy foo to 'stdout'

I think we should change the interpretation of the above so that it writes to a file called "stdout". It's possible that there's an application out there that relies on that to write to stdout, but it's not hard to fix the application. A small note in the release notes might be in order.

Also, I think we should require the command to be quoted in \copy. Ie. let's forbid this:

\copy pgbench_accounts to program /bin/cat>/dev/null

and require that it's written as:

\copy pgbench_accounts to program '/bin/cat>/dev/null'

We don't require quoting for filenames, which I find a bit weird too, but it seems particularly confusing for a shell command.

Attached is a new version of this patch that I almost committed, but one thing caught my eye at the last minute: The error reporting is not very user-friendly. If the program fails after reading/writing some rows, the reason is printed to the log, but not the user:

postgres=# copy foo to program '/tmp/midway-crash';
ERROR:  could not execute command "/tmp/midway-crash"

the log has more details:

LOG:  child process exited with exit code 10
STATEMENT:  copy foo to program '/tmp/midway-crash';
ERROR:  could not execute command "/tmp/midway-crash"
STATEMENT:  copy foo to program '/tmp/midway-crash';

I think we should arrange for the "child process exited with exit code 10" to be printed as errdetail to the client. Similarly, with psql \copy command, the "child process exited with exit code 10" command shouldn't be printed straight to stderr, but should go through psql_error.

I'll try to refactor pclose_check tomorrow to do that. Meanwhile, please take a look at the attached if you have the time. I rewrote much of the docs changes, and some comments.

- Heikki
diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index d644a46..d1cca1e 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -588,6 +588,7 @@ fileBeginForeignScan(ForeignScanState *node, int eflags)
 	 */
 	cstate = BeginCopyFrom(node->ss.ss_currentRelation,
 						   filename,
+						   false,
 						   NIL,
 						   options);
 
@@ -660,6 +661,7 @@ fileReScanForeignScan(ForeignScanState *node)
 
 	festate->cstate = BeginCopyFrom(node->ss.ss_currentRelation,
 									festate->filename,
+									false,
 									NIL,
 									festate->options);
 }
@@ -993,7 +995,7 @@ file_acquire_sample_rows(Relation onerel, int elevel,
 	/*
 	 * Create CopyState from FDW options.
 	 */
-	cstate = BeginCopyFrom(onerel, filename, NIL, options);
+	cstate = BeginCopyFrom(onerel, filename, false, NIL, options);
 
 	/*
 	 * Use per-tuple memory context to prevent leak of memory used to read
diff --git a/doc/src/sgml/keywords.sgml b/doc/src/sgml/keywords.sgml
index 0e7b322..576fd65 100644
--- a/doc/src/sgml/keywords.sgml
+++ b/doc/src/sgml/keywords.sgml
@@ -3514,6 +3514,13 @@
     <entry>reserved</entry>
    </row>
    <row>
+    <entry><token>PROGRAM</token></entry>
+    <entry>non-reserved</entry>
+    <entry></entry>
+    <entry></entry>
+    <entry></entry>
+   </row>
+   <row>
     <entry><token>PUBLIC</token></entry>
     <entry></entry>
     <entry>non-reserved</entry>
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 2137c67..0b4287a 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -23,11 +23,11 @@ PostgreSQL documentation
  <refsynopsisdiv>
 <synopsis>
 COPY <replaceable class="parameter">table_name</replaceable> [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
-    FROM { '<replaceable class="parameter">filename</replaceable>' | STDIN }
+    FROM { '<replaceable class="parameter">filename</replaceable>' | PROGRAM '<replaceable class="parameter">command</replaceable>' | STDIN }
     [ [ WITH ] ( <replaceable class="parameter">option</replaceable> [, ...] ) ]
 
 COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ] | ( <replaceable class="parameter">query</replaceable> ) }
-    TO { '<replaceable class="parameter">filename</replaceable>' | STDOUT }
+    TO { '<replaceable class="parameter">filename</replaceable>' | PROGRAM '<replaceable class="parameter">command</replaceable>' | STDOUT }
     [ [ WITH ] ( <replaceable class="parameter">option</replaceable> [, ...] ) ]
 
 <phrase>where <replaceable class="parameter">option</replaceable> can be one of:</phrase>
@@ -72,6 +72,10 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
    <productname>PostgreSQL</productname> server to directly read from
    or write to a file. The file must be accessible to the server and
    the name must be specified from the viewpoint of the server. When
+   <literal>PROGRAM</literal> is specified, the server executes the
+   given command, and reads from its standard input, or writes to its
+   standard output. The command must be specified from the viewpoint of the
+   server, and be executable by the <literal>postgres</> user. When
    <literal>STDIN</literal> or <literal>STDOUT</literal> is
    specified, data is transmitted via the connection between the
    client and the server.
@@ -126,6 +130,25 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
    </varlistentry>
 
    <varlistentry>
+    <term><literal>PROGRAM</literal></term>
+    <listitem>
+     <para>
+      A command to execute. In <command>COPY FROM</command>, the input is
+      read from standard output of the command, and in <command>COPY TO</>,
+      the output is written to the standard input of the command.
+     </para>
+     <para>
+      Note that the command is invoked by the shell, so if you need to pass
+      any arguments to shell command that come from an untrusted source, you
+      must be careful to strip or escape any special characters that might
+      have a special meaning for the shell. For security reasons, it is best
+      to use a fixed command string, or at least avoid passing any user input
+      in it.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>STDIN</literal></term>
     <listitem>
      <para>
@@ -367,9 +390,13 @@ COPY <replaceable class="parameter">count</replaceable>
     they must reside on or be accessible to the database server machine,
     not the client. They must be accessible to and readable or writable
     by the <productname>PostgreSQL</productname> user (the user ID the
-    server runs as), not the client. <command>COPY</command> naming a
-    file is only allowed to database superusers, since it allows reading
-    or writing any file that the server has privileges to access.
+    server runs as), not the client. Similarly,
+    the command specified with <literal>PROGRAM</literal> is executed directly
+    by the server, not by the client application, must be executable by the
+    <productname>PostgreSQL</productname> user.
+    <command>COPY</command> naming a file or command is only allowed to
+    database superusers, since it allows reading or writing any file that the
+    server has privileges to access.
    </para>
 
    <para>
@@ -394,6 +421,12 @@ COPY <replaceable class="parameter">count</replaceable>
    </para>
 
    <para>
+    Executing a command with <literal>PROGRAM</literal> might be
+    restricted by operating systems that implement access control for their
+    resources, such as the SELinux.
+   </para>
+
+   <para>
     <command>COPY FROM</command> will invoke any triggers and check
     constraints on the destination table. However, it will not invoke rules.
    </para>
@@ -842,6 +875,14 @@ COPY (SELECT * FROM country WHERE country_name LIKE 'A%') TO '/usr1/proj/bray/sq
   </para>
 
   <para>
+   To copy into a compressed file, you can pipe the output through an external
+   compression program:
+<programlisting>
+COPY country TO PROGRAM 'gzip > /usr1/proj/bray/sql/country_data.gz';
+</programlisting>
+  </para>
+
+  <para>
    Here is a sample of data suitable for copying into a table from
    <literal>STDIN</literal>:
 <programlisting>
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 465d3a1..ca64082 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -830,7 +830,7 @@ testdb=&gt;
       <varlistentry id="APP-PSQL-meta-commands-copy">
         <term><literal>\copy { <replaceable class="parameter">table</replaceable> [ ( <replaceable class="parameter">column_list</replaceable> ) ] | ( <replaceable class="parameter">query</replaceable> ) }
         { <literal>from</literal> | <literal>to</literal> }
-        { <replaceable class="parameter">filename</replaceable> | stdin | stdout | pstdin | pstdout }
+        { <replaceable class="parameter">filename</replaceable> | program <replaceable class="parameter">command</replaceable> | stdin | stdout | pstdin | pstdout }
         [ [ with ] ( <replaceable class="parameter">option</replaceable> [, ...] ) ]</literal></term>
 
         <listitem>
@@ -847,6 +847,18 @@ testdb=&gt;
         </para>
 
         <para>
+        When <literal>program</> is specified,
+        <replaceable class="parameter">command</replaceable> is
+        executed by <application>psql</application>,
+        not the server, and the data from or to
+        <replaceable class="parameter">command</replaceable> is
+        routed between the server and the client.
+        This means that the execution privileges are those of
+        the local user, not the server, and no SQL superuser
+        privileges are required.
+        </para>
+
+        <para>
         The syntax of the command is similar to that of the
         <acronym>SQL</acronym> <xref linkend="sql-copy">
         command, and
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 523c1e0..0818581 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -108,6 +108,7 @@ typedef struct CopyStateData
 	QueryDesc  *queryDesc;		/* executable query to copy from */
 	List	   *attnumlist;		/* integer list of attnums to copy */
 	char	   *filename;		/* filename, or NULL for STDIN/STDOUT */
+	bool		is_program;		/* is 'filename' a program to popen? */
 	bool		binary;			/* binary format? */
 	bool		oids;			/* include OIDs? */
 	bool		freeze;			/* freeze rows on loading? */
@@ -278,7 +279,8 @@ static CopyState BeginCopy(bool is_from, Relation rel, Node *raw_query,
 		  const char *queryString, List *attnamelist, List *options);
 static void EndCopy(CopyState cstate);
 static CopyState BeginCopyTo(Relation rel, Node *query, const char *queryString,
-			const char *filename, List *attnamelist, List *options);
+			const char *filename, bool is_program, List *attnamelist,
+			List *options);
 static void EndCopyTo(CopyState cstate);
 static uint64 DoCopyTo(CopyState cstate);
 static uint64 CopyTo(CopyState cstate);
@@ -752,13 +754,22 @@ DoCopy(const CopyStmt *stmt, const char *queryString, uint64 *processed)
 	Relation	rel;
 	Oid         relid;
 
-	/* Disallow file COPY except to superusers. */
+	/* Disallow COPY to/from file or program except to superusers. */
 	if (!pipe && !superuser())
-		ereport(ERROR,
-				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
-				 errmsg("must be superuser to COPY to or from a file"),
-				 errhint("Anyone can COPY to stdout or from stdin. "
-						 "psql's \\copy command also works for anyone.")));
+	{
+		if (stmt->is_program)
+			ereport(ERROR,
+					(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+			 errmsg("must be superuser to COPY to or from an external program"),
+					 errhint("Anyone can COPY to stdout or from stdin. "
+							 "psql's \\copy command also works for anyone.")));
+		else
+			ereport(ERROR,
+					(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+					 errmsg("must be superuser to COPY to or from a file"),
+					 errhint("Anyone can COPY to stdout or from stdin. "
+							 "psql's \\copy command also works for anyone.")));
+	}
 
 	if (stmt->relation)
 	{
@@ -812,14 +823,15 @@ DoCopy(const CopyStmt *stmt, const char *queryString, uint64 *processed)
 		if (XactReadOnly && !rel->rd_islocaltemp)
 			PreventCommandIfReadOnly("COPY FROM");
 
-		cstate = BeginCopyFrom(rel, stmt->filename,
+		cstate = BeginCopyFrom(rel, stmt->filename, stmt->is_program,
 							   stmt->attlist, stmt->options);
 		*processed = CopyFrom(cstate);	/* copy from file to database */
 		EndCopyFrom(cstate);
 	}
 	else
 	{
-		cstate = BeginCopyTo(rel, stmt->query, queryString, stmt->filename,
+		cstate = BeginCopyTo(rel, stmt->query, queryString,
+							 stmt->filename, stmt->is_program,
 							 stmt->attlist, stmt->options);
 		*processed = DoCopyTo(cstate);	/* copy from database to file */
 		EndCopyTo(cstate);
@@ -1395,11 +1407,21 @@ BeginCopy(bool is_from,
 static void
 EndCopy(CopyState cstate)
 {
-	if (cstate->filename != NULL && FreeFile(cstate->copy_file))
-		ereport(ERROR,
-				(errcode_for_file_access(),
-				 errmsg("could not close file \"%s\": %m",
-						cstate->filename)));
+	if (cstate->is_program)
+	{
+		if (ClosePipeStream(cstate->copy_file) == -1)
+			ereport(ERROR,
+					(errmsg("could not execute command \"%s\"",
+							cstate->filename)));
+	}
+	else
+	{
+		if (cstate->filename != NULL && FreeFile(cstate->copy_file))
+			ereport(ERROR,
+					(errcode_for_file_access(),
+					 errmsg("could not close file \"%s\": %m",
+							cstate->filename)));
+	}
 
 	MemoryContextDelete(cstate->copycontext);
 	pfree(cstate);
@@ -1413,6 +1435,7 @@ BeginCopyTo(Relation rel,
 			Node *query,
 			const char *queryString,
 			const char *filename,
+			bool  is_program,
 			List *attnamelist,
 			List *options)
 {
@@ -1451,39 +1474,52 @@ BeginCopyTo(Relation rel,
 
 	if (pipe)
 	{
+		Assert(!is_program);	/* the grammar does not allow this */
 		if (whereToSendOutput != DestRemote)
 			cstate->copy_file = stdout;
 	}
 	else
 	{
-		mode_t		oumask;		/* Pre-existing umask value */
-		struct stat st;
+		cstate->filename = pstrdup(filename);
+		cstate->is_program = is_program;
 
-		/*
-		 * Prevent write to relative path ... too easy to shoot oneself in the
-		 * foot by overwriting a database file ...
-		 */
-		if (!is_absolute_path(filename))
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_NAME),
-					 errmsg("relative path not allowed for COPY to file")));
+		if (is_program)
+		{
+			cstate->copy_file = OpenPipeStream(cstate->filename, PG_BINARY_W);
+			if (cstate->copy_file == NULL)
+				ereport(ERROR,
+						(errmsg("could not execute command \"%s\": %m",
+								cstate->filename)));
+		}
+		else
+		{
+			mode_t		oumask;		/* Pre-existing umask value */
+			struct stat st;
 
-		cstate->filename = pstrdup(filename);
-		oumask = umask(S_IWGRP | S_IWOTH);
-		cstate->copy_file = AllocateFile(cstate->filename, PG_BINARY_W);
-		umask(oumask);
+			/*
+			 * Prevent write to relative path ... too easy to shoot oneself in
+			 * the foot by overwriting a database file ...
+			 */
+			if (!is_absolute_path(filename))
+				ereport(ERROR,
+						(errcode(ERRCODE_INVALID_NAME),
+						 errmsg("relative path not allowed for COPY to file")));
 
-		if (cstate->copy_file == NULL)
-			ereport(ERROR,
-					(errcode_for_file_access(),
-					 errmsg("could not open file \"%s\" for writing: %m",
-							cstate->filename)));
+			oumask = umask(S_IWGRP | S_IWOTH);
+			cstate->copy_file = AllocateFile(cstate->filename, PG_BINARY_W);
+			umask(oumask);
+			if (cstate->copy_file == NULL)
+				ereport(ERROR,
+						(errcode_for_file_access(),
+						 errmsg("could not open file \"%s\" for writing: %m",
+								cstate->filename)));
 
-		fstat(fileno(cstate->copy_file), &st);
-		if (S_ISDIR(st.st_mode))
-			ereport(ERROR,
-					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
-					 errmsg("\"%s\" is a directory", cstate->filename)));
+			fstat(fileno(cstate->copy_file), &st);
+			if (S_ISDIR(st.st_mode))
+				ereport(ERROR,
+						(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+						 errmsg("\"%s\" is a directory", cstate->filename)));
+		}
 	}
 
 	MemoryContextSwitchTo(oldcontext);
@@ -2317,6 +2353,7 @@ CopyFromInsertBatch(CopyState cstate, EState *estate, CommandId mycid,
 CopyState
 BeginCopyFrom(Relation rel,
 			  const char *filename,
+			  bool	is_program,
 			  List *attnamelist,
 			  List *options)
 {
@@ -2413,9 +2450,11 @@ BeginCopyFrom(Relation rel,
 	cstate->defexprs = defexprs;
 	cstate->volatile_defexprs = volatile_defexprs;
 	cstate->num_defaults = num_defaults;
+	cstate->is_program = is_program;
 
 	if (pipe)
 	{
+		Assert(!is_program);	/* the grammar does not allow this */
 		if (whereToSendOutput == DestRemote)
 			ReceiveCopyBegin(cstate);
 		else
@@ -2423,22 +2462,33 @@ BeginCopyFrom(Relation rel,
 	}
 	else
 	{
-		struct stat st;
-
 		cstate->filename = pstrdup(filename);
-		cstate->copy_file = AllocateFile(cstate->filename, PG_BINARY_R);
 
-		if (cstate->copy_file == NULL)
-			ereport(ERROR,
-					(errcode_for_file_access(),
-					 errmsg("could not open file \"%s\" for reading: %m",
-							cstate->filename)));
+		if (cstate->is_program)
+		{
+			cstate->copy_file = OpenPipeStream(cstate->filename, PG_BINARY_R);
+			if (cstate->copy_file == NULL)
+				ereport(ERROR,
+						(errmsg("could not execute command \"%s\": %m",
+								cstate->filename)));
+		}
+		else
+		{
+			struct stat st;
 
-		fstat(fileno(cstate->copy_file), &st);
-		if (S_ISDIR(st.st_mode))
-			ereport(ERROR,
-					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
-					 errmsg("\"%s\" is a directory", cstate->filename)));
+			cstate->copy_file = AllocateFile(cstate->filename, PG_BINARY_R);
+			if (cstate->copy_file == NULL)
+				ereport(ERROR,
+						(errcode_for_file_access(),
+						 errmsg("could not open file \"%s\" for reading: %m",
+								cstate->filename)));
+
+			fstat(fileno(cstate->copy_file), &st);
+			if (S_ISDIR(st.st_mode))
+				ereport(ERROR,
+						(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+						 errmsg("\"%s\" is a directory", cstate->filename)));
+		}
 	}
 
 	if (!cstate->binary)
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 2da08d1..23ec88d 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -2703,6 +2703,7 @@ _copyCopyStmt(const CopyStmt *from)
 	COPY_NODE_FIELD(query);
 	COPY_NODE_FIELD(attlist);
 	COPY_SCALAR_FIELD(is_from);
+	COPY_SCALAR_FIELD(is_program);
 	COPY_STRING_FIELD(filename);
 	COPY_NODE_FIELD(options);
 
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 9e313c8..99c034a 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1090,6 +1090,7 @@ _equalCopyStmt(const CopyStmt *a, const CopyStmt *b)
 	COMPARE_NODE_FIELD(query);
 	COMPARE_NODE_FIELD(attlist);
 	COMPARE_SCALAR_FIELD(is_from);
+	COMPARE_SCALAR_FIELD(is_program);
 	COMPARE_STRING_FIELD(filename);
 	COMPARE_NODE_FIELD(options);
 
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index b998431..d3009b6 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -381,7 +381,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 %type <boolean> opt_freeze opt_default opt_recheck
 %type <defelt>	opt_binary opt_oids copy_delimiter
 
-%type <boolean> copy_from
+%type <boolean> copy_from opt_program
 
 %type <ival>	opt_column event cursor_options opt_hold opt_set_data
 %type <objtype>	reindex_type drop_type comment_type security_label_type
@@ -568,7 +568,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 
 	PARSER PARTIAL PARTITION PASSING PASSWORD PLACING PLANS POSITION
 	PRECEDING PRECISION PRESERVE PREPARE PREPARED PRIMARY
-	PRIOR PRIVILEGES PROCEDURAL PROCEDURE
+	PRIOR PRIVILEGES PROCEDURAL PROCEDURE PROGRAM
 
 	QUOTE
 
@@ -2309,7 +2309,10 @@ ClosePortalStmt:
  *
  *		QUERY :
  *				COPY relname [(columnList)] FROM/TO file [WITH] [(options)]
- *				COPY ( SELECT ... ) TO file [WITH] [(options)]
+ *				COPY ( SELECT ... ) TO file	[WITH] [(options)]
+ *
+ *				where 'file' can be one of:
+ *				{ PROGRAM 'command' | STDIN | STDOUT | 'filename' }
  *
  *				In the preferred syntax the options are comma-separated
  *				and use generic identifiers instead of keywords.  The pre-9.0
@@ -2324,14 +2327,21 @@ ClosePortalStmt:
  *****************************************************************************/
 
 CopyStmt:	COPY opt_binary qualified_name opt_column_list opt_oids
-			copy_from copy_file_name copy_delimiter opt_with copy_options
+			copy_from opt_program copy_file_name copy_delimiter opt_with copy_options
 				{
 					CopyStmt *n = makeNode(CopyStmt);
 					n->relation = $3;
 					n->query = NULL;
 					n->attlist = $4;
 					n->is_from = $6;
-					n->filename = $7;
+					n->is_program = $7;
+					n->filename = $8;
+
+					if (n->is_program && n->filename == NULL)
+						ereport(ERROR,
+								(errcode(ERRCODE_SYNTAX_ERROR),
+								 errmsg("STDIN/STDOUT not allowed with PROGRAM"),
+								 parser_errposition(@8)));
 
 					n->options = NIL;
 					/* Concatenate user-supplied flags */
@@ -2339,21 +2349,29 @@ CopyStmt:	COPY opt_binary qualified_name opt_column_list opt_oids
 						n->options = lappend(n->options, $2);
 					if ($5)
 						n->options = lappend(n->options, $5);
-					if ($8)
-						n->options = lappend(n->options, $8);
-					if ($10)
-						n->options = list_concat(n->options, $10);
+					if ($9)
+						n->options = lappend(n->options, $9);
+					if ($11)
+						n->options = list_concat(n->options, $11);
 					$$ = (Node *)n;
 				}
-			| COPY select_with_parens TO copy_file_name opt_with copy_options
+			| COPY select_with_parens TO opt_program copy_file_name opt_with copy_options
 				{
 					CopyStmt *n = makeNode(CopyStmt);
 					n->relation = NULL;
 					n->query = $2;
 					n->attlist = NIL;
 					n->is_from = false;
-					n->filename = $4;
-					n->options = $6;
+					n->is_program = $4;
+					n->filename = $5;
+					n->options = $7;
+
+					if (n->is_program && n->filename == NULL)
+						ereport(ERROR,
+								(errcode(ERRCODE_SYNTAX_ERROR),
+								 errmsg("STDIN/STDOUT not allowed with PROGRAM"),
+								 parser_errposition(@5)));
+
 					$$ = (Node *)n;
 				}
 		;
@@ -2363,6 +2381,11 @@ copy_from:
 			| TO									{ $$ = FALSE; }
 		;
 
+opt_program:
+			PROGRAM									{ $$ = TRUE; }
+			| /* EMPTY */							{ $$ = FALSE; }
+		;
+
 /*
  * copy_file_name NULL indicates stdio is used. Whether stdin or stdout is
  * used depends on the direction. (It really doesn't make sense to copy from
@@ -12666,6 +12689,7 @@ unreserved_keyword:
 			| PRIVILEGES
 			| PROCEDURAL
 			| PROCEDURE
+			| PROGRAM
 			| QUOTE
 			| RANGE
 			| READ
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index ba1b84e..452690d 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -39,13 +39,13 @@
  * for a long time, like relation files. It is the caller's responsibility
  * to close them, there is no automatic mechanism in fd.c for that.
  *
- * AllocateFile, AllocateDir and OpenTransientFile are wrappers around
- * fopen(3), opendir(3), and open(2), respectively. They behave like the
- * corresponding native functions, except that the handle is registered with
- * the current subtransaction, and will be automatically closed at abort.
- * These are intended for short operations like reading a configuration file,
- * and there is a fixed limit on the number of files that can be opened using
- * these functions at any one time.
+ * AllocateFile, AllocateDir, OpenPipeStream and OpenTransientFile are
+ * wrappers around fopen(3), opendir(3), popen(3) and open(2), respectively.
+ * They behave like the corresponding native functions, except that the handle
+ * is registered with the current subtransaction, and will be automatically
+ * closed at abort. These are intended for short operations like reading a
+ * configuration file, and there is a fixed limit on the number of files that
+ * can be opened using these functions at any one time.
  *
  * Finally, BasicOpenFile is just a thin wrapper around open() that can
  * release file descriptors in use by the virtual file descriptors if
@@ -202,6 +202,7 @@ static uint64 temporary_files_size = 0;
 typedef enum
 {
 	AllocateDescFile,
+	AllocateDescPipe,
 	AllocateDescDir,
 	AllocateDescRawFD
 } AllocateDescKind;
@@ -1586,6 +1587,61 @@ OpenTransientFile(FileName fileName, int fileFlags, int fileMode)
 }
 
 /*
+ * Routines that want to initiate a pipe stream should use OpenPipeStream
+ * rather than plain popen().  This lets fd.c deal with freeing FDs if
+ * necessary.  When done, call ClosePipeStream rather than pclose.
+ */
+FILE *
+OpenPipeStream(const char *command, const char *mode)
+{
+	FILE	   *file;
+
+	DO_DB(elog(LOG, "OpenPipeStream: Allocated %d (%s)",
+			   numAllocatedDescs, command));
+
+	/*
+	 * The test against MAX_ALLOCATED_DESCS prevents us from overflowing
+	 * allocatedFiles[]; the test against max_safe_fds prevents AllocateFile
+	 * from hogging every one of the available FDs, which'd lead to infinite
+	 * looping.
+	 */
+	if (numAllocatedDescs >= MAX_ALLOCATED_DESCS ||
+		numAllocatedDescs >= max_safe_fds - 1)
+		elog(ERROR, "exceeded MAX_ALLOCATED_DESCS while trying to execute command \"%s\"",
+			 command);
+
+TryAgain:
+	errno = 0;
+	fflush(stdout);
+	fflush(stderr);
+	if ((file = popen(command, mode)) != NULL)
+	{
+		AllocateDesc *desc = &allocatedDescs[numAllocatedDescs];
+
+		desc->kind = AllocateDescPipe;
+		desc->desc.file = file;
+		desc->create_subid = GetCurrentSubTransactionId();
+		numAllocatedDescs++;
+		return desc->desc.file;
+	}
+
+	if (errno == EMFILE || errno == ENFILE)
+	{
+		int			save_errno = errno;
+
+		ereport(LOG,
+				(errcode(ERRCODE_INSUFFICIENT_RESOURCES),
+				 errmsg("out of file descriptors: %m; release and retry")));
+		errno = 0;
+		if (ReleaseLruFile())
+			goto TryAgain;
+		errno = save_errno;
+	}
+
+	return NULL;
+}
+
+/*
  * Free an AllocateDesc of any type.
  *
  * The argument *must* point into the allocatedDescs[] array.
@@ -1601,6 +1657,9 @@ FreeDesc(AllocateDesc *desc)
 		case AllocateDescFile:
 			result = fclose(desc->desc.file);
 			break;
+		case AllocateDescPipe:
+			result = pclose_check(desc->desc.file);
+			break;
 		case AllocateDescDir:
 			result = closedir(desc->desc.dir);
 			break;
@@ -1815,6 +1874,31 @@ FreeDir(DIR *dir)
 
 
 /*
+ * Close a pipe stream returned by OpenPipeStream.
+ */
+int
+ClosePipeStream(FILE *file)
+{
+	int			i;
+
+	DO_DB(elog(LOG, "ClosePipeStream: Allocated %d", numAllocatedDescs));
+
+	/* Remove file from list of allocated files, if it's present */
+	for (i = numAllocatedDescs; --i >= 0;)
+	{
+		AllocateDesc *desc = &allocatedDescs[i];
+
+		if (desc->kind == AllocateDescPipe && desc->desc.file == file)
+			return FreeDesc(desc);
+	}
+
+	/* Only get here if someone passes us a file not in allocatedDescs */
+	elog(WARNING, "file passed to ClosePipeStream was not obtained from OpenPipeStream");
+
+	return pclose_check(file);
+}
+
+/*
  * closeAllVfds
  *
  * Force all VFDs into the physically-closed state, so that the fewest
diff --git a/src/bin/psql/copy.c b/src/bin/psql/copy.c
index a31d789..7ad7bdf 100644
--- a/src/bin/psql/copy.c
+++ b/src/bin/psql/copy.c
@@ -35,6 +35,9 @@
  *	\copy tablename [(columnlist)] from|to filename [options]
  *	\copy ( select stmt ) to filename [options]
  *
+ * where 'filename' can be one of the following:
+ *  '<file path>' | PROGRAM '<command>' | stdin | stdout | pstdout | pstdout
+ *
  * An undocumented fact is that you can still write BINARY before the
  * tablename; this is a hangover from the pre-7.3 syntax.  The options
  * syntax varies across backend versions, but we avoid all that mess
@@ -43,6 +46,7 @@
  * table name can be double-quoted and can have a schema part.
  * column names can be double-quoted.
  * filename can be single-quoted like SQL literals.
+ * command can be single-quoted like SQL literals.
  *
  * returns a malloc'ed structure with the options, or NULL on parsing error
  */
@@ -52,6 +56,7 @@ struct copy_options
 	char	   *before_tofrom;	/* COPY string before TO/FROM */
 	char	   *after_tofrom;	/* COPY string after TO/FROM filename */
 	char	   *file;			/* NULL = stdin/stdout */
+	bool		program;		/* is 'file' a program to popen? */
 	bool		psql_inout;		/* true = use psql stdin/stdout */
 	bool		from;			/* true = FROM, false = TO */
 };
@@ -191,15 +196,37 @@ parse_slash_copy(const char *args)
 	else
 		goto error;
 
+	/* { 'filename' | PROGRAM 'command' | STDIN | STDOUT | PSTDIN | PSTDOUT } */
 	token = strtokx(NULL, whitespace, NULL, "'",
-					0, false, true, pset.encoding);
+					0, false, false, pset.encoding);
 	if (!token)
 		goto error;
 
-	if (pg_strcasecmp(token, "stdin") == 0 ||
-		pg_strcasecmp(token, "stdout") == 0)
+	if (pg_strcasecmp(token, "program") == 0)
+	{
+		int toklen;
+
+		token = strtokx(NULL, whitespace, NULL, "'",
+						0, false, false, pset.encoding);
+		if (!token)
+			goto error;
+
+		/*
+		 * The shell command must be quoted. This isn't fool-proof, but catches
+		 * most quoting errors.
+		 */
+		toklen = strlen(token);
+		if (token[0] != '\'' || toklen < 2 || token[toklen - 1] != '\'')
+			goto error;
+
+		strip_quotes(token, '\'', 0, pset.encoding);
+
+		result->program = true;
+		result->file = pg_strdup(token);
+	}
+	else if (pg_strcasecmp(token, "stdin") == 0 ||
+			 pg_strcasecmp(token, "stdout") == 0)
 	{
-		result->psql_inout = false;
 		result->file = NULL;
 	}
 	else if (pg_strcasecmp(token, "pstdin") == 0 ||
@@ -210,7 +237,8 @@ parse_slash_copy(const char *args)
 	}
 	else
 	{
-		result->psql_inout = false;
+		/* filename can be optionally quoted */
+		strip_quotes(token, '\'', 0, pset.encoding);
 		result->file = pg_strdup(token);
 		expand_tilde(&result->file);
 	}
@@ -235,9 +263,9 @@ error:
 
 
 /*
- * Execute a \copy command (frontend copy). We have to open a file, then
- * submit a COPY query to the backend and either feed it data from the
- * file or route its response into the file.
+ * Execute a \copy command (frontend copy). We have to open a file (or execute
+ * a command), then submit a COPY query to the backend and either feed it data
+ * from the file or route its response into the file.
  */
 bool
 do_copy(const char *args)
@@ -257,7 +285,7 @@ do_copy(const char *args)
 		return false;
 
 	/* prepare to read or write the target file */
-	if (options->file)
+	if (options->file && !options->program)
 		canonicalize_path(options->file);
 
 	if (options->from)
@@ -265,7 +293,17 @@ do_copy(const char *args)
 		override_file = &pset.cur_cmd_source;
 
 		if (options->file)
-			copystream = fopen(options->file, PG_BINARY_R);
+		{
+			if (options->program)
+			{
+				fflush(stdout);
+				fflush(stderr);
+				errno = 0;
+				copystream = popen(options->file, PG_BINARY_R);
+			}
+			else
+				copystream = fopen(options->file, PG_BINARY_R);
+		}
 		else if (!options->psql_inout)
 			copystream = pset.cur_cmd_source;
 		else
@@ -276,7 +314,20 @@ do_copy(const char *args)
 		override_file = &pset.queryFout;
 
 		if (options->file)
-			copystream = fopen(options->file, PG_BINARY_W);
+		{
+			if (options->program)
+			{
+				fflush(stdout);
+				fflush(stderr);
+				errno = 0;
+#ifndef WIN32
+				pqsignal(SIGPIPE, SIG_IGN);
+#endif
+				copystream = popen(options->file, PG_BINARY_W);
+			}
+			else
+				copystream = fopen(options->file, PG_BINARY_W);
+		}
 		else if (!options->psql_inout)
 			copystream = pset.queryFout;
 		else
@@ -285,21 +336,28 @@ do_copy(const char *args)
 
 	if (!copystream)
 	{
-		psql_error("%s: %s\n",
-				   options->file, strerror(errno));
+		if (options->program)
+			psql_error("could not execute command \"%s\": %s\n",
+					   options->file, strerror(errno));
+		else
+			psql_error("%s: %s\n",
+					   options->file, strerror(errno));
 		free_copy_options(options);
 		return false;
 	}
 
-	/* make sure the specified file is not a directory */
-	fstat(fileno(copystream), &st);
-	if (S_ISDIR(st.st_mode))
+	if (!options->program)
 	{
-		fclose(copystream);
-		psql_error("%s: cannot copy from/to a directory\n",
-				   options->file);
-		free_copy_options(options);
-		return false;
+		/* make sure the specified file is not a directory */
+		fstat(fileno(copystream), &st);
+		if (S_ISDIR(st.st_mode))
+		{
+			fclose(copystream);
+			psql_error("%s: cannot copy from/to a directory\n",
+					   options->file);
+			free_copy_options(options);
+			return false;
+		}
 	}
 
 	/* build the command we will send to the backend */
@@ -322,10 +380,25 @@ do_copy(const char *args)
 
 	if (options->file != NULL)
 	{
-		if (fclose(copystream) != 0)
+		if (options->program)
 		{
-			psql_error("%s: %s\n", options->file, strerror(errno));
-			success = false;
+			if (pclose_check(copystream) == -1)
+			{
+				psql_error("could not execute command \"%s\"\n",
+						   options->file);
+				success = false;
+			}
+#ifndef WIN32
+			pqsignal(SIGPIPE, SIG_DFL);
+#endif
+		}
+		else
+		{
+			if (fclose(copystream) != 0)
+			{
+				psql_error("%s: %s\n", options->file, strerror(errno));
+				success = false;
+			}
 		}
 	}
 	free_copy_options(options);
diff --git a/src/bin/psql/stringutils.c b/src/bin/psql/stringutils.c
index 450240d..99968a1 100644
--- a/src/bin/psql/stringutils.c
+++ b/src/bin/psql/stringutils.c
@@ -13,9 +13,6 @@
 #include "stringutils.h"
 
 
-static void strip_quotes(char *source, char quote, char escape, int encoding);
-
-
 /*
  * Replacement for strtok() (a.k.a. poor man's flex)
  *
@@ -239,7 +236,7 @@ strtokx(const char *s,
  *
  * Note that the source string is overwritten in-place.
  */
-static void
+void
 strip_quotes(char *source, char quote, char escape, int encoding)
 {
 	char	   *src;
diff --git a/src/bin/psql/stringutils.h b/src/bin/psql/stringutils.h
index b991376..bb2a194 100644
--- a/src/bin/psql/stringutils.h
+++ b/src/bin/psql/stringutils.h
@@ -19,6 +19,8 @@ extern char *strtokx(const char *s,
 		bool del_quotes,
 		int encoding);
 
+extern void strip_quotes(char *source, char quote, char escape, int encoding);
+
 extern char *quote_if_needed(const char *source, const char *entails_quote,
 				char quote, char escape, int encoding);
 
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index 725c277..5860e4c 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -26,7 +26,7 @@ extern Oid DoCopy(const CopyStmt *stmt, const char *queryString,
 
 extern void ProcessCopyOptions(CopyState cstate, bool is_from, List *options);
 extern CopyState BeginCopyFrom(Relation rel, const char *filename,
-			  List *attnamelist, List *options);
+			  bool is_program, List *attnamelist, List *options);
 extern void EndCopyFrom(CopyState cstate);
 extern bool NextCopyFrom(CopyState cstate, ExprContext *econtext,
 			 Datum *values, bool *nulls, Oid *tupleOid);
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index d8678e5..d54990d 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1407,6 +1407,7 @@ typedef struct CopyStmt
 	List	   *attlist;		/* List of column names (as Strings), or NIL
 								 * for all columns */
 	bool		is_from;		/* TO or FROM */
+	bool		is_program;		/* is 'filename' a program to popen? */
 	char	   *filename;		/* filename, or NULL for STDIN/STDOUT */
 	List	   *options;		/* List of DefElem nodes */
 } CopyStmt;
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 03aa761..6f67a65 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -292,6 +292,7 @@ PG_KEYWORD("prior", PRIOR, UNRESERVED_KEYWORD)
 PG_KEYWORD("privileges", PRIVILEGES, UNRESERVED_KEYWORD)
 PG_KEYWORD("procedural", PROCEDURAL, UNRESERVED_KEYWORD)
 PG_KEYWORD("procedure", PROCEDURE, UNRESERVED_KEYWORD)
+PG_KEYWORD("program", PROGRAM, UNRESERVED_KEYWORD)
 PG_KEYWORD("quote", QUOTE, UNRESERVED_KEYWORD)
 PG_KEYWORD("range", RANGE, UNRESERVED_KEYWORD)
 PG_KEYWORD("read", READ, UNRESERVED_KEYWORD)
diff --git a/src/include/storage/fd.h b/src/include/storage/fd.h
index bd36c9d..90b4933 100644
--- a/src/include/storage/fd.h
+++ b/src/include/storage/fd.h
@@ -80,6 +80,10 @@ extern char *FilePathName(File file);
 extern FILE *AllocateFile(const char *name, const char *mode);
 extern int	FreeFile(FILE *file);
 
+/* Operations that allow use of pipe streams (popen/pclose) */
+extern FILE *OpenPipeStream(const char *command, const char *mode);
+extern int	ClosePipeStream(FILE *file);
+
 /* Operations to allow use of the <dirent.h> library routines */
 extern DIR *AllocateDir(const char *dirname);
 extern struct dirent *ReadDir(DIR *dir, const char *dirname);
diff --git a/src/interfaces/ecpg/preproc/ecpg.addons b/src/interfaces/ecpg/preproc/ecpg.addons
index d8147cf..85ec391 100644
--- a/src/interfaces/ecpg/preproc/ecpg.addons
+++ b/src/interfaces/ecpg/preproc/ecpg.addons
@@ -192,7 +192,7 @@ ECPG: where_or_current_clauseWHERECURRENT_POFcursor_name block
 		char *cursor_marker = $4[0] == ':' ? mm_strdup("$0") : $4;
 		$$ = cat_str(2,mm_strdup("where current of"), cursor_marker);
 	}
-ECPG: CopyStmtCOPYopt_binaryqualified_nameopt_column_listopt_oidscopy_fromcopy_file_namecopy_delimiteropt_withcopy_options addon
+ECPG: CopyStmtCOPYopt_binaryqualified_nameopt_column_listopt_oidscopy_fromopt_programcopy_file_namecopy_delimiteropt_withcopy_options addon
 			if (strcmp($6, "from") == 0 &&
 			   (strcmp($7, "stdin") == 0 || strcmp($7, "stdout") == 0))
 				mmerror(PARSE_ERROR, ET_WARNING, "COPY FROM STDIN is not implemented");
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to