Hi here is cleaned/finished previous implementation of RAW_TEXT/RAW_BINARY formats for COPY statements.
The RAW with text formats means unescaped data, but with correct encoding - input/output is realised with input/output function. RAW binary means content produced/received by sending/received functions. Now both directions (input/output) working well Some examples of expected usage: copy (select xmlelement(name foo, 'hello')) to stdout (format raw_binary, encoding 'latin2'); create table avatars(id serial, picture bytea); \copy avatars(picture) from ~/images/foo.jpg (format raw_binary); select lastval(); create table doc(id serial, txt text); \copy doc(txt) from ~/files/aaa.txt (format raw_text, encoding 'latin2'); select lastval(); Regards Pavel
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml new file mode 100644 index 3829a14..7b9ed73 *** a/doc/src/sgml/libpq.sgml --- b/doc/src/sgml/libpq.sgml *************** int PQfformat(const PGresult *res, *** 3226,3233 **** <para> Format code zero indicates textual data representation, while format ! code one indicates binary representation. (Other codes are reserved ! for future definition.) </para> </listitem> </varlistentry> --- 3226,3234 ---- <para> Format code zero indicates textual data representation, while format ! code one indicates binary representation. Format code two indicates ! raw_text representation and format code three indicates raw_binary ! representation (Other codes are reserved for future definition.) </para> </listitem> </varlistentry> *************** typedef struct *** 3557,3562 **** --- 3558,3583 ---- </para> <variablelist> + <varlistentry id="libpq-pqcopyformat"> + <term> + <function>PQcopyFormat</function> + <indexterm> + <primary>PQcopyFormat</primary> + </indexterm> + </term> + + <listitem> + <para> + Format code zero indicates textual data representation, format one + indicates binary representation, format two indicates raw + representation. + <synopsis> + int PQcopyFormat(PGresult *res); + </synopsis> + </para> + </listitem> + </varlistentry> + <varlistentry id="libpq-pqcmdstatus"> <term> <function>PQcmdStatus</function> diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml new file mode 100644 index 522128e..e783b30 *** a/doc/src/sgml/protocol.sgml --- b/doc/src/sgml/protocol.sgml *************** CopyInResponse (B) *** 3239,3244 **** --- 3239,3245 ---- characters, etc). 1 indicates the overall copy format is binary (similar to DataRow format). + 2 indicates the overall copy format is raw. See <xref linkend="sql-copy"> for more information. </para> *************** CopyInResponse (B) *** 3262,3269 **** <listitem> <para> The format codes to be used for each column. ! Each must presently be zero (text) or one (binary). ! All must be zero if the overall copy format is textual. </para> </listitem> </varlistentry> --- 3263,3271 ---- <listitem> <para> The format codes to be used for each column. ! Each must be zero (text), one (binary), two (raw_text) ! or three (raw_binary). All must be zero if the overall ! copy format is textual. </para> </listitem> </varlistentry> *************** CopyOutResponse (B) *** 3313,3319 **** is textual (rows separated by newlines, columns separated by separator characters, etc). 1 indicates the overall copy format is binary (similar to DataRow ! format). See <xref linkend="sql-copy"> for more information. </para> </listitem> </varlistentry> --- 3315,3322 ---- is textual (rows separated by newlines, columns separated by separator characters, etc). 1 indicates the overall copy format is binary (similar to DataRow ! format). 2 indicates raw_text or raw_binary format. ! See <xref linkend="sql-copy"> for more information. </para> </listitem> </varlistentry> *************** CopyOutResponse (B) *** 3335,3342 **** <listitem> <para> The format codes to be used for each column. ! Each must presently be zero (text) or one (binary). ! All must be zero if the overall copy format is textual. </para> </listitem> </varlistentry> --- 3338,3346 ---- <listitem> <para> The format codes to be used for each column. ! Each must be zero (text), one (binary), two (raw_text) ! or three (raw_binary). All must be zero if the overall ! copy format is textual. </para> </listitem> </varlistentry> diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml new file mode 100644 index 07e2f45..4e339e4 *** a/doc/src/sgml/ref/copy.sgml --- b/doc/src/sgml/ref/copy.sgml *************** COPY { <replaceable class="parameter">ta *** 197,203 **** Selects the data format to be read or written: <literal>text</>, <literal>csv</> (Comma Separated Values), ! or <literal>binary</>. The default is <literal>text</>. </para> </listitem> --- 197,205 ---- Selects the data format to be read or written: <literal>text</>, <literal>csv</> (Comma Separated Values), ! <literal>binary</>, ! <literal>raw_text</> ! or <literal>raw_binary</>. The default is <literal>text</>. </para> </listitem> *************** OIDs to be shown as null if that ever pr *** 888,893 **** --- 890,933 ---- </para> </refsect3> </refsect2> + + <refsect2> + <title>Raw_text/raw_binary Format</title> + + <para> + The <literal>raw_text</literal> format option causes all data to be + stored/read as one text value. This format doesn't use any metadata + - only raw data are exported or imported. + </para> + + <para> + The <literal>raw_binary</literal> format option causes all data to be + stored/read as binary format rather than as text. It shares format + for data with <literal>binary</literal> format. This format doesn't + use any metadata - only row data in network byte order are exported + or imported. + </para> + + <para> + Because this format doesn't support any delimiter, only one value + can be exported or imported. NULL values are not allowed. + </para> + <para> + The <literal>raw_binary</literal> format can be used for export or import + bytea values. + <programlisting> + COPY images(data) FROM '/usr1/proj/img/01.jpg' (FORMAT raw_binary); + </programlisting> + It can be used successfully for export XML in different encoding + or import valid XML document with any supported encoding: + <screen><![CDATA[ + SET client_encoding TO latin2; + + COPY (SELECT xmlelement(NAME data, 'Hello')) TO stdout (FORMAT raw_binary); + <?xml version="1.0" encoding="LATIN2"?><data>Hello</data> + ]]></screen> + </para> + </refsect2> </refsect1> <refsect1> diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c new file mode 100644 index 3201476..7600829 *** a/src/backend/commands/copy.c --- b/src/backend/commands/copy.c *************** typedef struct CopyStateData *** 110,115 **** --- 110,116 ---- char *filename; /* filename, or NULL for STDIN/STDOUT */ bool is_program; /* is 'filename' a program to popen? */ bool binary; /* binary format? */ + bool raw; /* raw mode? */ bool oids; /* include OIDs? */ bool freeze; /* freeze rows on loading? */ bool csv_mode; /* Comma Separated Value format? */ *************** SendCopyBegin(CopyState cstate) *** 342,353 **** /* new way */ StringInfoData buf; int natts = list_length(cstate->attnumlist); ! int16 format = (cstate->binary ? 1 : 0); int i; pq_beginmessage(&buf, 'H'); ! pq_sendbyte(&buf, format); /* overall format */ pq_sendint(&buf, natts, 2); for (i = 0; i < natts; i++) pq_sendint(&buf, format, 2); /* per-column formats */ pq_endmessage(&buf); --- 343,369 ---- /* new way */ StringInfoData buf; int natts = list_length(cstate->attnumlist); ! int16 format; ! int mode; int i; pq_beginmessage(&buf, 'H'); ! ! if (cstate->raw) ! mode = 2; ! else if (cstate->binary) ! mode = 1; ! else ! mode = 0; ! ! pq_sendbyte(&buf, mode); /* overall mode */ pq_sendint(&buf, natts, 2); + + if (!cstate->raw) + format = cstate->binary ? 1 : 0; + else + format = cstate->binary ? 3 : 2; + for (i = 0; i < natts; i++) pq_sendint(&buf, format, 2); /* per-column formats */ pq_endmessage(&buf); *************** SendCopyBegin(CopyState cstate) *** 356,365 **** else if (PG_PROTOCOL_MAJOR(FrontendProtocol) >= 2) { /* old way */ ! if (cstate->binary) ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), ! errmsg("COPY BINARY is not supported to stdout or from stdin"))); pq_putemptymessage('H'); /* grottiness needed for old COPY OUT protocol */ pq_startcopyout(); --- 372,381 ---- else if (PG_PROTOCOL_MAJOR(FrontendProtocol) >= 2) { /* old way */ ! if (cstate->binary || cstate->raw) ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), ! errmsg("COPY BINARY or COPY RAW_TEXT/RAW_BINARY is not supported to stdout or from stdin"))); pq_putemptymessage('H'); /* grottiness needed for old COPY OUT protocol */ pq_startcopyout(); *************** SendCopyBegin(CopyState cstate) *** 368,377 **** else { /* very old way */ ! if (cstate->binary) ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), ! errmsg("COPY BINARY is not supported to stdout or from stdin"))); pq_putemptymessage('B'); /* grottiness needed for old COPY OUT protocol */ pq_startcopyout(); --- 384,393 ---- else { /* very old way */ ! if (cstate->binary || cstate->raw) ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), ! errmsg("COPY BINARY or COPY RAW_TEXT/RAW_BINARY is not supported to stdout or from stdin"))); pq_putemptymessage('B'); /* grottiness needed for old COPY OUT protocol */ pq_startcopyout(); *************** ReceiveCopyBegin(CopyState cstate) *** 387,398 **** /* new way */ StringInfoData buf; int natts = list_length(cstate->attnumlist); ! int16 format = (cstate->binary ? 1 : 0); int i; pq_beginmessage(&buf, 'G'); ! pq_sendbyte(&buf, format); /* overall format */ pq_sendint(&buf, natts, 2); for (i = 0; i < natts; i++) pq_sendint(&buf, format, 2); /* per-column formats */ pq_endmessage(&buf); --- 403,429 ---- /* new way */ StringInfoData buf; int natts = list_length(cstate->attnumlist); ! int16 format; ! int mode; int i; pq_beginmessage(&buf, 'G'); ! ! if (cstate->raw) ! mode = 2; ! else if (cstate->binary) ! mode = 1; ! else ! mode = 0; ! ! pq_sendbyte(&buf, mode); /* overall format */ pq_sendint(&buf, natts, 2); + + if (!cstate->raw) + format = cstate->binary ? 1 : 0; + else + format = cstate->binary ? 3 : 2; + for (i = 0; i < natts; i++) pq_sendint(&buf, format, 2); /* per-column formats */ pq_endmessage(&buf); *************** ReceiveCopyBegin(CopyState cstate) *** 402,411 **** else if (PG_PROTOCOL_MAJOR(FrontendProtocol) >= 2) { /* old way */ ! if (cstate->binary) ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), ! errmsg("COPY BINARY is not supported to stdout or from stdin"))); pq_putemptymessage('G'); /* any error in old protocol will make us lose sync */ pq_startmsgread(); --- 433,442 ---- else if (PG_PROTOCOL_MAJOR(FrontendProtocol) >= 2) { /* old way */ ! if (cstate->binary || cstate->raw) ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), ! errmsg("COPY BINARY or COPY RAW_TEXT/RAW_BINARY is not supported to stdout or from stdin"))); pq_putemptymessage('G'); /* any error in old protocol will make us lose sync */ pq_startmsgread(); *************** ReceiveCopyBegin(CopyState cstate) *** 414,423 **** else { /* very old way */ ! if (cstate->binary) ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), ! errmsg("COPY BINARY is not supported to stdout or from stdin"))); pq_putemptymessage('D'); /* any error in old protocol will make us lose sync */ pq_startmsgread(); --- 445,454 ---- else { /* very old way */ ! if (cstate->binary || cstate->raw) ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), ! errmsg("COPY BINARY or COPY RAW_TEXT/RAW_BINARY is not supported to stdout or from stdin"))); pq_putemptymessage('D'); /* any error in old protocol will make us lose sync */ pq_startmsgread(); *************** CopySendEndOfRow(CopyState cstate) *** 482,488 **** switch (cstate->copy_dest) { case COPY_FILE: ! if (!cstate->binary) { /* Default line termination depends on platform */ #ifndef WIN32 --- 513,519 ---- switch (cstate->copy_dest) { case COPY_FILE: ! if (!cstate->binary && !cstate->raw) { /* Default line termination depends on platform */ #ifndef WIN32 *************** CopySendEndOfRow(CopyState cstate) *** 526,531 **** --- 557,565 ---- } break; case COPY_OLD_FE: + /* This old protocol doesn't allow RAW_TEXT/RAW_BINARY */ + Assert(!cstate->raw); + /* The FE/BE protocol uses \n as newline for all platforms */ if (!cstate->binary) CopySendChar(cstate, '\n'); *************** CopySendEndOfRow(CopyState cstate) *** 540,546 **** break; case COPY_NEW_FE: /* The FE/BE protocol uses \n as newline for all platforms */ ! if (!cstate->binary) CopySendChar(cstate, '\n'); /* Dump the accumulated row as one CopyData message */ --- 574,580 ---- break; case COPY_NEW_FE: /* The FE/BE protocol uses \n as newline for all platforms */ ! if (!cstate->binary && !cstate->raw) CopySendChar(cstate, '\n'); /* Dump the accumulated row as one CopyData message */ *************** CopyLoadRawBuf(CopyState cstate) *** 766,771 **** --- 800,837 ---- return (inbytes > 0); } + /* + * CopyLoadallRawBuf loads all content into raw_buf. + * + * This routine is used in raw_text/raw_binary mode. If original RAW_BUF_SIZE is not + * enough, then the buffer is enlarged. + */ + static void + CopyLoadallRawBuf(CopyState cstate) + { + int nbytes = 0; + int inbytes; + Size raw_buf_size = RAW_BUF_SIZE; + + do + { + /* hold enough space for one data packet */ + if ((raw_buf_size - nbytes - 1) < 8 * 1024) + { + raw_buf_size += RAW_BUF_SIZE; + cstate->raw_buf = repalloc(cstate->raw_buf, raw_buf_size); + } + + inbytes = CopyGetData(cstate, cstate->raw_buf + nbytes, 1, raw_buf_size - nbytes - 1); + nbytes += inbytes; + } + while (inbytes > 0); + + cstate->raw_buf[nbytes] = '\0'; + cstate->raw_buf_index = 0; + cstate->raw_buf_len = nbytes; + } + /* * DoCopy executes the SQL COPY statement *************** ProcessCopyOptions(CopyState cstate, *** 1013,1018 **** --- 1079,1091 ---- cstate->csv_mode = true; else if (strcmp(fmt, "binary") == 0) cstate->binary = true; + else if (strcmp(fmt, "raw_text") == 0) + cstate->raw = true; + else if (strcmp(fmt, "raw_binary") == 0) + { + cstate->binary = true; + cstate->raw = true; + } else ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), *************** ProcessCopyOptions(CopyState cstate, *** 1162,1177 **** * Check for incompatible options (must do these two before inserting * defaults) */ ! if (cstate->binary && cstate->delim) ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR), errmsg("cannot specify DELIMITER in BINARY mode"))); ! if (cstate->binary && cstate->null_print) ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR), errmsg("cannot specify NULL in BINARY mode"))); /* Set defaults for omitted options */ if (!cstate->delim) cstate->delim = cstate->csv_mode ? "," : "\t"; --- 1235,1255 ---- * Check for incompatible options (must do these two before inserting * defaults) */ ! if ((cstate->binary || cstate->raw) && cstate->delim) ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR), errmsg("cannot specify DELIMITER in BINARY mode"))); ! if ((cstate->binary || cstate->raw) && cstate->null_print) ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR), errmsg("cannot specify NULL in BINARY mode"))); + if (cstate->raw && cstate->oids) + ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), + errmsg("cannot specify OIDS in RAW_TEXT/RAW_BINARY mode"))); + /* Set defaults for omitted options */ if (!cstate->delim) cstate->delim = cstate->csv_mode ? "," : "\t"; *************** BeginCopy(bool is_from, *** 1608,1613 **** --- 1686,1697 ---- } } + /* No more columns are allowed in RAW mode */ + if (cstate->raw && list_length(cstate->attnumlist) > 1) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("Single column result/target is required in RAW_TEXT/RAW_BINARY mode"))); + /* Use client encoding when ENCODING option is not specified. */ if (cstate->file_encoding < 0) cstate->file_encoding = pg_get_client_encoding(); *************** CopyTo(CopyState cstate) *** 1899,1905 **** ALLOCSET_DEFAULT_INITSIZE, ALLOCSET_DEFAULT_MAXSIZE); ! if (cstate->binary) { /* Generate header for a binary copy */ int32 tmp; --- 1983,1989 ---- ALLOCSET_DEFAULT_INITSIZE, ALLOCSET_DEFAULT_MAXSIZE); ! if (cstate->binary && !cstate->raw) { /* Generate header for a binary copy */ int32 tmp; *************** CopyTo(CopyState cstate) *** 1931,1936 **** --- 2015,2023 ---- { bool hdr_delim = false; + /* raw_text/raw_binary mode is not allowed here */ + Assert(!cstate->raw); + foreach(cur, cstate->attnumlist) { int attnum = lfirst_int(cur); *************** CopyTo(CopyState cstate) *** 1967,1972 **** --- 2054,2063 ---- { CHECK_FOR_INTERRUPTS(); + /* stop quickly in raw_text/raw_binary when more rows is detected */ + if (cstate->raw && processed > 0) + break; + /* Deconstruct the tuple ... faster than repeated heap_getattr */ heap_deform_tuple(tuple, tupDesc, values, nulls); *************** CopyTo(CopyState cstate) *** 1983,1993 **** else { /* run the plan --- the dest receiver will send tuples */ ! ExecutorRun(cstate->queryDesc, ForwardScanDirection, 0L); processed = ((DR_copy *) cstate->queryDesc->dest)->processed; } ! if (cstate->binary) { /* Generate trailer for a binary copy */ CopySendInt16(cstate, -1); --- 2074,2098 ---- else { /* run the plan --- the dest receiver will send tuples */ ! ExecutorRun(cstate->queryDesc, ForwardScanDirection, cstate->raw ? 2L : 0L); processed = ((DR_copy *) cstate->queryDesc->dest)->processed; } ! /* raw_text/raw_binary requires exactly one row */ ! if (cstate->raw) ! { ! if (processed > 1) ! ereport(ERROR, ! (errcode(ERRCODE_TOO_MANY_ROWS), ! errmsg("single row result is required by RAW_TEXT/RAW_BINARY mode"))); ! ! if (processed == 0) ! ereport(ERROR, ! (errcode(ERRCODE_NO_DATA_FOUND), ! errmsg("single row result is required by RAW_TEXT/RAW_BINARY mode"))); ! } ! ! if (cstate->binary && !cstate->raw) { /* Generate trailer for a binary copy */ CopySendInt16(cstate, -1); *************** CopyOneRowTo(CopyState cstate, Oid tuple *** 2015,2042 **** MemoryContextReset(cstate->rowcontext); oldcontext = MemoryContextSwitchTo(cstate->rowcontext); ! if (cstate->binary) { ! /* Binary per-tuple header */ ! CopySendInt16(cstate, list_length(cstate->attnumlist)); ! /* Send OID if wanted --- note attnumlist doesn't include it */ ! if (cstate->oids) { ! /* Hack --- assume Oid is same size as int32 */ ! CopySendInt32(cstate, sizeof(int32)); ! CopySendInt32(cstate, tupleOid); } ! } ! else ! { ! /* Text format has no per-tuple header, but send OID if wanted */ ! /* Assume digits don't need any quoting or encoding conversion */ ! if (cstate->oids) { ! string = DatumGetCString(DirectFunctionCall1(oidout, ! ObjectIdGetDatum(tupleOid))); ! CopySendString(cstate, string); ! need_delim = true; } } --- 2120,2150 ---- MemoryContextReset(cstate->rowcontext); oldcontext = MemoryContextSwitchTo(cstate->rowcontext); ! if (!cstate->raw) { ! if (cstate->binary) { ! /* Binary per-tuple header */ ! CopySendInt16(cstate, list_length(cstate->attnumlist)); ! /* Send OID if wanted --- note attnumlist doesn't include it */ ! if (cstate->oids) ! { ! /* Hack --- assume Oid is same size as int32 */ ! CopySendInt32(cstate, sizeof(int32)); ! CopySendInt32(cstate, tupleOid); ! } } ! else { ! /* Text format has no per-tuple header, but send OID if wanted */ ! /* Assume digits don't need any quoting or encoding conversion */ ! if (cstate->oids) ! { ! string = DatumGetCString(DirectFunctionCall1(oidout, ! ObjectIdGetDatum(tupleOid))); ! CopySendString(cstate, string); ! need_delim = true; ! } } } *************** CopyOneRowTo(CopyState cstate, Oid tuple *** 2055,2060 **** --- 2163,2173 ---- if (isnull) { + if (cstate->raw) + ereport(ERROR, + (errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED), + errmsg("NULL value is not allowed in RAW_TEXT/RAW_BINARY mode"))); + if (!cstate->binary) CopySendString(cstate, cstate->null_print_client); else *************** CopyOneRowTo(CopyState cstate, Oid tuple *** 2062,2068 **** } else { ! if (!cstate->binary) { string = OutputFunctionCall(&out_functions[attnum - 1], value); --- 2175,2246 ---- } else { ! if (cstate->raw) ! { ! const void *content; ! int size; ! ! if (!cstate->binary) ! { ! string = OutputFunctionCall(&out_functions[attnum - 1], ! value); ! ! /* We would to transcode, but without escaping */ ! if (cstate->need_transcoding) ! content = pg_server_to_any(string, strlen(string), cstate->file_encoding); ! else ! content = string; ! ! size = strlen((const char *) content); ! } ! else ! { ! bytea *outputbytes; ! ! /* ! * Some binary output functions depends can depends on client encoding. ! * The binary output of xml is good example. Set client_encoding ! * temporaly before out function execution. ! */ ! if (cstate->need_transcoding) ! { ! int old_server_encoding = pg_get_client_encoding(); ! volatile bool reset_encoding = false; ! ! PG_TRY(); ! { ! /* We don't expect an error, because encoding was checked before */ ! if (PrepareClientEncoding(cstate->file_encoding) < 0) ! elog(ERROR, "PrepareClientEncoding(%d) failed", cstate->file_encoding); ! ! SetClientEncoding(cstate->file_encoding); ! reset_encoding = true; ! ! outputbytes = SendFunctionCall(&out_functions[attnum - 1], ! value); ! SetClientEncoding(old_server_encoding); ! } ! PG_CATCH(); ! { ! if (reset_encoding) ! SetClientEncoding(old_server_encoding); ! PG_RE_THROW(); ! } ! PG_END_TRY(); ! } ! else ! { ! outputbytes = SendFunctionCall(&out_functions[attnum - 1], ! value); ! } ! content = VARDATA(outputbytes); ! size = VARSIZE(outputbytes) - VARHDRSZ; ! } ! ! /* Send only content in RAW_TEXT/RAW_BINARY mode */ ! CopySendData(cstate, content, size); ! } ! else if (!cstate->binary) { string = OutputFunctionCall(&out_functions[attnum - 1], value); *************** BeginCopyFrom(Relation rel, *** 2811,2875 **** } } ! if (!cstate->binary) ! { ! /* must rely on user to tell us... */ ! cstate->file_has_oids = cstate->oids; ! } ! else { ! /* Read and verify binary header */ ! char readSig[11]; ! int32 tmp; ! ! /* Signature */ ! if (CopyGetData(cstate, readSig, 11, 11) != 11 || ! memcmp(readSig, BinarySignature, 11) != 0) ! ereport(ERROR, ! (errcode(ERRCODE_BAD_COPY_FILE_FORMAT), ! errmsg("COPY file signature not recognized"))); ! /* Flags field */ ! if (!CopyGetInt32(cstate, &tmp)) ! ereport(ERROR, ! (errcode(ERRCODE_BAD_COPY_FILE_FORMAT), ! errmsg("invalid COPY file header (missing flags)"))); ! cstate->file_has_oids = (tmp & (1 << 16)) != 0; ! tmp &= ~(1 << 16); ! if ((tmp >> 16) != 0) ! ereport(ERROR, ! (errcode(ERRCODE_BAD_COPY_FILE_FORMAT), ! errmsg("unrecognized critical flags in COPY file header"))); ! /* Header extension length */ ! if (!CopyGetInt32(cstate, &tmp) || ! tmp < 0) ! ereport(ERROR, ! (errcode(ERRCODE_BAD_COPY_FILE_FORMAT), ! errmsg("invalid COPY file header (missing length)"))); ! /* Skip extension header, if present */ ! while (tmp-- > 0) { ! if (CopyGetData(cstate, readSig, 1, 1) != 1) ereport(ERROR, (errcode(ERRCODE_BAD_COPY_FILE_FORMAT), ! errmsg("invalid COPY file header (wrong length)"))); } - } ! if (cstate->file_has_oids && cstate->binary) ! { ! getTypeBinaryInputInfo(OIDOID, ! &in_func_oid, &cstate->oid_typioparam); ! fmgr_info(in_func_oid, &cstate->oid_in_function); ! } ! /* create workspace for CopyReadAttributes results */ ! if (!cstate->binary) ! { ! AttrNumber attr_count = list_length(cstate->attnumlist); ! int nfields = cstate->file_has_oids ? (attr_count + 1) : attr_count; ! cstate->max_fields = nfields; ! cstate->raw_fields = (char **) palloc(nfields * sizeof(char *)); } MemoryContextSwitchTo(oldcontext); --- 2989,3057 ---- } } ! /* The raw mode hasn't any header information */ ! if (!cstate->raw) { ! if (!cstate->binary) { ! /* must rely on user to tell us... */ ! cstate->file_has_oids = cstate->oids; ! } ! else ! { ! /* Read and verify binary header */ ! char readSig[11]; ! int32 tmp; ! ! /* Signature */ ! if (CopyGetData(cstate, readSig, 11, 11) != 11 || ! memcmp(readSig, BinarySignature, 11) != 0) ereport(ERROR, (errcode(ERRCODE_BAD_COPY_FILE_FORMAT), ! errmsg("COPY file signature not recognized"))); ! /* Flags field */ ! if (!CopyGetInt32(cstate, &tmp)) ! ereport(ERROR, ! (errcode(ERRCODE_BAD_COPY_FILE_FORMAT), ! errmsg("invalid COPY file header (missing flags)"))); ! cstate->file_has_oids = (tmp & (1 << 16)) != 0; ! tmp &= ~(1 << 16); ! if ((tmp >> 16) != 0) ! ereport(ERROR, ! (errcode(ERRCODE_BAD_COPY_FILE_FORMAT), ! errmsg("unrecognized critical flags in COPY file header"))); ! /* Header extension length */ ! if (!CopyGetInt32(cstate, &tmp) || ! tmp < 0) ! ereport(ERROR, ! (errcode(ERRCODE_BAD_COPY_FILE_FORMAT), ! errmsg("invalid COPY file header (missing length)"))); ! /* Skip extension header, if present */ ! while (tmp-- > 0) ! { ! if (CopyGetData(cstate, readSig, 1, 1) != 1) ! ereport(ERROR, ! (errcode(ERRCODE_BAD_COPY_FILE_FORMAT), ! errmsg("invalid COPY file header (wrong length)"))); ! } } ! if (cstate->file_has_oids && cstate->binary) ! { ! getTypeBinaryInputInfo(OIDOID, ! &in_func_oid, &cstate->oid_typioparam); ! fmgr_info(in_func_oid, &cstate->oid_in_function); ! } ! /* create workspace for CopyReadAttributes results */ ! if (!cstate->binary) ! { ! AttrNumber attr_count = list_length(cstate->attnumlist); ! int nfields = cstate->file_has_oids ? (attr_count + 1) : attr_count; ! cstate->max_fields = nfields; ! cstate->raw_fields = (char **) palloc(nfields * sizeof(char *)); ! } } MemoryContextSwitchTo(oldcontext); *************** NextCopyFrom(CopyState cstate, ExprConte *** 2968,2974 **** MemSet(values, 0, num_phys_attrs * sizeof(Datum)); MemSet(nulls, true, num_phys_attrs * sizeof(bool)); ! if (!cstate->binary) { char **field_strings; ListCell *cur; --- 3150,3203 ---- MemSet(values, 0, num_phys_attrs * sizeof(Datum)); MemSet(nulls, true, num_phys_attrs * sizeof(bool)); ! if (cstate->raw) ! { ! int m = linitial_int(cstate->attnumlist) - 1; ! ! /* All content was read in first cycle */ ! if (++cstate->cur_lineno > 1) ! return false; ! ! CopyLoadallRawBuf(cstate); ! ! cstate->cur_attname = NameStr(attr[m]->attname); ! ! if (!cstate->binary) ! { ! char *cvt; ! ! cvt = pg_any_to_server(cstate->raw_buf, ! cstate->raw_buf_len, ! cstate->file_encoding); ! ! values[m] = InputFunctionCall(&in_functions[m], ! cvt, ! typioparams[m], ! attr[m]->atttypmod); ! } ! else ! { ! cstate->attribute_buf.data = cstate->raw_buf; ! cstate->attribute_buf.len = cstate->raw_buf_len; ! cstate->attribute_buf.cursor = 0; ! cstate->raw_buf = NULL; ! ! /* Call the column type's binary input converter */ ! values[m] = ReceiveFunctionCall(&in_functions[m], &cstate->attribute_buf, ! typioparams[m], attr[m]->atttypmod); ! ! /* Trouble if it didn't eat the whole buffer */ ! if (cstate->attribute_buf.cursor != cstate->attribute_buf.len) ! ereport(ERROR, ! (errcode(ERRCODE_INVALID_BINARY_REPRESENTATION), ! errmsg("incorrect binary data format"))); ! } ! ! nulls[m] = false; ! ! cstate->cur_attname = NULL; ! } ! else if (!cstate->binary) { char **field_strings; ListCell *cur; diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c new file mode 100644 index cb8a06d..4dab119 *** a/src/bin/psql/tab-complete.c --- b/src/bin/psql/tab-complete.c *************** psql_completion(const char *text, int st *** 1969,1976 **** /* Handle COPY [BINARY] <sth> FROM|TO filename */ else if (Matches4("COPY|\\copy", MatchAny, "FROM|TO", MatchAny) || Matches5("COPY", "BINARY", MatchAny, "FROM|TO", MatchAny)) ! COMPLETE_WITH_LIST6("BINARY", "OIDS", "DELIMITER", "NULL", "CSV", ! "ENCODING"); /* Handle COPY [BINARY] <sth> FROM|TO filename CSV */ else if (Matches5("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "CSV") || --- 1969,1976 ---- /* Handle COPY [BINARY] <sth> FROM|TO filename */ else if (Matches4("COPY|\\copy", MatchAny, "FROM|TO", MatchAny) || Matches5("COPY", "BINARY", MatchAny, "FROM|TO", MatchAny)) ! COMPLETE_WITH_LIST8("BINARY", "RAW_TEXT", "RAW_BINARY", "OIDS", ! "DELIMITER", "NULL", "CSV", "ENCODING"); /* Handle COPY [BINARY] <sth> FROM|TO filename CSV */ else if (Matches5("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "CSV") || diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt new file mode 100644 index 21dd772..a2754f1 *** a/src/interfaces/libpq/exports.txt --- b/src/interfaces/libpq/exports.txt *************** PQsslAttributeNames 168 *** 171,173 **** --- 171,174 ---- PQsslAttribute 169 PQsetErrorContextVisibility 170 PQresultVerboseErrorMessage 171 + PQcopyFormat 172 diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c new file mode 100644 index 2621767..09967e9 *** a/src/interfaces/libpq/fe-exec.c --- b/src/interfaces/libpq/fe-exec.c *************** PQmakeEmptyPGresult(PGconn *conn, ExecSt *** 155,160 **** --- 155,161 ---- result->resultStatus = status; result->cmdStatus[0] = '\0'; result->binary = 0; + result->raw = 0; result->events = NULL; result->nEvents = 0; result->errMsg = NULL; *************** PQsetResultAttrs(PGresult *res, int numA *** 256,263 **** if (!res->attDescs[i].name) return FALSE; ! if (res->attDescs[i].format == 0) res->binary = 0; } return TRUE; --- 257,266 ---- if (!res->attDescs[i].name) return FALSE; ! if (res->attDescs[i].format == 0 || res->attDescs[i].format == 2) res->binary = 0; + if (res->attDescs[i].format == 2 || res->attDescs[i].format == 3) + res->raw = 1; } return TRUE; *************** PQcmdStatus(PGresult *res) *** 2932,2937 **** --- 2935,2955 ---- } /* + * PQcopyFormat + * + * Returns a info about copy mode: + * -1 signalize a error, 0 = text mode, 1 = binary mode, 2 = raw mode + */ + int + PQcopyFormat(const PGresult *res) + { + if (res->raw) + return 2; + else + return res->binary; + } + + /* * PQoidStatus - * if the last command was an INSERT, return the oid string * if not, return "" diff --git a/src/interfaces/libpq/fe-protocol3.c b/src/interfaces/libpq/fe-protocol3.c new file mode 100644 index 0b8c62f..1783844 *** a/src/interfaces/libpq/fe-protocol3.c --- b/src/interfaces/libpq/fe-protocol3.c *************** getCopyStart(PGconn *conn, ExecStatusTyp *** 1486,1491 **** --- 1486,1495 ---- */ format = (int) ((int16) format); result->attDescs[i].format = format; + + /* when any field uses raw format, then COPY RAW_* was used */ + if (format == 2 || format == 3) + result->raw = true; } /* Success! */ diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h new file mode 100644 index 9ca0756..7984666 *** a/src/interfaces/libpq/libpq-fe.h --- b/src/interfaces/libpq/libpq-fe.h *************** extern Oid PQftype(const PGresult *res, *** 479,484 **** --- 479,485 ---- extern int PQfsize(const PGresult *res, int field_num); extern int PQfmod(const PGresult *res, int field_num); extern char *PQcmdStatus(PGresult *res); + extern int PQcopyFormat(const PGresult *res); extern char *PQoidStatus(const PGresult *res); /* old and ugly */ extern Oid PQoidValue(const PGresult *res); /* new and improved */ extern char *PQcmdTuples(PGresult *res); diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h new file mode 100644 index 1183323..8fc4b04 *** a/src/interfaces/libpq/libpq-int.h --- b/src/interfaces/libpq/libpq-int.h *************** struct pg_result *** 180,185 **** --- 180,186 ---- char cmdStatus[CMDSTATUS_LEN]; /* cmd status from the query */ int binary; /* binary tuple values if binary == 1, * otherwise text */ + int raw; /* raw mode for COPY */ /* * These fields are copied from the originating PGconn, so that operations diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out new file mode 100644 index 5f6260a..3d36dd3 *** a/src/test/regress/expected/copy2.out --- b/src/test/regress/expected/copy2.out *************** DROP FUNCTION truncate_in_subxact(); *** 466,468 **** --- 466,482 ---- DROP TABLE x, y; DROP FUNCTION fn_x_before(); DROP FUNCTION fn_x_after(); + CREATE TABLE x(a bytea); + INSERT INTO x VALUES('\x41484f4a0a'); + INSERT INTO x VALUES('\x41484f4a0a'); + -- should to fail + COPY (SELECT a,a FROM x LIMIT 1) TO STDOUT (FORMAT raw_binary); + ERROR: Single column result/target is required in RAW_TEXT/RAW_BINARY mode + COPY (SELECT a FROM x) TO STDOUT (FORMAT raw_binary); + AHOJ + AHOJ + ERROR: single row result is required by RAW_TEXT/RAW_BINARY mode + -- should be ok + COPY (SELECT a FROM x LIMIT 1) TO STDOUT (FORMAT raw_binary); + AHOJ + DROP TABLE x; diff --git a/src/test/regress/input/copy.source b/src/test/regress/input/copy.source new file mode 100644 index cb13606..085ae36 *** a/src/test/regress/input/copy.source --- b/src/test/regress/input/copy.source *************** this is just a line full of junk that wo *** 133,135 **** --- 133,214 ---- \. copy copytest3 to stdout csv header; + + -- copy raw + CREATE TABLE x(a bytea); + INSERT INTO x VALUES('\x41484f4a0a'); + SELECT length(a) FROM x; + + INSERT INTO x VALUES('\x41484f4a0a'); + + -- should to fail + COPY (SELECT a,a FROM x LIMIT 1) TO '@abs_builddir@/results/raw.data' (FORMAT raw_binary); + COPY (SELECT a FROM x) TO '@abs_builddir@/results/raw.data' (FORMAT raw_binary); + + -- should be ok + COPY (SELECT a FROM x LIMIT 1) TO '@abs_builddir@/results/raw.data' (FORMAT raw_binary); + TRUNCATE x; + COPY x FROM '@abs_builddir@/results/raw.data' (FORMAT raw_binary); + SELECT length(a) FROM x; + COPY x TO stdout (FORMAT raw_binary); + + TRUNCATE x; + + \COPY x FROM '@abs_builddir@/results/raw.data' (FORMAT raw_binary) + SELECT length(a) FROM x; + COPY x TO stdout (FORMAT raw_binary); + + \COPY x TO '@abs_builddir@/results/raw2.data' (FORMAT raw_binary) + TRUNCATE x; + + \COPY x FROM '@abs_builddir@/results/raw2.data' (FORMAT raw_binary) + SELECT length(a) FROM x; + COPY x TO stdout (FORMAT raw_binary); + + -- test big file + TRUNCATE x; + -- use different mechanism for load to bytea + \lo_import @abs_builddir@/data/hash.data + \set lo_oid :LASTOID + INSERT INTO x VALUES(lo_get(:lo_oid)); + \lo_unlink :lo_oid + + COPY x FROM '@abs_builddir@/data/hash.data' (FORMAT raw_binary); + \COPY x FROM '@abs_builddir@/data/hash.data' (FORMAT raw_binary) + + SELECT md5(a), length(a) FROM x; + + TRUNCATE x; + COPY x FROM '@abs_builddir@/data/hash.data' (FORMAT raw_binary); + COPY x TO '@abs_builddir@/results/hash2.data' (FORMAT raw_binary); + \COPY x TO '@abs_builddir@/results/hash3.data' (FORMAT raw_binary) + + -- read again + COPY x FROM '@abs_builddir@/results/hash2.data' (FORMAT raw_binary); + \COPY x FROM '@abs_builddir@/results/hash3.data' (FORMAT raw_binary) + -- cross + COPY x FROM '@abs_builddir@/results/hash3.data' (FORMAT raw_binary); + \COPY x FROM '@abs_builddir@/results/hash2.data' (FORMAT raw_binary) + + SELECT md5(a), length(a) FROM x; + + DROP TABLE x; + + -- insert into multicolumn table + CREATE TABLE x(id serial, a bytea, b bytea); + + -- should fail, too much columns + COPY x FROM '@abs_builddir@/results/hash2.data' (FORMAT raw_binary); + + -- should work + COPY x(a) FROM '@abs_builddir@/results/hash2.data' (FORMAT raw_binary); + COPY x(b) FROM '@abs_builddir@/results/hash2.data' (FORMAT raw_binary); + SELECT id, md5(a), md5(b) FROM x; + + -- test raw_text + COPY (SELECT a FROM x WHERE id = 1) TO '@abs_builddir@/results/hash4.data' (FORMAT raw_text); + COPY x(a) FROM '@abs_builddir@/results/hash4.data' (FORMAT raw_text); + SELECT id, md5(a) FROM x WHERE id = lastval(); + + DROP TABLE x; + diff --git a/src/test/regress/output/copy.source b/src/test/regress/output/copy.source new file mode 100644 index b7e372d..e34bbab *** a/src/test/regress/output/copy.source --- b/src/test/regress/output/copy.source *************** copy copytest3 to stdout csv header; *** 95,97 **** --- 95,208 ---- c1,"col with , comma","col with "" quote" 1,a,1 2,b,2 + -- copy raw + CREATE TABLE x(a bytea); + INSERT INTO x VALUES('\x41484f4a0a'); + SELECT length(a) FROM x; + length + -------- + 5 + (1 row) + + INSERT INTO x VALUES('\x41484f4a0a'); + -- should to fail + COPY (SELECT a,a FROM x LIMIT 1) TO '@abs_builddir@/results/raw.data' (FORMAT raw_binary); + ERROR: Single column result/target is required in RAW_TEXT/RAW_BINARY mode + COPY (SELECT a FROM x) TO '@abs_builddir@/results/raw.data' (FORMAT raw_binary); + ERROR: single row result is required by RAW_TEXT/RAW_BINARY mode + -- should be ok + COPY (SELECT a FROM x LIMIT 1) TO '@abs_builddir@/results/raw.data' (FORMAT raw_binary); + TRUNCATE x; + COPY x FROM '@abs_builddir@/results/raw.data' (FORMAT raw_binary); + SELECT length(a) FROM x; + length + -------- + 5 + (1 row) + + COPY x TO stdout (FORMAT raw_binary); + AHOJ + TRUNCATE x; + \COPY x FROM '@abs_builddir@/results/raw.data' (FORMAT raw_binary) + SELECT length(a) FROM x; + length + -------- + 5 + (1 row) + + COPY x TO stdout (FORMAT raw_binary); + AHOJ + \COPY x TO '@abs_builddir@/results/raw2.data' (FORMAT raw_binary) + TRUNCATE x; + \COPY x FROM '@abs_builddir@/results/raw2.data' (FORMAT raw_binary) + SELECT length(a) FROM x; + length + -------- + 5 + (1 row) + + COPY x TO stdout (FORMAT raw_binary); + AHOJ + -- test big file + TRUNCATE x; + -- use different mechanism for load to bytea + \lo_import @abs_builddir@/data/hash.data + \set lo_oid :LASTOID + INSERT INTO x VALUES(lo_get(:lo_oid)); + \lo_unlink :lo_oid + COPY x FROM '@abs_builddir@/data/hash.data' (FORMAT raw_binary); + \COPY x FROM '@abs_builddir@/data/hash.data' (FORMAT raw_binary) + SELECT md5(a), length(a) FROM x; + md5 | length + ----------------------------------+-------- + e446fe6ea5a347e69670633412c7f8cb | 153749 + e446fe6ea5a347e69670633412c7f8cb | 153749 + e446fe6ea5a347e69670633412c7f8cb | 153749 + (3 rows) + + TRUNCATE x; + COPY x FROM '@abs_builddir@/data/hash.data' (FORMAT raw_binary); + COPY x TO '@abs_builddir@/results/hash2.data' (FORMAT raw_binary); + \COPY x TO '@abs_builddir@/results/hash3.data' (FORMAT raw_binary) + -- read again + COPY x FROM '@abs_builddir@/results/hash2.data' (FORMAT raw_binary); + \COPY x FROM '@abs_builddir@/results/hash3.data' (FORMAT raw_binary) + -- cross + COPY x FROM '@abs_builddir@/results/hash3.data' (FORMAT raw_binary); + \COPY x FROM '@abs_builddir@/results/hash2.data' (FORMAT raw_binary) + SELECT md5(a), length(a) FROM x; + md5 | length + ----------------------------------+-------- + e446fe6ea5a347e69670633412c7f8cb | 153749 + e446fe6ea5a347e69670633412c7f8cb | 153749 + e446fe6ea5a347e69670633412c7f8cb | 153749 + e446fe6ea5a347e69670633412c7f8cb | 153749 + e446fe6ea5a347e69670633412c7f8cb | 153749 + (5 rows) + + DROP TABLE x; + -- insert into multicolumn table + CREATE TABLE x(id serial, a bytea, b bytea); + -- should fail, too much columns + COPY x FROM '@abs_builddir@/results/hash2.data' (FORMAT raw_binary); + ERROR: Single column result/target is required in RAW_TEXT/RAW_BINARY mode + -- should work + COPY x(a) FROM '@abs_builddir@/results/hash2.data' (FORMAT raw_binary); + COPY x(b) FROM '@abs_builddir@/results/hash2.data' (FORMAT raw_binary); + SELECT id, md5(a), md5(b) FROM x; + id | md5 | md5 + ----+----------------------------------+---------------------------------- + 1 | e446fe6ea5a347e69670633412c7f8cb | + 2 | | e446fe6ea5a347e69670633412c7f8cb + (2 rows) + + -- test raw_text + COPY (SELECT a FROM x WHERE id = 1) TO '@abs_builddir@/results/hash4.data' (FORMAT raw_text); + COPY x(a) FROM '@abs_builddir@/results/hash4.data' (FORMAT raw_text); + SELECT id, md5(a) FROM x WHERE id = lastval(); + id | md5 + ----+---------------------------------- + 3 | e446fe6ea5a347e69670633412c7f8cb + (1 row) + + DROP TABLE x; diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql new file mode 100644 index 39a9deb..7e22ee4 *** a/src/test/regress/sql/copy2.sql --- b/src/test/regress/sql/copy2.sql *************** DROP FUNCTION truncate_in_subxact(); *** 333,335 **** --- 333,348 ---- DROP TABLE x, y; DROP FUNCTION fn_x_before(); DROP FUNCTION fn_x_after(); + + CREATE TABLE x(a bytea); + INSERT INTO x VALUES('\x41484f4a0a'); + INSERT INTO x VALUES('\x41484f4a0a'); + + -- should to fail + COPY (SELECT a,a FROM x LIMIT 1) TO STDOUT (FORMAT raw_binary); + COPY (SELECT a FROM x) TO STDOUT (FORMAT raw_binary); + + -- should be ok + COPY (SELECT a FROM x LIMIT 1) TO STDOUT (FORMAT raw_binary); + + DROP TABLE x;
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers