Hi 2016-03-29 20:59 GMT+02:00 Tom Lane <t...@sss.pgh.pa.us>:
> Pavel Stehule <pavel.steh...@gmail.com> writes: > > I am writing few lines as summary: > > > 1. invention RAW_TEXT and RAW_BINARY > > 2. for RAW_BINARY: PQbinaryTuples() returns 1 and PQfformat() returns 1 > > 3.a for RAW_TEXT: PQbinaryTuples() returns 0 and PQfformat() returns 0, > but > > the client should to check PQcopyFormat() to not print "\n" on the end > > 3.b for RAW_TEXT: PQbinaryTuples() returns 1 and PQfformat() returns 1, > but > > used output function, not necessary client modification > > 4. PQcopyFormat() returns 0 for text, 1 for binary, 2 for RAW_TEXT, 3 for > > RAW_BINARY > > 5. create tests for ecpg > > 3.b certainly seems completely wrong. PQfformat==1 would imply binary > data. > > I suggest that PQcopyFormat should be understood as defining the format > of the copy data encapsulation, not the individual fields. So it would go > like 0 = traditional text format, 1 = traditional binary format, 2 = raw > (no encapsulation). You'd need to also look at PQfformat to distinguish > raw text from raw binary. But if we do it as you suggest above, we've > locked ourselves into only ever having two field format codes, which > is something the existing design is specifically intended to allow > expansion in. > > I wrote concept of raw_text, raw_binary modes. I am trying to implement text data passing like text format - but for RAW_TEXT it is not practical. Text passing is designed for one line data, for multiline data enforces escaping, what we don't would for RAW mode. I have to skip escaping, and the code is not nice. So I propose different schema - RAW_TEXT uses text values (uses input/output functions), enforce encoding from/to client codes and for passing to client mode is used binary mode - then I don't need to read the content with line by line. PQbinaryTuples() returns 1 for RAW_TEXT and RAW_BINARY - in these cases data are passed as one binary value. PQfformat returns 2 for RAW_TEXT and 3 for RAW_BINARY. Any objections to this design? Regards Pavel > regards, tom lane >
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml new file mode 100644 index 07e2f45..68fbfd8 *** a/doc/src/sgml/ref/copy.sgml --- b/doc/src/sgml/ref/copy.sgml *************** COPY { <replaceable class="parameter">ta *** 197,203 **** Selects the data format to be read or written: <literal>text</>, <literal>csv</> (Comma Separated Values), ! or <literal>binary</>. The default is <literal>text</>. </para> </listitem> --- 197,203 ---- Selects the data format to be read or written: <literal>text</>, <literal>csv</> (Comma Separated Values), ! <literal>binary</> or <literal>raw</literal>. The default is <literal>text</>. </para> </listitem> *************** OIDs to be shown as null if that ever pr *** 888,893 **** --- 888,925 ---- </para> </refsect3> </refsect2> + + <refsect2> + <title>Raw Format</title> + + <para> + The <literal>raw</literal> format option causes all data to be + stored/read as binary format rather than as text. It shares format + for data with <literal>binary</literal> format. This format doesn't + use any metadata - only row data in network byte order are exported + or imported. + </para> + + <para> + Because this format doesn't support any delimiter, only one value + can be exported or imported. NULL values are not allowed. + </para> + <para> + The <literal>raw</literal> format can be used for export or import + bytea values. + <programlisting> + COPY images(data) FROM '/usr1/proj/img/01.jpg' (FORMAT raw); + </programlisting> + It can be used successfully for export XML in different encoding + or import valid XML document with any supported encoding: + <screen><![CDATA[ + SET client_encoding TO latin2; + + COPY (SELECT xmlelement(NAME data, 'Hello')) TO stdout (FORMAT raw); + <?xml version="1.0" encoding="LATIN2"?><data>Hello</data> + ]]></screen> + </para> + </refsect2> </refsect1> <refsect1> diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c new file mode 100644 index 3201476..1de36b6 *** a/src/backend/commands/copy.c --- b/src/backend/commands/copy.c *************** typedef enum EolType *** 89,94 **** --- 89,99 ---- * it's faster to make useless comparisons to trailing bytes than it is to * invoke pg_encoding_mblen() to skip over them. encoding_embeds_ascii is TRUE * when we have to do it the hard way. + * + * COPY supports three modes: text, binary, raw_text and raw_binary. The text + * format is plain text multiline format with specified delimiter. The binary + * format holds metadata (numbers, sizes) and data. The raw format holds data + * only and only one non NULL value can be processed. */ typedef struct CopyStateData { *************** typedef struct CopyStateData *** 110,115 **** --- 115,121 ---- char *filename; /* filename, or NULL for STDIN/STDOUT */ bool is_program; /* is 'filename' a program to popen? */ bool binary; /* binary format? */ + bool raw; /* required raw binary? */ bool oids; /* include OIDs? */ bool freeze; /* freeze rows on loading? */ bool csv_mode; /* Comma Separated Value format? */ *************** typedef struct CopyStateData *** 199,204 **** --- 205,213 ---- char *raw_buf; int raw_buf_index; /* next byte to process */ int raw_buf_len; /* total # of bytes stored */ + + /* field for RAW mode */ + bool row_processed; /* true, when first row was processed */ } CopyStateData; /* DestReceiver for COPY (query) TO */ *************** SendCopyBegin(CopyState cstate) *** 342,353 **** /* new way */ StringInfoData buf; int natts = list_length(cstate->attnumlist); ! int16 format = (cstate->binary ? 1 : 0); int i; pq_beginmessage(&buf, 'H'); ! pq_sendbyte(&buf, format); /* overall format */ pq_sendint(&buf, natts, 2); for (i = 0; i < natts; i++) pq_sendint(&buf, format, 2); /* per-column formats */ pq_endmessage(&buf); --- 351,368 ---- /* new way */ StringInfoData buf; int natts = list_length(cstate->attnumlist); ! int16 format; int i; pq_beginmessage(&buf, 'H'); ! pq_sendbyte(&buf, cstate->binary ? 1 : 0); /* overall format */ pq_sendint(&buf, natts, 2); + + if (!cstate->raw) + format = cstate->binary ? 1 : 0; + else + format = cstate->binary ? 3 : 2; + for (i = 0; i < natts; i++) pq_sendint(&buf, format, 2); /* per-column formats */ pq_endmessage(&buf); *************** SendCopyBegin(CopyState cstate) *** 356,365 **** else if (PG_PROTOCOL_MAJOR(FrontendProtocol) >= 2) { /* old way */ ! if (cstate->binary) ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), ! errmsg("COPY BINARY is not supported to stdout or from stdin"))); pq_putemptymessage('H'); /* grottiness needed for old COPY OUT protocol */ pq_startcopyout(); --- 371,380 ---- else if (PG_PROTOCOL_MAJOR(FrontendProtocol) >= 2) { /* old way */ ! if (cstate->binary || cstate->raw) ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), ! errmsg("COPY BINARY or COPY RAW is not supported to stdout or from stdin"))); pq_putemptymessage('H'); /* grottiness needed for old COPY OUT protocol */ pq_startcopyout(); *************** SendCopyBegin(CopyState cstate) *** 368,377 **** else { /* very old way */ ! if (cstate->binary) ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), ! errmsg("COPY BINARY is not supported to stdout or from stdin"))); pq_putemptymessage('B'); /* grottiness needed for old COPY OUT protocol */ pq_startcopyout(); --- 383,392 ---- else { /* very old way */ ! if (cstate->binary || cstate->raw) ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), ! errmsg("COPY BINARY or COPY RAW is not supported to stdout or from stdin"))); pq_putemptymessage('B'); /* grottiness needed for old COPY OUT protocol */ pq_startcopyout(); *************** ReceiveCopyBegin(CopyState cstate) *** 387,398 **** /* new way */ StringInfoData buf; int natts = list_length(cstate->attnumlist); ! int16 format = (cstate->binary ? 1 : 0); int i; pq_beginmessage(&buf, 'G'); ! pq_sendbyte(&buf, format); /* overall format */ pq_sendint(&buf, natts, 2); for (i = 0; i < natts; i++) pq_sendint(&buf, format, 2); /* per-column formats */ pq_endmessage(&buf); --- 402,419 ---- /* new way */ StringInfoData buf; int natts = list_length(cstate->attnumlist); ! int16 format; int i; pq_beginmessage(&buf, 'G'); ! pq_sendbyte(&buf, cstate->binary ? 1 : 0); /* overall format */ pq_sendint(&buf, natts, 2); + + if (!cstate->raw) + format = cstate->binary ? 1 : 0; + else + format = cstate->binary ? 3 : 2; + for (i = 0; i < natts; i++) pq_sendint(&buf, format, 2); /* per-column formats */ pq_endmessage(&buf); *************** ReceiveCopyBegin(CopyState cstate) *** 402,408 **** else if (PG_PROTOCOL_MAJOR(FrontendProtocol) >= 2) { /* old way */ ! if (cstate->binary) ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("COPY BINARY is not supported to stdout or from stdin"))); --- 423,429 ---- else if (PG_PROTOCOL_MAJOR(FrontendProtocol) >= 2) { /* old way */ ! if (cstate->binary || cstate->raw) ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("COPY BINARY is not supported to stdout or from stdin"))); *************** ReceiveCopyBegin(CopyState cstate) *** 414,420 **** else { /* very old way */ ! if (cstate->binary) ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("COPY BINARY is not supported to stdout or from stdin"))); --- 435,441 ---- else { /* very old way */ ! if (cstate->binary || cstate->raw) ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("COPY BINARY is not supported to stdout or from stdin"))); *************** CopySendEndOfRow(CopyState cstate) *** 482,488 **** switch (cstate->copy_dest) { case COPY_FILE: ! if (!cstate->binary) { /* Default line termination depends on platform */ #ifndef WIN32 --- 503,509 ---- switch (cstate->copy_dest) { case COPY_FILE: ! if (!cstate->binary && !cstate->raw) { /* Default line termination depends on platform */ #ifndef WIN32 *************** CopySendEndOfRow(CopyState cstate) *** 527,533 **** break; case COPY_OLD_FE: /* The FE/BE protocol uses \n as newline for all platforms */ ! if (!cstate->binary) CopySendChar(cstate, '\n'); if (pq_putbytes(fe_msgbuf->data, fe_msgbuf->len)) --- 548,554 ---- break; case COPY_OLD_FE: /* The FE/BE protocol uses \n as newline for all platforms */ ! if (!cstate->binary && !cstate->raw) CopySendChar(cstate, '\n'); if (pq_putbytes(fe_msgbuf->data, fe_msgbuf->len)) *************** CopySendEndOfRow(CopyState cstate) *** 540,546 **** break; case COPY_NEW_FE: /* The FE/BE protocol uses \n as newline for all platforms */ ! if (!cstate->binary) CopySendChar(cstate, '\n'); /* Dump the accumulated row as one CopyData message */ --- 561,567 ---- break; case COPY_NEW_FE: /* The FE/BE protocol uses \n as newline for all platforms */ ! if (!cstate->binary && !cstate->raw) CopySendChar(cstate, '\n'); /* Dump the accumulated row as one CopyData message */ *************** CopyGetData(CopyState cstate, void *data *** 597,602 **** --- 618,624 ---- bytesread = minread; break; case COPY_NEW_FE: + while (maxread > 0 && bytesread < minread && !cstate->fe_eof) { int avail; *************** CopyGetData(CopyState cstate, void *data *** 619,624 **** --- 641,647 ---- (errcode(ERRCODE_CONNECTION_FAILURE), errmsg("unexpected EOF on client connection with an open transaction"))); RESUME_CANCEL_INTERRUPTS(); + switch (mtype) { case 'd': /* CopyData */ *************** CopyLoadRawBuf(CopyState cstate) *** 766,771 **** --- 789,825 ---- return (inbytes > 0); } + /* + * CopyLoadallRawBuf load all file into raw_buf. + * + * It is used for reading content in raw mode. If original RAW_BUF_SIZE is not + * enough, the buffer is enlarged. + */ + static void + CopyLoadallRawBuf(CopyState cstate) + { + int nbytes = 0; + int inbytes; + Size raw_buf_size = RAW_BUF_SIZE; + + do + { + /* hold enough space for one data packet */ + if ((raw_buf_size - nbytes - 1) < 8 * 1024) + { + raw_buf_size += RAW_BUF_SIZE; + cstate->raw_buf = repalloc(cstate->raw_buf, raw_buf_size); + } + + inbytes = CopyGetData(cstate, cstate->raw_buf + nbytes, 1, raw_buf_size - nbytes - 1); + nbytes += inbytes; + } + while (inbytes > 0); + + cstate->raw_buf[nbytes] = '\0'; + cstate->raw_buf_index = 0; + cstate->raw_buf_len = nbytes; + } /* * DoCopy executes the SQL COPY statement *************** ProcessCopyOptions(CopyState cstate, *** 1013,1018 **** --- 1067,1079 ---- cstate->csv_mode = true; else if (strcmp(fmt, "binary") == 0) cstate->binary = true; + else if (strcmp(fmt, "raw_text") == 0) + cstate->raw = true; + else if (strcmp(fmt, "raw_binary") == 0) + { + cstate->binary = true; + cstate->raw = true; + } else ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), *************** ProcessCopyOptions(CopyState cstate, *** 1162,1176 **** * Check for incompatible options (must do these two before inserting * defaults) */ ! if (cstate->binary && cstate->delim) ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR), ! errmsg("cannot specify DELIMITER in BINARY mode"))); ! if (cstate->binary && cstate->null_print) ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR), ! errmsg("cannot specify NULL in BINARY mode"))); /* Set defaults for omitted options */ if (!cstate->delim) --- 1223,1242 ---- * Check for incompatible options (must do these two before inserting * defaults) */ ! if ((cstate->binary || cstate->raw) && cstate->delim) ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR), ! errmsg("cannot specify DELIMITER in BINARY or RAW mode"))); ! if ((cstate->binary || cstate->raw) && cstate->null_print) ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR), ! errmsg("cannot specify NULL in BINARY or RAW mode"))); ! ! if (cstate->raw && cstate->oids) ! ereport(ERROR, ! (errcode(ERRCODE_SYNTAX_ERROR), ! errmsg("cannot specify OIDS in RAW mode"))); /* Set defaults for omitted options */ if (!cstate->delim) *************** BeginCopy(bool is_from, *** 1608,1613 **** --- 1674,1693 ---- } } + /* + * Initializaze the field "row_processed" for one row output in RAW mode, + * and ensure only one output column. + */ + if (cstate->raw) + { + cstate->row_processed = false; + + if (num_phys_attrs > 1) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("only single column result is allowed in RAW mode"))); + } + /* Use client encoding when ENCODING option is not specified. */ if (cstate->file_encoding < 0) cstate->file_encoding = pg_get_client_encoding(); *************** CopyTo(CopyState cstate) *** 1899,1905 **** ALLOCSET_DEFAULT_INITSIZE, ALLOCSET_DEFAULT_MAXSIZE); ! if (cstate->binary) { /* Generate header for a binary copy */ int32 tmp; --- 1979,1985 ---- ALLOCSET_DEFAULT_INITSIZE, ALLOCSET_DEFAULT_MAXSIZE); ! if (!cstate->raw && cstate->binary) { /* Generate header for a binary copy */ int32 tmp; *************** CopyTo(CopyState cstate) *** 1927,1933 **** cstate->file_encoding); /* if a header has been requested send the line */ ! if (cstate->header_line) { bool hdr_delim = false; --- 2007,2013 ---- cstate->file_encoding); /* if a header has been requested send the line */ ! if (!cstate->raw && cstate->header_line) { bool hdr_delim = false; *************** CopyTo(CopyState cstate) *** 1983,1993 **** else { /* run the plan --- the dest receiver will send tuples */ ! ExecutorRun(cstate->queryDesc, ForwardScanDirection, 0L); processed = ((DR_copy *) cstate->queryDesc->dest)->processed; } ! if (cstate->binary) { /* Generate trailer for a binary copy */ CopySendInt16(cstate, -1); --- 2063,2073 ---- else { /* run the plan --- the dest receiver will send tuples */ ! ExecutorRun(cstate->queryDesc, ForwardScanDirection, cstate->raw ? 2L : 0L); processed = ((DR_copy *) cstate->queryDesc->dest)->processed; } ! if (!cstate->raw && cstate->binary) { /* Generate trailer for a binary copy */ CopySendInt16(cstate, -1); *************** CopyOneRowTo(CopyState cstate, Oid tuple *** 2015,2021 **** MemoryContextReset(cstate->rowcontext); oldcontext = MemoryContextSwitchTo(cstate->rowcontext); ! if (cstate->binary) { /* Binary per-tuple header */ CopySendInt16(cstate, list_length(cstate->attnumlist)); --- 2095,2109 ---- MemoryContextReset(cstate->rowcontext); oldcontext = MemoryContextSwitchTo(cstate->rowcontext); ! if (cstate->raw) ! { ! if (cstate->row_processed) ! ereport(ERROR, ! (errcode(ERRCODE_TOO_MANY_ROWS), ! errmsg("only single row result is allowed in RAW mode"))); ! cstate->row_processed = true; ! } ! else if (cstate->binary) { /* Binary per-tuple header */ CopySendInt16(cstate, list_length(cstate->attnumlist)); *************** CopyOneRowTo(CopyState cstate, Oid tuple *** 2046,2052 **** Datum value = values[attnum - 1]; bool isnull = nulls[attnum - 1]; ! if (!cstate->binary) { if (need_delim) CopySendChar(cstate, cstate->delim[0]); --- 2134,2140 ---- Datum value = values[attnum - 1]; bool isnull = nulls[attnum - 1]; ! if (!(cstate->binary || cstate->raw)) { if (need_delim) CopySendChar(cstate, cstate->delim[0]); *************** CopyOneRowTo(CopyState cstate, Oid tuple *** 2055,2068 **** if (isnull) { ! if (!cstate->binary) CopySendString(cstate, cstate->null_print_client); else CopySendInt32(cstate, -1); } else { ! if (!cstate->binary) { string = OutputFunctionCall(&out_functions[attnum - 1], value); --- 2143,2174 ---- if (isnull) { ! if (cstate->raw) ! ereport(ERROR, ! (errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED), ! errmsg("cannot to copy NULL value in RAW mode."))); ! else if (!cstate->binary) CopySendString(cstate, cstate->null_print_client); else CopySendInt32(cstate, -1); } else { ! if (cstate->binary) ! { ! bytea *outputbytes; ! ! outputbytes = SendFunctionCall(&out_functions[attnum - 1], ! value); ! ! /* send the size only in binary mode */ ! if (!cstate->raw) ! CopySendInt32(cstate, VARSIZE(outputbytes) - VARHDRSZ); ! ! CopySendData(cstate, VARDATA(outputbytes), ! VARSIZE(outputbytes) - VARHDRSZ); ! } ! else { string = OutputFunctionCall(&out_functions[attnum - 1], value); *************** CopyOneRowTo(CopyState cstate, Oid tuple *** 2073,2088 **** else CopyAttributeOutText(cstate, string); } - else - { - bytea *outputbytes; - - outputbytes = SendFunctionCall(&out_functions[attnum - 1], - value); - CopySendInt32(cstate, VARSIZE(outputbytes) - VARHDRSZ); - CopySendData(cstate, VARDATA(outputbytes), - VARSIZE(outputbytes) - VARHDRSZ); - } } } --- 2179,2184 ---- *************** BeginCopyFrom(Relation rel, *** 2811,2859 **** } } ! if (!cstate->binary) ! { ! /* must rely on user to tell us... */ ! cstate->file_has_oids = cstate->oids; ! } ! else { ! /* Read and verify binary header */ ! char readSig[11]; ! int32 tmp; ! ! /* Signature */ ! if (CopyGetData(cstate, readSig, 11, 11) != 11 || ! memcmp(readSig, BinarySignature, 11) != 0) ! ereport(ERROR, ! (errcode(ERRCODE_BAD_COPY_FILE_FORMAT), ! errmsg("COPY file signature not recognized"))); ! /* Flags field */ ! if (!CopyGetInt32(cstate, &tmp)) ! ereport(ERROR, ! (errcode(ERRCODE_BAD_COPY_FILE_FORMAT), ! errmsg("invalid COPY file header (missing flags)"))); ! cstate->file_has_oids = (tmp & (1 << 16)) != 0; ! tmp &= ~(1 << 16); ! if ((tmp >> 16) != 0) ! ereport(ERROR, ! (errcode(ERRCODE_BAD_COPY_FILE_FORMAT), ! errmsg("unrecognized critical flags in COPY file header"))); ! /* Header extension length */ ! if (!CopyGetInt32(cstate, &tmp) || ! tmp < 0) ! ereport(ERROR, ! (errcode(ERRCODE_BAD_COPY_FILE_FORMAT), ! errmsg("invalid COPY file header (missing length)"))); ! /* Skip extension header, if present */ ! while (tmp-- > 0) { ! if (CopyGetData(cstate, readSig, 1, 1) != 1) ereport(ERROR, (errcode(ERRCODE_BAD_COPY_FILE_FORMAT), ! errmsg("invalid COPY file header (wrong length)"))); } } if (cstate->file_has_oids && cstate->binary) { --- 2907,2956 ---- } } ! if (cstate->binary) { ! if (!cstate->raw) { ! /* Read and verify binary header */ ! char readSig[11]; ! int32 tmp; ! ! /* Signature */ ! if (CopyGetData(cstate, readSig, 11, 11) != 11 || ! memcmp(readSig, BinarySignature, 11) != 0) ereport(ERROR, (errcode(ERRCODE_BAD_COPY_FILE_FORMAT), ! errmsg("COPY file signature not recognized"))); ! /* Flags field */ ! if (!CopyGetInt32(cstate, &tmp)) ! ereport(ERROR, ! (errcode(ERRCODE_BAD_COPY_FILE_FORMAT), ! errmsg("invalid COPY file header (missing flags)"))); ! cstate->file_has_oids = (tmp & (1 << 16)) != 0; ! tmp &= ~(1 << 16); ! if ((tmp >> 16) != 0) ! ereport(ERROR, ! (errcode(ERRCODE_BAD_COPY_FILE_FORMAT), ! errmsg("unrecognized critical flags in COPY file header"))); ! /* Header extension length */ ! if (!CopyGetInt32(cstate, &tmp) || ! tmp < 0) ! ereport(ERROR, ! (errcode(ERRCODE_BAD_COPY_FILE_FORMAT), ! errmsg("invalid COPY file header (missing length)"))); ! /* Skip extension header, if present */ ! while (tmp-- > 0) ! { ! if (CopyGetData(cstate, readSig, 1, 1) != 1) ! ereport(ERROR, ! (errcode(ERRCODE_BAD_COPY_FILE_FORMAT), ! errmsg("invalid COPY file header (wrong length)"))); ! } } } + else + cstate->file_has_oids = cstate->oids; + if (cstate->file_has_oids && cstate->binary) { *************** NextCopyFromRawFields(CopyState cstate, *** 2918,2928 **** if (done && cstate->line_buf.len == 0) return false; ! /* Parse the line into de-escaped field values */ ! if (cstate->csv_mode) ! fldct = CopyReadAttributesCSV(cstate); else ! fldct = CopyReadAttributesText(cstate); *fields = cstate->raw_fields; *nfields = fldct; --- 3015,3051 ---- if (done && cstate->line_buf.len == 0) return false; ! /* try to read all content in raw mode */ ! if (cstate->raw) ! { ! StringInfoData lines; ! ! initStringInfo(&lines); ! ! do ! { ! if (lines.len > 0) ! appendStringInfoChar(&lines, '\n'); ! ! appendBinaryStringInfo(&lines, cstate->line_buf.data, cstate->line_buf.len); ! ! cstate->cur_lineno++; ! done = CopyReadLine(cstate); ! } while (!(done && cstate->line_buf.len == 0)); ! ! appendStringInfoChar(&lines, '\0'); ! ! cstate->raw_fields[0] = &lines.data; ! fldct = 1; ! } else ! { ! /* Parse the line into de-escaped field values */ ! if (cstate->csv_mode) ! fldct = CopyReadAttributesCSV(cstate); ! else ! fldct = CopyReadAttributesText(cstate); ! } *fields = cstate->raw_fields; *nfields = fldct; *************** NextCopyFrom(CopyState cstate, ExprConte *** 2968,2975 **** MemSet(values, 0, num_phys_attrs * sizeof(Datum)); MemSet(nulls, true, num_phys_attrs * sizeof(bool)); ! if (!cstate->binary) { char **field_strings; ListCell *cur; int fldct; --- 3091,3210 ---- MemSet(values, 0, num_phys_attrs * sizeof(Datum)); MemSet(nulls, true, num_phys_attrs * sizeof(bool)); ! if (cstate->binary && !cstate->raw) ! { ! int16 fld_count; ! ListCell *cur; ! ! cstate->cur_lineno++; ! ! if (!CopyGetInt16(cstate, &fld_count)) ! { ! /* EOF detected (end of file, or protocol-level EOF) */ ! return false; ! } ! ! if (fld_count == -1) ! { ! /* ! * Received EOF marker. In a V3-protocol copy, wait for the ! * protocol-level EOF, and complain if it doesn't come ! * immediately. This ensures that we correctly handle CopyFail, ! * if client chooses to send that now. ! * ! * Note that we MUST NOT try to read more data in an old-protocol ! * copy, since there is no protocol-level EOF marker then. We ! * could go either way for copy from file, but choose to throw ! * error if there's data after the EOF marker, for consistency ! * with the new-protocol case. ! */ ! char dummy; ! ! if (cstate->copy_dest != COPY_OLD_FE && ! CopyGetData(cstate, &dummy, 1, 1) > 0) ! ereport(ERROR, ! (errcode(ERRCODE_BAD_COPY_FILE_FORMAT), ! errmsg("received copy data after EOF marker"))); ! return false; ! } ! ! if (fld_count != attr_count) ! ereport(ERROR, ! (errcode(ERRCODE_BAD_COPY_FILE_FORMAT), ! errmsg("row field count is %d, expected %d", ! (int) fld_count, attr_count))); ! ! if (file_has_oids) ! { ! Oid loaded_oid; ! ! cstate->cur_attname = "oid"; ! loaded_oid = ! DatumGetObjectId(CopyReadBinaryAttribute(cstate, ! 0, ! &cstate->oid_in_function, ! cstate->oid_typioparam, ! -1, ! &isnull)); ! if (isnull || loaded_oid == InvalidOid) ! ereport(ERROR, ! (errcode(ERRCODE_BAD_COPY_FILE_FORMAT), ! errmsg("invalid OID in COPY data"))); ! cstate->cur_attname = NULL; ! if (cstate->oids && tupleOid != NULL) ! *tupleOid = loaded_oid; ! } ! ! i = 0; ! foreach(cur, cstate->attnumlist) ! { ! int attnum = lfirst_int(cur); ! int m = attnum - 1; ! ! cstate->cur_attname = NameStr(attr[m]->attname); ! i++; ! values[m] = CopyReadBinaryAttribute(cstate, ! i, ! &in_functions[m], ! typioparams[m], ! attr[m]->atttypmod, ! &nulls[m]); ! cstate->cur_attname = NULL; ! } ! } ! else if (cstate->binary && cstate->raw) ! { ! if (cstate->row_processed) ! return false; ! ! CopyLoadallRawBuf(cstate); ! cstate->cur_attname = NameStr(attr[0]->attname); ! ! if (cstate->attribute_buf.data != NULL) ! pfree(cstate->attribute_buf.data); ! ! cstate->attribute_buf.data = cstate->raw_buf; ! cstate->attribute_buf.len = cstate->raw_buf_len; ! cstate->attribute_buf.cursor = 0; ! ! cstate->raw_buf = NULL; ! ! /* Call the column type's binary input converter */ ! values[0] = ReceiveFunctionCall(&in_functions[0], &cstate->attribute_buf, ! typioparams[0], attr[0]->atttypmod); ! nulls[0] = false; ! ! /* Trouble if it didn't eat the whole buffer */ ! if (cstate->attribute_buf.cursor != cstate->attribute_buf.len) ! ereport(ERROR, ! (errcode(ERRCODE_INVALID_BINARY_REPRESENTATION), ! errmsg("incorrect binary data format"))); ! ! cstate->row_processed = true; ! } ! else { + /* text */ char **field_strings; ListCell *cur; int fldct; *************** NextCopyFrom(CopyState cstate, ExprConte *** 3074,3161 **** Assert(fieldno == nfields); } - else - { - /* binary */ - int16 fld_count; - ListCell *cur; - - cstate->cur_lineno++; - - if (!CopyGetInt16(cstate, &fld_count)) - { - /* EOF detected (end of file, or protocol-level EOF) */ - return false; - } - - if (fld_count == -1) - { - /* - * Received EOF marker. In a V3-protocol copy, wait for the - * protocol-level EOF, and complain if it doesn't come - * immediately. This ensures that we correctly handle CopyFail, - * if client chooses to send that now. - * - * Note that we MUST NOT try to read more data in an old-protocol - * copy, since there is no protocol-level EOF marker then. We - * could go either way for copy from file, but choose to throw - * error if there's data after the EOF marker, for consistency - * with the new-protocol case. - */ - char dummy; - - if (cstate->copy_dest != COPY_OLD_FE && - CopyGetData(cstate, &dummy, 1, 1) > 0) - ereport(ERROR, - (errcode(ERRCODE_BAD_COPY_FILE_FORMAT), - errmsg("received copy data after EOF marker"))); - return false; - } - - if (fld_count != attr_count) - ereport(ERROR, - (errcode(ERRCODE_BAD_COPY_FILE_FORMAT), - errmsg("row field count is %d, expected %d", - (int) fld_count, attr_count))); - - if (file_has_oids) - { - Oid loaded_oid; - - cstate->cur_attname = "oid"; - loaded_oid = - DatumGetObjectId(CopyReadBinaryAttribute(cstate, - 0, - &cstate->oid_in_function, - cstate->oid_typioparam, - -1, - &isnull)); - if (isnull || loaded_oid == InvalidOid) - ereport(ERROR, - (errcode(ERRCODE_BAD_COPY_FILE_FORMAT), - errmsg("invalid OID in COPY data"))); - cstate->cur_attname = NULL; - if (cstate->oids && tupleOid != NULL) - *tupleOid = loaded_oid; - } - - i = 0; - foreach(cur, cstate->attnumlist) - { - int attnum = lfirst_int(cur); - int m = attnum - 1; - - cstate->cur_attname = NameStr(attr[m]->attname); - i++; - values[m] = CopyReadBinaryAttribute(cstate, - i, - &in_functions[m], - typioparams[m], - attr[m]->atttypmod, - &nulls[m]); - cstate->cur_attname = NULL; - } - } /* * Now compute and insert any defaults available for the columns not --- 3309,3314 ---- *************** CopyAttributeOutText(CopyState cstate, c *** 4143,4148 **** --- 4296,4312 ---- ptr = string; /* + * Do not any escaping when raw mode is used. In this mode only one field + * is passed - so escaping is useless. We would to work with raw data, i.e. + * no escaping. + */ + if (cstate->raw) + { + CopySendString(cstate, ptr); + return; + } + + /* * We have to grovel through the string searching for control characters * and instances of the delimiter character. In most cases, though, these * are infrequent. To avoid overhead from calling CopySendData once per *************** CopyAttributeOutText(CopyState cstate, c *** 4156,4162 **** * it's worth making two copies of it to get the IS_HIGHBIT_SET() test out * of the normal safe-encoding path. */ ! if (cstate->encoding_embeds_ascii) { start = ptr; while ((c = *ptr) != '\0') --- 4320,4326 ---- * it's worth making two copies of it to get the IS_HIGHBIT_SET() test out * of the normal safe-encoding path. */ ! else if (cstate->encoding_embeds_ascii) { start = ptr; while ((c = *ptr) != '\0') diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c new file mode 100644 index 892058e..777a375 *** a/src/bin/psql/common.c --- b/src/bin/psql/common.c *************** ProcessResult(PGresult **results) *** 871,876 **** --- 871,877 ---- { if (!copystream) copystream = pset.cur_cmd_source; + success = handleCopyIn(pset.db, copystream, PQbinaryTuples(*results), diff --git a/src/interfaces/libpq/Makefile b/src/interfaces/libpq/Makefile new file mode 100644 index 1b292d2..83b30b0 *** a/src/interfaces/libpq/Makefile --- b/src/interfaces/libpq/Makefile *************** include $(top_builddir)/src/Makefile.glo *** 17,23 **** # shared library parameters NAME= pq SO_MAJOR_VERSION= 5 ! SO_MINOR_VERSION= 9 override CPPFLAGS := -DFRONTEND -DUNSAFE_STAT_OK -I$(srcdir) $(CPPFLAGS) -I$(top_builddir)/src/port -I$(top_srcdir)/src/port ifneq ($(PORTNAME), win32) --- 17,23 ---- # shared library parameters NAME= pq SO_MAJOR_VERSION= 5 ! SO_MINOR_VERSION= 10 override CPPFLAGS := -DFRONTEND -DUNSAFE_STAT_OK -I$(srcdir) $(CPPFLAGS) -I$(top_builddir)/src/port -I$(top_srcdir)/src/port ifneq ($(PORTNAME), win32) diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c new file mode 100644 index 41937c0..096ed6b *** a/src/interfaces/libpq/fe-exec.c --- b/src/interfaces/libpq/fe-exec.c *************** PQmakeEmptyPGresult(PGconn *conn, ExecSt *** 155,160 **** --- 155,161 ---- result->resultStatus = status; result->cmdStatus[0] = '\0'; result->binary = 0; + result->raw = 0; result->events = NULL; result->nEvents = 0; result->errMsg = NULL; *************** PQsetResultAttrs(PGresult *res, int numA *** 245,250 **** --- 246,252 ---- /* deep-copy the attribute names, and determine format */ res->binary = 1; + res->raw = 0; for (i = 0; i < res->numAttributes; i++) { if (res->attDescs[i].name) *************** PQsetResultAttrs(PGresult *res, int numA *** 255,262 **** if (!res->attDescs[i].name) return FALSE; ! if (res->attDescs[i].format == 0) res->binary = 0; } return TRUE; --- 257,266 ---- if (!res->attDescs[i].name) return FALSE; ! if (res->attDescs[i].format == 0 || res->attDescs[i].format == 2) res->binary = 0; + if (res->attDescs[i].format == 2 || res->attDescs[i].format == 3) + res->raw = 1; } return TRUE; *************** PQcopyResult(const PGresult *src, int fl *** 372,377 **** --- 376,382 ---- return dest; } + /* * Copy an array of PGEvents (with no extra space for more). * Does not duplicate the event instance data, sets this to NULL. *************** PQbinaryTuples(const PGresult *res) *** 2634,2639 **** --- 2639,2645 ---- { if (!res) return 0; + return res->binary; } *************** PQfmod(const PGresult *res, int field_nu *** 2884,2889 **** --- 2890,2910 ---- return 0; } + /* + * PQcopyFormat + * + * Returns a info about copy mode: + * -1 signalize a error, 0 = text mode, 1 = binary mode, 2 = raw mode + */ + int + PQcopyFormat(const PGresult *res) + { + if (res->raw) + return 2; + else + return res->binary; + } + char * PQcmdStatus(PGresult *res) { diff --git a/src/interfaces/libpq/fe-protocol3.c b/src/interfaces/libpq/fe-protocol3.c new file mode 100644 index 43898a4..3934b1d *** a/src/interfaces/libpq/fe-protocol3.c --- b/src/interfaces/libpq/fe-protocol3.c *************** getCopyStart(PGconn *conn, ExecStatusTyp *** 1397,1402 **** --- 1397,1403 ---- if (pqGetc(&conn->copy_is_binary, conn)) goto failure; + result->binary = conn->copy_is_binary; /* the next two bytes are the number of fields */ if (pqGetInt(&(result->numAttributes), 2, conn)) *************** getCopyStart(PGconn *conn, ExecStatusTyp *** 1426,1431 **** --- 1427,1436 ---- */ format = (int) ((int16) format); result->attDescs[i].format = format; + + /* when any fields uses raw format, then COPY RAW was used */ + if (format == 2 || format == 3) + result->raw = true; } /* Success! */ diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h new file mode 100644 index 6bf34b3..9f7903a *** a/src/interfaces/libpq/libpq-fe.h --- b/src/interfaces/libpq/libpq-fe.h *************** extern int PQfformat(const PGresult *res *** 475,480 **** --- 475,481 ---- extern Oid PQftype(const PGresult *res, int field_num); extern int PQfsize(const PGresult *res, int field_num); extern int PQfmod(const PGresult *res, int field_num); + extern int PQcopyFormat(const PGresult *res); extern char *PQcmdStatus(PGresult *res); extern char *PQoidStatus(const PGresult *res); /* old and ugly */ extern Oid PQoidValue(const PGresult *res); /* new and improved */ diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h new file mode 100644 index 6c9bbf7..6b4d923 *** a/src/interfaces/libpq/libpq-int.h --- b/src/interfaces/libpq/libpq-int.h *************** struct pg_result *** 180,185 **** --- 180,186 ---- char cmdStatus[CMDSTATUS_LEN]; /* cmd status from the query */ int binary; /* binary tuple values if binary == 1, * otherwise text */ + int raw; /* only values */ /* * These fields are copied from the originating PGconn, so that operations diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out new file mode 100644 index 5f6260a..e31c4f2 *** a/src/test/regress/expected/copy2.out --- b/src/test/regress/expected/copy2.out *************** DROP FUNCTION truncate_in_subxact(); *** 466,468 **** --- 466,481 ---- DROP TABLE x, y; DROP FUNCTION fn_x_before(); DROP FUNCTION fn_x_after(); + CREATE TABLE x(a bytea); + INSERT INTO x VALUES('\x41484f4a0a'); + INSERT INTO x VALUES('\x41484f4a0a'); + -- should to fail + COPY (SELECT a,a FROM x LIMIT 1) TO STDOUT (FORMAT raw_binary); + ERROR: only single column result is allowed in RAW mode + COPY (SELECT a FROM x) TO STDOUT (FORMAT raw_binary); + AHOJ + ERROR: only single row result is allowed in RAW mode + -- should be ok + COPY (SELECT a FROM x LIMIT 1) TO STDOUT (FORMAT raw_binary); + AHOJ + DROP TABLE x; diff --git a/src/test/regress/input/copy.source b/src/test/regress/input/copy.source new file mode 100644 index cb13606..e25f996 *** a/src/test/regress/input/copy.source --- b/src/test/regress/input/copy.source *************** this is just a line full of junk that wo *** 133,135 **** --- 133,195 ---- \. copy copytest3 to stdout csv header; + + -- copy raw + CREATE TABLE x(a bytea); + INSERT INTO x VALUES('\x41484f4a0a'); + SELECT length(a) FROM x; + + INSERT INTO x VALUES('\x41484f4a0a'); + + -- should to fail + COPY (SELECT a,a FROM x LIMIT 1) TO '@abs_builddir@/results/raw.data' (FORMAT raw_binary); + COPY (SELECT a FROM x) TO '@abs_builddir@/results/raw.data' (FORMAT raw_binary); + + -- should be ok + COPY (SELECT a FROM x LIMIT 1) TO '@abs_builddir@/results/raw.data' (FORMAT raw_binary); + TRUNCATE x; + COPY x FROM '@abs_builddir@/results/raw.data' (FORMAT raw_binary); + SELECT length(a) FROM x; + COPY x TO stdout (FORMAT raw_binary); + + TRUNCATE x; + + \COPY x FROM '@abs_builddir@/results/raw.data' (FORMAT raw_binary) + SELECT length(a) FROM x; + COPY x TO stdout (FORMAT raw_binary); + + \COPY x TO '@abs_builddir@/results/raw2.data' (FORMAT raw_binary) + TRUNCATE x; + + \COPY x FROM '@abs_builddir@/results/raw2.data' (FORMAT raw_binary) + SELECT length(a) FROM x; + COPY x TO stdout (FORMAT raw_binary); + + -- test big file + TRUNCATE x; + -- use different mechanism for load to bytea + \lo_import @abs_builddir@/data/hash.data + \set lo_oid :LASTOID + INSERT INTO x VALUES(lo_get(:lo_oid)); + \lo_unlink :lo_oid + + COPY x FROM '@abs_builddir@/data/hash.data' (FORMAT raw_binary); + \COPY x FROM '@abs_builddir@/data/hash.data' (FORMAT raw_binary) + + SELECT md5(a), length(a) FROM x; + + TRUNCATE x; + COPY x FROM '@abs_builddir@/data/hash.data' (FORMAT raw_binary); + COPY x TO '@abs_builddir@/results/hash2.data' (FORMAT raw_binary); + \COPY x TO '@abs_builddir@/results/hash3.data' (FORMAT raw_binary) + + -- read again + COPY x FROM '@abs_builddir@/results/hash2.data' (FORMAT raw_binary); + \COPY x FROM '@abs_builddir@/results/hash3.data' (FORMAT raw_binary) + -- cross + COPY x FROM '@abs_builddir@/results/hash3.data' (FORMAT raw_binary); + \COPY x FROM '@abs_builddir@/results/hash2.data' (FORMAT raw_binary) + + SELECT md5(a), length(a) FROM x; + + DROP TABLE x; diff --git a/src/test/regress/output/copy.source b/src/test/regress/output/copy.source new file mode 100644 index b7e372d..6c82993 *** a/src/test/regress/output/copy.source --- b/src/test/regress/output/copy.source *************** copy copytest3 to stdout csv header; *** 95,97 **** --- 95,183 ---- c1,"col with , comma","col with "" quote" 1,a,1 2,b,2 + -- copy raw + CREATE TABLE x(a bytea); + INSERT INTO x VALUES('\x41484f4a0a'); + SELECT length(a) FROM x; + length + -------- + 5 + (1 row) + + INSERT INTO x VALUES('\x41484f4a0a'); + -- should to fail + COPY (SELECT a,a FROM x LIMIT 1) TO '@abs_builddir@/results/raw.data' (FORMAT raw_binary); + ERROR: only single column result is allowed in RAW mode + COPY (SELECT a FROM x) TO '@abs_builddir@/results/raw.data' (FORMAT raw_binary); + ERROR: only single row result is allowed in RAW mode + -- should be ok + COPY (SELECT a FROM x LIMIT 1) TO '@abs_builddir@/results/raw.data' (FORMAT raw_binary); + TRUNCATE x; + COPY x FROM '@abs_builddir@/results/raw.data' (FORMAT raw_binary); + SELECT length(a) FROM x; + length + -------- + 5 + (1 row) + + COPY x TO stdout (FORMAT raw_binary); + AHOJ + TRUNCATE x; + \COPY x FROM '@abs_builddir@/results/raw.data' (FORMAT raw_binary) + SELECT length(a) FROM x; + length + -------- + 5 + (1 row) + + COPY x TO stdout (FORMAT raw_binary); + AHOJ + \COPY x TO '@abs_builddir@/results/raw2.data' (FORMAT raw_binary) + TRUNCATE x; + \COPY x FROM '@abs_builddir@/results/raw2.data' (FORMAT raw_binary) + SELECT length(a) FROM x; + length + -------- + 5 + (1 row) + + COPY x TO stdout (FORMAT raw_binary); + AHOJ + -- test big file + TRUNCATE x; + -- use different mechanism for load to bytea + \lo_import @abs_builddir@/data/hash.data + \set lo_oid :LASTOID + INSERT INTO x VALUES(lo_get(:lo_oid)); + \lo_unlink :lo_oid + COPY x FROM '@abs_builddir@/data/hash.data' (FORMAT raw_binary); + \COPY x FROM '@abs_builddir@/data/hash.data' (FORMAT raw_binary) + SELECT md5(a), length(a) FROM x; + md5 | length + ----------------------------------+-------- + e446fe6ea5a347e69670633412c7f8cb | 153749 + e446fe6ea5a347e69670633412c7f8cb | 153749 + e446fe6ea5a347e69670633412c7f8cb | 153749 + (3 rows) + + TRUNCATE x; + COPY x FROM '@abs_builddir@/data/hash.data' (FORMAT raw_binary); + COPY x TO '@abs_builddir@/results/hash2.data' (FORMAT raw_binary); + \COPY x TO '@abs_builddir@/results/hash3.data' (FORMAT raw_binary) + -- read again + COPY x FROM '@abs_builddir@/results/hash2.data' (FORMAT raw_binary); + \COPY x FROM '@abs_builddir@/results/hash3.data' (FORMAT raw_binary) + -- cross + COPY x FROM '@abs_builddir@/results/hash3.data' (FORMAT raw_binary); + \COPY x FROM '@abs_builddir@/results/hash2.data' (FORMAT raw_binary) + SELECT md5(a), length(a) FROM x; + md5 | length + ----------------------------------+-------- + e446fe6ea5a347e69670633412c7f8cb | 153749 + e446fe6ea5a347e69670633412c7f8cb | 153749 + e446fe6ea5a347e69670633412c7f8cb | 153749 + e446fe6ea5a347e69670633412c7f8cb | 153749 + e446fe6ea5a347e69670633412c7f8cb | 153749 + (5 rows) + + DROP TABLE x; diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql new file mode 100644 index 39a9deb..7e22ee4 *** a/src/test/regress/sql/copy2.sql --- b/src/test/regress/sql/copy2.sql *************** DROP FUNCTION truncate_in_subxact(); *** 333,335 **** --- 333,348 ---- DROP TABLE x, y; DROP FUNCTION fn_x_before(); DROP FUNCTION fn_x_after(); + + CREATE TABLE x(a bytea); + INSERT INTO x VALUES('\x41484f4a0a'); + INSERT INTO x VALUES('\x41484f4a0a'); + + -- should to fail + COPY (SELECT a,a FROM x LIMIT 1) TO STDOUT (FORMAT raw_binary); + COPY (SELECT a FROM x) TO STDOUT (FORMAT raw_binary); + + -- should be ok + COPY (SELECT a FROM x LIMIT 1) TO STDOUT (FORMAT raw_binary); + + DROP TABLE x;
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers