On Tue, Jan 19, 2010 at 12:54 PM, Pavel Stehule <pavel.steh...@gmail.com> wrote: >> I think what you're saying is that you agree with Tom's position that >> the new escaping function should not add the necessary surrounding >> quotes, instead leaving that to the user. Is that correct? > > yes
Updated patch attached. I still think this is a bizarre API. ...Robert
*** a/doc/src/sgml/libpq.sgml --- b/doc/src/sgml/libpq.sgml *************** *** 2923,3042 **** typedef struct { </sect2> <sect2 id="libpq-exec-escape-string"> ! <title>Escaping Strings for Inclusion in SQL Commands</title> <indexterm zone="libpq-exec-escape-string"> - <primary>PQescapeStringConn</primary> - </indexterm> - <indexterm zone="libpq-exec-escape-string"> - <primary>PQescapeString</primary> - </indexterm> - <indexterm zone="libpq-exec-escape-string"> <primary>escaping strings</primary> <secondary>in libpq</secondary> </indexterm> ! <para> ! <function>PQescapeStringConn</function> escapes a string for use within an SQL ! command. This is useful when inserting data values as literal constants ! in SQL commands. Certain characters (such as quotes and backslashes) must ! be escaped to prevent them from being interpreted specially by the SQL parser. ! <function>PQescapeStringConn</> performs this operation. ! </para> ! <tip> ! <para> ! It is especially important to do proper escaping when handling strings that ! were received from an untrustworthy source. Otherwise there is a security ! risk: you are vulnerable to <quote>SQL injection</> attacks wherein unwanted ! SQL commands are fed to your database. ! </para> ! </tip> ! <para> ! Note that it is not necessary nor correct to do escaping when a data ! value is passed as a separate parameter in <function>PQexecParams</> or ! its sibling routines. ! ! <synopsis> ! size_t PQescapeStringConn (PGconn *conn, ! char *to, const char *from, size_t length, ! int *error); ! </synopsis> ! </para> ! <para> ! <function>PQescapeStringConn</> writes an escaped version of the ! <parameter>from</> string to the <parameter>to</> buffer, escaping ! special characters so that they cannot cause any harm, and adding a ! terminating zero byte. The single quotes that must surround ! <productname>PostgreSQL</> string literals are not included in the ! result string; they should be provided in the SQL command that the ! result is inserted into. The parameter <parameter>from</> points to ! the first character of the string that is to be escaped, and the ! <parameter>length</> parameter gives the number of bytes in this ! string. A terminating zero byte is not required, and should not be ! counted in <parameter>length</>. (If a terminating zero byte is found ! before <parameter>length</> bytes are processed, ! <function>PQescapeStringConn</> stops at the zero; the behavior is ! thus rather like <function>strncpy</>.) <parameter>to</> shall point ! to a buffer that is able to hold at least one more byte than twice ! the value of <parameter>length</>, otherwise the behavior is undefined. ! Behavior is likewise undefined if the <parameter>to</> and ! <parameter>from</> strings overlap. ! </para> ! <para> ! If the <parameter>error</> parameter is not NULL, then ! <literal>*error</> is set to zero on success, nonzero on error. ! Presently the only possible error conditions involve invalid multibyte ! encoding in the source string. The output string is still generated ! on error, but it can be expected that the server will reject it as ! malformed. On error, a suitable message is stored in the ! <parameter>conn</> object, whether or not <parameter>error</> is NULL. ! </para> ! <para> ! <function>PQescapeStringConn</> returns the number of bytes written ! to <parameter>to</>, not including the terminating zero byte. ! </para> ! <para> ! <synopsis> ! size_t PQescapeString (char *to, const char *from, size_t length); ! </synopsis> ! </para> ! <para> ! <function>PQescapeString</> is an older, deprecated version of ! <function>PQescapeStringConn</>; the difference is that it does ! not take <parameter>conn</> or <parameter>error</> parameters. ! Because of this, it cannot adjust its behavior depending on the ! connection properties (such as character encoding) and therefore ! <emphasis>it might give the wrong results</>. Also, it has no way ! to report error conditions. ! </para> ! <para> ! <function>PQescapeString</> can be used safely in single-threaded ! client programs that work with only one <productname>PostgreSQL</> ! connection at a time (in this case it can find out what it needs to ! know <quote>behind the scenes</>). In other contexts it is a security ! hazard and should be avoided in favor of ! <function>PQescapeStringConn</>. ! </para> ! </sect2> ! <sect2 id="libpq-exec-escape-bytea"> ! <title>Escaping Binary Strings for Inclusion in SQL Commands</title> ! <indexterm zone="libpq-exec-escape-bytea"> ! <primary>bytea</primary> ! <secondary sortas="libpq">in libpq</secondary> ! </indexterm> - <variablelist> <varlistentry> <term> <function>PQescapeByteaConn</function> --- 2923,3117 ---- </sect2> <sect2 id="libpq-exec-escape-string"> ! <title>Escaping Strings and Identifiers for Inclusion in SQL Commands</title> <indexterm zone="libpq-exec-escape-string"> <primary>escaping strings</primary> <secondary>in libpq</secondary> </indexterm> ! <variablelist> ! <varlistentry> ! <term> ! <function>PQescapeStringConn</function> ! <indexterm> ! <primary>PQescapeStringConn</primary> ! </indexterm> ! </term> ! <listitem> ! <para> ! <function>PQescapeStringConn</function> escapes a string for use within an SQL ! command. This is useful when inserting data values as literal constants ! in SQL commands. Certain characters (such as quotes and backslashes) must ! be escaped to prevent them from being interpreted specially by the SQL parser. ! <function>PQescapeStringConn</> performs this operation. ! </para> ! <tip> ! <para> ! It is especially important to do proper escaping when handling strings that ! were received from an untrustworthy source. Otherwise there is a security ! risk: you are vulnerable to <quote>SQL injection</> attacks wherein unwanted ! SQL commands are fed to your database. ! </para> ! </tip> ! <para> ! Note that it is not necessary nor correct to do escaping when a data ! value is passed as a separate parameter in <function>PQexecParams</> or ! its sibling routines. ! <synopsis> ! size_t PQescapeStringConn (PGconn *conn, ! char *to, const char *from, size_t length, ! int *error); ! </synopsis> ! </para> ! <para> ! <function>PQescapeStringConn</> writes an escaped version of the ! <parameter>from</> string to the <parameter>to</> buffer, escaping ! special characters so that they cannot cause any harm, and adding a ! terminating zero byte. The single quotes that must surround ! <productname>PostgreSQL</> string literals are not included in the ! result string; they should be provided in the SQL command that the ! result is inserted into. The parameter <parameter>from</> points to ! the first character of the string that is to be escaped, and the ! <parameter>length</> parameter gives the number of bytes in this ! string. A terminating zero byte is not required, and should not be ! counted in <parameter>length</>. (If a terminating zero byte is found ! before <parameter>length</> bytes are processed, ! <function>PQescapeStringConn</> stops at the zero; the behavior is ! thus rather like <function>strncpy</>.) <parameter>to</> shall point ! to a buffer that is able to hold at least one more byte than twice ! the value of <parameter>length</>, otherwise the behavior is undefined. ! Behavior is likewise undefined if the <parameter>to</> and ! <parameter>from</> strings overlap. ! </para> ! <para> ! If the <parameter>error</> parameter is not NULL, then ! <literal>*error</> is set to zero on success, nonzero on error. ! Presently the only possible error conditions involve invalid multibyte ! encoding in the source string. The output string is still generated ! on error, but it can be expected that the server will reject it as ! malformed. On error, a suitable message is stored in the ! <parameter>conn</> object, whether or not <parameter>error</> is NULL. ! </para> ! <para> ! <function>PQescapeStringConn</> returns the number of bytes written ! to <parameter>to</>, not including the terminating zero byte. ! </para> ! </listitem> ! </varlistentry> ! <varlistentry> ! <term> ! <function>PQescapeString</function> ! <indexterm> ! <primary>PQescapeString</primary> ! </indexterm> ! </term> ! ! <listitem> ! <para> ! <synopsis> ! size_t PQescapeString (char *to, const char *from, size_t length); ! </synopsis> ! </para> + <para> + <function>PQescapeString</> is an older, deprecated version of + <function>PQescapeStringConn</>; the difference is that it does + not take <parameter>conn</> or <parameter>error</> parameters. + Because of this, it cannot adjust its behavior depending on the + connection properties (such as character encoding) and therefore + <emphasis>it might give the wrong results</>. Also, it has no way + to report error conditions. + </para> ! <para> ! <function>PQescapeString</> can be used safely in single-threaded ! client programs that work with only one <productname>PostgreSQL</> ! connection at a time (in this case it can find out what it needs to ! know <quote>behind the scenes</>). In other contexts it is a security ! hazard and should be avoided in favor of ! <function>PQescapeStringConn</>. ! </para> ! </listitem> ! </varlistentry> ! <varlistentry> ! <term> ! <function>PQescapeIdentifierConn</function> ! <indexterm> ! <primary>PQescapeIdentifierConn</primary> ! </indexterm> ! </term> ! ! <listitem> ! <para> ! <function>PQescapeIdentifierConn</> is intended for use in escaping ! strings for use as SQL identifiers, such as table or column names. ! It writes an escaped version of the <parameter>from</> string to the ! <parameter>to</> buffer, escaping special characters so that they cannot ! cause any harm, and adding a terminating zero byte. Note that the ! escaping required for identifiers is different than what is required ! for string literals, so you must be careful to use the correct function. ! </para> ! ! <tip> ! <para> ! In order to use this function safely, you must surround its output ! with double quotation marks. If you do not include these double ! quotation marks, or if you do not use this function at all, your ! application may be vulnerable to <quote>SQL injection</> attacks, ! wherein unwanted SQL commands are fed to your database, if identifiers ! are recieved from an untrustworthy source. ! </para> ! </tip> ! ! <para> ! <synopsis> ! size_t PQescapeIdentifierConn (PGconn *conn, ! char *to, const char *from, size_t length, ! int *error); ! </synopsis> ! </para> ! ! <para> ! The parameter <parameter>from</> points to the first character of the ! string that is to be escaped, and the <parameter>length</> parameter ! gives the number of bytes in this string. A terminating zero byte is not ! required, and should not be counted in <parameter>length</>. (If a ! terminating zero byte is found before <parameter>length</> bytes are ! processed, <function>PQescapeIdentifierConn</> stops at the zero; the ! behavior is thus rather like <function>strncpy</>.) <parameter>to</> ! shall point to a buffer that is able to hold at least one more bytes ! than twice the value of <parameter>length</>, otherwise the behavior is ! undefined. Behavior is likewise undefined if the <parameter>to</> and ! <parameter>from</> strings overlap. ! </para> ! ! <para> ! If the <parameter>error</> parameter is not NULL, then ! <literal>*error</> is set to zero on success, nonzero on error. ! Presently the only possible error conditions involve invalid multibyte ! encoding in the source string. The output string is still generated ! on error, but it can be expected that the server will reject it as ! malformed. On error, a suitable message is stored in the ! <parameter>conn</> object, whether or not <parameter>error</> is NULL. ! </para> ! ! <para> ! <function>PQescapeIdentifierConn</> returns the number of bytes written ! to <parameter>to</>, not including the terminating zero byte. ! </para> ! </listitem> ! </varlistentry> <varlistentry> <term> <function>PQescapeByteaConn</function> *** a/src/interfaces/libpq/exports.txt --- b/src/interfaces/libpq/exports.txt *************** *** 153,155 **** PQresultSetInstanceData 150 --- 153,156 ---- PQfireResultCreateEvents 151 PQconninfoParse 152 PQinitOpenSSL 153 + PQescapeIdentifierConn 154 *** a/src/interfaces/libpq/fe-exec.c --- b/src/interfaces/libpq/fe-exec.c *************** *** 3058,3063 **** PQescapeString(char *to, const char *from, size_t length) --- 3058,3167 ---- static_std_strings); } + /* + * Escape an arbitrary string as an SQL identifier. + * + * To use this function safely, you MUST surround surround the output with + * double quotation marks. The only purpose of this function is to double any + * internal quotation marks. So, unlike the backend function + * quote_identifier(), this function can only be used for unconditional + * quoting. That's probably OK, because to quote only when it's actually + * needed, we'd need a list of the tokens that the server considers keywords. + * That depends on the server version, which isn't known here. + * + * This function will up to, but not more than, 2*length+1 bytes to the output + * buffer. A terminating NUL character is added to the output string, whether + * the input is NUL-terminated or not. + * + * Returns the actual length of the output (not counting the terminating NUL). + */ + size_t + PQescapeIdentifierConn(PGconn *conn, char *to, const char *from, + size_t length, int *error) + { + const char *source = from; + char *target = to; + size_t remaining = length; + + if (!conn) + { + /* force empty-string result */ + *to = '\0'; + if (error) + *error = 1; + return 0; + } + + if (error) + *error = 0; + + while (remaining > 0 && *source != '\0') + { + char c = *source; + int len; + int i; + + /* Fast path for plain ASCII */ + if (!IS_HIGHBIT_SET(c)) + { + /* Apply quoting if needed */ + if (c == '"') + *target++ = c; + /* Copy the character */ + *target++ = c; + source++; + remaining--; + continue; + } + + /* Slow path for possible multibyte characters */ + len = pg_encoding_mblen(conn->client_encoding, source); + + /* Copy the character */ + for (i = 0; i < len; i++) + { + if (remaining == 0 || *source == '\0') + break; + *target++ = *source++; + remaining--; + } + + /* + * If we hit premature end of string (ie, incomplete multibyte + * character), try to pad out to the correct length with spaces. We + * may not be able to pad completely, but we will always be able to + * insert at least one pad space (since we'd not have quoted a + * multibyte character). This should be enough to make a string that + * the server will error out on. + */ + if (i < len) + { + if (error) + *error = 1; + if (conn) + printfPQExpBuffer(&conn->errorMessage, + libpq_gettext("incomplete multibyte character\n")); + for (; i < len; i++) + { + /* + * The output buffer must be 2n+3 bytes, but the extra 3 are + * reserved for leading and trailing quotes and terminating; + * NUL, so we have room for exactly 2 output bytes per input + * character. + */ + if (((size_t) (target - to)) / 2 >= length) + break; + *target++ = ' '; + } + break; + } + } + + /* Write the terminating NUL character. */ + *target = '\0'; + + return target - to; + } /* HEX encoding support for bytea */ static const char hextbl[] = "0123456789abcdef"; *** a/src/interfaces/libpq/libpq-fe.h --- b/src/interfaces/libpq/libpq-fe.h *************** *** 471,476 **** extern int PQsetvalue(PGresult *res, int tup_num, int field_num, char *value, in --- 471,478 ---- extern size_t PQescapeStringConn(PGconn *conn, char *to, const char *from, size_t length, int *error); + extern size_t PQescapeIdentifierConn(PGconn *conn, char *to, const char *from, + size_t length, int *error); extern unsigned char *PQescapeByteaConn(PGconn *conn, const unsigned char *from, size_t from_length, size_t *to_length);
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers