On 12 February 2011 14:48, Alex Hunsaker <bada...@gmail.com> wrote: > On Sun, Feb 6, 2011 at 15:31, Andrew Dunstan <and...@dunslane.net> wrote: >> Force strings passed to and from plperl to be in UTF8 encoding. >> >> String are converted to UTF8 on the way into perl and to the >> database encoding on the way back. This avoids a number of >> observed anomalies, and ensures Perl a consistent view of the >> world. > > So I noticed a problem while playing with this in my discussion with > David Wheeler. pg_do_encoding() does nothing when the src encoding == > the dest encoding. That means on a UTF-8 database we fail make sure > our strings are valid utf8. > > An easy way to see this is to embed a null in the middle of a string: > => create or replace function zerob() returns text as $$ return > "abcd\0efg"; $$ language plperl; > => SELECT zerob(); > abcd > > Also It seems bogus to bogus to do any encoding conversion when we are > SQL_ASCII, and its really trivial to fix. > > With the attached: > - when we are on a utf8 database make sure to verify our output string > in sv2cstr (we assume database strings coming in are already valid) > > - Do no string conversion when we are SQL_ASCII in or out > > - add plperl_helpers.h as a dep to plperl.o in our makefile > > - remove some redundant calls to pg_verify_mbstr() > > - as utf_e2u only as one caller dont pstrdup() instead have the caller > check (saves some cycles and memory) >
Is there a plan to commit this issue? I am still seeing this issue on PG 9.1 STABLE branch. Attached is a small patch that targets only the specific issue in the described testcase : create or replace function zerob() returns text as $$ return "abcd\0efg"; $$ language plperl; SELECT zerob(); The patch does the perl data validation in the function utf_u2e() itself. > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > >
diff --git a/src/pl/plperl/plperl_helpers.h b/src/pl/plperl/plperl_helpers.h index 81c177b..3afe2f5 100644 --- a/src/pl/plperl/plperl_helpers.h +++ b/src/pl/plperl/plperl_helpers.h @@ -10,7 +10,10 @@ utf_u2e(const char *utf8_str, size_t len) char *ret = (char *) pg_do_encoding_conversion((unsigned char *) utf8_str, len, PG_UTF8, GetDatabaseEncoding()); if (ret == utf8_str) + { + pg_verify_mbstr_len(PG_UTF8, utf8_str, len, false); ret = pstrdup(ret); + } return ret; }
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers