Re: [HACKERS] Re: [COMMITTERS] pgsql: Force strings passed to and from plperl to be in UTF8 encoding.

Alex Hunsaker Tue, 04 Oct 2011 01:35:07 -0700

On Mon, Oct 3, 2011 at 23:35, Amit Khandekar
<amit.khande...@enterprisedb.com> wrote:


> WHen GetDatabaseEncoding() != PG_UTF8 case, ret will not be equal to
> utf8_str, so pg_verify_mbstr_len() will not get called. That's the
> reason, pg_verify_mbstr_len() is under the ( ret == utf8_str )
> condition.

Consider a latin1 database where utf8_str was a string of ascii
characters. Then no conversion would take place and ret == utf8_str
but the string would be verified by pg_do_encdoing_conversion() and
verified again by your added check :-).

>> It might be worth adding a regression test also...
>
> I could not find any basic pl/perl tests in the regression
> serial_schedule. I am not sure if we want to add just this scenario
> without any basic tests for pl/perl ?

I went ahead and added one in the attached based upon your example.

Look ok to you?

BTW thanks for the patch!

[ side note ]
I still think we should not be doing any conversion in the SQL_ASCII
case but this slimmed down patch should be less controversial.

*** a/src/pl/plperl/GNUmakefile
--- b/src/pl/plperl/GNUmakefile
***************
*** 57,63 **** PSQLDIR = $(bindir)
  
  include $(top_srcdir)/src/Makefile.shlib
  
! plperl.o: perlchunks.h plperl_opmask.h
  
  plperl_opmask.h: plperl_opmask.pl
  	@if [ x"$(perl_privlibexp)" = x"" ]; then echo "configure switch --with-perl was not specified."; exit 1; fi
--- 57,63 ----
  
  include $(top_srcdir)/src/Makefile.shlib
  
! plperl.o: perlchunks.h plperl_opmask.h plperl_helpers.h
  
  plperl_opmask.h: plperl_opmask.pl
  	@if [ x"$(perl_privlibexp)" = x"" ]; then echo "configure switch --with-perl was not specified."; exit 1; fi
*** a/src/pl/plperl/expected/plperl.out
--- b/src/pl/plperl/expected/plperl.out
***************
*** 639,641 **** CONTEXT:  PL/Perl anonymous code block
--- 639,651 ----
  DO $do$ use warnings FATAL => qw(void) ; my @y; my $x = sort @y; 1; $do$ LANGUAGE plperl;
  ERROR:  Useless use of sort in scalar context at line 1.
  CONTEXT:  PL/Perl anonymous code block
+ --
+ -- Make sure strings are validated -- This code may fail in a non-UTF8 database
+ -- if it allows null bytes in strings.
+ --
+ CREATE OR REPLACE FUNCTION perl_zerob() RETURNS TEXT AS $$
+   return "abcd\0efg";
+ $$ LANGUAGE plperlu;
+ SELECT perl_zerob();
+ ERROR:  invalid byte sequence for encoding "UTF8": 0x00
+ CONTEXT:  PL/Perl function "perl_zerob"
*** a/src/pl/plperl/plperl_helpers.h
--- b/src/pl/plperl/plperl_helpers.h
***************
*** 9,14 **** utf_u2e(const char *utf8_str, size_t len)
--- 9,22 ----
  {
  	char	   *ret = (char *) pg_do_encoding_conversion((unsigned char *) utf8_str, len, PG_UTF8, GetDatabaseEncoding());
  
+ 	/*
+ 	 * when src encoding == dest encoding (PG_UTF8 ==
+ 	 * GetDatabaseEncoding(), pg_do_encoding_conversion() is a noop and
+ 	 * does not verify the string so we need to do it manually
+ 	 */
+ 	if(GetDatabaseEncoding() == PG_UTF8)
+ 		pg_verify_mbstr_len(PG_UTF8, utf8_str, len, false);
+ 
  	if (ret == utf8_str)
  		ret = pstrdup(ret);
  	return ret;
*** a/src/pl/plperl/sql/plperl.sql
--- b/src/pl/plperl/sql/plperl.sql
***************
*** 415,417 **** DO $do$ use strict; my $name = "foo"; my $ref = $$name; $do$ LANGUAGE plperl;
--- 415,426 ----
  -- check that we can "use warnings" (in this case to turn a warn into an error)
  -- yields "ERROR:  Useless use of sort in scalar context."
  DO $do$ use warnings FATAL => qw(void) ; my @y; my $x = sort @y; 1; $do$ LANGUAGE plperl;
+ 
+ --
+ -- Make sure strings are validated -- This code may fail in a non-UTF8 database
+ -- if it allows null bytes in strings.
+ --
+ CREATE OR REPLACE FUNCTION perl_zerob() RETURNS TEXT AS $$
+   return "abcd\0efg";
+ $$ LANGUAGE plperlu;
+ SELECT perl_zerob();

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Force strings passed to and from plperl to be in UTF8 encoding.

Reply via email to