Re: [HACKERS] Bug in UTF8-Validation Code?

Mark Dilger Wed, 04 Apr 2007 10:03:11 -0700

Tatsuo Ishii wrote:

<SNIP>. I think we need to continute design discussion, probably
targetting for 8.4, not 8.3.

The discussion came about because Andrew - Supernews noticed that chr()returns invalid utf8, and we're trying to fix all the bugs with invalidutf8 in the system. Something needs to be done, even if we just checkthe result of the current chr() implementation and throw an error oninvalid results. But do we want to make this minor change for 8.3 andthen change it again for 8.4?

Here's an example of the current problem. It's an 8.2.3 database withutf8.en_US encoding



mark=# create table testutf8 (t text);
CREATE TABLE

mark=# insert into testutf8 (t) (select chr(gs) fromgenerate_series(0,255) as gs);

INSERT 0 256
mark=# \copy testutf8 to testutf8.data
mark=# truncate testutf8;
TRUNCATE TABLE
mark=# \copy testutf8 from testutf8.data
ERROR:  invalid byte sequence for encoding "UTF8": 0x80

HINT: This error can also happen if the byte sequence does not matchthe encoding expected by the server, which is controlled by"client_encoding".

CONTEXT:  COPY testutf8, line 129



---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match

Re: [HACKERS] Bug in UTF8-Validation Code?

Reply via email to