The following bug has been logged online: Bug reference: 3697 Logged by: Marc Mamin Email address: [EMAIL PROTECTED] PostgreSQL version: 8.2.4 Operating system: SuSE Linux 9.1 (i586) Description: utf8 issue: can not reimport a table that was successfully exported. Details:
Hello, I'm not sure this is a bug; the problem might be related to a client encoding issue. My Database is on a Linux server which I connect to using putty from Windows. both server and client are set to UTF8: client_encoding | UTF8 backslash_quote | safe_encoding server_encoding | UTF8 I stumbled on this issue while trying to import a "malicious" user agent string... I didn't check if all characters are valid UTF8... My concern is about database recovery. I'm using pg_dump to regulary export my users, bu according to the example below, it seems that my dumps may be worthless ! May be you should consider not to publish this before a fix exist as this is a serious issue which could eventually be exploited to damage existing instances (for the case this is really a bug)... regards, Marc Mamin steps to repeat: CREATE TABLE utf8_test(s varchar); CREATE OR REPLACE FUNCTION f_utf8_test( st VARCHAR) RETURNS INT AS $$ DECLARE quotedline varchar = quote_literal($1); BEGIN INSERT INTO utf8_test ( s ) VALUES ( quotedline); RETURN 0; END; $$ LANGUAGE plpgsql; select f_utf8_test('(Mozilla/4.0 (compatible; MSIE 6.0; Wind \xE0\xF0\xF1\xF2\xE2\xE5\xED\xED\xFB\xE9 \xE2\xFB\xF1\xF8\9 \xE3\xEE\xF1\xF3\xE4 xE4\xE6 \xCD\xC1 \xD0\xC1")'); -- here the same statement, but with all backslashed duplicated for the case when the string was modified when posting this issue: select f_utf8_test('(Mozilla/4.0 (compatible; MSIE 6.0; Wind \\xE0\\xF0\\xF1\\xF2\\xE2\\xE5\\xED\\xED\\xFB\\xE9 \\xE2\\xFB\\xF1\\xF8\\9 \\xE3\\xEE\\xF1\\xF3\\xE4 xE4\\xE6 \\xCD\\xC1 \\xD0\\xC1")'); WARNING: nonstandard use of escape in a string literal LINE 1: select f_utf8_test('(Mozilla/4.0 (compatible; MSIE 6.0; Wind COPY utf8_test TO '/tmp/utf8_test.txt'; COPY 1 COPY utf8_test FROM '/tmp/utf8_test.txt'; ERROR: invalid byte sequence for encoding "UTF8": 0xd3ce HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding". CONTEXT: COPY utf8_test, line 1 The same isuue occure when using pg_dump: pg_dump -i -v -p 5433 -Uisdb2 -tutf8_test > /tmp/utf8_dump pg_dump: server version: 8.2.4; pg_dump version: 8.2.1 pg_dump: proceeding despite version mismatch psql -f"/tmp/utf8_dump" => psql:/tmp/utf8_dump:40: ERROR: invalid byte sequence for encoding "UTF8": 0xd3ce HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding". CONTEXT: COPY utf8_test, line 1 ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster