The following bug has been logged online: Bug reference: 4622 Logged by: Sergey Burladyan Email address: eshkin...@gmail.com PostgreSQL version: 8.3.5 Operating system: Debian testing Description: xpath only work in utf-8 server encoding Details:
hello, all ! i am trying for test parse xml string in other than utf-8 encoding, it correctly loaded but xpath(text, xml) can't handle it: s...@seb:~/tmp/pg$ echo $LANG ru_RU.CP1251 s...@seb:~/tmp/pg$ /usr/lib/postgresql/8.3/bin/postgres -p 5433 -k s -s -D . LOG: система была отключена: 2009-01-22 16:30:07 MSK LOG: autovacuum launcher started LOG: database system is ready to accept connections s...@seb:~$ echo $LANG ru_RU.CP1251 s...@seb:~$ psql -h localhost -p 5433 Welcome to psql 8.3.5, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help with psql commands \g or terminate with semicolon to execute query \q to quit seb=# select * from (select xml('<русский>язык</русский>')) as x(v); v ------------------------- <русский>язык</русский> (1 запись) seb=# select xpath('/русский/text()', v::xml) from (select xml('<русский>язык</русский>')) as x(v); ERROR: could not parse XML data DETAIL: Entity: line 1: parser error : Input is not proper UTF-8, indicate encoding ! Bytes: 0xF0 0xF3 0xF1 0xF1 <x><русский>язык</русский></x> ^ seb=# select name, setting from pg_settings where name like 'lc_%' or name like '%enco%'; name | setting -----------------+-------------- client_encoding | WIN1251 lc_collate | ru_RU.CP1251 lc_ctype | ru_RU.CP1251 lc_messages | ru_RU.CP1251 lc_monetary | ru_RU.CP1251 lc_numeric | ru_RU.CP1251 lc_time | ru_RU.CP1251 server_encoding | WIN1251 (8 rows) in utf-8 server encoding it work correctly: seb=> select xpath('/русский/text()', v::xml) from (select xml('<русский>язык</русский>')) as x(v); xpath -------- {язык} (1 запись) seb=> select name, setting from pg_settings where name like 'lc_%' or name like '%enco%'; name | setting -----------------+------------- client_encoding | UTF8 lc_collate | ru_RU.UTF-8 lc_ctype | ru_RU.UTF-8 lc_messages | ru_RU.UTF-8 lc_monetary | ru_RU.UTF-8 lc_numeric | ru_RU.UTF-8 lc_time | ru_RU.UTF-8 server_encoding | UTF8 (8 rows) i am think something is wrong here, string parsed correctly by xml(text), but it result can't pass to xpath function... -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs