Regexp works differently with no-ascii characters depending on server encoding (bug.sql contains non-ascii char):

% initdb -E KOI8-R --locale ru_RU.KOI8-R
% psql postgres < bug.sql
 true
------
 t
(1 row)

 true | true
------+------
 t    | t
(1 row)
% initdb -E UTF8 --locale ru_RU.UTF-8
% psql postgres < bug.sql
 true
------
 f
(1 row)

 true | true
------+------
 f    | t
(1 row)

As I can see, that is because of using isalpha (and other is*), tolower & toupper instead of isw* and tow* functions. Is any reason to use them? If not, I can modify regc_locale.c similarly to tsearch2 locale part.



--
Teodor Sigaev                                   E-mail: [EMAIL PROTECTED]
                                                   WWW: http://www.sigaev.ru/
set client_encoding='KOI8';

SELECT  'Ä' ~* '[[:alpha:]]' as "true";
SELECT 
                'äÏÒÏÇÁ' ~* 'ÄÏÒÏÇÁ' as "true", 
                'ÄÏÒÏÇÁ' ~* 'ÄÏÒÏÇÁ' as "true";
---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Reply via email to