Hi !
make_greater_string() does not return a string when some UTF8 strings
set to str_const.
# Especially UTF8 strings which contains 'BF' in last byte.
Because make_greater_string() only try incrementing the last byte of
the string, and not try same test for upper bytes.
Therefore, some queries which contains "LIKE '<contains 'BF' in last byte>%'"
can not perform (Btree's) index-scan.
# Or may be nearly full-index-scan.
# See follwing example.
===============================================================================
'西' (Japanese Letter) : 0xE8A5BF
[client : UTF8 ⇔ server : EUC_JP]
=# EXPLAIN ANALYZE SELECT * FROM test2 WHERE name LIKE '西%';
QUERY PLAN
------------------------------------------------------------------------------------------------------------------
Index Scan using test2_name on test2 (cost=0.00..8.28 rows=1 width=3) (actual
time=0.077..0.078 rows=1 loops=1)
Index Cond: ((name >= '西'::text) AND (name < '誠'::text)) <-- Index-scan is
chosen
Filter: (name ~~ '西%'::text)
Total runtime: 0.110 ms
(4 rows)
[client : UTF8 ⇔ server : UTF8]
=# EXPLAIN ANALYZE SELECT * FROM test2 WHERE name LIKE '西%';
QUERY PLAN
----------------------------------------------------------------------------------------------------
Seq Scan on test2 (cost=0.00..1693.01 rows=1 width=4) (actual
time=22.598..22.599 rows=1 loops=1)
Filter: (name ~~ '西%'::text) <-- Seq-scan is chosen !
Total runtime: 22.626 ms
(3 rows)
===============================================================================
Attached patch solve above problem.
Best regards,
--
NTT OSS Center
Tatsuhito Kasahara
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index fc3c5b0..fdf58cf 100644
*** a/src/backend/utils/adt/selfuncs.c
--- b/src/backend/utils/adt/selfuncs.c
*************** make_greater_string(const Const *str_con
*** 5542,5552 ****
*lastchar = savelastchar;
/*
! * Truncate off the last character, which might be more than 1
byte,
! * depending on the character encoding.
*/
if (datatype != BYTEAOID && pg_database_encoding_max_length() >
1)
! len = pg_mbcliplen(workstr, len, len - 1);
else
len -= 1;
--- 5542,5567 ----
*lastchar = savelastchar;
/*
! * Increment the previous character, or truncate off the last
character,
! * which might be more than 1 byte, depending on the character
encoding.
*/
if (datatype != BYTEAOID && pg_database_encoding_max_length() >
1)
! {
! int i;
! int cliplen = pg_mbcliplen(workstr, len,
len - 1);
!
! for (i = len - 1; i > cliplen; i--)
! {
! if ((unsigned char) workstr[i] < (unsigned
char) 255)
! {
! workstr[i]++;
! memset(workstr + i + 1, 1 /* or 0? */,
len - i);
! break;
! }
! }
! if (i <= cliplen)
! len = cliplen;
! }
else
len -= 1;
--
Sent via pgsql-bugs mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs