Hi !

make_greater_string() does not return a string when some UTF8 strings
set to str_const.
# Especially UTF8 strings which contains 'BF' in last byte.

Because make_greater_string() only try incrementing the last byte of
the string, and not try same test for upper bytes.

Therefore, some queries which contains "LIKE '<contains 'BF' in last byte>%'"
can not perform (Btree's) index-scan.
# Or may be nearly full-index-scan.

# See follwing example.
===============================================================================
'西' (Japanese Letter) : 0xE8A5BF

[client : UTF8 ⇔ server : EUC_JP]
=# EXPLAIN ANALYZE SELECT * FROM test2 WHERE name LIKE '西%';
                                                   QUERY PLAN
------------------------------------------------------------------------------------------------------------------
 Index Scan using test2_name on test2  (cost=0.00..8.28 rows=1 width=3) (actual 
time=0.077..0.078 rows=1 loops=1)
  Index Cond: ((name >= '西'::text) AND (name < '誠'::text))  <-- Index-scan is 
chosen
  Filter: (name ~~ '西%'::text)
 Total runtime: 0.110 ms
(4 rows)

[client : UTF8 ⇔ server : UTF8]
=# EXPLAIN ANALYZE SELECT * FROM test2 WHERE name LIKE '西%';
                                            QUERY PLAN
----------------------------------------------------------------------------------------------------
 Seq Scan on test2  (cost=0.00..1693.01 rows=1 width=4) (actual 
time=22.598..22.599 rows=1 loops=1)
  Filter: (name ~~ '西%'::text)  <-- Seq-scan is chosen !
 Total runtime: 22.626 ms
(3 rows)
===============================================================================

Attached patch solve above problem.

Best regards,

-- 
NTT OSS Center
Tatsuhito Kasahara



diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index fc3c5b0..fdf58cf 100644
*** a/src/backend/utils/adt/selfuncs.c
--- b/src/backend/utils/adt/selfuncs.c
*************** make_greater_string(const Const *str_con
*** 5542,5552 ****
                *lastchar = savelastchar;
  
                /*
!                * Truncate off the last character, which might be more than 1 
byte,
!                * depending on the character encoding.
                 */
                if (datatype != BYTEAOID && pg_database_encoding_max_length() > 
1)
!                       len = pg_mbcliplen(workstr, len, len - 1);
                else
                        len -= 1;
  
--- 5542,5567 ----
                *lastchar = savelastchar;
  
                /*
!                * Increment the previous character, or truncate off the last 
character,
!                * which might be more than 1 byte, depending on the character 
encoding.
                 */
                if (datatype != BYTEAOID && pg_database_encoding_max_length() > 
1)
!               {
!                       int             i;
!                       int             cliplen = pg_mbcliplen(workstr, len, 
len - 1);
! 
!                       for (i = len - 1; i > cliplen; i--)
!                       {
!                               if ((unsigned char) workstr[i] < (unsigned 
char) 255)
!                               {
!                                       workstr[i]++;
!                                       memset(workstr + i + 1, 1 /* or 0? */, 
len - i);
!                                       break;
!                               }
!                       }
!                       if (i <= cliplen)
!                               len = cliplen;
!               }
                else
                        len -= 1;
  
-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Reply via email to