Tatsuhito Kasahara <kasahara.tatsuh...@oss.ntt.co.jp> writes: > make_greater_string() does not return a string when some UTF8 strings > set to str_const. > # Especially UTF8 strings which contains 'BF' in last byte.
The patch you propose for this is really untenable: it will re-introduce many corner cases that we got rid of years ago, for example cases wherein pg_verifymbstr and pg_mbcliplen index off the end of the string because they think the last character occupies more bytes than are there. It's intentional that the existing code doesn't mess with the first byte of a multibyte character (which is the one that determines the character length, in all encodings of interest). Another problem is that if the last character is several bytes long, this coding would cause us to iterate through potentially many millions of character values before giving up and truncating off the last character. In a large number of cases that's just wasted time because there is no chance of getting a larger string without incrementing some character further to the left. So there's a tradeoff that limits how many values we should consider for each character position --- choosing to consider at most 255 is a bit arbitrary, but "all of them" isn't going to work. I don't think that the set of cases that could be improved this way is large enough to justify trying to find solutions to these problems. regards, tom lane -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs