On Thu, Jun 9, 2011 at 12:39 AM, Jeevan Chalke <jeevan.cha...@enterprisedb.com> wrote: >> It's a problem, but without an efficient algorithm for Unicode case >> folding, any fix we attempt to implement seems like it'll just be >> moving the problem around. > > Agree. > > I read on other mail thread that str_tolower() is a wide-character-aware > lower function but it is also a collation-aware and hence might change its > behaviour wrt change in locale. However, Tom suggested that we need to have > non-locale-dependent case folding algorithm. > > But still for same locale on same machine, where we can able to create a > table, insert some data, we cannot retrieve it. Don't you think it is more > serious and we need a quick solution here? As said earlier it may even lead > to pg_dump failures. Given that str_tolower() functionality is locale > dependent but still it will resolve this particular issue. Not sure, there > might be a performance issue but at-least we are not giving an error.
Well, as I understand it, the problem here is that if someone goes and changes the locale, then you might massively break the user's application. For example, if the user says: CREATE TABLE FOO (...); SELECT * FROM FOO; ...that'll work, of course, because whatever you get when you downcase FOO will be the same both times. But if the locale now changes, then the next... SELECT * FROM FOO; ...might fail, because the new downcasing of FOO might not match the old one. You could argue that that's better than the current situation, but it's not clear-cut. But now that I re-think about it, I guess what I'm confused about is this code here: if (ch >= 'A' && ch <= 'Z') ch += 'a' - 'A'; else if (IS_HIGHBIT_SET(ch) && isupper(ch)) ch = tolower(ch); result[i] = (char) ch; It seems to me that we're downcasing the first byte of each wide character and ignoring the rest... which seems like it can't possibly be a good idea in a multi-byte encoding. Perhaps we could keep that approach for single-byte encodings and just pass through multi-byte characters untouched? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers