On Mon, Jun 16, 2014 at 4:04 PM, Josh Berkus <j...@agliodbs.com> wrote: > Question: How should we handle the issues with East Asian languages > (i.e. Japanese, Chinese) and this Hint? Should we just avoid hinting > for a selected list of languages which don't work well with levenshtein? > If so, how do we get that list?
I think that how useful Levenshtein distance is for users based in east Asia generally, and how useful this patch is to those users are two distinct questions. I have no idea how common it is for Japanese users to just use Roman characters as table and attribute names. Since they're very probably already writing application code that uses Roman characters (except in the comments, user strings and so on), it might make sense to do the same in the database. I would welcome further input on that question. I don't know what the trends are in the real world. Also note that the patch scans the range table parse state to pick the most probable candidate among all Vars/columns that already appear there. The query would raise an error at an earlier point if a non-existent relation was referenced, for example. We're only choosing from a minimal list of possibilities, and pick one that is very probably what was intended. Even if Levenshtein distance works badly with Kanji (which is not obviously the case, at least to me), it might not matter here. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers