On 14/06/17 9:53, Tom Lane wrote: > Michael Paquier <michael.paqu...@gmail.com> writes: >> On Tue, Jun 17, 2014 at 9:30 AM, Ian Barwick <i...@2ndquadrant.com> wrote: >>> From what I've seen in the wild in Japan, Roman/ASCII characters are >>> widely used for object/attribute names, as generally it's much less >>> hassle than switching between input methods, dealing with different >>> encodings etc. The only place where I've seen Japanese characters widely >>> used is in tutorials, examples etc. However that's only my personal >>> observation for one particular non-Roman language. > >> And I agree to this remark, that's a PITA to manage database object >> names with Japanese characters directly. I have ever seen some >> applications using such ways to define objects though in the past, not >> *that* many I concur.. > > What exactly is the rationale for thinking that Levenshtein distance is > useless in non-Roman alphabets? AFAIK it just counts insertions and > deletions of characters, which seems like a concept rather independent > of what those characters are.
With Japanese (which doesn't have an alphabet, but two syllabaries and a bunch of logographic characters), Levenshtein distance is pretty useless for examining similarities with words which can be written in either syllabary (Michael's "ramen" example earlier in the thread); and when catching "typos" caused by erroneous conversion from phonetic input to characters - e.g. intending to input "成長" (seichou, growth) but accidentally selecting "清聴" (seichou, courteous attention). Howver in this particular use case, as long as it doesn't produce false positives (I haven't looked at the patch) I don't think it would cause any problems (of the kind which would require actively excluding certain languages/character sets), it just wouldn't be quite as useful. Regards Ian Barwick -- Ian Barwick http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers