Nikita Glukhov <n.glu...@postgrespro.ru> writes: > I don't know if it is possible to check Unicode properties "ID_Start" and > "ID_Continue" in Postgres, and what ZWNJ/ZWJ is. Now, identifier's starting > character set is simply determined by the exclusion of all recognized special > characters.
TBH, I think you should simply ignore any aspect of any of these standards that is defined by reference to Unicode. We are not necessarily dealing with a Unicode character set, so at best, references to things like ZWNJ are unreachable no-ops in a lot of environments. As a relevant example, modern SQL defines whitespace in terms of Unicode[1], a fact that we have ignored from the start and will likely continue to do so. You could do a lot worse than to just consider identifiers to be the same strings as our SQL lexer would do (modulo things like "$" that have special status in the path language). regards, tom lane [1] cf 4.2.4 "Character repertoires" in SQL:2011