Hi, Although this is a ten-year-old message, it was the one I found quickly when looking to see what the current state of play on this might be.
On 2013-09-20 14:22, Robert Haas wrote:
Hmm. So under that design, a database could support up to a total of two character sets, the one that you get when you say 'foo' and the other one that you get when you say n'foo'. I guess we could do that, but it seems a bit limited. If we're going to go to the trouble of supporting multiple character sets, why not support an arbitrary number instead of just two?
Because that old thread came to an end without mentioning how the standard approaches that, it seemed worth adding, just to complete the record. In the draft of the standard I'm looking at (which is also around a decade old), n'foo' is nothing but a handy shorthand for _csname'foo' (which is a syntax we do not accept) for some particular csname that was chosen when setting up the db. So really, the standard contemplates letting you have columns of arbitrary different charsets (CHAR(x) CHARACTER SET csname), and literals of arbitrary charsets _csname'foo'. Then, as a bit of sugar, you get to pick which two of those charsets you'd like to have easy shorter ways of writing, 'foo' or n'foo', CHAR or NCHAR. The grammar for csname is kind of funky. It can be nothing but <SQL language identifier>, which has the nice restricted form /[A-Za-z][A-Za-z0-9_]*/. But it can also be schema-qualified, with the schema of course being a full-fledged <identifier>. So yeah, to fully meet this part of the standard, the parser'd have to know that _U&"I am a schema nameZ0021" UESCAPE 'Z'/*hi!*/.LATIN1'foo' is a string literal, expressing foo, in a character set named LATIN1, in some cutely-named schema. Never a dull moment. Regards, -Chap