Could you point me where in the archives can I read more? I'm having a bit of trouble finding discussion on this. Thanks.
I didn't spend too much time looking, but there are a few that look like they'll touch upon related issues:
http://archives.postgresql.org/pgsql-hackers/2003-11/msg01299.php http://archives.postgresql.org/pgsql-hackers/2001-11/msg00610.php http://archives.postgresql.org/pgsql-hackers/2002-11/msg00515.php
So, as I understand it, the current plan is:
1. charset + encoding will be tagged to each column (as per SQL standard)
2a. individual string values will be tagged with charset+encoding. this incurs an overhead of 1-2 bytes per value.
or
2b. all string values will be stored in a single charset+encoding (e.g. unicode + utf8). this will of course upset some people, e.g. japanese.
Is it 1+2a or 1+2b? Recent language implementations/VM like Parrot and Ruby2 are inclined to 2a, I think.
-- dave
---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])