Re: [HACKERS] String encoding during connection "handshake"

Trevor Talbot Wed, 28 Nov 2007 09:41:15 -0800

On 11/28/07, Martijn van Oosterhout <[EMAIL PROTECTED]> wrote:
> On Wed, Nov 28, 2007 at 05:54:05PM +0200, [EMAIL PROTECTED] wrote:


> > Regarding the problem of "One True Encoding", the answer seems obvious to 
> > me:
> > use only one encoding per database cluster, either UTF-8 or UTF-16 or 
> > another
> > Unicode-aware scheme, whichever yields a statistically smaller database for
> > the languages employed by the users in their data. This encoding should be a
> > one time choice! De facto, this is already happening now, because one cannot
> > change collation rules after a cluster has been created.

> Umm, each database in a cluster can have a different encoding, so there
> is no such thing as the "cluster's encoding". You can certainly argue
> that it should be a one time choice, but I doubt you'll get people to
> remove the possibilites we have now. If fact, if anything we'd probably
> go the otherway, allow you to select the collation on a per
> database/table/column level (SQL complaince requires this).

To be clear, what sulfinu is really advocating is convergence on
Unicode period, which is the direction most international projects are
moving, when they can.  PostgreSQL's problem is that it (and AFAICT
POSIX) conflates encoding with locale, when the two are entirely
separate concepts.

I'm not entirely sure how that's supposed to solve the client
authentication issue though.  Demanding that clients present auth data
in UTF-8 is no different than demanding they present it in the
encoding it was entered in originally...

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

Re: [HACKERS] String encoding during connection "handshake"

Reply via email to