In a context using normalization, wouldn't you typically want to store a
normalized-text type that could perhaps (depending on locale) take advantage of
simpler, more-efficient comparison functions? Whether you're doing
INSERT/UPDATE, or importing a flat text file, if you canonicalize characters
On Sep 24, 2009, at 8:59 AM, Andrew Dunstan wrote:
That might be nice, but I'd be wary of a geometric multiplication
of text types. We already have TEXT and CITEXT; what if we had your
NTEXT (normalized text) but I wanted it to also be case-insensitive?
Actually, I don't think it's necessar
David E. Wheeler wrote:
On Sep 24, 2009, at 6:24 AM, p...@thetdh.com wrote:
In a context using normalization, wouldn't you typically want to
store a normalized-text type that could perhaps (depending on locale)
take advantage of simpler, more-efficient comparison functions?
That might be n
On Sep 24, 2009, at 6:24 AM, p...@thetdh.com wrote:
In a context using normalization, wouldn't you typically want to
store a normalized-text type that could perhaps (depending on
locale) take advantage of simpler, more-efficient comparison
functions?
That might be nice, but I'd be wary of
On Sep 23, 2009, at 11:08 AM, David E. Wheeler wrote:
I looked around and found the Public Software Group's utf8proc
project, which even includes some PostgreSQL support (not, alas, for
normalization). It has an MIT-licensed C library that offers these
functions:
Sorry, forgot the link:
On Sep 23, 2009, at 11:08 AM, David E. Wheeler wrote:
I just had a discussion on IRC about unicode normalization in
PostgreSQL. Apparently there is not support for it, currently.
BTW, the only reference I found on the [to do list](http://wiki.postgresql.org/wiki/Todo
) was:
More sensible