"Daniel Verite" <dan...@manitou-mail.org> writes:
> Independently of these rules, all Unicode collations change frequently
> because each release of Unicode adds new characters. Any string
> that contains a code point that was previously unassigned is going
> to be sorted differently by all collations when that code point gets
> assigned to a character.
> Therefore the versions of all collations need to be bumped at every
> Unicode release. This is what ICU does.

I'm very skeptical of this process as being a reason to push users
to reindex everything in sight.  If U+NNNN was not a thing last year,
there's no reason to expect that it appears in anyone's existing data,
and therefore the fact that it sorts differently this year is a poor
excuse for sounding time-to-reindex alarm bells.

I'm quite concerned that we are going to be training users to ignore
collation-change warnings.  They have got to be a lot better targeted
than this, or we're just wasting everyone's time, including ours.

                        regards, tom lane


Reply via email to