"Daniel Verite" <dan...@manitou-mail.org> writes: > Independently of these rules, all Unicode collations change frequently > because each release of Unicode adds new characters. Any string > that contains a code point that was previously unassigned is going > to be sorted differently by all collations when that code point gets > assigned to a character. > Therefore the versions of all collations need to be bumped at every > Unicode release. This is what ICU does.
I'm very skeptical of this process as being a reason to push users to reindex everything in sight. If U+NNNN was not a thing last year, there's no reason to expect that it appears in anyone's existing data, and therefore the fact that it sorts differently this year is a poor excuse for sounding time-to-reindex alarm bells. I'm quite concerned that we are going to be training users to ignore collation-change warnings. They have got to be a lot better targeted than this, or we're just wasting everyone's time, including ours. regards, tom lane