On Mon, Sep 17, 2018 at 9:02 AM Stephen Frost <sfr...@snowman.net> wrote: > * Thomas Munro (thomas.mu...@enterprisedb.com) wrote: > > Once you get into downstream effects of changes (whether they are > > recorded in the database or elsewhere), I think it's basically beyond > > our event horizon. Why and when did the collation definition change > > (bug fix in CLDR, decree by the Académie Française taking effect on 1 > > January 2019, ...)? We could all use bitemporal databases and > > multi-version ICU, but at some point it all starts to look like an > > episode of Dr Who. I think we should make a clear distinction between > > things that invalidate the correct working of the database, and more > > nebulous effects that we can't possibly track in general. > > I tend to agree in general, but I don't think it's beyond us to consider > multi-version ICU and being able to perform online reindexing (such that > a given system could be migrated from one collation to another over a > time while the system is still online, instead of having to take a > potentially long downtime hit to rebuild indexes after an upgrade, or > having to rebuild the entire system using some kind of logical > replication...).
It's a very interesting idea with a high nerd-sniping factor[1]. Practically speaking, I wonder if you can actually do that with typical Linux distributions where the ICU data is in a shared library (eg libicudata.so.57), and may also be dependent on the ICU code version (?) -- do you run into problems linking to several of them at the same time? Maybe you have to ship your own ICU collations in "data" form to pull that off. But someone mentioned that distributions don't like you to do that (likewise for tzinfo and other such things that no one wants 42 copies of on their system). Actually, if I had infinite resources I'd really like to go and make libc support multiple collation versions with a standard interface (technically easy, bureaucratically hard); I don't really like leaving libc behind. But I digress. I'd like to propose the 3 more humble goals I mentioned a few messages back as earlier steps. OS collation changes aren't really like Monty Python's Spanish Inquisition: they usually hit you when you're doing major operating system upgrades or setting up a streaming replica to a different OS version IIUC. That is, they probably happen during maintenance windows when REINDEX would hopefully be plausible, and presumably critical systems get tested on the new OS version before production is upgraded. It'd be kind to our users to make the problem non-silent at that time so they can plan for it (and of course also alert them if it happens when nobody expects it, because knowing you have a problem is better than not knowing). [1] https://xkcd.com/356/ -- Thomas Munro http://www.enterprisedb.com