On Sun, Jun 30, 2019 at 11:45:47AM +0200, Julien Rouhaud wrote:
Hi,
With the glibc 2.28 coming, all users will have to reindex almost
every indexes after a glibc upgrade to guarantee the lack of
corruption. Unfortunately, reindexdb is not ideal for that as it's
processing everything using a single connexion and isn't able to
discard indexes that doesn't depend on a glibc collation.
PFA a patchset to add parallelism to reindexdb (reusing the
infrastructure in vacuumdb with some additions) and an option to
discard indexes that doesn't depend on glibc (without any specific
collation filtering or glibc version detection), with updated
regression tests. Note that this should be applied on top of the
existing reindexdb cleanup & refactoring patch
(https://commitfest.postgresql.org/23/2115/).
This was sponsored by VMware, and has been discussed internally with
Kevin and Michael, in Cc.
I wonder why this is necessary:
pg_log_error("cannot reindex glibc dependent objects and a subset of objects");
What's the reasoning behind that? It seems like a valid use case to me -
imagine you have a bug database, but only a couple of tables are used by
the application regularly (the rest may be archive tables, for example).
Why not to allow rebuilding glibc-dependent indexes on the used tables, so
that the database can be opened for users sooner.
BTW now that we allow rebuilding only some of the indexes, it'd be great
to have a dry-run mode, were we just print which indexes will be rebuilt
without actually rebuilding them.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services