On Thu, Jul 25, 2019 at 12:03 PM Michael Paquier <mich...@paquier.xyz> wrote: > > On Thu, Jul 25, 2019 at 12:12:40PM +0300, Sergei Kornilov wrote: > > Thank you, v9 code and behavior looks good for me. Builds cleanly, > > works with older releases. I'll mark Ready to Commiter. > > The restriction with --jobs and --system is not documented, and that's > good to have from the start.
That's documented in the patch: + Note that this option is ignored with the <option>--index</option> + option to prevent deadlocks when processing multiple indexes from + the same relation, and incompatible with the <option>--system</option> + option. The restriction with --jobs and --concurrently is indeed not specifically documented in reindexdb.sgml, there's only a mention in reindex.sgml: <command>REINDEX SYSTEM</command> does not support <command>CONCURRENTLY</command> since system catalogs cannot be reindexed The behavior doesn't really change with this patch, though we could enhance the documentation. > This actually makes the parallel job > handling with --index inconsistent as we mask parallelism in this case > by enforcing the use of one connection. I think that we should > revisit the interactions with --index and --jobs actually, because, > assuming that users never read the documentation, they may actually be > surprised to see that something like --index idx1 .. --index idxN > --jobs=N does not lead to any improvements at all, until they find out > the reason why. The problem is that a user doing something like: reindexdb -j48 -i some_index -S s1 -S s2 -S s3.... will probably be disappointed to learn that he has to run a specific command for the index(es) that should be reindexed. Maybe we can issue a warning that parallelism isn't used when an index list is processed and user asked for multiple jobs? > Thinking deeper about it, there is also a point of gathering first all > the relations if one associates --schemas and --tables in the same > call of reindexdb and then pass down a list of decomposed relations > which are processed in parallel. The code as currently presented is > rather straight-forward, and I don't think that this is worth the > extra complication, but this was not mentioned until now on this > thread :) +1 > For the non-parallel case in reindex_one_database(), I would add an > Assert on REINDEX_DATABASE and REINDEX_SYSTEM with a comment to > mention that a NULL list of objects can just be passed down only in > those two cases when the single-object list with the database name is > built. Something like that? if (!parallel) { - if (user_list == NULL) + /* + * Database-wide and system catalogs processing should omit the list + * of objects to process. + */ + if (process_type == REINDEX_DATABASE || process_type == REINDEX_SYSTEM) { + Assert(user_list == NULL); + process_list = pg_malloc0(sizeof(SimpleStringList)); simple_string_list_append(process_list, PQdb(conn)); } There's another assert after the else-parallel that checks that a list is present, so there's no need to also check it here. I don't send a new patch since the --index wanted behavior is not clear yet.