On Wed, Sep 04, 2019 at 09:18:39AM -0400, Robert Haas wrote: > I think this is all going in the wrong direction. It's the > responsibility of the people who are calling the pgstat_progress_* > functions to only do so when it's appropriate. Having the > pgstat_progress_* functions try to untangle whether or not they ought > to ignore calls made to them is backwards.
Check. > It seems to me that the problem here can be summarized in this way: > there's a bunch of code reuse in the relevant paths here, and somebody > wasn't careful enough when adding progress reporting for one of the > new commands, and so now things are broken, because e.g. CLUSTER > progress reporting gets ended by a pgstat_progress_end_command() call > that was intended for some other utility command but is reached in the > CLUSTER case anyway. The solution to that problem in my book is to > figure out which commit broke it, and then the person who made that > commit either needs to fix it or revert it. I am not sure that it is right as well to say that the first committed patch is right and that the follow-up ones are wrong. CLUSTER progress was committed first (6f97457), followed a couple of days after by CREATE INDEX (ab0dfc9) and then REINDEX (03f9e5c). So let's have a look at them.. For CLUSTER, the progress starts and ends in cluster_rel(). CLUSTER uses its code paths at the beginning, but then things get more complicated, particularly with finish_heap_swap() which calls directly reindex_table(). 6f97457 includes one progress update at point which can be a problem per its shared nature in reindex_relation() with PROGRESS_CLUSTER_INDEX_REBUILD_COUNT. This last part is wrong IMO, why should cluster reporting take priority in this code path, enforcing anything else? For CREATE INDEX, the progress reporting starts and ends once in DefineIndex(). However, we have updates of progress within each index AM build routine, which could be taken by many code paths. Is it actually fine to give priority to CREATE INDEX in those cases? Those paths can as well be taken by REINDEX or CLUSTER (right?), so having a counter for CREATE INDEX looks logically wrong to me. The part where we wait for snapshots looks actually good from the perspective of REINDEX CONCURRENTLY and CREATE INDEX CONCURRENTLY. For REINDEX, we have a problematic start progress call in reindex_index() which is for example called by reindex_relation for each relation's index for a non-concurrent case (also in ReindexRelationConcurrently()). I think that these are incorrect locations, and I would have placed them in ReindexIndex(), ReindexTable() and ReindexMultipleTables() so as we avoid anything low-level. This has added two calls to pgstat_progress_update_param() in reindex_index(), which is shared between all. Why would it be fine to give the priority to a CREATE INDEX marker here if CLUSTER can also cross this way? On top of those issues, I see some problems with the current state of affairs, and I am repeating myself: - It is possible that pgstat_progress_update_param() is called for a given command for a code path taken by a completely different command, and that we have a real risk of showing a status completely buggy as the progress phases share the same numbers. - We don't consider wisely end and start progress handling for cascading calls, leading to a risk where we start command A, but for shared code paths where we assume that only command B can run then the processing abruptly ends for command A. - Is it actually fine to report information about a command completely different than the one provided by the client? It is for example possible to call CLUSTER, but show up to the user progress report about PROGRESS_COMMAND_CREATE_INDEX (see reindex_index). This could actually make sense if we are able to handle cascading progress reports. These are, at least it seems to me, fundamental problems we need to ponder more about if we begin to include more progress reporting, and we don't have that now, and that worries me. > It's quite possible here that we need a bigger redesign to make adding > progress reporting for new command easier and less prone to bugs, but > I don't think it's at all clear what such a redesign should look like > or even that we definitely need one, and September is not the right > time to be redesigning features for the pending release. Definitely. -- Michael
signature.asc
Description: PGP signature