On Tue, May 24, 2022 at 2:12 PM Matthias van de Meent < boekewurm+postg...@gmail.com> wrote:
> On Tue, 24 May 2022 at 22:12, Zhihong Yu <z...@yugabyte.com> wrote: > > > > On Tue, May 24, 2022 at 12:37 PM Matthias van de Meent < > boekewurm+postg...@gmail.com> wrote: > >> > >> On Tue, 24 May 2022 at 19:13, Zhihong Yu <z...@yugabyte.com> wrote: > >> > > >> > Hi, > >> > Please see attached for enhancement to COPY command progress. > >> > > >> > The added status column would allow users to get the status of the > most recent COPY command. > >> > >> I fail to see the merit of retaining completed progress reporting > >> commands in their views after completion, other than making the > >> behaviour of the pg_stat_progress-views much more complicated and > >> adding overhead in places where we want the system to have as little > >> overhead as possible. > >> > >> Trying to get the status of a COPY command after it finished on a > >> different connection seems weird, as that other connection is likely > >> to have already disconnected / started another task. To be certain > >> that a backend can see the return status of the COPY command, you'd > >> have to be certain that the connection doesn't run any other > >> _progress-able commands in the following seconds / minutes, which > >> implies control over the connection, which means you already have > >> access to the resulting status of your COPY command. > >> > >> Regarding the patch: I really do not like that this leaks entries into > >> all _progress views: I get garbage data from e.g. the _create_index > >> and _copy views when VACUUM is running, etc, because you removed the > >> filter on cmdtype. > >> Also, the added fields in CopyToStateData / CopyFromStateData seem > >> useless when a pgstat_progress_update_param in the right place should > >> suffice. > >> > >> Kind regards, > >> > >> Matthias van de Meent > > > > Hi, > > For #2 above, can you let me know where the > pgstat_progress_update_param() call(s) should be added ? > > In my patch, pgstat_progress_update_param() is called from error > callback and EndCopy(). > > In the places that the patch currently sets cstate->status it could > instead directly call pgstat_progress_update_param(..., STATUS_VALUE). > I'm fairly certain that EndCopy is not called when the error callback > is called, so status reporting should not be overwritten when > unconditionally setting the status to OK in EndCopy. > > > For #1, if I use param18 (which is not used by other views), would that > be better ? > > No: > > /* > - * Report values for only those backends which are running the > given > - * command. > + * Report values for only those backends which are running or > have run. > */ > - if (!beentry || beentry->st_progress_command != cmdtype) > + if (!beentry || beentry->st_progress_command_target == InvalidOid) > continue; > > This change removes the filter that ensures that we only return the > backends which have a st_progress_command of the correct cmdtype (i.e. > for _progress_copy only those that have st_progress_command == > PROGRESS_COMMAND_COPY. Without that filter, you'll return all backends > that have (or have had) their progress fields set at any point. Which > means that the expectation of "the backends returned by > pg_stat_get_progress_info are those running the requested command" > will be incorrect - you'll violate the contract / documented behaviour > of the function: "Returns command progress information for the named > command.". > > The numerical index of the column thus doesn't matter, what matters is > that you want special behaviour for only the COPY progress reporting > that doesn't fit with the rest of the progress-reporting > infrastructure, and that the patch as-is breaks all progress reporting > views. > > Sidenote: The change is also invalid because the rows that we expect > to return for pg_stat_progress_basebackup always have > st_progress_command_target == InvalidOid, so the backends running > BASE_BACKUP would never be visible with the change as-is. COPY (SELECT > stuff) also would not show up, because that too reports a > command_target of InvalidOid. > > Either way, I don't think that this idea is worth pursuing: the > progress views are explicitly there to show the progress of currently > active backends, and not to show the last progress state of backends > that at some point ran a progress-reporting-enabled command. > > Kind regards, > > Matthias van de Meent > Hi, Thanks for the comment. w.r.t. `Returns command progress information for the named command`, how about introducing PROGRESS_COMMAND_COPY_DONE which signifies that PROGRESS_COMMAND_COPY was the previous command ? In pgstat_progress_end_command(): if (beentry->st_progress_command == PROGRESS_COMMAND_COPY) beentry->st_progress_command = PROGRESS_COMMAND_COPY_DONE; else beentry->st_progress_command = PROGRESS_COMMAND_INVALID; beentry->st_progress_command_target = InvalidOid; In pg_stat_get_progress_info(): if (!beentry || (beentry->st_progress_command != cmdtype && (cmdtype == PROGRESS_COMMAND_COPY && beentry->st_progress_command != PROGRESS_COMMAND_COPY_DONE))) continue; Cheers