On 2015/11/20 0:57, Jim Nasby wrote: > On 11/19/15 1:18 AM, Amit Langote wrote: >> 1) General purpose interface for (maintenance?) commands to report a set > > I'm surprised no one has picked up on using this for DML. Certainly anyone > works with ETL processes would love to be able to get some clue on the > status of a long running query...
Instrumenting query execution for progress info would be a complex beast though. Also, what kind of reporting interface it would require is also not clear, at least to me. Jan Urbanski's PGCon presentation[1] is a good source on the matter I discovered in this thread, thanks! But IMHO, for now, it would be worthwhile to focus our resources on the modest goal of implementing a reporting interface for utility commands. Sure it would be nice to investigate how much the requirements of the two overlap. > >> About pass 2, ie, lazy_index_vacuum() and >> lazy_vacuum_heap(), I don't see how we can do better than reporting its >> progress only after finishing all of it without any finer-grained >> instrumentation. They are essentially block-box as far as the proposed >> instrumentation approach is concerned. Being able to report progress per >> index seems good but as a whole, a user would have to wait arbitrarily >> long before numbers move forward. We might as well just report a bool >> saying we're about to enter a potentially time-consuming index vacuum >> round with possibly multiple indexes followed by lazy_vacuum_heap() >> processing. Additionally, we can report the incremented count of the >> vacuuming round (pass 2) once we are through it. > > Another option is to provide the means for the index scan routines to > report their progress. Maybe every index AM won't use it, but it'd > certainly be a lot better than staring at a long_running boolean. The boolean would be a workaround for sure. I'm also slightly tempted by the idea of instrumenting vacuum scans of individual index AM's bulkdelete methods. One precedent is how vacuum_delay_point() are sprinkled around in the code. Another problem to solve would be to figure out how to pass progress parameters around - via some struct or could they be globals just like VacuumCost* variables are... > >> Note that we can leave them out of >> percent_done of overall vacuum progress. Until we have a good solution for >> number (3) above, it seems to difficult to incorporate index pages into >> overall progress. > > IMHO we need to either put a big caution sign on any % estimate that it > could be wildly off, or just forgo it completely for now. I'll bet that if > we don't provide it some enterprising users will figure out the best way > to do this (similar to how the bloat estimate query has evolved over time). > > Even if we never get a % done indicator, just being able to see what > 'position' a command is at will be very valuable. Agreed. If we provide enough information in whatever view we choose to expose, that would be a good start. > >> As someone pointed out upthread, the final heap truncate phase can take >> arbitrarily long and is outside the scope of lazy_scan_heap() to >> instrument. Perhaps a bool, say, waiting_heap_trunc could be reported for >> the same. Note that, it would have to be reported from lazy_vacuum_rel(). > > ISTM this is similar to the problem of reporting index status, namely that > a progress reporting method needs to accept reports from multiple places > in the code. Yes. Thanks, Amit [1] http://www.pgcon.org/2013/schedule/events/576.en.html -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers