On Mon, May 2, 2022 at 6:26 PM Ashutosh Bapat <ashutosh.bapat....@gmail.com> wrote: > > Hi Bharath, > > > On Sat, Apr 30, 2022 at 11:08 AM Bharath Rupireddy > <bharath.rupireddyforpostg...@gmail.com> wrote: > > > > Hi, > > > > At times, there can be many temp files (under pgsql_tmp) and temp > > relation files (under removal which after crash may take longer during > > which users have no clue about what's going on in the server before it > > comes up online. > > > > Here's a proposal to use ereport_startup_progress to report the > > progress of the file removal. > > > > Thoughts? > > The patch looks good to me. > > With this patch, the user would at least know which directory is being > scanned and how much time has elapsed.
There's a problem with the patch, the timeout mechanism isn't being used by the postmaster process. Postmaster doesn't InitializeTimeouts() and doesn't register STARTUP_PROGRESS_TIMEOUT, I tried to make postmaster do that (attached a v2 patch) but make check fails. Now, I'm thinking if it's a good idea to let postmaster use timeouts at all? > It would be better to know how > much work is remaining. I could not find a way to estimate the number > of files in the directory so that we can extrapolate elapsed time and > estimate the remaining time. Well, we could loop the output of > opendir() twice, first to estimate and then for the actual work. This > might actually work, if the time to delete all the files is very high > compared to the time it takes to scan all the files/directories. > > Another possibility is to scan the sorted output of opendir() thus > using the current file name to estimate remaining files in a very > crude and inaccurate way. That doesn't look attractive either. I can't > think of any better way to estimate the remaining time. I think 'how much work/how many files remaining to process' is a generic problem, for instance, snapshot, mapping files, old WAL file processing and so on. I don't think we can do much about it. > But at least with this patch, a user knows which files have been > deleted, guessing how far, in the directory structure, the process has > reached. S/he can then take a look at the remaining contents of the > directory to estimate how much it should wait. Not sure we will be able to use the timeout mechanism within postmaster. Another idea is to have a generic GUC something like log_file_processing_traffic = {none, medium, high} (similar idea is proposed for WAL files processing while replaying/recovering at [1]), default being none, when set to medium a log message gets emitted for every say 128 or 256 (just a random number) files processed. when set to high, log messages get emitted for every file processed (too verbose). I think this generic GUC log_file_processing_traffic can be used in many other file processing areas. Thoughts? [1] https://www.postgresql.org/message-id/CALj2ACVnhbx4pLZepvdqOfeOekvZXJ2F%3DwJeConGzok%2B6kgCVA%40mail.gmail.com Regards, Bharath Rupireddy.
v2-0001-pgsql_tmp-ereport_startup_progress.patch
Description: Binary data