Hi, We often hear customers/users asking questions like - How much time does it take for postgres to recover if it crashes now? How much time does it take for a PITR to finish if it's started now with a specific recovery target? When will the recovery of a postgres server end? It will be nice if the postgres can "somehow" answer these questions. I know this is easier said than done. At a bare minimum, the postgres can scan the WAL from the last checkpoint till end of WAL to see how many WAL records need to be replayed and count in "some" IO costs, average redo/replay/apply times etc. and provide an estimate something like "recovery, if started at this moment, will take approximately X amount of time". To answer these questions, postgres needs to have information about the average replay time of WAL records which depends on the type of WAL record (replay of different WAL records take different amount of time at different times; for instance, replay of a WAL record with many FPIs or data blocks touched takes different time based on the shared buffers hit and misses, disk IO etc.). The postgres can capture and save average replay time of each WAL record type over a period of time and use it for estimates which is of course a costly thing for the postgres to do while it's actually recovering. Or we can feed in some average disk IO, replay costs, postgres can scan the WAL records and provide the estimates.
If postgres has a way to estimate recovery times, it can also trigger checkpoints based on it to keep the RTO/recovery times under limits. I know there are lots of unclear points for now but I would like to start a discussion and hear more thoughts from the community. Please feel free to provide your inputs. Regards, Bharath Rupireddy.