Re: Reduce the time required for a database recovery from archive.

Pavel Stehule Mon, 07 Sep 2020 22:47:09 -0700

út 8. 9. 2020 v 6:51 odesílatel Dmitry Shulga <d.shu...@postgrespro.ru>
napsal:


> Hello hackers,
>
> Currently, database recovery from archive is performed sequentially,
> by reading archived WAL files and applying their records to the database.
>
> Overall archive file processing is done one by one, and this might
> create a performance bottleneck if archived WAL files are delivered slowly,
> because the database server has to wait for arrival of the next
> WAL segment before applying its records.
>
> To address this issue it is proposed to receive archived WAL files in
> parallel
> so that when the next WAL segment file is required for processing of redo
> log
> records it would be already available.
>
> Implementation of this approach assumes running several background
> processes (bgworkers)
> each of which runs a shell command specified by the parameter
> restore_command
> to deliver an archived WAL file. Number of running parallel processes is
> limited
> by the new parameter max_restore_command_workers. If this parameter has
> value 0
> then WAL files delivery is performed using the original algorithm, that is
> in
> one-by-one manner. If this parameter has value greater than 0 then the
> database
> server starts several bgworker processes up to the limit specified by
> the parameter max_restore_command_workers and passes to every process
> WAL file name to deliver. Active processes start prefetching of specified
> WAL files and store received files in the directory pg_wal/pgsql_tmp. After
> bgworker process finishes receiving a file it marks itself as a free
> process
> and waits for a new request to receive a next WAL file. The main process
> performing database recovery still handles WAL files in one-by-one manner,
> but instead of waiting for a next required WAL file's availability it
> checks for
> that file in the prefetched directory. If a new file is present there,
> the main process starts its processing.
>
> The patch implemeting the described approach is attached to this email.
> The patch contains a test in the file src/test/recovery/t/
> 021_xlogrestore.pl
> Although the test result depends on real execution time and hardly could be
> approved for including to the repository it was added in order to show
> a positive effect from applying the new algorithm. In my environment
> restoring
> from archive with parallel prefetching is twice as faster than in original
> mode.
>

+1

it is interesting feature

Regards

Pavel


> Regards,
> Dmitry.
>
>

Re: Reduce the time required for a database recovery from archive.

Reply via email to