Ășt 8. 9. 2020 v 6:51 odesĂlatel Dmitry Shulga <d.shu...@postgrespro.ru> napsal:
> Hello hackers, > > Currently, database recovery from archive is performed sequentially, > by reading archived WAL files and applying their records to the database. > > Overall archive file processing is done one by one, and this might > create a performance bottleneck if archived WAL files are delivered slowly, > because the database server has to wait for arrival of the next > WAL segment before applying its records. > > To address this issue it is proposed to receive archived WAL files in > parallel > so that when the next WAL segment file is required for processing of redo > log > records it would be already available. > > Implementation of this approach assumes running several background > processes (bgworkers) > each of which runs a shell command specified by the parameter > restore_command > to deliver an archived WAL file. Number of running parallel processes is > limited > by the new parameter max_restore_command_workers. If this parameter has > value 0 > then WAL files delivery is performed using the original algorithm, that is > in > one-by-one manner. If this parameter has value greater than 0 then the > database > server starts several bgworker processes up to the limit specified by > the parameter max_restore_command_workers and passes to every process > WAL file name to deliver. Active processes start prefetching of specified > WAL files and store received files in the directory pg_wal/pgsql_tmp. After > bgworker process finishes receiving a file it marks itself as a free > process > and waits for a new request to receive a next WAL file. The main process > performing database recovery still handles WAL files in one-by-one manner, > but instead of waiting for a next required WAL file's availability it > checks for > that file in the prefetched directory. If a new file is present there, > the main process starts its processing. > > The patch implemeting the described approach is attached to this email. > The patch contains a test in the file src/test/recovery/t/ > 021_xlogrestore.pl > Although the test result depends on real execution time and hardly could be > approved for including to the repository it was added in order to show > a positive effect from applying the new algorithm. In my environment > restoring > from archive with parallel prefetching is twice as faster than in original > mode. > +1 it is interesting feature Regards Pavel > Regards, > Dmitry. > >