Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

Heikki Linnakangas Tue, 15 Jun 2021 05:04:28 -0700

On 15/06/2021 14:20, Dilip Kumar wrote:

Design Idea:
-----------------
First, create the target database directory along with the version
file and WAL-log this operation.  Create the "relation map file" in
the target database and copy the content from the source database. For
this, we can use some modified versions of the write_relmap_file() and
WAL-log the relmap create operation along with the file content.  Now,
read the relmap file to find the relfilenode for pg_class and then we
read pg_class block by block and decode the tuples. For reading the
pg_class blocks, we can use ReadBufferWithoutRelCache() so that we
don't need the relcache.  Nothing prevents us from checking visibility
for tuples in another database because CLOG is global to the cluster.
And nothing prevents us from deforming those tuples because the column
definitions for pg_class have to be the same in every database. Then
we can get the relfilenode of every file we need to copy, and prepare
a list of all such relfilenode.

I guess that would work, but you could also walk the database directorylike copydir() does. How you find the relations to copy is orthogonal towhether you WAL-log them or use checkpoints. And whether you use thebuffer cache is also orthogonal to the rest of the proposal; you couldissue FlushDatabaseBuffers() instead of a checkpoint.

Next, for each relfilenode in the
source database, create a respective relfilenode in the target
database (for all forks) using smgrcreate, which is already a
WAL-logged operation.  Now read the source relfilenode block by block
using ReadBufferWithoutRelCache() and copy the block to the target
relfilenode using smgrextend() and WAL-log them using log_newpage().
For the source database, we can not directly use the smgrread(),
because there could be some dirty buffers so we will have to read them
through the buffer manager interface, otherwise, we will have to flush
all the dirty buffers.

Yeah, WAL-logging the contents of the source database would certainly beless weird than the current system. As Julien also pointed out, thequestion is, are there people using on "CREATE DATABASE foo TEMPLATEbar" to copy a large source database, on the premise that it's fastbecause it skips WAL-logging?

In principle, we could have both mechanisms, and use the new WAL-loggedsystem if the database is small, and the old system with checkpoints ifit's large. But I don't like idea of having to maintain both.


- Heikki

Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

Reply via email to