On 6/15/21 3:31 PM, Andrew Dunstan wrote:

On 6/15/21 8:04 AM, Heikki Linnakangas wrote:

Yeah, WAL-logging the contents of the source database would certainly
be less weird than the current system. As Julien also pointed out, the
question is, are there people using on "CREATE DATABASE foo TEMPLATE
bar" to copy a large source database, on the premise that it's fast
because it skips WAL-logging?


I'm 100% certain there are. It's not even a niche case.



In principle, we could have both mechanisms, and use the new
WAL-logged system if the database is small, and the old system with
checkpoints if it's large. But I don't like idea of having to maintain
both.



Rather than use size, I'd be inclined to say use this if the source
database is marked as a template, and use the copydir approach for
anything that isn't.



I think we should be asking what is the benefit of that use case, and perhaps try addressing that without having to maintain two entirely different ways to do CREATE DATABASE. It's not like we're sure the current code is 100% reliable in various corner cases, I doubt having two separate approaches will improve the situation :-/

I can see three reasons why people want to skip the WAL logging:

1) it's faster, because there's no CPU and I/O for building the WAL

  I wonder if some optimization / batching could help with (1), as
  suggested by Andres elsewhere in this thread.

2) it saves the amount of WAL (could matter with large template databases and WAL archiving, etc.)

  We can't really do much about this - we need to log all the data. But
  the batching from (1) might help a bit too, I guess.

3) saves the amount of WAL that needs to be copied to standby, so that there's no increase of replication lag, etc. particularly when the network link has limited bandwidth

  I think this is a more general issue - some operations that may
  generate a lot of WAL, and we generally assume it's better to do
  that rather than hold exclusive locks for long time. But maybe we
  could have some throttling, to limit the amount of WAL per second,
  similarly to what we have to plain vacuum.


regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Reply via email to