Re: Make drop database safer

Tomas Vondra Wed, 06 Mar 2019 07:56:27 -0800



On 2/12/19 12:55 AM, Ashwin Agrawal wrote:


Thanks for the response and inputs.

On Sat, Feb 9, 2019 at 4:51 AM Andres Freund <[email protected]<mailto:[email protected]>> wrote:


    Hi,

    On 2019-02-08 16:36:13 -0800, Alexandra Wang wrote:
     > Current sequence of operations for drop database (dropdb())
     > 1. Start Transaction
     > 2. Make catalog changes
     > 3. Drop database buffers
     > 4. Forget database fsync requests
     > 5. Checkpoint
     > 6. Delete database directory
     > 7. Commit Transaction
     >
     > Problem
     > This sequence is unsafe from couple of fronts. Like if drop database,
     > aborts (means system can crash/shutdown can happen) right after
    buffers are
     > dropped step 3 or step 4. The database will still exist and fully
     > accessible but will loose the data from the dirty buffers. This
    seems very
     > bad.
     >
     > Operation can abort after step 5 as well in which can the entries
    remain in
     > catalog but the database is not accessible. Which is bad as well
    but not as
     > severe as above case mentioned, where it exists but some stuff goes
     > magically missing.
     >
     > Repo:
     > ```
     > CREATE DATABASE test;
     > \c test
     > CREATE TABLE t1(a int); CREATE TABLE t2(a int); CREATE TABLE t3(a
    int);
     > \c postgres
     > DROP DATABASE test; <<====== kill the session after
    DropDatabaseBuffers()
     > (make sure to issue checkpoint before killing the session)
     > ```
     >
     > Proposed ways to fix
     > 1. CommitTransactionCommand() right after step 2. This makes it
    fully safe
     > as the catalog will have the database dropped. Files may still
    exist on
     > disk in some cases which is okay. This also makes it consistent
    with the
     > approach used in movedb().

    To me this seems bad. The current failure mode obviously isn't good, but
    the data obviously isn't valuable, and just loosing track of an entire
    database worth of data seems worse.

So, based on that response seems not loosing track to the filesassociated with the database is design choice we wish to achieve. Hencecatalog having entry but data directory being deleted is fine behaviorto have and doesn't need to be solved.

What about adding 'is dropped' flag to pg_database, set it to true atthe beginning of DROP DATABASE and commit? And ensure no one can connectto such database, making DROP DATABASE the only allowed operation?

ISTM we could then continue doing the same thing we do today, withoutany extra checkpoints etc.

 > 2. Alternative way to make it safer is perform Checkpoint (step 5) just

     > before dropping database buffers, to avoid the unsafe nature.
    Caveats of
     > this solution is:
     > - Performs IO for data which in success case anyways will get deleted
     > - Still doesn't cover the case where catalog has the database
    entry but
     > files are removed from disk

    That seems like an unacceptable slowdown.
Given dropping database should be infrequent operation and only additionIO cost is for buffers for that database itself as Checkpoint is anywaysperformed in later step, is it really unacceptable slowdown, compared tosafety it brings ?

That's probably true, although I do know quite a few systems that createand drop databases fairly often. And the implied explicit checkpointsare quite painful, so I'd vote not to make this worse.

FWIW I don't recall why exactly we need the checkpoints, except perhapsto ensure the file copies see the most recent data (in CREATE DATABASE)and evict stuff for the to-be-dropped database from shared bufers. Iwonder if we could do that without a checkpoint somehow ...

     > 3. One more fancier approach is to use pending delete mechanism
    used by
     > relation drops, to perform these non-catalog related activities
    at commit.
     > Easily, the pending delete structure can be added boolean to convey
     > database directory dropping instead of file. Given drop database
    can't be
     > performed inside transaction, not needed to be done this way, but
    this
     > makes it one consistent approach used to deal with on-disk removal.

    ISTM we'd need to do something like this.
Given the above design choice to retain link to database files tillactually deleted, not seeing why pending delete approach any better thanapproach 1. This approach will allow us to track the database oid incommit transaction xlog record but any checkpoint post the same stilllooses the reference to the database. Which is same case in approach 1where separate xlog record XLOG_DBASE_DROP is written just aftercommitting the transaction.When we proposed approach 3, we thought its functionally same asapproach 1 just differs in implementation. But your preference to thisapproach and stating approach 1 as bad, reads as pending deletesapproach is functionally different, we would like to hear more how?


Hmmm, I don't see how this is an improvement over option #1 either.

Considering the design choice we must meet, seems approach 2, movingCheckpoint from step 5 before step 3 would give us the safety desiredand retain the desired link to the database till we actually delete thefiles for it.


Ummm? That essentially means this order:

1. Start Transaction
2. Make catalog changes
5. Checkpoint
3. Drop database buffers
4. Forget database fsync requests
6. Delete database directory
7. Commit Transaction

I don't see how that actually fixes any of the issues? Can you explain?

Not to mention we might end up doing quite a bit of I/O to checkpointbuffers from the database that is going to disappear shortly ...


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Make drop database safer

Reply via email to