Are all unlogged tables in any case truncated after a server-crash?
Hi everyone, every few weeks I use Postgres ability, to import huge data sets very fast by means of "unlogged tables". The bulk load (consisting of plenty "copy"- & DML-Stmts) and the spatial index creation afterwards, takes about 5 hours on a proper server (pg12.7 & PostGIS-Extension). After that all unlogged tables remain completely unchanged (no DML-/DDL-Statements). Hence all of my huge unlogged, "static" tables get never "unclean" and should not be truncated after a server crash. BTW, if I set all unlogged tables to logged after bulk load, it takes additional 1.5 hours, mainly because of re-indexing, I suppose. I assume that a restart of the database after a server crash takes another 1.5 hours (reading from WAL) until the database is up and running. Therefore I am seeking a strategy, to not tagging those tables as "unclean" and not truncating all unlogged tables on server restart. Cheers and regards.
Re: Are all unlogged tables in any case truncated after a server-crash?
Hi David, thx for your comments and your advice on reading docs on "checkpoint". Of course consistency is most important to any DBMS, and if in doubt about that, truncate data rows and restore from WAL. But in this case, where data is never modified after bulk load, I thought there might be an undocumented feature or workaround, like ... - option to set the datafiles of those tables in read-only mode and record this in the metadata - on server-recovery spare these unlogged tables and indexes from truncating all data rows Its truly a "nice to have"-thing, but I have learned now, that there is not feature like that. Mart Am 11.11.2021 um 22:10 schrieb David G. Johnston: On Thu, Nov 11, 2021 at 11:39 AM wrote: After that all unlogged tables remain completely unchanged (no DML-/DDL-Statements). Hence all of my huge unlogged, "static" tables get never "unclean" and should not be truncated after a server crash. The server cannot make this assumption so it truncates unlogged relations upon an unclean shutdown/crash because it has no WAL with which to ensure a proper restoration. BTW, if I set all unlogged tables to logged after bulk load, it takes additional 1.5 hours, mainly because of re-indexing, I suppose. More likely it is writing the entire table, and all of its indexes, to WAL. I assume that a restart of the database after a server crash takes another 1.5 hours (reading from WAL) until the database is up and running. That would be incorrect. See "CHECKPOINT". Therefore I am seeking a strategy, to not tagging those tables as "unclean" and not truncating all unlogged tables on server restart. There is no middle ground that I am aware of. Either the contents of the table are in WAL ,or they are not. If not, they can be lost upon an unclean shutdown. For manually initiated shutdowns you do have the option to do so cleanly. This topic (unlogged optimizations) does draw quite a bit of attention every year but so far the problem of proving to the system that the physical file on disk is a truly accurate representation of the post-crash relation is yet unsolved. David J.
Re: Are all unlogged tables in any case truncated after a server-crash?
Am 12.11.2021 um 08:41 schrieb Laurenz Albe: On Thu, 2021-11-11 at 18:39 +, sch...@posteo.de wrote: every few weeks I use Postgres ability, to import huge data sets very fast by means of "unlogged tables". The bulk load (consisting of plenty "copy"- & DML-Stmts) and the spatial index creation afterwards, takes about 5 hours on a proper server (pg12.7 & PostGIS-Extension). After that all unlogged tables remain completely unchanged (no DML-/DDL-Statements). Hence all of my huge unlogged, "static" tables get never "unclean" and should not be truncated after a server crash. There is no way to achieve that. But you could keep the "huge data sets" around and load them again if your server happens to crash (which doesn't happen often, I hope). Thx Laurenz for yr reply! Yes, that's what we did after server crashes (~ 2/yr on different locations). But the system is at least 5 hours offline plus the time until the admin manually re-starts the bulk loads. On my system, I have 6 databases configured like this. For all I have to redo the bulk loads. I hoped there was a 'switch' on crash-recovery, to avoid truncating the datafiles of these unlogged tables, which are definitely in a perfect condition. Mart Yours, Laurenz Albe