Are all unlogged tables in any case truncated after a server-crash?

2021-11-11 Thread sch8el

Hi everyone,

every few weeks I use Postgres ability, to import huge data sets very 
fast by means of "unlogged tables". The bulk load (consisting of plenty 
"copy"- & DML-Stmts) and the spatial index creation afterwards, takes 
about 5 hours on a proper server  (pg12.7 & PostGIS-Extension). After 
that all unlogged tables remain completely unchanged (no 
DML-/DDL-Statements). Hence all of my huge unlogged, "static" tables get 
never "unclean" and should not be truncated after a server crash.


BTW, if I set all unlogged tables to logged after bulk load, it takes 
additional 1.5 hours, mainly because of re-indexing, I suppose. I assume 
that a restart of the database after a server crash takes another 1.5 
hours (reading from WAL) until the database is up and running.


Therefore I am seeking a strategy, to not tagging those tables as 
"unclean" and not truncating all unlogged tables on server restart.



Cheers and regards.





Re: Are all unlogged tables in any case truncated after a server-crash?

2021-11-12 Thread sch8el

Hi David,

thx for your comments and your advice on reading docs on "checkpoint".

Of course consistency is most important to any DBMS, and if in doubt 
about that, truncate data rows and restore from WAL.
But in this case, where data is never modified after bulk load, I 
thought there might be an undocumented feature or workaround, like ...
  - option to set the datafiles of those tables in read-only mode and 
record this in the metadata
  - on server-recovery spare these unlogged tables and indexes from 
truncating all data rows


Its truly a "nice to have"-thing, but I have learned now, that there is 
not feature like that.


Mart


Am 11.11.2021 um 22:10 schrieb David G. Johnston:

On Thu, Nov 11, 2021 at 11:39 AM  wrote:

After
that all unlogged tables remain completely unchanged (no
DML-/DDL-Statements). Hence all of my huge unlogged, "static"
tables get
never "unclean" and should not be truncated after a server crash.


The server cannot make this assumption so it truncates unlogged 
relations upon an unclean shutdown/crash because it has no WAL with 
which to ensure a proper restoration.


BTW, if I set all unlogged tables to logged after bulk load, it takes
additional 1.5 hours, mainly because of re-indexing, I suppose.


More likely it is writing the entire table, and all of its indexes, to 
WAL.


I assume
that a restart of the database after a server crash takes another 1.5
hours (reading from WAL) until the database is up and running.


That would be incorrect.  See "CHECKPOINT".


Therefore I am seeking a strategy, to not tagging those tables as
"unclean" and not truncating all unlogged tables on server restart.


There is no middle ground that I am aware of.  Either the contents of 
the table are in WAL ,or they are not.  If not, they can be lost upon 
an unclean shutdown.  For manually initiated shutdowns you do have the 
option to do so cleanly.


This topic (unlogged optimizations) does draw quite a bit of attention 
every year but so far the problem of proving to the system that the 
physical file on disk is a truly accurate representation of the 
post-crash relation is yet unsolved.


David J.



Re: Are all unlogged tables in any case truncated after a server-crash?

2021-11-12 Thread sch8el




Am 12.11.2021 um 08:41 schrieb Laurenz Albe:

On Thu, 2021-11-11 at 18:39 +, sch...@posteo.de wrote:

every few weeks I use Postgres ability, to import huge data sets very
fast by means of "unlogged tables". The bulk load (consisting of plenty
"copy"- & DML-Stmts) and the spatial index creation afterwards, takes
about 5 hours on a proper server  (pg12.7 & PostGIS-Extension). After
that all unlogged tables remain completely unchanged (no
DML-/DDL-Statements). Hence all of my huge unlogged, "static" tables get
never "unclean" and should not be truncated after a server crash.

There is no way to achieve that.

But you could keep the "huge data sets" around and load them again if
your server happens to crash (which doesn't happen often, I hope).
Thx Laurenz for yr reply! Yes, that's what we did after server crashes 
(~ 2/yr on different locations).
But the system is at least 5 hours offline plus the time until the admin 
manually re-starts the bulk loads. On my system, I have 6 databases 
configured like this. For all I have to redo the bulk loads.
I hoped there was a 'switch' on crash-recovery, to avoid truncating the 
datafiles of these unlogged tables, which are definitely in a perfect 
condition.


Mart


Yours,
Laurenz Albe