Here's an attempt to summarize the remaining issues with this patch that I know about. I may have forgotten something, so please mention it if you notice something missing.
1. pg_dump needs an option to control whether unlogged tables are dumped. --no-unlogged-tables seems like the obvious choice, assuming we want the default to be to dump them, which seems like the safest option. 2. storage.sgml likely needs to be updated. We have a section on the free space map and one on the visibility map, so I suppose the logical thing to do is add a similar section on the initialization fork. 3. It's unnecessary to include unlogged relation buffers in non-shutdown checkpoints. I've recently realized that this is true independently of whether or not we want unlogged tables to survive a clean shutdown. Whether or not we can survive a clean shutdown is a function of whether we register dirty segments when buffers are written, which is independent of whether we choose to write such buffers as part of a checkpoint. And indeed, unless we're about to shut down, there's no reason to do so, because the whole point of checkpointing is to advance the redo pointer, and that's irrelevant for unlogged tables. 4. It's arguably unnecessary to register dirty segments for unlogged relations. Given #3, this now seems a little less important. If the unlogged relation is hot and fits in shared_buffers, then omitting it from the checkpoint process means we'll never write out those dirty buffers, so the fact that they'd cause fsyncs if we did write them doesn't matter. However, it's still not totally irrelevant, because a relation that fits in the OS buffer cache but not in shared buffers will probably generate fsyncs at every checkpoint. (And on the third hand, the OS may decide to write the dirty data anyway, especially if it's a largish percentage of RAM.) There are a couple of possible ways of dealing with this: 4A. The solution Andres proposed - Iterate through all unlogged relations at shutdown time and fsyncing them all. Possibly complicated to handle fsync failures. 4B. Another idea I just thought of - register dirty segments as normal, but teach the background writer to accumulate them in a separate queue that is only flushed at shutdown, or when it reaches some maximum size, rather than at every checkpoint. 4C. Decree that this is an area for future enhancement and forget about it for now. I am leaning toward this option. 5. Make it work with GIST indexes. Per discussion on the other thread, the current proposal seems to be: (a) add a BM_FLUSH_XLOG bit; when clear, don't flush XLOG; this then allows pages to have fake LSNs; (b) add an XLogRecPtr structure in shared memory, protected by a spinlock; (c) use the structure described in (b) to generate fake LSNs every time an operation is performed on an unlogged GIST index. I am not clear on how we make this work across shutdowns - it seems you'd need to save this structure somewhere during a clean shutdown (where?) and restore it on startup, unless we go back to truncating even on a clean shutdown. 6. Make it work with GIN indexes. I haven't looked at what's involved here yet. Advice, comments, feedback appreciated... I'd like to put this one to bed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers