Greg Stark wrote:
> I think double buffering solves the torn page problem but not the lack  
> of wal logging. Alvarro solved the wal logging  by deferring the wal  
> logs. But I'm not sure how confident we are that it's logging enough.

Right now, it's WAL-logging HeapTupleHeader hint bits (infomask and
infomask2), and ItemId (line pointer) flags.   Page pd_flags are skipped
in the CRC checksum -- this is easy to do because they are in a constant
offset in the page and I'm just skipping those bytes in CRC_COMP().

So what I'm missing is:
- btree hint bits
- bgwriter calls XLogInsert during shutdown, to WAL-log the hint bits
of unwritten pages.  This causes a PANIC to trigger about concurrent WAL
activity during checkpoint.  (The easy solution to this problem is just
to remove the check; another idea is to flush the buffers before
grabbing the final address to watch for at shutdown.)

> I'm beginning to think just excluding the hint bits would be simpler and 
> safer. If we're double buffering then it might be possible to do that 
> pretty cheaply. Copy the whole buffer with memcpy then loop through the 
> line pointers unsetting the hint bits. Then do the crc. Though that would 
> prevent us from doing "zero-copy" crc by doing it in the copy.

This can probably be made to work, and it solves the problem that
bgwriter calls XLogInsert during shutdown.  I would create new routines
to clear hint bits in all involved modules (heap_resethintbits, btree_%,
item_%, page_%), and call them on a copy of the page.

The downside to this idea is that we need to create a copy of the page
and call those routines when we read the page in, too.

Alvaro Herrera                      
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Sent via pgsql-hackers mailing list (
To make changes to your subscription:

Reply via email to