On 22.12.2010 16:52, Simon Riggs wrote:
On Wed, 2010-12-22 at 16:22 +0200, Heikki Linnakangas wrote:
On 22.12.2010 15:59, Simon Riggs wrote:
On Wed, 2010-12-22 at 15:30 +0200, Heikki Linnakangas wrote:
My gut feeling is that a reasonable compromise is to set hint bits like
we do today, but don't mark the page as dirty when only hint bits are
set. That way you get the benefit of hint bits for tuples that are
frequently accessed and stay in buffer cache. But you don't spend any
extra I/O to set them. I'd really like to see a worst-case scenario
benchmark of a patch that does that.

That sounds great, but still prevents block checksums and that is a very
valuable feature for robustness.

It does? The problem with block checksums is that if you modify a page
and don't have a corresponding WAL record for it, like a hint bit
update, you can have a torn page so that the checksum doesn't match.
Refraining from dirtying the page when a hint bit is updated avoids the
problem. With that change, we only ever write pages to disk that have a
WAL record associated with it, with full-page images as necessary to
avoid torn pages.

Which then leads to a block CRC not matching the block in memory.

What do you mean?

Do you envision that the CRC is calculated at every update, or only when a page is written out from the buffer cache? If the former, you could recalculate the CRC at a hint bit update too. If the latter, the hint bits are included in the page image that you checksum just like any other data.

So what you suggest works only if we restrict CRC checking to blocks
incoming to the buffer cache, but leaves us unable to do CRC checks on
blocks once in the buffer cache. Since many blocks stay in cache almost
constantly, we're left with the situation that the most heavily used
parts of the database seldom get CRC checked.

There's plenty of stuff in memory that's not covered by an application-level CRC. That's what ECC RAM is for. Updating the CRC at every update to a page seems really expensive, but it's an orthogonal issue to hint bits.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to