On 2015-02-16 11:30:20 +0000, Syed, Rahila wrote: > - * As a trivial form of data compression, the XLOG code is aware that > - * PG data pages usually contain an unused "hole" in the middle, which > - * contains only zero bytes. If hole_length > 0 then we have removed > - * such a "hole" from the stored data (and it's not counted in the > - * XLOG record's CRC, either). Hence, the amount of block data actually > - * present is BLCKSZ - hole_length bytes. > + * Block images are able to do several types of compression: > + * - When wal_compression is off, as a trivial form of compression, the > + * XLOG code is aware that PG data pages usually contain an unused "hole" > + * in the middle, which contains only zero bytes. If length < BLCKSZ > + * then we have removed such a "hole" from the stored data (and it is > + * not counted in the XLOG record's CRC, either). Hence, the amount > + * of block data actually present is "length" bytes. The hole "offset" > + * on page is defined using "hole_offset". > + * - When wal_compression is on, block images are compressed using a > + * compression algorithm without their hole to improve compression > + * process of the page. "length" corresponds in this case to the length > + * of the compressed block. "hole_offset" is the hole offset of the page, > + * and the length of the uncompressed block is defined by "raw_length", > + * whose data is included in the record only when compression is enabled > + * and "with_hole" is set to true, see below. > + * > + * "is_compressed" is used to identify if a given block image is compressed > + * or not. Maximum page size allowed on the system being 32k, the hole > + * offset cannot be more than 15-bit long so the last free bit is used to > + * store the compression state of block image. If the maximum page size > + * allowed is increased to a value higher than that, we should consider > + * increasing this structure size as well, but this would increase the > + * length of block header in WAL records with alignment. > + * > + * "with_hole" is used to identify the presence of a hole in a block image. > + * As the length of a block cannot be more than 15-bit long, the extra bit in > + * the length field is used for this identification purpose. If the block > image > + * has no hole, it is ensured that the raw size of a compressed block image > is > + * equal to BLCKSZ, hence the contents of XLogRecordBlockImageCompressionInfo > + * are not necessary. > */ > typedef struct XLogRecordBlockImageHeader > { > - uint16 hole_offset; /* number of bytes before "hole" */ > - uint16 hole_length; /* number of bytes in "hole" */ > + uint16 length:15, /* length of block data in > record */ > + with_hole:1; /* status of hole in the block > */ > + > + uint16 hole_offset:15, /* number of bytes before "hole" */ > + is_compressed:1; /* compression status of image */ > + > + /* Followed by the data related to compression if block is compressed */ > } XLogRecordBlockImageHeader;
Yikes, this is ugly. I think we should change the xlog format so that the block_id (which currently is XLR_BLOCK_ID_DATA_SHORT/LONG or a actual block id) isn't the block id but something like XLR_CHUNK_ID. Which is used as is for XLR_CHUNK_ID_DATA_SHORT/LONG, but for backup blocks can be set to to XLR_CHUNK_BKP_WITH_HOLE, XLR_CHUNK_BKP_COMPRESSED, XLR_CHUNK_BKP_REFERENCE... The BKP blocks will then follow, storing the block id following the chunk id. Yes, that'll increase the amount of data for a backup block by 1 byte, but I think that's worth it. I'm pretty sure we will be happy about the added extensibility pretty soon. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers