On 2018-01-12 17:43:00 -0500, Tom Lane wrote: > Andres Freund <and...@anarazel.de> writes: > > On 2018-01-12 17:24:54 -0500, Tom Lane wrote: > >> Andres Freund <and...@anarazel.de> writes: > >>> Right. I wonder if it be reasonable to move that to a page's header > >>> instead of individual records? To avoid torn page issues we'd have to > >>> reduce the page size to a sector size, but I'm not sure that's that bad? > > >> Giving up a dozen or two bytes out of every 512 sounds like quite an > >> overhead. > > > It's not nothing, that's true. But if it avoids 8 bytes in every record, > > that'd probably at least as much in most usecases. > > Fair point. I don't have a very good handle on what "typical" WAL record > sizes are, but we might be fine with that --- some quick counting on the > fingers says we'd break even with an average record size of ~160 bytes, > and be ahead below that.
This is far from a definitive answer, but here's some data: pgbench -i -s 100 -q: Type N (%) Record size (%) FPI size (%) Combined size (%) ---- - --- ----------- --- -------- --- ------------- --- Total 308958 1077269060 [84.19%] 202269468 [15.81%] 1279538528 [100%] So here records are really large, which makes sense, given it's largelyinitialization of data. With wal_compression that'd probably look different, but still commonly spanning multiple pages. pgbench -M prepared -c 16 -j 16 -T 100 Type N (%) Record size (%) FPI size (%) Combined size (%) ---- - --- ----------- --- -------- --- ------------- --- Total 14228881 947824170 [100.00%] 8192 [0.00%] 947832362 [100%] Here we're at 66 bytes... > We'd need to investigate the page-crossing overhead carefully though. agreed. Greetings, Andres Freund