On Wed, Jul 11, 2018 at 8:25 AM, Joshua D. Drake <j...@commandprompt.com> wrote: > On 07/10/2018 01:15 PM, Jerry Jelinek wrote: >> >> Thanks to everyone who took the time to look at the patch and send me >> feedback. I'm happy to work on improving the documentation of this new >> tunable to clarify when it should be used and the implications. I'm trying >> to understand more specifically what else needs to be done next. To >> summarize, I think the following general concerns were brought up. >> >> For #6, there is no feasible way for us to recreate our workload on other >> operating systems or filesystems. Can anyone expand on what performance data >> is needed? > > I think a simple way to prove this would be to run BenchmarkSQL against > PostgreSQL in a default configuration with pg_xlog/pg_wal on a filesystem > that is COW (zfs) and then run another test where pg_xlog/pg_wal is patched > with your patch and new behavior and then run the test again. BenchmarkSQL > is a more thorough benchmarking tool that something like pg_bench and is > very easy to setup.
I have a lowly but trusty HP Microserver running FreeBSD 11.2 with ZFS on spinning rust. It occurred to me that such an anaemic machine might show this effect easily because its cold reads are as slow as a Lada full of elephants going uphill. Let's see... # os setup sysctl vfs.zfs.arc_min=134217728 sysctl vfs.zfs.arc_max=134217728 zfs create zoot/data/test zfs set mountpoint=/data/test zroot/data/test zfs set compression=off zroot/data/test zfs set recordsize=8192 zroot/data/test # initdb into /data/test/pgdata, then set postgresql.conf up like this: fsync=off max_wal_size = 600MB min_wal_size = 600MB # small scale test, we're only interested in producing WAL, not db size pgbench -i -s 100 postgres # do this a few times first, to make sure we have lots of WAL segments pgbench -M prepared -c 4 -j 4 -T 60 postgres # now test... With wal_recycle=on I reliably get around 1100TPS and vmstat -w 10 shows numbers like this: procs memory page disks faults cpu r b w avm fre flt re pi po fr sr ad0 ad1 in sy cs us sy id 3 0 3 1.2G 3.1G 4496 0 0 0 52 76 144 138 607 84107 29713 55 17 28 4 0 3 1.2G 3.1G 2955 0 0 0 84 77 134 130 609 82942 34324 61 17 22 4 0 3 1.2G 3.1G 2327 0 0 0 0 77 114 125 454 83157 29638 68 15 18 5 0 3 1.2G 3.1G 1966 0 0 0 82 77 86 81 335 84480 25077 74 13 12 3 0 3 1.2G 3.1G 1793 0 0 0 533 74 72 68 310 127890 31370 77 16 7 4 0 3 1.2G 3.1G 1113 0 0 0 151 73 95 94 363 128302 29827 74 18 8 With wal_recycle=off I reliably get around 1600TPS and vmstat -w 10 shows numbers like this: procs memory page disks faults cpu r b w avm fre flt re pi po fr sr ad0 ad1 in sy cs us sy id 0 0 3 1.2G 3.1G 148 0 0 0 402 71 38 38 153 16668 5656 10 3 87 5 0 3 1.2G 3.1G 4527 0 0 0 50 73 28 27 123 123986 23373 68 15 17 5 0 3 1.2G 3.1G 3036 0 0 0 151 73 47 49 181 148014 29412 83 16 0 4 0 3 1.2G 3.1G 2063 0 0 0 233 73 56 54 200 143018 28699 81 17 2 4 0 3 1.2G 3.1G 1202 0 0 0 95 73 48 49 189 147276 29196 81 18 1 4 0 3 1.2G 3.1G 732 0 0 0 0 73 56 55 207 146805 29265 82 17 1 I don't have time to investigate further for now and my knowledge of ZFS is superficial, but the patch seems to have a clear beneficial effect, reducing disk IOs and page faults on my little storage box. Obviously this isn't representative of a proper server environment, or some other OS, but it's a clue. That surprised me... I was quietly hoping it was hoping it was going to be 'oh, if you turn off compression and use 8kb it doesn't happen because the pages line up'. But nope. -- Thomas Munro http://www.enterprisedb.com