On Sat, April 24, 2010 00:39, Simon Riggs wrote: > On Fri, 2010-04-23 at 11:32 -0400, Robert Haas wrote: >> > >> > 99% of transactions happen in similar times between primary and standby, >> > everything dragged down by rare but severe spikes. >> > >> > We're looking for something that would delay something that normally >> > takes <0.1ms into something that takes >100ms, yet does eventually >> > return. That looks like a severe resource contention issue. >> >> Wow. Good detective work. > > While we haven't fully established the source of those problems, I am > now happy that these test results don't present any reason to avoid > commiting the main patch tested by Erik (not the smaller additional one > I sent). I expect to commit that on Sunday. >
yes, that (main) patch seems to have largely closed the gap between primary and standby; here are some results from a lower scale (10): scale: 10 clients: 10, 20, 40, 60, 90 for each: 4x primary, 4x standby: (6565=primary, 6566=standby) ----- scale: 10 clients: 10 tps = 27624.339871 pgbench -p 6565 -n -S -c 10 -T 900 -j 1 scale: 10 clients: 10 tps = 27604.261750 pgbench -p 6565 -n -S -c 10 -T 900 -j 1 scale: 10 clients: 10 tps = 28015.093466 pgbench -p 6565 -n -S -c 10 -T 900 -j 1 scale: 10 clients: 10 tps = 28422.561280 pgbench -p 6565 -n -S -c 10 -T 900 -j 1 scale: 10 clients: 10 tps = 27254.806526 pgbench -p 6566 -n -S -c 10 -T 900 -j 1 scale: 10 clients: 10 tps = 27686.470866 pgbench -p 6566 -n -S -c 10 -T 900 -j 1 scale: 10 clients: 10 tps = 28078.904035 pgbench -p 6566 -n -S -c 10 -T 900 -j 1 scale: 10 clients: 10 tps = 27101.622337 pgbench -p 6566 -n -S -c 10 -T 900 -j 1 ----- scale: 10 clients: 20 tps = 23106.795587 pgbench -p 6565 -n -S -c 20 -T 900 -j 1 scale: 10 clients: 20 tps = 23101.681155 pgbench -p 6565 -n -S -c 20 -T 900 -j 1 scale: 10 clients: 20 tps = 22893.364004 pgbench -p 6565 -n -S -c 20 -T 900 -j 1 scale: 10 clients: 20 tps = 23038.577109 pgbench -p 6565 -n -S -c 20 -T 900 -j 1 scale: 10 clients: 20 tps = 22903.578552 pgbench -p 6566 -n -S -c 20 -T 900 -j 1 scale: 10 clients: 20 tps = 22970.691946 pgbench -p 6566 -n -S -c 20 -T 900 -j 1 scale: 10 clients: 20 tps = 22999.473318 pgbench -p 6566 -n -S -c 20 -T 900 -j 1 scale: 10 clients: 20 tps = 22884.854749 pgbench -p 6566 -n -S -c 20 -T 900 -j 1 ----- scale: 10 clients: 40 tps = 23522.499429 pgbench -p 6565 -n -S -c 40 -T 900 -j 1 scale: 10 clients: 40 tps = 23611.319191 pgbench -p 6565 -n -S -c 40 -T 900 -j 1 scale: 10 clients: 40 tps = 23616.905302 pgbench -p 6565 -n -S -c 40 -T 900 -j 1 scale: 10 clients: 40 tps = 23572.213990 pgbench -p 6565 -n -S -c 40 -T 900 -j 1 scale: 10 clients: 40 tps = 23714.721220 pgbench -p 6566 -n -S -c 40 -T 900 -j 1 scale: 10 clients: 40 tps = 23711.781175 pgbench -p 6566 -n -S -c 40 -T 900 -j 1 scale: 10 clients: 40 tps = 23691.867023 pgbench -p 6566 -n -S -c 40 -T 900 -j 1 scale: 10 clients: 40 tps = 23691.699231 pgbench -p 6566 -n -S -c 40 -T 900 -j 1 ----- scale: 10 clients: 60 tps = 21987.497095 pgbench -p 6565 -n -S -c 60 -T 900 -j 1 scale: 10 clients: 60 tps = 21950.344204 pgbench -p 6565 -n -S -c 60 -T 900 -j 1 scale: 10 clients: 60 tps = 22006.461447 pgbench -p 6565 -n -S -c 60 -T 900 -j 1 scale: 10 clients: 60 tps = 21824.071303 pgbench -p 6565 -n -S -c 60 -T 900 -j 1 scale: 10 clients: 60 tps = 22149.415231 pgbench -p 6566 -n -S -c 60 -T 900 -j 1 scale: 10 clients: 60 tps = 22211.064402 pgbench -p 6566 -n -S -c 60 -T 900 -j 1 scale: 10 clients: 60 tps = 22164.238081 pgbench -p 6566 -n -S -c 60 -T 900 -j 1 scale: 10 clients: 60 tps = 22174.585736 pgbench -p 6566 -n -S -c 60 -T 900 -j 1 ----- scale: 10 clients: 90 tps = 18751.213002 pgbench -p 6565 -n -S -c 90 -T 900 -j 1 scale: 10 clients: 90 tps = 18757.115811 pgbench -p 6565 -n -S -c 90 -T 900 -j 1 scale: 10 clients: 90 tps = 18692.942329 pgbench -p 6565 -n -S -c 90 -T 900 -j 1 scale: 10 clients: 90 tps = 18765.390154 pgbench -p 6565 -n -S -c 90 -T 900 -j 1 scale: 10 clients: 90 tps = 18929.462104 pgbench -p 6566 -n -S -c 90 -T 900 -j 1 scale: 10 clients: 90 tps = 18999.851184 pgbench -p 6566 -n -S -c 90 -T 900 -j 1 scale: 10 clients: 90 tps = 18972.321607 pgbench -p 6566 -n -S -c 90 -T 900 -j 1 scale: 10 clients: 90 tps = 18924.058827 pgbench -p 6566 -n -S -c 90 -T 900 -j 1 The higher scales still have that other standby-slowness. It may be caching effects (as Mark Kirkwood suggested): the idea being that the primary data is pre-cached because of the initial create; standby data needs to be first-time-read from disk. Does that make sense? I will try to confirm this. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers