On 3/7/2013 12:57 PM, Steven Hartland wrote: > > ----- Original Message ----- From: "Karl Denninger" <k...@denninger.net> >> Where I am right now is this: >> >> 1. I *CANNOT* reproduce the spins on the test machine with Postgres >> stopped in any way. Even with multiple ZFS send/recv copies going on >> and the load average north of 20 (due to all the geli threads), the >> system doesn't stall or produce any notable pauses in throughput. Nor >> does the system RAM allocation get driven hard enough to force paging. >> This is with NO tuning hacks in /boot/loader.conf. I/O performance is >> both stable and solid. >> >> 2. WITH Postgres running as a connected hot spare (identical to the >> production machine), allocating ~1.5G of shared, wired memory, running >> the same synthetic workload in (1) above I am getting SMALL versions of >> the misbehavior. However, while system RAM allocation gets driven >> pretty hard and reaches down toward 100MB in some instances it doesn't >> get driven hard enough to allocate swap. The "burstiness" is very >> evident in the iostat figures with spates getting into the single digit >> MB/sec range from time to time but it's not enough to drive the system >> to a full-on stall. >> >> There's pretty-clearly a bad interaction here between Postgres wiring >> memory and the ARC, when the latter is left alone and allowed to do what >> it wants. I'm continuing to work on replicating this on the test >> machine... just not completely there yet. > > Another possibility to consider is how postgres uses the FS. For example > does is request sync IO in ways not present in the system without it > which is causing the FS and possibly underlying disk system to behave > differently. > That's possible but not terribly-likely in this particular instance. The reason is that I ran into this with the Postgres data store on a UFS volume BEFORE I converted it. Now it's on the ZFS pool (with recordsize=8k as recommended for that filesystem) but when I first ran into this it was on a separate UFS filesystem (which is where it had resided for 2+ years without incident), so unless the Postgres filesystem use on a UFS volume would give ZFS fits it's unlikely to be involved.
> One other options to test, just to rule it out is what happens if you > use BSD scheduler instead of ULE? > > Regards > Steve > I will test that but first I have to get the test machine to reliably stall so I know I'm not chasing my tail. -- -- Karl Denninger /The Market Ticker ®/ <http://market-ticker.org> Cuda Systems LLC _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"