On 2022-Apr-29, at 13:41, Pete Wright <p...@nomadlogic.org> wrote: > > On 4/29/22 11:38, Mark Millard wrote: >> On 2022-Apr-29, at 11:08, Pete Wright <p...@nomadlogic.org> wrote: >> >>> On 4/23/22 19:20, Pete Wright wrote: >>>>> The developers handbook has a section debugging deadlocks that he >>>>> referenced in a response to another report (on freebsd-hackers). >>>>> >>>>> https://docs.freebsd.org/en/books/developers-handbook/kerneldebug/#kerneldebug-deadlocks >>>> d'oh - thanks for the correction! >>>> >>>> -pete >>>> >>>> >>> hello, i just wanted to provide an update on this issue. so the good news >>> is that by removing the file backed swap the deadlocks have indeed gone >>> away! thanks for sorting me out on that front Mark! >> Glad it helped. > > d'oh - went out for lunch and workstation locked up. i *knew* i shouldn't > have said anything lol.
Any interesting console messages ( or dmesg -a or /var/log/messages )? >>> i still am seeing a memory leak with either firefox or chrome (maybe both >>> where they create a voltron of memory leaks?). this morning firefox and >>> chrome had been killed when i first logged in. fortunately the system has >>> remained responsive for several hours which was not the case previously. >>> >>> when looking at my metrics i see vm.domain.0.stats.inactive take a nose >>> dive from around 9GB to 0 over the course of 1min. the timing seems to >>> align with around the time when firefox crashed, and is proceeded by a >>> large spike in vm.domain.0.stats.active from ~1GB to 7GB 40mins before the >>> apps crashed. after the binaries were killed memory metrics seem to have >>> recovered (laundry size grew, and inactive size grew by several gigs for >>> example). >> Since the form of kill here is tied to sustained low free memory >> ("failed to reclaim memory"), you might want to report the >> vm.domain.0.stats.free_count figures from various time frames as >> well: >> >> vm.domain.0.stats.free_count: Free pages >> >> (It seems you are converting pages to byte counts in your report, >> the units I'm not really worried about so long as they are >> obvious.) >> >> There are also figures possibly tied to the handling of the kill >> activity but some being more like thresholds than usage figures, >> such as: >> >> vm.domain.0.stats.free_severe: Severe free pages >> vm.domain.0.stats.free_min: Minimum free pages >> vm.domain.0.stats.free_reserved: Reserved free pages >> vm.domain.0.stats.free_target: Target free pages >> vm.domain.0.stats.inactive_target: Target inactive pages > ok thanks Mark, based on this input and the fact i did manage to lock up my > system, i'm going to get some metrics up on my website and share them > publicly when i have time. i'll definitely take you input into account when > sharing this info. > >> >> Also, what value were you using for: >> >> vm.pageout_oom_seq > $ sysctl vm.pageout_oom_seq > vm.pageout_oom_seq: 120 > $ Without knowing vm.domain.0.stats.free_count it is hard to tell, but you might try, say, sysctl vm.pageout_oom_seq=12000 in hopes of getting notably more time with the vm.domain.0.stats.free_count staying small. That may give you more time to notice the low free RAM (if you are checking periodically, rather than just waiting for failure to make it obvious). === Mark Millard marklmi at yahoo.com