https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=242427
Bug ID: 242427 Summary: pmap_remove() sometimes is very slow causing 10+ minutes long reboots Product: Base System Version: 11.3-RELEASE Hardware: amd64 OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: b...@freebsd.org Reporter: p...@lysator.liu.se I've noticed that on our file servers it can take a _very_ long time to reboot - where most of the time spent is during the "shutdown" phase - time spent after the "Uptime: " line is printed. After a long debugging session I've pinpointed it to ZFS freeing the "zio" cache, which, after many levels of function calls, ends up it in pmap_remove() where some calls to it takes approximately 1 second. On a basically idle test server it can take up to 10-20 minutes for the server to "shut down" (or even more - the time seems to be relative to the server uptime). We've seen production machines that seem to be "hung" (atleast an hour or more) so we gave up and sent them a "hard reset" via IPMI. Hardware: Dell PowerEdge R730xd with dual Intel Xeon E5-2620v4 CPUs (32 "cpus") and 256GB of RAM. No swap. Software: FreeBSD 11.3-RELEASE-p5. ZFS on boot & data. ZFS ARC limited to 128GB. Approx 24000 ZFS filesystems (empty on this test server). Snapshots taken every hour. An example of how long it can take (I'm only printing timing info for calls that take >=1s (the top 4) or >=2s (the rest): kmem_unback: pmap_remove(kernel_pmap, 18446741877714755584, 18446741877714767872) took 1 seconds kmem_free: kmem_unback(kmem_object, 18446741877714755584, 12288) took 1 seconds page_free: kmem_free(kmem_arena, 18446741877714755584, 12288) took 1 seconds keg_free_slab: keg->uk_freef(mem) {page_free} took 1 seconds keg_drain: while-keg_free_slab-loop took 14 seconds [20021 loops, 14 slow calls] zone_drain_wait: zone_foreach_keg(zone, &keg_drain) took 14 seconds zone_dtor: zone_drain_wait(zone, M_WAITOK) took 14 seconds zone_free_item(zone=UMA Zones): zone->uz_dtor() took 14 seconds uma_zdestroy(zio_buf_12288) took 14 seconds kmem_cache_destroy: uma_zdestroy(0xfffff803467c8ac0) [zio_buf_12288] took 14 seconds kmem_cache_destroy(zio_buf_cache[20]) took 14 seconds Called from kern_shutdown() -> EVENTHANDLER_INVOKE(shutdown_post_sync) -> zfsshutdown() Ie, ~14 out of 20021 calls to keg_free_slab() takes 1 second instead of executing really quickly (in this case). But some kmem_cache:s are much bigger causing delays for 300-600 seconds (or more). (I use "time_second" for time measurements, should probably use something with better granularity for the top 4 calls :-) I added a sysctl() kern.shutdown.verbose that I can set to a number to make it be more verbose (and added a lot of printf()s to get this info)... With it I now can see the number of filesystems being unmounted (since that too can take a little while - nothing close to the times above though). - Peter -- You are receiving this mail because: You are the assignee for the bug. _______________________________________________ freebsd-bugs@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-bugs To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"