Hi misc, I'm having frequent crashes on OpenBSD 4.2 (stable) on different machines with the following error:
panic: pmap_pinit: kernel_map out of virtual space! Specifically, we have two carped firewalls (running pfsync) that showed the same error with a difference of around 8 hours. First the backup crashed, and then master. I could run "boot dump" on the first one that crashed (the backup box), and then recover the core files with savecore (bsd.0 and bsd.0.core). But now, when trying to run the gdb commands, I get to a point where it tells me this when typing "target kvm bsd.0.core" and hitting enter: myhost:/var/crash# gdb GNU gdb 6.3 Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-unknown-openbsd4.2". (gdb) file bsd.0 Reading symbols from /u/data/crash/bsd.0...(no debugging symbols found)...done. (gdb) target kvm bsd.0.core Cannot access memory at address 0xffbe6afc (gdb) Why could this be? I'm kind of stuck at this point. I could run vmstat and ps commands with the -N and -M options, but I don't think I'm getting something very useful. I did see this with vmstat -m though: Memory statistics by bucket size Size In Use Free Requests HighWater Couldfree 16 3918 1714 228317130 1280 71 32 321 447 17128808 640 0 64 1222 1018 13797629 320 295030 128 405 43 4699894 160 0 256 229 59 14697840 80 71663 512 447 25 2447129 40 5629 1024 1274 30 941406 20 419326 2048 17 17 2263518 10 1768442 4096 21 6 1920222 5 0 8192 12 0 12 5 0 16384 2 0 4615 5 0 32768 4 0 8 5 0 Memory usage type by bucket size Size Type(s) 16 devbuf, pcb, routetbl, ifaddr, sysctl, vnodes, UFS mount, dirhash, in_multi, exec, xform_data, VM swap, UVM amap, UVM aobj, USB, USB device, temp 32 devbuf, pcb, routetbl, ifaddr, sem, dirhash, proc, VFS cluster, in_multi, ether_multi, exec, pfkey data, xform_data, VM swap, UVM amap, USB, crypto data, temp 64 devbuf, pcb, routetbl, ifaddr, vnodes, sem, dirhash, in_multi, pfkey data, UVM amap, USB, NDP, temp 128 devbuf, routetbl, ifaddr, iov, vnodes, ttys, exec, pfkey data, tdb, UVM amap, USB, USB device, crypto data, NDP, temp 256 devbuf, routetbl, ifaddr, sysctl, ioctlops, vnodes, shm, VM map, file desc, proc, NFS srvsock, NFS daemon, pfkey data, newblk, UVM amap, USB, USB device, temp 512 devbuf, pcb, ifaddr, ioctlops, mount, UFS mount, shm, dirhash, file desc, ttys, exec, UVM amap, USB device, temp 1024 devbuf, ioctlops, namecache, file desc, proc, ttys, exec, tdb, UVM amap, UVM aobj, crypto data, temp 2048 devbuf, ifaddr, ioctlops, UFS mount, pagedep, VM swap, UVM amap, temp 4096 devbuf, ioctlops, UFS mount, MSDOSFS mount, UVM amap, memdesc, temp 8192 devbuf, NFS node, namecache, UFS quota, UFS mount, ISOFS mount, inodedep, crypto data 16384 devbuf, UFS mount, VM swap, temp 32768 devbuf, namecache, VM swap, UVM amap Memory statistics by type Type Kern Type InUse MemUse HighUse Limit Requests Limit Limit Size(s) devbuf 2397 1445K 1445K 39322K 2458 0 0 16,32,64,128,256,512,1024,2048,4096,8192,16384,32768 pcb 95 8K 9K 39322K 469128 0 0 16,32,64,512 routetbl 220 20K 28K 39322K 1619554 0 0 16,32,64,128,256 ifaddr 118 22K 22K 39322K 120 0 0 16,32,64,128,256,512,2048 sysctl 2 1K 1K 39322K 2 0 0 16,256 ioctlops 0 0K 4K 39322K 10088436 0 0 256,512,1024,2048,4096 iov 0 0K 1K 39322K 18 0 0 128 mount 4 2K 4K 39322K 120 0 0 512 NFS node 1 8K 8K 39322K 1 0 0 8192 vnodes 41 7K 80K 39322K 350224 0 0 16,64,128,256 namecache 3 41K 41K 39322K 3 0 0 1024,8192,32768 UFS quota 1 8K 8K 39322K 1 0 0 8192 UFS mount 17 33K 68K 39322K 481 0 0 16,512,2048,4096,8192,16384 shm 2 1K 1K 39322K 2 0 0 256,512 VM map 4 1K 1K 39322K 4 0 0 256 sem 2 1K 1K 39322K 2 0 0 32,64 dirhash 78 15K 16K 39322K 8979 0 0 16,32,64,512 file desc 3 2K 7K 39322K 773 0 0 256,512,1024 proc 12 3K 3K 39322K 12 0 0 32,256,1024 VFS cluster 0 0K 1K 39322K 53484 0 0 32 NFS srvsock 1 1K 1K 39322K 1 0 0 256 NFS daemon 1 1K 1K 39322K 1 0 0 256 in_multi 114 5K 5K 39322K 115 0 0 16,32,64 ether_multi 60 2K 2K 39322K 61 0 0 32 ISOFS mount 1 8K 8K 39322K 1 0 0 8192 MSDOSFS mount 1 4K 4K 39322K 1 0 0 4096 ttys 420 263K 263K 39322K 420 0 0 128,512,1024 exec 0 0K 2K 39322K 1800891 0 0 16,32,128,512,1024 pfkey data 1 1K 1K 39322K 39 0 0 32,64,128,256 tdb 5 3K 3K 39322K 5 0 0 128,1024 xform_data 4 1K 1K 39322K 664042 0 0 16,32 pagedep 1 2K 2K 39322K 1 0 0 2048 inodedep 1 8K 8K 39322K 1 0 0 8192 newblk 1 1K 1K 39322K 1 0 0 256 VM swap 7 39K 39K 39322K 7 0 0 16,32,2048,16384,32768 UVM amap 4065 127K 205K 39322K264897527 0 0 16,32,64,128,256,512,1024,2048,4096,32768 UVM aobj 2 2K 2K 39322K 2 0 0 16,1024 USB 46 5K 5K 39322K 46 0 0 16,32,64,128,256 USB device 13 6K 6K 39322K 13 0 0 16,128,256,512 memdesc 1 4K 4K 39322K 1 0 0 4096 crypto data 12 18K 18K 39322K 12 0 0 32,128,1024,8192 NDP 21 2K 3K 39322K 25 0 0 64,128 temp 94 10K 28K 39322K 6261196 0 0 16,32,64,128,256,512,1024,2048,4096,16384 Memory Totals: In Use Free Requests 2115K 225K 286218211 Memory resource pool statistics Name Size Requests Fail Releases Pgreq Pgrel Npage Hiwat Minpg Maxpg Idle phpool 32 685 0 0 6 0 6 6 0 8 0 extentpl 20 229 0 196 1 0 1 1 0 8 0 pmappl 84 1219710 0 1219679 1 0 1 1 0 8 0 vmsppl 188 1219710 0 1219679 3 0 3 3 0 8 1 vmmpepl 88 191190366 0 191186089 114 0 114 114 0 179 20 vmmpekpl 88 4509023 0 4495911 286 0 286 286 0 8 0 aobjpl 52 1 0 0 1 0 1 1 0 8 0 amappl 44 86847249 0 86845907 21 0 21 21 0 45 6 anonpl 16 113809131 0 113806618 44 0 44 44 0 62 34 bufpl 124 1906610 0 1899767 218 4 214 214 0 8 0 mbpl 256 139948029 0 139947363 68 0 68 68 1 384 7 mclpl 2048 89771507 0 89770865 479 0 479 479 4 3072 29 sockpl 212 977092 0 977000 6 0 6 6 0 8 1 procpl 344 1219725 0 1219679 6 0 6 6 0 8 1 processpl 20 1219725 0 1219679 1 0 1 1 0 8 0 zombiepl 72 1219679 0 1219679 1 0 1 1 0 8 1 ucredpl 80 590685 0 590629 2 0 2 2 0 8 0 pgrppl 24 180756 0 180741 1 0 1 1 0 8 0 sessionpl 48 180754 0 180739 1 0 1 1 0 8 0 pcredpl 24 1219725 0 1219679 1 0 1 1 0 8 0 lockfpl 52 3023 0 3021 1 0 1 1 0 8 0 filepl 88 50017650 0 50017503 4 0 4 4 0 8 0 fdescpl 296 1219726 0 1219679 5 0 5 5 0 8 1 pipepl 72 2056576 0 2056553 1 0 1 1 0 8 0 kqueuepl 192 686 0 683 1 0 1 1 0 8 0 knotepl 64 2923 0 2874 2 0 2 2 0 8 1 sigapl 316 1219710 0 1219679 4 0 4 4 0 8 1 wqtasks 20 490193 0 490193 1 0 1 1 0 8 1 wdcspl 96 5130491 0 5130490 1 0 1 1 0 8 0 scxspl 132 3 0 3 1 0 1 1 0 8 1 namei 1024 68487004 0 68487004 2 0 2 2 0 8 2 vnodes 148 2621 0 0 98 0 98 98 0 8 0 nchpl 72 3315172 0 3313864 295 271 24 24 0 8 0 ffsino 184 5248219 0 5245603 291 172 119 119 0 8 0 dino1pl 128 5248218 0 5245602 195 110 85 85 0 8 0 dirhash 1024 10230 0 10146 25 0 25 25 0 128 4 pfrulepl 824 265 0 10 67 0 67 67 0 8 2 pfstatepl 204 10843516 4940 10843385 527 0 527 527 0 527 514 pfstatekeypl 108 10843516 0 5769657 138375 1243 137132 137132 0 8 0 pfpooladdrpl 68 26 0 0 1 0 1 1 0 8 0 pfrktable 1240 84 0 42 28 0 28 28 0 334 0 pfrkentry 156 479 0 0 19 0 19 19 0 7693 0 pfosfpen 108 1392 0 696 30 11 19 19 0 8 0 pfosfp 28 814 0 407 3 0 3 3 0 8 0 rtentpl 108 362 0 283 3 0 3 3 0 8 0 rttmrpl 32 1 0 1 1 0 1 1 0 8 1 tcpcbpl 400 73879 0 73876 1 0 1 1 0 8 0 tcpqepl 16 162 0 162 1 0 1 1 0 13 1 sackhlpl 20 2 0 2 1 0 1 1 0 162 1 synpl 184 73783 0 73783 1 0 1 1 0 8 1 plimitpl 152 107808 0 107796 1 0 1 1 0 8 0 inpcbpl 216 507973 0 507967 1 0 1 1 0 8 0 ipsec policy 212 16 0 0 1 0 1 1 0 8 0 In use 540926K, total allocated 559516K; utilization 96.7% Particularly, I saw this: Memory Totals: In Use Free Requests 2115K 225K 286218211 And this: In use 540926K, total allocated 559516K; utilization 96.7% Which seems to be little to spare. I also checked that a swap device is configured like this: myhost:/var/crash# swapctl -l Device 512-blocks Used Avail Capacity Priority swap_device 1048320 0 1048320 0% 0 We have a suspect, which is a script written in python that monitors several parameters every 5 minutes, calling vmstat, iostat, pfctl, etc. I find it weird though, as this is the first time that this is happening, on different hardware, and so close together in time. The python version is python-2.4.4p4 and the script is run by root (as some statistics can only be retrieved by root). Both machines were booted at the approximately the same time a month ago by the way. I find this theory not very probable though, as this script has been running on other versions for a long time. The other thing I can think of is something related to carp or pfsync. Any input on this will be much appreciated. Thank you, Martmn.