Re: kernel's make fails in ath module, stable9
See tinderbox log, my -wirelesss mail etc. I'm waiting for a fix too. -- View this message in context: http://freebsd.1045724.n5.nabble.com/kernel-s-make-fails-in-ath-module-stable9-tp579p5777859.html Sent from the freebsd-stable mailing list archive at Nabble.com. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: kernel's make fails in ath module, stable9
...should be fixed, as it is already reverted. -- View this message in context: http://freebsd.1045724.n5.nabble.com/kernel-s-make-fails-in-ath-module-stable9-tp579p5777864.html Sent from the freebsd-stable mailing list archive at Nabble.com. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: kernel's make fails in ath module, stable9
On 2013-01-15 12:31, Jakub Lach wrote: ...should be fixed, as it is already reverted. Yes, sorry about that breakage. It should be fixed as of r245449. The good news is that stable/9 now has clang 3.2 release. :-) ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: IPv6 Tunnel Shared With Jails via epair Devices
On Tue, Jan 15, 2013 at 12:29 AM, Ben Morrow wrote: > Quoth Shawn Webb : > > > > I've been working on sharing a 6in4 IPv6 tunnel (via a gif device) I have > > with Hurricane Electric (tunnelbroker.net) to my jails via epair > devices. > > My setup is a bit unique in that the IPv6 tunnel is behind an OpenVPN > > connection. I've had varying degrees of success. I might have a bug to > > report, but I thought I'd post here to get input from people who know > > better than I do about these kinds of things. > > > > I have a bridge device (we'll call it bridge0) with a /64 IPv6 address > > (2001:470:8142:1::1). Each jail's epair[n]b device will get an IPv6 > address > > in that same prefix. For example, one of my jails is 2001:470:8142:1::3. > > The default IPv6 gateway is the IPv6 address of bridge0. > > > > Giving one jail an IP address works fine. For each jail after that, the > > IPv6 address stays in tentative mode. FreeBSD gets stuck trying to use > DAD > > to figure out if there's an address conflict. It never leaves tentative > > mode. This is the bug I'm working out. > > > > Here's bridge0's config: > > > > # ifconfig bridge0 > > bridge0: flags=8843 metric 0 mtu > > 1500 > > ether 02:fe:21:34:d3:00 > > inet6 2001:470:8142:1::1 prefixlen 64 > > nd6 options=21 > > id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15 > > maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200 > > root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0 > > member: epair0a flags=143 > >ifmaxaddr 0 port 19 priority 128 path cost 2000 > > member: epair1a flags=143 > >ifmaxaddr 0 port 21 priority 128 path cost 2000 > > member: bge0 flags=143 > >ifmaxaddr 0 port 5 priority 128 path cost 20 > > Why have you added the physical interface to the bridge? AFAICT you > don't need to: a bridge will bridge epairs just fine, and as you > explained in that blog post you have to route rather than bridge into > the tunnel, since the tunnel isn't an Ethernet device. > I did it so that I have an IPv4 address directly on the LAN for each of my jails. > > > Here's the relevant epair device for the jail whose IPv6 stack is > working: > > > > # jexec "ClamAV_Dev" ifconfig epair1b > > epair1b: flags=8843 metric 0 mtu > > 1500 > > options=8 > > ether 02:fb:c0:00:16:0b > > inet6 2001:470:8142:1::3 prefixlen 64 > > inet6 fe80::fb:c0ff:fe00:160b%epair1b prefixlen 64 scopeid 0x2 > > inet 10.7.1.172 netmask 0xfe00 broadcast 10.7.1.255 > > nd6 options=21 > > media: Ethernet 10Gbase-T (10Gbase-T ) > > status: active > > > > Here's the relevant epair device for the jail whose IPv6 stack isn't > > working: > > > > # jexec "Dev Template" ifconfig epair0b > > epair0b: flags=8843 metric 0 mtu > > 1500 > > options=8 > > ether 02:80:03:00:14:0b > > inet6 2001:470:8142:1::5 prefixlen 64 tentative > > inet6 fe80::80:3ff:fe00:140b%epair0b prefixlen 64 tentative scopeid 0x2 > > inet 10.7.1.92 netmask 0xfe00 broadcast 10.7.1.255 > > nd6 options=29 > > I suspect the addresses are only marked tentative because the interface > has been marked IFDISABLED. This causes all current addresses to be > marked tentative, because the kernel isn't allowed to send or receive > IPv6 packets and so can't defend the addresses any more. > > Is it possible something in the jail's startup scripts is causing the > interface to be marked IFDISABLED after the inet6 address has been > assigned? Some of the functions in network.subr mark interfaces > IFDISABLED automatically if they don't think they have IPv6 addresses. > I was thinking the same thing. One problem is that I can't remove the IFDISABLED flag. This is what happens when I try: # jexec "Dev Template" ifconfig epair0b -ifdisabled ifconfig: ioctl(SIOCGIFINFO_IN6): Invalid argument > > > media: Ethernet 10Gbase-T (10Gbase-T ) > > status: active > > > > I brought up the "Dev Template" jail after bringing up the ClamAV_Dev > jail. > > If there's any other output you'd like to see, let me know. If you're > > confused about my setup, visit my blog post about the subject here: > > > http://0xfeedface.org/blog/lattera/2013-01-12/tunneled-ipv6-freebsd-jails > > > > I'm curious to know if I've got a legit bug or if it's something I'm > doing > > wrong. The one thing I haven't tried is setting up rtadvd on the bridge. > > That'd be kindof interesting, since my physical NIC is a member on the > > bridge. I'd rather not dish out IPv6 addresses for all devices on the > > network (a network with lots of devices I don't own or control). > > As I said, I don't believe you need the physical interface on the > bridge, unless you have to for IPv4 (and you can't route or proxyarp > instead). However, before you can run rtadvd you will need to give the > bridge its proper link-local address, which probably also means locking > down its hardware address in rc.conf. Bridges don't get auto link-local > addresses, for reasons I've never entirely understood, and RAs have to > use ll addresses. > > You wil
Re: make release doesn't correctly include EXTLOCALDIR ?
On Jan 11, 2013, at 2:06 PM, Fleuriot Damien wrote: > Hello list, > > > I'm running 8.3-stable r245223 from a mere 2 days ago and am in the process > of building a custom release for our internal use as preconfigured firewalls. > > "make release" works pretty fine except for a few quirks here and there. > > > > First of all, I have set EXTLOCALDIR so that the release contains my existing > /usr/local/ , and thus the collection of installed ports. > > The problem here is that while /release/usr/local/ is correctly populated, > the ISO images and ftp install directory have an empty usr/local/ > Extracting the ISO's base.?? files doesn't yield the /usr/local/ contents > either. > > > > > The second problem I encounter is with the kernel's build. > Apparently "make release" doesn't pull MODULES_OVERRIDE from /etc/make.conf > and decides to build every single module, as opposed to my own restricted > list. > > I'm going to try with with KERNEL_FLAGS=-DMODULES_OVERRIDE module1 module2 in > /usr/src/release/Makefile > > > > Has anyone else ever experienced the same problem regarding the inclusion of > /usr/local/ in their release ? > Reposting to -stable in the hope of getting feedback, having received none on -questions. Has anyone experienced this before ? Is this intended behaviour ? I fail to see the purpose of including /usr/local/ if it won't be packaged into the release images. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: IPv6 Tunnel Shared With Jails via epair Devices
Quoth Shawn Webb : > On Tue, Jan 15, 2013 at 12:29 AM, Ben Morrow wrote: > > Quoth Shawn Webb : > > > > > > # ifconfig bridge0 > > > bridge0: flags=8843 metric 0 mtu > > > 1500 > > > ether 02:fe:21:34:d3:00 > > > inet6 2001:470:8142:1::1 prefixlen 64 > > > nd6 options=21 > > > id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15 > > > maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200 > > > root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0 > > > member: epair0a flags=143 > > >ifmaxaddr 0 port 19 priority 128 path cost 2000 > > > member: epair1a flags=143 > > >ifmaxaddr 0 port 21 priority 128 path cost 2000 > > > member: bge0 flags=143 > > >ifmaxaddr 0 port 5 priority 128 path cost 20 > > > > Why have you added the physical interface to the bridge? AFAICT you > > don't need to: a bridge will bridge epairs just fine, and as you > > explained in that blog post you have to route rather than bridge into > > the tunnel, since the tunnel isn't an Ethernet device. > > I did it so that I have an IPv4 address directly on the LAN for each of my > jails. Hmm, OK. > > > # jexec "Dev Template" ifconfig epair0b > > > epair0b: flags=8843 metric 0 mtu > > > 1500 > > > options=8 > > > ether 02:80:03:00:14:0b > > > inet6 2001:470:8142:1::5 prefixlen 64 tentative > > > inet6 fe80::80:3ff:fe00:140b%epair0b prefixlen 64 tentative scopeid 0x2 > > > inet 10.7.1.92 netmask 0xfe00 broadcast 10.7.1.255 > > > nd6 options=29 > > > > I suspect the addresses are only marked tentative because the interface > > has been marked IFDISABLED. This causes all current addresses to be > > marked tentative, because the kernel isn't allowed to send or receive > > IPv6 packets and so can't defend the addresses any more. > > > > Is it possible something in the jail's startup scripts is causing the > > interface to be marked IFDISABLED after the inet6 address has been > > assigned? Some of the functions in network.subr mark interfaces > > IFDISABLED automatically if they don't think they have IPv6 addresses. > > I was thinking the same thing. One problem is that I can't remove the > IFDISABLED flag. This is what happens when I try: > > # jexec "Dev Template" ifconfig epair0b -ifdisabled > ifconfig: ioctl(SIOCGIFINFO_IN6): Invalid argument ifconfig epair0b inet6 -ifdisabled I don't know why you get that error when you miss out the 'inet6'; it's not exactly very clear. Ben ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: CAM hangs in 9-STABLE? [Was: NFS/ZFS hangs after upgrading from 9.0-RELEASE to -STABLE]
Dear All, Still experiencing the same hangs I reported earlier with 9.1. I've been running a kernel with WITNESS enabled to provide more information. During an occurrence of the hang, running show alllocks gave Process 25777 (sysctl) thread 0xfe014c5b2920 (102567) exclusive sleep mutex Giant (Giant) r = 0 (0x811e34c0) locked @ /usr/src/sys/dev/usb/usb_transfer.c:3171 Process 25750 (sshd) thread 0xfe015a688000 (104313) exclusive sx so_rcv_sx (so_rcv_sx) r = 0 (0xfe0204e0bb98) locked @ /usr/src/sys/kern/uipc_sockbuf.c:148 Process 24922 (cnid_dbd) thread 0xfe0187ac4920 (103597) shared lockmgr zfs (zfs) r = 0 (0xfe0973062488) locked @ /usr/src/sys/kern/vfs_syscalls.c:3591 Process 24117 (sshd) thread 0xfe07bd914490 (104195) exclusive sx so_rcv_sx (so_rcv_sx) r = 0 (0xfe0204e0a8f0) locked @ /usr/src/sys/kern/uipc_sockbuf.c:148 Process 1243 (java) thread 0xfe01ca85d000 (102704) exclusive sleep mutex pmap (pmap) r = 0 (0xfe015aec1440) locked @ /usr/src/sys/amd64/amd64/pmap.c:4840 exclusive rw pmap pv global (pmap pv global) r = 0 (0x81409780) locked @ /usr/src/sys/amd64/amd64/pmap.c:4802 exclusive sleep mutex vm page (vm page) r = 0 (0x813f0a80) locked @ /usr/src/sys/vm/vm_object.c:1128 exclusive sleep mutex vm object (standard object) r = 0 (0xfe01458e43a0) locked @ /usr/src/sys/vm/vm_object.c:1076 shared sx vm map (user) (vm map (user)) r = 0 (0xfe015aec1388) locked @ /usr/src/sys/vm/vm_map.c:2045 Process 994 (nfsd) thread 0xfe015a0df000 (102426) shared lockmgr zfs (zfs) r = 0 (0xfe0c3b505878) locked @ /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:1760 Process 994 (nfsd) thread 0xfe015a0f8490 (102422) exclusive lockmgr zfs (zfs) r = 0 (0xfe02db3b3e60) locked @ /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:1760 Process 931 (syslogd) thread 0xfe015af18920 (102365) shared lockmgr zfs (zfs) r = 0 (0xfe0141dd6680) locked @ /usr/src/sys/kern/vfs_syscalls.c:3591 Process 22 (syncer) thread 0xfe0125077000 (100279) exclusive lockmgr syncer (syncer) r = 0 (0xfe015a2ff680) locked @ /usr/src/sys/kern/vfs_subr.c:1809 I don't have full "show lockedvnods" output because the output does not get captured by ddb after using "capture on", it doesn't fit on a single screen, and doesn't get piped into a "more" equivalent. What I did manage to get (copied by hand, typos possible) is: 0xfe0c3b5057e0: 0xfe0c3b5057e0: tag zfs, type VREG tag zfs, type VREG usecount 1, writecount 0, refcount 1 mountedhere 0 usecount 1, writecount 0, refcount 1 mountedhere 0 flags (VI_ACTIVE) flags (VI_ACTIVE) v_object 0xfe089bc1b828 ref 0 pages 0 v_object 0xfe089bc1b828 ref 0 pages 0 lock type zfs: SHARED (count 1) lock type zfs: SHARED (count 1) 0xfe02db3b3dc8: 0xfe02db3b3dc8: tag zfs, type VREG tag zfs, type VREG usecount 6, writecount 0, refcount 6 mountedhere 0 usecount 6, writecount 0, refcount 6 mountedhere 0 flags (VI_ACTIVE) flags (VI_ACTIVE) v_object 0xfe0b79583ae0 ref 0 pages 0 v_object 0xfe0b79583ae0 ref 0 pages 0 lock type zfs: EXCL by thread 0xfe015a0f8490 (pid 994) lock type zfs: EXCL by thread 0xfe015a0f8490 (pid 994) with exclusive waiters pending with exclusive waiters pending The output of show witness is at http://pastebin.com/eSRb3FEu The output of alltrace is at http://pastebin.com/X1LruNrf (a number of threads are stuck in zio_wait, none I can find in zio_interrupt, and according to gstat and disks eventually going to sleep all disk IO seems to be stuck for good; I think Andriy explained earlier that these criteria might indicate this is a ZFS hang). The output of show geom is at http://pastebin.com/6nwQbKr4 The output of vmstat -i is at http://pastebin.com/9LcZ7Mi0 Interrupts are occurring at a normal rate during the hang, as far as I can tell. Any help would be greatly appreciated. Thanks Olivier PS: my kernel was compiled from 9-STABLE from December, with CAM and ahci from 9.0 (in the hope it would fix the hangs I was experiencing in plain 9-STABLE; obviously the hangs are still occurring). The rest of my configuration is the same as posted earlier. On Mon, Dec 24, 2012 at 9:42 PM, olivier wrote: > Dear All > It turns out that reverting to an older version of the mps driver did not > fix the ZFS hangs I've been struggling with in 9.1 and 9-STABLE after all > (they just took a bit longer to occur again, possibly just by chance). I > followed steps along lines suggested by Andriy to collect more information > when the problem occurs. Hopefully this will help figure out what's going > on. > > As far as I can tell, what happens is that at some point IO operations to > a bunch of drives that belong to different pools get stuck. For these > drives, gstat shows no activity but 1 pending operation, as such: > > L(q) ops/sr/s kBps ms/rw/s kBps ms/wd/s kBps > ms/d %busy Name > 1
Re: IPv6 Tunnel Shared With Jails via epair Devices
On Tue, Jan 15, 2013 at 2:54 PM, Ben Morrow wrote: > Quoth Shawn Webb : > > On Tue, Jan 15, 2013 at 12:29 AM, Ben Morrow wrote: > > > Quoth Shawn Webb : > > > > > > > > # ifconfig bridge0 > > > > bridge0: flags=8843 metric 0 > mtu > > > > 1500 > > > > ether 02:fe:21:34:d3:00 > > > > inet6 2001:470:8142:1::1 prefixlen 64 > > > > nd6 options=21 > > > > id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15 > > > > maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200 > > > > root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0 > > > > member: epair0a flags=143 > > > >ifmaxaddr 0 port 19 priority 128 path cost 2000 > > > > member: epair1a flags=143 > > > >ifmaxaddr 0 port 21 priority 128 path cost 2000 > > > > member: bge0 flags=143 > > > >ifmaxaddr 0 port 5 priority 128 path cost 20 > > > > > > Why have you added the physical interface to the bridge? AFAICT you > > > don't need to: a bridge will bridge epairs just fine, and as you > > > explained in that blog post you have to route rather than bridge into > > > the tunnel, since the tunnel isn't an Ethernet device. > > > > I did it so that I have an IPv4 address directly on the LAN for each of > my > > jails. > > Hmm, OK. > > > > > # jexec "Dev Template" ifconfig epair0b > > > > epair0b: flags=8843 metric 0 > mtu > > > > 1500 > > > > options=8 > > > > ether 02:80:03:00:14:0b > > > > inet6 2001:470:8142:1::5 prefixlen 64 tentative > > > > inet6 fe80::80:3ff:fe00:140b%epair0b prefixlen 64 tentative scopeid > 0x2 > > > > inet 10.7.1.92 netmask 0xfe00 broadcast 10.7.1.255 > > > > nd6 options=29 > > > > > > I suspect the addresses are only marked tentative because the interface > > > has been marked IFDISABLED. This causes all current addresses to be > > > marked tentative, because the kernel isn't allowed to send or receive > > > IPv6 packets and so can't defend the addresses any more. > > > > > > Is it possible something in the jail's startup scripts is causing the > > > interface to be marked IFDISABLED after the inet6 address has been > > > assigned? Some of the functions in network.subr mark interfaces > > > IFDISABLED automatically if they don't think they have IPv6 addresses. > > > > I was thinking the same thing. One problem is that I can't remove the > > IFDISABLED flag. This is what happens when I try: > > > > # jexec "Dev Template" ifconfig epair0b -ifdisabled > > ifconfig: ioctl(SIOCGIFINFO_IN6): Invalid argument > > ifconfig epair0b inet6 -ifdisabled > > I don't know why you get that error when you miss out the 'inet6'; it's > not exactly very clear. > Ah. That works. I'll just have to add that to my scripts. Since the device won't come out of tentative mode without manually removing the ifdisabled flag, should I go ahead and file a PR? It'd be nice if I could at the very least set a timeout for DAD. > > Ben > > ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: IPv6 Tunnel Shared With Jails via epair Devices
Somehow there ended up a typo in the CC to freebsd-stable@freebsd.org. Last email below: On Tue, Jan 15, 2013 at 5:53 PM, Shawn Webb wrote: > On Tue, Jan 15, 2013 at 4:52 PM, Ben Morrow wrote: > >> Quoth Shawn Webb : >> > On Tue, Jan 15, 2013 at 2:54 PM, Ben Morrow wrote: >> > > >> > > ifconfig epair0b inet6 -ifdisabled >> > > >> > > I don't know why you get that error when you miss out the 'inet6'; >> it's >> > > not exactly very clear. >> > > >> > >> > Ah. That works. I'll just have to add that to my scripts. Since the >> device >> > won't come out of tentative mode without manually removing the >> ifdisabled >> > flag, should I go ahead and file a PR? It'd be nice if I could at the >> very >> > least set a timeout for DAD. >> >> DAD already has a timeout: it succeeds iff no packets indicating someone >> else is using the address are received in a given time. The only reason >> for an address remaining tentative indefinitely (without transitioning >> to either valid or duplicated) is if IPv6 on that interface has been >> disable entirely by setting IFDISABLED. If DAD fails for the LL address >> the interface is marked IFDISABLED but the LL address is marked >> duplicated rather than tentative. >> > > I figured it out. In my jail initialization scripts, I'm running '/bin/sh > /bin/rc' after doing initial network setup. The rc script puts the > interface in IFDISABLED mode. So if I run the ifconfig command to remove > the flag, I'm golden. I've committed and pushed the code that fixes the > problem in my scripts. If you're curious, you can look at > https://github.com/lattera/drupal-jailadmin/commit/cbf8509712c3dd237bbc020f49f63b51507b7be4 > > Thanks for the help. I really appreciate it. > ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: IPv6 Tunnel Shared With Jails via epair Devices
At 5PM -0500 on 15/01/13 you (Shawn Webb) wrote: > > I figured it out. In my jail initialization scripts, I'm running '/bin/sh > /bin/rc' after doing initial network setup. The rc script puts the > interface in IFDISABLED mode. So if I run the ifconfig command to remove > the flag, I'm golden. Yes, that's what I thought. You should be able to avoid this by specifying either ifconfig_epair0b_ipv6="inet6 auto_linklocal" or ipv6_activate_all_interfaces="YES" in the jail's rc.conf. This is cleaner than running ifconfig explicitly outside the jail. Ben ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: CAM hangs in 9-STABLE? [Was: NFS/ZFS hangs after upgrading from 9.0-RELEASE to -STABLE]
I don't know if this is relevant or not, but I deadlock was recently fixed in the VFS code: http://svnweb.freebsd.org/base?view=revision&revision=244795 On Tue, Jan 15, 2013 at 12:55 PM, olivier wrote: > Dear All, > Still experiencing the same hangs I reported earlier with 9.1. I've been > running a kernel with WITNESS enabled to provide more information. > > During an occurrence of the hang, running show alllocks gave > > Process 25777 (sysctl) thread 0xfe014c5b2920 (102567) > exclusive sleep mutex Giant (Giant) r = 0 (0x811e34c0) locked @ > /usr/src/sys/dev/usb/usb_transfer.c:3171 > Process 25750 (sshd) thread 0xfe015a688000 (104313) > exclusive sx so_rcv_sx (so_rcv_sx) r = 0 (0xfe0204e0bb98) locked @ > /usr/src/sys/kern/uipc_sockbuf.c:148 > Process 24922 (cnid_dbd) thread 0xfe0187ac4920 (103597) > shared lockmgr zfs (zfs) r = 0 (0xfe0973062488) locked @ > /usr/src/sys/kern/vfs_syscalls.c:3591 > Process 24117 (sshd) thread 0xfe07bd914490 (104195) > exclusive sx so_rcv_sx (so_rcv_sx) r = 0 (0xfe0204e0a8f0) locked @ > /usr/src/sys/kern/uipc_sockbuf.c:148 > Process 1243 (java) thread 0xfe01ca85d000 (102704) > exclusive sleep mutex pmap (pmap) r = 0 (0xfe015aec1440) locked @ > /usr/src/sys/amd64/amd64/pmap.c:4840 > exclusive rw pmap pv global (pmap pv global) r = 0 (0x81409780) > locked @ /usr/src/sys/amd64/amd64/pmap.c:4802 > exclusive sleep mutex vm page (vm page) r = 0 (0x813f0a80) locked @ > /usr/src/sys/vm/vm_object.c:1128 > exclusive sleep mutex vm object (standard object) r = 0 > (0xfe01458e43a0) locked @ /usr/src/sys/vm/vm_object.c:1076 > shared sx vm map (user) (vm map (user)) r = 0 (0xfe015aec1388) locked @ > /usr/src/sys/vm/vm_map.c:2045 > Process 994 (nfsd) thread 0xfe015a0df000 (102426) > shared lockmgr zfs (zfs) r = 0 (0xfe0c3b505878) locked @ > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:1760 > Process 994 (nfsd) thread 0xfe015a0f8490 (102422) > exclusive lockmgr zfs (zfs) r = 0 (0xfe02db3b3e60) locked @ > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:1760 > Process 931 (syslogd) thread 0xfe015af18920 (102365) > shared lockmgr zfs (zfs) r = 0 (0xfe0141dd6680) locked @ > /usr/src/sys/kern/vfs_syscalls.c:3591 > Process 22 (syncer) thread 0xfe0125077000 (100279) > exclusive lockmgr syncer (syncer) r = 0 (0xfe015a2ff680) locked @ > /usr/src/sys/kern/vfs_subr.c:1809 > > I don't have full "show lockedvnods" output because the output does not get > captured by ddb after using "capture on", it doesn't fit on a single > screen, and doesn't get piped into a "more" equivalent. What I did manage > to get (copied by hand, typos possible) is: > > 0xfe0c3b5057e0: 0xfe0c3b5057e0: tag zfs, type VREG > tag zfs, type VREG > usecount 1, writecount 0, refcount 1 mountedhere 0 > usecount 1, writecount 0, refcount 1 mountedhere 0 > flags (VI_ACTIVE) > flags (VI_ACTIVE) > v_object 0xfe089bc1b828 ref 0 pages 0 > v_object 0xfe089bc1b828 ref 0 pages 0 > lock type zfs: SHARED (count 1) > lock type zfs: SHARED (count 1) > > 0xfe02db3b3dc8: 0xfe02db3b3dc8: tag zfs, type VREG > tag zfs, type VREG > usecount 6, writecount 0, refcount 6 mountedhere 0 > usecount 6, writecount 0, refcount 6 mountedhere 0 > flags (VI_ACTIVE) > flags (VI_ACTIVE) > v_object 0xfe0b79583ae0 ref 0 pages 0 > v_object 0xfe0b79583ae0 ref 0 pages 0 > lock type zfs: EXCL by thread 0xfe015a0f8490 (pid 994) > lock type zfs: EXCL by thread 0xfe015a0f8490 (pid 994) > with exclusive waiters pending > with exclusive waiters pending > > The output of show witness is at http://pastebin.com/eSRb3FEu > > The output of alltrace is at http://pastebin.com/X1LruNrf (a number of > threads are stuck in zio_wait, none I can find in zio_interrupt, and > according to gstat and disks eventually going to sleep all disk IO seems to > be stuck for good; I think Andriy explained earlier that these criteria > might indicate this is a ZFS hang). > > The output of show geom is at http://pastebin.com/6nwQbKr4 > > The output of vmstat -i is at http://pastebin.com/9LcZ7Mi0 Interrupts are > occurring at a normal rate during the hang, as far as I can tell. > > Any help would be greatly appreciated. > Thanks > Olivier > PS: my kernel was compiled from 9-STABLE from December, with CAM and ahci > from 9.0 (in the hope it would fix the hangs I was experiencing in plain > 9-STABLE; obviously the hangs are still occurring). The rest of my > configuration is the same as posted earlier. > > On Mon, Dec 24, 2012 at 9:42 PM, olivier wrote: > >> Dear All >> It turns out that reverting to an older version of the mps driver did not >> fix the ZFS hangs I've been struggling with in 9.1 and 9-STABLE after all >> (they just took a bit longer to occur again, possibly just by chance). I >> followed steps along lines suggested by Andriy to collect more information >> when the probl
Re: CAM hangs in 9-STABLE? [Was: NFS/ZFS hangs after upgrading from 9.0-RELEASE to -STABLE]
My understanding is that the locks (and pieces of kernel code) involved are different. Maybe someone more knowledgeable than I am can comment. Thanks for the suggestion... Olivier On Tue, Jan 15, 2013 at 4:07 PM, Reed A. Cartwright wrote: > I don't know if this is relevant or not, but I deadlock was recently > fixed in the VFS code: > > http://svnweb.freebsd.org/base?view=revision&revision=244795 > > On Tue, Jan 15, 2013 at 12:55 PM, olivier wrote: > > Dear All, > > Still experiencing the same hangs I reported earlier with 9.1. I've been > > running a kernel with WITNESS enabled to provide more information. > > > > During an occurrence of the hang, running show alllocks gave > > > > Process 25777 (sysctl) thread 0xfe014c5b2920 (102567) > > exclusive sleep mutex Giant (Giant) r = 0 (0x811e34c0) locked @ > > /usr/src/sys/dev/usb/usb_transfer.c:3171 > > Process 25750 (sshd) thread 0xfe015a688000 (104313) > > exclusive sx so_rcv_sx (so_rcv_sx) r = 0 (0xfe0204e0bb98) locked @ > > /usr/src/sys/kern/uipc_sockbuf.c:148 > > Process 24922 (cnid_dbd) thread 0xfe0187ac4920 (103597) > > shared lockmgr zfs (zfs) r = 0 (0xfe0973062488) locked @ > > /usr/src/sys/kern/vfs_syscalls.c:3591 > > Process 24117 (sshd) thread 0xfe07bd914490 (104195) > > exclusive sx so_rcv_sx (so_rcv_sx) r = 0 (0xfe0204e0a8f0) locked @ > > /usr/src/sys/kern/uipc_sockbuf.c:148 > > Process 1243 (java) thread 0xfe01ca85d000 (102704) > > exclusive sleep mutex pmap (pmap) r = 0 (0xfe015aec1440) locked @ > > /usr/src/sys/amd64/amd64/pmap.c:4840 > > exclusive rw pmap pv global (pmap pv global) r = 0 (0x81409780) > > locked @ /usr/src/sys/amd64/amd64/pmap.c:4802 > > exclusive sleep mutex vm page (vm page) r = 0 (0x813f0a80) > locked @ > > /usr/src/sys/vm/vm_object.c:1128 > > exclusive sleep mutex vm object (standard object) r = 0 > > (0xfe01458e43a0) locked @ /usr/src/sys/vm/vm_object.c:1076 > > shared sx vm map (user) (vm map (user)) r = 0 (0xfe015aec1388) > locked @ > > /usr/src/sys/vm/vm_map.c:2045 > > Process 994 (nfsd) thread 0xfe015a0df000 (102426) > > shared lockmgr zfs (zfs) r = 0 (0xfe0c3b505878) locked @ > > > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:1760 > > Process 994 (nfsd) thread 0xfe015a0f8490 (102422) > > exclusive lockmgr zfs (zfs) r = 0 (0xfe02db3b3e60) locked @ > > > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:1760 > > Process 931 (syslogd) thread 0xfe015af18920 (102365) > > shared lockmgr zfs (zfs) r = 0 (0xfe0141dd6680) locked @ > > /usr/src/sys/kern/vfs_syscalls.c:3591 > > Process 22 (syncer) thread 0xfe0125077000 (100279) > > exclusive lockmgr syncer (syncer) r = 0 (0xfe015a2ff680) locked @ > > /usr/src/sys/kern/vfs_subr.c:1809 > > > > I don't have full "show lockedvnods" output because the output does not > get > > captured by ddb after using "capture on", it doesn't fit on a single > > screen, and doesn't get piped into a "more" equivalent. What I did manage > > to get (copied by hand, typos possible) is: > > > > 0xfe0c3b5057e0: 0xfe0c3b5057e0: tag zfs, type VREG > > tag zfs, type VREG > > usecount 1, writecount 0, refcount 1 mountedhere 0 > > usecount 1, writecount 0, refcount 1 mountedhere 0 > > flags (VI_ACTIVE) > > flags (VI_ACTIVE) > > v_object 0xfe089bc1b828 ref 0 pages 0 > > v_object 0xfe089bc1b828 ref 0 pages 0 > > lock type zfs: SHARED (count 1) > > lock type zfs: SHARED (count 1) > > > > 0xfe02db3b3dc8: 0xfe02db3b3dc8: tag zfs, type VREG > > tag zfs, type VREG > > usecount 6, writecount 0, refcount 6 mountedhere 0 > > usecount 6, writecount 0, refcount 6 mountedhere 0 > > flags (VI_ACTIVE) > > flags (VI_ACTIVE) > > v_object 0xfe0b79583ae0 ref 0 pages 0 > > v_object 0xfe0b79583ae0 ref 0 pages 0 > > lock type zfs: EXCL by thread 0xfe015a0f8490 (pid 994) > > lock type zfs: EXCL by thread 0xfe015a0f8490 (pid 994) > > with exclusive waiters pending > > with exclusive waiters pending > > > > The output of show witness is at http://pastebin.com/eSRb3FEu > > > > The output of alltrace is at http://pastebin.com/X1LruNrf (a number of > > threads are stuck in zio_wait, none I can find in zio_interrupt, and > > according to gstat and disks eventually going to sleep all disk IO seems > to > > be stuck for good; I think Andriy explained earlier that these criteria > > might indicate this is a ZFS hang). > > > > The output of show geom is at http://pastebin.com/6nwQbKr4 > > > > The output of vmstat -i is at http://pastebin.com/9LcZ7Mi0 Interrupts > are > > occurring at a normal rate during the hang, as far as I can tell. > > > > Any help would be greatly appreciated. > > Thanks > > Olivier > > PS: my kernel was compiled from 9-STABLE from December, with CAM and ahci > > from 9.0 (in the hope it would fix the hangs I was experiencing in plain > > 9-STABLE; obviously the hangs are still occurring). The rest