>Number: 172166 >Category: kern >Synopsis: Deadlock in the networking code, possible due to a bug in the >SCHED_ULE >Confidential: no >Severity: serious >Priority: high >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sat Sep 29 17:30:01 UTC 2012 >Closed-Date: >Last-Modified: >Originator: Eugene Grosbein >Release: FreeBSD 8.3-STABLE amd64 >Organization: RDTC JSC >Environment: System: FreeBSD 8.3-STABLE/amd64, six-core Intel X5675 CPU (hyperthreading disabled).
>Description: I run pretty busy FreeBSD 8.3-STABLE/amd64 based mpd-5.6 PPPoE server that serves hundreds (and sometimes over a thousand) simultaneous connections with high connect/disconnect rate. Also, it sends its logs to remote syslog collector over the net. It also heavily uses ipfw tables for dummynet shaping: every new connected client obtains its IP and this IP is added by mpd to some of ipfw tables. Upon disconnection, mpd removes that IP from tables. Today, my server deadlocked second time in two months: all of its network activity got blocked, even lagg's LACP frames. The kernel and userland were fine, I managed to login using IP KVM console. I have invoked KDB, did 'call doadump', obtained crashdump and rebooted the box. I've digged it a little; it seems syslogd(8) was preempted by the scheduler in the middle of ipfw_lookup_table()/rn_match() sequence while holding reader-lock of "IPFW static rules" rwlock and newer got back. Hence, all network activity broke as ipfw needs writer-lock of "IPFW static rules". Here comes backtrace of syslogd's kernel thread: GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: KDB: enter: Break to debugger Dumping 1135 out of 4079 MB:..2%..12%..22%..32%..41%..51%..61%..71%..81%..91% Error while mapping shared library sections: /boot/kernel/nfsclient.ko: îÅÔ ÔÁËÏÇÏ ÆÁÊÌÁ ÉÌÉ ËÁÔÁÌÏÇÁ. Error while mapping shared library sections: /boot/kernel/nfslock.ko: îÅÔ ÔÁËÏÇÏ ÆÁÊÌÁ ÉÌÉ ËÁÔÁÌÏÇÁ. Error while mapping shared library sections: /boot/kernel/nfs_common.ko: îÅÔ ÔÁËÏÇÏ ÆÁÊÌÁ ÉÌÉ ËÁÔÁÌÏÇÁ. Error while mapping shared library sections: /boot/kernel/krpc.ko: îÅÔ ÔÁËÏÇÏ ÆÁÊÌÁ ÉÌÉ ËÁÔÁÌÏÇÁ. Reading symbols from /boot/kernel/ipmi.ko...done. Loaded symbols for /boot/kernel/ipmi.ko Error while reading shared library symbols: /boot/kernel/nfsclient.ko: îÅÔ ÔÁËÏÇÏ ÆÁÊÌÁ ÉÌÉ ËÁÔÁÌÏÇÁ. Error while reading shared library symbols: /boot/kernel/nfslock.ko: îÅÔ ÔÁËÏÇÏ ÆÁÊÌÁ ÉÌÉ ËÁÔÁÌÏÇÁ. Error while reading shared library symbols: /boot/kernel/nfs_common.ko: îÅÔ ÔÁËÏÇÏ ÆÁÊÌÁ ÉÌÉ ËÁÔÁÌÏÇÁ. Error while reading shared library symbols: /boot/kernel/krpc.ko: îÅÔ ÔÁËÏÇÏ ÆÁÊÌÁ ÉÌÉ ËÁÔÁÌÏÇÁ. #0 doadump () at /home/src/sys/kern/kern_shutdown.c:268 268 if (textdump_pending) (kgdb) thread 134 [Switching to thread 134 (Thread 100201)]#0 sched_switch ( td=0xffffff0004e00470, newtd=0xffffff0001b96470, flags=Variable "flags" is not available. ) at /home/src/sys/kern/sched_ule.c:1892 1892 cpuid = PCPU_GET(cpuid); (kgdb) bt #0 sched_switch (td=0xffffff0004e00470, newtd=0xffffff0001b96470, flags=Variable "flags" is not available. ) at /home/src/sys/kern/sched_ule.c:1892 #1 0xffffffff80305c96 in mi_switch (flags=1538, newtd=0x0) at /home/src/sys/kern/kern_synch.c:466 #2 0xffffffff803048f5 in critical_exit () at /home/src/sys/kern/kern_switch.c:212 #3 0xffffffff802d5733 in intr_event_handle (ie=0xffffff0001b61100, frame=0xffffff81254122c0) at /home/src/sys/kern/kern_intr.c:1424 #4 0xffffffff804de4ff in intr_execute_handlers (isrc=0xffffff0001b82a00, frame=0xffffff81254122c0) at /home/src/sys/amd64/amd64/intr_machdep.c:260 #5 0xffffffff804e2287 in lapic_handle_intr (vector=Variable "vector" is not available. ) at /home/src/sys/x86/x86/local_apic.c:771 #6 0xffffffff804db2b5 in Xapic_isr1 () at apic_vector.S:86 #7 0xffffffff803cdab3 in rn_match (v_arg=0xffffff81254123d0, head=Variable "head" is not available. ) at /home/src/sys/net/radix.c:352 #8 0xffffffff8040c03b in ipfw_lookup_table (ch=Variable "ch" is not available. ) at /home/src/sys/netinet/ipfw/ip_fw_table.c:538 #9 0xffffffff80405e1f in ipfw_chk (args=0xffffff81254125a0) at /home/src/sys/netinet/ipfw/ip_fw2.c:1429 #10 0xffffffff80409d7a in ipfw_check_hook (arg=Variable "arg" is not available. ) at /home/src/sys/netinet/ipfw/ip_fw_pfil.c:137 #11 0xffffffff803cd46c in pfil_run_hooks (ph=Variable "ph" is not available. ) at /home/src/sys/net/pfil.c:82 #12 0xffffffff804127ca in ip_output (m=0xffffff00bdb5b000, opt=Variable "opt" is not available. ) at /home/src/sys/netinet/ip_output.c:511 #13 0xffffffff8042c115 in udp_send (so=Variable "so" is not available. ) at /home/src/sys/netinet/udp_usrreq.c:1249 #14 0xffffffff803674cb in sosend_dgram (so=0xffffff0004e79550, addr=0xffffff010c529060, uio=0xffffff8125412a00, top=0xffffff00bdb5b000, control=0x0, flags=0, td=0xffffff0004e00470) at /home/src/sys/kern/uipc_socket.c:1107 #15 0xffffffff8036b7e2 in kern_sendit (td=0xffffff0004e00470, s=7, mp=0xffffff8125412ad0, flags=0, control=0x0, segflg=UIO_USERSPACE) at /home/src/sys/kern/uipc_syscalls.c:785 #16 0xffffffff8036ba6c in sendit (td=0xffffff0004e00470, s=7, mp=0xffffff8125412ad0, flags=0) at /home/src/sys/kern/uipc_syscalls.c:717 #17 0xffffffff8036bb5d in sendto (td=Variable "td" is not available. ) at /home/src/sys/kern/uipc_syscalls.c:837 #18 0xffffffff804f3554 in amd64_syscall (td=0xffffff0004e00470, traced=0) at subr_syscall.c:114 #19 0xffffffff804daefc in Xfast_syscall () at /home/src/sys/amd64/amd64/exception.S:387 #20 0x000000080082be3c in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) Backtraces of all kernel threads are available here: http://www.grosbein.net/crash/20120929/misc.tar.xz Kernel crashdump is also available at http://www.grosbein.net/crash/20120929/ Here comes kernel config file: cpu HAMMER ident PPPOE # To statically compile in device wiring instead of /boot/device.hints #hints "GENERIC.hints" # Default places to look for devices. # Use the following to compile in values accessible to the kernel # through getenv() (or kenv(1) in userland). The format of the file # is 'variable=value', see kenv(1) # # env "GENERIC.env" makeoptions DEBUG=-g # Build kernel with gdb(1) debug symbols options SCHED_ULE # ULE scheduler options PREEMPTION # Enable kernel thread preemption options INET # InterNETworking options INET6 # IPv6 communications protocols #options SCTP # Stream Control Transmission Protocol options FFS # Berkeley Fast Filesystem options SOFTUPDATES # Enable FFS soft updates support options UFS_ACL # Support for access control lists options UFS_DIRHASH # Improve performance on big directories options UFS_GJOURNAL # Enable gjournal-based UFS journaling options MD_ROOT # MD is a potential root device #options NFSCLIENT # Network Filesystem Client #options NFSSERVER # Network Filesystem Server #options NFSLOCKD # Network Lock Manager options NFS_ROOT # NFS usable as /, requires NFSCLIENT #options MSDOSFS # MSDOS Filesystem #options CD9660 # ISO 9660 Filesystem #options PROCFS # Process filesystem (requires PSEUDOFS) options PSEUDOFS # Pseudo-filesystem framework options GEOM_PART_GPT # GUID Partition Tables. options GEOM_LABEL # Provides labelization #options GEOM_JOURNAL options COMPAT_43TTY # BSD 4.3 TTY compat (sgtty) options COMPAT_FREEBSD32 # Compatible with i386 binaries #options COMPAT_FREEBSD4 # Compatible with FreeBSD4 #options COMPAT_FREEBSD5 # Compatible with FreeBSD5 #options COMPAT_FREEBSD6 # Compatible with FreeBSD6 #options COMPAT_FREEBSD7 # Compatible with FreeBSD7 #options SCSI_DELAY=5000 # Delay (in ms) before probing SCSI options KTRACE # ktrace(1) support options STACK # stack(9) support options SYSVSHM # SYSV-style shared memory options SYSVMSG # SYSV-style message queues options SYSVSEM # SYSV-style semaphores options P1003_1B_SEMAPHORES # POSIX-style semaphores options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time extensions options PRINTF_BUFR_SIZE=512 # Prevent printf output being interspersed. options KBD_INSTALL_CDEV # install a CDEV entry in /dev options HWPMC_HOOKS # Necessary kernel hooks for hwpmc(4) options AUDIT # Security event auditing options MAC # TrustedBSD MAC Framework #options FLOWTABLE # per-cpu routing cache #options KDTRACE_FRAME # Ensure frames are compiled in #options KDTRACE_HOOKS # Kernel DTrace hooks options INCLUDE_CONFIG_FILE # Include this file in kernel # Make an SMP-capable kernel by default options SMP # Symmetric MultiProcessor Kernel # CPU frequency control device cpufreq # Bus support. device acpi device pci # Floppy drives #device fdc # ATA and ATAPI devices device ata device atadisk # ATA disk drives device atapicd # ATAPI CDROM drives # SCSI peripherals device scbus # SCSI bus (required for SCSI) device da # Direct Access (disks) device cd # CD device pass # Passthrough device (direct SCSI access) # atkbdc0 controls both the keyboard and the PS/2 mouse device atkbdc # AT keyboard controller device atkbd # AT keyboard device psm # PS/2 mouse device kbdmux # keyboard multiplexer device vga # VGA video card driver # syscons is the default console driver, resembling an SCO console device sc # Serial (COM) ports device uart # Generic UART driver # PCI Ethernet NICs. device em # Intel PRO/1000 Gigabit Ethernet Family device igb # Pseudo devices. device loop # Network loopback device random # Entropy device device ether # Ethernet support device vlan # 802.1Q VLAN support device pty # BSD-style compatibility pseudo ttys device md # Memory "disks" device gif # IPv6 and IPv4 tunneling device faith # IPv6-to-IPv4 relaying (translation) device firmware # firmware assist module device snp device bpf # Berkeley packet filter # USB support #options USB_DEBUG # enable debug msgs #options USB_VERBOSE device uhci # UHCI PCI->USB interface device ehci # EHCI PCI->USB interface (USB 2.0) device usb # USB Bus (required) device ukbd # Keyboard device umass # Disks/Mass storage - Requires scbus and da device ums # Mouse device ucom # USB support for Prolific PL-2303 serial adapters device uplcom # USB support for Silicon Laboratories CP2101/CP2102 based USB serial adapters device uslcom #options IPSEC #device crypto options NETGRAPH options NETGRAPH_ETHER options NETGRAPH_IFACE options NETGRAPH_MPPC_ENCRYPTION options NETGRAPH_PPP options NETGRAPH_PPPOE options NETGRAPH_SOCKET options NETGRAPH_TCPMSS options NETGRAPH_TEE options NETGRAPH_VJC options IPFIREWALL options IPFIREWALL_FORWARD options DUMMYNET options VFS_AIO device smbus device smb device ichsmb device iicbus device iicbb device ic device iic device iicsmb device coretemp device ichwd device nvram device lagg options KDB options KDB_TRACE options KDB_UNATTENDED options DDB options DDB_NUMSYM #options NETGRAPH_DEBUG #options INVARIANT_SUPPORT #options INVARIANTS #options DEBUG_MEMGUARD #options BREAK_TO_DEBUGGER options ALT_BREAK_TO_DEBUGGER device bridge >How-To-Repeat: Unknown. The problem occures very seldom but today was second time. >Fix: Unknown for me. >Release-Note: >Audit-Trail: >Unformatted: _______________________________________________ freebsd-bugs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-bugs To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"