Kristof,

It’s from the 2nd situation. It is so weird. Last time there was ipsec code in 
the backtrace, which wasn’t used on the bridge+members.

This is from my own kernel config, but during testing with the GENERIC kernel I 
had similar backtraces at reboot.

I can’t do a lot right now, but I’m planning to:

- build kernel with -O0
- do the deletem of the epair manually

I’ll get back to you if I find something.

Peter

> On 23 Nov 2020, at 12:15, Kristof Provost <k...@freebsd.org> wrote:
> 
> Peter,
> 
> Is that backtrace from the first or the second situation you describe? What 
> kernel config are you using with that backtrace?
> 
> This backtrace does not appear to involve the bridge. Given that part of the 
> panic message is cut off it’s very hard to conclude anything at all from it.
> 
> Best regards,
> Kristof
> 
> On 23 Nov 2020, at 11:52, Peter Blok wrote:
> 
>> Kristof,
>> 
>> With commit 367705+367706 and if_bridge statically linked. It crashes while 
>> adding an epair of a jail.
>> 
>> With commit 367705+367706 and if_bridge dynamically loaded there is a crash 
>> at reboot
>> 
>> #0 0xffffffff8069ddc5 at kdb_backtrace+0x65
>> #1 0xffffffff80652c8b at vpanic+0x17b
>> #2 0xffffffff80652b03 at panic+0x43
>> #3 0xffffffff809c8951 at trap_fatal+0x391
>> #4 0xffffffff809c89af at trap_pfault+0x4f
>> #5 0xffffffff809c7ff6 at trap+0x286
>> #6 0xffffffff809a1ec8 at calltrap+0x8
>> #7 0xffffffff8079f7ed at ip_input+0x63d
>> #8 0xffffffff8077a07a at netisr_dispatch_src+0xca
>> #9 0xffffffff8075a6f8 at ether_demux+0x138
>> #10 0xffffffff8075b9bb at ether_nh_input+0x33b
>> #11 0xffffffff8077a07a at netisr_dispatch_src+0xca
>> #12 0xffffffff8075ab1b at ether_input+0x4b
>> #13 0xffffffff8077a80b at swi_net+0x12b
>> #14 0xffffffff8061e10c at ithread_loop+0x23c
>> #15 0xffffffff8061afbe at fork_exit+0x7e
>> #16 0xffffffff809a2efe at fork_trampoline+0xe
>> 
>> Peter
>> 
>>> On 21 Nov 2020, at 17:22, Peter Blok <pb...@bsd4all.org> wrote:
>>> 
>>> Kristof,
>>> 
>>> With a GENERIC kernel it does NOT happen. I do have a different iflib 
>>> related panic at reboot, but I’ll report that separately.
>>> 
>>> I brought the two config files closer together and found out that if I 
>>> remove if_bridge from the config file and have it loaded dynamically when 
>>> the bridge is created, the problem no longer happens and everything works 
>>> ok.
>>> 
>>> Peter
>>> 
>>>> On 20 Nov 2020, at 15:53, Kristof Provost <k...@freebsd.org> wrote:
>>>> 
>>>> I still can’t reproduce that panic.
>>>> 
>>>> Does it happen immediately after you start a vnet jail?
>>>> 
>>>> Does it also happen with a GENERIC kernel?
>>>> 
>>>> Regards,
>>>> Kristof
>>>> 
>>>> On 20 Nov 2020, at 14:53, Peter Blok wrote:
>>>> 
>>>>> The panic with ipsec code in the backtrace was already very strange. I 
>>>>> was using IPsec, but only on one interface totally separate from the 
>>>>> members of the bridge as well as the bridge itself. The jails were not 
>>>>> doing any ipsec as well. Note that panic was a while ago and it was after 
>>>>> the 1st bridge epochification was done on stable-12 which was later 
>>>>> backed out
>>>>> 
>>>>> Today the system is no longer using ipsec, but it is still compiled in. I 
>>>>> can remove it if need be for a test
>>>>> 
>>>>> 
>>>>> src.conf
>>>>> WITHOUT_KERBEROS=yes
>>>>> WITHOUT_GSSAPI=yes
>>>>> WITHOUT_SENDMAIL=true
>>>>> WITHOUT_MAILWRAPPER=true
>>>>> WITHOUT_DMAGENT=true
>>>>> WITHOUT_GAMES=true
>>>>> WITHOUT_IPFILTER=true
>>>>> WITHOUT_UNBOUND=true
>>>>> WITHOUT_PROFILE=true
>>>>> WITHOUT_ATM=true
>>>>> WITHOUT_BSNMP=true
>>>>> #WITHOUT_CROSS_COMPILER=true
>>>>> WITHOUT_DEBUG_FILES=true
>>>>> WITHOUT_DICT=true
>>>>> WITHOUT_FLOPPY=true
>>>>> WITHOUT_HTML=true
>>>>> WITHOUT_HYPERV=true
>>>>> WITHOUT_NDIS=true
>>>>> WITHOUT_NIS=true
>>>>> WITHOUT_PPP=true
>>>>> WITHOUT_TALK=true
>>>>> WITHOUT_TESTS=true
>>>>> WITHOUT_WIRELESS=true
>>>>> #WITHOUT_LIB32=true
>>>>> WITHOUT_LPR=true
>>>>> 
>>>>> make.conf
>>>>> KERNCONF=BHYVE
>>>>> MODULES_OVERRIDE=opensolaris dtrace zfs vmm nmdm if_bridge bridgestp 
>>>>> if_vxlan pflog libmchain libiconv smbfs linux linux64 linux_common 
>>>>> linuxkpi linprocfs linsysfs ext2fs
>>>>> DEFAULT_VERSIONS+=perl5=5.30 mysql=5.7 python=3.8 python3=3.8
>>>>> OPTIONS_UNSET=DOCS NLS MANPAGES
>>>>> 
>>>>> BHYVE
>>>>> cpu               HAMMER
>>>>> ident             BHYVE
>>>>> 
>>>>> makeoptions       DEBUG=-g                # Build kernel with gdb(1) 
>>>>> debug symbols
>>>>> makeoptions       WITH_CTF=1              # Run ctfconvert(1) for DTrace 
>>>>> support
>>>>> 
>>>>> options           CAMDEBUG
>>>>> 
>>>>> options   SCHED_ULE               # ULE scheduler
>>>>> options   PREEMPTION              # Enable kernel thread preemption
>>>>> options   INET                    # InterNETworking
>>>>> options   INET6                   # IPv6 communications protocols
>>>>> options           IPSEC
>>>>> options   TCP_OFFLOAD             # TCP offload
>>>>> options           TCP_RFC7413             # TCP FASTOPEN
>>>>> options   SCTP                    # Stream Control Transmission Protocol
>>>>> options   FFS                     # Berkeley Fast Filesystem
>>>>> options   SOFTUPDATES             # Enable FFS soft updates support
>>>>> options   UFS_ACL                 # Support for access control lists
>>>>> options   UFS_DIRHASH             # Improve performance on big directories
>>>>> options   UFS_GJOURNAL            # Enable gjournal-based UFS journaling
>>>>> options   QUOTA                   # Enable disk quotas for UFS
>>>>> options           SUIDDIR
>>>>> options   NFSCL                   # Network Filesystem Client
>>>>> options   NFSD                    # Network Filesystem Server
>>>>> options   NFSLOCKD                # Network Lock Manager
>>>>> options   MSDOSFS                 # MSDOS Filesystem
>>>>> options   CD9660                  # ISO 9660 Filesystem
>>>>> options   FUSEFS
>>>>> options           NULLFS                  # NULL filesystem
>>>>> options           UNIONFS
>>>>> options           FDESCFS                 # File descriptor filesystem
>>>>> options   PROCFS                  # Process filesystem (requires PSEUDOFS)
>>>>> options   PSEUDOFS                # Pseudo-filesystem framework
>>>>> options   GEOM_PART_GPT           # GUID Partition Tables.
>>>>> options   GEOM_RAID               # Soft RAID functionality.
>>>>> options   GEOM_LABEL              # Provides labelization
>>>>> options   GEOM_ELI                # Disk encryption.
>>>>> options   COMPAT_FREEBSD32        # Compatible with i386 binaries
>>>>> options   COMPAT_FREEBSD4         # Compatible with FreeBSD4
>>>>> options   COMPAT_FREEBSD5         # Compatible with FreeBSD5
>>>>> options   COMPAT_FREEBSD6         # Compatible with FreeBSD6
>>>>> options   COMPAT_FREEBSD7         # Compatible with FreeBSD7
>>>>> options   COMPAT_FREEBSD9         # Compatible with FreeBSD9
>>>>> options   COMPAT_FREEBSD10        # Compatible with FreeBSD10
>>>>> options   COMPAT_FREEBSD11        # Compatible with FreeBSD11
>>>>> options   SCSI_DELAY=5000         # Delay (in ms) before probing SCSI
>>>>> options   KTRACE                  # ktrace(1) support
>>>>> options   STACK                   # stack(9) support
>>>>> options   SYSVSHM                 # SYSV-style shared memory
>>>>> options   SYSVMSG                 # SYSV-style message queues
>>>>> options   SYSVSEM                 # SYSV-style semaphores
>>>>> options   _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time 
>>>>> extensions
>>>>> options   PRINTF_BUFR_SIZE=128    # Prevent printf output being 
>>>>> interspersed.
>>>>> options   KBD_INSTALL_CDEV        # install a CDEV entry in /dev
>>>>> options   HWPMC_HOOKS             # Necessary kernel hooks for hwpmc(4)
>>>>> options   AUDIT                   # Security event auditing
>>>>> options   CAPABILITY_MODE         # Capsicum capability mode
>>>>> options   CAPABILITIES            # Capsicum capabilities
>>>>> options   MAC                     # TrustedBSD MAC Framework
>>>>> options   MAC_PORTACL
>>>>> options   MAC_NTPD
>>>>> options   KDTRACE_FRAME           # Ensure frames are compiled in
>>>>> options   KDTRACE_HOOKS           # Kernel DTrace hooks
>>>>> options   DDB_CTF                 # Kernel ELF linker loads CTF data
>>>>> options   INCLUDE_CONFIG_FILE     # Include this file in kernel
>>>>> 
>>>>> # Debugging support.  Always need this:
>>>>> options   KDB                     # Enable kernel debugger support.
>>>>> options   KDB_TRACE               # Print a stack trace for a panic.
>>>>> options   KDB_UNATTENDED
>>>>> 
>>>>> # Make an SMP-capable kernel by default
>>>>> options   SMP                     # Symmetric MultiProcessor Kernel
>>>>> options   EARLY_AP_STARTUP
>>>>> 
>>>>> # CPU frequency control
>>>>> device            cpufreq
>>>>> device            cpuctl
>>>>> device            coretemp
>>>>> 
>>>>> # Bus support.
>>>>> device            acpi
>>>>> options   ACPI_DMAR
>>>>> device            pci
>>>>> options           PCI_IOV                 # PCI SR-IOV support
>>>>> 
>>>>> device            iicbus
>>>>> device            iicbb
>>>>> 
>>>>> device            iic
>>>>> device            ic
>>>>> device            iicsmb
>>>>> 
>>>>> device            ichsmb
>>>>> device            smbus
>>>>> device            smb
>>>>> 
>>>>> #device           jedec_dimm
>>>>> 
>>>>> # ATA controllers
>>>>> device            ahci                    # AHCI-compatible SATA 
>>>>> controllers
>>>>> device            mvs                     # Marvell 
>>>>> 88SX50XX/88SX60XX/88SX70XX/SoC SATA
>>>>> 
>>>>> # SCSI Controllers
>>>>> device            mps                     # LSI-Logic MPT-Fusion 2
>>>>> 
>>>>> # ATA/SCSI peripherals
>>>>> device            scbus                   # SCSI bus (required for 
>>>>> ATA/SCSI)
>>>>> device            da                      # Direct Access (disks)
>>>>> device            cd                      # CD
>>>>> device            pass                    # Passthrough device (direct 
>>>>> ATA/SCSI access)
>>>>> device            ses                     # Enclosure Services (SES and 
>>>>> SAF-TE)
>>>>> device            sg
>>>>> 
>>>>> device            cfiscsi
>>>>> device            ctl                     # CAM Target Layer
>>>>> device            iscsi
>>>>> 
>>>>> # atkbdc0 controls both the keyboard and the PS/2 mouse
>>>>> device            atkbdc                  # AT keyboard controller
>>>>> device            atkbd                   # AT keyboard
>>>>> device            psm                     # PS/2 mouse
>>>>> 
>>>>> device            kbdmux                  # keyboard multiplexer
>>>>> 
>>>>> # vt is the new video console driver
>>>>> device            vt
>>>>> device            vt_vga
>>>>> device            vt_efifb
>>>>> 
>>>>> # Serial (COM) ports
>>>>> device            uart                    # Generic UART driver
>>>>> 
>>>>> # PCI/PCI-X/PCIe Ethernet NICs that use iflib infrastructure
>>>>> device            iflib
>>>>> device            em                      # Intel PRO/1000 Gigabit 
>>>>> Ethernet Family
>>>>> device            ix                      # Intel PRO/10GbE PCIE PF 
>>>>> Ethernet
>>>>> 
>>>>> # Network stack virtualization.
>>>>> options           VIMAGE
>>>>> 
>>>>> # Pseudo devices.
>>>>> device            crypto
>>>>> device            cryptodev
>>>>> device            loop                    # Network loopback
>>>>> device            random                  # Entropy device
>>>>> device            padlock_rng             # VIA Padlock RNG
>>>>> device            rdrand_rng              # Intel Bull Mountain RNG
>>>>> device            ipmi
>>>>> device            smbios
>>>>> device            vpd
>>>>> device            aesni                   # AES-NI OpenCrypto module
>>>>> device            ether                   # Ethernet support
>>>>> device            lagg
>>>>> device            vlan                    # 802.1Q VLAN support
>>>>> device            tuntap                  # Packet tunnel.
>>>>> device            md                      # Memory "disks"
>>>>> device            gif                     # IPv6 and IPv4 tunneling
>>>>> device            firmware                # firmware assist module
>>>>> 
>>>>> device            pf
>>>>> #device           pflog
>>>>> #device           pfsync
>>>>> 
>>>>> # The `bpf' device enables the Berkeley Packet Filter.
>>>>> # Be aware of the administrative consequences of enabling this!
>>>>> # Note that 'bpf' is required for DHCP.
>>>>> device            bpf                     # Berkeley packet filter
>>>>> 
>>>>> # The `epair' device implements a virtual back-to-back connected Ethernet
>>>>> # like interface pair.
>>>>> device            epair
>>>>> 
>>>>> # USB support
>>>>> options   USB_DEBUG               # enable debug msgs
>>>>> device            uhci                    # UHCI PCI->USB interface
>>>>> device            ohci                    # OHCI PCI->USB interface
>>>>> device            ehci                    # EHCI PCI->USB interface (USB 
>>>>> 2.0)
>>>>> device            xhci                    # XHCI PCI->USB interface (USB 
>>>>> 3.0)
>>>>> device            usb                     # USB Bus (required)
>>>>> device            uhid
>>>>> device            ukbd                    # Keyboard
>>>>> device            umass                   # Disks/Mass storage - Requires 
>>>>> scbus and da
>>>>> device            ums
>>>>> 
>>>>> device            filemon
>>>>> 
>>>>> device            if_bridge
>>>>> 
>>>>>> On 20 Nov 2020, at 12:53, Kristof Provost <k...@freebsd.org> wrote:
>>>>>> 
>>>>>> Can you share your kernel config file (and src.conf / make.conf if they 
>>>>>> exist)?
>>>>>> 
>>>>>> This second panic is in the IPSec code. My current thinking is that your 
>>>>>> kernel config is triggering a bug that’s manifesting in multiple places, 
>>>>>> but not actually caused by those places.
>>>>>> 
>>>>>> I’d like to be able to reproduce it so we can debug it.
>>>>>> 
>>>>>> Best regards,
>>>>>> Kristof
>>>>>> 
>>>>>> On 20 Nov 2020, at 12:02, Peter Blok wrote:
>>>>>>> Hi Kristof,
>>>>>>> 
>>>>>>> This is 12-stable. With the previous bridge epochification that was 
>>>>>>> backed out my config had a panic too.
>>>>>>> 
>>>>>>> I don’t have any local modifications. I did a clean rebuild after 
>>>>>>> removing /usr/obj/usr
>>>>>>> 
>>>>>>> My kernel is custom - I only have zfs.ko, opensolaris.ko, vmm.ko and 
>>>>>>> nmdm.ko as modules. Everything else is statically linked. I have 
>>>>>>> removed all drivers not needed for the hardware at hand.
>>>>>>> 
>>>>>>> My bridge is between two vlans from the same trunk and the jail epair 
>>>>>>> devices as well as the bhyve tap devices.
>>>>>>> 
>>>>>>> The panic happens when the jails are starting.
>>>>>>> 
>>>>>>> I can try to narrow it down over the weekend and make the crash dump 
>>>>>>> available for analysis.
>>>>>>> 
>>>>>>> Previously I had the following crash with 363492
>>>>>>> 
>>>>>>> kernel trap 12 with interrupts disabled
>>>>>>> 
>>>>>>> 
>>>>>>> Fatal trap 12: page fault while in kernel mode
>>>>>>> cpuid = 2; apic id = 02
>>>>>>> fault virtual address   = 0xffffffff00000410
>>>>>>> fault code              = supervisor read data, page not present
>>>>>>> instruction pointer     = 0x20:0xffffffff80692326
>>>>>>> stack pointer           = 0x28:0xfffffe00c06097b0
>>>>>>> frame pointer           = 0x28:0xfffffe00c06097f0
>>>>>>> code segment            = base 0x0, limit 0xfffff, type 0x1b
>>>>>>>                         = DPL 0, pres 1, long 1, def32 0, gran 1
>>>>>>> processor eflags        = resume, IOPL = 0
>>>>>>> current process         = 2030 (ifconfig)
>>>>>>> trap number             = 12
>>>>>>> panic: page fault
>>>>>>> cpuid = 2
>>>>>>> time = 1595683412
>>>>>>> KDB: stack backtrace:
>>>>>>> #0 0xffffffff80698165 at kdb_backtrace+0x65
>>>>>>> #1 0xffffffff8064d67b at vpanic+0x17b
>>>>>>> #2 0xffffffff8064d4f3 at panic+0x43
>>>>>>> #3 0xffffffff809cc311 at trap_fatal+0x391
>>>>>>> #4 0xffffffff809cc36f at trap_pfault+0x4f
>>>>>>> #5 0xffffffff809cb9b6 at trap+0x286
>>>>>>> #6 0xffffffff809a5b28 at calltrap+0x8
>>>>>>> #7 0xffffffff803677fd at ck_epoch_synchronize_wait+0x8d
>>>>>>> #8 0xffffffff8069213a at epoch_wait_preempt+0xaa
>>>>>>> #9 0xffffffff807615b7 at ipsec_ioctl+0x3a7
>>>>>>> #10 0xffffffff8075274f at ifioctl+0x47f
>>>>>>> #11 0xffffffff806b5ea7 at kern_ioctl+0x2b7
>>>>>>> #12 0xffffffff806b5b4a at sys_ioctl+0xfa
>>>>>>> #13 0xffffffff809ccec7 at amd64_syscall+0x387
>>>>>>> #14 0xffffffff809a6450 at fast_syscall_common+0x101
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> On 20 Nov 2020, at 11:30, Kristof Provost <k...@freebsd.org> wrote:
>>>>>>>> 
>>>>>>>> On 20 Nov 2020, at 11:18, peter.b...@bsd4all.org 
>>>>>>>> <mailto:peter.b...@bsd4all.org> wrote:
>>>>>>>>> I’m afraid the last Epoch fix for bridge is not solving the problem ( 
>>>>>>>>> or perhaps creates a new ).
>>>>>>>>> 
>>>>>>>> We’re talking about the stable/12 branch, right?
>>>>>>>> 
>>>>>>>>> This seems to happen when the jail epair is added to the bridge.
>>>>>>>>> 
>>>>>>>> There must be something more to it than that. I’ve run the bridge 
>>>>>>>> tests on stable/12 without issue, and this is a problem we didn’t see 
>>>>>>>> when the bridge epochification initially went into stable/12.
>>>>>>>> 
>>>>>>>> Do you have a custom kernel config? Other patches? What exact commands 
>>>>>>>> do you run to trigger the panic?
>>>>>>>> 
>>>>>>>>> kernel trap 12 with interrupts disabled
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Fatal trap 12: page fault while in kernel mode
>>>>>>>>> cpuid = 6; apic id = 06
>>>>>>>>> fault virtual address = 0xc10
>>>>>>>>> fault code            = supervisor read data, page not present
>>>>>>>>> instruction pointer   = 0x20:0xffffffff80695e76
>>>>>>>>> stack pointer         = 0x28:0xfffffe00bf14e6e0
>>>>>>>>> frame pointer         = 0x28:0xfffffe00bf14e720
>>>>>>>>> code segment          = base 0x0, limit 0xfffff, type 0x1b
>>>>>>>>>                       = DPL 0, pres 1, long 1, def32 0, gran 1
>>>>>>>>> processor eflags      = resume, IOPL = 0
>>>>>>>>> current process               = 1686 (jail)
>>>>>>>>> trap number           = 12
>>>>>>>>> panic: page fault
>>>>>>>>> cpuid = 6
>>>>>>>>> time = 1605811310
>>>>>>>>> KDB: stack backtrace:
>>>>>>>>> #0 0xffffffff8069bb85 at kdb_backtrace+0x65
>>>>>>>>> #1 0xffffffff80650a4b at vpanic+0x17b
>>>>>>>>> #2 0xffffffff806508c3 at panic+0x43
>>>>>>>>> #3 0xffffffff809d0351 at trap_fatal+0x391
>>>>>>>>> #4 0xffffffff809d03af at trap_pfault+0x4f
>>>>>>>>> #5 0xffffffff809cf9f6 at trap+0x286
>>>>>>>>> #6 0xffffffff809a98c8 at calltrap+0x8
>>>>>>>>> #7 0xffffffff80368a8d at ck_epoch_synchronize_wait+0x8d
>>>>>>>>> #8 0xffffffff80695c8a at epoch_wait_preempt+0xaa
>>>>>>>>> #9 0xffffffff80757d40 at vnet_if_init+0x120
>>>>>>>>> #10 0xffffffff8078c994 at vnet_alloc+0x114
>>>>>>>>> #11 0xffffffff8061e3f7 at kern_jail_set+0x1bb7
>>>>>>>>> #12 0xffffffff80620190 at sys_jail_set+0x40
>>>>>>>>> #13 0xffffffff809d0f07 at amd64_syscall+0x387
>>>>>>>>> #14 0xffffffff809aa1ee at fast_syscall_common+0xf8
>>>>>>>> 
>>>>>>>> This panic is rather odd. This isn’t even the bridge code. This is 
>>>>>>>> during initial creation of the vnet. I don’t really see how this could 
>>>>>>>> even trigger panics.
>>>>>>>> That panic looks as if something corrupted the net_epoch_preempt, by 
>>>>>>>> overwriting the epoch->e_epoch. The bridge patches only access this 
>>>>>>>> variable through the well-established functions and macros. I see no 
>>>>>>>> obvious way that they could corrupt it.
>>>>>>>> 
>>>>>>>> Best regards,
>>>>>>>> Kristof
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> freebsd-stable@freebsd.org mailing list
>>>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
>>>>>> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>>>> _______________________________________________
>>>> freebsd-stable@freebsd.org mailing list
>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
>>>> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>>> 
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to