> On 19 May 2024, at 22:05, Anthony J. Bentley <bent...@openbsd.org> wrote:
>
> Vitaliy Makkoveev writes:
>>> On 17 May 2024, at 12:06, Stuart Henderson <s...@spacehopper.org> =
>> wrote:
>>> =20
>>> There are problems with wg(4) that people with some workloads have =
>> been
>>> seeing after upgrading past 7.3, though looking at this thread from =
>> when
>>> it last came up https://marc.info/?t=3D170940892700001&r=3D1&w=3D2 I'm =
>> not
>>> sure if we'd be expecting to see trouble on non-MP=E2=80=A6
>>> =20
>>
>> We do. The problem is not MP related.
>>
>> Antony, does the diff [1] help?
>>
>> 1. https://marc.info/?l=3Dopenbsd-bugs&m=3D170980835807159&w=3D2
>
> Crashes continue to occur with the same frequency after patching.
>
This could be vio(4) bug. Please try this [1] diff.
1. https://marc.info/?l=openbsd-tech&m=171588941332420&w=2
> Here are three more crashes from running with the patch. I've seen
> identical traces with and without the patch but these were not in
> my last email.
>
> kernel: page fault trap, code=0
> Stopped at schedclock+0x8a: movzbl 0x344(%rax),%r13d
> ddb> show panic
> the kernel did not panic
> ddb> trace
> schedclock(ffff8000fffeaa68) at schedclock+0x8a
> statclock(ffffffff82529bf8,ffff80001ca32a20,0) at statclock+0x129
> clockintr_dispatch(ffff80001ca32a20) at clockintr_dispatch+0x30d
> clockintr(ffff80001ca32a20) at clockintr+0x59
> intr_handler(ffff80001ca32a20,ffff8000000e6000) at intr_handler+0x3c
> Xintr_legacy0_untramp() at Xintr_legacy0_untramp+0x1a3
> memset() at memset+0x5c
> end trace frame: 0x0, count: -7
> ddb> ps
> PID TID PPID UID S FLAGS WAIT COMMAND
>
>
> panic: pr_find_pagehead: mbufpl: incorrect page
> Stopped at db_enter+0x14: popq %rbp
> TID PID UID PRFLAGS PFLAGS CPU COMMAND
> db_enter() at db_enter+0x14
> panic(ffffffff82161d70) at panic+0xb5
> pool_do_put(ffffffff8260b3c0,fffffd8028dbf600) at pool_do_put+0x27a
> pool_put(ffffffff8260b3c0,fffffd8028dbf600) at pool_put+0x53
> m_free(fffffd8028dbf600) at m_free+0xa6
> m_freem(fffffd8028dbf600) at m_freem+0x38
> vio_txeof(ffff800000064118) at vio_txeof+0x12d
> vio_tx_intr(ffff800000064118) at vio_tx_intr+0x31
> virtio_check_vqs(ffff800000024800) at virtio_check_vqs+0x102
> virtio_pci_legacy_intr(ffff800000024800) at virtio_pci_legacy_intr+0x65
> intr_handler(ffff80001ca7e7f0,ffff800000073e00) at intr_handler+0x3c
> Xintr_legacy5_untramp() at Xintr_legacy5_untramp+0x1a3
> memset() at memset+0x5c
> wg_encap_worker(ffff8000007ed000) at wg_encap_worker+0x79
> end trace frame: 0xffff80001ca7e9f0, count: 0
> https://www.openbsd.org/ddb.html describes the minimum info required in bug
> reports. Insufficient info makes it difficult to find and fix bugs.
> ddb> trace
> db_enter() at db_enter+0x14
> panic(ffffffff82161d70) at panic+0xb5
> pool_do_put(ffffffff8260b3c0,fffffd8028dbf600) at pool_do_put+0x27a
> pool_put(ffffffff8260b3c0,fffffd8028dbf600) at pool_put+0x53
> m_free(fffffd8028dbf600) at m_free+0xa6
> m_freem(fffffd8028dbf600) at m_freem+0x38
> vio_txeof(ffff800000064118) at vio_txeof+0x12d
> vio_tx_intr(ffff800000064118) at vio_tx_intr+0x31
> virtio_check_vqs(ffff800000024800) at virtio_check_vqs+0x102
> virtio_pci_legacy_intr(ffff800000024800) at virtio_pci_legacy_intr+0x65
> intr_handler(ffff80001ca7e7f0,ffff800000073e00) at intr_handler+0x3c
> Xintr_legacy5_untramp() at Xintr_legacy5_untramp+0x1a3
> memset() at memset+0x5c
> wg_encap_worker(ffff8000007ed000) at wg_encap_worker+0x79
> taskq_thread(ffff80000088ac00) at taskq_thread+0xf0
> end trace frame: 0x0, count: -15
> ddb> show panic
> *cpu0: pr_find_pagehead: mbufpl: incorrect page
> ddb> ps
> PID TID PPID UID S FLAGS WAIT COMMAND
> 56587 470184 85475 0 3 0x18000083 dtread btrace
> 58952 222967 0 89 3 0x19100092 kqread relayd
> 83190 101464 0 89 3 0x19100092 kqread relayd
> ddb> show registers
> rdi 0x4
> rsi 0x14
> rbp 0xffff80001ca7e4a0
> rbx 0xfffffd8028dbf600
> rdx 0x3fd
> rcx 0x4800000000000111
> rax 0x30
> r8 0x101010101010101
> r9 0
> r10 0x582c2a7821cc399f
> r11 0xf4834d1e02cdca10
> r12 0xfffffd8028dbf600
> r13 0xffff800000024800
> r14 0
> r15 0xffffffff82161d70 pp_r600_decoded_lanes+0xc8aa
> rip 0xffffffff81fa1d44 db_enter+0x14
> cs 0x8
> rflags 0x282
> rsp 0xffff80001ca7e4a0
> ss 0x10
> db_enter+0x14: popq %rbp
>
>
> panic: pr_find_pagehead: mbufpl: incorrect page
> Stopped at db_enter+0x14: popq %rbp
> TID PID UID PRFLAGS PFLAGS CPU COMMAND
> *225925 73351 0 0x14000 0x200 0 wg_crypt
> db_enter() at db_enter+0x14
> panic(ffffffff82161d70) at panic+0xb5
> pool_do_put(ffffffff8260b3c0,fffffd8035fd9400) at pool_do_put+0x27a
> pool_put(ffffffff8260b3c0,fffffd8035fd9400) at pool_put+0x53
> m_free(fffffd8035fd9400) at m_free+0xa6
> m_freem(fffffd8035fd9400) at m_freem+0x38
> vio_txeof(ffff800000064118) at vio_txeof+0x12d
> vio_tx_intr(ffff800000064118) at vio_tx_intr+0x31
> virtio_check_vqs(ffff800000024800) at virtio_check_vqs+0x102
> virtio_pci_legacy_intr(ffff800000024800) at virtio_pci_legacy_intr+0x65
> intr_handler(ffff80001c922500,ffff800000073e00) at intr_handler+0x3c
> Xintr_legacy5_untramp() at Xintr_legacy5_untramp+0x1a3
> memset() at memset+0x5c
> wg_encap_worker(ffff8000007ef000) at wg_encap_worker+0x79
> end trace frame: 0xffff80001c922700, count: 0
> https://www.openbsd.org/ddb.html describes the minimum info required in bug
> reports. Insufficient info makes it difficult to find and fix bugs.
> ddb> show panic
> *cpu0: pr_find_pagehead: mbufpl: incorrect page
> ddb> trace
> db_enter() at db_enter+0x14
> panic(ffffffff82161d70) at panic+0xb5
> pool_do_put(ffffffff8260b3c0,fffffd8035fd9400) at pool_do_put+0x27a
> pool_put(ffffffff8260b3c0,fffffd8035fd9400) at pool_put+0x53
> m_free(fffffd8035fd9400) at m_free+0xa6
> m_freem(fffffd8035fd9400) at m_freem+0x38
> vio_txeof(ffff800000064118) at vio_txeof+0x12d
> vio_tx_intr(ffff800000064118) at vio_tx_intr+0x31
> virtio_check_vqs(ffff800000024800) at virtio_check_vqs+0x102
> virtio_pci_legacy_intr(ffff800000024800) at virtio_pci_legacy_intr+0x65
> intr_handler(ffff80001c922500,ffff800000073e00) at intr_handler+0x3c
> Xintr_legacy5_untramp() at Xintr_legacy5_untramp+0x1a3
> memset() at memset+0x5c
> wg_encap_worker(ffff8000007ef000) at wg_encap_worker+0x79
> taskq_thread(ffff800000889080) at taskq_thread+0xf0
> end trace frame: 0x0, count: -15
> ddb> ps
>
> PID TID PPID UID S FLAGS WAIT COMMAND
> 51969 144614 37729 0 2 0x18000003 btrace
> 40841 474945 76353 1000 3 0x810008b sigsusp ksh
> 76353 455143 78366 1000 3 0x18000098 kqread sshd-session
> 78366 500790 60748 0 3 0x18000092 kqread sshd-session
> 1661 483333 93900 89 3 0x19100092 kqread relayd
> 20971 454162 93900 89 3 0x19100092 kqread relayd
> 66174 90602 93900 89 3 0x19100092 kqread relayd
> 48738 445549 93900 89 3 0x19100092 kqread relayd
> 88711 54303 93900 89 3 0x19100092 kqread relayd
> 33085 157864 93900 89 2 0x19100012 relayd
> 36613 263398 93900 89 3 0x19100092 relayd
> 93900 61929 1 0 3 0x18000080 kqread relayd
> 58569 410836 1 0 3 0x8100083 ttyin ksh
> 30102 428727 1 0 3 0x18100098 kqread cron
> *73351 225925 0 0 7 0x14200 wg_crypt
> 25707 237828 0 0 3 0x14200 bored wg_handshake
> 75251 422241 0 0 3 0x14200 bored wg_handshake
> 89402 219146 1 110 3 0x18100090 kqread sndiod
> 1652 116066 1 99 3 0x19100090 kqread sndiod
> 41636 131173 47944 95 3 0x19100092 kqread smtpd
> 56159 435661 47944 103 3 0x19100092 kqread smtpd
> 30864 263446 47944 95 3 0x18100092 kqread smtpd
>
> 64861 75991 47944 95 3 0x19100092 kqread smtpd
>
> 74399 157341 47944 95 3 0x19100092 kqread smtpd
> 47944 325461 1 0 3 0x18100080 kqread smtpd
> 60748 251840 1 0 3 0x18000088 kqread sshd
> 93282 26115 1 0 3 0x18100080 kqread ntpd
> 12262 492605 81276 83 3 0x18100092 kqread ntpd
> 81276 343918 1 83 2 0x19100492 ntpd
> 24416 419389 95291 74 3 0x19100092 bpf pflogd
> 95291 58348 1 0 3 0x18000080 sbwait pflogd
> 99456 71886 56811 73 3 0x19100090 kqread syslogd
> 57202 274926 82913 77 3 0x18100092 kqread dhcpleased
> 93609 415070 82913 77 3 0x18100092 kqread dhcpleased
> 82913 38615 1 0 3 0x18000080 kqread dhcpleased
> 39413 85502 22242 115 3 0x18100092 kqread slaacd
> 84235 356871 22242 115 3 0x18100092 kqread slaacd
> 22242 283359 1 0 3 0x18100080 kqread slaacd
> 53776 372278 0 0 3 0x14200 bored smr
> 16202 188026 0 0 3 0x14200 pgzero zerothread
> 40368 204141 0 0 3 0x14200 aiodoned aiodoned
> 18183 419428 0 0 3 0x14200 syncer update
> 79669 281449 0 0 3 0x14200 cleaner cleaner
> 80971 55573 0 0 3 0x14200 reaper reaper
> 88433 220842 0 0 3 0x14200 pgdaemon pagedaemon
> 34834 242944 0 0 3 0x14200 bored softnet3
> 28119 493362 0 0 3 0x14200 bored softnet2
> 41877 463150 0 0 3 0x14200 bored softnet1
> 16167 354819 0 0 3 0x14200 bored softnet0
> 93717 296304 0 0 3 0x14200 bored systqmp
> 45065 39416 0 0 3 0x14200 bored systq
> 46106 21722 0 0 3 0x40014200 tmoslp softclock
> 25869 146461 0 0 3 0x40014200 idle0
> 1 357659 0 0 3 0x8000082 wait init
> 0 0 -1 0 3 0x10200 scheduler swapper
> ddb> show registers
> rdi 0x4
> rsi 0x14
> rbp 0xffff80001c9221b0
> rbx 0xfffffd8035fd9400
> rdx 0x3fd
> rcx 0x4800000000000111
> rax 0x30
> r8 0x101010101010101
> r9 0
> r10 0x8dd14be7a93050dc
> r11 0xe3e5f94705a0c9e7
> r12 0xfffffd8035fd9400
> r13 0xffff800000024800
> r14 0
> r15 0xffffffff82161d70 pp_r600_decoded_lanes+0xc8aa
> rip 0xffffffff81fa1d44 db_enter+0x14
> cs 0x8
> rflags 0x286
> rsp 0xffff80001c9221b0
> ss 0x10
> db_enter+0x14: popq %rbp
>