> On 2 Jan 2024, at 20:16, Alexander Bluhm <alexander.bl...@gmx.net> wrote:
> 
> On Tue, Jan 02, 2024 at 03:45:10PM +0000, Stuart Henderson wrote:
>> panic: rw_enter: netlock locking against myself
>> Stopped at   db_enter+0x14:  popq    %rbp
>>    TID    PID    UID     PRFLAGS     PFLAGS  CPU  COMMAND
>> *388963  11506    755         0x2          0    0  snmpbulkwalk
>> 320140  61046      0     0x14000      0x200    1  reaper
>> db_enter() at db_enter+0x14
>> panic(ffffffff820ebaaa) at panic+0xc3
>> rw_enter(ffffffff824bd2d0,2) at rw_enter+0x252
>> hv_kvp(ffffffff8248d738) at hv_kvp+0x705
>> hv_event_intr(ffff80000008d000) at hv_event_intr+0x1ab
>> hv_intr() at hv_intr+0x1a
>> Xresume_hyperv_upcall() at Xresume_hyperv_upcall+0x27
>> Xspllower() at Xspllower+0x1d
>> task_add(ffff800000036200,ffff8000001533c0) at task_add+0x83
>> if_enqueue_ifq(ffff800000153060,fffffd80f6642900) at if_enqueue_ifq+0x6f
>> ether_output(ffff800000153060,fffffd80f6642900,fffffd80c8ef5c78,fffffd811e6414d8)
>>  at ether_output+0x82
>> if_output_tso(ffff800000153060,ffff800025395d08,fffffd80c8ef5c78,fffffd811e6414d8,5dc)
>>  at if_output_tso+0xe1
>> ip_output(fffffd80f6642900,0,fffffd80c8ef5c68,0,0,fffffd80c8ef5bf0,1da98542b9d8f733)
>>  at ip_output+0x817
>> udp_output(fffffd80c8ef5bf0,fffffd80b468b400,fffffd80ac452c00,0) at 
>> udp_output+0x3be
>> end trace frame: 0xffff800025395ee0, count: 0
>> https://www.openbsd.org/ddb.html describes the minimum info required in bug
>> reports.  Insufficient info makes it difficult to find and fix bugs.
>> ddb{0}> Stopped at   x86_ipi_db+0x16:        leave
>> x86_ipi_db(ffff80002250dff0) at x86_ipi_db+0x16
>> x86_ipi_handler() at x86_ipi_handler+0x80
>> Xresume_lapic_ipi() at Xresume_lapic_ipi+0x27
>> _kernel_lock() at _kernel_lock+0xc2
>> reaper(ffff80002473c008) at reaper+0x133
>> end trace frame: 0x0, count: 10
> 
> It is caused by this commit.
> 
> ----------------------------
> revision 1.20
> date: 2023/09/23 13:01:12;  author: mvs;  state: Exp;  lines: +5 -9;  
> commitid:
> Gj5WdkySyube1RkO;
> Use shared netlock to protect if_list and ifa_list walkthrough and ifnet
> data access within kvp_get_ip_info().
> 
> ok bluhm
> ----------------------------
> 
> Net lock must not be acquired in interrupt context.  It is wrong
> that hv tries to perform network operations from interrupt context.
> Actually it does not do much networking, it is just reads some
> interface addresses.  The kernel lock that was used before is also
> questionable.  Maybe some writers do not hold it anymore.  There
> was much conversion from kernel to net lock.
> 
> bluhm

This is not interrupt context. tsleep_nsec() points that. The
problem lays in Xresume_hyperv_upcall() which can run this
from interrupt context.

hv_wait(struct hv_softc *sc, int (*cond)(struct hv_softc *, struct hv_msg *),
    struct hv_msg *msg, void *wchan, const char *wmsg)
{
        int s;

        KASSERT(cold ? msg->msg_flags & MSGF_NOSLEEP : 1);

        while (!cond(sc, msg)) {
                if (msg->msg_flags & MSGF_NOSLEEP) {
                        delay(1000);
                        s = splnet();
                        hv_intr();
                        splx(s);
                } else {
                        tsleep_nsec(wchan, PRIBIO, wmsg ? wmsg : "hvwait",
                            USEC_TO_NSEC(1000));
                }
        }
}

Reply via email to