On 8.2.2023. 8:53, Alexandr Nedvedicky wrote:
> Hello,
> 
> On Tue, Feb 07, 2023 at 09:12:38PM +0100, Hrvoje Popovski wrote:
> </snip>
>>
>> Hi,
>>
>> this panic is with plain snapshot and I didn't do anything. I will leave
>> box in ddb if something else is needed.
>>
>     It does not look like there is more data to gather in ddb.
>     may be I'm quick in my judgment. this is the relevant part
>     of pfsync_bulk_update() function:
> 2456         int i = 0;
>               /* `i` seems to be kept in %r12 */
> 2457
> 2458         NET_LOCK();
> 2459         sc = pfsyncif;
> 2460         if (sc == NULL)
> 2461                 goto out;
> 2462
> 2463         rw_enter_read(&pf_state_list.pfs_rwl);
> 2464         st = sc->sc_bulk_next;
>               /* `st` is kept in %r15
> 2465         sc->sc_bulk_next = NULL;
> 2466
> 2467         for (;;) {
> 2468                 if (st->sync_state == PFSYNC_S_NONE &&
> 2469                     st->timeout < PFTM_MAX &&
> 2470                     st->pfsync_time <= sc->sc_ureq_received) {
> 2471                         pfsync_update_state_req(st);
> 2472                         i++;
> 2473                 }
> 
> 
> 
> 
>> ddb{0}> dmesg
>> OpenBSD 7.2-current (GENERIC.MP) #1021: Sun Feb  5 09:52:50 MST 2023
>>     dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
>>
>>
>> r620-2# uvm_fault(0xffffffff824fb2f8, 0x14e, 0, 1) -> e
>> kernel: page fault trap, code=0
>> Stopped at      pfsync_bulk_update+0x60:        cmpb    $0xff,0x14e(%r15)
>>     TID    PID    UID     PRFLAGS     PFLAGS  CPU  COMMAND
>> *109809  58944      0     0x14000 0x40000200    0K softclock
>> pfsync_bulk_update(0) at pfsync_bulk_update+0x60
>     we seems to be dying at line 2468 due to a NULL pointer dereference
> 
>> softclock_thread(ffff8000fffff050) at softclock_thread+0x13b
>> end trace frame: 0x0, count: 13
>> https://www.openbsd.org/ddb.html describes the minimum info required in
>> bug reports.  Insufficient info makes it difficult to find and fix bugs.
>> ddb{0}>
>>
> </snip>
> 
>> r11               0xfbec2dfc846efdb5
>> r12                                0
>> r13               0xffffffff82503f80    timeout_proc
>> r14               0xffff8000009d8000
>> r15                                0
>> rip               0xffffffff8101aea0    pfsync_bulk_update+0x60
>     r12 (`i`) is 0 which suggest the loop is most likely in its first 
> iteration
>     r15 (`st`) is 0 ... so looks like it's trivial bug we try to send
>     a bulk but there is nothing to send. this makes me wonder if diff below
>     makes your test box more stable.
> 
> 
> can you give a try a diff below?
> 
> thanks a lot for your help
> 
> regards
> sashan

Hi,

with this diff I can't trigger panic as before. I'm trying the whole day
and I should be able to see panic or 2, but there isn't any ...

Thank you...


Reply via email to