https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=284057

--- Comment #3 from Andriy Gapon <a...@freebsd.org> ---
>From the data in comment #2 it appears that the faulting access to address
0xfffffe0212a00008
could have been an access to txcd = 0xfffffe02129f8000 just beyond the end of
the array.
Given that vxcr_ndesc = 2048 and that the size of a descriptor
(vmxnet3_txcompdesc) is 16 that produces 32K bytes or 0x8000.  Additional 8
bytes is an offset gen and type bit-fields in the descriptor.

So, it appears that vxcr_next coudl have been equal to vxcr_ndesc "for a
moment".
In the crash dump it's zero, but the only explanation for 0xfffffe0212a00008
access is that ti could have overflown.

I suspect that the overflow could have resulted from a concurrent (or, perhaps,
recursive) execution of vmxnet3_isc_txd_credits_update.

Looking at thread 100073, I can see that it is possible for an RX thread to end
up in TXQ related routines.
It seems that there is an execution path to vmxnet3_isc_txd_credits_update:
ifmp_ring_enqueue -> ifmp_ring_check_drainage -> r->can_drain() ==
iflib_txq_can_drain -> isc_txd_credits_update.
As far as I can see, that execution path does not take any locks.

Although thread 100073 (if_io_tqg_7) is for a different set of queues, it's
possible that if_io_tqg_1 could have made the same calls "a bit earlier".

If such concurrent execution is indeed possible then one thread could mess up
vxcr_next for the other thread.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to