Re: vmxnet3: possible bug in vmxnet3_isc_rxd_pkt_get

Andriy Gapon Fri, 19 Nov 2021 10:47:16 -0800

On 19/11/2021 20:19, Andriy Gapon wrote:

Here is some data to demonstrate the issue:
$1 = (iflib_rxq_t) 0xfffffe00ea9f6200
(kgdb) p $1->ifr_frags[0]
$2 = {irf_flid = 0 '\000', irf_idx = 1799, irf_len = 118}


(kgdb) p $1->ifr_frags[1]
$3 = {irf_flid = 1 '\001', irf_idx = 674, irf_len = 0}
(kgdb) p $1->ifr_frags[2]
$4 = {irf_flid = 1 '\001', irf_idx = 675, irf_len = 0}

... elements 3..62 follow the same pattern ...

(kgdb) p $1->ifr_frags[63]
$6 = {irf_flid = 1 '\001', irf_idx = 736, irf_len = 0}

and then...

(kgdb) p $1->ifr_frags[64]
$7 = {irf_flid = 1 '\001', irf_idx = 737, irf_len = 0}
(kgdb) p $1->ifr_frags[65]
$8 = {irf_flid = 1 '\001', irf_idx = 738, irf_len = 0}
... the pattern continues ...
(kgdb) p $1->ifr_frags[70]
$10 = {irf_flid = 1 '\001', irf_idx = 743, irf_len = 0}


It seems like a start-of-packet completion descriptor referenced a descriptor in

the command ring zero (and apparently it didn't have the end-of-packet bit). Andthere were another 70 zero-length completions referencing the ring one until theend-of-packet.

So, in total 71 fragment was recorded.

Or it's possible that those zero-length fragments were from the penultimatepkt_get call and ifr_frags[0] was obtained after that...

I think that this was the case and that I was able to find the correspondingdescriptors in the completion ring.


Please see https://people.freebsd.org/~avg/vmxnet3-fragment-overrun.txt

$54 is the SOP, it has qid of 6.

It is followed by many fragments with qid 14 (there are 8 queues / queue sets)and zero length.

But not all of them are zero length, some have length of 4096, e.g. $77, $86, 
etc.
$124 is the last fragment, its has eop = 1 and error = 1.
So, there are 71 fragments in total.

So, it is clear that VMWare produced 71 segments for a single packet beforegiving up on it.


I wonder why it did that.

Perhaps it's a bug, perhaps it does not count zero-length segments against thelimit, maybe something else.


In any case, it happens.

Finally, the packet looks interesting: udp = 0, tcp = 0, ipcsum_ok = 0, ipv6 =0, ipv4 = 0. I wonder what kind of a packet it could be -- being rather largeand not an IP packet.

I am not sure how that could happen.
I am thinking about adding a sanity check for the number of fragments.
Not sure yet what options there are for handling the overflow besides panicing.


Also, some data from the vmxnet3's side of things:
(kgdb) p $15.vmx_rxq[6]
$18 = {vxrxq_sc = 0xfffff80002d9b800, vxrxq_id = 6, vxrxq_intr_idx = 6,vxrxq_irq = {ii_res = 0xfffff80002f23e00, ii_rid = 7, ii_tag =0xfffff80002f23d80}, vxrxq_cmd_ring = {{vxrxr_rxd = 0xfffffe00ead3c000,vxrxr_ndesc = 2048, vxrxr_gen = 0, vxrxr_paddr = 57917440, vxrxr_desc_skips = 1114,vxrxr_refill_start = 1799}, {vxrxr_rxd = 0xfffffe00ead44000, vxrxr_ndesc = 2048,vxrxr_gen = 0, vxrxr_paddr = 57950208, vxrxr_desc_skips = 121, vxrxr_refill_start = 743}}, vxrxq_comp_ring = {vxcr_u = {txcd =0xfffffe00ead2c000, rxcd = 0xfffffe00ead2c000}, vxcr_next = 0, vxcr_ndesc =4096, vxcr_gen = 1, vxcr_paddr = 57851904, vxcr_zero_length = 1044, vxcr_pkt_errors = 128}, vxrxq_rs = 0xfffff80002d78e00, vxrxq_sysctl =0xfffff80004308080, vxrxq_name = "vmx0-rx6\000\000\000\000\000\000\000"}
vxrxr_refill_start values are consistent with what is seen in ifr_frags[].
vxcr_zero_length and vxcr_pkt_errors are both not zero, so maybe something gotthe driver into a confused state or the emulated hardware became confused.



--
Andriy Gapon

Re: vmxnet3: possible bug in vmxnet3_isc_rxd_pkt_get

Reply via email to