On 09/29/2016 07:57 PM, Michael S. Tsirkin wrote: > On Thu, Sep 29, 2016 at 05:30:53PM +0200, Maxime Coquelin wrote: ... >> >> Before enabling anything by default, we should first optimize the 1 slot >> case. Indeed, micro-benchmark using testpmd in txonly[0] shows ~17% >> perf regression for 64 bytes case: >> - 2 descs per packet: 11.6Mpps >> - 1 desc per packet: 9.6Mpps >> >> This is due to the virtio header clearing in virtqueue_enqueue_xmit(). >> Removing it, we get better results than with 2 descs (1.20Mpps). >> Since the Virtio PMD doesn't support offloads, I wonder whether we can >> just drop the memset? > > What will happen? Will the header be uninitialized? Yes.. I didn't look closely at the spec, but just looked at DPDK's and Linux vhost implementations. IIUC, the header is just skipped in the two implementations. > > The spec says: > The driver can send a completely checksummed packet. In this case, flags > will be zero, and gso_type > will be VIRTIO_NET_HDR_GSO_NONE. > > and > The driver MUST set num_buffers to zero. > If VIRTIO_NET_F_CSUM is not negotiated, the driver MUST set flags to > zero and SHOULD supply a fully > checksummed packet to the device. > > and > If none of the VIRTIO_NET_F_HOST_TSO4, TSO6 or UFO options have been > negotiated, the driver MUST > set gso_type to VIRTIO_NET_HDR_GSO_NONE. > > so doing this unconditionally would be a spec violation, but if you see > value in this, we can add a feature bit. Right it would be a spec violation, so it should be done conditionally. If a feature bit is to be added, what about VIRTIO_NET_F_NO_TX_HEADER? It would imply VIRTIO_NET_F_CSUM not set, and no GSO features set. If negotiated, we wouldn't need to prepend a header.
From the micro-benchmarks results, we can expect +10% compared to indirect descriptors, and + 5% compared to using 2 descs in the virtqueue. Also, it should have the same benefits as indirect descriptors for 0% pkt loss (as we can fill 2x more packets in the virtqueue). What do you think? Thanks, Maxime