Hi Thomas, 

Thanks for your comments  and my response as below.

> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Friday, October 24, 2014 5:28 PM
> To: Ouyang, Changchun
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] vhost: Check descriptor number for vector
> Rx
> 
> Hi Changchun,
> 
> 2014-10-24 16:38, Ouyang Changchun:
> > For zero copy, it need check whether RX descriptor num meets the least
> > requirement when using vector PMD Rx function, and give user more
> > hints if it fails to meet the least requirement.
> [...]
> > --- a/examples/vhost/main.c
> > +++ b/examples/vhost/main.c
> > @@ -131,6 +131,10 @@
> >  #define RTE_TEST_RX_DESC_DEFAULT_ZCP 32   /* legacy: 32, DPDK virt FE:
> 128. */
> >  #define RTE_TEST_TX_DESC_DEFAULT_ZCP 64   /* legacy: 64, DPDK virt FE:
> 64.  */
> >
> > +#ifdef RTE_IXGBE_INC_VECTOR
> > +#define VPMD_RX_BURST         32
> > +#endif
> > +
> >  /* Get first 4 bytes in mbuf headroom. */  #define
> > MBUF_HEADROOM_UINT32(mbuf) (*(uint32_t *)((uint8_t *)(mbuf) \
> >             + sizeof(struct rte_mbuf)))
> > @@ -792,6 +796,19 @@ us_vhost_parse_args(int argc, char **argv)
> >             return -1;
> >     }
> >
> > +#ifdef RTE_IXGBE_INC_VECTOR
> > +   if ((zero_copy == 1) && (num_rx_descriptor <= VPMD_RX_BURST)) {
> > +           RTE_LOG(INFO, VHOST_PORT,
> > +                   "The RX desc num: %d is too small for PMD to
> work\n"
> > +                   "properly, please enlarge it to bigger than %d if\n"
> > +                   "possible by the option: '--rx-desc-num
> <number>'\n"
> > +                   "One alternative is disabling
> RTE_IXGBE_INC_VECTOR\n"
> > +                   "in config file and rebuild the libraries.\n",
> > +                   num_rx_descriptor, VPMD_RX_BURST);
> > +           return -1;
> > +   }
> > +#endif
> > +
> >     return 0;
> >  }
> 
> I feel there is a design problem here.
> An application shouldn't have to care about the underlying driver.
> 

For most of other applications, as their descriptor numbers are set as big 
enough(1024 or so) ,
So there is no need to check the descriptor number at the early stage of 
running.

But for vhost zero copy(note vhost one copy also has 1024 descriptor number) 
has the default 
descriptor number of 32.
Why use 32? 
because vhost zero copy implementation (working as backend) need support dpdk 
based app which use pmd virtio-net driver,
And also need support linux legacy virtio-net based application.  
When it is the linux legacy virtio-net case, on one side the qemu has hard code 
to confine the total virtio descriptor size to 256, 
On other side, legacy virtio use half of them as virtio header, and then only 
another half i.e. 128 descriptors are available to use as real buffer.

In PMD mode, all HW descriptors need to be filled DMA address in the rx initial 
stage, otherwise there is probably exceptional in rx process.
Based on that, we need use really limited virtio buffer to fully fill all hw 
descriptor DMA address,
Or in other word, the available virtio descriptor size will determine the total 
mbuf size and hw descriptor size in the case of zero copy,

Tune and find that 32 is the suitable value for vhost zero copy to work 
properly when it legacy linux virtio case.
Another factor to reduce the value to 32, is that mempool use ring to 
accommodate the mbuf, it cost one to flag the ring head/tail,
And there are some other overheads like temporary mbufs(size as RX_BURST) when 
rx.
Note that number descriptor should need power 2.   

Why the change occur at this moment?
Recently the default rx function is modified into vector RX function, while it 
use non-vector mode (scalar mode) Rx previously,
Vector RX function need more than 32 descriptor to work properly,  but scalar 
mode RX hasn't this limitation.

As the RX function is changeable(you can use vector mode or non-vector), and 
descriptor number can also be changed.
So here in the vhost app, check if they match to make sure all things could 
work normally, and give some hints if they don't match.

Hope the above could make it a bit clearer. :-)
Thanks again,
Best regards,
Changchun

Reply via email to