Re: [ovs-dev] [PATCH v5 1/1] netdev-dpdk: add dpdk vhost ports

Pravin Shelar Mon, 22 Dec 2014 10:02:47 -0800

On Sun, Dec 21, 2014 at 9:31 AM, Traynor, Kevin <kevin.tray...@intel.com> wrote:
> Hi,
>
> I'd like to get some feedback about using RCU on the virtio_dev structure as 
> per the comment below. Comments inline.
>
> Thanks,
> Kevin.
>
>> -----Original Message-----
>> From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Pravin Shelar
>> Sent: Tuesday, October 21, 2014 12:00 AM
>> To: Tahhan, Maryam
>> Cc: dev@openvswitch.org
>> Subject: Re: [ovs-dev] [PATCH v5 1/1] netdev-dpdk: add dpdk vhost ports
>>
>> On Mon, Sep 29, 2014 at 10:10 AM, maryam.tahhan <maryam.tah...@intel.com> 
>> wrote:
>> > This patch implements the vhost-net offload API.  It adds support for
>> > a new port type to userspace datapath called dpdkvhost. This allows KVM
>> > (QEMU) to offload the servicing of virtio-net devices to it's associated
>> > dpdkvhost port. Instructions for use are in INSTALL.DPDK.
>> >
>> > This has been tested on Intel multi-core platforms and with clients that
>> > have virtio-net interfaces.
>> >
>> > ver 5:
>> >   - rebased against latest master
>> > ver 4:
>> >   - added eventfd_link.h and eventfd_link.c to EXTRA_DIST in 
>> > utilities/automake.mk
>> >   - rebased with master to work with DPDK 1.7
>> > ver 3:
>> >   - rebased with master
>> > ver 2:
>> >   - rebased with master
>> >
>> > Signed-off-by: maryam.tahhan <maryam.tah...@intel.com>
>> > ---
>> Thanks for the patch, I have following comments.
>> - NON_PMD_THREAD_TX_QUEUE is not used in the patch.
>> - why do we need to limit MAX_BASENAME_SZ to 12
>> - dev_basename[] should be configurable at run time or it should be
>> vswitchd parameter.
>> - any reason for liming MAX_PKT_BURST to 32?
>> - netdev_dpdk->type should be a enum type with DPDK and VHOST members.
>> I am not sure if you really need type member in netdev_dpdk. you can
>> just define separate device ops for DPDK and VHOST of you was to do
>> special processing for each device.
>> - there is no check on gpa_to_vva() return value.
>> - in function virtio_dev_rx() variables can be defined in local block
>> rather than in function block. this simplifies code reading.
>> - in function virtio_dev_rx() virtio_hdr is always zero, so it can be
>> static variable rather than on stack.
>> - in function virtio_dev_rx() virtio_hdr should be copied before packet data.
>> - in function virtio_dev_rx() can we prefetch next vq->desc before
>> coping packet data?
>> - ovs-mutex is bit heavy weight, can you use rte-spin-lock
>> - can you reverse name of virtio_dev_rx(), virtio_dev_tx(), since this
>> is ovs-netdev code we could use name same as ovs context.
>> - virtio_dev_tx() is always called from PMD thread, so no need to the check.
>> - there is no synchronization for netdev_dpdk->rx_count and tx_count.
>> - destroy_device() can use RCU based mechanism rather than polling packet 
>> count.
>> - Currently new_device() assigns post in linear fashion. How does
>> handle multiple bridge case where vhost port might belong to available
>> port on different bridge?
>> - virtio_net_device_ops ops should be set before registering cuse device.
>> - destroy function sets virtio_dev to NULL, so no need to check for
>> remove flag, ofcourse you need to RCUfy it first.
> (Additional comment later added): device destroy can set p->virtio_dev to 
> NULL and call
> ovsrcu_postpone(). All read/write code path can check for NULL before
> using virtio_dev. That should be enough, I do not see any need to
> check tx_count or rx_count.
>
>
> I'm seeing two issues with using RCU for the virtio_dev...
>
> 1. The memory of the virtio_dev is not allocated in netdev-dpdk, but is 
> passed in as part of the new_device() callback from the DPDK vhost library. 
> As soon as the destroy_device() fn returns, I don't think we can have any 
> guarantees about the state of this memory. So for another thread to continue 
> to use it, or for a postpone callback to modify it after the destroy_device() 
> fn has returned would be unsafe. In any of the other examples I see of RCU 
> set/postpone, it looks like the memory has been allocated locally so we can 
> guarantee the integrity of it until all threads have quiesced, at which time 
> we can free it in the postpone callback.
>
> 2. The cuse thread loop is in the fuse library and only occasionally calls 
> the OVS code for new_device()/destroy_device(), so we can't make it 
> periodically quiesce - we can only call ovsrcu_quiesce_start() at the very 
> start of the thread. When I put a call to ovsrcu_postpone() in 
> destroy_device(), I see that the rcu thread is continually waiting for the 
> cuse thread to quiesce and the postpone callback does not get called. I'm 
> thinking that if the ovsrcu_postpone() fn is used in destroy_device() then 
> the cuse thread would have to have a periodic call to quiesce which I don't 
> think is possible?
>
> I think we may have no option but to block in the destroy_device() fn until 
> we know that virtio_dev is not being used anymore by other threads. We can 
> stop threads initiating more operations on it by checking for NULL or REMOVE 
> flag and we can check that in-progress operations are finished by the tx/rx 
> counters.
>
> Let me know if I've misunderstood and there is a way to use the RCUs here.
>


Can you use ovsrcu_synchronize() in destroy_device() to prevent the race?

>> _______________________________________________
>> dev mailing list
>> dev@openvswitch.org
>> http://openvswitch.org/mailman/listinfo/dev
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH v5 1/1] netdev-dpdk: add dpdk vhost ports

Reply via email to