On Sun, Dec 21, 2014 at 9:31 AM, Traynor, Kevin <kevin.tray...@intel.com> wrote: > Hi, > > I'd like to get some feedback about using RCU on the virtio_dev structure as > per the comment below. Comments inline. > > Thanks, > Kevin. > >> -----Original Message----- >> From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Pravin Shelar >> Sent: Tuesday, October 21, 2014 12:00 AM >> To: Tahhan, Maryam >> Cc: dev@openvswitch.org >> Subject: Re: [ovs-dev] [PATCH v5 1/1] netdev-dpdk: add dpdk vhost ports >> >> On Mon, Sep 29, 2014 at 10:10 AM, maryam.tahhan <maryam.tah...@intel.com> >> wrote: >> > This patch implements the vhost-net offload API. It adds support for >> > a new port type to userspace datapath called dpdkvhost. This allows KVM >> > (QEMU) to offload the servicing of virtio-net devices to it's associated >> > dpdkvhost port. Instructions for use are in INSTALL.DPDK. >> > >> > This has been tested on Intel multi-core platforms and with clients that >> > have virtio-net interfaces. >> > >> > ver 5: >> > - rebased against latest master >> > ver 4: >> > - added eventfd_link.h and eventfd_link.c to EXTRA_DIST in >> > utilities/automake.mk >> > - rebased with master to work with DPDK 1.7 >> > ver 3: >> > - rebased with master >> > ver 2: >> > - rebased with master >> > >> > Signed-off-by: maryam.tahhan <maryam.tah...@intel.com> >> > --- >> Thanks for the patch, I have following comments. >> - NON_PMD_THREAD_TX_QUEUE is not used in the patch. >> - why do we need to limit MAX_BASENAME_SZ to 12 >> - dev_basename[] should be configurable at run time or it should be >> vswitchd parameter. >> - any reason for liming MAX_PKT_BURST to 32? >> - netdev_dpdk->type should be a enum type with DPDK and VHOST members. >> I am not sure if you really need type member in netdev_dpdk. you can >> just define separate device ops for DPDK and VHOST of you was to do >> special processing for each device. >> - there is no check on gpa_to_vva() return value. >> - in function virtio_dev_rx() variables can be defined in local block >> rather than in function block. this simplifies code reading. >> - in function virtio_dev_rx() virtio_hdr is always zero, so it can be >> static variable rather than on stack. >> - in function virtio_dev_rx() virtio_hdr should be copied before packet data. >> - in function virtio_dev_rx() can we prefetch next vq->desc before >> coping packet data? >> - ovs-mutex is bit heavy weight, can you use rte-spin-lock >> - can you reverse name of virtio_dev_rx(), virtio_dev_tx(), since this >> is ovs-netdev code we could use name same as ovs context. >> - virtio_dev_tx() is always called from PMD thread, so no need to the check. >> - there is no synchronization for netdev_dpdk->rx_count and tx_count. >> - destroy_device() can use RCU based mechanism rather than polling packet >> count. >> - Currently new_device() assigns post in linear fashion. How does >> handle multiple bridge case where vhost port might belong to available >> port on different bridge? >> - virtio_net_device_ops ops should be set before registering cuse device. >> - destroy function sets virtio_dev to NULL, so no need to check for >> remove flag, ofcourse you need to RCUfy it first. > (Additional comment later added): device destroy can set p->virtio_dev to > NULL and call > ovsrcu_postpone(). All read/write code path can check for NULL before > using virtio_dev. That should be enough, I do not see any need to > check tx_count or rx_count. > > > I'm seeing two issues with using RCU for the virtio_dev... > > 1. The memory of the virtio_dev is not allocated in netdev-dpdk, but is > passed in as part of the new_device() callback from the DPDK vhost library. > As soon as the destroy_device() fn returns, I don't think we can have any > guarantees about the state of this memory. So for another thread to continue > to use it, or for a postpone callback to modify it after the destroy_device() > fn has returned would be unsafe. In any of the other examples I see of RCU > set/postpone, it looks like the memory has been allocated locally so we can > guarantee the integrity of it until all threads have quiesced, at which time > we can free it in the postpone callback. > > 2. The cuse thread loop is in the fuse library and only occasionally calls > the OVS code for new_device()/destroy_device(), so we can't make it > periodically quiesce - we can only call ovsrcu_quiesce_start() at the very > start of the thread. When I put a call to ovsrcu_postpone() in > destroy_device(), I see that the rcu thread is continually waiting for the > cuse thread to quiesce and the postpone callback does not get called. I'm > thinking that if the ovsrcu_postpone() fn is used in destroy_device() then > the cuse thread would have to have a periodic call to quiesce which I don't > think is possible? > > I think we may have no option but to block in the destroy_device() fn until > we know that virtio_dev is not being used anymore by other threads. We can > stop threads initiating more operations on it by checking for NULL or REMOVE > flag and we can check that in-progress operations are finished by the tx/rx > counters. > > Let me know if I've misunderstood and there is a way to use the RCUs here. >
Can you use ovsrcu_synchronize() in destroy_device() to prevent the race? >> _______________________________________________ >> dev mailing list >> dev@openvswitch.org >> http://openvswitch.org/mailman/listinfo/dev _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev