Hi Konstantin, > -----Original Message----- > From: Ananyev, Konstantin > Sent: Monday, June 13, 2016 7:17 AM > To: Lu, Wenzhuo; Tao, Zhe; dev at dpdk.org > Cc: Richardson, Bruce; Chen, Jing D; Liang, Cunming; Wu, Jingjing; Zhang, > Helin > Subject: RE: [PATCH v4 2/8] lib/librte_ether: defind RX/TX lock mode > > Hi Wenzhuo, > > > > > Hi Konstantin, > > > > > > > -----Original Message----- > > > From: Ananyev, Konstantin > > > Sent: Wednesday, June 8, 2016 5:20 PM > > > To: Lu, Wenzhuo; Tao, Zhe; dev at dpdk.org > > > Cc: Richardson, Bruce; Chen, Jing D; Liang, Cunming; Wu, Jingjing; > > > Zhang, Helin > > > Subject: RE: [PATCH v4 2/8] lib/librte_ether: defind RX/TX lock mode > > > > > > > > > > > > > > > > > Hi Konstantin, > > > > > > > > > > > > > -----Original Message----- > > > > > From: Ananyev, Konstantin > > > > > Sent: Tuesday, June 7, 2016 5:59 PM > > > > > To: Tao, Zhe; dev at dpdk.org > > > > > Cc: Lu, Wenzhuo; Richardson, Bruce; Chen, Jing D; Liang, > > > > > Cunming; Wu, Jingjing; Zhang, Helin > > > > > Subject: RE: [PATCH v4 2/8] lib/librte_ether: defind RX/TX lock > > > > > mode > > > > > > > > > > > > > > > Hi Zhe & Wenzhuo, > > > > > > > > > > Please find my comments below. > > > > > BTW, for clarification - is that patch for 16.11? > > > > > I believe it's too late to introduce such significant change in 16.07. > > > > > Thanks > > > > > Konstantin > > > > Thanks for the comments. > > > > Honestly, our purpose is 16.07. Realizing the big impact, we use > > > > NEXT_ABI to comment our change. So, I think although we want to > > > > merge it in > > > 16.07 this change will become effective after we remove NEXT_ABI in > 16.11. > > > > > > I don't think it is achievable. > > > First I think your code is not in proper shape yet, right now. > > > Second, as you said, it is a significant change and I would like to > > > hear opinions from the rest of the community. > > Agree it should have risk. I mean our target is 16.07. But surely if it can > > be > achieved depends on the feedback from the community. > > > > > > > > > > > > > > > > > > > > Define lock mode for RX/TX queue. Because when resetting the > > > > > > device we want the resetting thread to get the lock of the > > > > > > RX/TX queue to make sure the RX/TX is stopped. > > > > > > > > > > > > Using next ABI macro for this ABI change as it has too much > > > > > > impact. 7 APIs and 1 global variable are impacted. > > > > > > > > > > > > Signed-off-by: Wenzhuo Lu <wenzhuo.lu at intel.com> > > > > > > Signed-off-by: Zhe Tao <zhe.tao at intel.com> > > > > > > --- > > > > > > lib/librte_ether/rte_ethdev.h | 62 > > > > > > +++++++++++++++++++++++++++++++++++++++++++ > > > > > > 1 file changed, 62 insertions(+) > > > > > > > > > > > > diff --git a/lib/librte_ether/rte_ethdev.h > > > > > > b/lib/librte_ether/rte_ethdev.h index 74e895f..4efb5e9 100644 > > > > > > --- a/lib/librte_ether/rte_ethdev.h > > > > > > +++ b/lib/librte_ether/rte_ethdev.h > > > > > > @@ -354,7 +354,12 @@ struct rte_eth_rxmode { > > > > > > jumbo_frame : 1, /**< Jumbo Frame Receipt > enable. */ > > > > > > hw_strip_crc : 1, /**< Enable CRC stripping by > hardware. */ > > > > > > enable_scatter : 1, /**< Enable scatter packets rx > handler */ > > > > > > +#ifndef RTE_NEXT_ABI > > > > > > enable_lro : 1; /**< Enable LRO */ > > > > > > +#else > > > > > > + enable_lro : 1, /**< Enable LRO */ > > > > > > + lock_mode : 1; /**< Using lock path */ > > > > > > +#endif > > > > > > }; > > > > > > > > > > > > /** > > > > > > @@ -634,11 +639,68 @@ struct rte_eth_txmode { > > > > > > /**< If set, reject sending out tagged pkts */ > > > > > > hw_vlan_reject_untagged : 1, > > > > > > /**< If set, reject sending out untagged pkts */ > > > > > > +#ifndef RTE_NEXT_ABI > > > > > > hw_vlan_insert_pvid : 1; > > > > > > /**< If set, enable port based VLAN insertion */ > > > > > > +#else > > > > > > + hw_vlan_insert_pvid : 1, > > > > > > + /**< If set, enable port based VLAN insertion */ > > > > > > + lock_mode : 1; > > > > > > + /**< If set, using lock path */ #endif > > > > > > }; > > > > > > > > > > > > /** > > > > > > + * The macros for the RX/TX lock mode functions */ #ifdef > > > > > > +RTE_NEXT_ABI #define RX_LOCK_FUNCTION(dev, func) \ > > > > > > + (dev->data->dev_conf.rxmode.lock_mode ? \ > > > > > > + func ## _lock : func) > > > > > > + > > > > > > +#define TX_LOCK_FUNCTION(dev, func) \ > > > > > > + (dev->data->dev_conf.txmode.lock_mode ? \ > > > > > > + func ## _lock : func) > > > > > > +#else > > > > > > +#define RX_LOCK_FUNCTION(dev, func) func > > > > > > + > > > > > > +#define TX_LOCK_FUNCTION(dev, func) func #endif > > > > > > + > > > > > > +/* Add the lock RX/TX function for VF reset */ #define > > > > > > +GENERATE_RX_LOCK(func, nic) \ uint16_t func ## _lock(void > > > > > > +*rx_queue, \ > > > > > > + struct rte_mbuf **rx_pkts, \ > > > > > > + uint16_t nb_pkts) \ > > > > > > +{ \ > > > > > > + struct nic ## _rx_queue *rxq = rx_queue; \ > > > > > > + uint16_t nb_rx = 0; \ > > > > > > + \ > > > > > > + if (rte_spinlock_trylock(&rxq->rx_lock)) { \ > > > > > > + nb_rx = func(rx_queue, rx_pkts, nb_pkts); \ > > > > > > + rte_spinlock_unlock(&rxq->rx_lock); \ > > > > > > + } \ > > > > > > + \ > > > > > > + return nb_rx; \ > > > > > > +} > > > > > > + > > > > > > +#define GENERATE_TX_LOCK(func, nic) \ uint16_t func ## > > > > > > +_lock(void *tx_queue, \ > > > > > > + struct rte_mbuf **tx_pkts, \ > > > > > > + uint16_t nb_pkts) \ > > > > > > +{ \ > > > > > > + struct nic ## _tx_queue *txq = tx_queue; \ > > > > > > + uint16_t nb_tx = 0; \ > > > > > > + \ > > > > > > + if (rte_spinlock_trylock(&txq->tx_lock)) { \ > > > > > > + nb_tx = func(tx_queue, tx_pkts, nb_pkts); \ > > > > > > + rte_spinlock_unlock(&txq->tx_lock); \ > > > > > > + } \ > > > > > > + \ > > > > > > + return nb_tx; \ > > > > > > +} > > > > > > > > > > 1. As I said in off-line dicussiion, I think this locking could > > > > > (and I think better be) impelented completely on rte_ethdev layer. > > > > > So actual PMD code will be unaffected. > > > > > Again that avoids us to introduce _lock version of every RX/Tx > > > > > function in each PMD. > > > > One purpose of implementing the lock in PMD layer is to avoid ABI > > > > change. But we introduce the field lock_mode in struct > > > > rte_eth_rx/txmode. So seems it's not a good reason now :) The > > > > other purpose is we want to add a lock for every queue. But in rte > > > > layer the queue is void *, so we add the lock in the specific > > > > structures of the NICs. But as > > > you mentioned below, we can add the lock as > > > dev->data->rx_queue_state it the struct rte_eth_dev_data. > > > > So, I prefer to add the lock in rte layer now. > > > > > > OK. > > > > > > > > > > > > > > > > > 2. Again, as discussed offline, I think it is better to have an > > > > > explicit > > > > > rte_eth_(rx|tx)_burst_lock(sync?) API, instead of add new fileds > > > > > into RX/TX config strcutures. > > > > > Would help to avoid any confusion, I think. > > > > We want the users to choose the rx/tx path without lock if > > > > they're sensitive to the performance and can handle the reset > > > > event in their APP. After > > > introducing new fields of config struct, users can change the config > > > to choose the different path. > > > > > > I understand what you are doing. > > > > > > > If we introduce new API, it may be harder for the use to use it. I > > > > mean when users want to use lock mode, they may need to replace > > > > all the > > > rte_eth_rx/tx_burst by rte_eth_rx/tx_burst_lock. > > > > > > Yes, my opinion if users would like to use locking API they need to > > > call it explicitly. > > > > > > > > > >So if we add the lock in rte layer, I still prefer adding lock_mode > > > >in the configuration, and the rte_eth_rx/tx_burst is changed like > > > >this, rte_eth_rx/tx_burst { > > > > + if lock_mode > > > > + try_lock > > > > ...... > > > > + if lock_mode > > > > + release_lock > > > > } > > > > > > My preference is to keep existing rx/tx_burst() functions unaffected > > > by that patch. > > > At least for now. > > > I suppose that will minimise the risks and help users to avoid > > > confusion what API > > > (locking/non-locking) is in use. > > OK. Let me add new APIs. > > > > > > > > > > > > > > > > > > > > > > > 3. I thought the plan was to introduce a locking in all > > > > > appropriate control path functions (dev_start/dev_stop etc.) > > > > > Without that locking version of RX/TX seems a bit useless. > > > > > Yes, I understand that you do use locking inside dev_reset, but > > > > > I suppose the plan was to have a generic solution, no? > > > > > Again, interrupt fire when user invokes dev_start/stop or so, so > > > > > we still need some synchronisation between them. > > > > > > > > > > To be more specific, I thought about something like that: > > > > > > > > > > static inline uint16_t > > > > > rte_eth_rx_burst_lock(uint8_t port_id, uint16_t queue_id, > > > > > struct rte_mbuf **rx_pkts, const uint16_t nb_pkts) { > > > > > struct rte_eth_dev *dev = &rte_eth_devices[port_id]; > > > > > > > > > > #ifdef RTE_LIBRTE_ETHDEV_DEBUG > > > > > RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0); > > > > > RTE_FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, 0); > > > > > > > > > > if (queue_id >= dev->data->nb_rx_queues) { > > > > > RTE_PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", > queue_id); > > > > > return 0; > > > > > } > > > > > #endif > > > > > > > > > > + if > > > > > + (rte_spinlock_trylock(&dev->data->rx_queue_state[rx_queue_id]. > > > > > + lock) > > > == 0) > > > > > + return 0; > > > > > + else if (dev->data->rx_queue_state[rx_queue_id] == > > > > > RTE_ETH_QUEUE_STATE_STOPPED)) { > > > > > + rte_spinlock_unlock(&dev->data- > >rx_queue_state[rx_queue_id].unlock); > > > > > + return 0; > > > > > + > > > > > > > > > > nb_rx = (*dev->rx_pkt_burst)(dev->data->rx_queues[queue_id], > > > > > rx_pkts, nb_pkts); > > > > > > > > > > + rte_spinlock_unlock(&dev->data- > >rx_queue_state[rx_queue_id].un > > > > > + lock > > > > > + ); > > > > > > > > > > .... > > > > > > > > > > return nb_rx; > > > > > } > > > > > > > > > > And inside queue_start: > > > > > > > > > > int > > > > > rte_eth_dev_rx_queue_start(uint8_t port_id, uint16_t rx_queue_id) > { > > > > > struct rte_eth_dev *dev; > > > > > > > > > > RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL); > > > > > > > > > > dev = &rte_eth_devices[port_id]; > > > > > if (rx_queue_id >= dev->data->nb_rx_queues) { > > > > > RTE_PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", > > > rx_queue_id); > > > > > return -EINVAL; > > > > > } > > > > > > > > > > RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_start, > > > > > -ENOTSUP); > > > > > > > > > > > > > > > rte_spinlock_lock(&dev->data->rx_queue_state[rx_queue_id].lock) > > > > I think you add the lock here to stop the rx/tx. > > > > But to my opinion, we should lock the rx/tx much earlier before > > > > starting the queue. For example, when stop the port, the resource > > > > of the > > > queues may be released. > > > > > > I didn't get you here... > > > Before releasing the queue resources, queue_stop() has to be executed, > right? > > Sorry, I saw your example with rte_eth_dev_rx_queue_start, I didn't > > know you also want to change rte_eth_dev_rx_queue_stop too. > > Agree this should work it we call queue_start/stop when reset the > > port. But we will not call them. I find the queue_stop/start are per- queue > functions and not supported by all NICs. > > But right now you do reset only for ixgbe/i40e. Not only for ixgbe/i40e. You forget igb, which doesn't support queue_start/stop :)
> For these devices we defiantly do support queue start/stop. > And again, it is not only about reset op. > If we want to add rx locked (synced), I think it should be in sync with all > control API that changes queue state. > As I said before it is a lot of work and a lot of hassle... > So probably the easiest (and might be safiest) way just leave things as there > are right now: > we allow user to setup a callback on VF reset, and it is user responsibility > to > make sure no RX/TX is active while reset operation is performed. > Pretty much what Olivier and Stephen suggested, as I understand. Agree. It's not a good way to add lock for just one feature. It could be tricky if we want to extend the lock to other features. A whole picture is needed. We've sent another patch set to let the user setup a callback on VF reset. Depend on that, user can use existing rte APIs to reset the VF port. But how about your opinion if we add a specific rte_reset API? It may be easier for the user. > Konstantin > > > Our solution now is stop the whole port and restart the whole port. We > will not stop/restart queue by queue. > > > > > > > > >The rx/tx cannot be executed. So I prefer to get the lock before > > > >stopping the > > > ports. > > > > > > Might be I wasn't clear enough here. > > > What I think we need to have: > > > -To stop/start/rx/tx the queue (or do any other action that might > > > change the queue internal structure) > > > you have to grab the lock. > > > After queue is stopped it's state has to be changed to > > > QUEUE_STATE_STOPPED (whti queue lock grabbed), > > > so rx/tx_locked wouldn't proceed with that queue. > > > - dev_stop() - has to stop all its queues first, i.e. it needs to > > > call queue_stop() for all of them. > > > So after dev_stop() had finished - all device queues have to be in > > > QUEUE_STATE_STOPPED Same about dev_start() - after it does all other > > > things - it will call queue_start() for all it's queues. > > > that will bring them into QUEUE_STARTED. > > > After that rx/tx_locked can use them again. > > > > > > >Maybe better to keep the spinlock in the dev_reset. > > > > > > Might be not :) > > > > > > > > > > > > > > > > > if (dev->data->rx_queue_state[rx_queue_id] != > > > > > RTE_ETH_QUEUE_STATE_STOPPED) { > > > > > RTE_PMD_DEBUG_TRACE("Queue %" PRIu16" of device > > > > > with port_id=%" PRIu8 > > > > > " already started\n", > > > > > rx_queue_id, port_id); > > > > > ret = -EINVAL 0; > > > > > } else > > > > > ret = dev->dev_ops->rx_queue_start(dev, rx_queue_id); > > > > > > > > > > > > > > > rte_spinlock_unlock(&dev->data- > >rx_queue_state[rx_queue_id].unlo > > > > > ck); > > > > > > > > > > return ret; > > > > > } > > > > > > > > > > Then again, we don't need to do explicit locking inside dev_reset(). > > > > > Does it make sense to you guys? > > > > Please see the answer above. > > > > > > > > > > > > > > > > > > > > + > > > > > > +/** > > > > > > * A structure used to configure an RX ring of an Ethernet port. > > > > > > */ > > > > > > struct rte_eth_rxconf { > > > > > > -- > > > > > > 2.1.4