On 2/18/2019 3:25 PM, Ferruh Yigit wrote: > On 2/11/2019 10:25 AM, Parthasarathy, JananeeX M wrote: >> Hi >> >>> -----Original Message----- >>> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Chas Williams >>> Sent: Saturday, February 09, 2019 6:47 PM >>> To: Hyong Youb Kim <hyon...@cisco.com>; Yigit, Ferruh >>> <ferruh.yi...@intel.com>; Doherty, Declan <declan.dohe...@intel.com>; Chas >>> Williams <ch...@att.com> >>> Cc: dev@dpdk.org; sta...@dpdk.org >>> Subject: Re: [dpdk-dev] [PATCH 2/2] net/bonding: avoid the next active slave >>> going out of bound >>> >>> >>> >>> On 1/10/19 5:22 AM, Hyong Youb Kim wrote: >>>> For bonding modes like broadcast that use bond_ethdev_rx_burst(), it >>>> is fairly easy to produce a crash simply by bringing a slave port's >>>> link down. When slave links go down, the driver on one thread reduces >>>> active_slave_count via the LSC callback and deactivate_slave(). At the >>>> same time, bond_ethdev_rx_burst() running on a forwarding thread may >>>> increment active_slave (next active slave) beyond active_slave_count. >>>> Here is a typical sequence of events. >>>> >>>> At time 0: >>>> active_slave_count = 3 >>>> active_slave = 2 >>>> >>>> At time 1: >>>> A slave link goes down. >>>> Thread 0 (main) reduces active_slave_count to 2. >>>> >>>> At time 2: >>>> Thread 1 (forwarding) executes bond_ethdev_rx_burst(). >>>> - Reads active_slave_count = 2. >>>> - Increments active_slave at the end to 3. >>>> >>>> From this point on, everytime bond_ethdev_rx_burst() runs, >>>> active_slave increments by one, eventually going well out of bound of >>>> the active_slaves array and causing a crash. >>>> >>>> Make the rx burst function to first check that active_slave is within >>>> bound. If not, reset it to 0 to avoid out-of-range array access. >>>> >>>> Fixes: e1110e977648 ("net/bonding: fix Rx slave fairness") >>>> Cc: sta...@dpdk.org >>>> >>>> Signed-off-by: Hyong Youb Kim <hyon...@cisco.com> >>> >>> Acked-by: Chas Williams <ch...@att.com> >>> >>>> --- >>>> drivers/net/bonding/rte_eth_bond_pmd.c | 14 ++++++++++++++ >>>> 1 file changed, 14 insertions(+) >>>> >>>> diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c >>>> b/drivers/net/bonding/rte_eth_bond_pmd.c >>>> index daf2440cd..bc2405e54 100644 >>>> --- a/drivers/net/bonding/rte_eth_bond_pmd.c >>>> +++ b/drivers/net/bonding/rte_eth_bond_pmd.c >>>> @@ -68,6 +68,15 @@ bond_ethdev_rx_burst(void *queue, struct rte_mbuf >>> **bufs, uint16_t nb_pkts) >>>> internals = bd_rx_q->dev_private; >>>> slave_count = internals->active_slave_count; >>>> active_slave = internals->active_slave; >>>> + /* >>>> + * Reset the active slave index, in case active_slave goes out >>>> + * of bound. It can hapen when slave links go down, and >>>> + * another thread (LSC callback) shrinks the slave count. >>>> + */ >>>> + if (active_slave >= slave_count) { >>>> + internals->active_slave = 0; >>>> + active_slave = 0; >>>> + } >> >> Instead of introducing new conditions again at the top of functions, it >> would be better to check greater than, equal to >= instead of the equal to >> in below condition. >> if (++internals->active_slave == slave_count) >> internals->active_slave = 0; >> >> Thereby we can reduce the multiple if conditions and still ensure >> internals->active_slave points to correct index always. >> >>>> >>>> for (i = 0; i < slave_count && nb_pkts; i++) { >>>> uint16_t num_rx_slave; >>>> @@ -273,6 +282,11 @@ bond_ethdev_rx_burst_8023ad_fast_queue(void >>> *queue, struct rte_mbuf **bufs, >>>> active_slave = internals->active_slave; >>>> memcpy(slaves, internals->active_slaves, >>>> sizeof(internals->active_slaves[0]) * slave_count); >>>> + /* active_slave may go out of bound. See bond_ethdev_rx_burst() */ >>>> + if (active_slave >= slave_count) { >>>> + internals->active_slave = 0; >>>> + active_slave = 0; >>>> + } >> >> Same as above comment would be better. >>>> >>>> for (i = 0; i < slave_count && nb_pkts; i++) { >>>> uint16_t num_rx_slave; >>>> >> >> It would be better to check the internals->active_slave during >> deactivate_slave() as well in rte_eth_bond_api.c. >> Since slave counts would be decremented during de-activation and resetting >> here appropriately would be better. >> >> Regards >> M.P.Jananee > > > I don't see this comment on the patchwork, can you double check if your > comment > hit the mailing list?
For record, this patch superseded by: https://patches.dpdk.org/patch/50346/