> -----Original Message----- > From: Mattias Rönnblom <mattias.ronnb...@ericsson.com> > Sent: Monday, March 9, 2020 1:28 PM > To: Jerin Jacob Kollanukkaran <jer...@marvell.com> > Cc: dev@dpdk.org; Stefan Sundkvist <stefan.sundkv...@ericsson.com>; > ola.liljed...@arm.com > Subject: [EXT] Re: [PATCH 5/8] event/dsw: avoid migration waves in large > systems > > On 2020-03-09 08:17, Jerin Jacob Kollanukkaran wrote: > >> -----Original Message----- > >> From: Mattias Rönnblom <mattias.ronnb...@ericsson.com> > >> Sent: Monday, March 9, 2020 12:21 PM > >> To: Jerin Jacob Kollanukkaran <jer...@marvell.com> > >> Cc: dev@dpdk.org; stefan.sundkv...@ericsson.com; > >> ola.liljed...@arm.com; Mattias Rönnblom > >> <mattias.ronnb...@ericsson.com> > >> Subject: [PATCH 5/8] event/dsw: avoid migration waves in large > >> systems > >> > >> --------------------------------------------------------------------- > >> - DSW limits the rate of migrations on a per-port basis. Hence, as > >> the number of cores grows, so does the total migration capacity. > >> > >> In high core-count systems, this allows for a situation where flows > >> are migrated to a lightly loaded port which recently already received > >> a number of new flows (from other ports). The processing load > >> generated by these new flows may not yet be reflected in the lightly > >> loaded port's load estimate. The result is that the previously lightly > >> loaded > port is now overloaded. > >> > >> This patch adds a rough estimate of the size of the inbound > >> migrations to a particular port, which can be factored into the > >> migration logic, avoiding the above problem. > >> > >> Signed-off-by: Mattias Rönnblom <mattias.ronnb...@ericsson.com> > >> --- > >> @@ -491,6 +502,9 @@ dsw_select_emigration_target(struct dsw_evdev > *dsw, > >> target_qfs[*targets_len] = *candidate_qf; > >> (*targets_len)++; > >> > >> + rte_atomic32_add(&dsw->ports[candidate_port_id].immigration_load, > >> + candidate_flow_load); > > These are the full barriers in arm64 and PowerPC. > > Request to change the C11 mem model[1] with Load and acquire semantics > > For better performance enhancement on non x86 machines. > > > > drivers/event/opdl is already moved to C11 mem model. > > > > [1] > > https://urldefense.proofpoint.com/v2/url?u=https-3A__gcc.gnu.org_onlin > > edocs_gcc_-5F005f-5F005fatomic- > 2DBuiltins.html&d=DwIGaQ&c=nKjWec2b6R0m > > > OyPaz7xtfQ&r=1DGob4H4rxz6H8uITozGOCa0s5f4wCNtTa4UUKvcsvI&m=WWfY > IvEKR8a > > > _FuTltGFBbtERAKU1akjXuokLpv2zSz0&s=bEjlLRgN4LriVpVzwYcdgcTV39OI_MZY > OG0 > > QDhjmezw&e= > > > The performance impacts would be small, since this is in the slow path, with > something like a handful of memory barrier per core per ms.
OK. If it is slow path, then yes, no point in changing. How about the other following uses in the DSW driver? Does it comes in fastpath or slowpath? drivers/event/dsw/dsw_event.c: new_total_on_loan = rte_atomic32_add_return(&dsw->credits_on_loan, drivers/event/dsw/dsw_event.c: rte_atomic32_sub(&dsw->credits_on_loan, acquired_credits); drivers/event/dsw/dsw_event.c: rte_atomic32_sub(&dsw->credits_on_loan, return_credits); > > Arguably, it could be done for consistency reasons, but then you should change > all DSW atomics. > > >> + > >> return true; >