W dniu 09.10.2020 o 14:13, David Hunt pisze: > > On 8/10/2020 10:07 PM, Lukasz Wojciechowski wrote: >> W dniu 08.10.2020 o 16:26, David Hunt pisze: >>> On 8/10/2020 6:23 AM, Lukasz Wojciechowski wrote: >>>> The burst version of distributor implementation was missing proper >>>> handling of worker shutdown. A worker processing packets received >>>> from distributor can call rte_distributor_return_pkt() function >>>> informing distributor that it want no more packets. Further calls to >>>> rte_distributor_request_pkt() or rte_distributor_get_pkt() however >>>> should inform distributor that new packets are requested again. >>>> >>>> Lack of the proper implementation has caused that even after worker >>>> informed about returning last packets, new packets were still sent >>>> from distributor causing deadlocks as no one could get them on worker >>>> side. >>>> >>>> This patch adds handling shutdown of the worker in following way: >>>> 1) It fixes usage of RTE_DISTRIB_VALID_BUF handshake flag. This flag >>>> was formerly unused in burst implementation and now it is used >>>> for marking valid packets in retptr64 replacing invalid use >>>> of RTE_DISTRIB_RETURN_BUF flag. >>>> 2) Uses RTE_DISTRIB_RETURN_BUF as a worker to distributor handshake >>>> in retptr64 to indicate that worker has shutdown. >>>> 3) Worker that shuts down blocks also bufptr for itself with >>>> RTE_DISTRIB_RETURN_BUF flag allowing distributor to retrieve any >>>> in flight packets. >>>> 4) When distributor receives information about shutdown of a worker, >>>> it: marks worker as not active; retrieves any in flight and backlog >>>> packets and process them to different workers; unlocks bufptr64 >>>> by clearing RTE_DISTRIB_RETURN_BUF flag and allowing use in >>>> the future if worker requests any new packages. >>>> 5) Do not allow to: send or add to backlog any packets for not >>>> active workers. Such workers are also ignored if matched. >>>> 6) Adjust calls to handle_returns() and tags matching procedure >>>> to react for possible activation deactivation of workers. >>>> >>>> Fixes: 775003ad2f96 ("distributor: add new burst-capable library") >>>> Cc: david.h...@intel.com >>>> Cc: sta...@dpdk.org >>>> >>>> Signed-off-by: Lukasz Wojciechowski <l.wojciec...@partner.samsung.com> >>>> --- >>> >>> Hi Lukasz, >> Hi David, >> Many thanks for your review. >>> I spent the most amount of time going through this particular >>> patch, and it looks good to me (even the bit where >>> rte_distributor_process is called recursively) :) >> That's the same trick that was used in the legacy single version. :) >>> I'll try and get some time to run through some more testing, but for >>> now: >>> >>> Acked-by: David Hunt <david.h...@intel.com> >> Thanks and if you'll run the test, please take a look at the >> performance. I think it has dropped because of these additional >> synchronizations and actions on activation/deactivation. >> >> However the quality has increased much. With v5 version , I ran tests >> over 100000 times and didn't get a single failure! >> >> Let me know about your results. >> > > Going back through the patch set and running performance on each one, > I see a 10% drop in performance at patch 2 in the series, which adds > an extra handle_returns() call in the busy loop. Which avoids possible > deadlock. > > I played around with that patch for a while, only calling > handle_returns() every x times aroudn the loop, but the performance > was worse again, probably because of the extra branch I added. > > However, it's more important to have stable performance than so it's > still a good idea to have that fix applied, IMO. I agree > > Maybe we can get back some lost performance in future optimisation > patches. That would be really nice. If i have some time, I would like to try some ideas I came with during work in the series. > > Thanks, > Dave. > > > > -- Lukasz Wojciechowski Principal Software Engineer
Samsung R&D Institute Poland Samsung Electronics Office +48 22 377 88 25 l.wojciec...@partner.samsung.com