> -----Original Message----- > From: Wiles, Keith > Sent: Wednesday, August 23, 2017 4:05 PM > To: Carrillo, Erik G <erik.g.carri...@intel.com> > Cc: rsanf...@akamai.com; dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH 0/3] *** timer library enhancements *** > > > > On Aug 23, 2017, at 2:28 PM, Carrillo, Erik G <erik.g.carri...@intel.com> > wrote: > > > >> > >> -----Original Message----- > >> From: Wiles, Keith > >> Sent: Wednesday, August 23, 2017 11:50 AM > >> To: Carrillo, Erik G <erik.g.carri...@intel.com> > >> Cc: rsanf...@akamai.com; dev@dpdk.org > >> Subject: Re: [dpdk-dev] [PATCH 0/3] *** timer library enhancements > >> *** > >> > >> > >>> On Aug 23, 2017, at 11:19 AM, Carrillo, Erik G > >>> <erik.g.carri...@intel.com> > >> wrote: > >>> > >>> > >>> > >>>> -----Original Message----- > >>>> From: Wiles, Keith > >>>> Sent: Wednesday, August 23, 2017 10:02 AM > >>>> To: Carrillo, Erik G <erik.g.carri...@intel.com> > >>>> Cc: rsanf...@akamai.com; dev@dpdk.org > >>>> Subject: Re: [dpdk-dev] [PATCH 0/3] *** timer library enhancements > >>>> *** > >>>> > >>>> > >>>>> On Aug 23, 2017, at 9:47 AM, Gabriel Carrillo > >>>>> <erik.g.carri...@intel.com> > >>>> wrote: > >>>>> > >>>>> In the current implementation of the DPDK timer library, timers > >>>>> can be created and set to be handled by a target lcore by adding > >>>>> it to a skiplist that corresponds to that lcore. However, if an > >>>>> application enables multiple lcores, and each of these lcores > >>>>> repeatedly attempts to install timers on the same target lcore, > >>>>> overall application throughput will be reduced as all lcores > >>>>> contend to acquire the lock guarding the single skiplist of pending > timers. > >>>>> > >>>>> This patchset addresses this scenario by adding an array of > >>>>> skiplists to each lcore's priv_timer struct, such that when lcore > >>>>> i installs a timer on lcore k, the timer will be added to the ith > >>>>> skiplist for lcore k. If lcore j installs a timer on lcore k > >>>>> simultaneously, lcores i and j can both proceed since they will be > >>>>> acquiring different locks for different lists. > >>>>> > >>>>> When lcore k processes its pending timers, it will traverse each > >>>>> skiplist in its array and acquire a skiplist's lock while a run > >>>>> list is broken out; meanwhile, all other lists can continue to be > modified. > >>>>> Then, all run lists for lcore k are collected and traversed > >>>>> together so timers are executed in their global order. > >>>> > >>>> What is the performance and/or latency added to the timeout now? > >>>> > >>>> I worry about the case when just about all of the cores are > >>>> enabled, which could be as high was 128 or more now. > >>> > >>> There is a case in the timer_perf_autotest that runs > >>> rte_timer_manage > >> with zero timers that can give a sense of the added latency. When run > with > >> one lcore, it completes in around 25 cycles. When run with 43 lcores > >> (the highest I have access to at the moment), rte_timer_mange > >> completes in around 155 cycles. So it looks like each added lcore > >> adds around 3 cycles of overhead for checking empty lists in my testing. > >> > >> Does this mean we have only 25 cycles on the current design or is the > >> 25 cycles for the new design? > >> > > > > Both - when run with one lcore, the new design becomes equivalent to the > original one. I tested the current design to confirm. > > Good thanks > > > > >> If for the new design, then what is the old design cost compared to > >> the new cost. > >> > >> I also think we need the call to a timer function in the calculation, > >> just to make sure we have at least one timer in the list and we > >> account for any short cuts in the code for no timers active. > >> > > > > Looking at the numbers for non-empty lists in timer_perf_autotest, the > overhead appears to fall away. Here are some representative runs for > timer_perf_autotest: > > > > 43 lcores enabled, installing 1M timers on an lcore and processing them > with current design: > > > > <...snipped...> > > Appending 1000000 timers > > Time for 1000000 timers: 424066294 (193ms), Time per timer: 424 (0us) > > Time for 1000000 callbacks: 73124504 (33ms), Time per callback: 73 > > (0us) Resetting 1000000 timers Time for 1000000 timers: 1406756396 > > (641ms), Time per timer: 1406 (1us) <...snipped...> > > > > 43 lcores enabled, installing 1M timers on an lcore and processing them > with proposed design: > > > > <...snipped...> > > Appending 1000000 timers > > Time for 1000000 timers: 382912762 (174ms), Time per timer: 382 (0us) > > Time for 1000000 callbacks: 79194418 (36ms), Time per callback: 79 > > (0us) Resetting 1000000 timers Time for 1000000 timers: 1427189116 > > (650ms), Time per timer: 1427 (1us) <...snipped…> > > it looks ok then. The main concern I had was the timers in Pktgen and > someone telling the jitter increase or latency or performance. I guess I will > just have to wait an see. > > > > > The above are not averages, so the numbers don't really indicate which is > faster, but they show that the overhead of the proposed design should not > be appreciable. > > > >>> > >>>> > >>>> One option is to have the lcore j that wants to install a timer on > >>>> lcore k to pass a message via a ring to lcore k to add that timer. > >>>> We could even add that logic into setting a timer on a different > >>>> lcore then the caller in the current API. The ring would be a > >>>> multi-producer and > >> single consumer, we still have the lock. > >>>> What am I missing here? > >>>> > >>> > >>> I did try this approach: initially I had a multi-producer > >>> single-consumer ring > >> that would hold requests to add or delete a timer from lcore k's > >> skiplist, but it didn't really give an appreciable increase in my test > application throughput. > >> In profiling this solution, the hotspot had moved from acquiring the > >> skiplist's spinlock to the rte_atomic32_cmpset that the > >> multiple-producer ring code uses to manipulate the head pointer. > >>> > >>> Then, I tried multiple single-producer single-consumer rings per > >>> target > >> lcore. This removed the ring hotspot, but the performance didn't > >> increase as much as with the proposed solution. These solutions also > >> add overhead to rte_timer_manage, as it would have to process the > >> rings and then process the skiplists. > >>> > >>> One other thing to note is that a solution that uses such messages > >>> changes > >> the use models for the timer. One interesting example is: > >>> - lcore I enqueues a message to install a timer on lcore k > >>> - lcore k runs rte_timer_manage, processes its messages and adds the > >>> timer to its list > >>> - lcore I then enqueues a message to stop the same timer, now owned > >>> by lcore k > >>> - lcore k does not run rte_timer_manage again > >>> - lcore I wants to free the timer but it might not be safe > >> > >> This case seems like a mistake to me as lcore k should continue to > >> call > >> rte_timer_manager() to process any new timers from other lcores not > >> just the case where the list becomes empty and lcore k does not add > >> timer to his list. > >> > >>> > >>> Even though lcore I has successfully enqueued the request to stop > >>> the > >> timer (and delete it from lcore k's pending list), it hasn't actually > >> been deleted from the list yet, so freeing it could corrupt the > >> list. This case exists in the existing timer stress tests. > >>> > >>> Another interesting scenario is: > >>> - lcore I resets a timer to install it on lcore k > >>> - lcore j resets the same timer to install it on lcore k > >>> - then, lcore k runs timer_manage > >> > >> This one also seems like a mistake, more then one lcore setting the > >> same timer seems like a problem and should not be done. A lcore > >> should own a timer and no other lcore should be able to change that > >> timer. If multiple lcores need a timer then they should not share the same > timer structure. > >> > > > > Both of the above cases exist in the timer library stress tests, so a > > solution > would presumably need to address them or it would be less flexible. The > original design passed these tests, as does the proposed one. > > I get this twitch when one lcore is adding timers to another lcore as I come > from a realtime OS background, but I guess if no one else cares or finds a > problem I will have to live with it. Having a test for something does not make > it a good test or a reasonable reason to continue a design issue. We can make > any test work, but is it right is the real question and we will just have to > wait > an see I guess. > > > > >>> > >>> Lcore j's message obviates lcore i's message, and it would be wasted > >>> work > >> for lcore k to process it, so we should mark it to be skipped over. > Handling all > >> the edge cases was more complex than the solution proposed. > >> > >> Hmmm, to me it seems simple here as long as the lcores follow the > >> same rules and sharing a timer structure is very risky and avoidable IMO. > >> > >> Once you have lcores adding timers to another lcore then all accesses > >> to that skip list must be serialized or you get unpredictable > >> results. This should also fix most of the edge cases you are talking about. > >> > >> Also it seems to me the case with an lcore adding timers to another > >> lcore timer list is a specific use case and could be handled by a > >> different set of APIs for that specific use case. Then we do not need > >> to change the current design and all of the overhead is placed on the > >> new APIs/design. IMO we are turning the current timer design into a > >> global timer design as it really is a per lcore design today and I beleive > >> that > is a mistake. > >> > > > > Well, the original API explicitly supports installing a timer to be > > executed on > a different lcore, and there are no API changes in the patchset. Also, the > proposed design keeps the per-lcore design intact; it only takes what used > to be one large skiplist that held timers for all installing lcores, and > separates > it into N skiplists that correspond 1:1 to an installing lcore. When an lcore > processes timers on its lists it will still only be managing timers it owns, > and no > others. > > > Having an API to explicitly support some feature is not a reason to keep > something, but I think you have reduce my twitching some :-) so I will let it > go. > > Thanks for the information.
You're welcome, and thank you for the feedback. Regards, Gabriel > > > > > > >>> > >>>>> > >>>>> Gabriel Carrillo (3): > >>>>> timer: add per-installer pending lists for each lcore > >>>>> timer: handle timers installed from non-EAL threads > >>>>> doc: update timer lib docs > >>>>> > >>>>> doc/guides/prog_guide/timer_lib.rst | 19 ++- > >>>>> lib/librte_timer/rte_timer.c | 329 +++++++++++++++++++++++--- > --- > >> --- > >>>> ---- > >>>>> lib/librte_timer/rte_timer.h | 9 +- > >>>>> 3 files changed, 231 insertions(+), 126 deletions(-) > >>>>> > >>>>> -- > >>>>> 2.6.4 > >>>>> > >>>> > >>>> Regards, > >>>> Keith > >> > >> Regards, > >> Keith > > Regards, > Keith