Re: [dpdk-dev] [PATCH 0/3] * timer library enhancements *

Wiles, Keith Wed, 23 Aug 2017 14:05:32 -0700

> On Aug 23, 2017, at 2:28 PM, Carrillo, Erik G <erik.g.carri...@intel.com> 
> wrote:
> 
>> 
>> -----Original Message-----
>> From: Wiles, Keith
>> Sent: Wednesday, August 23, 2017 11:50 AM
>> To: Carrillo, Erik G <erik.g.carri...@intel.com>
>> Cc: rsanf...@akamai.com; dev@dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH 0/3] *** timer library enhancements ***
>> 
>> 
>>> On Aug 23, 2017, at 11:19 AM, Carrillo, Erik G <erik.g.carri...@intel.com>
>> wrote:
>>> 
>>> 
>>> 
>>>> -----Original Message-----
>>>> From: Wiles, Keith
>>>> Sent: Wednesday, August 23, 2017 10:02 AM
>>>> To: Carrillo, Erik G <erik.g.carri...@intel.com>
>>>> Cc: rsanf...@akamai.com; dev@dpdk.org
>>>> Subject: Re: [dpdk-dev] [PATCH 0/3] *** timer library enhancements
>>>> ***
>>>> 
>>>> 
>>>>> On Aug 23, 2017, at 9:47 AM, Gabriel Carrillo
>>>>> <erik.g.carri...@intel.com>
>>>> wrote:
>>>>> 
>>>>> In the current implementation of the DPDK timer library, timers can
>>>>> be created and set to be handled by a target lcore by adding it to a
>>>>> skiplist that corresponds to that lcore.  However, if an application
>>>>> enables multiple lcores, and each of these lcores repeatedly
>>>>> attempts to install timers on the same target lcore, overall
>>>>> application throughput will be reduced as all lcores contend to
>>>>> acquire the lock guarding the single skiplist of pending timers.
>>>>> 
>>>>> This patchset addresses this scenario by adding an array of
>>>>> skiplists to each lcore's priv_timer struct, such that when lcore i
>>>>> installs a timer on lcore k, the timer will be added to the ith
>>>>> skiplist for lcore k.  If lcore j installs a timer on lcore k
>>>>> simultaneously, lcores i and j can both proceed since they will be
>>>>> acquiring different locks for different lists.
>>>>> 
>>>>> When lcore k processes its pending timers, it will traverse each
>>>>> skiplist in its array and acquire a skiplist's lock while a run list
>>>>> is broken out; meanwhile, all other lists can continue to be modified.
>>>>> Then, all run lists for lcore k are collected and traversed together
>>>>> so timers are executed in their global order.
>>>> 
>>>> What is the performance and/or latency added to the timeout now?
>>>> 
>>>> I worry about the case when just about all of the cores are enabled,
>>>> which could be as high was 128 or more now.
>>> 
>>> There is a case in the timer_perf_autotest that runs rte_timer_manage
>> with zero timers that can give a sense of the added latency.   When run with
>> one lcore, it completes in around 25 cycles.  When run with 43 lcores (the
>> highest I have access to at the moment), rte_timer_mange completes in
>> around 155 cycles.  So it looks like each added lcore adds around 3 cycles of
>> overhead for checking empty lists in my testing.
>> 
>> Does this mean we have only 25 cycles on the current design or is the 25
>> cycles for the new design?
>> 
> 
> Both - when run with one lcore, the new design becomes equivalent to the 
> original one.  I tested the current design to confirm.


Good thanks

> 
>> If for the new design, then what is the old design cost compared to the new
>> cost.
>> 
>> I also think we need the call to a timer function in the calculation, just to
>> make sure we have at least one timer in the list and we account for any short
>> cuts in the code for no timers active.
>> 
> 
> Looking at the numbers for non-empty lists in timer_perf_autotest, the 
> overhead appears to fall away.  Here are some representative runs for 
> timer_perf_autotest:
> 
> 43 lcores enabled, installing 1M timers on an lcore and processing them with 
> current design:
> 
> <...snipped...>
> Appending 1000000 timers
> Time for 1000000 timers: 424066294 (193ms), Time per timer: 424 (0us)
> Time for 1000000 callbacks: 73124504 (33ms), Time per callback: 73 (0us)
> Resetting 1000000 timers
> Time for 1000000 timers: 1406756396 (641ms), Time per timer: 1406 (1us)
> <...snipped...>
> 
> 43 lcores enabled, installing 1M timers on an lcore and processing them with 
> proposed design:
> 
> <...snipped...>
> Appending 1000000 timers
> Time for 1000000 timers: 382912762 (174ms), Time per timer: 382 (0us)
> Time for 1000000 callbacks: 79194418 (36ms), Time per callback: 79 (0us)
> Resetting 1000000 timers
> Time for 1000000 timers: 1427189116 (650ms), Time per timer: 1427 (1us)
> <...snipped…>

it looks ok then. The main concern I had was the timers in Pktgen and someone 
telling the jitter increase or latency or performance. I guess I will just have 
to wait an see. 

> 
> The above are not averages, so the numbers don't really indicate which is 
> faster, but they show that the overhead of the proposed design should not be 
> appreciable.
> 
>>> 
>>>> 
>>>> One option is to have the lcore j that wants to install a timer on
>>>> lcore k to pass a message via a ring to lcore k to add that timer. We
>>>> could even add that logic into setting a timer on a different lcore
>>>> then the caller in the current API. The ring would be a multi-producer and
>> single consumer, we still have the lock.
>>>> What am I missing here?
>>>> 
>>> 
>>> I did try this approach: initially I had a multi-producer single-consumer 
>>> ring
>> that would hold requests to add or delete a timer from lcore k's skiplist, 
>> but it
>> didn't really give an appreciable increase in my test application throughput.
>> In profiling this solution, the hotspot had moved from acquiring the 
>> skiplist's
>> spinlock to the rte_atomic32_cmpset that the multiple-producer ring code
>> uses to manipulate the head pointer.
>>> 
>>> Then, I tried multiple single-producer single-consumer rings per target
>> lcore.  This removed the ring hotspot, but the performance didn't increase as
>> much as with the proposed solution. These solutions also add overhead to
>> rte_timer_manage, as it would have to process the rings and then process
>> the skiplists.
>>> 
>>> One other thing to note is that a solution that uses such messages changes
>> the use models for the timer.  One interesting example is:
>>> - lcore I enqueues a message to install a timer on lcore k
>>> - lcore k runs rte_timer_manage, processes its messages and adds the
>>> timer to its list
>>> - lcore I then enqueues a message to stop the same timer, now owned by
>>> lcore k
>>> - lcore k does not run rte_timer_manage again
>>> - lcore I wants to free the timer but it might not be safe
>> 
>> This case seems like a mistake to me as lcore k should continue to call
>> rte_timer_manager() to process any new timers from other lcores not just
>> the case where the list becomes empty and lcore k does not add timer to his
>> list.
>> 
>>> 
>>> Even though lcore I has successfully enqueued the request to stop the
>> timer (and delete it from lcore k's pending list), it hasn't actually been
>> deleted from the list yet,  so freeing it could corrupt the list.  This case 
>> exists
>> in the existing timer stress tests.
>>> 
>>> Another interesting scenario is:
>>> - lcore I resets a timer to install it on lcore k
>>> - lcore j resets the same timer to install it on lcore k
>>> - then, lcore k runs timer_manage
>> 
>> This one also seems like a mistake, more then one lcore setting the same
>> timer seems like a problem and should not be done. A lcore should own a
>> timer and no other lcore should be able to change that timer. If multiple
>> lcores need a timer then they should not share the same timer structure.
>> 
> 
> Both of the above cases exist in the timer library stress tests, so a 
> solution would presumably need to address them or it would be less flexible.  
> The original design passed these tests, as does the proposed one.

I get this twitch when one lcore is adding timers to another lcore as I come 
from a realtime OS background, but I guess if no one else cares or finds a 
problem I will have to live with it. Having a test for something does not make 
it a good test or a reasonable reason to continue a design issue. We can make 
any test work, but is it right is the real question and we will just have to 
wait an see I guess.

> 
>>> 
>>> Lcore j's message obviates lcore i's message, and it would be wasted work
>> for lcore k to process it, so we should mark it to be skipped over.   
>> Handling all
>> the edge cases was more complex than the solution proposed.
>> 
>> Hmmm, to me it seems simple here as long as the lcores follow the same
>> rules and sharing a timer structure is very risky and avoidable IMO.
>> 
>> Once you have lcores adding timers to another lcore then all accesses to that
>> skip list must be serialized or you get unpredictable results. This should 
>> also
>> fix most of the edge cases you are talking about.
>> 
>> Also it seems to me the case with an lcore adding timers to another lcore
>> timer list is a specific use case and could be handled by a different set of 
>> APIs
>> for that specific use case. Then we do not need to change the current design
>> and all of the overhead is placed on the new APIs/design. IMO we are
>> turning the current timer design into a global timer design as it really is 
>> a per
>> lcore design today and I beleive that is a mistake.
>> 
> 
> Well, the original API explicitly supports installing a timer to be executed 
> on a different lcore, and there are no API changes in the patchset.  Also, 
> the proposed design keeps the per-lcore design intact;  it only takes what 
> used to be one large skiplist that held timers for all installing lcores, and 
> separates it into N skiplists that correspond 1:1 to an installing lcore.  
> When an lcore processes timers on its lists it will still only be managing 
> timers it owns, and no others.


Having an API to explicitly support some feature is not a reason to keep 
something, but I think you have reduce my twitching some :-) so I will let it 
go.

Thanks for the information.

>  
> 
>>> 
>>>>> 
>>>>> Gabriel Carrillo (3):
>>>>> timer: add per-installer pending lists for each lcore
>>>>> timer: handle timers installed from non-EAL threads
>>>>> doc: update timer lib docs
>>>>> 
>>>>> doc/guides/prog_guide/timer_lib.rst |  19 ++-
>>>>> lib/librte_timer/rte_timer.c        | 329 +++++++++++++++++++++++------
>> ---
>>>> ----
>>>>> lib/librte_timer/rte_timer.h        |   9 +-
>>>>> 3 files changed, 231 insertions(+), 126 deletions(-)
>>>>> 
>>>>> --
>>>>> 2.6.4
>>>>> 
>>>> 
>>>> Regards,
>>>> Keith
>> 
>> Regards,
>> Keith

Regards,
Keith

Re: [dpdk-dev] [PATCH 0/3] *** timer library enhancements ***

Reply via email to

Re: [dpdk-dev] [PATCH 0/3] * timer library enhancements *