Hi all, Another approach that I'd like to put out for consideration is as follows:
Let's say we introduce one flag per lcore - multi_pendlists. This flag indicates whether that lcore supports multiple pending lists (one per source lcore) or not, and by default it's set to false. At rte_timer_subsystem_init() time, each lcore will be configured to use a single pending list per lcore (rather than multiple). A new API, rte_timer_subsystem_set_multi_pendlists(unsigned lcore_Id), can be called to enable multi_pendlists for a particular lcore. It should be called after rte_timer_subsystem_init(), and before any timers are started for that lcore. When timers are started for a particular lcore, that lcore's multi_pendlists flag will be inspected to determine whether it should go into a single list, or one of several lists. When an lcore processes its timers with rte_timer_manage(), it will look at the multi_pendlists flag, and if it is false, only process a single list. This should bring the overhead nearly back down to what it was originally. And if multi_pendlists is true, it will break out the runlists from multiple pending lists in sequence and process them, as in the current patch. Thoughts or comments? Thanks, Gabriel > -----Original Message----- > From: Ananyev, Konstantin > Sent: Tuesday, August 29, 2017 5:57 AM > To: Carrillo, Erik G <erik.g.carri...@intel.com>; rsanf...@akamai.com > Cc: dev@dpdk.org > Subject: RE: [dpdk-dev] [PATCH v2 1/3] timer: add per-installer pending lists > for each lcore > > Hi Gabriel, > > > > > Instead of each priv_timer struct containing a single skiplist, this > > commit adds a skiplist for each enabled lcore to priv_timer. In the > > case that multiple lcores repeatedly install timers on the same target > > lcore, this change reduces lock contention for the target lcore's > > skiplists and increases performance. > > I am not an rte_timer expert, but there is one thing that worries me: > It seems that complexity of timer_manage() has increased with that patch > quite a bit: > now it has to check/process up to RTE_MAX_LCORE skiplists instead of one, > also it has to somehow to properly sort up to RTE_MAX_LCORE lists of > retrieved (ready to run) timers. > Wouldn't all that affect it's running time? > > I understand your intention to reduce lock contention, but I suppose at least > it could be done in a configurable way. > Let say allow user to specify dimension of pending_lists[] at init phase or > so. > Then timer from lcore_id=N will endup in > pending_lists[N%RTE_DIM(pendilng_list)]. > > Another thought - might be better to divide pending timers list not by client > (lcore) id, but by expiration time - some analog of timer wheel or so. > That, I think might greatly decrease the probability that timer_manage() and > timer_add() will try to access the same list. > From other side timer_manage() still would have to consume skip-lists one > by one. > Though I suppose that's quite radical change from what we have right now. > Konstantin >