> Given what you said above, I agree, at least in the current implementation.  
> It
> still seems like theres a simpler solution that doesn't require all the
> comparative gymnastics.

Yes, there is simpler solution, but this solution involve recursive locking.
DPDK recursive spinlocks are no an option in here, so only option is posix 
recursive
mutex, which I think is even worst option than this gymnastics.

> 
> What if, instead of testing if you're the callback thread, we turn the 
> executing
> field of alarm_entry into a bitfield, where bit 0 represents the former
> "executing" state, and bit 1 is defined as a "cancelled" bit.  Then
> rte_eal_alarm_cancel becomes a search that, when an alarm is found simply or's
> in the cancelled bit to the executing bit field.  When the callback thread 
> runs,
> it skips executing any alarm that is marked as cancelled, but frees all alarm
> entries that are executed or cancelled.  That gives us a single point at which
> frees of alarm entires happen?  Something like the patch below (completely
> untested)?
> 
> It also seems like the alarm api as a whole could use some improvement.  The
> way its written right now, theres no way to refer to a specific alarm (i.e.
> cancelation relies on the specification of a function and data pointer, which
> may refer to multiple timers).  Shouldn't rte_eal_alarm_set return an opaque
> handle to a unique timer instance that can be store by a caller and used to
> specfically cancel that timer?  Thats how both the bsd and linux timer
> subsystems model timers.
> 

Goal was to not break user applications that use this library.

> 
> 
> diff --git a/lib/librte_eal/linuxapp/eal/eal_alarm.c
> b/lib/librte_eal/linuxapp/eal/eal_alarm.c
> index 480f0cb..73b6dc5 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_alarm.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_alarm.c
> @@ -64,6 +64,9 @@
>  #define MS_PER_S 1000
>  #define US_PER_S (US_PER_MS * MS_PER_S)
> 
> +#define ALARM_EXECUTING (1 << 0)
> +#define ALARM_CANCELLED (1 << 1)
> +
>  struct alarm_entry {
>       LIST_ENTRY(alarm_entry) next;
>       struct timeval time;
> @@ -107,12 +110,14 @@ eal_alarm_callback(struct rte_intr_handle *hdl
> __rte_unused,
>                       gettimeofday(&now, NULL) == 0 &&
>                       (ap->time.tv_sec < now.tv_sec || (ap->time.tv_sec ==
> now.tv_sec &&
>                                               ap->time.tv_usec <=
> now.tv_usec))){
> -             ap->executing = 1;
> -             rte_spinlock_unlock(&alarm_list_lk);

Removing unlock here introduce deadlock.

> +             ap->executing |= ALARM_EXECUTING;
> +             if (likely(!(ap->executing & ALARM_CANCELLED)) {
> +                     rte_spinlock_unlock(&alarm_list_lk);
> 
> -             ap->cb_fn(ap->cb_arg);
> +                     ap->cb_fn(ap->cb_arg);
> 
> -             rte_spinlock_lock(&alarm_list_lk);
> +                     rte_spinlock_lock(&alarm_list_lk);
> +             }
>               LIST_REMOVE(ap, next);
>               rte_free(ap);
>       }
> @@ -209,10 +214,9 @@ rte_eal_alarm_cancel(rte_eal_alarm_callback cb_fn,
> void *cb_arg)
>       rte_spinlock_lock(&alarm_list_lk);
>       /* remove any matches at the start of the list */
>       while ((ap = LIST_FIRST(&alarm_list)) != NULL &&
> -                     cb_fn == ap->cb_fn && ap->executing == 0 &&
> +                     cb_fn == ap->cb_fn &&
>                       (cb_arg == (void *)-1 || cb_arg == ap->cb_arg)) {
> -             LIST_REMOVE(ap, next);
> -             rte_free(ap);
> +             ap->executing |= ALARM_CANCELLED;
>               count++;
>       }
>       ap_prev = ap;
> @@ -220,10 +224,9 @@ rte_eal_alarm_cancel(rte_eal_alarm_callback cb_fn,
> void *cb_arg)
>       /* now go through list, removing entries not at start */
>       LIST_FOREACH(ap, &alarm_list, next) {
>               /* this won't be true first time through */
> -             if (cb_fn == ap->cb_fn &&  ap->executing == 0 &&
> +             if (cb_fn == ap->cb_fn &&
>                               (cb_arg == (void *)-1 || cb_arg == ap->cb_arg))
> {
> -                     LIST_REMOVE(ap,next);
> -                     rte_free(ap);
> +                     ap->executing |= ALARM_CANCELLED;
>                       count++;
>                       ap = ap_prev;
>               }

Pawel

Reply via email to