Thread cancellation is a pain point. Emulating it properly is nearly impossible 
(without hooking into various OS calls which are supposed to be "cancellation 
points"). I do like the cancellation token idea, but I'm not sure how existing 
clients, who rely on existing phtread_cancel() semantic, will react to that - 
it will require a code re-write. 

How about we defer fixing this to another follow-up change? 

-----Original Message-----
From: Dmitry Kozlyuk <dmitry.kozl...@gmail.com> 
Sent: Thursday, April 29, 2021 1:44 PM
To: Narcisa Ana Maria Vasile <navas...@linux.microsoft.com>
Cc: dev@dpdk.org; thomas <tho...@monjalon.net>; Khoa To <k...@microsoft.com>; 
Narcisa Ana Maria Vasile <narcisa.vas...@microsoft.com>; Dmitry Malloy 
<dmit...@microsoft.com>; Tyler Retzlaff <roret...@microsoft.com>; 
tal...@nvidia.com; Omar Cardona <ocard...@microsoft.com>; 
bruce.richard...@intel.com; david.march...@redhat.com; Kadam, Pallavi 
<pallavi.ka...@intel.com>
Subject: [EXTERNAL] Re: [PATCH v6 06/10] eal: add thread lifetime management

2021-04-02 18:39 (UTC-0700), Narcisa Ana Maria Vasile:
[...]
> diff --git a/lib/librte_eal/windows/rte_thread.c 
> b/lib/librte_eal/windows/rte_thread.c
> index f61103bbc..86bbd7bc2 100644
> --- a/lib/librte_eal/windows/rte_thread.c
> +++ b/lib/librte_eal/windows/rte_thread.c
> @@ -329,6 +329,131 @@ rte_thread_attr_set_priority(rte_thread_attr_t 
> *thread_attr,
>       return 0;
>  }
>  
> +int
> +rte_thread_create(rte_thread_t *thread_id,
> +               const rte_thread_attr_t *thread_attr,
> +               void *(*thread_func)(void *), void *args) {
> +     int ret = 0;
> +     HANDLE thread_handle = NULL;
> +     GROUP_AFFINITY thread_affinity;
> +
> +     thread_handle = CreateThread(NULL, 0,
> +                             (LPTHREAD_START_ROUTINE)(ULONG_PTR)thread_func,

thread_func returns void* (8 bytes), LPTHREAD_START_ROUTING returns DWORD (4 
byte), is this cast safe?
It seems to be on x86, can't say for ARM64.

> +                             args, 0, thread_id);
> +     if (thread_handle == NULL) {
> +             ret = rte_thread_translate_win32_error();
> +             RTE_LOG_WIN32_ERR("CreateThread()");
> +             goto cleanup;
> +     }

After this point, in case of errors the thread remains running, while the 
function reports failure. It's better to create the thread suspended, configure 
its attributes (destroying the thread on failure), then resume (start) the 
thread.

> +
> +     if (thread_attr != NULL) {
> +             if (CPU_COUNT(&thread_attr->cpuset) > 0) {
> +                     ret = 
> rte_convert_cpuset_to_affinity(&thread_attr->cpuset, &thread_affinity);
> +                     if (ret != 0) {
> +                             RTE_LOG(DEBUG, EAL, "Unable to convert cpuset 
> to thread affinity\n");
> +                             goto cleanup;
> +                     }
> +
> +                     if (!SetThreadGroupAffinity(thread_handle, 
> &thread_affinity, NULL)) {
> +                             ret = rte_thread_translate_win32_error();
> +                             RTE_LOG_WIN32_ERR("SetThreadGroupAffinity()");
> +                             goto cleanup;
> +                     }
> +             }
> +             ret = rte_thread_set_priority(*thread_id, 
> thread_attr->priority);
> +             if (ret != 0) {
> +                     RTE_LOG(DEBUG, EAL, "Unable to set thread priority\n");
> +                     goto cleanup;
> +             }
> +     }
> +
> +     return 0;
> +
> +cleanup:
> +     if (thread_handle != NULL) {
> +             CloseHandle(thread_handle);
> +             thread_handle = NULL;
> +     }
> +     return ret;
> +}

[...]
> +
> +int
> +rte_thread_cancel(rte_thread_t thread_id) {
> +     int ret = 0;
> +     HANDLE thread_handle = NULL;
> +
> +     thread_handle = OpenThread(THREAD_TERMINATE, FALSE, thread_id);
> +     if (thread_handle == NULL) {
> +             ret = rte_thread_translate_win32_error();
> +             RTE_LOG_WIN32_ERR("OpenThread()");
> +             goto cleanup;
> +     }
> +
> +     /*
> +      * TODO: Behavior is different between POSIX and Windows threads.
> +      * POSIX threads wait for a cancellation point.
> +      * Current Windows emulation kills thread at any point.
> +      */
> +     ret = TerminateThread(thread_handle, 0);

This is not acceptable.
There is a handful of pthread_cancel() usages in DPDK.
The threads they cancel take spinlocks and mutexes, so interrupt at arbitrary 
point may cause a deadlock.

User            Cancellation points     Notes
----            -------------------     -----
net/ipn3ke      libfdt functions?
net/kni         usleep                  Linux-only
raw/ifpga       open/read/write/close
vdpa/ifc        read
vdpa/mlx5       usleep, nanosleep
lib/eventdev    rte_epoll_wait          no pthread CP

Possible solutions:

1. Make specific DPDK functions cancellation points for this API and ensure 
DPDK users of pthread_cancel() call it.  This way is almost non-invasive, but 
it's more work in EAL.

2. Divert from pthread style: add cancellation token concept, so that DPDK 
users can check themselves if they're cancelled. This is invasive, but very 
explicit and flexible.

> +     if (ret != 0) {
> +             ret = rte_thread_translate_win32_error();
> +             RTE_LOG_WIN32_ERR("TerminateThread()");
> +             goto cleanup;
> +     }
> +
> +cleanup:
> +     if (thread_handle != NULL) {
> +             CloseHandle(thread_handle);
> +             thread_handle = NULL;

This line is not needed.

> +     }
> +     return ret;
> +}
> +
>  int
>  rte_thread_key_create(rte_thread_key *key,
>               __rte_unused void (*destructor)(void *))


Reply via email to