From: Jon Maloy <jon.ma...@ericsson.com>
Date: Mon, 15 Jan 2018 17:56:28 +0100

> We have identified a race condition during reception of socket
> events and messages in the topology server.
> 
> - The function tipc_close_conn() is releasing the corresponding
>   struct tipc_subscriber instance without considering that there
>   may still be items in the receive work queue. When those are
>   scheduled, in the function tipc_receive_from_work(), they are
>   using the subscriber pointer stored in struct tipc_conn, without
>   first checking if this is valid or not. This will sometimes
>   lead to crashes, as the next call of tipc_conn_recvmsg() will
>   access the now deleted item.
>   We fix this by making the usage of this pointer conditional on
>   whether the connection is active or not. I.e., we check the condition
>   test_bit(CF_CONNECTED) before making the call tipc_conn_recvmsg().
> 
> - Since the two functions may be running on different cores, the
>   condition test described above is not enough. tipc_close_conn()
>   may come in between and delete the subscriber item after the condition
>   test is done, but before tipc_conn_recv_msg() is finished. This
>   happens less frequently than the problem described above, but leads
>   to the same symptoms.
> 
>   We fix this by using the existing sk_callback_lock for mutual
>   exclusion in the two functions. In addition, we have to move
>   a call to tipc_conn_terminate() outside the mentioned lock to
>   avoid deadlock.
> 
> Acked-by: Ying Xue <ying....@windriver.com>
> Signed-off-by: Jon Maloy <jon.ma...@ericsson.com>

Applied, thanks Jon.

Reply via email to