From: Jon Maloy <jon.ma...@ericsson.com> Date: Mon, 15 Jan 2018 17:56:28 +0100
> We have identified a race condition during reception of socket > events and messages in the topology server. > > - The function tipc_close_conn() is releasing the corresponding > struct tipc_subscriber instance without considering that there > may still be items in the receive work queue. When those are > scheduled, in the function tipc_receive_from_work(), they are > using the subscriber pointer stored in struct tipc_conn, without > first checking if this is valid or not. This will sometimes > lead to crashes, as the next call of tipc_conn_recvmsg() will > access the now deleted item. > We fix this by making the usage of this pointer conditional on > whether the connection is active or not. I.e., we check the condition > test_bit(CF_CONNECTED) before making the call tipc_conn_recvmsg(). > > - Since the two functions may be running on different cores, the > condition test described above is not enough. tipc_close_conn() > may come in between and delete the subscriber item after the condition > test is done, but before tipc_conn_recv_msg() is finished. This > happens less frequently than the problem described above, but leads > to the same symptoms. > > We fix this by using the existing sk_callback_lock for mutual > exclusion in the two functions. In addition, we have to move > a call to tipc_conn_terminate() outside the mentioned lock to > avoid deadlock. > > Acked-by: Ying Xue <ying....@windriver.com> > Signed-off-by: Jon Maloy <jon.ma...@ericsson.com> Applied, thanks Jon.