> From: Sunil Muthuswamy <sunil...@microsoft.com>
> Sent: Tuesday, May 14, 2019 5:56 PM
> ...
> +static bool hvs_close_lock_held(struct vsock_sock *vsk)
>  {
> ...
> +     schedule_delayed_work(&vsk->close_work, HVS_CLOSE_TIMEOUT);

Reviewed-by: Dexuan Cui <de...@microsoft.com>

The patch looks good to me. Thank you, Sunil!

Next, we're going to remove the "channel->rescind" check in 
vmbus_hvsock_device_unregister() -- when doing that, IMO we need to
fix a potential race revealed by the schedule_delayed_work() in this
patch:

When hvs_close_timeout() finishes, the "sk" struct has been freed, but
vmbus_onoffer_rescind() -> channel->chn_rescind_callback(), i.e. 
hvs_close_connection(), may be still running and referencing the "chan"
and "sk" structs (), which should no longer be referenced when 
hvs_close_timeout() finishes, i.e. "get_per_channel_state(chan)" is no
longer safe. The problem is: currently there is no sync mechanism
between vmbus_onoffer_rescind() and hvs_close_timeout().

The race is a real issue only after we remove the "channel->rescind"
in vmbus_hvsock_device_unregister().

I guess we need to introduce a new single-threaded workqueue in the
vmbus driver, and offload both vmbus_onoffer_rescind() and 
hvs_close_timeout() onto the new workqueue.

Thanks,
-- Dexuan

Reply via email to