On Mon, Nov 30, 2015 at 6:32 PM, Lorenzo Colitti <lore...@google.com> wrote:
> Here is an updated version of the SOCK_DESTROY patch
> incorporating some of the feedback received.
>
> There were two substantial concerns expressed on the approach
> taken in this patch. The first was that it allows applications
> to cause the Linux TCP stack to behave improperly. I believe
> this is addressed as follows:
>
> 1. This new patchset sends a RST in addition to clearing state.
>    This is compliant behaviour: it is the ABORT operation
>    specified in RFC 793 [1]. Any app today can do this by
>    enabling SO_LINGER with a timeout of 0 and calling close.
> 2. Multiple other operating systems implement this behaviour:
>    - FreeBSD has had this since 5.4 in 2005 [2]. It is available
>      to privileged userspace and there is a tool to use it [3].
>    - The FreeBSD commit description states that the idea came
>      from OpenBSD.
>    - iOS has been administratively closing app sockets since
>      iOS 4 [see 4, which states that a socket "might get
>      reclaimed by the kernel" and after that will return EBADF].
>
> The second concern was that userspace should not be in the
> business of making reachability determinations for TCP sockets;
> that job belongs to the kernel. But userspace makes reachability
> determinations all the time. Most relevant to this patchset:
> "-j REJECT --reject-with tcp-reset" has exactly the same
> effect as SOCK_DESTROY, except it only does so when the app does
> write or the kernel sends a keepalive, not when blocked on read.
>
> Also, there are real use cases where the kernel does not have
> enough information to know that a connection is now inoperable.
> The kernel can know if a packet can't be routed, but in general
> it won't if a TCP connection is dead in the water because it is
> now routed to a network where its source address is no longer
> valid [5][6].
>
> Other concerns have been addressed in this version, as follows:
>
> 1. tcp_diag_destroy now does a proper RFC 793 ABORT, i.e., sends
>    a RST to the peer. This is consistent with BSD's tcpdrop, and
>    is more correct in general, even though in most use cases
>    SOCK_DESTROY will only be called when sending a RST is no
>    longer possible (e.g., the network has disconnected).
> 2. Blocking socket operations are interrupted with ECONNABORTED
>    instead of ETIMEDOUT. This addresses Tom's point that
>    ETIMEDOUT is vague and an explicit notification is needed.
>    ECONNABORTED was chosen because it is consistent with BSD.
> 3. SOCK_DESTROY is placed behind an INET_DIAG_DESTROY
>    configuration option, which is off by default.
>
Lorenzo,

This is awesome! The only thing I would suggest is to make
sock_destroy a proto_op so that it can be called from within the
kernel. This should be preferred to externally calling tcp_done
(hopefully we can unexport that symbol then).

Tom

> [1] http://tools.ietf.org/html/rfc793#page-50
> [2] http://svnweb.freebsd.org/base?view=revision&revision=141381
> [3] 
> https://www.freebsd.org/cgi/man.cgi?query=tcpdrop&sektion=8&manpath=FreeBSD+5.4-RELEASE
> [4] 
> https://developer.apple.com/library/ios/technotes/tn2277/_index.html#//apple_ref/doc/uid/DTS40010841-CH1-SUBSECTION3
> [5] http://www.spinics.net/lists/netdev/msg352775.html
> [6] http://www.spinics.net/lists/netdev/msg352952.html
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to