This patch breaks CRIU tests:

===================== Run zdtm/transition/socket-tcp6 in h =====================
Start test
./socket-tcp6 --pidfile=socket-tcp6.pid --outfile=socket-tcp6.out
start time for zdtm/transition/socket-tcp6: 0.90
Run criu dump
Run criu restore
=[log]=> dump/zdtm/transition/socket-tcp6/39/1/restore.log
------------------------ grep Error ------------------------
(00.132120)     39:     restore rcvlowat 1 for socket
(00.132130)     39:     restore mark 0 for socket
(00.132149)     39:             Create fd for 6
(00.132157)     39: Schedule 3 socket for repair off
(00.132183)     39: Error (criu/sockets.c:379): Can't set 1:15 (len 4): 
Structure needs cleaning
(00.132192)     39: Error (criu/files.c:1243): Unable to open fd=6 id=0x8
(00.132241) Error (criu/cr-restore.c:2391): Failed to wait inprogress tasks
(00.132286) Error (criu/cr-restore.c:2568): Restoring FAILED.
------------------------ ERROR OVER ------------------------
############ Test zdtm/transition/socket-tcp6 FAIL at CRIU restore #############

https://travis-ci.org/avagin/linux/jobs/388989833

We use these options to restore tcp sockets. On the first stage, CRIU creates
all sockets with SO_REUSEADDR and SO_REUSEPORT, than it restores established
and listening sockets. After that criu restores values of SO_REUSEADDR and
SO_REUSEPORT options.

On Sun, Jun 03, 2018 at 10:47:05AM -0700, Maciej Żenczykowski wrote:
> From: Maciej Żenczykowski <m...@google.com>
> 
> It is not safe to do so because such sockets are already in the
> hash tables and changing these options can result in invalidating
> the tb->fastreuse(port) caching.
> 
> This can have later far reaching consequences wrt. bind conflict checks
> which rely on these caches (for optimization purposes).
> 
> Not to mention that you can currently end up with two identical
> non-reuseport listening sockets bound to the same local ip:port
> by clearing reuseport on them after they've already both been bound.
> 
> There is unfortunately no EISBOUND error or anything similar,
> and EISCONN seems to be misleading for a bound-but-not-connected
> socket, so use EUCLEAN 'Structure needs cleaning' which AFAICT
> is the closest you can get to meaning 'socket in bad state'.
> (although perhaps EINVAL wouldn't be a bad choice either?)
> 
> This does unfortunately run the risk of breaking buggy
> userspace programs...
> 
> Signed-off-by: Maciej Żenczykowski <m...@google.com>
> Cc: Eric Dumazet <eduma...@google.com>
> 
> Change-Id: I77c2b3429b2fdf42671eee0fa7a8ba721c94963b
> Reviewed-by: Eric Dumazet <eduma...@google.com>
> ---
>  net/core/sock.c | 15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/net/core/sock.c b/net/core/sock.c
> index 435a0ba85e52..feca4c98f8a0 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -728,9 +728,22 @@ int sock_setsockopt(struct socket *sock, int level, int 
> optname,
>                       sock_valbool_flag(sk, SOCK_DBG, valbool);
>               break;
>       case SO_REUSEADDR:
> -             sk->sk_reuse = (valbool ? SK_CAN_REUSE : SK_NO_REUSE);
> +             val = (valbool ? SK_CAN_REUSE : SK_NO_REUSE);
> +             if ((sk->sk_family == PF_INET || sk->sk_family == PF_INET6) &&
> +                 inet_sk(sk)->inet_num &&
> +                 (sk->sk_reuse != val)) {
> +                     ret = (sk->sk_state == TCP_ESTABLISHED) ? -EISCONN : 
> -EUCLEAN;
> +                     break;
> +             }
> +             sk->sk_reuse = val;
>               break;
>       case SO_REUSEPORT:
> +             if ((sk->sk_family == PF_INET || sk->sk_family == PF_INET6) &&
> +                 inet_sk(sk)->inet_num &&
> +                 (sk->sk_reuseport != valbool)) {
> +                     ret = (sk->sk_state == TCP_ESTABLISHED) ? -EISCONN : 
> -EUCLEAN;
> +                     break;
> +             }
>               sk->sk_reuseport = valbool;
>               break;
>       case SO_TYPE:

Reply via email to