From:   Eric Dumazet <eric.duma...@gmail.com>
Date:   Sun, 5 Jul 2020 10:08:08 -0700
> On 7/4/20 8:28 AM, Kuniyuki Iwashima wrote:
> > Commit 0c3d79bce48034018e840468ac5a642894a521a3 ("tcp: reduce SYN-ACK
> > retrans for TCP_DEFER_ACCEPT") introduces syn_ack_recalc() which decides
> > if a minisock is held and a SYN+ACK is retransmitted or not.
> > 
> > If rskq_defer_accept is not zero in syn_ack_recalc(), max_retries always
> > has the same value because max_retries is overwritten by rskq_defer_accept
> > in reqsk_timer_handler().
> > 
> > This commit adds two changes:
> > - remove max_retries from the arguments of syn_ack_recalc() and use
> >    rskq_defer_accept instead.
> > - rename thresh to max_retries for readability.
> > 
> 
> Honestly this looks unnecessary code churn to me.
> 
> This will make future backports more error prone.
> 
> Real question is : why do you want this change in the first place ?

The current code does non-zero checks for rskq_defer_accept twice in
reqsk_timer_handler() and syn_ack_recalc(), the former of which is
redundant.

Also, max_retries can have two meanings in reqsk_timer_handler() depending
on TCP_DEFER_ACCEPT:
  - the number of retries to resend SYN+ACK (unused)
  - the number of retries to drop bare ACK

On the other hand, the max_retries in reqsk_timer_handler() has only the
latter meaning and is confusing because rskq_defer_accept has the same
(original) value and the both values are used.

As far as I see, in the original code, the non-zero check was reasonable
because it was done once and the max_retries was evaluated through the
function (tcp_synack_timer()).


$ git blame net/ipv4/tcp_timer.c 1944972d3bb651474a5021c9da8d0166ae19f1eb
...
^1da177e4c3f4 (Linus Torvalds 2005-04-16 15:20:36 -0700 464) static void 
tcp_synack_timer(struct sock *sk)
...
^1da177e4c3f4 (Linus Torvalds 2005-04-16 15:20:36 -0700 468)    int max_retries 
= tp->syn_retries ? : sysctl_tcp_synack_retries;
^1da177e4c3f4 (Linus Torvalds 2005-04-16 15:20:36 -0700 469)    int thresh = 
max_retries;
...
^1da177e4c3f4 (Linus Torvalds 2005-04-16 15:20:36 -0700 505)    if 
(tp->defer_accept)
^1da177e4c3f4 (Linus Torvalds 2005-04-16 15:20:36 -0700 506)            
max_retries = tp->defer_accept;
...
^1da177e4c3f4 (Linus Torvalds 2005-04-16 15:20:36 -0700 515)                    
        if ((req->retrans < thresh ||
^1da177e4c3f4 (Linus Torvalds 2005-04-16 15:20:36 -0700 516)                    
             (req->acked && req->retrans < max_retries))
^1da177e4c3f4 (Linus Torvalds 2005-04-16 15:20:36 -0700 517)                    
            && !req->class->rtx_syn_ack(sk, req, NULL)) {


Currently, the code already looks a bit churned and error-prone.

It might be because of the ambiguity of the name of max_retries. 

rskq_defer_accept is assigned to max_retries but not always "max".
The code checks thresh at first, and then max_retries. So, as a result of
the evaluation order, it can be "max" (also may be smaller than thresh).
Moreover, in this context, there are three kinds of "retries": timer
(num_timeout), resending SYN+ACK (thresh), and dropping bare ACK
(max_retries and rskq_defer_accept).

In the original code, it was OK because we did not use rskq_defer_accept
twice.

The commit introduces syn_ack_recalc() and delegates the decision of
retries to the function.

I think it is better to 
  - remove the redundant check of rskq_defer_accept
  - pass only necessary arguments to syn_ack_recalc()
  - use a more understandable name instead of max_retries in two functions. 

For example, max_resends and rskq_defer_accept, or max_syn_ack_retries and
rskq_defer_accept. (I am not confident about what is the most
understandable name for anyone.)

So, I would like to respin the patch rephrasing max_retries to the proper
name.

What would you think about this?

Sincerely,
Kuniyuki

Reply via email to