Re: [net] net/tls: missing received data after fast remote close

Vadim Fedorenko Wed, 18 Nov 2020 12:52:54 -0800

On 18.11.2020 16:23, Jakub Kicinski wrote:

On Wed, 18 Nov 2020 02:47:24 +0000 Vadim Fedorenko wrote:

This behavior differs from plain TCP socket and
leads to special treating in user-space. Patch unpauses parser
directly if we have unparsed data in tcp receive queue.

Sure, but why is the parser paused? Does it pause itself on FIN?

No, it doesn't start even once. The trace looks like:


tcp_recvmsg is called
tcp_recvmsg returns 1 (Change Cipher Spec record data)
tls_setsockopt is called
tls_setsockopt returns
tls_recvmsg is called
tls_recvmsg returns 0
__strp_recv is called
stack
           __strp_recv+1
           tcp_read_sock+169
           strp_read_sock+104
           strp_work+68
           process_one_work+436
           worker_thread+80
           kthread+276
           ret_from_fork+34tls_read_size called

So it looks like strp_work was scheduled after tls_recvmsg and
nothing triggered parser because all the data was received before
tls_setsockopt ended the configuration process.

Um. That makes me think we need to flush_work() on the strparser after
we configure rx tls, no? Or __unpause at the right time instead of
dealing with the async nature of strp_check_rcv()?

I'm not sure that flush_work() will do right way in this case. Because:
1. The work is idle after tls_sw_strparser_arm()

Not sure what you mean by idle, it queues the work via strp_check_rcv().

2. The strparser needs socket lock to do it's work - that could be a
deadlock because setsockopt_conf already holds socket lock. I'm not
sure that it's a good idea to unlock socket just to wait the strparser.

Ack, I meant in do_tls_setsockopt() after the lock is released.

The async nature of parser is OK for classic HTTPS server/client case
because it's very good to have parsed record before actual call to recvmsg
or splice_read is done. The code inside the loop in tls_wait_data is slow
path - maybe just move the check and the __unpause in this slow path?

Yeah, looking closer this problem can arise after we start as well :/

How about we move the __unparse code into the loop tho? Seems like this
could help with latency. Right now AFAICT we get a TCP socket ready
notification, which wakes the waiter for no good reason and makes
strparser queue its work, the work then will call tls_queue() ->
data_ready waking the waiting thread second time this time with the
record actually parsed.

Did I get that right? Should the waiter not cancel the work on the
first wake up and just kick of the parsing itself?

I was thinking of the same solution too, but simple check of emptyness of
socket's receive queue is not working in case when we have partial record
in queue - __unpause will return without changing ctx->skb and still having
positive value in socket queue and we will have blocked loop until new data
is received or strp_abort_strp is fired because of timeout. If you could
suggest proper condition to break the loop it would be great.

Or probably I misunderstood what loop did you mean exactly?

Re: [net] net/tls: missing received data after fast remote close

Reply via email to