> On 18 Mar 2022, at 12:18, otr...@employees.org wrote:
> 
> Klement,
> 
>>>>>>> Following up on this thread. 
>>>>>>> The changes in 34877 led to some undesired behaviour in the "real 
>>>>>>> world(tm)".
>>>>>>> In the close pattern below it left sessions in established state, and 
>>>>>>> with a relatively low cps
>>>>>>> would consume the whole session table.
>>>>>>> 
>>>>>>> The change here https://gerrit.fd.io/r/c/vpp/+/35692 nat: tweak rfc7857 
>>>>>>> tcp connection tracking 
>>>>>>> proposes to move the needle somewhat more towards protecting the 
>>>>>>> session table.
>>>>>>> Views? Miklos, Klement?
>>>>>>> 
>>>>>>> The RFC7857 state machine introduced in 56c492a is a trade-off.
>>>>>>> It tries to retain sessions as much as possible and also offers
>>>>>>> some protection against spurious RST by re-establishing sessions if data
>>>>>>> is received after the RST. From experience in the wild, this algorithm 
>>>>>>> is
>>>>>>> a little too liberal, as it leaves too many spurious established 
>>>>>>> sessions
>>>>>>> in the session table.
>>>>>>> 
>>>>>>> E.g. a oberserved pattern is:
>>>>>>> client      server
>>>>>>>          <- FIN, ACK
>>>>>>> ACK      ->
>>>>>>> ACK      ->
>>>>>>> RST, ACK ->
>>>>>> 
>>>>>> So why not just add a new state change where RST+half-closed moves to 
>>>>>> TRANS instead of throwing everything away?
>>>>> 
>>>>> What do you mean by "throwing everything away"?
>>>>> Reset the state flags? Now it goes to transitory, and it will stay in 
>>>>> transitory as long as packets are flowing.
>>>> 
>>>> Assuming the guys writing RFC gave it a thorough thought and that current 
>>>> state tracking is mostly done with RFC in mind, then changing it 
>>>> dramatically feels like it might not cover corner cases which we are 
>>>> currently not aware of. Feels like instead of doing one tweak let’s 
>>>> rewrite the whole thing approach.
>>> 
>>> I wouldn't make too many assumptions about guys writing RFCs, given that 
>>> I'm one of them. ;-)
>> 
>> Oh! When you put it like this … ;-)
>> 
>>> There is a history here, and initially NATs were viewed as breaking the 
>>> Internet architecture, and if NATs should be specified at all, the 
>>> overriding concern was to make them as transparent to applications as 
>>> possible.
>>> Given the centralisation of the Internet and the level of packet 
>>> mangling/middleboxes we now have, combined with the run-out of IPv4 
>>> addresses, applications have been forced to adapt. I don't think you can 
>>> expect long-lived TCP sessions to survive at all anymore.
>> 
>> Wouldn’t it be then easier to just have transitory timeout on for all 
>> sessions all the time? Yes, you would have to turn on (tcp) keepalives for 
>> your (ssh) sessions … And also Miklos might be a bit unhappy, but you would 
>> get a very very simple solution ….
> 
> I suppose by doing the 3-way handshake you have proven to me (the NAT) that 
> you are intending to communicate.
> And by doing that, I promise to be a little kinder to you than I do for a UDP 
> session.
> Would the world break if all sessions got a 2 minute timeout, probably not. 
> Most sessions are very short.
> Addresses as we have learnt with IPv6 are ephemeral. You need a session 
> layer, and run something like mosh if you want long lived ssh-like sessions.
> 
> The current proposal was trying to find a compromise here.

Right. Btw. the RFC says you SHOULD honour the timers, but it doesn’t say you 
MUST honour them. Based on above talk about non-expectance of long-lived 
session support anyway, maybe even a very simple one-LRU rules them all (as in, 
whenever you need a new session, you simply reuse the session which saw traffic 
least recently. This way, under pressure, new sessions would terminate possibly 
lively old-ish sessions, but you never have to track anything and if not under 
pressure, even broken scenarios could work if the clients are able to cope with 
them.

Anyhow, if a deployment is actively running out of space, I’d say something is 
wrong with the config, setup or is simply incorrectly scaled ...

> 
>>> The main concern about RST was to recover from a 3rd party sending RSTs 
>>> into the session.
>>> 
>>> 
>>>>>>> With the current state machine this would leave the session in 
>>>>>>> established state.
>>>>>>> 
>>>>>>> These proposed changes do:
>>>>>>>  - require 3-way handshake to establish session.
>>>>>> 
>>>>>> How does this help? Would you also need to track sequence numbers as was 
>>>>>> done before?
>>>>> 
>>>>> It helps in the case where someone would spoof a a SYN, then RFC would 
>>>>> leave the spurious session in established.
>>>>> The proposed state machine will leave it in transitory (the client to 
>>>>> server ACK would never be seen).
>>>> 
>>>> Ah, so you are assuming a legitimate client is connecting to a nefarious 
>>>> server, which cannot produce it’s own SYN (or ACK) packet, but has the 
>>>> capability to spoof a SYN packet, yes?
>>>> Or is it a nefarious client which is unable to produce a SYN packet, but 
>>>> capable of spoofing a SYN packet?
>>> 
>>> Neither I think, I'm concerned about a nefarious 3rd party trying to attack 
>>> the session table. Yes, somewhat depending on how the NAT is configured the 
>>> attacker has to be on the inside. Depending on the 3-way handshake also 
>>> ensures the NAT state is better synchronised with the client and server 
>>> state, than just using the 2-way. Do you see this causing issues?
>> 
>> I don’t see any value added besides code being more complex.
>> If I have inside access I can drain the session table with scapy (which is a 
>> very slow way of doing things) easily even without keeping any local state 
>> and it doesn’t matter if you track 2way or 3way ….
>> (haven’t we had this discussion a couple of times already? feels a bit like 
>> beating a dead horse. NAT just sucks - malicious actor on the inside can 
>> simply make life miserable for all others UNLESS you implement a limit per 
>> inside host).
> 
> You could do a limit per inside host. Now IPv4 is somewhat more beneficial 
> here, but the inside host might have 10/8 to play with still...
> 
> I do have a proposal written up for a IPv4 plan B. That could have been done 
> instead of IPv6, that offers stateless NATs... I was intending to wait until 
> April 1st to publish. ;-)
> 
> Best regards,
> Ole
> 

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#21067): https://lists.fd.io/g/vpp-dev/message/21067
Mute This Topic: https://lists.fd.io/mt/88218698/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to