On 6/24/20 3:34 AM, Oliver Herms wrote:
> On 24.06.20 00:06, Michal Kubecek wrote:
>> On Tue, Jun 23, 2020 at 01:30:29AM +0200, Oliver Herms wrote:
>>>
>>> I'm encountering the issues due to cache entries that are created by
>>> tnl_update_pmtu. However, I'm going to address that issue in another thread
>>> and patch.
>>>
>>> As entries in the cache can be caused on many ways this should be fixed on
>>> the GC
>>> level.
>>
>> Actually, not so many as starting with (IIRC) 4.2, IPv6 routing cache is
>> only used for exceptions (like PMTU), not for regular lookup results.
>>
>
> Hi Michal,
>
> Right. That is the intention. But reality is, that when sending IPv6 with an
> MPLS encap route into a SIT/FOU tunnel, a cache entry is being created for
> every single
> destination on the tunnel. Now that IS a bug by itself and I'll shortly
> submit a
> patch that should fix that issue.
Thanks !
>
> However, when a tunnel uses PMTU, and a tunnel source received an ICMP packet
> too big
> for the tunnel destination, that triggers creation of IPv6 route cache entries
> (and for IPv4 entries in the corresponding data structure) for every
> destination for which
> packets are sent through the tunnel.
>
> Both these attributes,
> 1. the presence or absence of, maybe spoofed, ICMP packet too big messages
> for the tunnel
> 2. the number of flows through a tunnel (attackers could just create more
> flows)
> are not fully under control by an operator.
>
> Thus the assumption that if only max_size would be big enough, it would solve
> the problem,
> it not correct.
Our intention is to get rid of the IPv6 garbage collection, so we need to make
sure
we do not rely on it ;)
>
> Regarding your argument making the limit "softer":
> There is only 2 functions that use/lock fib6_gc_lock:
> 1. fib6_net_init
> 2. fib6_run_gc
>
> fib6_net_init is only called when the network namespace is initialized.
>
> fib6_run_gc clears all entries that are subject to garbage collection.
> There is no gain in doing that N times (with N = amount of CPUs) and
> spinlocking all CPUs.
> Cleaning once is enough. A GC run is so short, that by the time the GC run is
> finished,
> there's most probably no new entry in the cache that is ready to be removed.
> And even if there is. The next call to ip6_dst_gc will happen when a new entry
> is added to the cache. Thus I can't see how my patch makes time limit softer.
>
This are really minor implementation details, we need to find the root cause.