[vpp-dev] session_lookup: holes in fib_index_to_table_index?

2020-04-23 Thread Andreas Schultz
Hi,

While trying to troubleshoot a session problem, I've come
across session_table_get_or_alloc:
https://gerrit.fd.io/r/gitweb?p=vpp.git;a=blob;f=src/vnet/session/session_lookup.c;h=4de6fdbe547499fcd256e387370cc5bf0c810206;hb=refs/heads/master#l158

To me it looks like this function could leave holes of invalid/wrong
table_indexes in the fib_index_to_table_index.

Index 0 is initialized explicitly (fib_index = 0), but the other indexes
are only initialized with a valid session table index when the first
session in that fib_index is opened.
When the first session happens to be in e.g. fib_index 10, all indexes from
1 up to including 9 are filled with zero (by vec_validate). The problem is
that for those fib indexes the session table index is now 0.
Fortunately, that session table index is valid, because it points to the
session table for fib_index 0. However, it also means that sessions from
the various fibs leak into each other.

Or do I miss something?

Regards
Andreas

-- 

Andreas Schultz

--
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16145): https://lists.fd.io/g/vpp-dev/message/16145
Mute This Topic: https://lists.fd.io/mt/73213863/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] session_lookup: holes in fib_index_to_table_index?

2020-04-23 Thread Andreas Schultz
ok, I have to retract my initial assertion that this is not a problem.
Indeed for IPv4 the breakage is not visible, but IPv6 in anything other
than fib_index zero is not usable.
The uninitialized entries in the IPv6 fib_index_to_table_index point to a
*IPv4* session table. Of course that does not work and will crash.

The fact that fib_index is passed a u8 into session_table_get_or_alloc does
not help either. A invalid fib_index of ~0 is converted to 255 and
the fib_index_to_table_index for IPv6 happily extendeds to 255 entries.

I'll try to come up with a fix, unless someone beats me to it.

Regards,
Andreas

Am Do., 23. Apr. 2020 um 09:07 Uhr schrieb Andreas Schultz via lists.fd.io
:

> Hi,
>
> While trying to troubleshoot a session problem, I've come
> across session_table_get_or_alloc:
>
> https://gerrit.fd.io/r/gitweb?p=vpp.git;a=blob;f=src/vnet/session/session_lookup.c;h=4de6fdbe547499fcd256e387370cc5bf0c810206;hb=refs/heads/master#l158
>
> To me it looks like this function could leave holes of invalid/wrong
> table_indexes in the fib_index_to_table_index.
>
> Index 0 is initialized explicitly (fib_index = 0), but the other indexes
> are only initialized with a valid session table index when the first
> session in that fib_index is opened.
> When the first session happens to be in e.g. fib_index 10, all indexes
> from 1 up to including 9 are filled with zero (by vec_validate). The
> problem is that for those fib indexes the session table index is now 0.
> Fortunately, that session table index is valid, because it points to the
> session table for fib_index 0. However, it also means that sessions from
> the various fibs leak into each other.
>
> Or do I miss something?
>
> Regards
> Andreas
>
> --
>
> Andreas Schultz
>
> --
> 
>


-- 

Andreas Schultz
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16146): https://lists.fd.io/g/vpp-dev/message/16146
Mute This Topic: https://lists.fd.io/mt/73213863/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] VPP nat ipfix logging problem, need to use thread-specific vlib_main_t?

2020-04-23 Thread Elias Rudberg
Hello,

There was a merge conflict for my previous fix for this. Now I made a
new one, it's essentially the same thing, just avoiding the merge
conflict: https://gerrit.fd.io/r/c/vpp/+/26659

Please have a look at that one and merge if it seems ok. Based on our
experience from the past few weeks it seems good, we have seen no more
ipfix loggning crashes after implementing this fix.

Best regards,
Elias


On Sun, 2020-04-05 at 12:08 +, Dave Barach via lists.fd.io wrote:
> If you have the thread index handy, that's OK. Otherwise, use
> vlib_get_main() which grabs the thread index from thread local
> storage. 
> 
> -Original Message-
> From: vpp-dev@lists.fd.io  On Behalf Of Elias
> Rudberg
> Sent: Sunday, April 5, 2020 4:58 AM
> To: vpp-dev@lists.fd.io
> Subject: [vpp-dev] VPP nat ipfix logging problem, need to use thread-
> specific vlib_main_t?
> 
> Hello VPP experts,
> 
> We have been using VPP for NAT44 for a while and it has been working
> fine, but a few days ago when we tried turing on nat ipfix logging,
> vpp crashed. It turned out that the problem went away if we used only
> a single thread, so it seemed related to how threading was handled in
> the ipfix logging code. The crash happened in different ways on
> different runs but often seemed related to the snat_ipfix_send()
> function in plugins/nat/nat_ipfix_logging.c.
> 
> Having looked at the code in nat_ipfix_logging.c I have the following
> theory about what goes wrong (I might have misunderstood something,
> if so please correct me):
> 
> In the the snat_ipfix_send() function, a vlib_main_t data structure
> is used, a pointer to it is fetched in the following way:
> 
>vlib_main_t *vm = frm->vlib_main;
> 
> So the frm->vlib_main pointer comes from "frm" which has been set to
> flow_report_main which is a global data structure from vnet/ipfix-
> export/flow_report.c that as far as I can tell only exists once in
> memory (not once per thread). This means that different threads
> calling the snat_ipfix_send() function are using the same vlib_main_t
> data structure. That is not how it should be, I think, instead each
> thread should be using its own thread-specific vlib_main_t data
> structure.
> 
> A suggestion for how to fix this is to replace the line
> 
>vlib_main_t *vm = frm->vlib_main;
> 
> with the following line
> 
>vlib_main_t *vm = vlib_mains[thread_index];
> 
> in all places where worker threads are using such a vlib_main_t
> pointer. Using vlib_mains[thread_index] means that we are picking the
> thread-specific vlib_main_t data structure for the current thread,
> instead of all threads using the same vlib_main_t. I pushed such a
> change to gerrit, here: https://gerrit.fd.io/r/c/vpp/+/26359
> 
> That fix seems to solve the issue in my tests, vpp does not crash
> anymore after the change. Please have a look at it and let me know if
> this seems reasonable or if I have misunderstood something.
> 
> Best regards,
> Elias
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16147): https://lists.fd.io/g/vpp-dev/message/16147
Mute This Topic: https://lists.fd.io/mt/72786912/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] Regarding lookup in vppinfra/hash.c

2020-04-23 Thread Dave Barach via lists.fd.io
Ack. If all else fails, suggest a quick peek at the assembly code.

From: vpp-dev@lists.fd.io  On Behalf Of Peng Xia
Sent: Thursday, April 23, 2020 12:15 AM
To: Peng Xia 
Cc: vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] Regarding lookup in vppinfra/hash.c

Sorry, I was confused with the later lines. Seems they have the same effect.

Peng Xia via lists.fd.io 
mailto:gmail@lists.fd.io>> 
于2020年4月21日周二 下午9:50写道:
Hi experts, in vppinfra/hash.c function lookup()
line number 568:
clib_memcpy_fast (old_value, p->direct.value,
should it be clib_memcpy_fast (old_value, &p->direct.value,
?

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16148): https://lists.fd.io/g/vpp-dev/message/16148
Mute This Topic: https://lists.fd.io/mt/73212642/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] session_lookup: holes in fib_index_to_table_index?

2020-04-23 Thread Florin Coras
Hi Andreas, 

I just merged your patch. Thanks for the fix!

Regards,
Florin

> On Apr 23, 2020, at 12:53 AM, Andreas Schultz 
>  wrote:
> 
> ok, I have to retract my initial assertion that this is not a problem. Indeed 
> for IPv4 the breakage is not visible, but IPv6 in anything other than 
> fib_index zero is not usable.
> The uninitialized entries in the IPv6 fib_index_to_table_index point to a 
> *IPv4* session table. Of course that does not work and will crash.
> 
> The fact that fib_index is passed a u8 into session_table_get_or_alloc does 
> not help either. A invalid fib_index of ~0 is converted to 255 and the 
> fib_index_to_table_index for IPv6 happily extendeds to 255 entries.
> 
> I'll try to come up with a fix, unless someone beats me to it.
> 
> Regards,
> Andreas
> 
> Am Do., 23. Apr. 2020 um 09:07 Uhr schrieb Andreas Schultz via lists.fd.io 
>   >:
> Hi,
> 
> While trying to troubleshoot a session problem, I've come across 
> session_table_get_or_alloc:
> https://gerrit.fd.io/r/gitweb?p=vpp.git;a=blob;f=src/vnet/session/session_lookup.c;h=4de6fdbe547499fcd256e387370cc5bf0c810206;hb=refs/heads/master#l158
>  
> 
> 
> To me it looks like this function could leave holes of invalid/wrong 
> table_indexes in the fib_index_to_table_index.
> 
> Index 0 is initialized explicitly (fib_index = 0), but the other indexes are 
> only initialized with a valid session table index when the first session in 
> that fib_index is opened.
> When the first session happens to be in e.g. fib_index 10, all indexes from 1 
> up to including 9 are filled with zero (by vec_validate). The problem is that 
> for those fib indexes the session table index is now 0.
> Fortunately, that session table index is valid, because it points to the 
> session table for fib_index 0. However, it also means that sessions from the 
> various fibs leak into each other.
> 
> Or do I miss something?
> 
> Regards
> Andreas
> 
> -- 
> Andreas Schultz
> 
> -- 
> 
> 
> 
> 
> -- 
> Andreas Schultz
> 

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16149): https://lists.fd.io/g/vpp-dev/message/16149
Mute This Topic: https://lists.fd.io/mt/73213863/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-