If you have the thread index handy, that's OK. Otherwise, use vlib_get_main() 
which grabs the thread index from thread local storage. 

-----Original Message-----
From: vpp-dev@lists.fd.io <vpp-dev@lists.fd.io> On Behalf Of Elias Rudberg
Sent: Sunday, April 5, 2020 4:58 AM
To: vpp-dev@lists.fd.io
Subject: [vpp-dev] VPP nat ipfix logging problem, need to use thread-specific 
vlib_main_t?

Hello VPP experts,

We have been using VPP for NAT44 for a while and it has been working fine, but 
a few days ago when we tried turing on nat ipfix logging, vpp crashed. It 
turned out that the problem went away if we used only a single thread, so it 
seemed related to how threading was handled in the ipfix logging code. The 
crash happened in different ways on different runs but often seemed related to 
the snat_ipfix_send() function in plugins/nat/nat_ipfix_logging.c.

Having looked at the code in nat_ipfix_logging.c I have the following theory 
about what goes wrong (I might have misunderstood something, if so please 
correct me):

In the the snat_ipfix_send() function, a vlib_main_t data structure is used, a 
pointer to it is fetched in the following way:

   vlib_main_t *vm = frm->vlib_main;

So the frm->vlib_main pointer comes from "frm" which has been set to 
flow_report_main which is a global data structure from vnet/ipfix- 
export/flow_report.c that as far as I can tell only exists once in memory (not 
once per thread). This means that different threads calling the 
snat_ipfix_send() function are using the same vlib_main_t data structure. That 
is not how it should be, I think, instead each thread should be using its own 
thread-specific vlib_main_t data structure.

A suggestion for how to fix this is to replace the line

   vlib_main_t *vm = frm->vlib_main;

with the following line

   vlib_main_t *vm = vlib_mains[thread_index];

in all places where worker threads are using such a vlib_main_t pointer. Using 
vlib_mains[thread_index] means that we are picking the thread-specific 
vlib_main_t data structure for the current thread, instead of all threads using 
the same vlib_main_t. I pushed such a change to gerrit, here: 
https://gerrit.fd.io/r/c/vpp/+/26359

That fix seems to solve the issue in my tests, vpp does not crash anymore after 
the change. Please have a look at it and let me know if this seems reasonable 
or if I have misunderstood something.

Best regards,
Elias

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#15991): https://lists.fd.io/g/vpp-dev/message/15991
Mute This Topic: https://lists.fd.io/mt/72786912/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to