> I have been looking at a problem reported by Sandesh
> where packet capture does not work if rx/tx burst is done in secondary
> process.
>
> The root cause is that existing rx/tx callback model just doesn't work
> unless the process doing the rx/tx burst calls is the same one that
> registered the callbacks.
>
> An example sequence would be:
> 1. dumpcap (or pdump) as secondary tells pdump in primary to register
> callback
> 2. secondary process calls rx_burst.
> 3. rx_burst sees the callback but it has pointer pdump_rx which is not
> necessarily
> at same location in primary and secondary process.
> 4. indirect function call in secondary to bad location likely causes
> crash.
As I remember, RX/TX callbacks were never intended to work over multiple
processes.
Right now RX/TX callbacks are private for the process, different process simply
should not
see/execute them.
I.E. it callbacks list is part of 'struct rte_eth_dev' itself, not the
rte_eth_dev.data that is shared
between processes.
It should be normal, wehn for the same port/queue you will end-up with
different list of callbacks
for different processes.
So, unless I am missing something, I don't see how we can end-up with 3) and 4)
from above:
>From my understanding secondary process will never see/call primary's
>callbacks.
About pdump itself, it was a while when I looked at it last time, but as I
remember to start it to work,
server process has to call rte_pdump_init() which in terns register PDUMP_MP
handler.
I suppose for the secondary process to act as a 'pdump server' it needs to call
rte_pdump_init() itself,
though I am not sure such option is supported right now.
>
> Some possible workarounds.
> 1. Keep callback list per-process: messy, but won't crash. Capture
> won't work
> without other changes. In this primary would register callback,
> but secondaries
> would not use them in rx/tx burst.
>
> 2. Replace use of rx/tx callback in pdump with change to rte_ethdev to
> have
> a capture flag. (i.e. don't use indirection). Likely ABI problems.
> Basically, ignore the rx/tx callback mechanism. This is my
> preferred
> solution.
It is not only the capture flag, it is also what to do with the captured packets
(copy? If yes, then where to? examine? drop?, do something else?).
It is probably not the best choice to add all these things into ethdev API.
> 3. Some fix up mechanism (in EAL mp support?) to have each process fixup
> its callback mechanism.
Probably the easiest way to fix that - pass to rte_pdump_enable() extra
information
that would allow it to distinguish on what exact process (local, remote)
we want to enable pdump functionality. Then it could act accordingly.
>
> 4. Do something in pdump_init to register the callback in same process
> context
> (probably need callbacks to be per-process). Would mean callback is
> always
> on independent of capture being enabled.
>
> 5. Get rid of indirect function call pointer, and replace it by index
> into
> a static table of callback functions. Every process would have
> same code
> (in this case pdump_rx) but at different address. Requires all
> callbacks
> to be statically defined at build time.
Doesn't look like a good approach - it will break many things.
> The existing rx/tx callback is not safe id rx/tx burst is called from
> different process
> than where callback is registered.