[AMD Official Use Only - AMD Internal Distribution Only]

Thank you for sharing the details in this thread,


Snipped
>
>
> The crashes are on 22.11, 23.03, 24.11, it is on all dpdk stable versions and 
> 25.07
> as well.
> Please first close primary testpmd before secondary testpmd application and 
> try to
> close secondary or execute any of the following commands,

Here is my problem, these were working at start of 2023. So something has 
definitely changed.

>
> "show device info all
> show port stats all
> show port xstats all
> set fwd rxonly
> set fwd txonly
> start
> etc"

Have you started the rx-tx of the device in primary or secondary applications?

>
> We are all agree that these crashes exists. First we were tried to prevent the
> crashes at PMD level, but it was not possible to add checks in each PMD. Then 
> we
> tried to add safety checks in ethdev layer, and it was not suitable as with 
> primary
> closing all reference to device information (pointers) would lead crashes.
>
> Then we agreed on secondary process monitoring for primary process exiting.
> and it is now resolved on application level, i.e. on testpmd.

This is where I am not clear. You have attempted the fix in `testpmd`.
Please help in understanding the cause and impact

1. is it only occurring in testpmd?
2. does other applications (Example) also faces this issue?

I still remember, where solutions were designed for network customers where 
primary allocates and manages memory.
While secondary does rx and tx burst. We ended up in creating service thread in 
secondary to check aliveness of primary.

>
> Now, this solution is working perfectly. We can add eal_cleanup for gracefull 
> exit.

Reply via email to