> > > > On Tue, 6 Sep 2022 16:05:11 +0800 Yiding Zhou > > > > <mailto:yidingx.z...@intel.com> wrote: > > > > > > > > > The pcap file will be synchronized to the disk when stopping the > > > > > device. > > > > > It takes a long time if the file is large that would cause the > > > > > 'detach sync request' timeout when the device is closed under > > > > > multi-process scenario. > > > > > > > > > > This commit fixes the issue by using alarm handler to release dumper. > > > > > > > > > > Fixes: 0ecfb6c04d54 ("net/pcap: move handler to process > > > > > private") > > > > > Cc: mailto:sta...@dpdk.org > > > > > > > > > > Signed-off-by: Yiding Zhou <mailto:yidingx.z...@intel.com> > > > > > > > > > > > > I think you need to redesign the handshake if this the case. > > > > Forcing 30 second delay at the end of all uses of pcap is not > > > > acceptable. > > > > > > @Zhang, Qi Z Do we need to redesign the handshake to fix this? > > > > Hi, Ferruh > > Sorry for the late reply. > > I did not receive your email on Oct 6, I got your comments from patchwork. > > > > "Can you please provide more details on multi-process communication > > and call trace, to help us think about a solution to address this > > issue in a more generic way (not just for pcap but for any case device > > close takes more than multi-process timeout)?" > > > > I try to explain this issue with a sequence diagram, hope it can be > > displayed > correctly in the mail. > > > > thread intr thread intr > > thread thread > > of secondary of secondary of primary > > of primary > > | | > > | | > > | | > > | | > > rte_eal_hotplug_remove > > rte_dev_remove > > eal_dev_hotplug_request_to_primary > > rte_mp_request_sync > > ------------------------------------------------------->| > > > > | > > > handle_secondary_request > > > > |<-----------------| > > > > | > > > > __handle_secondary_request > > > > eal_dev_hotplug_request_to_secondary > > |<------------------------------------- rte_mp_request_sync > > | > > handle_primary_request--------->| > > | > > __handle_primary_request > > local_dev_remove(this will take long time) > > rte_mp_reply > > -------------------------------->| > > > > | > > > > local_dev_remove > > |<------------------------------------------------- > > rte_mp_reply > > > > The marked 'local_dev_remove()' in the secondary process will perform a > pcap file synchronization operation. > > When the pcap file is too large, it will take a lot of time (according to > > my test > 100G takes 20+ seconds). > > This caused the processing of hot_plug message to time out. > > > Part of the problem maybe a hidden file sync in some library. > Normally, closing a file should be fast even with lots of outstanding data. > The actual write done by OS will continue from file cache. > > I wonder if doing some kind of fadvise call might help see > POSIX_FADV_SEQUENTIAL or POSIX_FADV_DONTNEED
Thanks for your advice, I will try it and give you feedback.