> >>>>> On Tue, 6 Sep 2022 16:05:11 +0800 Yiding Zhou > >>>>> <mailto:yidingx.z...@intel.com> wrote: > >>>>> > >>>>>> The pcap file will be synchronized to the disk when stopping the > >>>>>> device. > >>>>>> It takes a long time if the file is large that would cause the > >>>>>> 'detach sync request' timeout when the device is closed under > >>>>>> multi-process scenario. > >>>>>> > >>>>>> This commit fixes the issue by using alarm handler to release dumper. > >>>>>> > >>>>>> Fixes: 0ecfb6c04d54 ("net/pcap: move handler to process private") > >>>>>> Cc: mailto:sta...@dpdk.org > >>>>>> > >>>>>> Signed-off-by: Yiding Zhou <mailto:yidingx.z...@intel.com> > >>>>> > >>>>> > >>>>> I think you need to redesign the handshake if this the case. > >>>>> Forcing 30 second delay at the end of all uses of pcap is not > >>>>> acceptable. > >>>> > >>>> @Zhang, Qi Z Do we need to redesign the handshake to fix this? > >>> > >>> Hi, Ferruh > >>> Sorry for the late reply. > >>> I did not receive your email on Oct 6, I got your comments from patchwork. > >>> > >>> "Can you please provide more details on multi-process communication > >>> and call trace, to help us think about a solution to address this > >>> issue in a more generic way (not just for pcap but for any case > >>> device close takes more than multi-process timeout)?" > >>> > >>> I try to explain this issue with a sequence diagram, hope it can be > >>> displayed > >> correctly in the mail. > >>> > >>> thread intr thread intr > >>> thread thread > >>> of secondary of secondary of primary > >>> of > primary > >>> | | > >>> | | > >>> | | > >>> | | > >>> rte_eal_hotplug_remove > >>> rte_dev_remove > >>> eal_dev_hotplug_request_to_primary > >>> rte_mp_request_sync > >>> ------------------------------------------------------->| > >>> > >>> | > >>> > >> handle_secondary_request > >>> > >>> |<-----------------| > >>> > >>> | > >>> > >>> __handle_secondary_request > >>> > eal_dev_hotplug_request_to_secondary > >>> |<------------------------------------- rte_mp_request_sync > >>> | > >>> handle_primary_request--------->| > >>> | > >>> __handle_primary_request > >>> local_dev_remove(this will take long time) > >>> rte_mp_reply > >>> -------------------------------->| > >>> > >>> | > >>> > >>> local_dev_remove > >>> |<------------------------------------------------- > >>> rte_mp_reply > >>> > >>> The marked 'local_dev_remove()' in the secondary process will > >>> perform a > >> pcap file synchronization operation. > >>> When the pcap file is too large, it will take a lot of time > >>> (according to my test > >> 100G takes 20+ seconds). > >>> This caused the processing of hot_plug message to time out. > >> > >> Hi Yiding, > >> > >> Thanks for the information, > >> > >> Right now all MP operations timeout is hardcoded in the code and it > >> is 5 seconds. > >> Do you think does it work to have an API to set custom timeout, > >> something like `rte_mp_timeout_set()`, and call this from pdump? > >> > >> This gives a generic solution for similar cases, not just for pcap. > >> But my concern is if this is too much multi-process related internal > >> detail to update, @Anatoly may comment on this. > > > > Hi, Ferruh > > For pdump case only, I think the timeout is affected by pcap's size and > > other > system components, such as the type of FS, system memory size. > > It may be difficult to predict the specific time value for setting. > > It doesn't have to be specific. > > Point here is to have a multi process API to set timeout, instead of put a > hardcoded timeout in pcap PMD.
OK, I understood.