From: Jeff Guo > On 11/8/2018 3:28 PM, Matan Azrad wrote: > > > > From: Ananyev, Konstantin > >>> -----Original Message----- > >>> From: Guo, Jia > >>> Sent: Wednesday, November 7, 2018 7:30 AM > >>> To: Matan Azrad <ma...@mellanox.com>; Ananyev, Konstantin > >>> <konstantin.anan...@intel.com>; Burakov, Anatoly > >>> <anatoly.bura...@intel.com>; Thomas Monjalon > >> <tho...@monjalon.net>; > >>> Iremonger, Bernard <bernard.iremon...@intel.com>; Wu, Jingjing > >>> <jingjing...@intel.com>; Lu, Wenzhuo <wenzhuo...@intel.com> > >>> Cc: Yigit, Ferruh <ferruh.yi...@intel.com>; dev@dpdk.org; Zhang, > >>> Helin <helin.zh...@intel.com>; He, Shaopeng > <shaopeng...@intel.com> > >>> Subject: Re: [PATCH 3/3] app/testpmd: fix callback issue for > >>> hot-unplug > >>> > >>> matan > >>> > >>> On 11/6/2018 2:36 PM, Matan Azrad wrote: > >>>> Hi Jeff > >>>> > >>>> From: Jeff Guo <jia....@intel.com> > >>>>> Before detach device when device be hot-unplugged, the failure > >>>>> process in user space and kernel space both need to be finished, > >>>>> such as eal interrupt callback need to be inactive before the > >>>>> callback be unregistered when device is being cleaned. This patch > >>>>> add rte alarm for device detaching, with that it could finish > >>>>> interrupt callback soon and give time to let the failure process > >>>>> done > >> before detaching. > >>>>> Fixes: 2049c5113fe8 ("app/testpmd: use hotplug failure handler") > >>>>> Signed-off-by: Jeff Guo <jia....@intel.com> > >>>>> --- > >>>>> app/test-pmd/testpmd.c | 13 ++++++++++++- > >>>>> 1 file changed, 12 insertions(+), 1 deletion(-) > >>>>> > >>>>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index > >>>>> 9c0edca..9c673cf 100644 > >>>>> --- a/app/test-pmd/testpmd.c > >>>>> +++ b/app/test-pmd/testpmd.c > >>>>> @@ -2620,7 +2620,18 @@ eth_dev_event_callback(const char > >>>>> *device_name, enum rte_dev_event_type type, > >>>>> device_name); > >>>>> return; > >>>>> } > >>>>> - rmv_event_callback((void *)(intptr_t)port_id); > >>>>> + /* > >>>>> + * Before detach device, the hot-unplug failure > process in > >>>>> + * user space and kernel space both need to be > finished, > >>>>> + * such as eal interrupt callback need to be inactive > before > >>>>> + * the callback be unregistered when device is being > cleaned. > >>>>> + * So finished interrupt callback soon here and give > time to > >>>>> + * let the work done before detaching. > >>>>> + */ > >>>>> + if (rte_eal_alarm_set(100000, > >>>>> + rmv_event_callback, (void > >>>>> *)(intptr_t)port_id)) > >>>>> + RTE_LOG(ERR, EAL, > >>>>> + "Could not set up deferred device > >>>> It looks me strange to use callback and alarm to remove a device: > >>>> Why not to use callback and that is it? > >>>> > >>>> I think that it's better to let the EAL to detach the device after > >>>> all the > >> callbacks were done and not to do it by the user callback. > >>>> So the application\callback owners just need to clean its resources > >>>> with understanding that after the callback the device(and the > >>>> callback > >>> itself) will be detached by the EAL. > >>> > >>> > >>> Firstly, at the currently framework and solution, such as callback > >>> for RTE_ETH_EVENT_INTR_RMV, still need to use the deferred device > >> removal, > >>> we tend to give the control of detaching device to the application, > >>> and the whole process is located on the user's callback. Notify app > >>> to detach device by callback but make it deferred, i think it is fine. > > But the device must be detached in remove event, why not to do it in EAL? > > > I think it because of before detached the device, application should be stop > the forwarding at first, then stop the device, then close > > the device, finally call eal unplug API to detach device. If eal directly > detach > device at the first step, there will be mountain user space error need to > handle, so that is one reason why need to provider the remove notification > to app, and let app to process it.
This is why the EAL need to detach the device only after all the user callbacks were done. > > > >> It is also unclear to me my we need an alarm here. > >> First (probably wrong) impression we just try to hide some > >> synchronization Problem by introducing delay. > > Looks like, the issue is that the callback function memory will be removed > from the function itself (by the detach call), no? > > > Answer here for both Konstantin and Matan. > > Yes, i think matan is right, the interrupt callback will be destroy in the app > callback itself, the sequence is that as below > > hot-unplug interrupt -> interrupt callback -> app callback(return to finish > interrupt callback, delay detaching) -> detach device(unregister interrupt > callback) > > > >> Konstantin > >> > >>> Secondly, the vfio is different with igb_uio for hot-unplug, it > >>> register/unregister hotplug interrupt callback for each device, so > >>> need to makeĀ the callback done before unregister the callback. > >>> > >>> So I think it should be considerate as an workaround here, before we > >>> find a better way. > >>> > >>> > >>>>> removal\n"); > >>>>> break; > >>>>> case RTE_DEV_EVENT_ADD: > >>>>> RTE_LOG(ERR, EAL, "The device: %s has been > added!\n", > >>>>> -- > >>>>> 2.7.4