On 11/12/2018 9:35 AM, Thomas Monjalon wrote:
11/11/2018 08:31, Matan Azrad:
From: Jeff Guo
On 11/9/2018 1:24 PM, Matan Azrad wrote:
   From: Jeff Guo
On 11/8/2018 5:35 PM, Matan Azrad wrote:
From: Jeff Guo
On 11/8/2018 3:28 PM, Matan Azrad wrote:
From: Ananyev, Konstantin
From: Guo, Jia
On 11/6/2018 2:36 PM, Matan Azrad wrote:
Hi Jeff

      From: Jeff Guo <jia....@intel.com>
Before detach device when device be hot-unplugged, the failure
process in user space and kernel space both need to be
finished, such as eal interrupt callback need to be inactive
before the callback be unregistered when device is being
cleaned. This patch add rte alarm for device detaching, with
that it could finish interrupt callback soon and give time to
let the failure process done
before detaching.
Fixes: 2049c5113fe8 ("app/testpmd: use hotplug failure
handler")
Signed-off-by: Jeff Guo <jia....@intel.com>
---
      app/test-pmd/testpmd.c | 13 ++++++++++++-
      1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 9c0edca..9c673cf 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -2620,7 +2620,18 @@ eth_dev_event_callback(const char
*device_name, enum rte_dev_event_type type,
                                device_name);
                        return;
                }
-               rmv_event_callback((void *)(intptr_t)port_id);
+               /*
+                * Before detach device, the hot-unplug failure
process in
+                * user space and kernel space both need to be
finished,
+                * such as eal interrupt callback need to be inactive
before
+                * the callback be unregistered when device is being
cleaned.
+                * So finished interrupt callback soon here and give
time to
+                * let the work done before detaching.
+                */
+               if (rte_eal_alarm_set(100000,
+                               rmv_event_callback, (void
*)(intptr_t)port_id))
+                       RTE_LOG(ERR, EAL,
+                               "Could not set up deferred device
It looks me strange to use callback and alarm to remove a device:
Why not to use callback and that is it?

I think that it's better to let the EAL to detach the device
after all the
callbacks were done and not to do it by the user callback.
So the application\callback owners just need to clean its
resources with understanding that after the callback the
device(and the callback
itself) will be detached by the EAL.


Firstly, at the currently framework and solution, such as
callback for RTE_ETH_EVENT_INTR_RMV, still need to use the
deferred device
removal,
we tend to give the control of detaching device to the
application, and the whole process is located on the user's
callback. Notify app to detach device by callback but make it
deferred,
i think it is fine.
But the device must be detached in remove event, why not to do it
in
EAL?
I think it because of before detached the device, application
should be stop the forwarding at first, then stop the device, then
close

the device, finally call eal unplug API to detach device. If eal
directly detach device at the first step, there will be mountain
user space error need to handle, so that is one reason why need to
provider the remove notification to app, and let app to process it.
This is why the EAL need to detach the device only after all the
user
callbacks were done.


If i correctly got your meaning, you suppose to let eal to mandatory
detach device but not app, app just need to stop/close port, right?
Yes, the app should stop,close,clean its own resources of the removed
device, Then, EAL to detach the device.

If so, it will need to modify rmv_event_callback by not invoke the
detaching func and add some detaching logic to hotplug framework in eal.

rmv_event_callback is using by other hotplug mechanism (ETHDEV RMV
event), so you need to use another one\ add parameter to it.
And yes, you need to detach the device from EAL, should be simple.

I think rmv_event_callback is original use for other hotplug event (ETHDEV
RMV event), but it still use the common hotplug mechanism(app callback and
app detach),

so i think it will still need to face this callback issue and you could check 
that
eth_event_callback also use the rte alarm to detach device.
The ETHDEV layer cannot know if more than one ethdev port is associated to the 
rte_device,
So, it is not correct to detach a rte_device by the ETHDEV layer. Hence, the 
application should decide in the ETHDEV event case.
Yes, we must not detach a device (rte_dev_remove) if we are not sure all ports 
are closed.

But, in the EAL event case this is different as I explained before.

Moreover, I think that the ethdev RMV event should be deprecated one day, after 
the EAL hot-plug mechanism will be stable.
Yes, we may replace the ethdev RMV event by the EAL event RTE_DEV_EVENT_REMOVE.

so my suggestion is that, you maybe propose a good idea but let we keep on
current mechanism until we come to a final good solution agreement, before
that, just let it functional.
I still suggest to fix the issue by an EAL detaching.
I don't understand the issue, but yes we can call rte_dev_remove in EAL,
after the callback return.
Ideally, the callback should be able to return the decision of cleaning
the device or not.

I suggest several steps (a roadmap for HW unplug):



I mainly agree on that we need to modify currently hotplug framework for HW hot-unplug. But i think we might be think about that.

1) simple delete detach_port_device from rmv_event_callback will affect both currently ethdev event and eal event hot-unplug mechanism.

2) If we remove call to rmv_event_callback in eth_dev_event_callback() , we also need to remove call to rmv_event_callback in eth_event_callback(),  since it also use deferred device removal to waiting the interrupt callback be unregistered in mlx5_dev_interrupt_handler_uninstall.

so i think that basically the multiple port free and the hotplug framework integrate should be an RFC at the begin of 19.02, but should not affect currently mechanism. And i suggest the several steps might be better as.

        - in 18.11, rename eth_dev_event_callback to dev_event_callback
        - in 18.11, rename rmv_event_callback() to rmv_port_callback(), keep 
current function calling when detach one port both for ethdev event and eal 
event. (we could document it for the limitation.)

        - in 19.02, remove call to detach_port_device() from 
rmv_event_callback()
        - in 19.02, call rte_dev_remove() in EAL after RTE_DEV_EVENT_REMOVE 
callback
        - in 19.02, remove call to rmv_event_callback() in 
eth_dev_event_callback()
        - in 19.02, convert all PMDs to free port resources on 
rte_eth_dev_close()
        - in 19.05, automatically call rte_dev_remove() from PMD when closing 
last port
        - in 19.08, we may try to unbind the kernel driver in rte_dev_remove()


Reply via email to