More details below about the plan for 20.02.

13/02/2020 13:37, Thomas Monjalon:
> Hi,
> 
> This discussion becomes confusing so I do a summary below.
> I think we can do several fixes in 20.02.
> 
> 12/02/2020 14:49, Ferruh Yigit:
> > On 2/3/2020 5:10 PM, Matan Azrad wrote:
> 
> [stripping long discussion in favor of a summary below]
> 
> > > Even if the PMD clear the device pointer, the testpmd still may release 
> > > wrong rte_device.
> > 
> > Yes it may, although that is less likely to occur, it requires a new device 
> > hot
> > added between close() and detach of the other device.
> > 
> > Would you be agree to say there are two problems:
> > 
> > 1) When testpmd close a port, a new attached port can re-use it over writing
> > some fields, relying the data structures of the closed port is not safe.
> > 
> > 2) PMD not cleaning ethdev->device pointer in the .remove() may cause 
> > issues in
> > double detach of a port.
> > 
> > 
> > For (1) I suggest fixing it in the attach path, don't re-use an eth_dev 
> > port id
> > unless it is completely freed, may need to add new state for it. Does it 
> > make sense?
> 
> Yes we could add a CLOSED state which is set on ethdev close.
> When the rte_device is freed, the PMD could set attached ports as UNUSED.
> But given some ethdev ports can be open and closed dynamically,
> I am not sure it is a good solution to keep them in CLOSED state and ask
> PMD to remember them.
> 
> An alternative workaround could be to allocate port_id by incrementing
> a saved biggest id. So the race condition would be very unlikely.
> The drawbacks are having big port_id numbers and changing the id
> allocation algorithm (which is not documented anyway).
> 
> The proposals above for port_id allocation or states rework cannot be
> done in 20.02. Let's discuss and work on it in a separated thread.
> 
> > For (2) PMDs want to get hotplug support needs to fix it.
> 
> Yes PMDs should clear rte_eth_devices[port_id].device in .remove().

I am sending a patch adding
        memset(eth_dev, 0, sizeof(struct rte_eth_dev));
in rte_eth_dev_release_port().
But this patch cannot be merged after 20.02-rc1. It will wait for 20.05.

> We must also protect from user calling detach on a closed port
> by adding a check in cmd_operate_detach_port_parsed(),
> before calling detach_port_device().

I am sending a patch adding RTE_ETH_VALID_PORTID_OR_RET()
in cmd_operate_detach_port_parsed().
It should fix the issue observed by Matan with double detach.
It will be a double protection if keeping the check
port_id_is_invalid() in detach_port_device().

> The hotplug rmv_port_callback() must be able to call detach after close.
> There are three possible fixes:
>       - revert the port_id_is_invalid() check in detach_port_device()
>       - call rte_dev_remove(rte_device) directly
>       - call a new function with rte_device (detach_port_device() can use it)

I am sending a patch implementing the third alternative
as it is both keeping the detach behaviour and fixing the race condition
(i.e. protect from new port re-using the port_id between close and detach).

> About the function detach_port_device() itself, yes this function is
> strange to say the least. It was a convenience for detaching a rte_device
> from a port_id.
> The cleanup of siblings with RTE_ETH_FOREACH_DEV_OF(sibling, dev),
> should probably be removed. I've added it as a temporary solution
> before all PMDs are properly fixed:
>       rte_eth_devices[sibling].device = NULL;

I propose sending such patch in 20.05 in order to merge the memset above
first, and have time to get agreement from all PMD maintainers.

> For info, there is a function detach_device() used by the command
>       "device detach <identifier>"



Reply via email to