failsafe: replace local sub-device with shared data

Gaëtan Rivet Thu, 07 Mar 2019 01:48:18 -0800

On Thu, Mar 07, 2019 at 08:43:01AM +0000, Raslan Darawsheh wrote:
> Hi,
> 
> > -----Original Message-----
> > From: Thomas Monjalon <[email protected]>
> > Sent: Wednesday, March 6, 2019 8:02 PM
> > To: Gaëtan Rivet <[email protected]>; Raslan Darawsheh
> > <[email protected]>
> > Cc: [email protected]; [email protected]
> > Subject: Re: [dpdk-dev] [PATCH v2 3/4] net/failsafe: replace local 
> > sub-device
> > with shared data
> > 
> > 06/03/2019 11:46, Gaëtan Rivet:
> > > On Tue, Mar 05, 2019 at 06:58:04PM +0100, Thomas Monjalon wrote:
> > > > 05/03/2019 18:38, Gaëtan Rivet:
> > > > > What happens when a primary process closes a device before a
> > secondary?
> > > > > Is the secondary unable to stop / close its own then? Isn't there
> > > > > some missing uninit?
> > > >
> > > > Is the secondary process supposed to do any closing?
> > > > The device management should be done only by the primary process.
> > > >
> > > > Note: anyway all this hotplug related code should be dropped from
> > > > failsafe to be replaced by EAL hotplug management.
> > > >
> > >
> > > I don't know, I've never used secondary process.
> > > However, cursory reading the code of rte_eth_dev_close(), I don't see
> > > a guard against calling it from a secondary process?
> > 
> > Yes indeed, there is no guard.
> > That's something not clear in DPDK, previously we were attaching some
> > vdevs in secondary only.
> > 
> > > Reading code like
> > >
> > >    rte_free(dev->data->rx_queues);
> > >    dev->data->rx_queues = NULL;
> > >
> > > within makes me think the issue has been seen at least once, where
> > > shared data is freed multiple times, so I guessed some secondary
> > > processes were calling it. Maybe they are not meant to, but what
> > > prevents them from being badly written?
> > >
> > > Also, given rte_dev_remove IPC call to transfer the order to the
> > > primary, it seems that at least secondary processes are expected to
> > > call
> > > rte_dev_remove() at some point? So are they only authorized to call
> > > rte_dev_remove() (to manage hotplug), but not rte_eth_dev_close()? Is
> > > there a specific documentation detailing the design of secondary
> > > process and the related responsibilities in the lifetime of a device?
> > > How are they synching their rte_eth_devices list if they are not
> > > calling rte_eth_dev_close(), ever?
> > 
> > All these calls should be done in primary.
> > The IPC mechanism calls the attach/detach in secondary at EAL level.
> > The PMDs does the bridge between EAL device and ethdev port status.
> > But you are right, there can be a sync issue if closing an ethdev port and 
> > not
> > removing the EAL device.
> > This is a generic question about deciding whether we want all ethdev ports 
> > to
> > be synced in multi-process or not.
> > 
> > In failsafe context, we are closing the EAL device and change the state of 
> > the
> > sub-device accordingly. So I think there is no issue.
> > 
> > > > > This seems dangerous to me. Why not instead allocating a
> > > > > per-process slab of memory that would hold the relevant references
> > > > > and outlive the shared data (a per-process rte_eth_dev private 
> > > > > data...).
> > > >
> > > > Which data do you think should be allocated per process?
> > > >
> > > >
> > >
> > > [-------- SHARED SPACE --------------] [-- PER-PROCESS --------]
> > > +--------------------------------------------------------------+
> > > | +------------------+                +- rte_eth_devices[n] -+ |
> > > | |rte_eth_dev_data  |<---------------+ data                 | | PRIMARY
> > > | |                  |   +dev_priv-+  |                      | |
> > > | |      dev_private +-->|         |  |                      | |
> > > | |              ... |   |         |  |                      | |
> > > | |          port_id |   |         |  |                      | |
> > > | |                  |   |         |  |                      | |
> > > | |                  |   |         |  |                      | |
> > > | |                  |   |         |  +----------------------+ |
> > > | |                  |   |         |  +- rte_eth_devices[n] -+ |
> > > | |                  |   |         |  |                      | | SECONDARY
> > > | |                  |   |         |  |                      | |
> > > | |                  |   |         |  |                      | |
> > > | |                  |   |         |  |                      | |
> > > | |                  |   +---------+  |                      | |
> > > | |                  |<---------------+ data                 | |
> > > | +------------------+                +----------------------+ |
> > > +--------------------------------------------------------------+
> > >
> > > Here port_id is used within fail-safe to get back to rte_eth_devices[n].
> > > This disappears once a device is closed, as all shared space is zeroed.
> > >
> > > This means that sometimes ETH(sdev) and PORT_ID(sdev) is correct, and
> > > at some point it is not anymore, once a sub-device has been closed.
> > > This seems dangerous.
> > 
> > The state of the sub-device is changed.
> > I don't see any issue.
> > 
> > > I was thinking initially that allocating a place where each sdev would
> > > store their rte_eth_devices / port_id back-reference could alleviate
> > > the issue, meaning that the fail-safe would not zero it on
> > > sdev_close(), and it would remain valid for the lifetime of a
> > > sub-device, so even when a sub-device is in DEV_PROBED state.
> > >
> > > But now that I think about it, it could probably be simpler: instead
> > > of using (ETH(sdev)->data->port_id) for the port_id of an sdev
> > > (meaning that it is dependent on the lifetime of the sdev, instead of
> > > the lifetime of the failsafe), the port-id itself should be stored in
> > > the sub_device structure. This structure will be available for the
> > > lifetime of the failsafe, and the port_id is correct accross all 
> > > processes.
> > >
> > > So PORT_ID(sdev) would be defined to something like (sdev->port_id),
> > > and
> > > ETH(sdev) would be (&rte_eth_devices[PORT_ID(sdev)]). It would remain
> > > correct even once the primary has closed the sub-device.
> > >
> > > What do you think? Do you agree that the current state is dangerous,
> > > and do you think the solution would alleviate the issue? Maybe the
> > > concern is unfounded and the only issue is within fs_dev_remove().
> > 
> > Yes it is only seen in fs_dev_remove().
> > I discussed about your proposal with Raslan, and we agree we could change
> > from sub_device.data to sub_device.port_id, it may be more future-proof.
> > 
> > I have only one doubt: look at the macro in this patch:
> > 
> > #define ETH(sdev) \
> >     ((sdev)->data == NULL ? NULL : &rte_eth_devices[(sdev)->data-
> > >port_id])
> > 
> > The NULL check cannot be done with a port id.
> > I think it was needed to manage one case. Raslan?
> 
> That's right since we need it for fs_tx_unsafe, to add a protection for 
> plugged out devices during TX.


Ok, thanks for your insights Thomas and Raslan. Sorry about the rambling
above I needed to write down the stuff to think about it.

You can use RTE_MAX_ETHPORTS as a sentinel value for port_id, this way
the value is kept unsigned and there are several checks against this
specific value otherwise.

so ETH(sdev) could be

        (PORT_ID(sdev) >= RTE_MAX_ETHPORTS ? NULL : 
&rte_eth_devices[PORT_ID(sdev)])

-- 
Gaëtan Rivet
6WIND

Re: [dpdk-dev] [PATCH v2 3/4] net/failsafe: replace local sub-device with shared data

Reply via email to