Hi Ferruh, Andrew,
> -----Original Message----- > From: Thomas Monjalon [mailto:tho...@monjalon.net] > Sent: Tuesday, October 23, 2018 3:29 PM > To: Lu, Wenzhuo <wenzhuo...@intel.com> > Cc: dev@dpdk.org; Yigit, Ferruh <ferruh.yi...@intel.com>; Andrew > Rybchenko <arybche...@solarflare.com> > Subject: Re: [dpdk-dev] [PATCH v2] ethdev: fix device info getting > > 23/10/2018 03:25, Lu, Wenzhuo: > > Hi Thomas, Ferruh, Andrew, > > > > From: Thomas Monjalon [mailto:tho...@monjalon.net] > > > 22/10/2018 14:01, Ferruh Yigit: > > > > On 8/23/2018 9:58 AM, Andrew Rybchenko wrote: > > > > > On 22.08.2018 19:55, Ferruh Yigit wrote: > > > > >> On 8/14/2018 1:57 AM, Lu, Wenzhuo wrote: > > > > >>> Hi Andrew, > > > > >>> From: Andrew Rybchenko [mailto:arybche...@solarflare.com] > > > > >>>> On 13.08.2018 05:50, Lu, Wenzhuo wrote: > > > > >>>>> From: Thomas Monjalon [mailto:tho...@monjalon.net] > > > > >>>>>> 16/07/2018 03:58, Lu, Wenzhuo: > > > > >>>>>>> Hi Andrew, > > > > >>>>>>> From: Lu, Wenzhuo > > > > >>>>>>>> From: Andrew Rybchenko > [mailto:arybche...@solarflare.com] > > > > >>>>>>>>> > > > > >>>>>>>>> Hi, Wenzhuo, > > > > >>>>>>>>> > > > > >>>>>>>>> I'm sorry, but I have more even harder questions than > > > > >>>>>>>>> the previous > > > > >>>> one. > > > > >>>>>>>>> This questions are rather generic and mainly to ethdev > > > maintainers. > > > > >>>>>>>>> > > > > >>>>>>>>> On 13.07.2018 05:42, Wenzhuo Lu wrote: > > > > >>>>>>>>>> The device information cannot be gotten correctly > > > > >>>>>>>>>> before the configuration is set. Because on some NICs > > > > >>>>>>>>>> the information has dependence on the configuration. > > > > >>>>>>>>> Thinking about it I have the following question. Is it > > > > >>>>>>>>> valid behaviour of the dev_info if it changes after > configuration? > > > > >>>>>>>>> I always thought that the primary goal of the dev_info > > > > >>>>>>>>> is to provide information to app about device > > > > >>>>>>>>> capabilities to allow app configure device and queues > > > > >>>>>>>>> correctly. Now we see the case when dev_info changes on > > > > >>>>>>>>> configure. May be it is acceptable, but it is really > > > > >>>>>>>>> suspicious. If we accept it, it should > > > be documented. > > > > >>>>>>>>> May be dev_info should be split into parts: part which > > > > >>>>>>>>> is persistent and part which may depend on device > configuration. > > > > >>>>>>>> As I remember, the similar discussion has happened :) > > > > >>>>>>>> I've raised the similar suggestion like this. But we > > > > >>>>>>>> don’t make it > > > happen. > > > > >>>>>>>> The reason is, you see, this is the rte layer's behavior. > > > > >>>>>>>> So the user doesn't have to know it. From APP's PoV, it > > > > >>>>>>>> inputs the configuration, it calls this API > "rte_eth_dev_configure". > > > > >>>>>>>> It doesn't know the configuration is copied before > > > > >>>>>>>> getting the > > > info or not. > > > > >>>>>>>> So, to my opinion, we can still keep the behavior. We > > > > >>>>>>>> only need to split it into parts when we do see the case > > > > >>>>>>>> that cannot > > > make it. > > > > >>>>>>> Maybe I talked too much about the patch. Think about it again. > > > > >>>>>>> Your comments is about how to use the APIs, > > > > >>>>>>> rte_eth_dev_info_get, > > > > >>>>>> rte_eth_dev_configure. To my opinion, rte_eth_dev_info_get > > > > >>>>>> is just to get the info. It can be called anywhere, before > > > > >>>>>> configuration or after. It's reasonable the info changes > > > > >>>>>> with the > > > configuration changing. > > > > >>>>>>> But we do have something missing, like, > > > > >>>>>>> rte_eth_dev_capability_get which > > > > >>>>>> should be stable. APP can use this API to get the necessary > > > > >>>>>> info before configuration. > > > > >>>>>>> A question, maybe a little divergent thinking, that APP > > > > >>>>>>> should have some > > > > >>>>>> intelligence to handle the capability automatically. So > > > > >>>>>> getting the capability is not so good and effective, looks > > > > >>>>>> like we still need the human > > > > >>>> involvement. > > > > >>>>>> Maybe that the reason currently we suppose APP know the > > > > >>>>>> capability from the paper copies, examples... > > > > >>>>>> > > > > >>>>>> I am not sure to understand all the sentences. > > > > >>>>>> But I agree that we should take a decision about the > > > > >>>>>> stability of these > > > > >>>> infos. > > > > >>>>>> Either infos cannot change after probing, or we must > > > > >>>>>> document that the app must request infos regularly (when?). > > > > >>>>> Sorry, I missed this mail. > > > > >>>>> > > > > >>>>> I have the concern that different NICs have different behavior. > > > > >>>>> One info > > > > >>>> can be stable on a NIC but dynamic on another. Considering > > > > >>>> this, we may better not splitting the rte_eth_dev_info_get to 2 > APIs. > > > > >>>> And comparing with handling this in rte layer, maybe we can > > > > >>>> let every > > > NIC has its own decision. > > > > >>>>> I have an idea. Maybe we can add a parameter for potential > > > > >>>>> dynamic fields. Like, Changing uint16_t nb_rx_queues; to > > > > >>>>> struct nb_rx_queues { uint16_t value; bool stable; } > > > > >>>> May be it is just very bad example, but as I understand > > > > >>>> nb_rx_queues is mainly required to configure the device properly. > > > > >>>> Or should app configure, get new value, reconfigure again, > > > > >>>> get new value and so on and stop when previous is equal to the > new one. > > > Yes, I dramatise and it sounds really bad. > > > > >>>> In any case it would over-complicate interface and no single > > > > >>>> app will do it correctly. > > > > >>> I think you're talking about max_rx_queues. APP can get that > > > > >>> info > > > before configuration. Then configure rx queue number which is not > > > larger than it. That's enough. > > > > >>> nb_rx_queues should be the number which is configured by APP > > > > >>> and > > > how many queues are actually used. To my opinion, it's mainly used > > > by the GUI to show the value to human being. > > > > >>> > > > > >>> BTW, max_rx_queues could be an good example that shows that > > > some parameters are stable on some NICs but not on other NICs. > > > > >>> Take Intel NICs for example (I don’t familiar with others.), > > > > >>> normally > > > max_rx_queues is stable on PF. But on VF, as the max number is > > > decided by PF, it could be dynamic. When VF starts, it can get an default > value from PF. > > > If it not enough, it can request a larger one from PF. If the number > > > works, VF can get a new number. > > > > >> "struct rte_eth_dev_info" is a little overloaded, it has: > > > > >> - static info, like *device > > > > >> - device limitations, max_*, *_lim > > > > >> - device capabilities, *_capa > > > > >> - suggested configurations, default_*conf > > > > >> - device configuration, nb_[r/t]x_queues > > > > >> - other, switch_info > > > > >> > > > > >> There is a concern that some values are dynamic, but this is > > > > >> not new, for example nb_rx/tx_queues can be changed by > > > > >> rte_eth_dev_rx/tx_queue_config() API and rte_eth_dev_info() > > > > >> output > > > will be changed. > > > > > > > > > > The example looks different to me. It is explicit changes > > > > > directly requested by the application. So, it is not a surprise that > > > > > it > changes. > > > > > > > > > >> For this patch suggested configuration changes based on some > > > > >> other config values looks ok as concept. > > > > >> So I think we can say after every configuration related API dev > > > > >> info can be changed. > > > > > > > > > > I think that saying that any configuration changes may result in > > > > > any changes in dev_info is hardly helpful. I'd suggest to be more > specific. > > > > > Yes, it is harder and will have bugs, but at least it is helpful. > > > > > > > > Hi Andrew, Wenzhuo, > > > > > > > > Back to this patch, which fixes an actual defect, > > > > > > > > What do you think about: > > > > 1- Keep existing patch but extend it as, save the original "dev->data" > > > > and revert it back to this original data on all error path. > > > > 2- Update rte_eth_dev_info() API document and say default > > > > configuration can be changed based on other config fields. So this > > > > reduces the scope of things can change in dev_info. > > > > > > I think we are doing too much juggling with data in ethdev layer. > > > All these things should be the responsibility of the PMD. > > > My radical proposal would be to remove rte_eth_dev_info and > > > integrate all the data into rte_eth_dev_data. > > > > > Sorry for missing this discussion. It's a good discussion about how to > optimize the rte_eth. > > But I have to say that above discussion can reach a huge reconsitution of > the rte_eth and impact every PMD. Is that fair? > > This patch is only try to revert a bad commit as we already find bug. As I > remember, at the beginning, Andrew said the discussion may not about the > patch but generic. So could we just tell if this patch itself OK at first? > Thanks. > > I have no strong opinion about your patch. > Sorry, I am more interested by a possible future big clean-up. > About a temporary fix, I am OK if Ferruh and Andrew are OK. About this patch itself, is it OK? If so, would you like ack it? Thanks.