Hi Hemant,

Back on this topic, please see some comments below.

On 11/07/2016 01:30 PM, Hemant Agrawal wrote:
> Hi Olivier,
>       
>> -----Original Message-----
>> From: Olivier Matz [mailto:olivier.matz at 6wind.com]
>> Sent: Friday, October 14, 2016 5:41 PM
>>> On 9/22/2016 6:42 PM, Hemant Agrawal wrote:
>>>> Hi Olivier
>>>>
>>>> On 9/19/2016 7:27 PM, Olivier Matz wrote:
>>>>> Hi Hemant,
>>>>>
>>>>> On 09/16/2016 06:46 PM, Hemant Agrawal wrote:
>>>>>> In the rte_pktmbuf_pool_create, if the default external mempool is
>>>>>> not available, the implementation can default to "ring_mp_mc",
>>>>>> which is an software implementation.
>>>>>>
>>>>>> Signed-off-by: Hemant Agrawal <hemant.agrawal at nxp.com>
>>>>>> ---
>>>>>> Changes in V3:
>>>>>> * adding warning message to say that falling back to default sw
>>>>>> pool
>>>>>> ---
>>>>>>  lib/librte_mbuf/rte_mbuf.c | 8 ++++++++
>>>>>>  1 file changed, 8 insertions(+)
>>>>>>
>>>>>> diff --git a/lib/librte_mbuf/rte_mbuf.c
>>>>>> b/lib/librte_mbuf/rte_mbuf.c index 4846b89..8ab0eb1 100644
>>>>>> --- a/lib/librte_mbuf/rte_mbuf.c
>>>>>> +++ b/lib/librte_mbuf/rte_mbuf.c
>>>>>> @@ -176,6 +176,14 @@ rte_pktmbuf_pool_create(const char *name,
>>>>>> unsigned n,
>>>>>>
>>>>>>      rte_errno = rte_mempool_set_ops_byname(mp,
>>>>>>              RTE_MBUF_DEFAULT_MEMPOOL_OPS, NULL);
>>>>>> +
>>>>>> +    /* on error, try falling back to the software based default
>>>>>> pool */
>>>>>> +    if (rte_errno == -EOPNOTSUPP) {
>>>>>> +        RTE_LOG(WARNING, MBUF, "Default HW Mempool not supported. "
>>>>>> +            "falling back to sw mempool \"ring_mp_mc\"");
>>>>>> +        rte_errno = rte_mempool_set_ops_byname(mp, "ring_mp_mc",
>>>>>> NULL);
>>>>>> +    }
>>>>>> +
>>>>>>      if (rte_errno != 0) {
>>>>>>          RTE_LOG(ERR, MBUF, "error setting mempool handler\n");
>>>>>>          return NULL;
>>>>>>
>>>>>
>>>>> Without adding a new method ".supported()", the first call to
>>>>> rte_mempool_populate() could return the same error ENOTSUP. In this
>>>>> case, it is still possible to fallback.
>>>>>
>>>> It will be bit late.
>>>>
>>>> On failure, than we have to set the default ops and do a goto before
>>>> rte_pktmbuf_pool_init(mp, &mbp_priv);
>>
>> I still think we can do the job without adding the .supported() method.
>> The following code is just an (untested) example:
>>
>> struct rte_mempool *
>> rte_pktmbuf_pool_create(const char *name, unsigned n,
>>     unsigned cache_size, uint16_t priv_size, uint16_t data_room_size,
>>     int socket_id)
>> {
>>     struct rte_mempool *mp;
>>     struct rte_pktmbuf_pool_private mbp_priv;
>>     unsigned elt_size;
>>     int ret;
>>     const char *ops[] = {
>>         RTE_MBUF_DEFAULT_MEMPOOL_OPS, "ring_mp_mc", NULL,
>>     };
>>     const char **op;
>>
>>     if (RTE_ALIGN(priv_size, RTE_MBUF_PRIV_ALIGN) != priv_size) {
>>         RTE_LOG(ERR, MBUF, "mbuf priv_size=%u is not aligned\n",
>>             priv_size);
>>         rte_errno = EINVAL;
>>         return NULL;
>>     }
>>     elt_size = sizeof(struct rte_mbuf) + (unsigned)priv_size +
>>         (unsigned)data_room_size;
>>     mbp_priv.mbuf_data_room_size = data_room_size;
>>     mbp_priv.mbuf_priv_size = priv_size;
>>
>>     for (op = &ops[0]; *op != NULL; op++) {
>>         mp = rte_mempool_create_empty(name, n, elt_size, cache_size,
>>             sizeof(struct rte_pktmbuf_pool_private), socket_id, 0);
>>         if (mp == NULL)
>>             return NULL;
>>
>>         ret = rte_mempool_set_ops_byname(mp, *op, NULL);
>>         if (ret != 0) {
>>             RTE_LOG(ERR, MBUF, "error setting mempool handler\n");
>>             rte_mempool_free(mp);
>>             if (ret == -ENOTSUP)
>>                 continue;
>>             rte_errno = -ret;
>>             return NULL;
>>         }
>>         rte_pktmbuf_pool_init(mp, &mbp_priv);
>>
>>         ret = rte_mempool_populate_default(mp);
>>         if (ret < 0) {
>>             rte_mempool_free(mp);
>>             if (ret == -ENOTSUP)
>>                 continue;
>>             rte_errno = -ret;
>>             return NULL;
>>         }
>>     }
>>
>>     rte_mempool_obj_iter(mp, rte_pktmbuf_init, NULL);
>>
>>     return mp;
>> }
>>
>>
> [Hemant]  This look fine to me. Please submit a patch for the same. 
> 
>>>>> I've just submitted an RFC, which I think is quite linked:
>>>>> http://dpdk.org/ml/archives/dev/2016-September/046974.html
>>>>> Assuming a new parameter "mempool_ops" is added to
>>>>> rte_pktmbuf_pool_create(), would it make sense to fallback to
>>>>> "ring_mp_mc"? What about just returning ENOTSUP? The application
>>>>> could do the job and decide which sw fallback to use.
>>>>
>>>> We ran into this issue when trying to run the standard DPDK examples
>>>> (l3fwd) in VM. Do you think, is it practical to add fallback handling
>>>> in each of the DPDK examples?
>>
>> OK. What is still unclear for me, is how the software is aware of the 
>> different
>> hardware-assisted handlers. Moreover, we could imagine more software
>> handlers, which could be used depending on the use case.
>>
>> I think this choice has to be made by the user or the application:
>>
>> - the application may want to use a specific (sw or hw) handler: in
>>   this case, it want to be notified if it fails, instead of having
>>   a quiet fallback to ring_mp_mc
>> - if several handlers are available, the application may want to
>>   try them in a specific order
>> - maybe some handlers will have some limitations with some
>>   configurations or driver? The application could decide to use
>>   a different handler according the configuration.
>>
>> So that's why I think this is an application decision.
>>
> [Hemant]  We should simplify it:  if the application has supplied the 
> handler, it is application's responsibility to take care of failure. Only if 
> the application want to use the default handler, the implementation can 
> fallback.  The fallback handler can again be configurable. 

Honestly, I'm not very convinced that having a quiet fallback is
the proper solution: the application won't be notified that the
fallback occured.

I understand your patch solves your use-case, but my fear is that
we just integrate this minimal patch and never do the harder work
of solving all the use-cases. I think we need to give the tools to
the applications to control what occurs because we also need to solve
these issues:
- you have a hardware handler, but you want to use the software
  handler
- you have several software handlers, and you (as an administrator) or
  the application knows which one is the best to use in your case.

An alternative (which I don't like either) to solve your issue without
modifying the mbuf code is to have your own handler fallbacking to
ring_mp_mc.


>>>> Typically when someone is writing a application on host, he need not
>>>> worry non-availability of the hw offloaded mempool. He may also want
>>>> to run the same binary in virtual machine. In VM, it is not
>>>> guaranteed that hw offloaded mempools will be available.
>>
>> Running the same binary is of course a need. But if your VM does not provide
>> the same virtualized hardware than the host, I think the command line option
>> makes sense.
>>
>> AFAIU, on the host, you can use a hw mempool handler or a sw one, and on the
>> guest, only a sw one. So you need an option to select the behavior you want 
>> on
>> the host, without recompiling.
>>
>> I understand that modifying all the applications is not a good option 
>> either. I'm
>> thinking a a EAL parameter that would allow to configure a library (mbuf in 
>> this
>> case). Something like kernel boot options.
>> Example: testpmd -l 2,4 --opt mbuf.handler="ring_mp_mc" -- [app args]
>>
>> I don't know if this is feasible, this has to be discussed first, but what 
>> do you
>> think on the principle? The value could also be an ordered list if we want a
>> fallback, and this option could be overriden by the application before 
>> creating
>> the mbuf pool. There was some discussions about a kind of dpdk global
>> configuration some time ago.
>>
> [Hemant] I agree that command line option provide a better control in this 
> case. 
> On the flipside, We need to be careful that we do not end up having too many 
> command line options. 

Yes, we should avoid having too many command line arguments.
I think we should go in the direction of having a sort of key/value
database in EAL. It could be set by:
- a generic command line argument
- a config file (maybe later)
- the application through an API

Then, it would be used by:
- dpdk libraries
- the application (maybe)

This has been discussed some times, the latest was probably:
http://dpdk.org/ml/archives/dev/2016-June/040079.html

I think it would be a good tool for dpdk configurability, and it would
help to remove some compile-time options and maybe some eal options.
Unfortunately, I don't have the time to work on this at the moment.

Regards,
Olivier

Reply via email to