Mon, Sep 25, 2017 at 02:45:08AM CEST, linyunsh...@huawei.com wrote: >Hi, Jiri > >On 2017/9/24 19:37, Jiri Pirko wrote: >> Sat, Sep 23, 2017 at 02:47:20AM CEST, linyunsh...@huawei.com wrote: >>> Hi, Jiri >>> >>> On 2017/9/23 0:03, Jiri Pirko wrote: >>>> Fri, Sep 22, 2017 at 04:11:51PM CEST, linyunsh...@huawei.com wrote: >>>>> Hi, Jiri >>>>> >>>>>>> - if (!tc) { >>>>>>> + if (if_running) { >>>>>>> + (void)hns3_nic_net_stop(netdev); >>>>>>> + msleep(100); >>>>>>> + } >>>>>>> + >>>>>>> + ret = (kinfo->dcb_ops && kinfo->dcb_ops->>setup_tc) ? >>>>>>> + kinfo->dcb_ops->setup_tc(h, tc, prio_tc) : ->EOPNOTSUPP; >>>>> >>>>>> This is most odd. Why do you call dcb_ops from >ndo_setup_tc callback? >>>>>> Why are you mixing this together? prio->tc mapping >can be done >>>>>> directly in dcbnl >>>>> >>>>> Here is what we do in dcb_ops->setup_tc: >>>>> Firstly, if current tc num is different from the tc num >>>>> that user provide, then we setup the queues for each >>>>> tc. >>>>> >>>>> Secondly, we tell hardware the pri to tc mapping that >>>>> the stack is using. In rx direction, our hardware need >>>>> that mapping to put different packet into different tc' >>>>> queues according to the priority of the packet, then >>>>> rss decides which specific queue in the tc should the >>>>> packet goto. >>>>> >>>>> By mixing, I suppose you meant why we need the >>>>> pri to tc infomation? >>>> >>>> by mixing, I mean what I wrote. You are calling dcb_ops callback from >>>> ndo_setup_tc callback. So you are mixing DCBNL subsystem and TC >>>> subsystem. Why? Why do you need sch_mqprio? Why DCBNL is not enough for >>>> all? >>> >>> When using lldptool, dcbnl is involved. >>> >>> But when using tc qdisc, dcbbl is not involved, below is the a few key >>> call graph in the kernel when tc qdisc cmd is executed. >>> >>> cmd: >>> tc qdisc add dev eth0 root handle 1:0 mqprio num_tc 4 map 1 2 3 3 1 3 1 1 >>> hw 1 >>> >>> call graph: >>> rtnetlink_rcv_msg -> tc_modify_qdisc -> qdisc_create -> mqprio_init -> >>> hns3_nic_setup_tc >>> >>> When hns3_nic_setup_tc is called, we need to know how many tc num and >>> prio_tc mapping from the tc_mqprio_qopt which is provided in the paramter >>> in the ndo_setup_tc function, and dcb_ops is the our hardware specific >>> method to setup the tc related parameter to the hardware, so this is why >>> we call dcb_ops callback in ndo_setup_tc callback. >>> >>> I hope this will answer your question, thanks for your time. >> >> Okay. I understand that you have a usecase for mqprio mapping offload >> without lldptool being involved. Ok. I believe it is wrong to call dcb_ops >> from tc callback. You should have a generic layer inside the driver and >> call it from both dcb_ops and tc callbacks. > >Actually, dcb_ops is our generic layer inside the driver. >Below is high level architecture: > > [ tc qdisc ] [ lldpad ] > | | > | | > | | > [ hns3_enet ] [ hns3_dcbnl ] > \ / > \ / > \ / > [ hclge_dcb ] > / \ > / \ > / \ > [ hclgc_main ] [ hclge_tm ] > >hns3_enet.c implements the ndo_setup_tc callback. >hns3_dcbnl.c implements the dcbnl_rtnl_ops for stack's DCBNL system. >hclge_dcb implements the dcb_ops. >So we already have a generic layer that tc and dcbnl all call from. > >> >> Also, what happens If I run lldptool concurrently with mqprio? Who wins >> and is going to configure the mapping? > >Both lldptool and tc qdisc cmd use rtnl interface provided by stack, so >they are both protected by rtnl_lock, so we do not have to do the locking >in the driver.
I was not asking about locking, which is obvious, I was asking about the behaviour. Like for example: If I use tc to configure some mapping, later on I run lldptool and change the mapping. Does the tc dump show the updated values or the original ones? > >The locking is in rtnetlink_rcv_msg: > > rtnl_lock(); > handlers = rtnl_dereference(rtnl_msg_handlers[family]); > if (handlers) { > doit = READ_ONCE(handlers[type].doit); > if (doit) > err = doit(skb, nlh, extack); > } > rtnl_unlock(); > >Thanks. > >> >> >>> >>>> >>>> >>>> >>>>> I hope I did not misunderstand your question, thanks >>>>> for your time reviewing. >>>> >>>> . >>>> >>> >> >> . >> >