[dpdk-dev] [PATCH v4 3/6] ixgbe: Get VF queue number
> -Original Message- > From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com] > Sent: Tuesday, January 6, 2015 7:27 PM > To: Ouyang, Changchun; dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v4 3/6] ixgbe: Get VF queue number > > > On 01/06/15 03:54, Ouyang, Changchun wrote: > > > >> -Original Message- > >> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com] > >> Sent: Monday, January 5, 2015 6:07 PM > >> To: Ouyang, Changchun; dev at dpdk.org > >> Subject: Re: [dpdk-dev] [PATCH v4 3/6] ixgbe: Get VF queue number > >> > >> > >> On 01/05/15 04:59, Ouyang, Changchun wrote: > -Original Message- > From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com] > Sent: Sunday, January 4, 2015 4:39 PM > To: Ouyang, Changchun; dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v4 3/6] ixgbe: Get VF queue number > > > On 01/04/15 09:18, Ouyang Changchun wrote: > > Get the available Rx and Tx queue number when receiving > IXGBE_VF_GET_QUEUES message from VF. > > Signed-off-by: Changchun Ouyang > > --- > > lib/librte_pmd_ixgbe/ixgbe_pf.c | 35 > ++- > > 1 file changed, 34 insertions(+), 1 deletion(-) > > > > diff --git a/lib/librte_pmd_ixgbe/ixgbe_pf.c > > b/lib/librte_pmd_ixgbe/ixgbe_pf.c index 495aff5..cbb0145 100644 > > --- a/lib/librte_pmd_ixgbe/ixgbe_pf.c > > +++ b/lib/librte_pmd_ixgbe/ixgbe_pf.c > > @@ -53,6 +53,8 @@ > > #include "ixgbe_ethdev.h" > > > > #define IXGBE_MAX_VFTA (128) > > +#define IXGBE_VF_MSG_SIZE_DEFAULT 1 #define > > +IXGBE_VF_GET_QUEUE_MSG_SIZE 5 > > > > static inline uint16_t > > dev_num_vf(struct rte_eth_dev *eth_dev) @@ -491,9 +493,36 > @@ > > ixgbe_negotiate_vf_api(struct rte_eth_dev *dev, uint32_t vf, > > uint32_t > *msgbuf) > > } > > > > static int > > +ixgbe_get_vf_queues(struct rte_eth_dev *dev, uint32_t vf, > > +uint32_t > > +*msgbuf) { > > + struct ixgbe_vf_info *vfinfo = > > + *IXGBE_DEV_PRIVATE_TO_P_VFDATA(dev->data- > > dev_private); > > + uint32_t default_q = vf * > RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool; > > + > > + /* Verify if the PF supports the mbox APIs version or not */ > > + switch (vfinfo[vf].api_version) { > > + case ixgbe_mbox_api_20: > > + case ixgbe_mbox_api_11: > > + break; > > + default: > > + return -1; > > + } > > + > > + /* Notify VF of Rx and Tx queue number */ > > + msgbuf[IXGBE_VF_RX_QUEUES] = > RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool; > > + msgbuf[IXGBE_VF_TX_QUEUES] = > RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool; > > + > > + /* Notify VF of default queue */ > > + msgbuf[IXGBE_VF_DEF_QUEUE] = default_q; > What about IXGBE_VF_TRANS_VLAN field? > >>> This field is used for vlan strip or dcb case, which the vf rss don't > >>> need it. > >> But VFs do support VLAN stripping and u don't add it to just RSS. If > >> VFs do not support VLAN stripping in the DPDK yet they should and > >> then we will need this field. > > If I don't miss your point, you also agree it is not related to vf rss > > itself, right? > > Right. > > > As for Vlan stripping, it need another patch to support it. > > Well, at least put some fat comment in bold there that some the fields in the > command is not filled and why. ;) OK, I will put more comments to explain it in v5. > > > > + > > + return 0; > > +} > > + > > +static int > > ixgbe_rcv_msg_from_vf(struct rte_eth_dev *dev, uint16_t vf) > > { > > uint16_t mbx_size = IXGBE_VFMAILBOX_SIZE; > > + uint16_t msg_size = IXGBE_VF_MSG_SIZE_DEFAULT; > > uint32_t msgbuf[IXGBE_VFMAILBOX_SIZE]; > > int32_t retval; > > struct ixgbe_hw *hw = > > IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private); > > @@ -537,6 +566,10 @@ ixgbe_rcv_msg_from_vf(struct rte_eth_dev > >> *dev, > uint16_t vf) > > case IXGBE_VF_API_NEGOTIATE: > > retval = ixgbe_negotiate_vf_api(dev, vf, msgbuf); > > break; > > + case IXGBE_VF_GET_QUEUES: > > + retval = ixgbe_get_vf_queues(dev, vf, msgbuf); > > + msg_size = IXGBE_VF_GET_QUEUE_MSG_SIZE; > Although the msg_size semantics and motivation is clear, if u want > to do > >> then > do it all the way - add it to all other cases too not just to > IXGBE_VF_GET_QUEUES. > For instance, why do u write all 16 DWORDS for API negotiation > (only 2 are > required) and only here u decided to get "greedy"? ;) > > My point is: either drop it completely or fix all other places as well. > >>> This is because the actual message siz
[dpdk-dev] [PATCH v4 6/6] testpmd: Set Rx VMDq RSS mode
> -Original Message- > From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com] > Sent: Tuesday, January 6, 2015 8:53 PM > To: Ouyang, Changchun; dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v4 6/6] testpmd: Set Rx VMDq RSS mode > > > On 01/06/15 04:01, Ouyang, Changchun wrote: > > > >> -Original Message- > >> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com] > >> Sent: Monday, January 5, 2015 6:12 PM > >> To: Ouyang, Changchun; dev at dpdk.org > >> Subject: Re: [dpdk-dev] [PATCH v4 6/6] testpmd: Set Rx VMDq RSS mode > >> > >> > >> On 01/05/15 04:38, Ouyang, Changchun wrote: > -Original Message- > From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com] > Sent: Sunday, January 4, 2015 5:47 PM > To: Ouyang, Changchun; dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v4 6/6] testpmd: Set Rx VMDq RSS > mode > > > On 01/04/15 11:01, Ouyang, Changchun wrote: > >> -Original Message- > >> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com] > >> Sent: Sunday, January 4, 2015 4:50 PM > >> To: Ouyang, Changchun; dev at dpdk.org > >> Subject: Re: [dpdk-dev] [PATCH v4 6/6] testpmd: Set Rx VMDq RSS > >> mode > >> > >> > >> On 01/04/15 09:18, Ouyang Changchun wrote: > >>> Set VMDq RSS mode if it has VF(VF number is more than 1) and has > >>> RSS > >> information. > >>> Signed-off-by: Changchun Ouyang > >>> --- > >>> app/test-pmd/testpmd.c | 10 ++ > >>> 1 file changed, 10 insertions(+) > >>> > >>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c > >>> index 8c69756..6230f8b 100644 > >>> --- a/app/test-pmd/testpmd.c > >>> +++ b/app/test-pmd/testpmd.c > >>> @@ -1708,6 +1708,16 @@ init_port_config(void) > >>> port->dev_conf.rxmode.mq_mode = > >> ETH_MQ_RX_NONE; > >>> } > >>> > >>> + if (port->dev_info.max_vfs != 0) { > >>> + if (port- > >>> dev_conf.rx_adv_conf.rss_conf.rss_hf != 0) > >>> + port->dev_conf.rxmode.mq_mode = > >>> + ETH_MQ_RX_VMDQ_RSS; > >>> + else { > >>> + port->dev_conf.rxmode.mq_mode = > >> ETH_MQ_RX_NONE; > >>> + port->dev_conf.txmode.mq_mode = > >> ETH_MQ_TX_NONE; > >> > >> And what about the txmode.mq_mode when RSS is available > (the :if" > clause)? > > I think we can keep its original value for txmode.mq_mode, so > > don't > change its value. How do you think of it? > > I agree that not changing a Tx mq_mode in both cases would be better. > >>> In the else clause, set txmode.mq_mode as ETH_MQ_TX_NONE > explicitly > >> to > >>> make sure it is neither ETH_MQ_TX_DCB, ETH_MQ_TX_VMDQ_DCB, nor > >> ETH_MQ_TX_VMDQ_ONLY. > >> > >> It's not obvious to me why u should do that since AFAIK any of these > >> modes requires RX_RSS. Do I miss anything? > > No, I don't think so, in the else clause, it doesn't need rx_rss, and > > no way to do it, because the case is there is no rss configuration > information(note: in the else clause, dev_conf.rx_adv_conf.rss_conf.rss_hf > == 0). > > > > So ETH_MQ_RX_NONE for rx_mode, and ETH_MQ_TX_NONE for tx_mode. > > Of course, however, in general, one may ask, why u configure TX MQ mode > in "else" clause an don't do it in the "if" one. Possibly the "if" case in TX > MQ > context has been handled elsewhere but this is what makes this code > confusing: to make it the most readable u'd rather configure the same > feature set in both "if" and "else". > For instance: > > if (bla-bla) { >tx_mode = X1; >rx_mode = X2; > } else { > tx_mode = Y1; > rx_mode = Y2; > } > > Look at the non-SR-IOV clause right above the "if-else" block u've added. > Why don't they configure tx_mode there? Is it a bug in their code? It also makes sense, I will add tx_mode = ETH_MQ_TX_NONE as no rss for tx mode, Rss only for rx mode. > By the way, u forgot to fix the remark below > > /* In SR-IOV mode, RSS mode is not available */ > > which is located a few lines above the code u've added. ;) Sorry, I missed these few lines before, I will remove them in v5. Thanks Changchun
[dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and csum forwarding engine
Hi Olivier, > -Original Message- > From: Olivier MATZ [mailto:olivier.matz at 6wind.com] > Sent: Saturday, December 13, 2014 12:33 AM > To: Liu, Jijiang > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and > csum forwarding engine > > Hello, > > On 12/12/2014 04:48 AM, Liu, Jijiang wrote: > > The 'hw/sw' option is used to set/clear the flag of enabling TX tunneling > > packet > checksum hardware offload in testpmd application. > > This is not clear at all. > In your command, there is (hw|sw|none). > Are you talking about inner or outer? > Is this command useful for any kind of packet? > How does it combine with "tx_checksum set outer-ip (hw|sw)"? > I rethink these TX checksum commands in this patch set and agree with you that we should make some changes for having clear meaning for them. There are 3 commands in patch set as follows, 1. tx_checksum set tunnel (hw|sw|none) (port-id) Now I also think the command 1 may confuse user, they probably don't understand why we need 'hw' or 'sw' option and when to use the two option, so I will replace the command with 'tx_checksum set hw-tunnel-mode (on|off) (port-id)' command. 2. tx_checksum set outer-ip (hw|sw) (port-id) 3. tx_checksum set (ip|udp|tcp|sctp) (hw|sw) (port-id) The command 2 will be merged into command 3, the new command is ' tx_checksum set (outer-ip|ip|udp|tcp|sctp) (hw|sw) (port-id)'. These most of the cases in http://dpdk.org/ml/archives/dev/2014-December/009213.html will be covered by using the two commands The command 'tx_checksum set hw-tunnel-mode (on|off) (port-id)' is used to set/clear TESTPMD_TX_OFFLOAD_TUNNEL_CKSUM flag. Actually, the PKT_TX_UDP_TUNNEL_PKT offload flag will be set if the testpmd flag is set, which tell driver/HW treat that transmit packet as a tunneling packet. When 'on' is set, which is able to meet Method B.1 and method C. When 'off' is set, the TESTPMD_TX_OFFLOAD_TUNNEL_CKSUM is not needed to set, so the PKT_TX_UDP_TUNNEL_PKT offload flag is not needed to set, then HW treat that transmit packet as a non-tunneling packet. It is able to meet Method B.2. As to case A, I think it is not mandatory to cover it in csum fwd engine for tunneling packet. Is the above description clear for you? > Regards, > Olivier
[dpdk-dev] [PATCH v4 4/6] ether: Check VMDq RSS mode
> -Original Message- > From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com] > Sent: Wednesday, January 7, 2015 3:56 AM > To: Ouyang, Changchun; dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v4 4/6] ether: Check VMDq RSS mode > > > On 01/06/15 03:56, Ouyang, Changchun wrote: > >> -Original Message- > >> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com] > >> Sent: Monday, January 5, 2015 6:10 PM > >> To: Ouyang, Changchun;dev at dpdk.org > >> Subject: Re: [dpdk-dev] [PATCH v4 4/6] ether: Check VMDq RSS mode > >> > >> > >> On 01/05/15 03:00, Ouyang, Changchun wrote: > -Original Message- > From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com] > Sent: Sunday, January 4, 2015 5:46 PM > To: Ouyang, Changchun;dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v4 4/6] ether: Check VMDq RSS mode > > > On 01/04/15 10:58, Ouyang, Changchun wrote: > >> -Original Message- > >> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com] > >> Sent: Sunday, January 4, 2015 4:45 PM > >> To: Ouyang, Changchun;dev at dpdk.org > >> Subject: Re: [dpdk-dev] [PATCH v4 4/6] ether: Check VMDq RSS > mode > >> > >> > >> On 01/04/15 09:18, Ouyang Changchun wrote: > >>> Check mq mode for VMDq RSS, handle it correctly instead of > >>> returning an error; Also remove the limitation of per pool queue > >>> number has max value of 1, because the per pool queue number > >> could > >>> be 2 or 4 if it is VMDq RSS mode; > >>> > >>> The number of rxq specified in config will determine the mq mode > >>> for > >> VMDq RSS. > >>> Signed-off-by: Changchun Ouyang > >>> --- > >>> lib/librte_ether/rte_ethdev.c | 39 > >> ++- > >>> 1 file changed, 34 insertions(+), 5 deletions(-) > >>> > >>> diff --git a/lib/librte_ether/rte_ethdev.c > >>> b/lib/librte_ether/rte_ethdev.c index 95f2ceb..59ff325 100644 > >>> --- a/lib/librte_ether/rte_ethdev.c > >>> +++ b/lib/librte_ether/rte_ethdev.c > >>> @@ -510,8 +510,7 @@ rte_eth_dev_check_mq_mode(uint8_t > >> port_id, > >>> uint16_t nb_rx_q, uint16_t nb_tx_q, > >>> > >>> if (RTE_ETH_DEV_SRIOV(dev).active != 0) { > >>> /* check multi-queue mode */ > >>> - if ((dev_conf->rxmode.mq_mode == > >> ETH_MQ_RX_RSS) || > >>> - (dev_conf->rxmode.mq_mode == > >> ETH_MQ_RX_DCB) || > >>> + if ((dev_conf->rxmode.mq_mode == > >> ETH_MQ_RX_DCB) || > >>> (dev_conf->rxmode.mq_mode == > >> ETH_MQ_RX_DCB_RSS) > >> || > >>> (dev_conf->txmode.mq_mode == > >> ETH_MQ_TX_DCB)) { > >>> /* SRIOV only works in VMDq enable mode > >> */ @@ - > >> 525,7 +524,6 @@ > >>> rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q, > >> uint16_t nb_tx_q, > >>> } > >>> > >>> switch (dev_conf->rxmode.mq_mode) { > >>> - case ETH_MQ_RX_VMDQ_RSS: > >>> case ETH_MQ_RX_VMDQ_DCB: > >>> case ETH_MQ_RX_VMDQ_DCB_RSS: > >>> /* DCB/RSS VMDQ in SRIOV mode, not > >> implement > >> yet */ @@ -534,6 > >>> +532,39 @@ rte_eth_dev_check_mq_mode(uint8_t port_id, > uint16_t > >> nb_rx_q, uint16_t nb_tx_q, > >>> "unsupported VMDQ > >> mq_mode > >> rx %u\n", > >>> port_id, dev_conf- > >>> rxmode.mq_mode); > >>> return (-EINVAL); > >>> + case ETH_MQ_RX_RSS: > >>> + PMD_DEBUG_TRACE("ethdev port_id=%" > >> PRIu8 > >>> + " SRIOV active, " > >>> + "Rx mq mode is changed > >> from:" > >>> + "mq_mode %u into VMDQ > >> mq_mode %u\n", > >>> + port_id, > >>> + dev_conf- > >>> rxmode.mq_mode, > >>> + dev->data- > >>> dev_conf.rxmode.mq_mode); > >>> + case ETH_MQ_RX_VMDQ_RSS: > >>> + dev->data->dev_conf.rxmode.mq_mode = > >> ETH_MQ_RX_VMDQ_RSS; > >>> + if (nb_rx_q < > >> RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool) { > Missed that before: shouldn't it be "<=" here? > >>> Agree with you, need <= here, I will fix it in v5 > >>> > >>> + switch (nb_rx_q) { > >>> + case 1: > >>> + case 2: > >>> + > >>RTE_ETH_DEV_SRIOV(dev).active = > >>> + ETH_64_POOLS; > >>> + break; > >>> +
[dpdk-dev] [PATCH v2] bond: vlan flags misinterpreted in xmit_slave_hash function
Tested-by: Jiajia, SunX - Tested Commit: 6fb3161060fc894295a27f9304c56ef34492799d - OS: Fedora20 3.11.10-301.fc20.x86_64 - GCC: gcc version 4.8.3 - CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz - NIC: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection [8086:10fb] - Target x86_64-native-linuxapp-gcc and i686-native-linuxapp-gcc - Total 44 cases, 44 passed, 0 failed TOPO: * Connections ports between tester/ixia and DUT - TESTER(Or IXIA)---DUT - portA--port0 - portB--port1 - portC--port2 - portD--port3 Test Setup#1 for Functional test Tester has 4 ports(portA--portD), and DUT has 4 ports(port0--port3), then connect portA to port0, portB to port1, portC to port2, portD to port3. - Case: Basic bonding--Create bonded devices and slaves Description: Use Setup#1. Create bonded device and add some ports as salve of bonded device, Then removed slaves or added slaves or change the bonding primary slave Or change bonding mode and so on. Expected test result: Verify the basic functions are normal. - Case: Basic bonding--MAC Address Test Description: Use Setup#1. Create bonded device and add some ports as slaves of bonded device, Check that the changes of the bonded device and slave MAC Expected test result: Verify the behavior of bonded device and slave according to the mode. - Case: Basic bonding--Device Promiscuous Mode Test Description: Use Setup#1. Create bonded device and add some ports as slaves of bonded device, Set promiscuous mode on or off, then send packets to the bonded device Or slaves. Expected test result: Verify the RX/TX status of bonded device and slaves according to the mode. - Case: Mode 0(Round Robin) TX/RX test Description: Use Setup#1. Create bonded device with mode 0 and add 3 ports as slaves of bonded device, Forward packets between bonded device and unbounded device, start to forward, And send packets to unbound device or slaves. Expected test result: Verify the RX/TX status of bonded device and slaves in mode 0. - Case: Mode 0(Round Robin) Bring one slave link down Description: Use Setup#1. Create bonded device with mode 0 and add 3 ports as slaves of bonded device, Forward packets between bonded device and unbounded device, start to forward, Bring the link on either port 0, 1 or 2 down. And send packets to unbound device or slaves. Expected test result: Verify the RX/TX status of bonded device and slaves in mode 0. - Case: Mode 0(Round Robin) Bring all slave links down Description: Use Setup#1. Create bonded device with mode 0 and add 3 ports as slaves of bonded device, Forward packets between bonded device and unbounded device, start to forward, Bring the links down on all bonded ports. And send packets to unbound device or slaves. Expected test result: Verify the RX/TX status of bonded device and slaves in mode 0. - Case: Mode 1(Active Backup) TX/RX Test Description: Use Setup#1. Create bonded device with mode 1 and add 3 ports as slaves of bonded device, Forward packets between bonded device and unbounded device, start to forward, And send packets to unbound device or slaves. Expected test result: Verify the RX/TX status of bonded device and slaves in mode 1. - Case: Mode 1(Active Backup) Change active slave, RX/TX test Description: Use Setup#1. Continuing from previous test case.Change the active slave port from port0 to port1.Verify that the bonded device's MAC has changed to slave1's MAC. testpmd> set bonding primary 1 4 Repeat the transmission and reception(TX/RX) test verify that data is now transmitted and received through the new active slave and no longer through port0 Expected test result: Verify the RX/TX status of bonded device and slaves in mode 1. - Case: Mode 1(Active Backup) Link up/down active eth dev Description: Use Setup#1. Bring link between port A and port0 down. If tester is ixia, can use IxExplorer to set the "Simulate Cable Disconnect" at the port property. Verify that the active slave has been changed from port0. Repeat the transmission and reception test verify that data is now transmitted and received through the new active slave and no longer through port0 Bring port0 to link down a
[dpdk-dev] [PATCH v5 0/6] Enable VF RSS for Niantic
This patch enables VF RSS for Niantic, which allow each VF having at most 4 queues. The actual queue number per VF depends on the total number of pool, which is determined by the max number of VF at PF initialization stage and the number of queue specified in config: 1) If the max number of VF is in the range from 1 to 32, and the number of rxq is 4 ('--rxq 4' in testpmd), then there is totally 32 pools(ETH_32_POOLS), and each VF have 4 queues; 2)If the max number of VF is in the range from 33 to 64, and the number of rxq is 2 ('--rxq 2' in testpmd), then there is totally 64 pools(ETH_64_POOLS), and each VF have 2 queues; On host, to enable VF RSS functionality, rx mq mode should be set as ETH_MQ_RX_VMDQ_RSS or ETH_MQ_RX_RSS mode, and SRIOV mode should be activated(max_vfs >= 1). It also needs config VF RSS information like hash function, RSS key, RSS key length. The limitation for Niantic VF RSS is: the hash and key are shared among PF and all VF, the RETA table with 128 entries are also shared among PF and all VF. So it could not to provide a method to query the hash and reta content per VF on guest, while, if possible, please query them on host(PF) for the shared RETA information. changes in v5: - Fix minor issue and some comments; changes in v4: - Extract a function to remove embeded switch-case statement; - Check whether RX queue number is a valid one, otherwise return error; - Update the description a bit; changes in v3: - More cleanup; changes in v2: - Update the description; - Use receiving queue number('--rxq ') specified in config to determine the number of pool and the number of queue per VF; changes in v1: - Config VF RSS; Changchun Ouyang (6): ixgbe: Code cleanup ixgbe: Negotiate VF API version ixgbe: Get VF queue number ether: Check VMDq RSS mode ixgbe: Config VF RSS testpmd: Set Rx VMDq RSS mode app/test-pmd/testpmd.c | 15 +++- lib/librte_ether/rte_ethdev.c | 50 +++-- lib/librte_pmd_ixgbe/ixgbe_ethdev.h | 1 + lib/librte_pmd_ixgbe/ixgbe_pf.c | 80 - lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 138 5 files changed, 248 insertions(+), 36 deletions(-) -- 1.8.4.2
[dpdk-dev] [PATCH v5 1/6] ixgbe: Code cleanup
Put global register configuring out of loop for queue; also fix typo and indent; Signed-off-by: Changchun Ouyang --- lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 35 ++- 1 file changed, 18 insertions(+), 17 deletions(-) diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c index 5c36bff..f69abda 100644 --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c @@ -3548,9 +3548,9 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev) IXGBE_WRITE_REG(hw, IXGBE_PSRTYPE(rxq->reg_idx), psrtype); } srrctl = ((dev->data->dev_conf.rxmode.split_hdr_size << - IXGBE_SRRCTL_BSIZEHDRSIZE_SHIFT) & - IXGBE_SRRCTL_BSIZEHDR_MASK); - srrctl |= E1000_SRRCTL_DESCTYPE_HDR_SPLIT_ALWAYS; + IXGBE_SRRCTL_BSIZEHDRSIZE_SHIFT) & + IXGBE_SRRCTL_BSIZEHDR_MASK); + srrctl |= IXGBE_SRRCTL_DESCTYPE_HDR_SPLIT_ALWAYS; } else #endif srrctl = IXGBE_SRRCTL_DESCTYPE_ADV_ONEBUF; @@ -3985,7 +3985,7 @@ ixgbevf_dev_rx_init(struct rte_eth_dev *dev) struct igb_rx_queue *rxq; struct rte_pktmbuf_pool_private *mbp_priv; uint64_t bus_addr; - uint32_t srrctl; + uint32_t srrctl, psrtype = 0; uint16_t buf_size; uint16_t i; int ret; @@ -4039,20 +4039,10 @@ ixgbevf_dev_rx_init(struct rte_eth_dev *dev) * Configure Header Split */ if (dev->data->dev_conf.rxmode.header_split) { - - /* Must setup the PSRTYPE register */ - uint32_t psrtype; - psrtype = IXGBE_PSRTYPE_TCPHDR | - IXGBE_PSRTYPE_UDPHDR | - IXGBE_PSRTYPE_IPV4HDR | - IXGBE_PSRTYPE_IPV6HDR; - - IXGBE_WRITE_REG(hw, IXGBE_VFPSRTYPE(i), psrtype); - srrctl = ((dev->data->dev_conf.rxmode.split_hdr_size << - IXGBE_SRRCTL_BSIZEHDRSIZE_SHIFT) & - IXGBE_SRRCTL_BSIZEHDR_MASK); - srrctl |= E1000_SRRCTL_DESCTYPE_HDR_SPLIT_ALWAYS; + IXGBE_SRRCTL_BSIZEHDRSIZE_SHIFT) & + IXGBE_SRRCTL_BSIZEHDR_MASK); + srrctl |= IXGBE_SRRCTL_DESCTYPE_HDR_SPLIT_ALWAYS; } else #endif srrctl = IXGBE_SRRCTL_DESCTYPE_ADV_ONEBUF; @@ -4095,6 +4085,17 @@ ixgbevf_dev_rx_init(struct rte_eth_dev *dev) } } +#ifdef RTE_HEADER_SPLIT_ENABLE + if (dev->data->dev_conf.rxmode.header_split) + /* Must setup the PSRTYPE register */ + psrtype = IXGBE_PSRTYPE_TCPHDR | + IXGBE_PSRTYPE_UDPHDR | + IXGBE_PSRTYPE_IPV4HDR | + IXGBE_PSRTYPE_IPV6HDR; +#endif + + IXGBE_WRITE_REG(hw, IXGBE_VFPSRTYPE, psrtype); + if (dev->data->dev_conf.rxmode.enable_scatter) { if (!dev->data->scattered_rx) PMD_INIT_LOG(DEBUG, "forcing scatter mode"); -- 1.8.4.2
[dpdk-dev] [PATCH v5 2/6] ixgbe: Negotiate VF API version
Negotiate API version with VF when receiving the IXGBE_VF_API_NEGOTIATE message. Signed-off-by: Changchun Ouyang --- lib/librte_pmd_ixgbe/ixgbe_ethdev.h | 1 + lib/librte_pmd_ixgbe/ixgbe_pf.c | 25 + 2 files changed, 26 insertions(+) diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.h b/lib/librte_pmd_ixgbe/ixgbe_ethdev.h index ca99170..730098d 100644 --- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.h +++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.h @@ -159,6 +159,7 @@ struct ixgbe_vf_info { uint16_t tx_rate[IXGBE_MAX_QUEUE_NUM_PER_VF]; uint16_t vlan_count; uint8_t spoofchk_enabled; + uint8_t api_version; }; /* diff --git a/lib/librte_pmd_ixgbe/ixgbe_pf.c b/lib/librte_pmd_ixgbe/ixgbe_pf.c index 51da1fd..495aff5 100644 --- a/lib/librte_pmd_ixgbe/ixgbe_pf.c +++ b/lib/librte_pmd_ixgbe/ixgbe_pf.c @@ -469,6 +469,28 @@ ixgbe_set_vf_lpe(struct rte_eth_dev *dev, __rte_unused uint32_t vf, uint32_t *ms } static int +ixgbe_negotiate_vf_api(struct rte_eth_dev *dev, uint32_t vf, uint32_t *msgbuf) +{ + uint32_t api_version = msgbuf[1]; + struct ixgbe_vf_info *vfinfo = + *IXGBE_DEV_PRIVATE_TO_P_VFDATA(dev->data->dev_private); + + switch (api_version) { + case ixgbe_mbox_api_10: + case ixgbe_mbox_api_11: + vfinfo[vf].api_version = (uint8_t)api_version; + return 0; + default: + break; + } + + RTE_LOG(ERR, PMD, "Negotiate invalid api version %u from VF %d\n", + api_version, vf); + + return -1; +} + +static int ixgbe_rcv_msg_from_vf(struct rte_eth_dev *dev, uint16_t vf) { uint16_t mbx_size = IXGBE_VFMAILBOX_SIZE; @@ -512,6 +534,9 @@ ixgbe_rcv_msg_from_vf(struct rte_eth_dev *dev, uint16_t vf) case IXGBE_VF_SET_VLAN: retval = ixgbe_vf_set_vlan(dev, vf, msgbuf); break; + case IXGBE_VF_API_NEGOTIATE: + retval = ixgbe_negotiate_vf_api(dev, vf, msgbuf); + break; default: PMD_DRV_LOG(DEBUG, "Unhandled Msg %8.8x", (unsigned)msgbuf[0]); retval = IXGBE_ERR_MBX; -- 1.8.4.2
[dpdk-dev] [PATCH v5 3/6] ixgbe: Get VF queue number
Get the available Rx and Tx queue number when receiving IXGBE_VF_GET_QUEUES message from VF. Signed-off-by: Changchun Ouyang changes in v5 - Add some 'FIX ME' comments for IXGBE_VF_TRANS_VLAN. --- lib/librte_pmd_ixgbe/ixgbe_pf.c | 40 +++- 1 file changed, 39 insertions(+), 1 deletion(-) diff --git a/lib/librte_pmd_ixgbe/ixgbe_pf.c b/lib/librte_pmd_ixgbe/ixgbe_pf.c index 495aff5..dbda9b5 100644 --- a/lib/librte_pmd_ixgbe/ixgbe_pf.c +++ b/lib/librte_pmd_ixgbe/ixgbe_pf.c @@ -53,6 +53,8 @@ #include "ixgbe_ethdev.h" #define IXGBE_MAX_VFTA (128) +#define IXGBE_VF_MSG_SIZE_DEFAULT 1 +#define IXGBE_VF_GET_QUEUE_MSG_SIZE 5 static inline uint16_t dev_num_vf(struct rte_eth_dev *eth_dev) @@ -491,9 +493,41 @@ ixgbe_negotiate_vf_api(struct rte_eth_dev *dev, uint32_t vf, uint32_t *msgbuf) } static int +ixgbe_get_vf_queues(struct rte_eth_dev *dev, uint32_t vf, uint32_t *msgbuf) +{ + struct ixgbe_vf_info *vfinfo = + *IXGBE_DEV_PRIVATE_TO_P_VFDATA(dev->data->dev_private); + uint32_t default_q = vf * RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool; + + /* Verify if the PF supports the mbox APIs version or not */ + switch (vfinfo[vf].api_version) { + case ixgbe_mbox_api_20: + case ixgbe_mbox_api_11: + break; + default: + return -1; + } + + /* Notify VF of Rx and Tx queue number */ + msgbuf[IXGBE_VF_RX_QUEUES] = RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool; + msgbuf[IXGBE_VF_TX_QUEUES] = RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool; + + /* Notify VF of default queue */ + msgbuf[IXGBE_VF_DEF_QUEUE] = default_q; + + /* +* FIX ME if it needs fill msgbuf[IXGBE_VF_TRANS_VLAN] +* for VLAN strip or VMDQ_DCB or VMDQ_DCB_RSS +*/ + + return 0; +} + +static int ixgbe_rcv_msg_from_vf(struct rte_eth_dev *dev, uint16_t vf) { uint16_t mbx_size = IXGBE_VFMAILBOX_SIZE; + uint16_t msg_size = IXGBE_VF_MSG_SIZE_DEFAULT; uint32_t msgbuf[IXGBE_VFMAILBOX_SIZE]; int32_t retval; struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private); @@ -537,6 +571,10 @@ ixgbe_rcv_msg_from_vf(struct rte_eth_dev *dev, uint16_t vf) case IXGBE_VF_API_NEGOTIATE: retval = ixgbe_negotiate_vf_api(dev, vf, msgbuf); break; + case IXGBE_VF_GET_QUEUES: + retval = ixgbe_get_vf_queues(dev, vf, msgbuf); + msg_size = IXGBE_VF_GET_QUEUE_MSG_SIZE; + break; default: PMD_DRV_LOG(DEBUG, "Unhandled Msg %8.8x", (unsigned)msgbuf[0]); retval = IXGBE_ERR_MBX; @@ -551,7 +589,7 @@ ixgbe_rcv_msg_from_vf(struct rte_eth_dev *dev, uint16_t vf) msgbuf[0] |= IXGBE_VT_MSGTYPE_CTS; - ixgbe_write_mbx(hw, msgbuf, 1, vf); + ixgbe_write_mbx(hw, msgbuf, msg_size, vf); return retval; } -- 1.8.4.2
[dpdk-dev] [PATCH v5 4/6] ether: Check VMDq RSS mode
Check mq mode for VMDq RSS, handle it correctly instead of returning an error; Also remove the limitation of per pool queue number has max value of 1, because the per pool queue number could be 2 or 4 if it is VMDq RSS mode; The number of rxq specified in config will determine the mq mode for VMDq RSS. Signed-off-by: Changchun Ouyang changes in v5: - Fix '<' issue, it should be '<=' to test rxq number; - Extract a function to remove the embeded switch-case statement. --- lib/librte_ether/rte_ethdev.c | 50 ++- 1 file changed, 45 insertions(+), 5 deletions(-) diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index 95f2ceb..8363e26 100644 --- a/lib/librte_ether/rte_ethdev.c +++ b/lib/librte_ether/rte_ethdev.c @@ -503,6 +503,31 @@ rte_eth_dev_tx_queue_config(struct rte_eth_dev *dev, uint16_t nb_queues) } static int +rte_eth_dev_check_vf_rss_rxq_num(uint8_t port_id, uint16_t nb_rx_q) +{ + struct rte_eth_dev *dev = &rte_eth_devices[port_id]; + switch (nb_rx_q) { + case 1: + case 2: + RTE_ETH_DEV_SRIOV(dev).active = + ETH_64_POOLS; + break; + case 4: + RTE_ETH_DEV_SRIOV(dev).active = + ETH_32_POOLS; + break; + default: + return -EINVAL; + } + + RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool = nb_rx_q; + RTE_ETH_DEV_SRIOV(dev).def_pool_q_idx = + dev->pci_dev->max_vfs * nb_rx_q; + + return 0; +} + +static int rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q, const struct rte_eth_conf *dev_conf) { @@ -510,8 +535,7 @@ rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q, if (RTE_ETH_DEV_SRIOV(dev).active != 0) { /* check multi-queue mode */ - if ((dev_conf->rxmode.mq_mode == ETH_MQ_RX_RSS) || - (dev_conf->rxmode.mq_mode == ETH_MQ_RX_DCB) || + if ((dev_conf->rxmode.mq_mode == ETH_MQ_RX_DCB) || (dev_conf->rxmode.mq_mode == ETH_MQ_RX_DCB_RSS) || (dev_conf->txmode.mq_mode == ETH_MQ_TX_DCB)) { /* SRIOV only works in VMDq enable mode */ @@ -525,7 +549,6 @@ rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q, } switch (dev_conf->rxmode.mq_mode) { - case ETH_MQ_RX_VMDQ_RSS: case ETH_MQ_RX_VMDQ_DCB: case ETH_MQ_RX_VMDQ_DCB_RSS: /* DCB/RSS VMDQ in SRIOV mode, not implement yet */ @@ -534,6 +557,25 @@ rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q, "unsupported VMDQ mq_mode rx %u\n", port_id, dev_conf->rxmode.mq_mode); return (-EINVAL); + case ETH_MQ_RX_RSS: + PMD_DEBUG_TRACE("ethdev port_id=%" PRIu8 + " SRIOV active, " + "Rx mq mode is changed from:" + "mq_mode %u into VMDQ mq_mode %u\n", + port_id, + dev_conf->rxmode.mq_mode, + dev->data->dev_conf.rxmode.mq_mode); + case ETH_MQ_RX_VMDQ_RSS: + dev->data->dev_conf.rxmode.mq_mode = ETH_MQ_RX_VMDQ_RSS; + if (nb_rx_q <= RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool) + if (rte_eth_dev_check_vf_rss_rxq_num(port_id, nb_rx_q) != 0) { + PMD_DEBUG_TRACE("ethdev port_id=%d" + " SRIOV active, invalid queue" + " number for VMDQ RSS\n", + port_id); + return -EINVAL; + } + break; default: /* ETH_MQ_RX_VMDQ_ONLY or ETH_MQ_RX_NONE */ /* if nothing mq mode configure, use default scheme */ dev->data->dev_conf.rxmode.mq_mode = ETH_MQ_RX_VMDQ_ONLY; @@ -553,8 +595,6 @@ rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q, default: /* ETH_MQ_TX_VMDQ_ONLY or ETH_MQ_TX_NONE */ /* if nothing mq mode configure, use default scheme */ dev->data->dev_conf.txmode.mq_mode = ETH_MQ_TX_VMDQ_ONLY; - if (RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool > 1) - RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool = 1; break; } -- 1.8.4.2
[dpdk-dev] [PATCH v5 5/6] ixgbe: Config VF RSS
It needs config RSS and IXGBE_MRQC and IXGBE_VFPSRTYPE to enable VF RSS. The psrtype will determine how many queues the received packets will distribute to, and the value of psrtype should depends on both facet: max VF rxq number which has been negotiated with PF, and the number of rxq specified in config on guest. Signed-off-by: Changchun Ouyang Changes in v4: - the number of rxq from config should be power of 2 and should not bigger than max VF rxq number(negotiated between guest and host). --- lib/librte_pmd_ixgbe/ixgbe_pf.c | 15 ++ lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 103 +- 2 files changed, 106 insertions(+), 12 deletions(-) diff --git a/lib/librte_pmd_ixgbe/ixgbe_pf.c b/lib/librte_pmd_ixgbe/ixgbe_pf.c index dbda9b5..93f6e43 100644 --- a/lib/librte_pmd_ixgbe/ixgbe_pf.c +++ b/lib/librte_pmd_ixgbe/ixgbe_pf.c @@ -187,6 +187,21 @@ int ixgbe_pf_host_configure(struct rte_eth_dev *eth_dev) IXGBE_WRITE_REG(hw, IXGBE_MPSAR_LO(hw->mac.num_rar_entries), 0); IXGBE_WRITE_REG(hw, IXGBE_MPSAR_HI(hw->mac.num_rar_entries), 0); + /* +* VF RSS can support at most 4 queues for each VF, even if +* 8 queues are available for each VF, it need refine to 4 +* queues here due to this limitation, otherwise no queue +* will receive any packet even RSS is enabled. +*/ + if (eth_dev->data->dev_conf.rxmode.mq_mode == ETH_MQ_RX_VMDQ_RSS) { + if (RTE_ETH_DEV_SRIOV(eth_dev).nb_q_per_pool == 8) { + RTE_ETH_DEV_SRIOV(eth_dev).active = ETH_32_POOLS; + RTE_ETH_DEV_SRIOV(eth_dev).nb_q_per_pool = 4; + RTE_ETH_DEV_SRIOV(eth_dev).def_pool_q_idx = + dev_num_vf(eth_dev) * 4; + } + } + /* set VMDq map to default PF pool */ hw->mac.ops.set_vmdq(hw, 0, RTE_ETH_DEV_SRIOV(eth_dev).def_vmdq_idx); diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c index f69abda..e83a9ab 100644 --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c @@ -3327,6 +3327,68 @@ ixgbe_alloc_rx_queue_mbufs(struct igb_rx_queue *rxq) } static int +ixgbe_config_vf_rss(struct rte_eth_dev *dev) +{ + struct ixgbe_hw *hw; + uint32_t mrqc; + + ixgbe_rss_configure(dev); + + hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private); + + /* MRQC: enable VF RSS */ + mrqc = IXGBE_READ_REG(hw, IXGBE_MRQC); + mrqc &= ~IXGBE_MRQC_MRQE_MASK; + switch (RTE_ETH_DEV_SRIOV(dev).active) { + case ETH_64_POOLS: + mrqc |= IXGBE_MRQC_VMDQRSS64EN; + break; + + case ETH_32_POOLS: + case ETH_16_POOLS: + mrqc |= IXGBE_MRQC_VMDQRSS32EN; + break; + + default: + PMD_INIT_LOG(ERR, "Invalid pool number in IOV mode"); + return -EINVAL; + } + + IXGBE_WRITE_REG(hw, IXGBE_MRQC, mrqc); + + return 0; +} + +static int +ixgbe_config_vf_default(struct rte_eth_dev *dev) +{ + struct ixgbe_hw *hw = + IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private); + + switch (RTE_ETH_DEV_SRIOV(dev).active) { + case ETH_64_POOLS: + IXGBE_WRITE_REG(hw, IXGBE_MRQC, + IXGBE_MRQC_VMDQEN); + break; + + case ETH_32_POOLS: + IXGBE_WRITE_REG(hw, IXGBE_MRQC, + IXGBE_MRQC_VMDQRT4TCEN); + break; + + case ETH_16_POOLS: + IXGBE_WRITE_REG(hw, IXGBE_MRQC, + IXGBE_MRQC_VMDQRT8TCEN); + break; + default: + PMD_INIT_LOG(ERR, + "invalid pool number in IOV mode"); + break; + } + return 0; +} + +static int ixgbe_dev_mq_rx_configure(struct rte_eth_dev *dev) { struct ixgbe_hw *hw = @@ -3358,24 +3420,25 @@ ixgbe_dev_mq_rx_configure(struct rte_eth_dev *dev) default: ixgbe_rss_disable(dev); } } else { - switch (RTE_ETH_DEV_SRIOV(dev).active) { /* * SRIOV active scheme -* FIXME if support DCB/RSS together with VMDq & SRIOV +* Support RSS together with VMDq & SRIOV */ - case ETH_64_POOLS: - IXGBE_WRITE_REG(hw, IXGBE_MRQC, IXGBE_MRQC_VMDQEN); - break; - - case ETH_32_POOLS: - IXGBE_WRITE_REG(hw, IXGBE_MRQC, IXGBE_MRQC_VMDQRT4TCEN); + switch (dev->data->dev_conf.rxmode.mq_mode) { + case ETH_MQ_RX_RSS: + case ETH_MQ_RX_VMDQ_RSS: + ixgbe_config_vf_rss(dev); break; - case ETH_16_POOLS: - IXGBE_WRITE_REG(hw, IXGBE_MRQC, IXGBE_MRQC_VMDQRT8TCEN); -
[dpdk-dev] [PATCH v5 6/6] testpmd: Set Rx VMDq RSS mode
Set VMDq RSS mode if it has VF(VF number is more than 1) and has RSS information. Signed-off-by: Changchun Ouyang changes in v5 - Assign txmode.mq_mode with ETH_MQ_TX_NONE explicitly; - Remove one line wrong comment. --- app/test-pmd/testpmd.c | 15 ++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index 8c69756..64fd4ee 100644 --- a/app/test-pmd/testpmd.c +++ b/app/test-pmd/testpmd.c @@ -1700,7 +1700,6 @@ init_port_config(void) port->dev_conf.rx_adv_conf.rss_conf.rss_hf = 0; } - /* In SR-IOV mode, RSS mode is not available */ if (port->dcb_flag == 0 && port->dev_info.max_vfs == 0) { if( port->dev_conf.rx_adv_conf.rss_conf.rss_hf != 0) port->dev_conf.rxmode.mq_mode = ETH_MQ_RX_RSS; @@ -1708,6 +1707,20 @@ init_port_config(void) port->dev_conf.rxmode.mq_mode = ETH_MQ_RX_NONE; } + if (port->dev_info.max_vfs != 0) { + if (port->dev_conf.rx_adv_conf.rss_conf.rss_hf != 0) { + port->dev_conf.rxmode.mq_mode = + ETH_MQ_RX_VMDQ_RSS; + port->dev_conf.txmode.mq_mode = + ETH_MQ_TX_NONE; + } else { + port->dev_conf.rxmode.mq_mode = + ETH_MQ_RX_NONE; + port->dev_conf.txmode.mq_mode = + ETH_MQ_TX_NONE; + } + } + port->rx_conf.rx_thresh = rx_thresh; port->rx_conf.rx_free_thresh = rx_free_thresh; port->rx_conf.rx_drop_en = rx_drop_en; -- 1.8.4.2
[dpdk-dev] [PATCH RFC v2 03/12] lib/librte_vhost: move event_copy logic from virtio-net.c to vhost-net-cdev.c
> + file = *(const struct vhost_vring_file *)in_buf; > + LOG_DEBUG(VHOST_CONFIG, > + "idx:%d fd:%d\n", file.index, file.fd); > + fd = eventfd_copy(file.fd, ctx.pid); > + if (fd < 0) { > + fuse_reply_ioctl(req, -1, NULL, 0); > + result = -1; > + break; > + } > + file.fd = fd; > + if (cmd == VHOST_SET_VRING_KICK) > + VHOST_IOCTL_R(struct vhost_vring_file, file, > ops->set_vring_kick); > + else > + VHOST_IOCTL_R(struct vhost_vring_file, file, > ops->set_vring_call); File doesn't get the new fd, but is again assigned with the value in in_buf in VHOST_IOCTL_R. Fix the bug in the next version of patch. > + } > break; >
[dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and csum forwarding engine
Hi Frank, > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Liu, Jijiang > Sent: Wednesday, January 07, 2015 2:04 AM > To: 'Olivier MATZ' > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and csum > forwarding engine > > Hi Olivier, > > > -Original Message- > > From: Olivier MATZ [mailto:olivier.matz at 6wind.com] > > Sent: Saturday, December 13, 2014 12:33 AM > > To: Liu, Jijiang > > Cc: dev at dpdk.org > > Subject: Re: [dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and > > csum forwarding engine > > > > Hello, > > > > On 12/12/2014 04:48 AM, Liu, Jijiang wrote: > > > The 'hw/sw' option is used to set/clear the flag of enabling TX > > > tunneling packet > > checksum hardware offload in testpmd application. > > > > This is not clear at all. > > In your command, there is (hw|sw|none). > > Are you talking about inner or outer? > > Is this command useful for any kind of packet? > > How does it combine with "tx_checksum set outer-ip (hw|sw)"? > > > > I rethink these TX checksum commands in this patch set and agree with you > that we should make some changes for having clear > meaning for them. > > There are 3 commands in patch set as follows, > 1. tx_checksum set tunnel (hw|sw|none) (port-id) > > Now I also think the command 1 may confuse user, they probably don't > understand why we need 'hw' or 'sw' option and when to > use the two option, > so I will replace the command with 'tx_checksum set hw-tunnel-mode (on|off) > (port-id)' command. I am a bit confused here, could you explain what would be a behaviour for 'on' and 'off'? Konstantin > > 2. tx_checksum set outer-ip (hw|sw) (port-id) > 3. tx_checksum set (ip|udp|tcp|sctp) (hw|sw) (port-id) > > The command 2 will be merged into command 3, the new command is ' tx_checksum > set (outer-ip|ip|udp|tcp|sctp) (hw|sw) (port- > id)'. > > These most of the cases in > http://dpdk.org/ml/archives/dev/2014-December/009213.html will be covered by > using the two > commands > > The command 'tx_checksum set hw-tunnel-mode (on|off) (port-id)' is used to > set/clear TESTPMD_TX_OFFLOAD_TUNNEL_CKSUM > flag. > Actually, the PKT_TX_UDP_TUNNEL_PKT offload flag will be set if the testpmd > flag is set, which tell driver/HW treat that transmit > packet as a tunneling packet. > > When 'on' is set, which is able to meet Method B.1 and method C. > > When 'off' is set, the TESTPMD_TX_OFFLOAD_TUNNEL_CKSUM is not needed to set, > so the PKT_TX_UDP_TUNNEL_PKT offload flag is > not needed to set, then HW treat that transmit packet as a non-tunneling > packet. It is able to meet Method B.2. > > As to case A, I think it is not mandatory to cover it in csum fwd engine for > tunneling packet. > > Is the above description clear for you? > > > Regards, > > Olivier
[dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and csum forwarding engine
Hi Konstantin, > -Original Message- > From: Ananyev, Konstantin > Sent: Wednesday, January 7, 2015 5:59 PM > To: Liu, Jijiang; 'Olivier MATZ' > Cc: dev at dpdk.org > Subject: RE: [dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and > csum forwarding engine > > Hi Frank, > > > -Original Message- > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Liu, Jijiang > > Sent: Wednesday, January 07, 2015 2:04 AM > > To: 'Olivier MATZ' > > Cc: dev at dpdk.org > > Subject: Re: [dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and > > csum forwarding engine > > > > Hi Olivier, > > > > > -Original Message- > > > From: Olivier MATZ [mailto:olivier.matz at 6wind.com] > > > Sent: Saturday, December 13, 2014 12:33 AM > > > To: Liu, Jijiang > > > Cc: dev at dpdk.org > > > Subject: Re: [dpdk-dev] [PATCH v3 0/3] enhance TX checksum command > > > and csum forwarding engine > > > > > > Hello, > > > > > > On 12/12/2014 04:48 AM, Liu, Jijiang wrote: > > > > The 'hw/sw' option is used to set/clear the flag of enabling TX > > > > tunneling packet > > > checksum hardware offload in testpmd application. > > > > > > This is not clear at all. > > > In your command, there is (hw|sw|none). > > > Are you talking about inner or outer? > > > Is this command useful for any kind of packet? > > > How does it combine with "tx_checksum set outer-ip (hw|sw)"? > > > > > > > I rethink these TX checksum commands in this patch set and agree with > > you that we should make some changes for having clear meaning for them. > > > > There are 3 commands in patch set as follows, 1. tx_checksum set > > tunnel (hw|sw|none) (port-id) > > > > Now I also think the command 1 may confuse user, they probably don't > > understand why we need 'hw' or 'sw' option and when to use the two > > option, so I will replace the command with 'tx_checksum set hw-tunnel-mode > (on|off) (port-id)' command. > > I am a bit confused here, could you explain what would be a behaviour for > 'on' and > 'off'? > Konstantin I have explained the behaviour for 'on' and'off' below, The command 'tx_checksum set hw-tunnel-mode (on|off) (port-id)' is used to set/clear TESTPMD_TX_OFFLOAD_TUNNEL_CKSUM flag. Actually, the PKT_TX_UDP_TUNNEL_PKT offload flag will be set if the testpmd flag is set, which means to tell HW treat that transmit packet as a tunneling packet to do checksum offload When 'on' is set, which is able to meet Method B.1 and method C. When 'off' is set, the TESTPMD_TX_OFFLOAD_TUNNEL_CKSUM is not needed to set, so the PKT_TX_UDP_TUNNEL_PKT offload flag is not needed to set, then HW treat that transmit packet as a non-tunneling packet. It is able to meet Method B.2. Is the explanation not clear? > > > > > 2. tx_checksum set outer-ip (hw|sw) (port-id) 3. tx_checksum set > > (ip|udp|tcp|sctp) (hw|sw) (port-id) > > > > The command 2 will be merged into command 3, the new command is ' > > tx_checksum set (outer-ip|ip|udp|tcp|sctp) (hw|sw) (port- id)'. > > > > These most of the cases in > > http://dpdk.org/ml/archives/dev/2014-December/009213.html will be > > covered by using the two commands > > > > The command 'tx_checksum set hw-tunnel-mode (on|off) (port-id)' is > > used to set/clear TESTPMD_TX_OFFLOAD_TUNNEL_CKSUM flag. > > Actually, the PKT_TX_UDP_TUNNEL_PKT offload flag will be set if the > > testpmd flag is set, which tell driver/HW treat that transmit packet as a > tunneling packet. > > > > When 'on' is set, which is able to meet Method B.1 and method C. > > > > When 'off' is set, the TESTPMD_TX_OFFLOAD_TUNNEL_CKSUM is not needed > > to set, so the PKT_TX_UDP_TUNNEL_PKT offload flag is not needed to set, > > then > HW treat that transmit packet as a non-tunneling packet. It is able to meet > Method B.2. > > > > As to case A, I think it is not mandatory to cover it in csum fwd engine for > tunneling packet. > > > > Is the above description clear for you? > > > > > Regards, > > > Olivier
[dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and csum forwarding engine
> -Original Message- > From: Liu, Jijiang > Sent: Wednesday, January 07, 2015 11:39 AM > To: Ananyev, Konstantin; 'Olivier MATZ' > Cc: dev at dpdk.org > Subject: RE: [dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and csum > forwarding engine > > Hi Konstantin, > > > -Original Message- > > From: Ananyev, Konstantin > > Sent: Wednesday, January 7, 2015 5:59 PM > > To: Liu, Jijiang; 'Olivier MATZ' > > Cc: dev at dpdk.org > > Subject: RE: [dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and > > csum forwarding engine > > > > Hi Frank, > > > > > -Original Message- > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Liu, Jijiang > > > Sent: Wednesday, January 07, 2015 2:04 AM > > > To: 'Olivier MATZ' > > > Cc: dev at dpdk.org > > > Subject: Re: [dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and > > > csum forwarding engine > > > > > > Hi Olivier, > > > > > > > -Original Message- > > > > From: Olivier MATZ [mailto:olivier.matz at 6wind.com] > > > > Sent: Saturday, December 13, 2014 12:33 AM > > > > To: Liu, Jijiang > > > > Cc: dev at dpdk.org > > > > Subject: Re: [dpdk-dev] [PATCH v3 0/3] enhance TX checksum command > > > > and csum forwarding engine > > > > > > > > Hello, > > > > > > > > On 12/12/2014 04:48 AM, Liu, Jijiang wrote: > > > > > The 'hw/sw' option is used to set/clear the flag of enabling TX > > > > > tunneling packet > > > > checksum hardware offload in testpmd application. > > > > > > > > This is not clear at all. > > > > In your command, there is (hw|sw|none). > > > > Are you talking about inner or outer? > > > > Is this command useful for any kind of packet? > > > > How does it combine with "tx_checksum set outer-ip (hw|sw)"? > > > > > > > > > > I rethink these TX checksum commands in this patch set and agree with > > > you that we should make some changes for having clear meaning for them. > > > > > > There are 3 commands in patch set as follows, 1. tx_checksum set > > > tunnel (hw|sw|none) (port-id) > > > > > > Now I also think the command 1 may confuse user, they probably don't > > > understand why we need 'hw' or 'sw' option and when to use the two > > > option, so I will replace the command with 'tx_checksum set hw-tunnel-mode > > (on|off) (port-id)' command. > > > > I am a bit confused here, could you explain what would be a behaviour for > > 'on' and > > 'off'? > > Konstantin > > I have explained the behaviour for 'on' and'off' below, > > The command 'tx_checksum set hw-tunnel-mode (on|off) (port-id)' is > used to set/clear TESTPMD_TX_OFFLOAD_TUNNEL_CKSUM flag. > > Actually, the PKT_TX_UDP_TUNNEL_PKT offload flag will be set if the > testpmd flag is set, which means to tell HW treat that transmit packet as a > tunneling packet to do checksum offload > When 'on' is set, which is able to meet Method B.1 and method C. > > When 'off' is set, the TESTPMD_TX_OFFLOAD_TUNNEL_CKSUM is not needed > to set, so the PKT_TX_UDP_TUNNEL_PKT offload flag is not needed to set, then > HW treat that transmit packet as a non-tunneling > packet. It is able to meet Method B.2. > > Is the explanation not clear? Ok, and how I can set method A (testpmd treat all packets as non-tunnelling and never look beyond outer L4 header) then? Konstantin > > > > > > > > > 2. tx_checksum set outer-ip (hw|sw) (port-id) 3. tx_checksum set > > > (ip|udp|tcp|sctp) (hw|sw) (port-id) > > > > > > The command 2 will be merged into command 3, the new command is ' > > > tx_checksum set (outer-ip|ip|udp|tcp|sctp) (hw|sw) (port- id)'. > > > > > > These most of the cases in > > > http://dpdk.org/ml/archives/dev/2014-December/009213.html will be > > > covered by using the two commands > > > > > > The command 'tx_checksum set hw-tunnel-mode (on|off) (port-id)' is > > > used to set/clear TESTPMD_TX_OFFLOAD_TUNNEL_CKSUM flag. > > > Actually, the PKT_TX_UDP_TUNNEL_PKT offload flag will be set if the > > > testpmd flag is set, which tell driver/HW treat that transmit packet as a > > tunneling packet. > > > > > > When 'on' is set, which is able to meet Method B.1 and method C. > > > > > > When 'off' is set, the TESTPMD_TX_OFFLOAD_TUNNEL_CKSUM is not needed > > > to set, so the PKT_TX_UDP_TUNNEL_PKT offload flag is not needed to set, > > > then > > HW treat that transmit packet as a non-tunneling packet. It is able to meet > > Method B.2. > > > > > > As to case A, I think it is not mandatory to cover it in csum fwd engine > > > for > > tunneling packet. > > > > > > Is the above description clear for you? > > > > > > > Regards, > > > > Olivier
[dpdk-dev] [PATCH RFC v2 00/12] lib/librte_vhost: vhost-user support
On 12/18/2014 1:43 AM, Xie, Huawei wrote: > >> -Original Message- >> From: Tetsuya Mukawa [mailto:mukawa at igel.co.jp] >> Sent: Sunday, December 14, 2014 10:26 PM >> To: Xie, Huawei; dev at dpdk.org >> Cc: haifeng.lin at intel.com >> Subject: Re: [PATCH RFC v2 00/12] lib/librte_vhost: vhost-user support >> >> Hi Xie, >> >> I've got warnings from checkpatch.pl. >> Mostly 'over 80 characters' warnings. >> (But I know these are come from original vhost-example code sometimes.) >> >> So far, your patches are RFC, so I haven't check these strictly. > Thanks. > I try to, but you know sometimes 'over 80 characters' is unavoidable. Why unavoidable? I'm very curious :) >> Thanks, >> Tetsuya >> >> (2014/12/11 6:37), Huawei Xie wrote: >>> This patchset refines vhost library to support both vhost-cuse and >>> vhost-user. >>> >>> >>> Huawei Xie (12): >>> create vhost_cuse directory and move vhost-net-cdev.c to vhost_cuse >> directory >>> rename vhost-net-cdev.h as vhost-net.h >>> move eventfd_copy logic out from virtio-net.c to vhost-net-cdev.c >>> exact copy of host_memory_map from virtio-net.c to new file >>> virtio-net-cdev.c >>> host_memory_map refine: map partial memory of target process into current >> process >>> cuse_set_memory_table is the VHOST_SET_MEMORY_TABLE message >> handler for cuse >>> fd management for vhost user >>> vhost-user support >>> minor fix >>> vhost-user memory region map/unmap >>> kick/callfd fix >>> cleanup when vhost user connection is closed >>> >>> lib/librte_vhost/Makefile | 5 +- >>> lib/librte_vhost/rte_virtio_net.h | 2 + >>> lib/librte_vhost/vhost-net-cdev.c | 389 -- >>> lib/librte_vhost/vhost-net-cdev.h | 113 --- >>> lib/librte_vhost/vhost-net.h | 117 +++ >>> lib/librte_vhost/vhost_cuse/vhost-net-cdev.c | 452 >> ++ >>> lib/librte_vhost/vhost_cuse/virtio-net-cdev.c | 349 >>> lib/librte_vhost/vhost_cuse/virtio-net-cdev.h | 45 +++ >>> lib/librte_vhost/vhost_rxtx.c | 2 +- >>> lib/librte_vhost/vhost_user/fd_man.c | 205 >>> lib/librte_vhost/vhost_user/fd_man.h | 64 >>> lib/librte_vhost/vhost_user/vhost-net-user.c | 423 >> >>> lib/librte_vhost/vhost_user/vhost-net-user.h | 107 ++ >>> lib/librte_vhost/vhost_user/virtio-net-user.c | 313 ++ >>> lib/librte_vhost/vhost_user/virtio-net-user.h | 49 +++ >>> lib/librte_vhost/virtio-net.c | 394 ++ >>> lib/librte_vhost/virtio-net.h | 43 +++ >>> 17 files changed, 2199 insertions(+), 873 deletions(-) >>> delete mode 100644 lib/librte_vhost/vhost-net-cdev.c >>> delete mode 100644 lib/librte_vhost/vhost-net-cdev.h >>> create mode 100644 lib/librte_vhost/vhost-net.h >>> create mode 100644 lib/librte_vhost/vhost_cuse/vhost-net-cdev.c >>> create mode 100644 lib/librte_vhost/vhost_cuse/virtio-net-cdev.c >>> create mode 100644 lib/librte_vhost/vhost_cuse/virtio-net-cdev.h >>> create mode 100644 lib/librte_vhost/vhost_user/fd_man.c >>> create mode 100644 lib/librte_vhost/vhost_user/fd_man.h >>> create mode 100644 lib/librte_vhost/vhost_user/vhost-net-user.c >>> create mode 100644 lib/librte_vhost/vhost_user/vhost-net-user.h >>> create mode 100644 lib/librte_vhost/vhost_user/virtio-net-user.c >>> create mode 100644 lib/librte_vhost/vhost_user/virtio-net-user.h >>> create mode 100644 lib/librte_vhost/virtio-net.h >>> >
[dpdk-dev] [PATCH 0/2] remove limit on devargs parameters length
Here is a little patchset that removes the limit on the devargs parameters length. Previously, arguments specified by user would be stored in a static buffer, while there is no particular reason why we should have such a constraint, afaik. -- David Marchand David Marchand (2): devargs: indent and cleanup devargs: remove limit on parameters length lib/librte_eal/common/eal_common_devargs.c | 51 --- lib/librte_eal/common/include/rte_devargs.h |4 +-- 2 files changed, 32 insertions(+), 23 deletions(-) -- 1.7.10.4
[dpdk-dev] [PATCH 1/2] devargs: indent and cleanup
Prepare for next commit. Fix some indent issues, refactor error code. Signed-off-by: David Marchand --- lib/librte_eal/common/eal_common_devargs.c | 27 ++- 1 file changed, 14 insertions(+), 13 deletions(-) diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c index 4c7d11a..8c9b31a 100644 --- a/lib/librte_eal/common/eal_common_devargs.c +++ b/lib/librte_eal/common/eal_common_devargs.c @@ -48,7 +48,7 @@ struct rte_devargs_list devargs_list = int rte_eal_devargs_add(enum rte_devtype devtype, const char *devargs_str) { - struct rte_devargs *devargs; + struct rte_devargs *devargs = NULL; char buf[RTE_DEVARGS_LEN]; char *sep; int ret; @@ -57,14 +57,14 @@ rte_eal_devargs_add(enum rte_devtype devtype, const char *devargs_str) if (ret < 0 || ret >= (int)sizeof(buf)) { RTE_LOG(ERR, EAL, "user device args too large: <%s>\n", devargs_str); - return -1; + goto fail; } /* use malloc instead of rte_malloc as it's called early at init */ devargs = malloc(sizeof(*devargs)); if (devargs == NULL) { RTE_LOG(ERR, EAL, "cannot allocate devargs\n"); - return -1; + goto fail; } memset(devargs, 0, sizeof(*devargs)); devargs->type = devtype; @@ -81,28 +81,29 @@ rte_eal_devargs_add(enum rte_devtype devtype, const char *devargs_str) case RTE_DEVTYPE_BLACKLISTED_PCI: /* try to parse pci identifier */ if (eal_parse_pci_BDF(buf, &devargs->pci.addr) != 0 && - eal_parse_pci_DomBDF(buf, &devargs->pci.addr) != 0) { - RTE_LOG(ERR, EAL, - "invalid PCI identifier <%s>\n", buf); - free(devargs); - return -1; + eal_parse_pci_DomBDF(buf, &devargs->pci.addr) != 0) { + RTE_LOG(ERR, EAL, "invalid PCI identifier <%s>\n", buf); + goto fail; } break; case RTE_DEVTYPE_VIRTUAL: /* save driver name */ ret = snprintf(devargs->virtual.drv_name, - sizeof(devargs->virtual.drv_name), "%s", buf); + sizeof(devargs->virtual.drv_name), "%s", buf); if (ret < 0 || ret >= (int)sizeof(devargs->virtual.drv_name)) { - RTE_LOG(ERR, EAL, - "driver name too large: <%s>\n", buf); - free(devargs); - return -1; + RTE_LOG(ERR, EAL, "driver name too large: <%s>\n", buf); + goto fail; } break; } TAILQ_INSERT_TAIL(&devargs_list, devargs, next); return 0; + +fail: + if (devargs) + free(devargs); + return -1; } /* count the number of devices of a specified type */ -- 1.7.10.4
[dpdk-dev] [PATCH 2/2] devargs: remove limit on parameters length
As far as I know, there is no reason why we should have a limit on the length of parameters that can be given for a device. Remove this limit by using dynamic allocations. Signed-off-by: David Marchand --- lib/librte_eal/common/eal_common_devargs.c | 26 +- lib/librte_eal/common/include/rte_devargs.h |4 ++-- 2 files changed, 19 insertions(+), 11 deletions(-) diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c index 8c9b31a..3aace08 100644 --- a/lib/librte_eal/common/eal_common_devargs.c +++ b/lib/librte_eal/common/eal_common_devargs.c @@ -49,17 +49,10 @@ int rte_eal_devargs_add(enum rte_devtype devtype, const char *devargs_str) { struct rte_devargs *devargs = NULL; - char buf[RTE_DEVARGS_LEN]; + char *buf = NULL; char *sep; int ret; - ret = snprintf(buf, sizeof(buf), "%s", devargs_str); - if (ret < 0 || ret >= (int)sizeof(buf)) { - RTE_LOG(ERR, EAL, "user device args too large: <%s>\n", - devargs_str); - goto fail; - } - /* use malloc instead of rte_malloc as it's called early at init */ devargs = malloc(sizeof(*devargs)); if (devargs == NULL) { @@ -69,11 +62,21 @@ rte_eal_devargs_add(enum rte_devtype devtype, const char *devargs_str) memset(devargs, 0, sizeof(*devargs)); devargs->type = devtype; + buf = strdup(devargs_str); + if (buf == NULL) { + RTE_LOG(ERR, EAL, "cannot allocate temp memory for devargs\n"); + goto fail; + } + /* set the first ',' to '\0' to split name and arguments */ sep = strchr(buf, ','); if (sep != NULL) { sep[0] = '\0'; - snprintf(devargs->args, sizeof(devargs->args), "%s", sep + 1); + devargs->args = strdup(sep + 1); + if (devargs->args == NULL) { + RTE_LOG(ERR, EAL, "cannot allocate for devargs args\n"); + goto fail; + } } switch (devargs->type) { @@ -97,10 +100,15 @@ rte_eal_devargs_add(enum rte_devtype devtype, const char *devargs_str) break; } + free(buf); TAILQ_INSERT_TAIL(&devargs_list, devargs, next); return 0; fail: + if (devargs->args) + free(devargs->args); + if (buf) + free(buf); if (devargs) free(devargs); return -1; diff --git a/lib/librte_eal/common/include/rte_devargs.h b/lib/librte_eal/common/include/rte_devargs.h index 9f9c98f..996e180 100644 --- a/lib/librte_eal/common/include/rte_devargs.h +++ b/lib/librte_eal/common/include/rte_devargs.h @@ -88,8 +88,8 @@ struct rte_devargs { char drv_name[32]; } virtual; }; -#define RTE_DEVARGS_LEN 256 - char args[RTE_DEVARGS_LEN]; /**< Arguments string as given by user. */ + /** Arguments string as given by user. */ + char *args; }; /** user device double-linked queue type definition */ -- 1.7.10.4
[dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and csum forwarding engine
On 12/10/2014 9:04 AM, Jijiang Liu wrote: > In the current codes, the "tx_checksum set (ip|udp|tcp|sctp|vxlan) (hw|sw) > (port-id)" command is not easy to understand and extend, so the patch set > enhances the tx_checksum command and reworks csum forwarding engine due to > the change of tx_checksum command. > The main changes of the tx_checksum command are listed below, > > 1> add "tx_checksum set tunnel (hw|sw|none) (port-id)" command > > The command is used to set/clear tunnel flag that is used to tell the NIC > that the packetg is a tunneing packet and application want hardware TX > checksum offload for outer layer, or inner layer, or both. > > The 'none' option means that user treat tunneling packet as ordinary packet > when using the csum forward engine. > for example, let say we have a tunnel packet: > eth_hdr_out/ipv4_hdr_out/udp_hdr_out/vxlan_hdr/ehtr_hdr_in/ipv4_hdr_in/tcp_hdr_in. > one of several scenarios: > > 1) User requests HW offload for ipv4_hdr_out checksum, and doesn't care is > it a tunnelled packet or not. So he sets: > > tx_checksum set tunnel none 0 > > tx_checksum set ip hw 0 Hi Jijiang, I have one question, you know lots of command need port-id field like here, why we do not put port-id just after the command? like below: tx_checksum (port-id) set tunnel (hw|sw|none) Then for users, if we do not care whether it is a tunneling packet, we just ignore the field after port-id. tx_checksum (port-id) For code it maybe simpler to praise command, and better for user. What all I mean is, we can put the required parameters just close the command and put the optional parameters(or can be optional) at the end of the command line. (Command) (required parameter) (optional parameters) Thus, it would be a better user experience. But just personal idea. Thanks, Michael > > So for such case we should set tx_tunnel to 'none'. > > 2> add "tx_checksum set outer-ip (hw|sw) (port-id)" command > > The command is used to set/clear outer IP flag that is used to tell the NIC > that application want hardware offload for outer layer. > > 3> remove the 'vxlan' option from the "tx_checksum set > (ip|udp|tcp|sctp|vxlan) (hw|sw) (port-id)" command > > The command is used to set IP, UDP, TCP and SCTP TX checksum flag. In the > case of tunneling packet, the IP, UDP, TCP and SCTP flags always concern > inner layer. > > Moreover, replace the TESTPMD_TX_OFFLOAD_VXLAN_CKSUM flag with > TESTPMD_TX_OFFLOAD_TUNNEL_CKSUM flag and add the > TESTPMD_TX_OFFLOAD_OUTER_IP_CKSUM and TESTPMD_TX_OFFLOAD_NON_TUNNEL_CKSUM > flag in test-pmd application. > > v2 change: > redefine the 'none' behaviour for "tx_checksum set tunnel (hw|sw|none) > (port-id)" command. > v3 change: > typo correction in cmdline help > > Jijiang Liu (3): > add outer IP offload capability in librte_ether. > add outer IP checksum capability in i40e PMD > testpmd command lines of the tx_checksum and csum forwarding rework > > app/test-pmd/cmdline.c| 201 > +++-- > app/test-pmd/csumonly.c | 35 --- > app/test-pmd/testpmd.h|6 +- > lib/librte_ether/rte_ethdev.h |1 + > lib/librte_pmd_i40e/i40e_ethdev.c |3 +- > 5 files changed, 218 insertions(+), 28 deletions(-) >
[dpdk-dev] [PATCH] librte_reorder: New reorder library with unit tests and app
From: Reshma Pattan 1)New library to provide reordering of out of ordered mbufs based on sequence number of mbuf. Library uses reorder buffer structure which in tern uses two circular buffers called ready and order buffers. *rte_reorder_create API creates instance of reorder buffer. *rte_reorder_init API initializes given reorder buffer instance. *rte_reorder_reset API resets given reorder buffer instance. *rte_reorder_insert API inserts the mbuf into order circular buffer. *rte_reorder_fill_overflow moves mbufs from order buffer to ready buffer to accomodate early packets in order buffer. *rte_reorder_drain API provides draining facility to fetch out reordered mbufs from order and ready buffers. 2)New unit test cases added. 3)New application added to verify the performance of library. Signed-off-by: Reshma Pattan Signed-off-by: Richardson Bruce --- app/test/Makefile | 2 + app/test/test_reorder.c| 452 ++ config/common_bsdapp | 5 + config/common_linuxapp | 5 + examples/packet_ordering/Makefile | 50 ++ examples/packet_ordering/main.c| 637 + lib/Makefile | 1 + lib/librte_eal/common/include/rte_tailq_elem.h | 2 + lib/librte_mbuf/rte_mbuf.h | 3 + lib/librte_reorder/Makefile| 50 ++ lib/librte_reorder/rte_reorder.c | 464 ++ lib/librte_reorder/rte_reorder.h | 184 +++ mk/rte.app.mk | 4 + 13 files changed, 1859 insertions(+) create mode 100644 app/test/test_reorder.c create mode 100644 examples/packet_ordering/Makefile create mode 100644 examples/packet_ordering/main.c create mode 100644 lib/librte_reorder/Makefile create mode 100644 lib/librte_reorder/rte_reorder.c create mode 100644 lib/librte_reorder/rte_reorder.h diff --git a/app/test/Makefile b/app/test/Makefile index 4311f96..24b27d7 100644 --- a/app/test/Makefile +++ b/app/test/Makefile @@ -124,6 +124,8 @@ SRCS-$(CONFIG_RTE_LIBRTE_IVSHMEM) += test_ivshmem.c SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += test_distributor.c SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += test_distributor_perf.c +SRCS-$(CONFIG_RTE_LIBRTE_REORDER) += test_reorder.c + SRCS-y += test_devargs.c SRCS-y += virtual_pmd.c SRCS-y += packet_burst_generator.c diff --git a/app/test/test_reorder.c b/app/test/test_reorder.c new file mode 100644 index 000..6a673e2 --- /dev/null +++ b/app/test/test_reorder.c @@ -0,0 +1,452 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "test.h" +#include "stdio.h" + +#include +#include + +#include +#include +#include +#include +#include +#include + +#include "test.h" + +#define BURST 32 +#define REORDER_BUFFER_SIZE 16384 +#define NUM_MBUFS (2*REORDER_BUFFER_SIZE) +#define REORDER_BUFFER_SIZE_INVALID 2049 +#define MBUF_SIZE (2048 + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM) + +struct reorder_unittest_params { + struct rte_mempool *p; + struct rte_reorder_buffer *b; +}; + +static struct reorder_unittest_par
[dpdk-dev] [PATCH] librte_reorder: New reorder library with unit tests and app
Self Nacked. Sending multiple sub patches instead of this big patch. > -Original Message- > From: Pattan, Reshma > Sent: Wednesday, January 7, 2015 3:28 PM > To: dev at dpdk.org > Cc: Pattan, Reshma > Subject: [PATCH] librte_reorder: New reorder library with unit tests and app > > From: Reshma Pattan > > 1)New library to provide reordering of out of ordered > mbufs based on sequence number of mbuf. Library uses reorder buffer > structure > which in tern uses two circular buffers called ready and order > buffers. > *rte_reorder_create API creates instance of reorder buffer. > *rte_reorder_init API initializes given reorder buffer instance. > *rte_reorder_reset API resets given reorder buffer instance. > *rte_reorder_insert API inserts the mbuf into order circular buffer. > *rte_reorder_fill_overflow moves mbufs from order buffer to ready > buffer > to accomodate early packets in order buffer. > *rte_reorder_drain API provides draining facility to fetch out > reordered mbufs from order and ready buffers. > > 2)New unit test cases added. > > 3)New application added to verify the performance of library. > > Signed-off-by: Reshma Pattan > Signed-off-by: Richardson Bruce > --- > app/test/Makefile | 2 + > app/test/test_reorder.c| 452 ++ > config/common_bsdapp | 5 + > config/common_linuxapp | 5 + > examples/packet_ordering/Makefile | 50 ++ > examples/packet_ordering/main.c| 637 > + > lib/Makefile | 1 + > lib/librte_eal/common/include/rte_tailq_elem.h | 2 + > lib/librte_mbuf/rte_mbuf.h | 3 + > lib/librte_reorder/Makefile| 50 ++ > lib/librte_reorder/rte_reorder.c | 464 ++ > lib/librte_reorder/rte_reorder.h | 184 +++ > mk/rte.app.mk | 4 + > 13 files changed, 1859 insertions(+) > create mode 100644 app/test/test_reorder.c > create mode 100644 examples/packet_ordering/Makefile > create mode 100644 examples/packet_ordering/main.c > create mode 100644 lib/librte_reorder/Makefile > create mode 100644 lib/librte_reorder/rte_reorder.c > create mode 100644 lib/librte_reorder/rte_reorder.h > > diff --git a/app/test/Makefile b/app/test/Makefile > index 4311f96..24b27d7 100644 > --- a/app/test/Makefile > +++ b/app/test/Makefile > @@ -124,6 +124,8 @@ SRCS-$(CONFIG_RTE_LIBRTE_IVSHMEM) += > test_ivshmem.c > SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += test_distributor.c > SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += test_distributor_perf.c > > +SRCS-$(CONFIG_RTE_LIBRTE_REORDER) += test_reorder.c > + > SRCS-y += test_devargs.c > SRCS-y += virtual_pmd.c > SRCS-y += packet_burst_generator.c > diff --git a/app/test/test_reorder.c b/app/test/test_reorder.c > new file mode 100644 > index 000..6a673e2 > --- /dev/null > +++ b/app/test/test_reorder.c > @@ -0,0 +1,452 @@ > +/*- > + * BSD LICENSE > + * > + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. > + * All rights reserved. > + * > + * Redistribution and use in source and binary forms, with or without > + * modification, are permitted provided that the following conditions > + * are met: > + * > + * * Redistributions of source code must retain the above copyright > + * notice, this list of conditions and the following disclaimer. > + * * Redistributions in binary form must reproduce the above copyright > + * notice, this list of conditions and the following disclaimer in > + * the documentation and/or other materials provided with the > + * distribution. > + * * Neither the name of Intel Corporation nor the names of its > + * contributors may be used to endorse or promote products derived > + * from this software without specific prior written permission. > + * > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND > CONTRIBUTORS > + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND > FITNESS FOR > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE > COPYRIGHT > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, > INCIDENTAL, > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT > NOT > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS > OF USE, > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND > ON ANY > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR > TORT > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF > THE USE > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH > DAMAGE. > + */ > + > +
[dpdk-dev] [PATCH 1/3] librte_reorder: New reorder library
From: Reshma Pattan 1)New library to provide reordering of out of ordered mbufs based on sequence number of mbuf. Library uses reorder buffer structure which in tern uses two circular buffers called ready and order buffers. *rte_reorder_create API creates instance of reorder buffer. *rte_reorder_init API initializes given reorder buffer instance. *rte_reorder_reset API resets given reorder buffer instance. *rte_reorder_insert API inserts the mbuf into order circular buffer. *rte_reorder_fill_overflow moves mbufs from order buffer to ready buffer to accomodate early packets in order buffer. *rte_reorder_drain API provides draining facility to fetch out reordered mbufs from order and ready buffers. Signed-off-by: Reshma Pattan Signed-off-by: Richardson Bruce --- config/common_bsdapp | 5 + config/common_linuxapp | 5 + lib/Makefile | 1 + lib/librte_eal/common/include/rte_tailq_elem.h | 2 + lib/librte_mbuf/rte_mbuf.h | 3 + lib/librte_reorder/Makefile| 50 +++ lib/librte_reorder/rte_reorder.c | 464 + lib/librte_reorder/rte_reorder.h | 184 ++ 8 files changed, 714 insertions(+) create mode 100644 lib/librte_reorder/Makefile create mode 100644 lib/librte_reorder/rte_reorder.c create mode 100644 lib/librte_reorder/rte_reorder.h diff --git a/config/common_bsdapp b/config/common_bsdapp index 9177db1..e3e0e94 100644 --- a/config/common_bsdapp +++ b/config/common_bsdapp @@ -334,6 +334,11 @@ CONFIG_RTE_SCHED_PORT_N_GRINDERS=8 CONFIG_RTE_LIBRTE_DISTRIBUTOR=y # +# Compile the reorder library +# +CONFIG_RTE_LIBRTE_REORDER=y + +# # Compile librte_port # CONFIG_RTE_LIBRTE_PORT=y diff --git a/config/common_linuxapp b/config/common_linuxapp index 2f9643b..b5ec730 100644 --- a/config/common_linuxapp +++ b/config/common_linuxapp @@ -342,6 +342,11 @@ CONFIG_RTE_SCHED_PORT_N_GRINDERS=8 CONFIG_RTE_LIBRTE_DISTRIBUTOR=y # +# Compile the reorder library +# +CONFIG_RTE_LIBRTE_REORDER=y + +# # Compile librte_port # CONFIG_RTE_LIBRTE_PORT=y diff --git a/lib/Makefile b/lib/Makefile index 0ffc982..5919d32 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -65,6 +65,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += librte_distributor DIRS-$(CONFIG_RTE_LIBRTE_PORT) += librte_port DIRS-$(CONFIG_RTE_LIBRTE_TABLE) += librte_table DIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += librte_pipeline +DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y) DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni diff --git a/lib/librte_eal/common/include/rte_tailq_elem.h b/lib/librte_eal/common/include/rte_tailq_elem.h index f74fc7c..3013869 100644 --- a/lib/librte_eal/common/include/rte_tailq_elem.h +++ b/lib/librte_eal/common/include/rte_tailq_elem.h @@ -84,6 +84,8 @@ rte_tailq_elem(RTE_TAILQ_ACL, "RTE_ACL") rte_tailq_elem(RTE_TAILQ_DISTRIBUTOR, "RTE_DISTRIBUTOR") +rte_tailq_elem(RTE_TAILQ_REORDER, "RTE_REORDER") + rte_tailq_end(RTE_TAILQ_NUM) #undef rte_tailq_elem diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h index 16059c6..ed27eb8 100644 --- a/lib/librte_mbuf/rte_mbuf.h +++ b/lib/librte_mbuf/rte_mbuf.h @@ -262,6 +262,9 @@ struct rte_mbuf { uint32_t usr; /**< User defined tags. See @rte_distributor_process */ } hash; /**< hash information */ + /* sequence number - field used in distributor and reorder library */ + uint32_t seqn; + /* second cache line - fields only used in slow path or on TX */ MARKER cacheline1 __rte_cache_aligned; diff --git a/lib/librte_reorder/Makefile b/lib/librte_reorder/Makefile new file mode 100644 index 000..12b916f --- /dev/null +++ b/lib/librte_reorder/Makefile @@ -0,0 +1,50 @@ +# BSD LICENSE +# +# Copyright(c) 2010-2014 Intel Corporation. All rights reserved. +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of Intel Corporation nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRE
[dpdk-dev] [PATCH 2/3] librte_reorder: New unit test cases added
From: Reshma Pattan Signed-off-by: Reshma Pattan --- app/test/Makefile | 2 + app/test/test_reorder.c | 452 mk/rte.app.mk | 4 + 3 files changed, 458 insertions(+) create mode 100644 app/test/test_reorder.c diff --git a/app/test/Makefile b/app/test/Makefile index 4311f96..24b27d7 100644 --- a/app/test/Makefile +++ b/app/test/Makefile @@ -124,6 +124,8 @@ SRCS-$(CONFIG_RTE_LIBRTE_IVSHMEM) += test_ivshmem.c SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += test_distributor.c SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += test_distributor_perf.c +SRCS-$(CONFIG_RTE_LIBRTE_REORDER) += test_reorder.c + SRCS-y += test_devargs.c SRCS-y += virtual_pmd.c SRCS-y += packet_burst_generator.c diff --git a/app/test/test_reorder.c b/app/test/test_reorder.c new file mode 100644 index 000..6a673e2 --- /dev/null +++ b/app/test/test_reorder.c @@ -0,0 +1,452 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "test.h" +#include "stdio.h" + +#include +#include + +#include +#include +#include +#include +#include +#include + +#include "test.h" + +#define BURST 32 +#define REORDER_BUFFER_SIZE 16384 +#define NUM_MBUFS (2*REORDER_BUFFER_SIZE) +#define REORDER_BUFFER_SIZE_INVALID 2049 +#define MBUF_SIZE (2048 + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM) + +struct reorder_unittest_params { + struct rte_mempool *p; + struct rte_reorder_buffer *b; +}; + +static struct reorder_unittest_params default_params = { + .p = NULL, + .b = NULL +}; + +static struct reorder_unittest_params *test_params = &default_params; + +static int +test_reorder_create_inval_name(void) +{ + struct rte_reorder_buffer *b = NULL; + char *name = NULL; + + b = rte_reorder_create(name, rte_socket_id(), REORDER_BUFFER_SIZE); + TEST_ASSERT_EQUAL(b, NULL, "No error on create() with invalid name param."); + TEST_ASSERT_EQUAL(rte_errno, EINVAL, + "No error on create() with invalid name param."); + return 0; +} + +static int +test_reorder_create_inval_size(void) +{ + struct rte_reorder_buffer *b = NULL; + + b = rte_reorder_create("PKT", rte_socket_id(), REORDER_BUFFER_SIZE_INVALID); + TEST_ASSERT_EQUAL(b, NULL, + "No error on create() with invalid buffer size param."); + TEST_ASSERT_EQUAL(rte_errno, EINVAL, + "No error on create() with invalid buffer size param."); + return 0; +} + +static int +test_reorder_init_null_buffer(void) +{ + struct rte_reorder_buffer *b = NULL; + /* +* The minimum memory area size that should be passed to library is, +* sizeof(struct rte_reorder_buffer) + (2 * size * sizeof(struct rte_mbuf *)); +* Otherwise error will be thrown +*/ + unsigned int mzsize = 262336; + b = rte_reorder_init(b, mzsize, "PKT1", REORDER_BUFFER_SIZE); + TEST_ASSERT_EQUAL(b, NULL, "No error on init with NULL buffer."); + TEST_ASSERT_EQUAL(rte_errno, EINVAL, "No error on init with NULL buffer."); + return 0; +} + +static int +test_reorder_init_inval_mzsize(void) +{ + struct rte_reorder_buffer *b = NULL; + unsigned int mzsize = 100; + b = rte_malloc(NULL,
[dpdk-dev] [PATCH 3/3] librte_reorder: New sample app for reorder library
From: Reshma Pattan *Sample application consists of RX, Worker and TX threads. *RX thread marks the seqn field of mbufs upon receiving mbufs from driver. Marked mbufs will be enqueued in multi consumer ring. *Worker threads will dequeue mbufs from multi consumer ring and performs XOR on input port value of mbufs. Operated mbufs will be enqueued to another ring for TX. *TX thread will dequeue the mbufs from ring and hand it over to reorder lib for reordering before sending them out. Signed-of-by: Reshma Pattan --- examples/packet_ordering/Makefile | 50 +++ examples/packet_ordering/main.c | 637 ++ 2 files changed, 687 insertions(+) create mode 100644 examples/packet_ordering/Makefile create mode 100644 examples/packet_ordering/main.c diff --git a/examples/packet_ordering/Makefile b/examples/packet_ordering/Makefile new file mode 100644 index 000..44bd2e1 --- /dev/null +++ b/examples/packet_ordering/Makefile @@ -0,0 +1,50 @@ +# BSD LICENSE +# +# Copyright(c) 2010-2014 Intel Corporation. All rights reserved. +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of Intel Corporation nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +ifeq ($(RTE_SDK),) +$(error "Please define RTE_SDK environment variable") +endif + +# Default target, can be overriden by command line or environment +RTE_TARGET ?= x86_64-ivshmem-linuxapp-gcc + +include $(RTE_SDK)/mk/rte.vars.mk + +# binary name +APP = packet_ordering + +# all source are stored in SRCS-y +SRCS-y := main.c + +CFLAGS += -O3 +CFLAGS += $(WERROR_FLAGS) + +include $(RTE_SDK)/mk/rte.extapp.mk diff --git a/examples/packet_ordering/main.c b/examples/packet_ordering/main.c new file mode 100644 index 000..8b65275 --- /dev/null +++ b/examples/packet_ordering/main.c @@ -0,0 +1,637 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF
[dpdk-dev] [PATCH 1/3] librte_reorder: New reorder library
On Wed, Jan 07, 2015 at 04:39:11PM +, Reshma Pattan wrote: > From: Reshma Pattan > > 1)New library to provide reordering of out of ordered > mbufs based on sequence number of mbuf. Library uses reorder > buffer structure > which in tern uses two circular buffers called ready and order > buffers. > *rte_reorder_create API creates instance of reorder buffer. > *rte_reorder_init API initializes given reorder buffer instance. > *rte_reorder_reset API resets given reorder buffer instance. > *rte_reorder_insert API inserts the mbuf into order circular > buffer. > *rte_reorder_fill_overflow moves mbufs from order buffer to ready > buffer > to accomodate early packets in order buffer. > *rte_reorder_drain API provides draining facility to fetch out > reordered mbufs from order and ready buffers. > > Signed-off-by: Reshma Pattan > Signed-off-by: Richardson Bruce > --- > config/common_bsdapp | 5 + > config/common_linuxapp | 5 + > lib/Makefile | 1 + > lib/librte_eal/common/include/rte_tailq_elem.h | 2 + > lib/librte_mbuf/rte_mbuf.h | 3 + > lib/librte_reorder/Makefile| 50 +++ > lib/librte_reorder/rte_reorder.c | 464 > + > lib/librte_reorder/rte_reorder.h | 184 ++ > 8 files changed, 714 insertions(+) > create mode 100644 lib/librte_reorder/Makefile > create mode 100644 lib/librte_reorder/rte_reorder.c > create mode 100644 lib/librte_reorder/rte_reorder.h > + > +int > +rte_reorder_insert(struct rte_reorder_buffer *b, struct rte_mbuf *mbuf) > +{ > + uint32_t offset, position; > + struct cir_buffer *order_buf = &b->order_buf; > + > + /* > + * calculate the offset from the head pointer we need to go. > + * The subtraction takes care of the sequence number wrapping. > + * For example (using 16-bit for brevity): > + * min_seqn = 0xFFFD > + * mbuf_seqn = 0x0010 > + * offset= 0x0010 - 0xFFFD = 0x13 > + */ > + offset = mbuf->seqn - b->min_seqn; > + > + /* > + * action to take depends on offset. > + * offset < buffer->size: the mbuf fits within the current window of > + *sequence numbers we can reorder. EXPECTED CASE. > + * offset > buffer->size: the mbuf is outside the current window. There > + *are a number of cases to consider: > + *1. The packet sequence is just outside the window, then we need > + * to see about shifting the head pointer and taking any ready > + * to return packets out of the ring. If there was a delayed > + * or dropped packet preventing drains from shifting the window > + * this case will skip over the dropped packet instead, and any > + * packets dequeued here will be returned on the next drain call. > + *2. The packet sequence number is vastly outside our window, taken > + * here as having offset greater than twice the buffer size. In > + * this case, the packet is probably an old or late packet that > + * was previously skipped, so just enqueue the packet for > + * immediate return on the next drain call, or else return error. > + */ > + if (offset < b->order_buf.size) { > + position = (order_buf->head + offset) & order_buf->mask; > + order_buf->entries[position] = mbuf; > + } else if (offset < 2 * b->order_buf.size) { > + if (rte_reorder_fill_overflow(b, offset - order_buf->size) < > + offset - order_buf->size) { > + /* Put in handling for enqueue straight to output */ > + rte_errno = ENOSPC; > + return -1; > + } > + offset = mbuf->seqn - b->min_seqn; > + position = (order_buf->head + offset) & order_buf->mask; > + order_buf->entries[position] = mbuf; > + } else { > + /* Put in handling for enqueue straight to output */ > + rte_errno = ERANGE; > + return -1; > + } How does this work if you get two packets with the same sequence number? That situation seems like it would happen frequently with your example app, and from my read of the above, you just wind up overwriting the same pointer in ther entries array here, which leads to silent packet loss.
[dpdk-dev] [PATCH 2/2] devargs: remove limit on parameters length
On Wed, 7 Jan 2015 14:03:29 +0100 David Marchand wrote: > + buf = strdup(devargs_str); > + if (buf == NULL) { > + RTE_LOG(ERR, EAL, "cannot allocate temp memory for devargs\n"); > + goto fail; > + } > + If string is only used in same function you might consider using strdupa() which avoids worrying about freeing in error paths.