[dpdk-dev] How to know corresponding device from port number

2013-11-26 Thread Tetsuya.Mukawa
Hi,

I have a question about how to know corresponding device from port number.
For example, if I have 4 Ethernet devices and 2 Ring PMDs, I will get 6
ports during initialization.
In the case, how can I know which port corresponds last Ring PMD?

Regards,
Tetsuya Mukawa


[dpdk-dev] How to know corresponding device from port number

2013-11-26 Thread Richardson, Bruce
> 
> Hi,
> 
> I have a question about how to know corresponding device from port
> number.
> For example, if I have 4 Ethernet devices and 2 Ring PMDs, I will get 6 ports
> during initialization.
> In the case, how can I know which port corresponds last Ring PMD?

[BR] Firstly, to identify the ring PMD's vs the ethernet device PMDs you can 
use the information in the rte_eth_dev structure. For each device x, (0 <= x 
<=5), if you check rte_eth_devices[x], the ring pmd's will have a NULL driver 
pointer and the pci address given in the pci_dev structure will be all-zeros.
As for distinguishing two different ring ethdevs from each other, I'm not aware 
of any way to do this, they will just have different eth_dev indexes.


[dpdk-dev] [PATCH] compilation fixes for ICC

2013-11-26 Thread Richardson, Bruce
Compilation fixes for ICC

ICC requires an initializer be given for the static variables, so
adding one in cases where one wasn't previously given.
---
 lib/librte_pmd_e1000/igb_rxtx.c   |6 --
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c |6 --
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/lib/librte_pmd_e1000/igb_rxtx.c b/lib/librte_pmd_e1000/igb_rxtx.c
index 90c3227..68716b0 100644
--- a/lib/librte_pmd_e1000/igb_rxtx.c
+++ b/lib/librte_pmd_e1000/igb_rxtx.c
@@ -1134,7 +1134,8 @@ igb_reset_tx_queue_stat(struct igb_tx_queue *txq)
 static void
 igb_reset_tx_queue(struct igb_tx_queue *txq, struct rte_eth_dev *dev)
 {
-   static const union e1000_adv_tx_desc zeroed_desc;
+   static const union e1000_adv_tx_desc zeroed_desc = { .read = {
+   .buffer_addr = 0}};
struct igb_tx_entry *txe = txq->sw_ring;
uint16_t i, prev;
struct e1000_hw *hw;
@@ -1296,7 +1297,8 @@ eth_igb_rx_queue_release(void *rxq)
 static void
 igb_reset_rx_queue(struct igb_rx_queue *rxq)
 {
-   static const union e1000_adv_rx_desc zeroed_desc;
+   static const union e1000_adv_rx_desc zeroed_desc = { .read = {
+   .pkt_addr = 0}};
unsigned i;

/* Zero out HW ring memory */
diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c 
b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
index ae9eda8..6eda8bc 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
@@ -1799,7 +1799,8 @@ ixgbe_dev_tx_queue_release(void *txq)
 static void
 ixgbe_reset_tx_queue(struct igb_tx_queue *txq)
 {
-   static const union ixgbe_adv_tx_desc zeroed_desc;
+   static const union ixgbe_adv_tx_desc zeroed_desc = { .read = {
+   .buffer_addr = 0}};
struct igb_tx_entry *txe = txq->sw_ring;
uint16_t prev, i;

@@ -2094,7 +2095,8 @@ check_rx_burst_bulk_alloc_preconditions(__rte_unused 
struct igb_rx_queue *rxq)
 static void
 ixgbe_reset_rx_queue(struct igb_rx_queue *rxq)
 {
-   static const union ixgbe_adv_rx_desc zeroed_desc;
+   static const union ixgbe_adv_rx_desc zeroed_desc = { .read = {
+   .pkt_addr = 0}};
unsigned i;
uint16_t len;

--
1.7.7.6



[dpdk-dev] Using dpdk in KVM guest with sr-iov pass-thru

2013-11-26 Thread Jaeyong Yoo
Hello,



I'm having trouble using l2fwd example on top of KVM guest with sr-iov. Here
goes the detailed description:



Symptoms:

If I run l2fwd dpdk-app, this app does not receive any packets. Even worse,

pass-thru-ed device in KVM guest is not receiving any interrupts and more
over,
PF in host-side is not also receiving any packets. If I destroy the KVM
guest, 
then PF starts receiving packets. (which is very wired right?)



Env:

-   Sr-iov card: Intel Corporation I350 Gigabit Network Connection (rev
01)

-   DPDK version: dpdk-1.5.1r1

-   KVM installed with the packages in ubuntu server 64-bit

n  Kernel version: 3.11.0-12-generic

-   KVM guest (ubuntu server 64-bit; the same to host)

-   CPU: i7-3770

-   iommu enabled



Do you have any similar issues or any comments or pointers?



Thanks,

Jaeyong



[dpdk-dev] Question: Can't make pcap and refcnt to match

2013-11-26 Thread Mats Liljegren
I have had stability problems when using pcap in my little
application. My application is a simple benchmark applications that is
trying to see how much data I can send and receive.

It has one lcore per NIC, where each lcore handles transmit and
receive. On the hardware, I make a loopback between two NICs, so the
NICs are in practice paired. I currently use 4 NICs and therefore 4
lcores. Port 0 sends to port 1 and vice versa. Port 2 send to port 3
and vice versa. One pair is using DPDK hardware driver against a dual
i350 NIC. The other pair is using pcap against two of the four
on-board NICs.

When enabling everything saying "DEBUG" in its name in the .config
file, I get the following error:

PMD: rte_eth_dev_config_restore: port 1: MAC address array not supported
PMD: rte_eth_promiscuous_disable: Function not supported
PMD: rte_eth_allmulticast_disable: Function not supported
Speed: 1 Mbps, full duplex
Port 1 up and running.
PMD: e1000_put_hw_semaphore_generic(): e1000_put_hw_semaphore_generic
PANIC in rte_mbuf_sanity_check():
bad ref cnt
PANIC in rte_mbuf_sanity_check():
bad ref cnt
PMD: e1000_release_phy_82575(): e1000_release_phy_82575
PMD: e1000_release_swfw_sync_82575(): e1000_release_swfw_sync_82575
PMD: e1000_get_hw_semaphore_generic(): e1000_get_hw_semaphore_generic
PMD: eth_igb_rx_queue_setup(): sw_ring=0x7fff776eefc0
hw_ring=0x7fff76830480 dma_addr=0x464630480

PMD: e1000_put_hw_semaphore_generic(): e1000_put_hw_semaphore_generic
PMD: To improve 1G driver performance, consider setting the TX WTHRESH
value to 4, 8, or 16.
PMD: eth_igb_tx_queue_setup(): sw_ring=0x7fff776ece40
hw_ring=0x7fff76840500 dma_addr=0x464640500

PMD: eth_igb_start(): >>
PMD: e1000_read_phy_reg_82580(): e1000_read_phy_reg_82580
PMD: e1000_acquire_phy_82575(): e1000_acquire_phy_82575
PMD: e1000_acquire_swfw_sync_82575(): e1000_acquire_swfw_sync_82575
PMD: e1000_get_hw_semaphore_generic(): e1000_get_hw_semaphore_generic
PMD: e1000_get_cfg_done_82575(): e1000_get_cfg_done_82575
PMD: e1000_put_hw_semaphore_generic(): e1000_put_hw_semaphore_generic
PMD: e1000_read_phy_reg_mdic(): e1000_read_phy_reg_mdic
9: [/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x772a89cd]]
8: [/lib/x86_64-linux-gnu/libpthread.so.0(+0x7f6e) [0x7757df6e]]
7: [/home/mlil/dpdk-demo/build/enea-demo(eal_thread_loop+0x1b9) [0x492669]]
6: [/home/mlil/dpdk-demo/build/enea-demo() [0x4150bc]]
5: [/home/mlil/dpdk-demo/build/enea-demo() [0x414d0b]]
4: [/home/mlil/dpdk-demo/build/enea-demo() [0x4116ef]]
3: [/home/mlil/dpdk-demo/build/enea-demo(rte_mbuf_sanity_check+0xa7) [0x484707]]
2: [/home/mlil/dpdk-demo/build/enea-demo(__rte_panic+0xc1) [0x40f788]]
1: [/home/mlil/dpdk-demo/build/enea-demo(rte_dump_stack+0x18) [0x493f68]]
PMD: e1000_release_phy_82575(): e1000_release_phy_82575
PMD: e1000_release_swfw_sync_82575(): e1000_release_swfw_sync_82575
PMD: e1000_get_hw_semaphore_generic(): e1000_get_hw_semaphore_generic

I checked the source code for pcap, and in the file rte_eth_pcap.c,
function eth_pcap_rx(), I make the following observation:

It pre-allocates a number of mbufs (64 to be exact). It then fills
these mbufs with data and returns them. The pre-allocation seems to
only be done once, and then they are re-used.

This confuses me. How does this work when more than 64 packets are
requested? I see no safety checks for this.

Aren't application supposed to call rte_pktmbuf_free() on the returned
mbufs? If so, the pre-allocated mbufs will have been free'd as far as
I can see and can therefore not be re-used.

What am I missing here?

Regards
Mats


[dpdk-dev] Question: Can't make pcap and refcnt to match

2013-11-26 Thread Richardson, Bruce
Hi Mats,

yes, you are right, there is an issue in the pcap driver that it is not 
allocating mbufs correctly. We are working on a fix.

Regards,
/Bruce

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Mats Liljegren
> Sent: Tuesday, November 26, 2013 1:07 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] Question: Can't make pcap and refcnt to match
> 
> I have had stability problems when using pcap in my little application. My
> application is a simple benchmark applications that is trying to see how
> much data I can send and receive.
> 
> It has one lcore per NIC, where each lcore handles transmit and receive. On
> the hardware, I make a loopback between two NICs, so the NICs are in
> practice paired. I currently use 4 NICs and therefore 4 lcores. Port 0 sends 
> to
> port 1 and vice versa. Port 2 send to port 3 and vice versa. One pair is using
> DPDK hardware driver against a dual
> i350 NIC. The other pair is using pcap against two of the four on-board NICs.
> 
> When enabling everything saying "DEBUG" in its name in the .config file, I
> get the following error:
> 
> PMD: rte_eth_dev_config_restore: port 1: MAC address array not
> supported
> PMD: rte_eth_promiscuous_disable: Function not supported
> PMD: rte_eth_allmulticast_disable: Function not supported
> Speed: 1 Mbps, full duplex
> Port 1 up and running.
> PMD: e1000_put_hw_semaphore_generic():
> e1000_put_hw_semaphore_generic PANIC in rte_mbuf_sanity_check():
> bad ref cnt
> PANIC in rte_mbuf_sanity_check():
> bad ref cnt
> PMD: e1000_release_phy_82575(): e1000_release_phy_82575
> PMD: e1000_release_swfw_sync_82575():
> e1000_release_swfw_sync_82575
> PMD: e1000_get_hw_semaphore_generic():
> e1000_get_hw_semaphore_generic
> PMD: eth_igb_rx_queue_setup(): sw_ring=0x7fff776eefc0
> hw_ring=0x7fff76830480 dma_addr=0x464630480
> 
> PMD: e1000_put_hw_semaphore_generic():
> e1000_put_hw_semaphore_generic
> PMD: To improve 1G driver performance, consider setting the TX WTHRESH
> value to 4, 8, or 16.
> PMD: eth_igb_tx_queue_setup(): sw_ring=0x7fff776ece40
> hw_ring=0x7fff76840500 dma_addr=0x464640500
> 
> PMD: eth_igb_start(): >>
> PMD: e1000_read_phy_reg_82580(): e1000_read_phy_reg_82580
> PMD: e1000_acquire_phy_82575(): e1000_acquire_phy_82575
> PMD: e1000_acquire_swfw_sync_82575():
> e1000_acquire_swfw_sync_82575
> PMD: e1000_get_hw_semaphore_generic():
> e1000_get_hw_semaphore_generic
> PMD: e1000_get_cfg_done_82575(): e1000_get_cfg_done_82575
> PMD: e1000_put_hw_semaphore_generic():
> e1000_put_hw_semaphore_generic
> PMD: e1000_read_phy_reg_mdic(): e1000_read_phy_reg_mdic
> 9: [/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x772a89cd]]
> 8: [/lib/x86_64-linux-gnu/libpthread.so.0(+0x7f6e) [0x7757df6e]]
> 7: [/home/mlil/dpdk-demo/build/enea-demo(eal_thread_loop+0x1b9)
> [0x492669]]
> 6: [/home/mlil/dpdk-demo/build/enea-demo() [0x4150bc]]
> 5: [/home/mlil/dpdk-demo/build/enea-demo() [0x414d0b]]
> 4: [/home/mlil/dpdk-demo/build/enea-demo() [0x4116ef]]
> 3: [/home/mlil/dpdk-demo/build/enea-
> demo(rte_mbuf_sanity_check+0xa7) [0x484707]]
> 2: [/home/mlil/dpdk-demo/build/enea-demo(__rte_panic+0xc1)
> [0x40f788]]
> 1: [/home/mlil/dpdk-demo/build/enea-demo(rte_dump_stack+0x18)
> [0x493f68]]
> PMD: e1000_release_phy_82575(): e1000_release_phy_82575
> PMD: e1000_release_swfw_sync_82575():
> e1000_release_swfw_sync_82575
> PMD: e1000_get_hw_semaphore_generic():
> e1000_get_hw_semaphore_generic
> 
> I checked the source code for pcap, and in the file rte_eth_pcap.c, function
> eth_pcap_rx(), I make the following observation:
> 
> It pre-allocates a number of mbufs (64 to be exact). It then fills these mbufs
> with data and returns them. The pre-allocation seems to only be done once,
> and then they are re-used.
> 
> This confuses me. How does this work when more than 64 packets are
> requested? I see no safety checks for this.
> 
> Aren't application supposed to call rte_pktmbuf_free() on the returned
> mbufs? If so, the pre-allocated mbufs will have been free'd as far as I can
> see and can therefore not be re-used.
> 
> What am I missing here?
> 
> Regards
> Mats


[dpdk-dev] Question: Can't make pcap and refcnt to match

2013-11-26 Thread Robert Sanford
Hi Bruce,

We also found buffer overflow problems with the pcap driver. 1) Frame may
be longer than mbuf. 2) Caplen may be less than original packet.

I've been meaning to submit a change, but I'm not familiar with the process.

Here is a diff of the relevant code in rte_eth_pcap.c:



  if (unlikely(mbuf == NULL))
  break;
- rte_memcpy(mbuf->pkt.data, packet, header.len);
- mbuf->pkt.data_len = (uint16_t)header.len;
- mbuf->pkt.pkt_len = mbuf->pkt.data_len;
+
+ /*
+ * Fix buffer overflow problems.
+ * 1. Frame may be longer than mbuf.
+ * 2. Capture length (caplen) may be less than original packet length.
+ */
+ uint16_t len = (uint16_t)header.caplen;
+ uint16_t tailroom = rte_pktmbuf_tailroom(mbuf);
+ if (len > tailroom)
+ len = tailroom;
+
+ /
+ RTE_LOG(INFO, PMD, "eth_pcap_rx: i=%u caplen=%u framelen=%u tail=%u
len=%u\n",
+ i, header.caplen, header.len, tailroom, len);
+ /
+
+ rte_memcpy(mbuf->pkt.data, packet, len);
+ mbuf->pkt.data_len = len;
+ mbuf->pkt.pkt_len = len;
+
  bufs[i] = mbuf;
  num_rx++;
  }



Regards,
Robert




On Tue, Nov 26, 2013 at 8:46 AM, Richardson, Bruce <
bruce.richardson at intel.com> wrote:

> Hi Mats,
>
> yes, you are right, there is an issue in the pcap driver that it is not
> allocating mbufs correctly. We are working on a fix.
>
> Regards,
> /Bruce
>
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Mats Liljegren
> > Sent: Tuesday, November 26, 2013 1:07 PM
> > To: dev at dpdk.org
> > Subject: [dpdk-dev] Question: Can't make pcap and refcnt to match
> >
> > I have had stability problems when using pcap in my little application.
> My
> > application is a simple benchmark applications that is trying to see how
> > much data I can send and receive.
> >
> > It has one lcore per NIC, where each lcore handles transmit and receive.
> On
> > the hardware, I make a loopback between two NICs, so the NICs are in
> > practice paired. I currently use 4 NICs and therefore 4 lcores. Port 0
> sends to
> > port 1 and vice versa. Port 2 send to port 3 and vice versa. One pair is
> using
> > DPDK hardware driver against a dual
> > i350 NIC. The other pair is using pcap against two of the four on-board
> NICs.
> >
> > When enabling everything saying "DEBUG" in its name in the .config file,
> I
> > get the following error:
> >
> > PMD: rte_eth_dev_config_restore: port 1: MAC address array not
> > supported
> > PMD: rte_eth_promiscuous_disable: Function not supported
> > PMD: rte_eth_allmulticast_disable: Function not supported
> > Speed: 1 Mbps, full duplex
> > Port 1 up and running.
> > PMD: e1000_put_hw_semaphore_generic():
> > e1000_put_hw_semaphore_generic PANIC in rte_mbuf_sanity_check():
> > bad ref cnt
> > PANIC in rte_mbuf_sanity_check():
> > bad ref cnt
> > PMD: e1000_release_phy_82575(): e1000_release_phy_82575
> > PMD: e1000_release_swfw_sync_82575():
> > e1000_release_swfw_sync_82575
> > PMD: e1000_get_hw_semaphore_generic():
> > e1000_get_hw_semaphore_generic
> > PMD: eth_igb_rx_queue_setup(): sw_ring=0x7fff776eefc0
> > hw_ring=0x7fff76830480 dma_addr=0x464630480
> >
> > PMD: e1000_put_hw_semaphore_generic():
> > e1000_put_hw_semaphore_generic
> > PMD: To improve 1G driver performance, consider setting the TX WTHRESH
> > value to 4, 8, or 16.
> > PMD: eth_igb_tx_queue_setup(): sw_ring=0x7fff776ece40
> > hw_ring=0x7fff76840500 dma_addr=0x464640500
> >
> > PMD: eth_igb_start(): >>
> > PMD: e1000_read_phy_reg_82580(): e1000_read_phy_reg_82580
> > PMD: e1000_acquire_phy_82575(): e1000_acquire_phy_82575
> > PMD: e1000_acquire_swfw_sync_82575():
> > e1000_acquire_swfw_sync_82575
> > PMD: e1000_get_hw_semaphore_generic():
> > e1000_get_hw_semaphore_generic
> > PMD: e1000_get_cfg_done_82575(): e1000_get_cfg_done_82575
> > PMD: e1000_put_hw_semaphore_generic():
> > e1000_put_hw_semaphore_generic
> > PMD: e1000_read_phy_reg_mdic(): e1000_read_phy_reg_mdic
> > 9: [/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x772a89cd]]
> > 8: [/lib/x86_64-linux-gnu/libpthread.so.0(+0x7f6e) [0x7757df6e]]
> > 7: [/home/mlil/dpdk-demo/build/enea-demo(eal_thread_loop+0x1b9)
> > [0x492669]]
> > 6: [/home/mlil/dpdk-demo/build/enea-demo() [0x4150bc]]
> > 5: [/home/mlil/dpdk-demo/build/enea-demo() [0x414d0b]]
> > 4: [/home/mlil/dpdk-demo/build/enea-demo() [0x4116ef]]
> > 3: [/home/mlil/dpdk-demo/build/enea-
> > demo(rte_mbuf_sanity_check+0xa7) [0x484707]]
> > 2: [/home/mlil/dpdk-demo/build/enea-demo(__rte_panic+0xc1)
> > [0x40f788]]
> > 1: [/home/mlil/dpdk-demo/build/enea-demo(rte_dump_stack+0x18)
> > [0x493f68]]
> > PMD: e1000_release_phy_82575(): e1000_release_phy_82575
> > PMD: e1000_release_swfw_sync_82575():
> > e1000_release_swfw_sync_82575
> > PMD: e1000_get_hw_semaphore_generic():
> > e1000_get_hw_semaphore_generic
> >
> > I checked the source code for pcap, and in the file rte_eth_pcap.c,
> function
> > eth_pcap_rx(), I make the following observation:
> >
> > It pre-allocates a number of mbufs (64 to be exact). It then fills these
> m

[dpdk-dev] Question: Can't make pcap and refcnt to match

2013-11-26 Thread Thomas Monjalon
Hello,

26/11/2013 16:42, Robert Sanford :
> I've been meaning to submit a change, but I'm not familiar with the
> process.

The process is to send your patch with git (format-patch + send-email).
You have to set a short title and a longer commit log explaining what was the 
problem and how you fix it.
The commit log must have a Signed-off-by line (see "Developer's Certificate of 
Origin" in https://www.kernel.org/doc/Documentation/SubmittingPatches

> + /*
> + * Fix buffer overflow problems.
> + * 1. Frame may be longer than mbuf.
> + * 2. Capture length (caplen) may be less than original packet length.
> + */

This should be in the commit log.
Keep only comments needed to understand the code.

> + /
> + RTE_LOG(INFO, PMD, "eth_pcap_rx: i=%u caplen=%u framelen=%u tail=%u
> len=%u\n",
> + i, header.caplen, header.len, tailroom, len);
> + /

Why it is commented out ?
If it's important, it is an INFO log.
If it's useful when debugging, set it to DEBUG.
If it's a temporary debug, remove it.

By the way, thank you for your patch.
-- 
Thomas


[dpdk-dev] [PATCH] pmd_pcap: fixed incorrect mbuf allocation

2013-11-26 Thread Richardson, Bruce
The mbufs returned by the pcap pmd RX function were constantly
reused, instead of being allocated on demand. This has been fixed.

Signed-off-by: Bruce Richardson 
---
lib/librte_pmd_pcap/rte_eth_pcap.c |   37 +--
1 files changed, 26 insertions(+), 11 deletions(-)

diff --git a/lib/librte_pmd_pcap/rte_eth_pcap.c 
b/lib/librte_pmd_pcap/rte_eth_pcap.c
index 19d19b3..8a98471 100644
--- a/lib/librte_pmd_pcap/rte_eth_pcap.c
+++ b/lib/librte_pmd_pcap/rte_eth_pcap.c
@@ -118,32 +118,47 @@ eth_pcap_rx(void *queue,
struct pcap_pkthdr header;
const u_char *packet;
struct rte_mbuf *mbuf;
-   static struct rte_mbuf *mbufs[RTE_ETH_PCAP_MBUFS] = { 0 };
struct pcap_rx_queue *pcap_q = queue;
+   struct rte_pktmbuf_pool_private *mbp_priv;
uint16_t num_rx = 0;
+   uint16_t buf_size;

if (unlikely(pcap_q->pcap == NULL || nb_pkts == 0))
return 0;

-   if(unlikely(!mbufs[0]))
-   for (i = 0; i < RTE_ETH_PCAP_MBUFS; i++)
-   mbufs[i] = rte_pktmbuf_alloc(pcap_q->mb_pool);
-
/* Reads the given number of packets from the pcap file one by one
 * and copies the packet data into a newly allocated mbuf to return.
 */
for (i = 0; i < nb_pkts; i++) {
-   mbuf = mbufs[i % RTE_ETH_PCAP_MBUFS];
+   /* Get the next PCAP packet */
packet = pcap_next(pcap_q->pcap, &header);
if (unlikely(packet == NULL))
break;
+   else
+   mbuf = rte_pktmbuf_alloc(pcap_q->mb_pool);
if (unlikely(mbuf == NULL))
break;
-   rte_memcpy(mbuf->pkt.data, packet, header.len);
-   mbuf->pkt.data_len = (uint16_t)header.len;
-   mbuf->pkt.pkt_len = mbuf->pkt.data_len;
-   bufs[i] = mbuf;
-   num_rx++;
+
+   /* Now get the space available for data in the mbuf */
+   mbp_priv = (struct rte_pktmbuf_pool_private *)
+   ((char *)pcap_q->mb_pool + sizeof(struct 
rte_mempool));
+   buf_size = (uint16_t) (mbp_priv->mbuf_data_room_size -
+   RTE_PKTMBUF_HEADROOM);
+
+   if (header.len <= buf_size) {
+   /* pcap packet will fit in the mbuf, go ahead and copy 
*/
+   rte_memcpy(mbuf->pkt.data, packet, header.len);
+   mbuf->pkt.data_len = (uint16_t)header.len;
+   mbuf->pkt.pkt_len = mbuf->pkt.data_len;
+   bufs[i] = mbuf;
+   num_rx++;
+   } else {
+   /* pcap packet will not fit in the mbuf, so drop packet 
*/
+   RTE_LOG(ERR, PMD,
+   "PCAP packet %d bytes will not fit in 
mbuf (%d bytes)\n",
+   header.len, buf_size);
+   rte_pktmbuf_free(mbuf);
+   }
}
pcap_q->rx_pkts += num_rx;
return num_rx;
--
1.7.7.6


[dpdk-dev] l2fwd program reported 100Mbps on a 10Gbps physical port using virtio or e1000 port in CentOS guest OS using DPDK 1.3.1r2

2013-11-26 Thread James Yu
I have a Ubuntu 12.04.3 LTS (Linux 3.2.0-53-generic) KVM host. The guest OS
is a CentOS 32bit (CentOS 6.2, Linux 2.6.32-220.el6.i686). There are two
10G ports on the KVM host with the following kvm.
root at openstack1:~# kvm --version
QEMU emulator version 1.2.0 (qemu-kvm-1.2.0), Copyright (c) 2003-2008
Fabrice Bellard
root at openstack1:~# libvirtd --version
libvirtd (libvirt) 0.9.8


The DPDK l2fwd runs inside a CentOS 6.2 OS (2.6.32-220.el6.i686 32bit
Linux) on a RHEL 6.1 KVM host (2.6.32-131.0.15.e16.x86_64).

THE PROBLEM:
Using DPDK 1.3.1r2, I got the following message to indicate the port speed
of a virtio port that l2fwd listened to for received packets. It reports as
100Mbps only although the physical port is a 10G port (Intel x520 PN
49Y7960, http://www.redbooks.ibm.com/abstracts/tips0893.html?Open). Is this
expected on the DPDK side ?


Checking link status done
Port 0 Link Up - speed *100 Mbps *- full-duplex

Port statistics 
Statistics for port 0 --
Packets sent:0
Packets received:0
Packets dropped: 0
Aggregate statistics ===
Total packets sent:  0
Total packets received:  0
Total packets dropped:   0


The KVM information:
[root at rh188 ~]# libvirtd --version
libvirtd (libvirt) 0.10.2
[root at rh188 ~]# /usr/libexec/qemu-kvm --version
QEMU PC emulator version 0.12.1 (qemu-kvm-0.12.1.2), Copyright (c)
2003-2008 Fabrice Bellard

--
On the KVM host, physical port eth6 is 10Gbps
[root at rh188 ~]# ethtool eth6
Settings for eth6:
Supported ports: [ FIBRE ]
Supported link modes:   1baseT/Full
Supports auto-negotiation: No
Advertised link modes:  1baseT/Full
Advertised pause frame use: No
Advertised auto-negotiation: No
Speed: 1Mb/s
Duplex: Full
Port: FIBRE
PHYAD: 0
Transceiver: external
Auto-negotiation: off
Supports Wake-on: d
Wake-on: d
Current message level: 0x0007 (7)
Link detected: yes

On KVM host, virtual interface used by the KVM guest to receive packets,
10Mbps
[root at rh188 ~]# ethtool eth6-client6
Settings for eth6-client6:
Supported ports: [ ]
Supported link modes:
Supports auto-negotiation: No
Advertised link modes:  Not reported
Advertised pause frame use: No
Advertised auto-negotiation: No
Speed: 10Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 0
Transceiver: internal
Auto-negotiation: off
MDI-X: Unknown
Current message level: 0xffa1 (-95)
Link detected: yes

eth6 and eth6-client are virtually connected to the br6 bridge
[root at rh188 ~]# brctl show
bridge name bridge id   STP enabledinterfaces
br6  8000.90e2ba341e54   no   eth6

eth6-client6

The l2fwd command running on the guest VM
100Mbps reported on l2fwd for a one port setup (receiving only)
/root/dpdk/dpdk-1.3.1r2/examples/l2fwd/build/l2fwd -c 3 -n 1 -b 000:00:03.0
-b 000:00:07.0 -b 000:00:0a.0 -b 000:00:09.0 -- -q 1 -p 1

the same (100Mbps) reported on l2fwd for a two ports setup (looping back
received traffic to the other port) too
/root/dpdk/dpdk-1.3.1r2/
examples/l2fwd/build/l2fwd -c 3 -n 1 -b 000:00:03.0 -b 000:00:07.0 -b
000:00:0a.0 -- -q 2 -p 3

Thanks

James


[dpdk-dev] Regarding VM live migration with SRIOV

2013-11-26 Thread Stephen Hemminger
On Wed, 27 Nov 2013 10:09:09 +0530
Prashant Upadhyaya  wrote:

> Hi,
> 
> Let me be more specific.
> Does DPDK support hot plugin/plugout of PCI devices ?
> What typically needs to be done if this is to be achieved inside an 
> application.
> 
> Typically, the NIC PF or VF appears to the DPDK application as a PCI device 
> which is probed at startup.
> Now what happens if I insert a new VF dynamically and want to use it inside 
> the DPDK application (while it is already running), how should this typically 
> be done ? [hotplugin]
> And what happens if the DPDK application is in control of a PCI device and 
> that PCI device is suddenly removed ? How can the application detect this and 
> stop doing data transfer on this and sort of unload it ? [hotplugout]
> 
> If the above can be coded inside the DPDK app, then we can think of live VM 
> migration with SRIOV -- just hotplugin and plugout the VF's.
> 
> Regards
> -Prashant
> 

The current implementation does look like it supports hotplug.
All devices are discovered during rte_eal_pci_probe.



[dpdk-dev] Increasing number of txd and rxd from 256 to 1024 for virtio-net-pmd-1.1

2013-11-26 Thread James Yu
Running one directional traffic from Spirent traffic generator to l2fwd
running inside a guest OS on a RHEL 6.2 KVM host, I encountered performance
issue and need to increase the number of rxd and txd from 256 to 1024.
There was not enough freeslots for packets to be transmitted in this routine
  virtio_send_packet(){
  
if (tq->freeslots < nseg + 1) {
return -1;
}
  
  }

How do I solve the performance issue by one of the following
1. increase the number of rxd and txd from 256 to 1024
This should prevent packets could not be stored into the ring due
to lack of freeslots. But l2fwd fails to run and indicate the number must
be equal to 256.
2. increase the MAX_PKT_BURST
But this is not ideal since it will increase the delay while
improving the throughput
3. other mechanism that you know can improve it ?
Is there any other approach to have enough freeslots to store the
packets before passing down to PCI ?


Thanks

James


This is the performance numbers I measured on the l2fwd printout for the
receiving part. I added codes inside l2fwd to do tx part.

vhost-net is enabled on KVM host, # of cache buffer 4096, Ubuntu 12.04.3
LTS (3.2.0-53-generic); kvm 1.2.0, libvirtd: 0.9.8
64 Bytes/pkt from Spirent @ 223k pps, running test for 10 seconds.

DPDK 1.3 + virtio + 256 txd/rxd + nice -19 priority (l2fwd, guest kvm
process)
bash command: nice -n -19
/root/dpdk/dpdk-1.3.1r2/examples/l2fwd/build/l2fwd -c 3 -n 1 -b 000:00:03.0
-b 000:00:07.0 -b 000:00:0a.0 -b 000:00:09.0 -d
/root/dpdk/virtio-net-pmd-1.1/librte_pmd_virtio.so -- -q 1 -p 1

Spirent -> l2fwd (receiving 10G) (RX on KVM guest)
MAX_PKT_BURST 10seconds (<1% loss)  Packets Per Second
---
32  74k pps
64  80k pps
128   126kpps
256   133kpps

l2fw -> Spirent (10G port) (transmitting) (using one-directional one port
(port 0) setup)
MAX_PKT_BURST < 1% packet loss
32 88kpp


**
The same test run on e1000 ports


DPDK 1.3 + e1000 + 1024 txd/rxd + nice -19 priority (l2fwd, guest kvm
process)
bash command: nice -n -19
/root/dpdk/dpdk-1.3.1r2/examples/l2fwd/build/l2fwd -c 3 -n 1 -b 000:00:03.0
-b 000:00:07.0 -b 000:00:0a.0 -b 000:00:09.0 -- -q 1 -p 1

Spirent -> l2fwd (RECEIVING 10G)
MAX_PKT_BURST <= 1% packet loss
32 110k pps

l2fw -> Spirent (10G port) (TRANSMITTING) (using one-directional one port
(port 0) setup)
MAX_PKT_BURST pkts transmitted on l2fwd
32171k pps (0% dropped)
240  203k pps (6% dropped, 130k pps received on
eth6 (assumed on Spirent)) **
**: not enough freeslots in tx ring
==> this indicate the effects of small txd/rxd (256) when more traffic is
generated, the packets can not
be sent due to lack of freeslots in tx ring. I guess this is the
symptom occurs in the virtio_net


[dpdk-dev] Regarding VM live migration with SRIOV

2013-11-26 Thread Stephen Hemminger
On Wed, 27 Nov 2013 11:39:28 +0530
Prashant Upadhyaya  wrote:

> Hi Stephen,
> 
> The rte_eal_pci_probe is typically called at the startup.
> 
> Now let's say a DPDK application is running with a PCI device (doing tx and 
> rx) and I remove that PCI device underneath (hot plugout)
> So how does the application now know that the device is gone ?
> 
> Is it that rte_eal_pci_probe should be called periodically from, let's say, 
> the slow control path of the DPDK application ?
> 
> Regards
> -Prashant
> 

Like I said current code doesn't do hotplug.
If you wanted to add it, you would have to refactor the PCI management layer.




[dpdk-dev] Increasing number of txd and rxd from 256 to 1024 for virtio-net-pmd-1.1

2013-11-26 Thread Stephen Hemminger
On Tue, 26 Nov 2013 21:15:02 -0800
James Yu  wrote:

> Running one directional traffic from Spirent traffic generator to l2fwd
> running inside a guest OS on a RHEL 6.2 KVM host, I encountered performance
> issue and need to increase the number of rxd and txd from 256 to 1024.
> There was not enough freeslots for packets to be transmitted in this routine
>   virtio_send_packet(){
>   
> if (tq->freeslots < nseg + 1) {
> return -1;
> }
>   
>   }
> 
> How do I solve the performance issue by one of the following
> 1. increase the number of rxd and txd from 256 to 1024
> This should prevent packets could not be stored into the ring due
> to lack of freeslots. But l2fwd fails to run and indicate the number must
> be equal to 256.
> 2. increase the MAX_PKT_BURST
> But this is not ideal since it will increase the delay while
> improving the throughput
> 3. other mechanism that you know can improve it ?
> Is there any other approach to have enough freeslots to store the
> packets before passing down to PCI ?
> 
> 
> Thanks
> 
> James
> 
> 
> This is the performance numbers I measured on the l2fwd printout for the
> receiving part. I added codes inside l2fwd to do tx part.
> 
> vhost-net is enabled on KVM host, # of cache buffer 4096, Ubuntu 12.04.3
> LTS (3.2.0-53-generic); kvm 1.2.0, libvirtd: 0.9.8
> 64 Bytes/pkt from Spirent @ 223k pps, running test for 10 seconds.
> 
> DPDK 1.3 + virtio + 256 txd/rxd + nice -19 priority (l2fwd, guest kvm
> process)
> bash command: nice -n -19
> /root/dpdk/dpdk-1.3.1r2/examples/l2fwd/build/l2fwd -c 3 -n 1 -b 000:00:03.0
> -b 000:00:07.0 -b 000:00:0a.0 -b 000:00:09.0 -d
> /root/dpdk/virtio-net-pmd-1.1/librte_pmd_virtio.so -- -q 1 -p 1
> 
> Spirent -> l2fwd (receiving 10G) (RX on KVM guest)
> MAX_PKT_BURST 10seconds (<1% loss)  Packets Per Second
> ---
> 32  74k pps
> 64  80k pps
> 128   126kpps
> 256   133kpps
> 
> l2fw -> Spirent (10G port) (transmitting) (using one-directional one port
> (port 0) setup)
> MAX_PKT_BURST < 1% packet loss
> 32 88kpp
> 
> 
> **
> The same test run on e1000 ports
> 
> 
> DPDK 1.3 + e1000 + 1024 txd/rxd + nice -19 priority (l2fwd, guest kvm
> process)
> bash command: nice -n -19
> /root/dpdk/dpdk-1.3.1r2/examples/l2fwd/build/l2fwd -c 3 -n 1 -b 000:00:03.0
> -b 000:00:07.0 -b 000:00:0a.0 -b 000:00:09.0 -- -q 1 -p 1
> 
> Spirent -> l2fwd (RECEIVING 10G)
> MAX_PKT_BURST <= 1% packet loss
> 32 110k pps
> 
> l2fw -> Spirent (10G port) (TRANSMITTING) (using one-directional one port
> (port 0) setup)
> MAX_PKT_BURST pkts transmitted on l2fwd
> 32171k pps (0% dropped)
> 240  203k pps (6% dropped, 130k pps received on
> eth6 (assumed on Spirent)) **
> **: not enough freeslots in tx ring
> ==> this indicate the effects of small txd/rxd (256) when more traffic is
> generated, the packets can not
> be sent due to lack of freeslots in tx ring. I guess this is the
> symptom occurs in the virtio_net

The number of slots with virtio is a parameter negotiated with the host.
So unless the host (KVM) gives the device more slots, then it won't work.
I have a better virtio driver and one of the features being added is multiqueue
and merged TX buffer support which would give a bigger queue.