from:"Victor Huertas"

[dpdk-dev] Segmentation fault when creating a hash table in the example of pipeline (ROUTING type)

2018-04-19 Thread Victor Huertas

Hi all,

I have tried to run the pipeline example with a ROUTING pipeline where I
configured n_arp_entries=8 as well as the arp_key_offset = 192.

It compiles perfectly but when it reaches the line where the
rte_pipeline_table_create is (see below a fragment of the source code of
the pipeline_routing_init function at pipeline_routing.be.cpp file) it
always crashes with a segmentation fault at the
rte_table_hash_create_key8_ext() function.

I am completely stuck due to this issue. Has some bug being declared
regarding this table creation function?


/* ARP table configuration */
if (p_rt->params.n_arp_entries) {

struct rte_table_hash_key8_ext_params table_arp_params;
table_arp_params.n_entries = p_rt->params.n_arp_entries;
table_arp_params.n_entries_ext = p_rt->params.n_arp_entries;
table_arp_params.f_hash = hash_default_key8;
table_arp_params.seed = 0;
table_arp_params.signature_offset = 0; /* Unused */
table_arp_params.key_offset = p_rt->params.arp_key_offset;


struct rte_pipeline_table_params table_params = {
.ops = &rte_table_hash_key8_ext_dosig_ops,
.arg_create = &table_arp_params,
.f_action_hit = get_arp_table_ah_hit(p_rt),
.f_action_miss = NULL,
.arg_ah = p_rt,
.action_data_size = sizeof(struct arp_table_entry) -
sizeof(struct rte_pipeline_table_entry),
};

int status;

status = rte_pipeline_table_create(p->p,
&table_params,
&p->table_id[1]);

if (status) {
rte_pipeline_free(p->p);
rte_free(p);
return NULL;
}

p->n_tables++;
}

Thanks

-- 
Victor

[dpdk-dev] Tx vlan offload problem with igb and DPDK v17.11

2018-09-03 Thread Victor Huertas

Hi all,

I have realized that the PKT_TX_VLAN_PKT flag for Tx Vlan Offload doesn't
work in my application.

According to the NICs I have (IGB) there seems to be a problem with this
vlan offload tx feature and this version of DPDK according to the Bug 17 :
https://bugs.dpdk.org/show_bug.cgi?id=17

I have tested it using vfio_pci and igb_uio drivers as well as SW vlan
insertion (rte_vlan_insert) and the result is exactly the same.

Have this bug been solved so far?

These are my NICs:
04:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network
Connection (rev 01)
Subsystem: Super Micro Computer Inc Device 10c9
Flags: fast devsel, IRQ 17
Memory at fafe (32-bit, non-prefetchable) [disabled] [size=128K]
Memory at fafc (32-bit, non-prefetchable) [disabled] [size=128K]
I/O ports at ec00 [disabled] [size=32]
Memory at fafbc000 (32-bit, non-prefetchable) [disabled] [size=16K]
[virtual] Expansion ROM at faf8 [disabled] [size=128K]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
Capabilities: [70] MSI-X: Enable- Count=10 Masked-
Capabilities: [a0] Express Endpoint, MSI 00
Capabilities: [100] Advanced Error Reporting
Capabilities: [140] Device Serial Number 00-30-48-ff-ff-bb-17-02
Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
Capabilities: [160] Single Root I/O Virtualization (SR-IOV)
Kernel driver in use: vfio-pci
Kernel modules: igb

04:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network
Connection (rev 01)
Subsystem: Super Micro Computer Inc Device 10c9
Flags: fast devsel, IRQ 16
Memory at faf6 (32-bit, non-prefetchable) [disabled] [size=128K]
Memory at faf4 (32-bit, non-prefetchable) [disabled] [size=128K]
I/O ports at e880 [disabled] [size=32]
Memory at faf3c000 (32-bit, non-prefetchable) [disabled] [size=16K]
[virtual] Expansion ROM at faf0 [disabled] [size=128K]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
Capabilities: [70] MSI-X: Enable- Count=10 Masked-
Capabilities: [a0] Express Endpoint, MSI 00
Capabilities: [100] Advanced Error Reporting
Capabilities: [140] Device Serial Number 00-30-48-ff-ff-bb-17-02
Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
Capabilities: [160] Single Root I/O Virtualization (SR-IOV)
Kernel driver in use: vfio-pci
Kernel modules: igb

Thanks for your attention

Regards,

PD: BTW, I have observed that capturing a, for example, an ARP message in
an rx queue which the VLAN stripped the answer is sent correctly if I set
the PKT_TX_VLAN_PKT flag and the VLAN_TCI is the same... However, if I try
to set the VLAN header from a non-VLAN stripped frame then it doesnt work.



-- 
Victor

Re: [dpdk-dev] Tx vlan offload problem with igb and DPDK v17.11

2018-09-04 Thread Victor Huertas

Hi all,

I have solved the issue of the  PKT_TX_VLAN_PKT using the SW version
rte_vlan_insert function.
However I would like to tell you what I have seen during my tests. I hope
it can shed a light on the issue you the developers should correct.

When I use m->old_flags |= PKT_TX_VLAN_PKT  my Wireshark captures reveals
that the 802.1q and the vlan tag is attached to the output packet. The
problem is the 'ether_proto' field of the vlan header, which is set again
to 0x8100 (VLAN) instead of 0x0800 (IPv4). Apart from this, the rest of the
packet is correct. So if this is corrected in the driver it will work, I
think.

Regards,

El lun., 3 sept. 2018 a las 19:32, Victor Huertas ()
escribió:

> Hi all,
>
> I have realized that the PKT_TX_VLAN_PKT flag for Tx Vlan Offload doesn't
> work in my application.
>
> According to the NICs I have (IGB) there seems to be a problem with this
> vlan offload tx feature and this version of DPDK according to the Bug 17 :
> https://bugs.dpdk.org/show_bug.cgi?id=17
>
> I have tested it using vfio_pci and igb_uio drivers as well as SW vlan
> insertion (rte_vlan_insert) and the result is exactly the same.
>
> Have this bug been solved so far?
>
> These are my NICs:
> 04:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network
> Connection (rev 01)
> Subsystem: Super Micro Computer Inc Device 10c9
> Flags: fast devsel, IRQ 17
> Memory at fafe (32-bit, non-prefetchable) [disabled] [size=128K]
> Memory at fafc (32-bit, non-prefetchable) [disabled] [size=128K]
> I/O ports at ec00 [disabled] [size=32]
> Memory at fafbc000 (32-bit, non-prefetchable) [disabled] [size=16K]
> [virtual] Expansion ROM at faf8 [disabled] [size=128K]
> Capabilities: [40] Power Management version 3
> Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
> Capabilities: [70] MSI-X: Enable- Count=10 Masked-
> Capabilities: [a0] Express Endpoint, MSI 00
> Capabilities: [100] Advanced Error Reporting
> Capabilities: [140] Device Serial Number 00-30-48-ff-ff-bb-17-02
> Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
> Capabilities: [160] Single Root I/O Virtualization (SR-IOV)
> Kernel driver in use: vfio-pci
> Kernel modules: igb
>
> 04:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network
> Connection (rev 01)
> Subsystem: Super Micro Computer Inc Device 10c9
> Flags: fast devsel, IRQ 16
> Memory at faf6 (32-bit, non-prefetchable) [disabled] [size=128K]
> Memory at faf4 (32-bit, non-prefetchable) [disabled] [size=128K]
> I/O ports at e880 [disabled] [size=32]
> Memory at faf3c000 (32-bit, non-prefetchable) [disabled] [size=16K]
> [virtual] Expansion ROM at faf0 [disabled] [size=128K]
> Capabilities: [40] Power Management version 3
> Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
> Capabilities: [70] MSI-X: Enable- Count=10 Masked-
> Capabilities: [a0] Express Endpoint, MSI 00
> Capabilities: [100] Advanced Error Reporting
> Capabilities: [140] Device Serial Number 00-30-48-ff-ff-bb-17-02
> Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
> Capabilities: [160] Single Root I/O Virtualization (SR-IOV)
> Kernel driver in use: vfio-pci
> Kernel modules: igb
>
> Thanks for your attention
>
> Regards,
>
> PD: BTW, I have observed that capturing a, for example, an ARP message in
> an rx queue which the VLAN stripped the answer is sent correctly if I set
> the PKT_TX_VLAN_PKT flag and the VLAN_TCI is the same... However, if I try
> to set the VLAN header from a non-VLAN stripped frame then it doesnt work.
>
>
>
> --
> Victor
>


-- 
Victor

Re: [dpdk-dev] Tx vlan offload problem with igb and DPDK v17.11

2018-09-04 Thread Victor Huertas

Forget about it,

I found a bug in my software. Once solved, no problem with PKT_TX_VLAN_PKT
at all.

Regards,

El lun., 3 sept. 2018 a las 19:32, Victor Huertas ()
escribió:

> Hi all,
>
> I have realized that the PKT_TX_VLAN_PKT flag for Tx Vlan Offload doesn't
> work in my application.
>
> According to the NICs I have (IGB) there seems to be a problem with this
> vlan offload tx feature and this version of DPDK according to the Bug 17 :
> https://bugs.dpdk.org/show_bug.cgi?id=17
>
> I have tested it using vfio_pci and igb_uio drivers as well as SW vlan
> insertion (rte_vlan_insert) and the result is exactly the same.
>
> Have this bug been solved so far?
>
> These are my NICs:
> 04:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network
> Connection (rev 01)
> Subsystem: Super Micro Computer Inc Device 10c9
> Flags: fast devsel, IRQ 17
> Memory at fafe (32-bit, non-prefetchable) [disabled] [size=128K]
> Memory at fafc (32-bit, non-prefetchable) [disabled] [size=128K]
> I/O ports at ec00 [disabled] [size=32]
> Memory at fafbc000 (32-bit, non-prefetchable) [disabled] [size=16K]
> [virtual] Expansion ROM at faf8 [disabled] [size=128K]
> Capabilities: [40] Power Management version 3
> Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
> Capabilities: [70] MSI-X: Enable- Count=10 Masked-
> Capabilities: [a0] Express Endpoint, MSI 00
> Capabilities: [100] Advanced Error Reporting
> Capabilities: [140] Device Serial Number 00-30-48-ff-ff-bb-17-02
> Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
> Capabilities: [160] Single Root I/O Virtualization (SR-IOV)
> Kernel driver in use: vfio-pci
> Kernel modules: igb
>
> 04:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network
> Connection (rev 01)
> Subsystem: Super Micro Computer Inc Device 10c9
> Flags: fast devsel, IRQ 16
> Memory at faf6 (32-bit, non-prefetchable) [disabled] [size=128K]
> Memory at faf4 (32-bit, non-prefetchable) [disabled] [size=128K]
> I/O ports at e880 [disabled] [size=32]
> Memory at faf3c000 (32-bit, non-prefetchable) [disabled] [size=16K]
> [virtual] Expansion ROM at faf0 [disabled] [size=128K]
> Capabilities: [40] Power Management version 3
> Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
> Capabilities: [70] MSI-X: Enable- Count=10 Masked-
> Capabilities: [a0] Express Endpoint, MSI 00
> Capabilities: [100] Advanced Error Reporting
> Capabilities: [140] Device Serial Number 00-30-48-ff-ff-bb-17-02
> Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
> Capabilities: [160] Single Root I/O Virtualization (SR-IOV)
> Kernel driver in use: vfio-pci
> Kernel modules: igb
>
> Thanks for your attention
>
> Regards,
>
> PD: BTW, I have observed that capturing a, for example, an ARP message in
> an rx queue which the VLAN stripped the answer is sent correctly if I set
> the PKT_TX_VLAN_PKT flag and the VLAN_TCI is the same... However, if I try
> to set the VLAN header from a non-VLAN stripped frame then it doesnt work.
>
>
>
> --
> Victor
>


-- 
Victor

Re: [dpdk-dev] Nehalem Intel Xeon X5506 architecture and PCIe NIC association to Numa node

2018-02-08 Thread Victor Huertas

Bruce,

My requirements are not that much (500 Mbps and 1 Gbps desirable).

Thanks for your references links I will have a look at them.

Regarding the NIC detection in the DPDK app by the DPDK EAL initialization
after successfully having loaded the vfio-pci, it happens something strange.

The nb_ports = rte_eth_dev_count(); is always returning 0.

Therefore it shows an error telling that "port 0 is not present on the
board".

The EAL seems to detect the VFIOs as the only two logs that shows when
initializing regarding VFIO are:

EAL: Probing VFIO support...

EAL: VFIO support initialized

And nothing else... shouldn't the EAL detect the two NICs I associated to
the VFIO? very strange...

Regards,

PD: I have changed the email address account in order to avoid sending
these disturbing disclaimers. Sorry.

-Mensaje original-
De: Bruce Richardson [mailto:bruce.richard...@intel.com]
Enviado el: jueves, 08 de febrero de 2018 17:42
Para: Huertas García, Víctor
CC: dev@dpdk.org
Asunto: Re: [dpdk-dev] Nehalem Intel Xeon X5506 architecture and PCIe NIC
association to Numa node

On Thu, Feb 08, 2018 at 04:27:36PM +, Huertas García, Víctor wrote:

>

> Hi all,

>

> After having tried many ways to make the PCIe NIC card appear associated
to a numa node, I haven't been able to do it.

> That is, every time I try to look at which numa node belongs it always
returns -1.

>

> $ cat /sys/bus/pci/devices/\:04\:00.1/numa_node

> -1

>

> $ cat /sys/bus/pci/devices/\:04\:00.0/numa_node

> -1

>

> Using lstopo, I confirm that all PCI cards are "outside" of  any Numa
node.

>

> I have read in previous posted messages in dpdk-dev community that this
is normal in Nehalem generation Xeon architecture and there is nothing I
can do about it. Can somebody confirm this?

For that generation architecture, it is indeed expected. The NICs are not
directly connected to any NUMA node.

> If so, what implications could this have on packet capture and
performance?

Unsurprisingly, it's the case that newer platforms will perform better, as
you are missing out on performance benefits from improved cores and also
features like Intel® DDIO [1].

However, what I/O throughput are you hoping to get from your system?

Depending on your requirements, what you have may be enough. Some people
use DPDK on lower-end platforms because that is all that they need. You may
also find the chart on slide 6 of [2] of use to see how the max throughput
of a platform has improved over time (and has improved further since that
chart was published).

>

> Are the NICs available in my DPDK applications? Do I have to specifically
"add" them by "-w 04:00.1 - w 04:00.0"?

Yes, your NICs will still be available, even without NUMA affinity, and no,
you should not need to explicitly whitelist them - though you can if you
want. So long as they are bound to a uio or vfio driver (e.g.

igb_uio or vfio-pci), they should be detected by DPDK EAL init and made
available to your app.

> Is RSS supported and usable from the DPDK application?

Yes, at least for Intel NICs, and probably most other DPDK-supported NICs
too.

>

> Thanks a lot for your attention

>

> Victor

>

/Bruce

[1] https://www.intel.com/content/www/us/en/io/data-direct-i-o-t
echnology.html

[2] https://dpdksummit.com/Archive/pdf/2016Germany/DPDK-2016-
DPDK_FD_IO_Introduction.pdf

PS: This is a public list, so email disclaimers are rather pointless.

It's best if they can be removed from mails sent here.

-- 
Victor

[dpdk-dev] Fwd: rte_eth_rx_queue_setup is returning error -28 (ENOSPC)

2018-02-21 Thread Victor Huertas

Hi all,

I am trying to make an application for having various rx threads capturing
packets from various queues of the same NIC (to increase performance using
RSS on one NIC). I am basing the app on an example called l3fwd-thread.

However, when I try to create the rx queue in a port using the
rte_eth_rx_queue_setup function it returns an -28 error (ENOSPC).

Having a look at source code of rte_ethdev.c, where this function is
implemented, I have seen the only place where ENOSCP value is returned (see
below).

if (mp->private_data_size < sizeof(struct rte_pktmbuf_pool_private)) {
RTE_PMD_DEBUG_TRACE("%s private_data_size %d < %d\n",
mp->name, (int) mp->private_data_size,
(int) sizeof(struct rte_pktmbuf_pool_private));
return -ENOSPC;
}

Executing step by step (using Eclipse), I saw that the private_data_size of
the pktmbuf_pool was set to zero. And that was the reason why it returned
-ENOSPC.

Nevertheless in the init_mem function of the example, when the pktmbuf_pool
is created, the value for private_data_size as parameter is 0.

if (pktmbuf_pool[socketid] == NULL) {
snprintf(s, sizeof(s), "mbuf_pool_%d", socketid);
pktmbuf_pool[socketid] =
rte_pktmbuf_pool_create(s, nb_mbuf,
MEMPOOL_CACHE_SIZE, 0,
RTE_MBUF_DEFAULT_BUF_SIZE, socketid);
if (pktmbuf_pool[socketid] == NULL)
rte_exit(EXIT_FAILURE,
"Cannot init mbuf pool on socket %d\n", socketid);
else
printf("Allocated mbuf pool on socket %d\n", socketid);

#if (APP_LOOKUP_METHOD == APP_LOOKUP_LPM)
setup_lpm(socketid);
#else
setup_hash(socketid);
#endif
}

So this is contradictory. Why the example initializes the private_data_size
to 0 and then it provokes a bat rx queue initialization?

Or am I understanding it wrongly?

Thanks for your attention,

-- 
Victor



-- 
Victor

[dpdk-dev] (SOLVED) rte_eth_rx_queue_setup is returning error -28 (ENOSPC)

2018-02-22 Thread Victor Huertas

I have found myself the solution to the problem.

When the struct rte_mempool *pktmbuf_pool[NB_SOCKETS]; was declared, it was
not initialized with NULL and the memory pool was not created because the
if clause 'if (pktmbuf_pool[socketID]==NULL)' avoided it.

Once I forced the initialization of all *pktmbuf_pool[NB_SOCKETS] poistions
to NULL, then it worked well.

Regards,

-- 
Victor

[dpdk-dev] Suggestions on how to customize the metadata fields of each packet

2018-02-22 Thread Victor Huertas

Hi all,

In the project I am working I need to define custom metadata fields on each
packet once they are captured using DPDK.
I have seen that each packet has a headroom memory space (128 bytes long)
where RSS hashing and other metadata provided by the NIC is stored. This
information will be useful for me but I need to add some further fields to
these metadata without having to modify the source code of the library.

Do you which is the best practice to do it? Do I have for example to define
a struct that contains rte_mbuf struct and additional metadata struct?

struct new_packet{

struct rte_mbuf;
struct additional_metadata

}

I would appreciate it if you could provide me with some guidelines to know
how to implement it.

Thanks a lot

-- 
Victor

Re: [dpdk-dev] Suggestions on how to customize the metadata fields of each packet

2018-02-23 Thread Victor Huertas

Thanks for your quick answer,

I have read so many documents and web pages on this issue that probably I
confounded the utility of the headroom. It is good to know that this 128
bytes space is available to my disposal. The fact of being lost once the
NIC transmits the frame it is not a problem at all for my application.
However, in case that this space is not enough, I have seen in the rte_mbuf
struct a (void *) pointer called userdata which is in theory used for extra
user-defined metadata. If I wanted to attach an additional metadata struct,
I guess that I just have to assign the pointer to this struct to the
userdata field. However, what happens if I want that the content of this
struct travels with the packet through a software ring in order to be
processed by another thread? Should I reserve more space in the ring to
allocate such extra metadata?

Thanks again,

PD: I have copied the message to users mailing list

2018-02-23 4:13 GMT+01:00 :

> Hi,
>
> First, I think your question should be sent to the user mailing list, not
> the dev mailing list.
>
> > I have seen that each packet has a headroom memory space (128 bytes
> long)
>
> > where RSS hashing and other metadata provided by the NIC is stored.
>
> If I’m not mistaken, the headroom is not where metadata provided by the
> NIC are stored. Those metadata are stored in the rte_mbuf struct, which
> is also 128 bytes long.
>
> The headroom area is located AFTER the end of rte_mbuf (at offset 128).
> By default the headroom area is also 128 byte long, so the actual packet
> data is stored at offset 256.
>
> You can store whatever you want in this headroom area. However those
> information are lost as soon as the packet leaves DPDK (the NIC will start
> sending at offset 256).
>
> -BL.
>



-- 
Victor

Re: [dpdk-dev] Suggestions on how to customize the metadata fields of each packet

2018-02-23 Thread Victor Huertas

Thanks a lot for your suggestions,

Taking them into account and having a look a this example on userdata field
usage (http://dpdk.org/doc/api/examples_2bbdev_app_2main_8c-example.html#a19),
I have though the following plan. I think that the most elegant way to do
it is to use "userdata" for metadata, leaving the headroom as it is for
further and future header manipulation or encapsulation. Therefore, allow
me to expose it and tell me if it is a good practice or not:


   1. I create two independent mempools: "packet_pool" with N mbufs of
   capacity to store all captured packets and a second pool called
   "metadata_pool" with the same N positions of sizeof(struct
   additional-metadata).
   2. On thread#1, I setup an rx queue on the eth port0 assigning it the
   "packet_pool".
   3. The thread#1 captures a burst of 32 packets and store them in a
   struct *rte_mbuf packets[32]. At the same time I pick up 32 objects from
   the "metadata_pool" and store them in struct *additional-metadata
   custom_metadata[32]. The content of every struct vector item should be
   empty, but just in case I initialize every struct field to the default
   values.
   4. For every packet vector position (32 in total) I perform the
   following assignment: packest[i].userdata = custom_metadata[i];
   5. I modify one field of the metadata for each packet this way:
   packets[i].userdata->field1=X
   6. I send through a software ring (which I previously created) all the
   32 elements of "packets" vector. I do NOT implement any parallel software
   ring to put the custom_metadata 32 objects as I assume that such userdata
   assignment prevails through the previous software ring.
   7. Thread#2 reads 32 packets from the mentioned software ring and prints
   the content of  packets[i].userdata->field1 to check that the content of
   metadata is maintained through the software ring.
   8. Thread#2 sends the 32 packets through a tx queue in port 1.
   9. Thread#2 frees the 32 packets ret_mbufs structs AND also frees the
   content in  packets[i].userdata.
   10. Go to point 1 and repeat in a loop all the time.

Is this a valid procedure or do you think that there could be a better one?

Thanks for your attention

Regards,


2018-02-23 11:07 GMT+01:00 :

> Hi all,
>
> Victor, I suggest taking a closer look at section 7.1. here:
> http://dpdk.org/doc/guides/prog_guide/mbuf_lib.html
>
> The approach chosen by DPDK is to store everything, metadata and packet
> data, in contiguous memory. That way, network packets will always have 1 to
> 1 relationship with DPDK mbufs, no extra pointer needed. Every task that
> you need to perform, from allocating, freeing, to transferring mbufs to
> another lcore via software rings, are handled by DPDK. You don't have to
> worry about them. You can save your metadata either directly in the
> userdata field of struct rte_mbuf or in the headroom area.
>
> I agree with Konstantin that in theory we should think of the userdata
> field as space exclusively for metadata and reserve the headroom area for
> packet header manipulation purposes only. However in practice I tend to
> think that using headroom for metadata is more useful since you don't
> really need to worry about any special configuration when creating mbuf
> pool. The headroom is gonna be there by default and you can always adjust
> its size after initialization. Please let me know if I missed something.
>
> -BL
>
> > -Original Message-
> > From: konstantin.anan...@intel.com [mailto:konstantin.anan...@intel.com]
> > Sent: Friday, February 23, 2018 4:27 PM
> > To: Victor Huertas ; long...@viettel.com.vn
> > Cc: dev@dpdk.org; us...@dpdk.org
> > Subject: RE: [dpdk-dev] Suggestions on how to customize the metadata
> fields
> > of each packet
> >
> > Hi Victor,
> >
> > >
> > > Thanks for your quick answer,
> > >
> > > I have read so many documents and web pages on this issue that
> > > probably I confounded the utility of the headroom. It is good to know
> > > that this 128 bytes space is available to my disposal. The fact of
> > > being lost once the NIC transmits the frame it is not a problem at all
> for my
> > application.
> > > However, in case that this space is not enough, I have seen in the
> > > rte_mbuf struct a (void *) pointer called userdata which is in theory
> > > used for extra user-defined metadata. If I wanted to attach an
> > > additional metadata struct, I guess that I just have to assign the
> > > pointer to this struct to the userdata field. However, what happens if
> > > I want that the content of this struct travels with the packet through
> > > a software ring in order to be p

[dpdk-dev] Proposal to add a new toolchain for dpdk: g++

2020-02-17 Thread Victor Huertas

Hi all,

I am using DPDK development environment to develop an application from
which I have to access C++ code.
I managed to modify some internal mk files in the dpdk-stable repository to
allow g++ compiler to be supported.

I have all the modified files well identified and I wonder if the support
team is interested to add this toolchain in future DPDK releases.

Regards

-- 
Victor

[dpdk-dev] Fwd: Proposal to add a new toolchain for dpdk: g++

2020-02-17 Thread Victor Huertas

Thanks Bruce for your answer,
I will try it and let you know. Although I guess that it makes no
difference if, instead of an exe file, I am compiling a static library
(libmylibrary.a), right?

BTW, I would like to insist on the second issue I was referring to in my
first reply about ip pipeline example using software rings and the latency
detected (which may reach 3-4ms per pipeline transition as long as the two
connected pipelines are configured tu run in the same logical core and the
respective f_run functions are placed in the same thread consecutively).
The thing is that I may have in my application up to 5 o 6 pipelines
interconnected and the accumulated delay detected a ping crossing all these
pipelines becomes 55 ms RTT!!. The latency problem desapeers if I assign a
different logical core to every pipeline.

Thanks a lot for your quick response. It is really appreciated.

Regards,

El lun., 17 feb. 2020 a las 15:40, Bruce Richardson (<
bruce.richard...@intel.com>) escribió:

> On Mon, Feb 17, 2020 at 11:01:21AM +0100, Victor Huertas wrote:
> > Hi all,
> >
> > I am using DPDK development environment to develop an application from
> > which I have to access C++ code.
> > I managed to modify some internal mk files in the dpdk-stable repository
> to
> > allow g++ compiler to be supported.
> >
> > I have all the modified files well identified and I wonder if the support
> > team is interested to add this toolchain in future DPDK releases.
> >
> Rather than trying to build DPDK with g++, or to use the DPDK makefiles
> with your C++ application, can I recommend instead that you treat DPDK as
> any third-party library and build it independently of your application.
>
> If you compile and install DPDK using meson and ninja - or install the
> DPDK package from your linux distro - you will have a 'libdpdk.pc' file
> installed for use by pkg-config. Then for building your application, put in
> the relevant calls to pkg-config i.e. 'pkg-config --cflags libdpdk' and
> 'pkg-config --libs libdpdk', into your app makefile and work from there.
>
> Note too, that all DPDK header files should already be safe for inclusion
> in C++ code - if not, please log a bug.
>
> Regards,
> /Bruce
>

-- 
Victor

-- 
Victor

[dpdk-dev] Fwd: Proposal to add a new toolchain for dpdk: g++

2020-02-17 Thread Victor Huertas

Hi Neil,

Well, the thing is that I wanted to keep on using g++ as compiling tool
(and reduce impact on my original develoment environment). My source code
is composed by *.cpp extension files I decided to modify the dpdk makefiles
to accept such extension as well as disable some -W flags that are not used
by g++.

As I have already done the work I just wanted to let you know if DPDK
people was open to introduce this slight modifications and add the
possibility to use g++ instead of icc (Intel's compiler).

By the way, I had published another issue (on dpdk-users) in which I was
wondering on a strange problem I have seen in the "ip pipeline" DPDK
example, related to a high latency problem when using software rings
between two pipelines running in the same core id. However I have received
no answer and this issue is something that worries me a lot as this
behavior is not acceptable at all in my application (which is based on this
ip pipieline example). Would you mind if I rewrite it in this dpdk-dev
thread to see if we can shed a light on this?

Regards and thanks for your quick answer,

El lun., 17 feb. 2020 a las 13:33, Neil Horman ()
escribió:

> On Mon, Feb 17, 2020 at 11:01:21AM +0100, Victor Huertas wrote:
> > Hi all,
> >
> > I am using DPDK development environment to develop an application from
> > which I have to access C++ code.
> > I managed to modify some internal mk files in the dpdk-stable repository
> to
> > allow g++ compiler to be supported.
> >
> > I have all the modified files well identified and I wonder if the support
> > team is interested to add this toolchain in future DPDK releases.
> >
> > Regards
> >
> > --
> > Victor
> >
> Ostensibly, if you have g++, you have gcc, and theres not much more that
> needs
> to be done here.  You should just be able to wrap all your application
> includes
> in an extern C {} construct, and that should be it.
>
> Neil
>
>

-- 
Victor

-- 
Victor

[dpdk-dev] Fwd: high latency detected in IP pipeline example

2020-02-17 Thread Victor Huertas

Hi all,

I am developing my own DPDK application basing it in the dpdk-stable
ip_pipeline example.
At this moment I am using the 17.11 LTS version of DPDK and I amb observing
some extrange behaviour. Maybe it is an old issue that can be solved
quickly so I would appreciate it if some expert can shade a light on this.

The ip_pipeline example allows you to develop Pipelines that perform
specific packet processing functions (ROUTING, FLOW_CLASSIFYING, etc...).
The thing is that I am extending some of this pipelines with my own.
However I want to take advantage of the built-in ip_pipeline capability of
arbitrarily assigning the logical core where the pipeline (f_run()
function) must be executed so that i can adapt the packet processing power
to the amount of the number of cores available.
Taking this into account I have observed something strange. I show you this
simple example below.

Case 1:
[PIPELINE 0 MASTER core =0]
[PIPELINE 1 core=1] --- SWQ1--->[PIPELINE 2 core=2] -SWQ2>
[PIPELINE 3 core=3]

Case 2:
[PIPELINE 0 MASTER core =0]
[PIPELINE 1 core=1] --- SWQ1--->[PIPELINE 2 core=1] -SWQ2>
[PIPELINE 3 core=1]

I send a ping between two hosts connected at both sides of the pipeline
model which allows these pings to cross all the pipelines (from 1 to 3).
What I observe in Case 1 (each pipeline has its own thread in different
core) is that the reported RTT is less than 1 ms, whereas in Case 2 (all
pipelines except MASTER are run in the same thread) is 20 ms. Furthermore,
in Case 2, if I increase a lot (hundreds of Mbps) the packet rate this RTT
decreases to 3 or 4 ms.

Has somebody observed this behaviour in the past? Can it be solved somehow?

Thanks a lot for your attention
-- 
Victor


-- 
Victor

Re: [dpdk-dev] Fwd: high latency detected in IP pipeline example

2020-02-17 Thread Victor Huertas

Thanks James for your quick answer.
I guess that this configuration modification implies that the packets must
be written one by one in the sw ring. Did you notice loose of performance
(in throughput) in your aplicación because of that?

Regards

El mar., 18 feb. 2020 0:10, James Huang  escribió:

> Yes, I experienced similar issue in my application. In a short answer, set
> the swqs write burst value to 1 may reduce the latency significantly. The
> default write burst value is 32.
>
> On Mon., Feb. 17, 2020, 8:41 a.m. Victor Huertas 
> wrote:
>
>> Hi all,
>>
>> I am developing my own DPDK application basing it in the dpdk-stable
>> ip_pipeline example.
>> At this moment I am using the 17.11 LTS version of DPDK and I amb
>> observing
>> some extrange behaviour. Maybe it is an old issue that can be solved
>> quickly so I would appreciate it if some expert can shade a light on this.
>>
>> The ip_pipeline example allows you to develop Pipelines that perform
>> specific packet processing functions (ROUTING, FLOW_CLASSIFYING, etc...).
>> The thing is that I am extending some of this pipelines with my own.
>> However I want to take advantage of the built-in ip_pipeline capability of
>> arbitrarily assigning the logical core where the pipeline (f_run()
>> function) must be executed so that i can adapt the packet processing power
>> to the amount of the number of cores available.
>> Taking this into account I have observed something strange. I show you
>> this
>> simple example below.
>>
>> Case 1:
>> [PIPELINE 0 MASTER core =0]
>> [PIPELINE 1 core=1] --- SWQ1--->[PIPELINE 2 core=2] -SWQ2>
>> [PIPELINE 3 core=3]
>>
>> Case 2:
>> [PIPELINE 0 MASTER core =0]
>> [PIPELINE 1 core=1] --- SWQ1--->[PIPELINE 2 core=1] -SWQ2>
>> [PIPELINE 3 core=1]
>>
>> I send a ping between two hosts connected at both sides of the pipeline
>> model which allows these pings to cross all the pipelines (from 1 to 3).
>> What I observe in Case 1 (each pipeline has its own thread in different
>> core) is that the reported RTT is less than 1 ms, whereas in Case 2 (all
>> pipelines except MASTER are run in the same thread) is 20 ms. Furthermore,
>> in Case 2, if I increase a lot (hundreds of Mbps) the packet rate this RTT
>> decreases to 3 or 4 ms.
>>
>> Has somebody observed this behaviour in the past? Can it be solved
>> somehow?
>>
>> Thanks a lot for your attention
>> --
>> Victor
>>
>>
>> --
>> Victor
>>
>

Re: [dpdk-dev] Fwd: high latency detected in IP pipeline example

2020-02-18 Thread Victor Huertas

 ms
64 bytes from 192.168.0.101: icmp_seq=182 ttl=63 time=17.9 ms
64 bytes from 192.168.0.101: icmp_seq=183 ttl=63 time=18.5 ms
64 bytes from 192.168.0.101: icmp_seq=184 ttl=63 time=18.9 ms
64 bytes from 192.168.0.101: icmp_seq=185 ttl=63 time=19.8 ms
64 bytes from 192.168.0.101: icmp_seq=186 ttl=63 time=19.8 ms
64 bytes from 192.168.0.101: icmp_seq=187 ttl=63 time=10.7 ms
64 bytes from 192.168.0.101: icmp_seq=188 ttl=63 time=10.5 ms
64 bytes from 192.168.0.101: icmp_seq=189 ttl=63 time=10.4 ms
64 bytes from 192.168.0.101: icmp_seq=190 ttl=63 time=10.3 ms
64 bytes from 192.168.0.101: icmp_seq=191 ttl=63 time=10.5 ms
64 bytes from 192.168.0.101: icmp_seq=192 ttl=63 time=10.7 ms
As you mentioned, the delay has decreased a lot but it is still
considerably high (in a normal router this delay is less than 1 ms).
A second strange behaviour is seen in the evolution of the RTT detected. It
begins in 10 ms and goes increasing little by litttle to reach a peak of 20
ms aprox and then it suddely comes back to 10 ms again to increase again
till 20 ms.

Is this the behaviour you have in your case when the burst_write is set to
1?

Regards,

El mar., 18 feb. 2020 a las 8:18, James Huang ()
escribió:

> No. We didn't see noticable throughput difference in our test.
>
> On Mon., Feb. 17, 2020, 11:04 p.m. Victor Huertas 
> wrote:
>
>> Thanks James for your quick answer.
>> I guess that this configuration modification implies that the packets
>> must be written one by one in the sw ring. Did you notice loose of
>> performance (in throughput) in your aplicación because of that?
>>
>> Regards
>>
>> El mar., 18 feb. 2020 0:10, James Huang  escribió:
>>
>>> Yes, I experienced similar issue in my application. In a short answer,
>>> set the swqs write burst value to 1 may reduce the latency significantly.
>>> The default write burst value is 32.
>>>
>>> On Mon., Feb. 17, 2020, 8:41 a.m. Victor Huertas 
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I am developing my own DPDK application basing it in the dpdk-stable
>>>> ip_pipeline example.
>>>> At this moment I am using the 17.11 LTS version of DPDK and I amb
>>>> observing
>>>> some extrange behaviour. Maybe it is an old issue that can be solved
>>>> quickly so I would appreciate it if some expert can shade a light on
>>>> this.
>>>>
>>>> The ip_pipeline example allows you to develop Pipelines that perform
>>>> specific packet processing functions (ROUTING, FLOW_CLASSIFYING,
>>>> etc...).
>>>> The thing is that I am extending some of this pipelines with my own.
>>>> However I want to take advantage of the built-in ip_pipeline capability
>>>> of
>>>> arbitrarily assigning the logical core where the pipeline (f_run()
>>>> function) must be executed so that i can adapt the packet processing
>>>> power
>>>> to the amount of the number of cores available.
>>>> Taking this into account I have observed something strange. I show you
>>>> this
>>>> simple example below.
>>>>
>>>> Case 1:
>>>> [PIPELINE 0 MASTER core =0]
>>>> [PIPELINE 1 core=1] --- SWQ1--->[PIPELINE 2 core=2] -SWQ2>
>>>> [PIPELINE 3 core=3]
>>>>
>>>> Case 2:
>>>> [PIPELINE 0 MASTER core =0]
>>>> [PIPELINE 1 core=1] --- SWQ1--->[PIPELINE 2 core=1] -SWQ2>
>>>> [PIPELINE 3 core=1]
>>>>
>>>> I send a ping between two hosts connected at both sides of the pipeline
>>>> model which allows these pings to cross all the pipelines (from 1 to 3).
>>>> What I observe in Case 1 (each pipeline has its own thread in different
>>>> core) is that the reported RTT is less than 1 ms, whereas in Case 2 (all
>>>> pipelines except MASTER are run in the same thread) is 20 ms.
>>>> Furthermore,
>>>> in Case 2, if I increase a lot (hundreds of Mbps) the packet rate this
>>>> RTT
>>>> decreases to 3 or 4 ms.
>>>>
>>>> Has somebody observed this behaviour in the past? Can it be solved
>>>> somehow?
>>>>
>>>> Thanks a lot for your attention
>>>> --
>>>> Victor
>>>>
>>>>
>>>> --
>>>> Victor
>>>>
>>>

-- 
Victor

Re: [dpdk-dev] Proposal to add a new toolchain for dpdk: g++

2020-02-18 Thread Victor Huertas

Thanks Neil for your answer,
I think I will try the first option you posted without touching the
makefiles from the dpdk-stable and I will tell how it goes.
So, as a conclusion, it seems that all the modifications I have done are
not necessary in order to build your DPDK application using g++.

Regards,

El mar., 18 feb. 2020 a las 14:13, Neil Horman ()
escribió:

> On Mon, Feb 17, 2020 at 02:39:58PM +, Bruce Richardson wrote:
> > On Mon, Feb 17, 2020 at 11:01:21AM +0100, Victor Huertas wrote:
> > > Hi all,
> > >
> > > I am using DPDK development environment to develop an application from
> > > which I have to access C++ code.
> > > I managed to modify some internal mk files in the dpdk-stable
> repository to
> > > allow g++ compiler to be supported.
> > >
> > > I have all the modified files well identified and I wonder if the
> support
> > > team is interested to add this toolchain in future DPDK releases.
> > >
> > Rather than trying to build DPDK with g++, or to use the DPDK makefiles
> > with your C++ application, can I recommend instead that you treat DPDK as
> > any third-party library and build it independently of your application.
> >
> > If you compile and install DPDK using meson and ninja - or install the
> > DPDK package from your linux distro - you will have a 'libdpdk.pc' file
> > installed for use by pkg-config. Then for building your application, put
> in
> > the relevant calls to pkg-config i.e. 'pkg-config --cflags libdpdk' and
> > 'pkg-config --libs libdpdk', into your app makefile and work from there.
> >
> yes, exactly this.  The proscribed method of handling issues like this is
> to
> either:
>
> 1) Build dpdk separately (or just install it from whatever distribution
> you are
> using, if thats an option), and just link against it (either statically or
> dynamically) when you build your application.
>
> 2)  If you embed dpdk source in your
> environment, and build it at the same time as your application, you should
> interface to its build system, by just calling ninja/meson or make from a
> build
> target in your application - the dpdk build file should properly select gcc
> instead of g++, which you should already have if you have g++ installed.
>
> Neil
>
> > Note too, that all DPDK header files should already be safe for inclusion
> > in C++ code - if not, please log a bug.
> >
> > Regards,
> > /Bruce
> >
>


-- 
Victor

Re: [dpdk-dev] Fwd: high latency detected in IP pipeline example

2020-02-19 Thread Victor Huertas

OK James,
Thanks for sharing your own experience.
What I would need right now is to know from maintainers if this latency
behaviour is something inherent in DPDK  in the particular case we are
talking about. Furthermore, I would also appreciate it if some maintainer
could tell us if there is some workaround or special configuration that
completely mitigate this latency. I guess that there is one mitigation
mechanism, which is the approach that the new ip_pipeline app example
exposes: if two or more pipelines are in the same core the "connection"
between them is not a software queue but a "direct table connection".

This proposed approach has a big impact on my application and I would like
to know if there is other mitigation approach taking into account the "old"
version of ip_pipeline example.

Thanks for your attention


El mar., 18 feb. 2020 a las 23:09, James Huang ()
escribió:

> No. I didn't notice the RTT bouncing symptoms.
> In high throughput scenario, if multiple pipelines runs in a single cpu
> core, it does increase the latency.
>
>
> Regards,
> James Huang
>
>
> On Tue, Feb 18, 2020 at 1:50 AM Victor Huertas  wrote:
>
>> Dear James,
>>
>> I have done two different tests with the following configuration:
>> [PIPELINE 0 MASTER core =0]
>> [PIPELINE 1 core=1] --- SWQ1--->[PIPELINE 2 core=1] -SWQ2>
>> [PIPELINE 3 core=1]
>>
>> The first test (sending a single ping to cross all the pipelines to
>> measure RTT) has been done by setting the burst_write to 32 in SWQ1 and
>> SWQ2. NOTE: All the times we use rte_ring_enqueue_burst in the pipelines 1
>> and 2 we set the number of packets to write to 1.
>>
>> The result of this first test is as shown subsquently:
>> 64 bytes from 192.168.0.101: icmp_seq=343 ttl=63 time=59.8 ms
>> 64 bytes from 192.168.0.101: icmp_seq=344 ttl=63 time=59.4 ms
>> 64 bytes from 192.168.0.101: icmp_seq=345 ttl=63 time=59.2 ms
>> 64 bytes from 192.168.0.101: icmp_seq=346 ttl=63 time=59.0 ms
>> 64 bytes from 192.168.0.101: icmp_seq=347 ttl=63 time=59.0 ms
>> 64 bytes from 192.168.0.101: icmp_seq=348 ttl=63 time=59.2 ms
>> 64 bytes from 192.168.0.101: icmp_seq=349 ttl=63 time=59.3 ms
>> 64 bytes from 192.168.0.101: icmp_seq=350 ttl=63 time=59.1 ms
>> 64 bytes from 192.168.0.101: icmp_seq=351 ttl=63 time=58.9 ms
>> 64 bytes from 192.168.0.101: icmp_seq=352 ttl=63 time=58.5 ms
>> 64 bytes from 192.168.0.101: icmp_seq=353 ttl=63 time=58.4 ms
>> 64 bytes from 192.168.0.101: icmp_seq=354 ttl=63 time=58.0 ms
>> 64 bytes from 192.168.0.101: icmp_seq=355 ttl=63 time=58.4 ms
>> 64 bytes from 192.168.0.101: icmp_seq=356 ttl=63 time=57.7 ms
>> 64 bytes from 192.168.0.101: icmp_seq=357 ttl=63 time=56.9 ms
>> 64 bytes from 192.168.0.101: icmp_seq=358 ttl=63 time=57.2 ms
>> 64 bytes from 192.168.0.101: icmp_seq=359 ttl=63 time=57.5 ms
>> 64 bytes from 192.168.0.101: icmp_seq=360 ttl=63 time=57.3 ms
>>
>> As you can see, the RTT is quite high and the range of values is more or
>> less stable.
>>
>> The second test is the same as the first one but setting burst_write to 1
>> for all SWQs. The result is this one:
>>
>> 64 bytes from 192.168.0.101: icmp_seq=131 ttl=63 time=10.6 ms
>> 64 bytes from 192.168.0.101: icmp_seq=132 ttl=63 time=10.6 ms
>> 64 bytes from 192.168.0.101: icmp_seq=133 ttl=63 time=10.5 ms
>> 64 bytes from 192.168.0.101: icmp_seq=134 ttl=63 time=10.7 ms
>> 64 bytes from 192.168.0.101: icmp_seq=135 ttl=63 time=10.8 ms
>> 64 bytes from 192.168.0.101: icmp_seq=136 ttl=63 time=10.4 ms
>> 64 bytes from 192.168.0.101: icmp_seq=137 ttl=63 time=10.7 ms
>> 64 bytes from 192.168.0.101: icmp_seq=138 ttl=63 time=10.5 ms
>> 64 bytes from 192.168.0.101: icmp_seq=139 ttl=63 time=10.4 ms
>> 64 bytes from 192.168.0.101: icmp_seq=140 ttl=63 time=10.2 ms
>> 64 bytes from 192.168.0.101: icmp_seq=141 ttl=63 time=10.4 ms
>> 64 bytes from 192.168.0.101: icmp_seq=142 ttl=63 time=10.9 ms
>> 64 bytes from 192.168.0.101: icmp_seq=143 ttl=63 time=11.4 ms
>> 64 bytes from 192.168.0.101: icmp_seq=144 ttl=63 time=11.3 ms
>> 64 bytes from 192.168.0.101: icmp_seq=145 ttl=63 time=11.5 ms
>> 64 bytes from 192.168.0.101: icmp_seq=146 ttl=63 time=11.6 ms
>> 64 bytes from 192.168.0.101: icmp_seq=147 ttl=63 time=11.0 ms
>> 64 bytes from 192.168.0.101: icmp_seq=148 ttl=63 time=11.3 ms
>> 64 bytes from 192.168.0.101: icmp_seq=149 ttl=63 time=12.0 ms
>> 64 bytes from 192.168.0.101: icmp_seq=150 ttl=63 time=12.6 ms
>> 64 bytes from 192.168.0.101: icmp_seq=151 ttl=63 time=12.4 ms
>> 64 bytes from 192.168.0.101: icmp_seq=152 ttl=63 time=12.3 ms
>> 64 bytes from 192.168.0.101:

[dpdk-dev] Fwd: Fwd: high latency detected in IP pipeline example

2020-02-19 Thread Victor Huertas

Hi ,

I put some maintainers as destination that could provide some extra
information on this issue.
I hope they can shed some light on this.

Regards

El mié., 19 feb. 2020 a las 9:29, Victor Huertas ()
escribió:

> OK James,
> Thanks for sharing your own experience.
> What I would need right now is to know from maintainers if this latency
> behaviour is something inherent in DPDK  in the particular case we are
> talking about. Furthermore, I would also appreciate it if some maintainer
> could tell us if there is some workaround or special configuration that
> completely mitigate this latency. I guess that there is one mitigation
> mechanism, which is the approach that the new ip_pipeline app example
> exposes: if two or more pipelines are in the same core the "connection"
> between them is not a software queue but a "direct table connection".
>
> This proposed approach has a big impact on my application and I would like
> to know if there is other mitigation approach taking into account the "old"
> version of ip_pipeline example.
>
> Thanks for your attention
>
>
> El mar., 18 feb. 2020 a las 23:09, James Huang ()
> escribió:
>
>> No. I didn't notice the RTT bouncing symptoms.
>> In high throughput scenario, if multiple pipelines runs in a single cpu
>> core, it does increase the latency.
>>
>>
>> Regards,
>> James Huang
>>
>>
>> On Tue, Feb 18, 2020 at 1:50 AM Victor Huertas 
>> wrote:
>>
>>> Dear James,
>>>
>>> I have done two different tests with the following configuration:
>>> [PIPELINE 0 MASTER core =0]
>>> [PIPELINE 1 core=1] --- SWQ1--->[PIPELINE 2 core=1] -SWQ2>
>>> [PIPELINE 3 core=1]
>>>
>>> The first test (sending a single ping to cross all the pipelines to
>>> measure RTT) has been done by setting the burst_write to 32 in SWQ1 and
>>> SWQ2. NOTE: All the times we use rte_ring_enqueue_burst in the pipelines 1
>>> and 2 we set the number of packets to write to 1.
>>>
>>> The result of this first test is as shown subsquently:
>>> 64 bytes from 192.168.0.101: icmp_seq=343 ttl=63 time=59.8 ms
>>> 64 bytes from 192.168.0.101: icmp_seq=344 ttl=63 time=59.4 ms
>>> 64 bytes from 192.168.0.101: icmp_seq=345 ttl=63 time=59.2 ms
>>> 64 bytes from 192.168.0.101: icmp_seq=346 ttl=63 time=59.0 ms
>>> 64 bytes from 192.168.0.101: icmp_seq=347 ttl=63 time=59.0 ms
>>> 64 bytes from 192.168.0.101: icmp_seq=348 ttl=63 time=59.2 ms
>>> 64 bytes from 192.168.0.101: icmp_seq=349 ttl=63 time=59.3 ms
>>> 64 bytes from 192.168.0.101: icmp_seq=350 ttl=63 time=59.1 ms
>>> 64 bytes from 192.168.0.101: icmp_seq=351 ttl=63 time=58.9 ms
>>> 64 bytes from 192.168.0.101: icmp_seq=352 ttl=63 time=58.5 ms
>>> 64 bytes from 192.168.0.101: icmp_seq=353 ttl=63 time=58.4 ms
>>> 64 bytes from 192.168.0.101: icmp_seq=354 ttl=63 time=58.0 ms
>>> 64 bytes from 192.168.0.101: icmp_seq=355 ttl=63 time=58.4 ms
>>> 64 bytes from 192.168.0.101: icmp_seq=356 ttl=63 time=57.7 ms
>>> 64 bytes from 192.168.0.101: icmp_seq=357 ttl=63 time=56.9 ms
>>> 64 bytes from 192.168.0.101: icmp_seq=358 ttl=63 time=57.2 ms
>>> 64 bytes from 192.168.0.101: icmp_seq=359 ttl=63 time=57.5 ms
>>> 64 bytes from 192.168.0.101: icmp_seq=360 ttl=63 time=57.3 ms
>>>
>>> As you can see, the RTT is quite high and the range of values is more or
>>> less stable.
>>>
>>> The second test is the same as the first one but setting burst_write to
>>> 1 for all SWQs. The result is this one:
>>>
>>> 64 bytes from 192.168.0.101: icmp_seq=131 ttl=63 time=10.6 ms
>>> 64 bytes from 192.168.0.101: icmp_seq=132 ttl=63 time=10.6 ms
>>> 64 bytes from 192.168.0.101: icmp_seq=133 ttl=63 time=10.5 ms
>>> 64 bytes from 192.168.0.101: icmp_seq=134 ttl=63 time=10.7 ms
>>> 64 bytes from 192.168.0.101: icmp_seq=135 ttl=63 time=10.8 ms
>>> 64 bytes from 192.168.0.101: icmp_seq=136 ttl=63 time=10.4 ms
>>> 64 bytes from 192.168.0.101: icmp_seq=137 ttl=63 time=10.7 ms
>>> 64 bytes from 192.168.0.101: icmp_seq=138 ttl=63 time=10.5 ms
>>> 64 bytes from 192.168.0.101: icmp_seq=139 ttl=63 time=10.4 ms
>>> 64 bytes from 192.168.0.101: icmp_seq=140 ttl=63 time=10.2 ms
>>> 64 bytes from 192.168.0.101: icmp_seq=141 ttl=63 time=10.4 ms
>>> 64 bytes from 192.168.0.101: icmp_seq=142 ttl=63 time=10.9 ms
>>> 64 bytes from 192.168.0.101: icmp_seq=143 ttl=63 time=11.4 ms
>>> 64 bytes from 192.168.0.101: icmp_seq=144 ttl=63 time=11.3 ms
>>> 64 bytes from 192.16

Re: [dpdk-dev] Fwd: Fwd: high latency detected in IP pipeline example

2020-02-19 Thread Victor Huertas

Hi Oliver,

Thanks for your answer. I think that the most appropriate maintainer to
answer to this issue is Cristian as it is the maintainer of ip_pipeline.

In order to tell you how to reproduce the problem, you should go back to
the DPDK v17.11 and run the ip_pipeline app having a *.cfg configuration
file where you put N (where N is more than 3) pipelines in a row. No matter
if the pipelines apply the default entry of the tables they may have. The
important thing is that the packet crosses all the pipelines, not the
process that they receive. That is, the point is that several f_run()
functions (one for each pipeline) falls into the same thread which is
associated to an unique logical core (the f_runs are executed one after the
other) and the rte_mbufs are read and written into/from software queues.

If you need a particular configuration that you can test quickly I should
remount this environment in my lab and test it again.This would take a
while from my part. I have evolved the pipelines a little bit from the
original app and I cannot provide you with these pipelines. The only thing
I can assure you is that the mechanism of "connecting" the pipelines
depending on the logical core where you want to run them is untouched from
the original app.

This latency issue is something that worries me quite a lot because I need
to justify to my bosses the use of DPDK as a key library to improve
performance of the application we are developing in my company (please,
understand that I cannot tell you more). I decided to base the application
on the ip_pipeline app becuase it offered me the opportunity to mix
pipelines including built-in tables and built-in f_run with customized
pipelines with custom f_run function. I saw a very attractive point in
this--> flexibility in conforming our own packet processing models by
contatenating well defined pipelines. However I found myself with the
mentoined latency issue.

I hope this allows you to understand better where I am now.

Regards,

El mié., 19 feb. 2020 a las 11:53, Olivier Matz ()
escribió:

> Hi Victor,
>
> I have no experience with ip_pipeline. I can at least say that this
> latency is much higher that what you should get.
>
> My initial thought was that you were using several pthreads bound to the
> same core, but from what I read in your first mail, this is not the
> case.
>
> Do you have a simple way to reproduce your issue with the original
> example app?
>
>
> Olivier
>
> On Wed, Feb 19, 2020 at 11:37:21AM +0100, Victor Huertas wrote:
> > Hi ,
> >
> > I put some maintainers as destination that could provide some extra
> > information on this issue.
> > I hope they can shed some light on this.
> >
> > Regards
> >
> > El mié., 19 feb. 2020 a las 9:29, Victor Huertas ()
> > escribió:
> >
> > > OK James,
> > > Thanks for sharing your own experience.
> > > What I would need right now is to know from maintainers if this latency
> > > behaviour is something inherent in DPDK  in the particular case we are
> > > talking about. Furthermore, I would also appreciate it if some
> maintainer
> > > could tell us if there is some workaround or special configuration that
> > > completely mitigate this latency. I guess that there is one mitigation
> > > mechanism, which is the approach that the new ip_pipeline app example
> > > exposes: if two or more pipelines are in the same core the "connection"
> > > between them is not a software queue but a "direct table connection".
> > >
> > > This proposed approach has a big impact on my application and I would
> like
> > > to know if there is other mitigation approach taking into account the
> "old"
> > > version of ip_pipeline example.
> > >
> > > Thanks for your attention
> > >
> > >
> > > El mar., 18 feb. 2020 a las 23:09, James Huang ()
> > > escribió:
> > >
> > >> No. I didn't notice the RTT bouncing symptoms.
> > >> In high throughput scenario, if multiple pipelines runs in a single
> cpu
> > >> core, it does increase the latency.
> > >>
> > >>
> > >> Regards,
> > >> James Huang
> > >>
> > >>
> > >> On Tue, Feb 18, 2020 at 1:50 AM Victor Huertas 
> > >> wrote:
> > >>
> > >>> Dear James,
> > >>>
> > >>> I have done two different tests with the following configuration:
> > >>> [PIPELINE 0 MASTER core =0]
> > >>> [PIPELINE 1 core=1] --- SWQ1--->[PIPELINE 2 core=1] -SWQ2>
> > >>> [PIPELINE 3 core=1]
> > >>>
> >

[dpdk-dev] DPDK (v17.11) ACL table field format definition enhancement: 'offset' field covering headroom space.

2019-08-17 Thread Victor Huertas

Hi all,

I am developing an application with DPDK (v17.11) using a concatenation of
pipelines.
Now I find myself on the definition of an ACL table and more precisely the
format of the ACL fields.

One of the parameters of this field format is the 'offset', which indicates
the number of bytes till the start of the field to check.
All the examples I have seen so far assume that an offset value of 0 means
the start of Ethernet header and not the start of rte_mbuf struct. Is there
a way to place the offset start at the rte_mbuf struct?  If not, I think
that it would be a useful enhancement. That's why I propose it in the
developers mailing list.

What do you think about this?

Thanks a lot for your attention.

-- 
Victor

[dpdk-dev] DPDK (v17.11) ACL table field format definition enhancement: 'offset' field covering headroom space.

2019-08-17 Thread Victor Huertas

Hi all,

I am developing an application with DPDK (v17.11) using a concatenation of
pipelines.
Now I find myself on the definition of an ACL table and more precisely the
format of the ACL fields.

One of the parameters of this field format is the 'offset', which indicates
the number of bytes till the start of the field to check.
All the examples I have seen so far assume that an offset value of 0 means
the start of Ethernet header and not the start of rte_mbuf struct. Is there
a way to place the offset start at the rte_mbuf struct?  If not, I think
that it would be a useful enhancement. That's why I propose it in the
developers mailing list.

What do you think about this?

Thanks a lot for your attention.

-- 
Victor


-- 
Victor

[dpdk-dev] [DPDK v17.11 LTS] Crash (segmentation fault) in ACL table packet look up

2019-09-18 Thread Victor Huertas

Hi all,

the DPDK lib always crashes when a packet enters an ACL table I created to
check IPv6 fragmented packets. If the table is empty nothing happens as the
missed packets go to the next table in the pipeline but as soon as I put
some entries a crash happens when the first packet enters.

It seems to happen in the acl_run.h  file (in librte_acl) in line 178 (static
inline uint64_t acl_start_next_trie(struct acl_flow_data *flows, struct
parms *parms, int n, const struct rte_acl_ctx *ctx)). Subsequently, I put
the section of the code where it crashes (in bold and red):

 /* set completion parameters and starting index for this slot */
parms[n].cmplt = flows->last_cmplt;
transition =
*   flows->trans[parms[n].data[*parms[n].data_index++]* +
   ctx->trie[flows->trie].root_index];

Running in debug mode, Eclipse tells me that the 'trans' component of
'flows' is NULL and that's what I think that is the cause of the
segmentation fault.

The thing is that other ACL tables that I use don't causes this
segmentation fault at all. I have revised the fields format configuration,
etc. and all seems to be OK. The table creation returns 0 and all the table
entry insertions returns 0. So the lib doesn't complain at all until the
crash happens.

I enclose below the sections of my code where the fields format is set as
well as the table creation section.

Any help is really wellcome to see what is happening here.

Thanks a lot for your attention.

Now the code samples:

== ACL fields format (only the one that fails)
===
struct rte_acl_field_def *field_format_ipv6_1st_fragment*
[NUM_FIELDS_IPV6_1ST_FRAGMENT_ACL] = {
/* Protocol (1 byte) this value will always be the same (44)*/
[0] = {
.type = RTE_ACL_FIELD_TYPE_BITMASK,
.size = sizeof(uint8_t),
.field_index = 0,
.input_index = 0,
.offset = sizeof(struct ether_hdr) +
offsetof(struct ipv6_hdr, proto),
},
/* ethertype (2 bytes) : this value will always be the same (0x86DD) */
[1] = {
.type = RTE_ACL_FIELD_TYPE_BITMASK,
.size = sizeof(uint16_t),
.field_index = 1,
.input_index = 1, // this value must be multiple of 4 bytes
.offset = offsetof(struct ether_hdr, ether_type),
},
/* tos field (2 bytes) */
[2] = {
.type = RTE_ACL_FIELD_TYPE_BITMASK,
.size = sizeof(uint16_t),
.field_index = 2,
.input_index = 1, // this value must be multiple of 4 bytes
.offset = sizeof(struct ether_hdr) +
offsetof(struct ipv6_hdr, vtc_flow),
},
/* IPv6 source address **/
/* Source IPv6 address [0-3] */
[3] = {
.type = RTE_ACL_FIELD_TYPE_MASK,
.size = sizeof(uint32_t),
.field_index = 3,
.input_index = 2, // this value must be multiple of 4 bytes
.offset = sizeof(struct ether_hdr) +
offsetof(struct ipv6_hdr, src_addr[0]),
},
/* Source IPv6 address [4-7] */
[4] = {
.type = RTE_ACL_FIELD_TYPE_MASK,
.size = sizeof(uint32_t),
.field_index = 4,
.input_index = 3, // this value must be multiple of 4 bytes
.offset = sizeof(struct ether_hdr) +
offsetof(struct ipv6_hdr, src_addr[4]),
},
/* Source IPv6 address [8-11] */
[5] = {
.type = RTE_ACL_FIELD_TYPE_MASK,
.size = sizeof(uint32_t),
.field_index = 5,
.input_index = 4, // this value must be multiple of 4 bytes
.offset = sizeof(struct ether_hdr) +
offsetof(struct ipv6_hdr, src_addr[8]),
},
/* Source IPv6 address [12-15] */
[6] = {
.type = RTE_ACL_FIELD_TYPE_MASK,
.size = sizeof(uint32_t),
.field_index = 6,
.input_index = 5, // this value must be multiple of 4 bytes
.offset = sizeof(struct ether_hdr) +
offsetof(struct ipv6_hdr, src_addr[12]),
},

/* IPv6 destination address **/
/* Destination IPv6 address [0-3] */
[7] = {
.type = RTE_ACL_FIELD_TYPE_MASK,
.size = sizeof(uint32_t),
.field_index = 7,
.input_index = 6, // this value must be multiple of 4 bytes
.offset = sizeof(struct ether_hdr) +
offsetof(struct ipv6_hdr, dst_addr[0]),
},
/* Destination IPv6 address [4-7] */
[8] = {
.type = RTE_ACL_FIELD_TYPE_MASK,
.size = sizeof(uint32_t),
.field_index = 8,
.input_index = 7, // this value must be multiple of 4 bytes
.offset = sizeof(struct ether_hdr) +
offsetof(struct ipv6_hdr, dst_addr[4]),
},
/* Destination IPv6 address [8-11] */
[9] = {
.type = RTE_ACL_FIELD_TYPE_MASK,
.size = sizeof(uint32_t),
.field_index = 9,
.input_index = 8, // this value must be multiple of 4 bytes
.offset = sizeof(struct ether_hdr) +
offsetof(struct ipv6_hdr, dst_addr[8]),
},
/* Destination IPv6 address [12-15] */
[10] = {
.type = RTE_ACL_FIELD_TYPE_MASK,
.size = sizeof(uint32_t),
.field_index = 10,
.input_index = 9, // this value must be multiple of 4 bytes
.offset = sizeof(struct ether_hdr) +
offsetof(struct ipv6_hdr, dst_addr[12]),
},
/* next_header+reserved+frag_data (4 byte) in ipv6 frag header */
[11] = {
.type = RTE_ACL_FIELD_TYPE_BITMASK,
.size = sizeof(uint32_t),
.field_index = 11,
.input_index = 10, // this value must be multiple of 4 bytes
.offset = sizeof(struct ether_hdr) +
sizeof(struct ipv6_hdr) +
offsetof(struct ipv6_extension_fragment, next_header),
},
/* Source Port */
[12] = {

Re: [dpdk-dev] Using valgrind with DPDK app

2020-08-29 Thread Victor Huertas

Hello,

I have exactly the same problem as you. I have also downloaded, compiled
and installed the very last version of valgrind (v3.17).
As soon as the mempool is created, the program gets stuck.

If valgrind cannot be used with DPDK (I am using v18.11.5) as memory leak
debugger, there must be other tool to do it. Which one?

Thanks for your attention

El vie., 10 jul. 2020 a las 16:59, Montorsi, Francesco (<
fmonto...@empirix.com>) escribió:

> Hi all,
> I would like to know if it's possible to run my DPDK application (I'm
> using DPDK 19.11) under Valgrind.
> I tried but it gets stuck apparently while accessing hugepages:
>
> 
> AL: Detected memory type: socket_id:0 hugepage_sz:1073741824
> EAL: Detected memory type: socket_id:1 hugepage_sz:1073741824
> EAL: Creating 4 segment lists: n_segs:32 socket_id:0 hugepage_sz:1073741824
> EAL: Ask a virtual area of 0x1000 bytes
> EAL: Virtual area found at 0x100033000 (size = 0x1000)
> EAL: Memseg list allocated: 0x10kB at socket 0
> EAL: Ask a virtual area of 0x8 bytes
> EAL: Virtual area found at 0x14000 (size = 0x8)
> EAL: Ask a virtual area of 0x1000 bytes
> EAL: Virtual area found at 0x94000 (size = 0x1000)
> EAL: Memseg list allocated: 0x10kB at socket 0
> EAL: Ask a virtual area of 0x8 bytes
> EAL: WARNING! Base virtual address hint (0xa80001000 != 0x104000) not
> respected!
> EAL:This may cause issues with mapping memory into secondary processes
> EAL: Virtual area found at 0x104000 (size = 0x8)
> EAL: Ask a virtual area of 0x1000 bytes
> EAL: Virtual area found at 0xac0001000 (size = 0x1000)
> EAL: Memseg list allocated: 0x10kB at socket 0
> EAL: Ask a virtual area of 0x8 bytes
> 
>
> I've seen there was some attempt a few years ago:
> http://mails.dpdk.org/archives/dev/2016-February/033108.html
> has anything changed since that?
>
> Also I see that Luca has created a project here
>   https://github.com/bluca/valgrind-dpdk
> but seems like there were no changes since 3 years... I wonder if that
> works or not with recent DPDK versions...
>
> Thanks for any hint,
>
> Francesco Montorsi
>
>
>
>

-- 
Victor

[dpdk-dev] Segmentation fault when creating a hash table in the example of pipeline (ROUTING type)

[dpdk-dev] Tx vlan offload problem with igb and DPDK v17.11

Re: [dpdk-dev] Tx vlan offload problem with igb and DPDK v17.11

Re: [dpdk-dev] Tx vlan offload problem with igb and DPDK v17.11

Re: [dpdk-dev] Nehalem Intel Xeon X5506 architecture and PCIe NIC association to Numa node

[dpdk-dev] Fwd: rte_eth_rx_queue_setup is returning error -28 (ENOSPC)

[dpdk-dev] (SOLVED) rte_eth_rx_queue_setup is returning error -28 (ENOSPC)

[dpdk-dev] Suggestions on how to customize the metadata fields of each packet

Re: [dpdk-dev] Suggestions on how to customize the metadata fields of each packet

Re: [dpdk-dev] Suggestions on how to customize the metadata fields of each packet

[dpdk-dev] Proposal to add a new toolchain for dpdk: g++

[dpdk-dev] Fwd: Proposal to add a new toolchain for dpdk: g++

[dpdk-dev] Fwd: Proposal to add a new toolchain for dpdk: g++

[dpdk-dev] Fwd: high latency detected in IP pipeline example

Re: [dpdk-dev] Fwd: high latency detected in IP pipeline example

Re: [dpdk-dev] Fwd: high latency detected in IP pipeline example

Re: [dpdk-dev] Proposal to add a new toolchain for dpdk: g++

Re: [dpdk-dev] Fwd: high latency detected in IP pipeline example

[dpdk-dev] Fwd: Fwd: high latency detected in IP pipeline example

Re: [dpdk-dev] Fwd: Fwd: high latency detected in IP pipeline example

[dpdk-dev] DPDK (v17.11) ACL table field format definition enhancement: 'offset' field covering headroom space.

[dpdk-dev] DPDK (v17.11) ACL table field format definition enhancement: 'offset' field covering headroom space.

[dpdk-dev] [DPDK v17.11 LTS] Crash (segmentation fault) in ACL table packet look up

Re: [dpdk-dev] Using valgrind with DPDK app

24 matches

Site Navigation

Mail list logo

Footer information