[dpdk-dev] raw frame to rte_mbuf

2013-11-12 Thread Etai Lev-Ran
Hi Pepe,

In addition, you may want to consider the frame's lifetime, to ensure memory
is used and released
in a valid way.
When sending, it may be de-referenced by DPDK and consequently a memory free
may be tried. 
Hence, it is important that the raw buffer used for the ARP packet is
allocated with a 
reference added (or, alternately, just add-ref to the packet and ensure
it'll not be freed by DPDK
directly).

Regards,
Etai

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Prashant Upadhyaya
Sent: Tuesday, November 12, 2013 11:15 AM
To: Jose Gavine Cueto; dev at dpdk.org
Subject: Re: [dpdk-dev] raw frame to rte_mbuf

Hi Pepe,

Ofcourse a simple cast will not suffice.
Please look the rte_mbuf structure in the header files and let me know if
you still have the confusion.
There is a header and payload. Your raw frame will go in the payload.


Regards
-Prashant

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Jose Gavine Cueto
Sent: Tuesday, November 12, 2013 1:49 PM
To: dev at dpdk.org
Subject: [dpdk-dev] raw frame to rte_mbuf

Hi,

In DPDK how should a raw ethernet frame converted to rte_mbuf * ?  For
example if I have an ARP packet:

void * arp_pkt

how should this be converted to an rte_mbuf * for transmission, does a
simple cast suffice ?

Cheers,
Pepe

--
To stop learning is like to stop loving.





===
Please refer to http://www.aricent.com/legal/email_disclaimer.html
for important disclosures regarding this electronic communication.

===



[dpdk-dev] rte_ring_sc_dequeue returns 0 but sets packet to NULL

2013-11-20 Thread Etai Lev-Ran


Hi Pepe,



I?m assuming you?re creating and accessing the ring safely (i.e., 
single/multiple consumers and producers).



Based on the code, these return values are possible if the ring somehow got a 
NULL object pointer enqueued to it. 

>From the ring?s perspective the entries are valid, and since the dequeue does 
>not check for NULL object pointers, 

you?re getting back element(s) that happen to be NULL.



If this is indeed the case, I would propose the following patch:

- Adding a check for NULL object pointers to ENQUEUE_PTRS in rte_ring.h (in 
debug code so not to hurt performance?) 

- returning an EINVAL error code if any object in a burst is NULL and aborting 
all enqueue (ie. all or none)



IMHO, adding NULL objects is likely an error not a legitimate use case for 
adding ring elements.

Can anyone think of a use case where adding NULL pointer objects makes sense?

Best regards,
Etai


-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Jose Gavine Cueto
Sent: Tuesday, November 19, 2013 12:35 PM
To: dev at dpdk.org
Subject: [dpdk-dev] rte_ring_sc_dequeue returns 0 but sets packet to NULL

Hi,

I am encountering a strange behavior of rte_ring_sc_dequeue, though I'm not
yet sure what causes this.

I have a code:

rc = rte_ring_sc_dequeue(fwdp->rxtx_rings->xmit_ring, &rpackets);

At first dequeue, rpackets gets a correct address of an rte_mbuf, however at
the second dequeue it returns 0 which is successful but sets the rte_mbuf
result to a NULL value.  Is this even possible, because its happening in my
scenario.  Or it could be just there's something wrong with my code.

Cheers,
Pepe

--
To stop learning is like to stop loving.



[dpdk-dev] Unable to build dpdk : #error "SSSE3 instruction set not enabled"

2013-11-29 Thread Etai Lev-Ran
Hi Surya,

SSE3 instructions are not enabled by default. 
To enable, you can either tell gcc your CPU architecture (-march=) as
suggested
by Marc, or enable just the specific SSE version that's supported by your
CPU (e.g.,
make TOOLCHAIN_CFLAGS="-msse4")

See http://gcc.gnu.org/onlinedocs/gcc/i386-and-x86_002d64-Options.html for a
list of CPU
architectures and instruction flags.

Regards,
Etai

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Marc Sune
Sent: Friday, November 29, 2013 12:53 PM
To: dev at dpdk.org
Subject: Re: [dpdk-dev] Unable to build dpdk : #error "SSSE3 instruction set
not enabled"

Changing the CPU type emulation to some model that supports SSSE3 solved it
(e.g. core2duo) should do the trick. I faced the same problem sometime ago.

best
marc

On 29/11/13 11:39, Surya Nimmagadda wrote:
> Hi,
>
> I am a beginner with dpdk and trying to follow the instructions in 
> http://www.dpdk.org/doc/quick-start
>
> I am seeing the following error when doing make with 1.5.0r2 or 
> 1.5.1r1
>
> == Build lib/librte_meter
> == Build lib/librte_sched
>CC rte_sched.o
> In file included from
/home/surya/dpdk/dpdk-1.5.1r1/lib/librte_sched/rte_bitmap.h:77:0,
>   from
/home/surya/dpdk/dpdk-1.5.1r1/lib/librte_sched/rte_sched.c:47:
> /usr/lib/gcc/x86_64-linux-gnu/4.6/include/tmmintrin.h:31:3: error: #error
"SSSE3 instruction set not enabled"
> make[3]: *** [rte_sched.o] Error 1
> make[2]: *** [librte_sched] Error 2
> make[1]: *** [lib] Error 2
> make: *** [all] Error 2
>
> I am running this on a Ubuntu VM (12.04) with gcc version 4.6.3
>
> It built fine on another vm where I have Ubuntun 13.10 with gcc 
> version 4.8.1
>
> Should I upgrade to 4.8.1 here as well (it has become a long process with
lot of road blocks) or is there any simple fix?
>
> The DPDK doc says I just need gcc versions 4.5.x or later.
>
> Thanks,
> Surya
>




[dpdk-dev] Using DPDK in a multiprocess environment

2014-04-08 Thread Etai Lev Ran
Hi,



I'd like to split DPDK application functionality into a setup (primary)
process and business logic (secondary) processes.

The secondary processes access the hardware queues directly (exclusive queue
per process) and not through software rings.



I'm running into an initialization problem:

-  The primary starts and sets up memory and ports and then goes to
sleep waiting for termination signal

-  Secondary processes fail when probing the PCI bus for devices
(required, otherwise I get 0 ports visible in the secondary)



The error is directly related to the secondary failing to get the *same*
virtual address for mmap'ing the UIO device fd's.

The reason is that the secondary processes has considerably more shared
objects loaded and some of these are

loaded and mapped into addresses which the primary used to map UIO fd's.

The pci_map_resource()  (linuxapp/eal_pci.c) code explicitly requires that
the secondary processes get the same mmap'ed

address as given to the primary.



1)  Is this behavior (same mmap address) required?

2)  If so, is there a workaround to cause PCI areas of UIO devices to be
mapped to the same location in arbitrary processes?



The samples work just fine since all primary and secondary processes have
similar set and load order for .so's



Using  v1.6 on Ubuntu 12.04 64b, ixgbe devices, 1GB hugepages, ASLR
disabled.



Thanks,

Etai





[dpdk-dev] Using DPDK in a multiprocess environment

2014-04-10 Thread Etai Lev Ran
Thanks, Bruce.
Yes - artificial linking may be a viable workaround in some cases. 

However, in the general case, it seems that :
a)  multi-process DPDK applications work best when using a single (primary) 
process feeding secondary processes via SW rings; 
This requires a matching map of the shared area (huge pages);
b)  to allow multiple processes to access the HW directly (with exclusive 
queue assignment, though), the shared memory and 
PCI mapping must be the same in all processes, implying that they 
should be as similar as possible (e.g., *before* initializing 
the PCI resources they must load the same objects and map the same 
files in the same order)
Deviations from above may result in an inoperable system due to mismatches in 
the memory maps.

I think DPDK was designed mostly with use-case (a) above in mind (software 
rings), but that has the unfortunate downside of 
dedicating CPU core(s) for HW access.

Regards,
Etai

-Original Message-
From: Richardson, Bruce [mailto:bruce.richard...@intel.com] 
Sent: Wednesday, April 9, 2014 12:25 PM
To: Rogers, Gerald; elevran; Shaw, Jeffrey B
Cc: dev at dpdk.org
Subject: RE: [dpdk-dev] Using DPDK in a multiprocess environment

As a plan B (or C, or D, etc.) you could also try linking your primary process 
against those same shared libraries, even if they are unused by it. Hopefully 
that may have the same effect in the primary as in the secondary processes of 
adjusting your address space region and allow things to get mapped properly.

/Bruce 

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Rogers, Gerald
> Sent: Tuesday, April 08, 2014 6:00 PM
> To: elevran; Shaw, Jeffrey B
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] Using DPDK in a multiprocess environment
> 
> Etai,
> 
> If this doesn?t work, then you will need to change the virtual address 
> range that is used by DPDK.  By default this is set dynamically, 
> however; with DPDK 1.6you can change it to any region in the virtual address 
> space you want.
> 
> The problem you have is what you stated, the secondary process is 
> built with more shared libraries, which load upon application start, 
> and are occupying the region that DPDK allocates in the primary for shared 
> regions.
> 
> In DPDK version 1.6 there is an option to change the base address.  It 
> is --base- virtaddr
> 
> With this option you can set the base address for where the huge pages 
> are mapped into the process virtual address space.
> 
> This is all implemented within
> $DPDK_DIR/lib/librte_eal/linuxapp/eal/eal_memory.c
> 
> Gerald
> 
> 
> 
> 
> 
> On 4/8/14, 9:07 AM, "elevran"  wrote:
> 
> >Jeff,
> >
> >Thanks for the quick reply.
> >
> >I'll see if calling eal_init earlier resolves the problem I'm seeing.
> >I'm not sure this will resolve the issue if shared objects are loaded 
> >before
> >main() starts...
> >
> >I understand the rationale for having the same mbuf addresses across 
> >processes. And indeed they're mapped just fine (--virt-addr also 
> >gives some control over the mapping?).
> >I was wondering if the same logic applies to the mapping of device 
> >PCI addresses. Are they shared or passed around between processes in 
> >the same way?
> >
> >Thanks again for the quick response,
> >Etai
> >?? 8  2014 18:54, "Shaw, Jeffrey B" 
> >
> >???:
> >
> >> Have you tried calling "rte_eal_init()" closer to the beginning of 
> >> the program in your secondary process (i.e. the first thing in main())?
> >>
> >> The same mmap address is required.  The reason is simple, if 
> >>process A  thinks the virtual address of an mbuf is 123, and process 
> >>B thinks the  virtual address of the same mbuf is 456, either 
> >>process may segmentation  fault, accessing mbuf memory that is not 
> >>actually mapped into the processes  address space.
> >>
> >> Jeff
> >>
> >> -Original Message-
> >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Etai Lev Ran
> >> Sent: Tuesday, April 08, 2014 8:13 AM
> >> To: dev at dpdk.org
> >> Subject: [dpdk-dev] Using DPDK in a multiprocess environment
> >>
> >> Hi,
> >>
> >>
> >>
> >> I'd like to split DPDK application functionality into a setup
> >> (primary) process and business logic (secondary) processes.
> >>
> >> The secondary processes access the hardware queues directly 
> >> (exclusive queue per process) and not through software rings.
> >>
> &

[dpdk-dev] NUMA CPU Sockets and DPDK

2014-02-12 Thread Etai Lev Ran
Hi Prashant,

Based on our experience, using DPDK cross CPU sockets may indeed result in
some performance degradation (~10% for our application vs. staying 
in socket. YMMV based on HW, application structure, etc.).

Regarding CPU utilization on core 1, the one picking up traffic: perhaps I
had misunderstood your comment, but I would expect it to always be close 
to 100% since it's  polling the device via the PMD and not driven by
interrupts. 

Regards,
Etai

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Prashant Upadhyaya
Sent: Wednesday, February 12, 2014 1:28 PM
To: dev at dpdk.org
Subject: [dpdk-dev] NUMA CPU Sockets and DPDK

Hi guys,

What has been your experience of using DPDK based app's in NUMA mode with
multiple sockets where some cores are present on one socket and other cores
on some other socket.

I am migrating my application from one intel machine with 8 cores, all in
one socket to a 32 core machine where 16 cores are in one socket and 16
other cores in the second socket.
My core 0 does all initialization for mbuf's, nic ports, queues etc. and
uses SOCKET_ID_ANY for socket related parameters.

The usecase works, but I think I am running into performance issues on the
32 core machine.
The lscpu output on my 32 core machine shows the following - NUMA node0
CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30
NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31
I am using core 1 to lift all the data from a single queue of an 82599EB
port and I see that the cpu utilization for this core 1 is way too high even
for lifting traffic of 1 Gbps with packet size of 650 bytes.

In general, does one need to be careful in working with multiple sockets and
so forth, any comments would be helpful.

Regards
-Prashant






===
Please refer to http://www.aricent.com/legal/email_disclaimer.html
for important disclosures regarding this electronic communication.

===