[dpdk-dev] [PATCH 0/4] Broadcom 10G NIC Poll Mode Driver

2015-02-08 Thread Jun Xiao
Thanks Stephen for all the info!

Do you expect this pmd driver will be in sync with FreeBSD repo in the future,
or it's going to be diverged? I am just asking from a maintenance stand point.

On Sat, Feb 7, 2015 at 11:15 PM, Stephen Hemminger
 wrote:
> Source was from FreeBSD.
>
> changes were to make it work and lots of de-uglification.
>
> For example, the last change was to remove custom logging and debug macros.
>
> There are still lots of style issues with the driver, because of the amount
> of
> useless macro wrapping.
>
> I don't believe in the vendor model of taking one driver with lots of macros
> and pretending
> it is generic across OS's.
>
> It has been tested by our QA group as part of our vRouter product release,
> which is
> based on DPDK 1.6. There are no open problem reports. The performance is
> less
> than IXGBE but that is probably because it a a port an not optimized.
>
>
>
> On Sat, Feb 7, 2015 at 1:15 AM, Jun Xiao  wrote:
>>
>> Hi Stephen,
>>
>> Thanks for your great work on this!
>>
>> Could you elaborate on a few things:
>> - What's the methodology used in the driver porting?
>>   e.g. what's the base source you ported from?
>>   what's the major change against the base source?
>> - What kind of tests have been done on Linux? any known limitations?
>>
>> Thanks,
>> Jun
>>
>> On Sat, Feb 7, 2015 at 2:36 AM, Stephen Hemminger
>>  wrote:
>> > From: Stephen Hemminger 
>> >
>> > These are the patches to enable supporting the Broadcom
>> > NetExtreme II 10G devices (show up as bnx2x on Linux).
>> >
>> > The driver has only been tested on Linux, there maybe issues
>> > with firmware loading and PCI config access on BSD.
>> >
>> > Stephen Hemminger (4):
>> >   pci: allow access to PCI config space
>> >   bcm: add BCM pci device ids
>> >   bcm: new poll mode driver
>> >   bcm: enable BCM poll mode driver in config
>> >
>> >  config/common_linuxapp  |10 +
>> >  lib/Makefile| 1 +
>> >  lib/librte_eal/common/include/rte_pci.h |29 +
>> >  lib/librte_eal/common/include/rte_pci_dev_ids.h |30 +
>> >  lib/librte_eal/linuxapp/eal/eal_pci.c   |15 +
>> >  lib/librte_eal/linuxapp/eal/eal_pci_uio.c   |10 +
>> >  lib/librte_eal/linuxapp/eal/rte_eal_version.map | 2 +
>> >  lib/librte_pmd_bcm/Makefile |28 +
>> >  lib/librte_pmd_bcm/bcm.c| 11817
>> > +++
>> >  lib/librte_pmd_bcm/bcm.h|  1998 
>> >  lib/librte_pmd_bcm/bcm_ethdev.c |   544 +
>> >  lib/librte_pmd_bcm/bcm_ethdev.h |79 +
>> >  lib/librte_pmd_bcm/bcm_logs.h   |51 +
>> >  lib/librte_pmd_bcm/bcm_rxtx.c   |   487 +
>> >  lib/librte_pmd_bcm/bcm_rxtx.h   |85 +
>> >  lib/librte_pmd_bcm/bcm_stats.c  |  1619 +++
>> >  lib/librte_pmd_bcm/bcm_stats.h  |   633 +
>> >  lib/librte_pmd_bcm/bcm_vfpf.c   |   597 +
>> >  lib/librte_pmd_bcm/bcm_vfpf.h   |   315 +
>> >  lib/librte_pmd_bcm/debug.c  |   113 +
>> >  lib/librte_pmd_bcm/ecore_fw_defs.h  |   423 +
>> >  lib/librte_pmd_bcm/ecore_hsi.h  |  6349 ++
>> >  lib/librte_pmd_bcm/ecore_init.h |   842 ++
>> >  lib/librte_pmd_bcm/ecore_init_ops.h |   886 ++
>> >  lib/librte_pmd_bcm/ecore_mfw_req.h  |   207 +
>> >  lib/librte_pmd_bcm/ecore_reg.h  |  3664 ++
>> >  lib/librte_pmd_bcm/ecore_sp.c   |  5455 +
>> >  lib/librte_pmd_bcm/ecore_sp.h   |  1796 +++
>> >  lib/librte_pmd_bcm/elink.c  | 13378
>> > ++
>> >  lib/librte_pmd_bcm/elink.h  |   610 +
>> >  30 files changed, 52073 insertions(+)
>> >  create mode 100644 lib/librte_pmd_bcm/Makefile
>> >  create mode 100644 lib/librte_pmd_bcm/bcm.c
>> >  create mode 100644 lib/librte_pmd_bcm/bcm.h
>> >  create mode 100644 lib/librte_pmd_bcm/bcm_ethdev.c
>> >  create mode 100644 lib/librte_pmd_bcm/bcm_ethdev.h
>> >  create mode 100644 lib/librte_pmd_bcm/bcm_logs.h
>> >  create mode 100644 lib/librte_pmd_bcm/bcm_rxtx.c
>> >  create mode 100644 lib/librte_pmd_bcm/bcm_rxtx.h
>> >  create mode 100644 lib/librte_pmd_bcm/bcm_stats.c
>> >  create mode 100644 lib/librte_pmd_bcm/bcm_stats.h
>> >  create mode 100644 lib/librte_pmd_bcm/bcm_vfpf.c
>> >  create mode 100644 lib/librte_pmd_bcm/bcm_vfpf.h
>> >  create mode 100644 lib/librte_pmd_bcm/debug.c
>> >  create mode 100644 lib/librte_pmd_bcm/ecore_fw_defs.h
>> >  create mode 100644 lib/librte_pmd_bcm/ecore_hsi.h
>> >  create mode 100644 lib/librte_pmd_bcm/ecore_init.h
>> >  create mode 100644 lib/librte_pmd_bcm/ecore_init_ops.h
>> >  create mode 100644 lib/librte_pmd_bcm/ecore_mfw_req.h
>> >  create mode 100644 lib/librte_pmd_bcm/ecore_reg.h
>> >  create mode 100644 lib/li

[dpdk-dev] "virtual" C++ keyword used in rte_devargs.h and causes compilation error in C++

2015-02-08 Thread Ariel Rodriguez
"virtual" is a reserved word in c++. When the c++ compiler "g++" use that
header in a "extern way (just generate standard typo for function
identifiers)", there`s not way that the compiler posible "turn off" the
"virtual" reserved  word. If, for example, you guys use the "new" word ...
its just the same as virtual.

Regards.

On Sun, Feb 8, 2015 at 3:33 AM, Ming Zhao 
wrote:

> In fact the current rte_devargs.h header is enclosed inside  extern C {}
> block already. But it looks like it's not sufficient. Also there is also
> the case that rte_devargs.virtual field could be accessed inside a cpp
> file.
>
> On 02/07/2015 12:23 PM, Neil Horman wrote:
> > On Fri, Feb 06, 2015 at 11:24:15PM -0800, Ming Zhao wrote:
> >> The code is in rte_devargs.h:
> >>
> >> rte_devargs.h:
> >>
> >> /** Used if type is RTE_DEVTYPE_VIRTUAL. */
> >> struct {
> >> /** Driver name. */
> >> char drv_name[32];
> >> } virtual;
> >> };
> >>
> >> Which caused clang compiler to report error when this file is included
> >> by a cpp file, the error message is:
> >>
> >> In file included from net/dpdk/testing/base-test.cc:3:
> >> In file included from net/dpdk/testing/base-test.h:8:
> >> third-party/dpdk/lib/librte_eal/common/include/rte_devargs.h:89:5:
> >> error: 'virtual' can only appear on non-static
> >>   member functions
> >> } virtual;
> >>   ^
> >>
> >> I think we should try to pick another name for this field. I would
> >> suggest calling it "vdev" instead, or I'll be happy to take another name
> >> if someone comes with a different idea.
> >>
> >> Thanks!
> >> Ming
> >>
> > You could do that, but it seems like it shouldn't be necessecary.
> Shouldn't the
> > solution just be to encapsulate either the header file or the #include
> directive
> > from the C++ file with extern C { }?
> > Neil
> >
>


[dpdk-dev] [RFC PATCH] rte_timer: Fix rte_timer_reset return value

2015-02-08 Thread Olivier MATZ
Hi Robert,

On 02/06/2015 06:26 PM, Robert Sanford wrote:
> Hi Olivier,
> 
> Thanks for reviewing this patch.
> Please see my responses to your comments, below.
> 
> I also have one request for you. You probably use git almost every day.
> For people who only use git maybe once per year, could you please show
> us the exact sequence of commands that you run to prepare a patch
> series? We know there are man pages and online documents, etc, but it
> would be an extremely valuable jumpstart if you just give us a snippet
> of your shell history showing the exact commands that you run to prepare
> and email a patch series. I would much rather spend time getting the
> code right, and less time learning (by trial and error) the nuances of
> git apply, add, commit, format-patch, send-email, etc.

The following actions should work fine (please take the time to
understand them before execute them):

# start in your workspace with your patch commit
# "git log -1" shows your timer commit

# remove the commit but keep the files unmodified
git reset HEAD^

# you can see the modified files with "git diff" and "git status"

# Now do the modifications we discussed in the mail in the files

# first patch: relax cpu

# Once it's done we will add the first commit (relax cpu)
# We will add it in the cache with "git add".
# '-p' means it will ask for each hunk.
git add -p lib/librte_timer/rte_timer.c
# so "n" to the first one, and "y" to the second (with the rte_pause)

# show the cache, it should show the content of the commit
git diff --cached
# add the commit log
git commit

# second patch: first restarting

# same than above, we just want to commit a part of the diff
git add -p lib/librte_timer/rte_timer.c
# say "n" until you see the lines with "/* clean up statics, in case
# we run again */"
# Then say "e" (edit).

# In the editor, remove the line
#  "+   rte_atomic32_set(&collisions, 0);"
# (we want to have it in the 3rd patch, not this one)
# then save and exit from your editor

# show the cache, it should show the content of the commit
git diff --cached
# add the commit log
git commit

 3rd patch, the rest

# easy here
git add app/test/test_timer.c lib/librte_timer/rte_timer.c
git commit

 check compilation

git rebase -i HEAD~3
# replace all "pick" by "edit", then save and exit

# git stops at your first commit, check compilation, then:
git rebase --contine

# git stops at your 2nd commit, same...
git rebase --contine

# one last time
git rebase --contine

 send the email

# prepare a directory with your 3 patches
git format-patch -3 --cover-letter \
  --output-directory=timer-patches

# edit the cover letter
${EDITOR} timer-patches/-cover-letter.patch

# send it
git send-email --to dev at dpdk.org --cc olivier.matz at 6wind.com \
 --in-reply-to="1422996127-64370-1-git-send-email-rsanford2 at gmail.com" \
 timer-patches


Here it is. This is one solution, but probably other people do
differently.

The dpdk dev page http://dpdk.org/dev is simple but really useful to
remember the commands. The man pages of git are long, but really well
written, if you have a doubt, you can check there.


> > + ready = 0;
> 
> The lines above should go in another patch as it fixes another problem
> (+ a memory leek).
> "testpmd: allow to restart timer stress tests"
> 
> 
> Yes, I will split it into two patches: rte_timer and test_timer. But, I
> don't see much benefit in splitting test_timer.c changes into separate
> patches for each bug discovered.

There are several reasons why we want to split patches.

- Small patches are easier to understand (one problem -> one solution),
  it's easier to read for the reviewer, but also for all people that
  will browse the history later

- Smaller patches also reduce the risk to miss something, increasing
  code quality

- Some people may be maintaining their own stable dpdk branch. They can
  pick up only the patches they consider critical.

- When a developer takes time to present its patches properly, it's
  seen by the reviewers as a mark of respect by the reviewers, so they
  are happier to do the review.

> > diff --git a/lib/librte_timer/rte_timer.c b/lib/librte_timer/rte_timer.c
> > index 269a992..d18abf5 100644
> > --- a/lib/librte_timer/rte_timer.c
> > +++ b/lib/librte_timer/rte_timer.c
> > @@ -424,10 +424,8 @@ rte_timer_reset(struct rte_timer *tim, uint64_t 
> ticks,
> >   else
> >   period = 0;
> >
> > - __rte_timer_reset(tim,  cur_time + ticks, period, tim_lcore,
> > + return __rte_timer_reset(tim,  cur_time + ticks, period, 
> tim_lcore,
> > fct, arg, 0);
> > -
> > - return 0;
> >  }
> >
> >  /* loop until rte_timer_reset() succeed */
> > @@ -437,7 +435,8 @@ rte_timer_reset_sync(struct rte_timer *tim, 
> uint64_t ticks,
> >rte_timer_cb_t fct, void *arg)
> >  {
> >   wh

[dpdk-dev] "virtual" C++ keyword used in rte_devargs.h and causes compilation error in C++

2015-02-08 Thread Neil Horman
On Sun, Feb 08, 2015 at 04:31:39AM -0300, Ariel Rodriguez wrote:
> "virtual" is a reserved word in c++. When the c++ compiler "g++" use that
> header in a "extern way (just generate standard typo for function
> identifiers)", there`s not way that the compiler posible "turn off" the
> "virtual" reserved  word. If, for example, you guys use the "new" word ...
> its just the same as virtual.
> 
Hmm, you're right, it seems the C++ specification only uses the extern
"Language" construct to define linkage behavior, it in no way defines keyword
recognition or behavior.

Truthfully, while this would be easy to fix by just changing the name, it might
be a good oportunity to make the rte_devargs struct private.  It really doesn't
need to be public at all, if you had an accessor/iterator function for the list,
and that would implicitly solve the problem by moving all the identifiers out of
any publically accessible namespace

Neil

> Regards.
> 
> On Sun, Feb 8, 2015 at 3:33 AM, Ming Zhao 
> wrote:
> 
> > In fact the current rte_devargs.h header is enclosed inside  extern C {}
> > block already. But it looks like it's not sufficient. Also there is also
> > the case that rte_devargs.virtual field could be accessed inside a cpp
> > file.
> >
> > On 02/07/2015 12:23 PM, Neil Horman wrote:
> > > On Fri, Feb 06, 2015 at 11:24:15PM -0800, Ming Zhao wrote:
> > >> The code is in rte_devargs.h:
> > >>
> > >> rte_devargs.h:
> > >>
> > >> /** Used if type is RTE_DEVTYPE_VIRTUAL. */
> > >> struct {
> > >> /** Driver name. */
> > >> char drv_name[32];
> > >> } virtual;
> > >> };
> > >>
> > >> Which caused clang compiler to report error when this file is included
> > >> by a cpp file, the error message is:
> > >>
> > >> In file included from net/dpdk/testing/base-test.cc:3:
> > >> In file included from net/dpdk/testing/base-test.h:8:
> > >> third-party/dpdk/lib/librte_eal/common/include/rte_devargs.h:89:5:
> > >> error: 'virtual' can only appear on non-static
> > >>   member functions
> > >> } virtual;
> > >>   ^
> > >>
> > >> I think we should try to pick another name for this field. I would
> > >> suggest calling it "vdev" instead, or I'll be happy to take another name
> > >> if someone comes with a different idea.
> > >>
> > >> Thanks!
> > >> Ming
> > >>
> > > You could do that, but it seems like it shouldn't be necessecary.
> > Shouldn't the
> > > solution just be to encapsulate either the header file or the #include
> > directive
> > > from the C++ file with extern C { }?
> > > Neil
> > >
> >


[dpdk-dev] [PATCH v3 0/5] New Reorder Library

2015-02-08 Thread Neil Horman
On Fri, Feb 06, 2015 at 03:05:59PM +, Sergio Gonzalez Monroy wrote:
> This series introduces the new reorder library along with unit tests,
> sample app and a new entry in the programmers guide describing the library.
> 
> The library provides reordering of mbufs based on their sequence number.
> 
> As mention in the patch describing the library, one use case is the
> packet distributor.
> The distributor receives packets, assigns them a sequence number and sends
> them to the workers.
> The workers process those packets and return them to the distributor.
> The distributor collects out-of-order packets from the workers and uses
> this library to reorder the packets based on the sequence number they
> were assigned.
> 
> v3:
>  - fix copyright date
>  - add option to sample app to disable reordering
>  - add packet ordering sample guide entry
> 
> v2:
>  - add programmers guide entry describing the library
>  - use malloc instead of memzone to allocate memory
>  - modify create and init implementation, init takes a reorder buffer as input
>and create reserves memory and call init.
>  - update unit tests
> 
> Sergio Gonzalez Monroy (5):
>   reorder: new reorder library
>   app: New reorder unit test
>   examples: new sample app packet_ordering
>   doc: new reorder library description
>   doc: new packet ordering app description
> 
>  app/test/Makefile  |   2 +
>  app/test/test_reorder.c| 393 ++
>  config/common_bsdapp   |   5 +
>  config/common_linuxapp |   5 +
>  doc/guides/prog_guide/index.rst|   1 +
>  doc/guides/prog_guide/reorder_lib.rst  | 115 
>  doc/guides/sample_app_ug/index.rst |   1 +
>  doc/guides/sample_app_ug/packet_ordering.rst   | 102 
>  examples/packet_ordering/Makefile  |  50 ++
>  examples/packet_ordering/main.c| 695 
> +
>  lib/Makefile   |   1 +
>  lib/librte_eal/common/include/rte_tailq_elem.h |   2 +
>  lib/librte_mbuf/rte_mbuf.h |   3 +
>  lib/librte_reorder/Makefile|  50 ++
>  lib/librte_reorder/rte_reorder.c   | 416 +++
>  lib/librte_reorder/rte_reorder.h   | 181 +++
>  mk/rte.app.mk  |   4 +
>  17 files changed, 2026 insertions(+)
>  create mode 100644 app/test/test_reorder.c
>  create mode 100644 doc/guides/prog_guide/reorder_lib.rst
>  create mode 100644 doc/guides/sample_app_ug/packet_ordering.rst
>  create mode 100644 examples/packet_ordering/Makefile
>  create mode 100644 examples/packet_ordering/main.c
>  create mode 100644 lib/librte_reorder/Makefile
>  create mode 100644 lib/librte_reorder/rte_reorder.c
>  create mode 100644 lib/librte_reorder/rte_reorder.h
> 
> -- 
> 1.9.3
> 
> 
Functionally it looks pretty good.  You need to define an export map and
LIBABIVER macro in your makefile though.
Neil



[dpdk-dev] [PATCH] enic: silence log message

2015-02-08 Thread Stephen Hemminger
From: Stephen Hemminger 

Silence is normal. drivers should speak only when spoken to and not
be chatty.

Signed-off-by: Stephen Hemminger 
---
 lib/librte_pmd_enic/enic_main.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/lib/librte_pmd_enic/enic_main.c b/lib/librte_pmd_enic/enic_main.c
index 48fdca2..dad8922 100644
--- a/lib/librte_pmd_enic/enic_main.c
+++ b/lib/librte_pmd_enic/enic_main.c
@@ -1046,8 +1046,6 @@ int enic_probe(struct enic *enic)
struct rte_pci_device *pdev = enic->pdev;
int err = -1;

-   dev_info(enic, " Initializing ENIC PMD version %s\n", DRV_VERSION);
-
enic->bar0.vaddr = (void *)pdev->mem_resource[0].addr;
enic->bar0.len = pdev->mem_resource[0].len;

-- 
2.1.4



[dpdk-dev] [PATCH] enic: silence log message

2015-02-08 Thread Stephen Hemminger
From: Stephen Hemminger 

Silence is normal. drivers should speak only when spoken to and not
be chatty.

Signed-off-by: Stephen Hemminger 
---
 lib/librte_pmd_enic/enic_main.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/lib/librte_pmd_enic/enic_main.c b/lib/librte_pmd_enic/enic_main.c
index 48fdca2..dad8922 100644
--- a/lib/librte_pmd_enic/enic_main.c
+++ b/lib/librte_pmd_enic/enic_main.c
@@ -1046,8 +1046,6 @@ int enic_probe(struct enic *enic)
struct rte_pci_device *pdev = enic->pdev;
int err = -1;

-   dev_info(enic, " Initializing ENIC PMD version %s\n", DRV_VERSION);
-
enic->bar0.vaddr = (void *)pdev->mem_resource[0].addr;
enic->bar0.len = pdev->mem_resource[0].len;

-- 
2.1.4



[dpdk-dev] [PATCH] enic: silence log message

2015-02-08 Thread Stephen Hemminger
From: Stephen Hemminger 

Silence is normal. drivers should speak only when spoken to and not
be chatty.

Signed-off-by: Stephen Hemminger 
---
 lib/librte_pmd_enic/enic_main.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/lib/librte_pmd_enic/enic_main.c b/lib/librte_pmd_enic/enic_main.c
index 48fdca2..dad8922 100644
--- a/lib/librte_pmd_enic/enic_main.c
+++ b/lib/librte_pmd_enic/enic_main.c
@@ -1046,8 +1046,6 @@ int enic_probe(struct enic *enic)
struct rte_pci_device *pdev = enic->pdev;
int err = -1;

-   dev_info(enic, " Initializing ENIC PMD version %s\n", DRV_VERSION);
-
enic->bar0.vaddr = (void *)pdev->mem_resource[0].addr;
enic->bar0.len = pdev->mem_resource[0].len;

-- 
2.1.4



[dpdk-dev] [PATCH v4 01/17] eal: add cpuset into per EAL thread lcore_config

2015-02-08 Thread Olivier MATZ
Hi,

On 02/02/2015 03:02 AM, Cunming Liang wrote:
> The patch adds 'cpuset' into per-lcore configure 'lcore_config[]',
> as the lcore no longer always 1:1 pinning with physical cpu.
> The lcore now stands for a EAL thread rather than a logical cpu.
> 
> It doesn't change the default behavior of 1:1 mapping, but allows to
> affinity the EAL thread to multiple cpus.
> 
> [...]
> diff --git a/lib/librte_eal/bsdapp/eal/eal_memory.c 
> b/lib/librte_eal/bsdapp/eal/eal_memory.c
> index 65ee87d..a34d500 100644
> --- a/lib/librte_eal/bsdapp/eal/eal_memory.c
> +++ b/lib/librte_eal/bsdapp/eal/eal_memory.c
> @@ -45,6 +45,8 @@
>  #include "eal_internal_cfg.h"
>  #include "eal_filesystem.h"
>  
> +/* avoid re-defined against with freebsd header */
> +#undef PAGE_SIZE
>  #define PAGE_SIZE (sysconf(_SC_PAGESIZE))

I don't see the link with the patch. Should this go somewhere else?


>  
>  /*
> diff --git a/lib/librte_eal/common/include/rte_lcore.h 
> b/lib/librte_eal/common/include/rte_lcore.h
> index 49b2c03..4c7d6bb 100644
> --- a/lib/librte_eal/common/include/rte_lcore.h
> +++ b/lib/librte_eal/common/include/rte_lcore.h
> @@ -50,6 +50,13 @@ extern "C" {
>  
>  #define LCORE_ID_ANY -1/**< Any lcore. */
>  
> +#if defined(__linux__)
> + typedef cpu_set_t rte_cpuset_t;
> +#elif defined(__FreeBSD__)
> +#include 
> + typedef cpuset_t rte_cpuset_t;
> +#endif
> +

Should we also define RTE_CPU_SETSIZE?
For linux, should  be included?

If I understand well, after the patch series, the user of
rte_thread_set_affinity() and rte_thread_get_affinity() are
supposed to use the macros from sched.h to access to this
cpuset parameter. So I'm wondering if it's not better to
use cpu_set_t from libc instead of redefining rte_cpuset_t.

To reword my question: what is the purpose of redefining
cpu_set_t in rte_cpuset_t if we still need to use all the
libc API to access to it?


Regards,
Olivier


[dpdk-dev] [PATCH v4 02/17] eal: new eal option '--lcores' for cpu assignment

2015-02-08 Thread Olivier MATZ
Hi,

On 02/02/2015 03:02 AM, Cunming Liang wrote:
> It supports one new eal long option '--lcores' for EAL thread cpuset 
> assignment.
> 
> The format pattern:
>   --lcores='lcores[@cpus]<,lcores[@cpus]>'
> lcores, cpus could be a single digit/range or a group.
> '(' and ')' are necessary if it's a group.
> If not supply '@cpus', the value of cpus uses the same as lcores.
> 
> e.g. '1,2@(5-7),(3-5)@(0,2),(0,6),7-8' means starting 9 EAL thread as below
>   lcore 0 runs on cpuset 0x41 (cpu 0,6)
>   lcore 1 runs on cpuset 0x2 (cpu 1)
>   lcore 2 runs on cpuset 0xe0 (cpu 5,6,7)
>   lcore 3,4,5 runs on cpuset 0x5 (cpu 0,2)
>   lcore 6 runs on cpuset 0x41 (cpu 0,6)
>   lcore 7 runs on cpuset 0x80 (cpu 7)
>   lcore 8 runs on cpuset 0x100 (cpu 8)
> 
> Signed-off-by: Cunming Liang 
> ---
>  lib/librte_eal/common/eal_common_launch.c  |   1 -
>  lib/librte_eal/common/eal_common_options.c | 300 
> -
>  lib/librte_eal/common/eal_options.h|   2 +
>  lib/librte_eal/linuxapp/eal/Makefile   |   1 +
>  4 files changed, 299 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/librte_eal/common/eal_common_launch.c 
> b/lib/librte_eal/common/eal_common_launch.c
> index 599f83b..2d732b1 100644
> --- a/lib/librte_eal/common/eal_common_launch.c
> +++ b/lib/librte_eal/common/eal_common_launch.c
> @@ -117,4 +117,3 @@ rte_eal_mp_wait_lcore(void)
>   rte_eal_wait_lcore(lcore_id);
>   }
>  }
> -


This line should be removed from the patch.


> diff --git a/lib/librte_eal/common/eal_common_options.c 
> b/lib/librte_eal/common/eal_common_options.c
> index 67e02dc..29ebb6f 100644
> --- a/lib/librte_eal/common/eal_common_options.c
> +++ b/lib/librte_eal/common/eal_common_options.c
> @@ -45,6 +45,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "eal_internal_cfg.h"
>  #include "eal_options.h"
> @@ -85,6 +86,7 @@ eal_long_options[] = {
>   {OPT_XEN_DOM0, 0, 0, OPT_XEN_DOM0_NUM},
>   {OPT_CREATE_UIO_DEV, 1, NULL, OPT_CREATE_UIO_DEV_NUM},
>   {OPT_VFIO_INTR, 1, NULL, OPT_VFIO_INTR_NUM},
> + {OPT_LCORES, 1, 0, OPT_LCORES_NUM},
>   {0, 0, 0, 0}
>  };
>  
> @@ -255,9 +257,11 @@ eal_parse_corelist(const char *corelist)
>   if (min == RTE_MAX_LCORE)
>   min = idx;
>   for (idx = min; idx <= max; idx++) {
> - cfg->lcore_role[idx] = ROLE_RTE;
> - lcore_config[idx].core_index = count;
> - count++;
> + if (cfg->lcore_role[idx] != ROLE_RTE) {
> + cfg->lcore_role[idx] = ROLE_RTE;
> + lcore_config[idx].core_index = count;
> + count++;
> + }
>   }
>   min = RTE_MAX_LCORE;
>   } else
> @@ -292,6 +296,279 @@ eal_parse_master_lcore(const char *arg)
>   return 0;
>  }
>  
> +/*
> + * Parse elem, the elem could be single number/range or '(' ')' group
> + * Within group elem, '-' used for a range seperator;
> + *',' used for a single number.
> + */
> +static int
> +eal_parse_set(const char *input, uint16_t set[], unsigned num)

It's not very clear what elem is. Maybe it could be a bit reworded.
What about naming the function "eal_parse_cpuset()" instead?


> +{
> + unsigned idx;
> + const char *str = input;
> + char *end = NULL;
> + unsigned min, max;
> +
> + memset(set, 0, num * sizeof(uint16_t));
> +
> + while (isblank(*str))
> + str++;
> +
> + /* only digit or left bracket is qulify for start point */
> + if ((!isdigit(*str) && *str != '(') || *str == '\0')
> + return -1;
> +
> + /* process single number or single range of number */
> + if (*str != '(') {
> + errno = 0;
> + idx = strtoul(str, &end, 10);
> + if (errno || end == NULL || idx >= num)
> + return -1;
> + else {
> + while (isblank(*end))
> + end++;
> +
> + min = idx;
> + max = idx;
> + if (*end == '-') {
> + /* proccess single - */
> + end++;
> + while (isblank(*end))
> + end++;
> + if (!isdigit(*end))
> + return -1;
> +
> + errno = 0;
> + idx = strtoul(end, &end, 10);
> + if (errno || end == NULL || idx >= num)
> + return -1;
> + max = idx;
> + while (isblank(*end))
> + end++;
> +  

[dpdk-dev] [PATCH v4 03/17] eal: fix wrong strnlen() return value in 32bit icc

2015-02-08 Thread Olivier MATZ
Hi,

On 02/02/2015 03:02 AM, Cunming Liang wrote:
> The problem is that strnlen() here may return invalid value with 32bit icc.
> (actually it returns it?s second parameter,e.g: sysconf(_SC_ARG_MAX)).
> It starts to manifest hwen max_len parameter is > 2M and using icc ?m32 ?O2 
> (or above).
> 
> Suggested-by: Konstantin Ananyev 
> Signed-off-by: Cunming Liang 
> ---
>  lib/librte_eal/common/eal_common_options.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_eal/common/eal_common_options.c 
> b/lib/librte_eal/common/eal_common_options.c
> index 29ebb6f..22d5d37 100644
> --- a/lib/librte_eal/common/eal_common_options.c
> +++ b/lib/librte_eal/common/eal_common_options.c
> @@ -227,7 +227,7 @@ eal_parse_corelist(const char *corelist)
>   /* Remove all blank characters ahead and after */
>   while (isblank(*corelist))
>   corelist++;
> - i = strnlen(corelist, sysconf(_SC_ARG_MAX));
> + i = strnlen(corelist, PATH_MAX);
>   while ((i > 0) && isblank(corelist[i - 1]))
>   i--;
>  
> @@ -469,7 +469,7 @@ eal_parse_lcores(const char *lcores)
>   /* Remove all blank characters ahead and after */
>   while (isblank(*lcores))
>   lcores++;
> - i = strnlen(lcores, sysconf(_SC_ARG_MAX));
> + i = strnlen(lcores, PATH_MAX);
>   while ((i > 0) && isblank(lcores[i - 1]))
>   i--;
>  
> 

I think PATH_MAX is not equivalent to _SC_ARG_MAX.

But the main question is: why do we need to use strnlen() here instead
of strlen? We can expect that argv[] pointers are always nul-terminated.
Replacing them by strlen() would probably also solve the icc issue.

Regards,
Olivier


[dpdk-dev] [PATCH v4 04/17] eal: add support parsing socket_id from cpuset

2015-02-08 Thread Olivier MATZ
Hi,

On 02/02/2015 03:02 AM, Cunming Liang wrote:
> It returns the socket_id if all cpus in the cpuset belongs
> to the same NUMA node, otherwise it will return SOCKET_ID_ANY.
> 
> Signed-off-by: Cunming Liang 
> ---
>  lib/librte_eal/bsdapp/eal/eal_lcore.c   |  7 +
>  lib/librte_eal/common/eal_thread.h  | 52 
> +
>  lib/librte_eal/linuxapp/eal/eal_lcore.c |  7 +
>  3 files changed, 66 insertions(+)
> 
> diff --git a/lib/librte_eal/bsdapp/eal/eal_lcore.c 
> b/lib/librte_eal/bsdapp/eal/eal_lcore.c
> index 72f8ac2..162fb4f 100644
> --- a/lib/librte_eal/bsdapp/eal/eal_lcore.c
> +++ b/lib/librte_eal/bsdapp/eal/eal_lcore.c
> @@ -41,6 +41,7 @@
>  #include 
>  
>  #include "eal_private.h"
> +#include "eal_thread.h"
>  
>  /* No topology information available on FreeBSD including NUMA info */
>  #define cpu_core_id(X) 0
> @@ -112,3 +113,9 @@ rte_eal_cpu_init(void)
>  
>   return 0;
>  }
> +
> +unsigned
> +eal_cpu_socket_id(__rte_unused unsigned cpu_id)
> +{
> + return cpu_socket_id(cpu_id);
> +}
> diff --git a/lib/librte_eal/common/eal_thread.h 
> b/lib/librte_eal/common/eal_thread.h
> index b53b84d..a25ee86 100644
> --- a/lib/librte_eal/common/eal_thread.h
> +++ b/lib/librte_eal/common/eal_thread.h
> @@ -34,6 +34,10 @@
>  #ifndef EAL_THREAD_H
>  #define EAL_THREAD_H
>  
> +#include 
> +
> +#include 
> +
>  /**
>   * basic loop of thread, called for each thread by eal_init().
>   *
> @@ -50,4 +54,52 @@ __attribute__((noreturn)) void *eal_thread_loop(void *arg);
>   */
>  void eal_thread_init_master(unsigned lcore_id);
>  
> +/**
> + * Get the NUMA socket id from cpu id.
> + * This function is private to EAL.
> + *
> + * @param cpu_id
> + *   The logical process id.
> + * @return
> + *   socket_id or SOCKET_ID_ANY
> + */
> +unsigned eal_cpu_socket_id(unsigned cpu_id);

Wouldn't it be better to rename the existing function cpu_socket_id()
in eal_cpu_socket_id() and export it in eal_thread.h?

In case of bsd where cpu_socket_id() is implemented using a #define,
a new function should be created returning 0.


> +
> +/**
> + * Get the NUMA socket id from cpuset.
> + * This function is private to EAL.
> + *
> + * @param cpusetp
> + *   The point to a valid cpu set.
> + * @return
> + *   socket_id or SOCKET_ID_ANY
> + */
> +static inline int
> +eal_cpuset_socket_id(rte_cpuset_t *cpusetp)
> +{
> + unsigned cpu = 0;
> + int socket_id = SOCKET_ID_ANY;
> + int sid;
> +
> + if (cpusetp == NULL)
> + return SOCKET_ID_ANY;

SOCKET_ID_ANY is not defined, maybe  should be included
somewhere.

> +
> + do {
> + if (!CPU_ISSET(cpu, cpusetp))
> + continue;
> +
> + if (socket_id == SOCKET_ID_ANY)
> + socket_id = eal_cpu_socket_id(cpu);
> +
> + sid = eal_cpu_socket_id(cpu);
> + if (socket_id != sid) {
> + socket_id = SOCKET_ID_ANY;
> + break;
> + }
> +
> + } while (++cpu < RTE_MAX_LCORE);
> +
> + return socket_id;
> +}


I don't think this function should be inlined.

As this function is not used, it could be interesting for reviewers
to understand when

> +
>  #endif /* EAL_THREAD_H */
> diff --git a/lib/librte_eal/linuxapp/eal/eal_lcore.c 
> b/lib/librte_eal/linuxapp/eal/eal_lcore.c
> index 29615f8..922af6d 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_lcore.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_lcore.c
> @@ -45,6 +45,7 @@
>  
>  #include "eal_private.h"
>  #include "eal_filesystem.h"
> +#include "eal_thread.h"
>  
>  #define SYS_CPU_DIR "/sys/devices/system/cpu/cpu%u"
>  #define CORE_ID_FILE "topology/core_id"
> @@ -197,3 +198,9 @@ rte_eal_cpu_init(void)
>  
>   return 0;
>  }
> +
> +unsigned
> +eal_cpu_socket_id(unsigned cpu_id)
> +{
> + return cpu_socket_id(cpu_id);
> +}
> 


[dpdk-dev] [PATCH v4 05/17] eal: new TLS definition and API declaration

2015-02-08 Thread Olivier MATZ
Hi,

On 02/02/2015 03:02 AM, Cunming Liang wrote:
> 1. add two TLS *_socket_id* and *_cpuset*
> 2. add two external API rte_thread_set/get_affinity
> 3. add one internal API eal_thread_dump_affinity

To me, it's a bit strage to add an API withtout the associated code.
Maybe you have a good reason to do that, but I think in this case it
should be explained in the commit log.

> 
> Signed-off-by: Cunming Liang 
> ---
>  lib/librte_eal/bsdapp/eal/eal_thread.c|  2 ++
>  lib/librte_eal/common/eal_thread.h| 14 ++
>  lib/librte_eal/common/include/rte_lcore.h | 29 +++--
>  lib/librte_eal/linuxapp/eal/eal_thread.c  |  2 ++
>  4 files changed, 45 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c 
> b/lib/librte_eal/bsdapp/eal/eal_thread.c
> index ab05368..10220c7 100644
> --- a/lib/librte_eal/bsdapp/eal/eal_thread.c
> +++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
> @@ -56,6 +56,8 @@
>  #include "eal_thread.h"
>  
>  RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
> +RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
> +RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
>  
>  /*
>   * Send a message to a slave lcore identified by slave_id to call a
> diff --git a/lib/librte_eal/common/eal_thread.h 
> b/lib/librte_eal/common/eal_thread.h
> index a25ee86..28edf51 100644
> --- a/lib/librte_eal/common/eal_thread.h
> +++ b/lib/librte_eal/common/eal_thread.h
> @@ -102,4 +102,18 @@ eal_cpuset_socket_id(rte_cpuset_t *cpusetp)
>   return socket_id;
>  }
>  
> +/**
> + * Dump the current pthread cpuset.
> + * This function is private to EAL.
> + *
> + * @param str
> + *   The string buffer the cpuset will dump to.
> + * @param size
> + *   The string buffer size.
> + */
> +#define CPU_STR_LEN256
> +void
> +eal_thread_dump_affinity(char str[], unsigned size);

Although it's equivalent for function arguments, I think "char *str" is
usually preferred over "char str[]". See for instance in snprintf() or
fgets().

What is the purpose of CPU_STR_LEN?

What occurs if the size of the dump is greater than the size of the
given buffer? Is the string truncated? Is there a \0 at the end?
This should be described in the API comments. Maybe adding a return
value could help the user to determine if the string was truncated.

> +
> +
>  #endif /* EAL_THREAD_H */
> diff --git a/lib/librte_eal/common/include/rte_lcore.h 
> b/lib/librte_eal/common/include/rte_lcore.h
> index 4c7d6bb..facdbdc 100644
> --- a/lib/librte_eal/common/include/rte_lcore.h
> +++ b/lib/librte_eal/common/include/rte_lcore.h
> @@ -43,6 +43,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #ifdef __cplusplus
>  extern "C" {
> @@ -80,7 +81,9 @@ struct lcore_config {
>   */
>  extern struct lcore_config lcore_config[RTE_MAX_LCORE];
>  
> -RTE_DECLARE_PER_LCORE(unsigned, _lcore_id); /**< Per core "core id". */
> +RTE_DECLARE_PER_LCORE(unsigned, _lcore_id);  /**< Per thread "lcore id". */
> +RTE_DECLARE_PER_LCORE(unsigned, _socket_id); /**< Per thread "socket id". */
> +RTE_DECLARE_PER_LCORE(rte_cpuset_t, _cpuset); /**< Per thread "cpuset". */
>  
>  /**
>   * Return the ID of the execution unit we are running on.
> @@ -146,7 +149,7 @@ rte_lcore_index(int lcore_id)
>  static inline unsigned
>  rte_socket_id(void)
>  {
> - return lcore_config[rte_lcore_id()].socket_id;
> + return RTE_PER_LCORE(_socket_id);
>  }

I don't see where the _socket_id variable is assigned. I think there
is probably an issue with the splitting of the patches.

Regards,
Olivier


[dpdk-dev] [PATCH v4 06/17] eal: add eal_common_thread.c for common thread API

2015-02-08 Thread Olivier MATZ
Hi,

On 02/02/2015 03:02 AM, Cunming Liang wrote:
> The API works for both EAL thread and none EAL thread.
> When calling rte_thread_set_affinity, the *_socket_id* and
> *_cpuset* of calling thread will be updated if the thread
> successful set the cpu affinity.
> 
> [...]
> +int
> +rte_thread_set_affinity(rte_cpuset_t *cpusetp)
> +{
> + int s;
> + unsigned lcore_id;
> + pthread_t tid;
> +
> + if (!cpusetp)
> + return -1;

Is it really needed to test that cpusetp is not NULL?

> +
> + lcore_id = rte_lcore_id();
> + if (lcore_id != (unsigned)LCORE_ID_ANY) {

This is strange to see something that cannot happen:
lcore_id == LCORE_ID_ANY is only possible after your patch is 12/17
is added. Maybe it can be reordered to avoid this inconsistency?

> + /* EAL thread */
> + tid = lcore_config[lcore_id].thread_id;
> +
> + s = pthread_setaffinity_np(tid, sizeof(rte_cpuset_t), cpusetp);
> + if (s != 0) {
> + RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
> + return -1;
> + }
> +
> + /* store socket_id in TLS for quick access */
> + RTE_PER_LCORE(_socket_id) =
> + eal_cpuset_socket_id(cpusetp);
> +
> + /* store cpuset in TLS for quick access */
> + rte_memcpy(&RTE_PER_LCORE(_cpuset), cpusetp,
> +sizeof(rte_cpuset_t));
> +
> + /* update lcore_config */
> + lcore_config[lcore_id].socket_id = RTE_PER_LCORE(_socket_id);
> + rte_memcpy(&lcore_config[lcore_id].cpuset, cpusetp,
> +sizeof(rte_cpuset_t));
> + } else {
> + /* none EAL thread */
> + tid = pthread_self();
> +
> + s = pthread_setaffinity_np(tid, sizeof(rte_cpuset_t), cpusetp);
> + if (s != 0) {
> + RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
> + return -1;
> + }
> +
> + /* store cpuset in TLS for quick access */
> + rte_memcpy(&RTE_PER_LCORE(_cpuset), cpusetp,
> +sizeof(rte_cpuset_t));
> +
> + /* store socket_id in TLS for quick access */
> + RTE_PER_LCORE(_socket_id) =
> + eal_cpuset_socket_id(cpusetp);
> + }

Why not always using pthread_self() to get the tid?

I think most of the code could be factorized here. The only difference
(which is hard to see as is as code is not exactly ordered in the same
manner) is that the config is updated in case it's an EAL thread.



> +
> + return 0;
> +}
> +
> +int
> +rte_thread_get_affinity(rte_cpuset_t *cpusetp)
> +{
> + if (!cpusetp)
> + return -1;

Same here. This is the only reason why rte_thread_get_affinity() could
fail. Removing this test would allow to change the API to return void
instead. It will avoid a useless test below in
eal_thread_dump_affinity().

> +
> + rte_memcpy(cpusetp, &RTE_PER_LCORE(_cpuset),
> +sizeof(rte_cpuset_t));
> +
> + return 0;
> +}
> +
> +void
> +eal_thread_dump_affinity(char str[], unsigned size)
> +{
> + rte_cpuset_t cpuset;
> + unsigned cpu;
> + int ret;
> + unsigned int out = 0;
> +
> + if (rte_thread_get_affinity(&cpuset) < 0) {
> + str[0] = '\0';
> + return;
> + }

This one could be removed it the (== NULL) test is removed.

> +
> + for (cpu = 0; cpu < RTE_MAX_LCORE; cpu++) {
> + if (!CPU_ISSET(cpu, &cpuset))
> + continue;
> +
> + ret = snprintf(str + out,
> +size - out, "%u,", cpu);
> + if (ret < 0 || (unsigned)ret >= size - out)
> + break;

On the contrary, I think here returning an error to the user
would be useful so he can knows that the dump is not complete.


Regards,
Olivier


[dpdk-dev] [PATCH v4 07/17] eal: add rte_gettid() to acquire unique system tid

2015-02-08 Thread Olivier MATZ
Hi,

On 02/02/2015 03:02 AM, Cunming Liang wrote:
> The rte_gettid() wraps the linux and freebsd syscall gettid().
> It provides a persistent unique thread id for the calling thread.
> It will save the unique id in TLS on the first time.
> 
> [...]
>
> +/**
> + * A wrap API for syscall gettid.
> + *
> + * @return
> + *   On success, returns the thread ID of calling process.
> + *   It always successful.
> + */
> +int rte_sys_gettid(void);
> +
> +/**
> + * Get system unique thread id.
> + *
> + * @return
> + *   On success, returns the thread ID of calling process.
> + *   It always successful.
> + */
> +static inline int rte_gettid(void)
> +{
> + static RTE_DEFINE_PER_LCORE(int, _thread_id) = -1;
> + if (RTE_PER_LCORE(_thread_id) == -1)
> + RTE_PER_LCORE(_thread_id) = rte_sys_gettid();
> + return RTE_PER_LCORE(_thread_id);
> +}

Instead of doing the test each time rte_gettid() is called, why not
having 2 functions:
  rte_init_tid() -> assign the per_lcore variable
  rte_gettid() -> return the per_lcore variable



Regards,
Olivier


[dpdk-dev] [PATCH v4 08/17] eal: apply affinity of EAL thread by assigned cpuset

2015-02-08 Thread Olivier MATZ
Hi,

On 02/02/2015 03:02 AM, Cunming Liang wrote:
> EAL threads use assigned cpuset to set core affinity during startup.
> It keeps 1:1 mapping, if no '--lcores' option is used.
> 
> [...]
>
>  lib/librte_eal/bsdapp/eal/eal.c  | 13 ---
>  lib/librte_eal/bsdapp/eal/eal_thread.c   | 63 +-
>  lib/librte_eal/linuxapp/eal/eal.c|  7 +++-
>  lib/librte_eal/linuxapp/eal/eal_thread.c | 67 
> +++-
>  4 files changed, 54 insertions(+), 96 deletions(-)
> 
> diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
> index 69f3c03..98c5a83 100644
> --- a/lib/librte_eal/bsdapp/eal/eal.c
> +++ b/lib/librte_eal/bsdapp/eal/eal.c
> @@ -432,6 +432,7 @@ rte_eal_init(int argc, char **argv)
>   int i, fctret, ret;
>   pthread_t thread_id;
>   static rte_atomic32_t run_once = RTE_ATOMIC32_INIT(0);
> + char cpuset[CPU_STR_LEN];
>  
>   if (!rte_atomic32_test_and_set(&run_once))
>   return -1;
> @@ -502,13 +503,17 @@ rte_eal_init(int argc, char **argv)
>   if (rte_eal_pci_init() < 0)
>   rte_panic("Cannot init PCI\n");
>  
> - RTE_LOG(DEBUG, EAL, "Master core %u is ready (tid=%p)\n",
> - rte_config.master_lcore, thread_id);
> -
>   eal_check_mem_on_local_socket();
>  
>   rte_eal_mcfg_complete();
>  
> + eal_thread_init_master(rte_config.master_lcore);
> +
> + eal_thread_dump_affinity(cpuset, CPU_STR_LEN);
> +
> + RTE_LOG(DEBUG, EAL, "Master lcore %u is ready (tid=%p;cpuset=[%s])\n",
> + rte_config.master_lcore, thread_id, cpuset);
> +
>   if (rte_eal_dev_init() < 0)
>   rte_panic("Cannot init pmd devices\n");
>  
> @@ -532,8 +537,6 @@ rte_eal_init(int argc, char **argv)
>   rte_panic("Cannot create thread\n");
>   }
>  
> - eal_thread_init_master(rte_config.master_lcore);
> -
>   /*
>* Launch a dummy function on all slave lcores, so that master lcore
>* knows they are all ready when this function returns.

I wonder if changing this may have an impact on third-party drivers
that already use a management thread. Before the patch, the init()
function of the external library was called with default affinities,
and now it's called with the affinity from master lcore.

I think it should at least be noticed in the commit log.

Why are you doing this change? (I don't say it's a bad change, but
I don't understand why you are doing it here)


> diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c 
> b/lib/librte_eal/bsdapp/eal/eal_thread.c
> index d0c077b..5b16302 100644
> --- a/lib/librte_eal/bsdapp/eal/eal_thread.c
> +++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
> @@ -103,55 +103,27 @@ eal_thread_set_affinity(void)
>  {
>   int s;
>   pthread_t thread;
> -
> -/*
> - * According to the section VERSIONS of the CPU_ALLOC man page:
> - *
> - * The CPU_ZERO(), CPU_SET(), CPU_CLR(), and CPU_ISSET() macros were added
> - * in glibc 2.3.3.
> - *
> - * CPU_COUNT() first appeared in glibc 2.6.
> - *
> - * CPU_AND(), CPU_OR(), CPU_XOR(),CPU_EQUAL(),CPU_ALLOC(),
> - * CPU_ALLOC_SIZE(), CPU_FREE(), CPU_ZERO_S(),  CPU_SET_S(),  CPU_CLR_S(),
> - * CPU_ISSET_S(),  CPU_AND_S(), CPU_OR_S(), CPU_XOR_S(), and CPU_EQUAL_S()
> - * first appeared in glibc 2.7.
> - */
> -#if defined(CPU_ALLOC)
> - size_t size;
> - cpu_set_t *cpusetp;
> -
> - cpusetp = CPU_ALLOC(RTE_MAX_LCORE);
> - if (cpusetp == NULL) {
> - RTE_LOG(ERR, EAL, "CPU_ALLOC failed\n");
> - return -1;
> - }
> -
> - size = CPU_ALLOC_SIZE(RTE_MAX_LCORE);
> - CPU_ZERO_S(size, cpusetp);
> - CPU_SET_S(rte_lcore_id(), size, cpusetp);
> + unsigned lcore_id = rte_lcore_id();
>  
>   thread = pthread_self();
> - s = pthread_setaffinity_np(thread, size, cpusetp);
> + s = pthread_setaffinity_np(thread, sizeof(cpuset_t),
> +&lcore_config[lcore_id].cpuset);
>   if (s != 0) {
>   RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
> - CPU_FREE(cpusetp);
>   return -1;
>   }
>  
> - CPU_FREE(cpusetp);
> -#else /* CPU_ALLOC */
> - cpuset_t cpuset;
> - CPU_ZERO( &cpuset );
> - CPU_SET( rte_lcore_id(), &cpuset );
> + /* acquire system unique id  */
> + rte_gettid();

As suggested in the previous patch, I think having rte_init_tid() would
be clearer here.

> +
> + /* store socket_id in TLS for quick access */
> + RTE_PER_LCORE(_socket_id) =
> + eal_cpuset_socket_id(&lcore_config[lcore_id].cpuset);
> +
> + CPU_COPY(&lcore_config[lcore_id].cpuset, &RTE_PER_LCORE(_cpuset));
> +
> + lcore_config[lcore_id].socket_id = RTE_PER_LCORE(_socket_id);
>  
> - thread = pthread_self();
> - s = pthread_setaffinity_np(thread, sizeof( cpuset ), &cpuset);
> - if (s != 0) {
> - RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
> -

[dpdk-dev] [PATCH v4 09/17] enic: fix re-define freebsd compile complain

2015-02-08 Thread Olivier MATZ
Hi,

On 02/02/2015 03:02 AM, Cunming Liang wrote:
> Some macro already been defined by freebsd 'sys/param.h'.
> 
> Signed-off-by: Cunming Liang 
> ---
>  lib/librte_pmd_enic/enic.h| 1 +
>  lib/librte_pmd_enic/enic_compat.h | 1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/lib/librte_pmd_enic/enic.h b/lib/librte_pmd_enic/enic.h
> index c43417c..189c3b9 100644
> --- a/lib/librte_pmd_enic/enic.h
> +++ b/lib/librte_pmd_enic/enic.h
> @@ -66,6 +66,7 @@
>  #define ENIC_CALC_IP_CKSUM  1
>  #define ENIC_CALC_TCP_UDP_CKSUM 2
>  #define ENIC_MAX_MTU9000
> +#undef PAGE_SIZE
>  #define PAGE_SIZE   4096
>  #define PAGE_ROUND_UP(x) \
>   unsigned long)(x)) + PAGE_SIZE-1) & (~(PAGE_SIZE-1)))
> diff --git a/lib/librte_pmd_enic/enic_compat.h 
> b/lib/librte_pmd_enic/enic_compat.h
> index b1af838..b84c766 100644
> --- a/lib/librte_pmd_enic/enic_compat.h
> +++ b/lib/librte_pmd_enic/enic_compat.h
> @@ -67,6 +67,7 @@
>  #define pr_warn(y, args...) dev_warning(0, y, ##args)
>  #define BUG() pr_err("BUG at %s:%d", __func__, __LINE__)
>  
> +#undef ALIGN
>  #define ALIGN(x, a)  __ALIGN_MASK(x, (typeof(x))(a)-1)
>  #define __ALIGN_MASK(x, mask)(((x)+(mask))&~(mask))
>  #define udelay usleep
> 

Is the issue caused by a change you've made previously in the patch
series?

Wouldn't it be better to rename the macros in enic instead of doing
#undef?

Regards,
Olivier


[dpdk-dev] [PATCH v4 10/17] malloc: fix the issue of SOCKET_ID_ANY

2015-02-08 Thread Olivier MATZ
Hi,

On 02/02/2015 03:02 AM, Cunming Liang wrote:
> Add check for rte_socket_id(), avoid get unexpected return like (-1).
> 
> Signed-off-by: Cunming Liang 
> ---
>  lib/librte_malloc/malloc_heap.h | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/librte_malloc/malloc_heap.h b/lib/librte_malloc/malloc_heap.h
> index b4aec45..a47136d 100644
> --- a/lib/librte_malloc/malloc_heap.h
> +++ b/lib/librte_malloc/malloc_heap.h
> @@ -44,7 +44,12 @@ extern "C" {
>  static inline unsigned
>  malloc_get_numa_socket(void)
>  {
> - return rte_socket_id();
> + unsigned socket_id = rte_socket_id();
> +
> + if (socket_id == (unsigned)SOCKET_ID_ANY)
> + return 0;
> +
> + return socket_id;
>  }
>  
>  void *
> 

The documentation off rte_malloc_socket() says:

@param socket
  NUMA socket to allocate memory on. If SOCKET_ID_ANY is used, this
  function will behave the same as rte_malloc().

void *
rte_malloc_socket(const char *type, size_t size, unsigned align, int
socket);


Your patch changes the behavior of rte_malloc() without explaining
why, and the documentation becomes wrong.

Can you explain why you need this change?

Regards,
Olivier


[dpdk-dev] [PATCH v4 11/17] log: fix the gap to support non-EAL thread

2015-02-08 Thread Olivier MATZ
Hi,

On 02/02/2015 03:02 AM, Cunming Liang wrote:
> For those non-EAL thread, *_lcore_id* is invalid and probably larger than 
> RTE_MAX_LCORE.
> The patch adds the check and allows only EAL thread using EAL per thread log 
> level and log type.
> Others shares the global log level.
> 
> Signed-off-by: Cunming Liang 
> ---
>  lib/librte_eal/common/eal_common_log.c  | 17 +++--
>  lib/librte_eal/common/include/rte_log.h |  5 +
>  2 files changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_eal/common/eal_common_log.c 
> b/lib/librte_eal/common/eal_common_log.c
> index cf57619..e8dc94a 100644
> --- a/lib/librte_eal/common/eal_common_log.c
> +++ b/lib/librte_eal/common/eal_common_log.c
> @@ -193,11 +193,20 @@ rte_set_log_type(uint32_t type, int enable)
>   rte_logs.type &= (~type);
>  }
>  
> +/* Get global log type */
> +uint32_t
> +rte_get_log_type(void)
> +{
> + return rte_logs.type;
> +}
> +
>  /* get the current loglevel for the message beeing processed */
>  int rte_log_cur_msg_loglevel(void)
>  {
>   unsigned lcore_id;
>   lcore_id = rte_lcore_id();
> + if (lcore_id >= RTE_MAX_LCORE)
> + return rte_get_log_level();
>   return log_cur_msg[lcore_id].loglevel;
>  }
>  
> @@ -206,6 +215,8 @@ int rte_log_cur_msg_logtype(void)
>  {
>   unsigned lcore_id;
>   lcore_id = rte_lcore_id();
> + if (lcore_id >= RTE_MAX_LCORE)
> + return rte_get_log_type();
>   return log_cur_msg[lcore_id].logtype;
>  }
>  
> @@ -265,8 +276,10 @@ rte_vlog(__attribute__((unused)) uint32_t level,
>  
>   /* save loglevel and logtype in a global per-lcore variable */
>   lcore_id = rte_lcore_id();
> - log_cur_msg[lcore_id].loglevel = level;
> - log_cur_msg[lcore_id].logtype = logtype;
> + if (lcore_id < RTE_MAX_LCORE) {
> + log_cur_msg[lcore_id].loglevel = level;
> + log_cur_msg[lcore_id].logtype = logtype;
> + }
>  
>   ret = vfprintf(f, format, ap);
>   fflush(f);
> diff --git a/lib/librte_eal/common/include/rte_log.h 
> b/lib/librte_eal/common/include/rte_log.h
> index db1ea08..f83a0d9 100644
> --- a/lib/librte_eal/common/include/rte_log.h
> +++ b/lib/librte_eal/common/include/rte_log.h
> @@ -144,6 +144,11 @@ uint32_t rte_get_log_level(void);
>  void rte_set_log_type(uint32_t type, int enable);
>  
>  /**
> + * Get the global log type.
> + */
> +uint32_t rte_get_log_type(void);
> +
> +/**
>   * Get the current loglevel for the message being processed.
>   *
>   * Before calling the user-defined stream for logging, the log
> 

Wouldn't it be better to change the variable:
static struct log_cur_msg log_cur_msg[RTE_MAX_LCORE];
into a pthread (tls) variable?

With your patch, the log level and log type are not saved for
non-EAL threads. If TLS were used, I think it would work in any case.

Regards,
Olivier


[dpdk-dev] [PATCH v4 12/17] eal: set _lcore_id and _socket_id to (-1) by default

2015-02-08 Thread Olivier MATZ
Hi,

On 02/02/2015 03:02 AM, Cunming Liang wrote:
> For those none EAL thread, *_lcore_id* shall always be LCORE_ID_ANY.
> The libraries using *_lcore_id* as index need to take care.
> *_socket_id* always be SOCKET_ID_ANY unitl the thread changes the affinity

unitl -> until

> by rte_thread_set_affinity()
> 
> Signed-off-by: Cunming Liang 
> ---
>  lib/librte_eal/bsdapp/eal/eal_thread.c   | 4 ++--
>  lib/librte_eal/linuxapp/eal/eal_thread.c | 4 ++--
>  2 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c 
> b/lib/librte_eal/bsdapp/eal/eal_thread.c
> index 5b16302..2b3c9a8 100644
> --- a/lib/librte_eal/bsdapp/eal/eal_thread.c
> +++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
> @@ -56,8 +56,8 @@
>  #include "eal_private.h"
>  #include "eal_thread.h"
>  
> -RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
> -RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
> +RTE_DEFINE_PER_LCORE(unsigned, _lcore_id) = (unsigned)LCORE_ID_ANY;
> +RTE_DEFINE_PER_LCORE(unsigned, _socket_id) = (unsigned)SOCKET_ID_ANY;
>  RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
>  
>  /*
> diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c 
> b/lib/librte_eal/linuxapp/eal/eal_thread.c
> index 6eb1525..ab94e20 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_thread.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_thread.c
> @@ -57,8 +57,8 @@
>  #include "eal_private.h"
>  #include "eal_thread.h"
>  
> -RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
> -RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
> +RTE_DEFINE_PER_LCORE(unsigned, _lcore_id) = (unsigned)LCORE_ID_ANY;
> +RTE_DEFINE_PER_LCORE(unsigned, _socket_id) = (unsigned)SOCKET_ID_ANY;
>  RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);

As far as I understand, now a rte_lcore_id() can return LCORE_ID_ANY.
This should be modified in the rte_lcore_id() API comments.

Same for rte_socket_id().

I also wonder if the API of these functions should be modified to
return an int instead of an unsigned as LCORE_ID_ANY is -1.

Regards,
Olivier


[dpdk-dev] [PATCH v4 14/17] mempool: add support to non-EAL thread

2015-02-08 Thread Olivier MATZ
Hi,

On 02/02/2015 03:02 AM, Cunming Liang wrote:
> For non-EAL thread, bypass per lcore cache, directly use ring pool.
> It allows using rte_mempool in either EAL thread or any user pthread.
> As in non-EAL thread, it directly rely on rte_ring and it's none preemptive.
> It doesn't suggest to run multi-pthread/cpu which compete the rte_mempool.
> It will get bad performance and has critical risk if scheduling policy is RT.
> 
> Signed-off-by: Cunming Liang 
> ---
>  lib/librte_mempool/rte_mempool.h | 18 +++---
>  1 file changed, 11 insertions(+), 7 deletions(-)
> 
> diff --git a/lib/librte_mempool/rte_mempool.h 
> b/lib/librte_mempool/rte_mempool.h
> index 3314651..4845f27 100644
> --- a/lib/librte_mempool/rte_mempool.h
> +++ b/lib/librte_mempool/rte_mempool.h
> @@ -198,10 +198,12 @@ struct rte_mempool {
>   *   Number to add to the object-oriented statistics.
>   */
>  #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
> -#define __MEMPOOL_STAT_ADD(mp, name, n) do { \
> - unsigned __lcore_id = rte_lcore_id();   \
> - mp->stats[__lcore_id].name##_objs += n; \
> - mp->stats[__lcore_id].name##_bulk += 1; \
> +#define __MEMPOOL_STAT_ADD(mp, name, n) do {\
> + unsigned __lcore_id = rte_lcore_id();   \
> + if (__lcore_id < RTE_MAX_LCORE) {   \
> + mp->stats[__lcore_id].name##_objs += n; \
> + mp->stats[__lcore_id].name##_bulk += 1; \
> + }   \

Does it mean that we have no statistics for non-EAL threads?
(same question for rings and timers in the next patches)


>   } while(0)
>  #else
>  #define __MEMPOOL_STAT_ADD(mp, name, n) do {} while(0)
> @@ -767,8 +769,9 @@ __mempool_put_bulk(struct rte_mempool *mp, void * const 
> *obj_table,
>   __MEMPOOL_STAT_ADD(mp, put, n);
>  
>  #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
> - /* cache is not enabled or single producer */
> - if (unlikely(cache_size == 0 || is_mp == 0))
> + /* cache is not enabled or single producer or none EAL thread */
> + if (unlikely(cache_size == 0 || is_mp == 0 ||
> +  lcore_id >= RTE_MAX_LCORE))
>   goto ring_enqueue;
>  
>   /* Go straight to ring if put would overflow mem allocated for cache */
> @@ -952,7 +955,8 @@ __mempool_get_bulk(struct rte_mempool *mp, void 
> **obj_table,
>   uint32_t cache_size = mp->cache_size;
>  
>   /* cache is not enabled or single consumer */
> - if (unlikely(cache_size == 0 || is_mc == 0 || n >= cache_size))
> + if (unlikely(cache_size == 0 || is_mc == 0 ||
> +  n >= cache_size || lcore_id >= RTE_MAX_LCORE))
>   goto ring_dequeue;
>  
>   cache = &mp->local_cache[lcore_id];
> 

What is the performance impact of adding this test?


Regards,
Olivier