[dpdk-dev] [PATCH v3 00/10] split architecture specific operations

2014-11-03 Thread Chao CH Zhu
David,

I acked the set of patches.

Thanks a lot!

Best Regards!
--
Chao Zhu 




From:   David Marchand 
To: dev at dpdk.org
Cc: Chao CH Zhu/China/IBM at IBMCN
Date:   2014/10/28 20:50
Subject:[PATCH v3 00/10] split architecture specific operations



The set of patches split x86 architecture specific operations from DPDK 
and put
them to x86 arch directory.
This will make the adoption of DPDK much easier on other computer 
architecture.
For a new architecture, just add an architecture specific directory and
necessary building configuration files, then DPDK eal library can support 
it.


Reviewing patchset from Chao, I ended up modifying it along the way,
so here is a new iteration of this patchset.

Changes since Chao v2 patchset :

- added a preliminary patch for moving rte_atomic.h (for better 
readability)
- fixed documentation generation
- implemented a generic header for each arch specific header (cpuflags, 
memcpy,
  prefetch were missing)
- removed C++ stuff from generic headers
- centralised all doxygen stuff in generic headers (no need to have 
duplicates)
- refactored rte_cycles functions
- moved vmware tsc stuff to arch rte_cycles.h headers
- finished x86 factorisation


Little summary of current state :

- all applications continue to include the eal headers as before, these 
headers
  are the arch-specific ones
- the arch specific headers always include the generic ones. The generic 
headers
  contain the doxygen documentation and code common to all architectures
- a x86 architecture has been defined which handles both 32bits and 64bits
  peculiarities


It builds fine for 32/64 bits (w and w/o "force intrinsics"), but I really 
would
like a lot of eyes on this (and I would say, especially, rte_cycles, 
rte_memcpy
and rte_cpuflags).
I still have some concerns about the use of intrinsics for architecture != 
x86
but I think Chao will be the best to look at this.


-- 
David Marchand

Chao Zhu (7):
  eal: split atomic operations to architecture specific
  eal: split byte order operations to architecture specific
  eal: split CPU cycle operation to architecture specific
  eal: split prefetch operations to architecture specific
  eal: split spinlock operations to architecture specific
  eal: split memcpy operation to architecture specific
  eal: split CPU flags operations to architecture specific

David Marchand (3):
  eal: move rte_atomic.h header
  eal: install all arch headers
  eal: factorize x86 headers

 doc/api/doxy-api.conf  |1 +
 lib/librte_eal/common/Makefile |   24 +-
 lib/librte_eal/common/eal_common_cpuflags.c|  190 
 .../common/include/arch/x86/rte_atomic.h   |  216 
 .../common/include/arch/x86/rte_atomic_32.h|  222 
 .../common/include/arch/x86/rte_atomic_64.h|  191 
 .../common/include/arch/x86/rte_byteorder.h|  121 +++
 .../common/include/arch/x86/rte_byteorder_32.h |   51 +
 .../common/include/arch/x86/rte_byteorder_64.h |   52 +
 .../common/include/arch/x86/rte_cpuflags.h |  310 ++
 .../common/include/arch/x86/rte_cycles.h   |  121 +++
 .../common/include/arch/x86/rte_memcpy.h   |  297 +
 .../common/include/arch/x86/rte_prefetch.h |   62 ++
 .../common/include/arch/x86/rte_spinlock.h |   94 ++
 lib/librte_eal/common/include/generic/rte_atomic.h |  918 

 .../common/include/generic/rte_byteorder.h |  189 
 .../common/include/generic/rte_cpuflags.h  |  110 ++
 lib/librte_eal/common/include/generic/rte_cycles.h |  209 
 lib/librte_eal/common/include/generic/rte_memcpy.h |  144 +++
 .../common/include/generic/rte_prefetch.h  |   71 ++
 .../common/include/generic/rte_spinlock.h  |  226 
 .../common/include/i686/arch/rte_atomic.h  |  373 ---
 lib/librte_eal/common/include/rte_atomic.h | 1133 

 lib/librte_eal/common/include/rte_byteorder.h  |  270 -
 lib/librte_eal/common/include/rte_cpuflags.h   |  182 
 lib/librte_eal/common/include/rte_cycles.h |  266 -
 lib/librte_eal/common/include/rte_memcpy.h |  376 ---
 lib/librte_eal/common/include/rte_prefetch.h   |   88 --
 lib/librte_eal/common/include/rte_spinlock.h   |  258 -
 .../common/include/x86_64/arch/rte_atomic.h|  335 --
 mk/arch/i686/rte.vars.mk   |2 +
 mk/arch/x86_64/rte.vars.mk |2 +
 32 files changed, 3624 insertions(+), 3480 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_atomic.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_atomic_32.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_atomic_64.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_byteorder.h
 create mode 100644 
lib/librte_eal/common/includ

[dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power architecture

2014-09-29 Thread Chao CH Zhu
Hi, Hemant 

Actually, I submitted another set of patches to split the architecture 
specific operations which includes the patch to 
librte_eal\common\include\rte_atomic.h. Please refer to the previous 
email.   

Best Regards!
--
Chao Zhu (??)
Research Staff Member
Cloud Infrastructure and Technology Group
IBM China Research Lab
Building 19 Zhongguancun Software Park
8 Dongbeiwang West Road, Haidian District,
Beijing, PRC. 100193
Tel: +86-10-58748711
Email: bjzhuc at cn.ibm.com




From:   "Hemant at freescale.com" 
To:     Chao CH Zhu/China/IBM at IBMCN, "dev at dpdk.org" 
Date:   2014/09/29 14:15
Subject:RE: [dpdk-dev] [PATCH 02/12] Add atomic operations for IBM 
Power   architecture



Hi Chao,

This Patch seems to be incomplete. You may also need to patch the 
librte_eal\common\include\rte_atomic.h 
e.g.
#if !(defined RTE_ARCH_X86_64) || !(defined RTE_ARCH_I686)
#include 
#else /* if Intel*/

Otherwise you shall be getting compilation errors for "_mm_mfence"

Similar is true for other common header files as well.


Regards,
Hemant

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Chao Zhu
> Sent: 26/Sep/2014 3:06 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power
> architecture
> 
> The atomic operations implemented with assembly code in DPDK only 
support
> x86. This patch add architecture specific atomic operations for IBM 
Power
> architecture.
> 
> Signed-off-by: Chao Zhu 
> ---
>  .../common/include/powerpc/arch/rte_atomic.h   |  387
> 
>  .../common/include/powerpc/arch/rte_atomic_arch.h  |  318
> 
>  2 files changed, 705 insertions(+), 0 deletions(-)  create mode 100644
> lib/librte_eal/common/include/powerpc/arch/rte_atomic.h
>  create mode 100644
> lib/librte_eal/common/include/powerpc/arch/rte_atomic_arch.h
> 
> diff --git a/lib/librte_eal/common/include/powerpc/arch/rte_atomic.h
> b/lib/librte_eal/common/include/powerpc/arch/rte_atomic.h
> new file mode 100644
> index 000..7f5214e
> --- /dev/null
> +++ b/lib/librte_eal/common/include/powerpc/arch/rte_atomic.h
> @@ -0,0 +1,387 @@
> +/*
> + *   BSD LICENSE
> + *
> + *   Copyright (C) IBM Corporation 2014.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + * * Redistributions of source code must retain the above copyright
> + *   notice, this list of conditions and the following disclaimer.
> + * * Redistributions in binary form must reproduce the above 
copyright
> + *   notice, this list of conditions and the following disclaimer 
in
> + *   the documentation and/or other materials provided with the
> + *   distribution.
> + * * Neither the name of IBM Corporation nor the names of its
> + *   contributors may be used to endorse or promote products 
derived
> + *   from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
> OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
> ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR 
TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> +*/
> +
> +/*
> + * Inspired from FreeBSD src/sys/powerpc/include/atomic.h
> + * Copyright (c) 2008 Marcel Moolenaar
> + * Copyright (c) 2001 Benno Rice
> + * Copyright (c) 2001 David E. O'Brien
> + * Copyright (c) 1998 Doug Rabson
> + * All rights reserved.
> + */
> +
> +#ifndef _RTE_ATOMIC_H_
> +#error "don't include this file directly, please include generic 
"
> +#endif
> +
> +#ifndef _RTE_POWERPC_64_ATOMIC_H_
> +#define _RTE_POWERPC_64_ATOMIC_H_
> +
> +/*- 64 bit atomic operations
> +-*/
> +
> +/**
> + * An atomic compare and set function used by the mutex functions.
> + * (atomic) equivalent to:
> + *   if (*dst == exp)
> + * *dst = src (a

[dpdk-dev] [PATCH 10/12] Add cache size define for IBM Power Architecture

2014-09-29 Thread Chao CH Zhu
Hi,Hemant, 

Actually, the set of patches is only for IBM Power7/8 which has difference 
cache line size. Of cause, a better way may be detecting the cache line 
size at runtime not from configuration files... May be we can submit this 
kind of patch later. 

Best Regards!
--
Chao Zhu (??)
Research Staff Member
Cloud Infrastructure and Technology Group
IBM China Research Lab
Building 19 Zhongguancun Software Park
8 Dongbeiwang West Road, Haidian District,
Beijing, PRC. 100193
Tel: +86-10-58748711
Email: bjzhuc at cn.ibm.com




From:   "Hemant at freescale.com" 
To:     Chao CH Zhu/China/IBM at IBMCN, "dev at dpdk.org" 
Date:   2014/09/29 14:20
Subject:RE: [dpdk-dev] [PATCH 10/12] Add cache size define for IBM 
Power   Architecture



> --- a/mk/arch/powerpc/rte.vars.mk
> +++ b/mk/arch/powerpc/rte.vars.mk
> @@ -32,7 +32,7 @@
>  ARCH  ?= powerpc
>  CROSS ?=
> 
> -CPU_CFLAGS  ?= -m64
> +CPU_CFLAGS  ?= -m64 -DCACHE_LINE_SIZE=128

 [hemant]  Instead of hardcoding the CACHE_LINE_SIZE,  can you drive the 
CACHE_LINE_SIZE from config file.  Other powerpc processor have it as 64.


>  CPU_LDFLAGS ?=
>  CPU_ASFLAGS ?= -felf64
> 
> --
> 1.7.1





[dpdk-dev] [PATCH 1/7] Split atomic operations to architecture specific

2014-09-30 Thread Chao CH Zhu
Bruce and Neil,

Thanks for your comments! Actually, the compiler hides the difference with 
different architecture.
I'll submit another patch to correct this!


Best Regards!
--
Chao Zhu (??)
Research Staff Member
Cloud Infrastructure and Technology Group
IBM China Research Lab
Building 19 Zhongguancun Software Park
8 Dongbeiwang West Road, Haidian District,
Beijing, PRC. 100193
Tel: +86-10-58748711
Email: bjzhuc at cn.ibm.com




From:   Neil Horman 
To: Bruce Richardson 
Cc: Chao CH Zhu/China/IBM at IBMCN, dev at dpdk.org
Date:   2014/09/29 23:23
Subject:Re: [dpdk-dev] [PATCH 1/7] Split atomic operations to 
architecture specific



On Mon, Sep 29, 2014 at 12:05:22PM +0100, Bruce Richardson wrote:
> On Fri, Sep 26, 2014 at 05:33:32AM -0400, Chao Zhu wrote:
> > This patch splits the atomic operations from DPDK and push them to
> > architecture specific arch directories, so that other processor
> > architecture to support DPDK can be easily adopted.
> > 
> > Signed-off-by: Chao Zhu 
> > ---
> >  lib/librte_eal/common/Makefile |2 +-
> >  .../common/include/i686/arch/rte_atomic_arch.h |  378 

> >  lib/librte_eal/common/include/rte_atomic.h |  172 +
> >  .../common/include/x86_64/arch/rte_atomic_arch.h   |  378 

> >  4 files changed, 772 insertions(+), 158 deletions(-)
> >  create mode 100644 
lib/librte_eal/common/include/i686/arch/rte_atomic_arch.h
> >  create mode 100644 
lib/librte_eal/common/include/x86_64/arch/rte_atomic_arch.h
> > 
> <...snip...>
> > +#define rte_compiler_barrier() 
rte_arch_compiler_barrier()
> 
> Small question: shouldn't the compiler barrier be independent of 
> architecture?
> 
Agreed, compiler intrinsics I thought were used to define barriers, 
regardless
of arch (__memory_barrier() is the gcc intrinsic IIRC)
Neil

> /Bruce
> 
> 





[dpdk-dev] [PATCH 0/7] Patches to split architecture specific operations from DPDK

2014-10-12 Thread Chao CH Zhu
Cyril,

Thanks for your comments! You are right. SSE needs to be splited. The 
current split is not a completed one. I'll continue to contribute.

Best Regards!
--
Chao Zhu (??)
Research Staff Member
Cloud Infrastructure and Technology Group
IBM China Research Lab
Building 19 Zhongguancun Software Park
8 Dongbeiwang West Road, Haidian District,
Beijing, PRC. 100193
Tel: +86-10-58748711
Email: bjzhuc at cn.ibm.com




From:   Cyril Chemparathy 
To: Chao CH Zhu/China/IBM at IBMCN, 
Date:   2014/10/07 05:39
Subject:Re: [dpdk-dev] [PATCH 0/7] Patches to split architecture 
specific operations from DPDK



On 9/26/2014 2:33 AM, Chao Zhu wrote:
> The set of patches split x86 architecture specific operations from DPDK 
and put them to the
> arch directories of i686 and x86_64 architecture. This will make the 
adpotion of DPDK much easier
> on other computer architecture. For a new architecture, just add an 
architecture specific
> directory and necessary building configuration files, then DPDK can 
support it.

Wouldn't the SSE specifics in rte_common.h and rte_common_vect.h need to 
be similarly split out into architecture specifics?

Thanks
-- Cyril.





[dpdk-dev] [PATCH 0/7] Patches to split architecture specific operations from DPDK

2014-10-12 Thread Chao CH Zhu
David,

I agree that your idea may be better for the splitting. However, as Bruce 
said, I think people would like to see the multi-architecture support 
feature of DPDK first. We can improve it gradually. Do you have some 
comments?

Best Regards!
--
Chao Zhu 




From:   Bruce Richardson 
To: David Marchand 
Cc: Chao CH Zhu/China/IBM at IBMCN, "dev at dpdk.org" 
Date:   2014/10/03 21:28
Subject:Re: [dpdk-dev] [PATCH 0/7] Patches to split architecture 
specific operations from DPDK



On Fri, Oct 03, 2014 at 03:21:53PM +0200, David Marchand wrote:
> Hello Chao,
> 
> On Fri, Sep 26, 2014 at 11:33 AM, Chao Zhu  wrote:
> 
> > The set of patches split x86 architecture specific operations from 
DPDK
> > and put them to the
> > arch directories of i686 and x86_64 architecture. This will make the
> > adpotion of DPDK much easier
> > on other computer architecture. For a new architecture, just add an
> > architecture specific
> > directory and necessary building configuration files, then DPDK can
> > support it.
> >
> >
> Here is a different approach for the headers splitting.
> 
> If we are going to support multiple architectures, the best would be to
> have a specific header for each arch which implements a common API (no 
need
> for any _arch suffix).
> These headers would be located in 
lib/librte_eal/common/include/arch/$arch/
> rather than lib/librte_eal/common/include/$arch/arch/ (which looks odd 
to
> me).
> Makefiles can add some -I for dpdk to build itself (and we can remove 
those
> symlinks from the makefiles).
> Makefiles only install the specific headers in RTE_SDK/include for use 
by
> applications.
> 
> For common code and documentation, we can add a "generic" directory in
> lib/librte_eal/common/include (or "arch-generic", or "shared" ... any
> better idea ?).
> DPDK makefiles installs the generic headers in RTE_SDK/include/generic.
> arch headers (like rte_atomic.h) include the generic one
> ().
> 
> These generic headers can be implemented using compiler intrinsics when
> possible.
> They also include the doxygen stuff in a single place.
> 
> 
> This would look like something like this, for rte_atomic.h :
> - in DPDK sources
> $ ls lib/librte_eal/common/include/*/rte_atomic.h
> lib/librte_eal/common/include/i686/rte_atomic.h
> lib/librte_eal/common/include/x86_64/rte_atomic.h
> lib/librte_eal/common/include/generic/rte_atomic.h
> 
> - in installed RTE_SDK
> $ ls RTE_SDK/include/{,*/}rte_atomic.h
> RTE_SDK/include/rte_atomic.h
> RTE_SDK/include/generic/rte_atomic.h
> 
> Comments ?
> 
> 
> I am only focusing on the first patchset at the moment, but if we can 
find
> consensus here, a respin of the two patchsets would be great.
> 
> Thanks.
> 
> -- 
> David Marchand


I would have no objection to such a scheme. However, I'm not seeing much 
advantage over the existing way of doing things. I think I'd rather see 
the 
proposed patch sets merged first and then any additional cleanup done, 
rather than holding up a worthwhile submission for a bit of tidy-up.

/Bruce




[dpdk-dev] [PATCH 09/12] Remove iopl operation for IBM Power architecture

2014-10-13 Thread Chao CH Zhu
OK. I'll update the patches.
Thanks for your comments!

Best Regards!
--
Chao Zhu 




From:   "Ananyev, Konstantin" 
To: Cyril Chemparathy , Chao CH 
Zhu/China/IBM at IBMCN, "dev at dpdk.org" 
Date:   2014/10/07 22:45
Subject:RE: [dpdk-dev] [PATCH 09/12] Remove iopl operation for IBM 
Power architecture



> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Cyril Chemparathy
> Sent: Monday, October 06, 2014 11:04 PM
> To: Chao Zhu; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 09/12] Remove iopl operation for IBM 
Power architecture
> 
> On 9/26/2014 2:36 AM, Chao Zhu wrote:
> > iopl() call is mostly for the i386 architecture. In Power 
architecture.
> > It doesn't exist. This patch modified rte_eal_iopl_init() and make it
> > return -1 on Power. This means rte_config.flags will not contain
> > EAL_FLG_HIGH_IOPL flag on IBM Power architecture.
> 
> Since iopl() is an x86-only thing, shouldn't the code be conditional on
> defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686) instead of below?
> 
> Better still, should we maybe break out an architecture specific init
> function?  This function could set iopl on x86, and possibly do other
> lowlevel init things on other architectures...

Yep, that sounds like a good way to me too. 

> 
> > Signed-off-by: Chao Zhu 
> > ---
> >   lib/librte_eal/linuxapp/eal/eal.c |   11 +++
> >   1 files changed, 11 insertions(+), 0 deletions(-)
> >
> > diff --git a/lib/librte_eal/linuxapp/eal/eal.c 
b/lib/librte_eal/linuxapp/eal/eal.c
> > index 4869e7c..8cc1f21 100644
> > --- a/lib/librte_eal/linuxapp/eal/eal.c
> > +++ b/lib/librte_eal/linuxapp/eal/eal.c
> > @@ -50,7 +50,10 @@
> >   #include 
> >   #include 
> >   #include 
> > +/* Power architecture doesn't have this header file */
> > +#ifndef RTE_ARCH_PPC_64
> >   #include 
> > +#endif
> >
> >   #include 
> >   #include 
> > @@ -1019,11 +1022,19 @@ rte_eal_mcfg_complete(void)
> >
> >   /*
> >* Request iopl privilege for all RPL, returns 0 on success
> > + *
> > + * Power architecture doesn't have iopl function, so this function
> > + * return -1 on Power architecture, because this function is only 
used
> > + * in rte_eal_init to add EAL_FLG_HIGH_IOPL to rte_config.flags.
> >*/
> >   static int
> >   rte_eal_iopl_init(void)
> >   {
> > +#ifndef RTE_ARCH_PPC_64
> >  return iopl(HIGHEST_RPL);
> > +#else
> > +return -1;
> > +#endif
> >   }
> >
> >   /* Launch threads, called at application init(). */




[dpdk-dev] [PATCH 0/7] Patches to split architecture specific operations from DPDK

2014-10-13 Thread Chao CH Zhu
David,

I'll update the patches acccording to your comments. 
Thanks!

Best Regards!
--
Chao Zhu 



From:   David Marchand 
To: Chao CH Zhu/China/IBM at IBMCN
Cc: "dev at dpdk.org" 
Date:   2014/10/03 21:21
Subject:Re: [dpdk-dev] [PATCH 0/7] Patches to split architecture 
specific operations from DPDK



Hello Chao, 

On Fri, Sep 26, 2014 at 11:33 AM, Chao Zhu  wrote:
The set of patches split x86 architecture specific operations from DPDK 
and put them to the
arch directories of i686 and x86_64 architecture. This will make the 
adpotion of DPDK much easier
on other computer architecture. For a new architecture, just add an 
architecture specific
directory and necessary building configuration files, then DPDK can 
support it.


Here is a different approach for the headers splitting.

If we are going to support multiple architectures, the best would be to 
have a specific header for each arch which implements a common API (no 
need for any _arch suffix).
These headers would be located in 
lib/librte_eal/common/include/arch/$arch/ rather than 
lib/librte_eal/common/include/$arch/arch/ (which looks odd to me).
Makefiles can add some -I for dpdk to build itself (and we can remove 
those symlinks from the makefiles).
Makefiles only install the specific headers in RTE_SDK/include for use by 
applications.

For common code and documentation, we can add a "generic" directory in 
lib/librte_eal/common/include (or "arch-generic", or "shared" ... any 
better idea ?).
DPDK makefiles installs the generic headers in RTE_SDK/include/generic.
arch headers (like rte_atomic.h) include the generic one 
().

These generic headers can be implemented using compiler intrinsics when 
possible.
They also include the doxygen stuff in a single place.


This would look like something like this, for rte_atomic.h :
- in DPDK sources
$ ls lib/librte_eal/common/include/*/rte_atomic.h
lib/librte_eal/common/include/i686/rte_atomic.h
lib/librte_eal/common/include/x86_64/rte_atomic.h
lib/librte_eal/common/include/generic/rte_atomic.h

- in installed RTE_SDK
$ ls RTE_SDK/include/{,*/}rte_atomic.h
RTE_SDK/include/rte_atomic.h
RTE_SDK/include/generic/rte_atomic.h

Comments ?


I am only focusing on the first patchset at the moment, but if we can find 
consensus here, a respin of the two patchsets would be great.

Thanks.

-- 
David Marchand


[dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power architecture

2014-10-16 Thread Chao CH Zhu
Konstantin,

In my understanding, compiler barrier is a kind of software barrier which 
prevents the compiler from moving memory accesses across the barrier. This 
should be architecture-independent. And the "sync" instruction is a 
hardware barrier which depends on PowerPC architecture. So I think the 
compiler barrier should be the same on x86 and PowerPC. Any comments? 
Please correct me if I was wrong.

Thanks a lot! 

Best Regards!
--
Chao Zhu 




From:   "Ananyev, Konstantin" 
To: Chao CH Zhu/China/IBM at IBMCN, "dev at dpdk.org" 
Date:   2014/10/16 08:38
Subject:RE: [dpdk-dev] [PATCH 02/12] Add atomic operations for IBM 
Power   architecture




Hi,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Chao Zhu
> Sent: Friday, September 26, 2014 10:36 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power 
architecture
> 
> The atomic operations implemented with assembly code in DPDK only
> support x86. This patch add architecture specific atomic operations for
> IBM Power architecture.
> 
> Signed-off-by: Chao Zhu 
> ---
>  .../common/include/powerpc/arch/rte_atomic.h   |  387 

>  .../common/include/powerpc/arch/rte_atomic_arch.h  |  318 

>  2 files changed, 705 insertions(+), 0 deletions(-)
>  create mode 100644 
lib/librte_eal/common/include/powerpc/arch/rte_atomic.h
>  create mode 100644 
lib/librte_eal/common/include/powerpc/arch/rte_atomic_arch.h
> 
...
> +
> diff --git 
a/lib/librte_eal/common/include/powerpc/arch/rte_atomic_arch.h
> b/lib/librte_eal/common/include/powerpc/arch/rte_atomic_arch.h
> new file mode 100644
> index 000..fe5666e
> --- /dev/null
> +
...
>+#definerte_arch_rmb() asm volatile("sync" : : : 
"memory")
>+
> +#define   rte_arch_compiler_barrier() do {\
> +  asm volatile ("" : : : "memory");   \
> +} while(0)

I don't know much about PPC architecture, but as I remember it uses a 
weakly-ordering memory model.
Is that correct?
If so, then you probably need rte_arch_compiler_barrier() to be "sync" 
instruction (like mb()s above) .
The reason is that IA has much stronger memory ordering model and there 
are a lot of places in the code where it implies that  ordering.
For example - ring enqueue/dequeue functions. 

Konstantin