[PATCH 1/2] ARM: Correct build architecture detection logic

2013-08-27 Thread Steve Capper
Our initial patches appear to have been mangled a bit.
We check for armv7l (not seventy-one), and correct the whitespace.

Signed-off-by: Steve Capper 
---
 build/linux.inc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/build/linux.inc b/build/linux.inc
index 1c96d86..16ba6f5 100644
--- a/build/linux.inc
+++ b/build/linux.inc
@@ -56,8 +56,8 @@ ifndef arch
 ifeq ($(uname_m),sparc64)
 export arch:=sparc
 endif
-ifeq ($(uname_m),armv71)
-export arch :=armv7
+ifeq ($(uname_m),armv7l)
+export arch:=armv7
 endif
 ifndef arch
 export arch:=$(uname_m)
-- 
1.8.1.4


___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


[PATCH 0/2] Fixes for TBB ARM support

2013-08-27 Thread Steve Capper
This series fixes a couple of problems with the ARM support in
libTBB; the build system did not detect ARM correctly and external
code that referenced TBB may have had problems building for Thumb
state.

These patches apply against tbb41_20130516oss, and have been
tested under Fedora 18 and Debian Wheezy (with `make test' and
`make examples').

Any comments/critique welcome, as I aim to send these patches
upstream.

Leif Lindholm (1):
  ARM: Add IT instructions to inline assembler

Steve Capper (1):
  ARM: Correct build architecture detection logic

 build/linux.inc | 4 ++--
 include/tbb/machine/gcc_armv7.h | 3 +++
 2 files changed, 5 insertions(+), 2 deletions(-)

-- 
1.8.1.4


___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


[PATCH 2/2] ARM: Add IT instructions to inline assembler

2013-08-27 Thread Steve Capper
From: Leif Lindholm 

External code that uses libTBB will pull in gcc_armv7.h, which
has inline assembler that contains conditional instructions.

Unfortunately these external programs won't necessarily have the
`-Wa,-mimplicit-it=thumb' build option when compiling for Thumb
state, thus may fail to build for Thumb under older build systems.

To remedy this, we add the IT instructions to gcc_armv7.h that
would normally be added implicitly by the assembler.

Signed-off-by: Steve Capper 
---
 include/tbb/machine/gcc_armv7.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/tbb/machine/gcc_armv7.h b/include/tbb/machine/gcc_armv7.h
index fde1f7a..05ef6a9 100644
--- a/include/tbb/machine/gcc_armv7.h
+++ b/include/tbb/machine/gcc_armv7.h
@@ -88,6 +88,7 @@ static inline int32_t __TBB_machine_cmpswp4(volatile void 
*ptr, int32_t value, i
 "ldrex  %1, [%3]\n"
 "mov%0, #0\n"
 "cmp%1, %4\n"
+   "it eq\n"
 "strexeq%0, %5, [%3]\n"
 : "=&r" (res), "=&r" (oldval), "+Qo" (*(volatile int32_t*)ptr)
 : "r" ((int32_t *)ptr), "Ir" (comparand), "r" (value)
@@ -118,7 +119,9 @@ static inline int64_t __TBB_machine_cmpswp8(volatile void 
*ptr, int64_t value, i
 "mov%0, #0\n"
 "ldrexd %1, %H1, [%3]\n"
 "cmp%1, %4\n"
+"it eq\n"
 "cmpeq  %H1, %H4\n"
+"it eq\n"
 "strexdeq   %0, %5, %H5, [%3]"
 : "=&r" (res), "=&r" (oldval), "+Qo" (*(volatile int64_t*)ptr)
 : "r" ((int64_t *)ptr), "r" (comparand), "r" (value)
-- 
1.8.1.4


___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


[PATCH 4/5] Add linkhuge_rw test to 64 bit && !CUSTOM_LDSCIPTS

2013-09-10 Thread Steve Capper
If one compiles 64 bit with CUSTOM_LDSCRIPTS==no, then the linkhuge_rw
test is not compiled even though the logic to build it exists. For
32 bit targets these tests are compiled.

This patch adds $(HUGELINK_RW_TESTS) to the set of tests that are
compiled for 64 bit in this case.

Signed-off-by: Steve Capper 
---
 tests/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/Makefile b/tests/Makefile
index 231e3b0..9140e72 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -54,7 +54,7 @@ ifeq ($(CUSTOM_LDSCRIPTS),yes)
 TESTS += $(LDSCRIPT_TESTS) $(HUGELINK_TESTS) $(HUGELINK_TESTS:%=xB.%) \
$(HUGELINK_TESTS:%=xBDT.%)
 else
-TESTS += $(LDSCRIPT_TESTS) $(HUGELINK_TESTS)
+TESTS += $(LDSCRIPT_TESTS) $(HUGELINK_TESTS) $(HUGELINK_RW_TESTS)
 endif
 
 endif
-- 
1.8.1.4


___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


[PATCH 1/5] Aarch64 support.

2013-09-10 Thread Steve Capper
This patch adds support for Aarch64.

As with ARMv7, We do not add the xBT/xBDT style linker scripts as
these have been deprecated in favour of adjusting the page sizes
via command line parameter to ld.

Signed-off-by: Steve Capper 
---
 Makefile |  7 +++
 ld.hugetlbfs |  2 +-
 sys-aarch64elf.S | 34 ++
 3 files changed, 42 insertions(+), 1 deletion(-)
 create mode 100644 sys-aarch64elf.S

diff --git a/Makefile b/Makefile
index 48205af..0e61701 100644
--- a/Makefile
+++ b/Makefile
@@ -57,6 +57,12 @@ TMPLIB32 = lib
 ELF32 += armelf_linux_eabi
 CUSTOM_LDSCRIPTS = no
 else
+ifeq ($(ARCH),aarch64)
+CC64 = gcc
+ELF64 = aarch64elf
+TMPLIB64 = lib64
+CUSTOM_LDSCRIPTS = no
+else
 ifeq ($(ARCH),i386)
 CC32 = gcc
 ELF32 = elf_i386
@@ -100,6 +106,7 @@ endif
 endif
 endif
 endif
+endif
 
 ifdef CC32
 OBJDIRS += obj32
diff --git a/ld.hugetlbfs b/ld.hugetlbfs
index d6d12c4..5128aa2 100755
--- a/ld.hugetlbfs
+++ b/ld.hugetlbfs
@@ -91,7 +91,7 @@ case "$EMU" in
 elf32ppclinux|elf64ppc)HPAGE_SIZE=$((16*$MB)) SLICE_SIZE=$((256*$MB)) 
;;
 elf_i386|elf_x86_64)   HPAGE_SIZE=$((4*$MB)) SLICE_SIZE=$HPAGE_SIZE ;;
 elf_s390|elf64_s390)   HPAGE_SIZE=$((1*$MB)) SLICE_SIZE=$HPAGE_SIZE ;;
-armelf_linux_eabi) HPAGE_SIZE=$((2*$MB)) SLICE_SIZE=$HPAGE_SIZE ;;
+armelf_linux_eabi|aarch64elf)  HPAGE_SIZE=$((2*MB)) SLICE_SIZE=$HPAGE_SIZE ;;
 esac
 
 if [ "$HTLB_ALIGN" == "slice" ]; then
diff --git a/sys-aarch64elf.S b/sys-aarch64elf.S
new file mode 100644
index 000..54799d3
--- /dev/null
+++ b/sys-aarch64elf.S
@@ -0,0 +1,34 @@
+/*
+ * libhugetlbfs - Easy use of Linux hugepages
+ * Copyright (C) 2013 Linaro Ltd.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public License
+ * version 2.1 as published by the Free Software Foundation.
+ *
+ * This library is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+.text
+
+.globl  direct_syscall
+
+
+direct_syscall:
+   uxtwx8, w0
+   mov x0, x1
+   mov x1, x2
+   mov x2, x3
+   mov x3, x4
+   mov x4, x5
+   mov x5, x6
+   mov x6, x7
+   svc 0x0
+   ret
-- 
1.8.1.4


___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


[PATCH 0/5] libhugetlbfs - Aarch64 support and some fixes

2013-09-10 Thread Steve Capper
Hello,
This series adds Aarch64 support and makes some minor tweaks.

The first two patches of this series add Aarch64 support to
libhugetlbfs. (Starting from 3.11-rc1, the Linux Kernel supports
HugeTLB and THP for ARM64).

Some general changes are also made:
PROT_NONE is added to the mprotect unit test, and the
linkhuge_rw test is enabled for 64 bit where there aren't any
custom ldscripts.

The final patch clears up the superfluous ARM ld.hugetlbfs
HTLB_LINK logic.

Any comments would be appreciated.

Cheers,
--
Steve

Steve Capper (5):
  Aarch64 support.
  Aarch64 unit test fixes.
  Add PROT_NONE to the mprotect test.
  Add linkhuge_rw test to 64 bit && !CUSTOM_LDSCIPTS
  Cleanup ARM ld.hugetlbfs HTLB_LINK logic

 Makefile  |  7 +++
 ld.hugetlbfs  |  7 +--
 sys-aarch64elf.S  | 34 ++
 tests/Makefile|  2 +-
 tests/icache-hygiene.c|  7 ---
 tests/mprotect.c  |  6 ++
 tests/mremap-expand-slice-collision.c |  2 +-
 7 files changed, 54 insertions(+), 11 deletions(-)
 create mode 100644 sys-aarch64elf.S

-- 
1.8.1.4


___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


[PATCH 2/5] Aarch64 unit test fixes.

2013-09-10 Thread Steve Capper
On Aarch64, zero bytes are illegal instructions, this is added to
the icache-hygiene test.

In mremap-expand-slice-collision, if __LP64__ is defined then
mappings are attempted at 1TB boundaries which are outside the
allowable mmap region for Aarch64. For __aarch64__ we change this
mapping back to 256MB slices.

Signed-off-by: Steve Capper 
---
 tests/icache-hygiene.c| 7 ---
 tests/mremap-expand-slice-collision.c | 2 +-
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/tests/icache-hygiene.c b/tests/icache-hygiene.c
index 51792b3..876ce10 100644
--- a/tests/icache-hygiene.c
+++ b/tests/icache-hygiene.c
@@ -54,7 +54,7 @@ static void cacheflush(void *p)
 {
 #if defined(__powerpc__)
asm volatile("dcbst 0,%0; sync; icbi 0,%0; isync" : : "r"(p));
-#elif defined(__arm__)
+#elif defined(__arm__) || defined(__aarch64__)
__clear_cache(p, p + COPY_SIZE);
 #endif
 }
@@ -87,8 +87,9 @@ static void *sig_expected;
 static void sig_handler(int signum, siginfo_t *si, void *uc)
 {
 #if defined(__powerpc__) || defined(__powerpc64__) || defined(__ia64__) || \
-defined(__s390__) || defined(__s390x__) || defined(__sparc__)
-   /* On powerpc and ia64 and s390, 0 bytes are an illegal
+defined(__s390__) || defined(__s390x__) || defined(__sparc__) || \
+defined(__aarch64__)
+   /* On powerpc, ia64, s390 and Aarch64, 0 bytes are an illegal
 * instruction, so, if the icache is cleared properly, we SIGILL
 * as soon as we jump into the cleared page */
if (signum == SIGILL) {
diff --git a/tests/mremap-expand-slice-collision.c 
b/tests/mremap-expand-slice-collision.c
index c25f4c6..853f3c3 100644
--- a/tests/mremap-expand-slice-collision.c
+++ b/tests/mremap-expand-slice-collision.c
@@ -38,7 +38,7 @@ void init_slice_boundary(int fd)
unsigned long slice_size;
void *p1, *p2, *heap;
int slices_ok, i, rc;
-#ifdef __LP64__
+#if defined(__LP64__) && !defined(__aarch64__)
/* powerpc: 1TB slices starting at 1 TB */
slice_boundary = 0x100;
slice_size = 0x100;
-- 
1.8.1.4


___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


[PATCH 3/5] Add PROT_NONE to the mprotect test.

2013-09-10 Thread Steve Capper
The mprotect unit test checks PROT_READ and PROT_READ | PROT_WRITE
protections. We recently found that PROT_NONE wasn't properly
supported in an early version of our huge page kernel code.

This patch adds PROT_NONE tests to mprotect. The expected behaviour
is that neither reads nor writes should succeed.

Signed-off-by: Steve Capper 
---
 tests/mprotect.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/tests/mprotect.c b/tests/mprotect.c
index aa4673e..db6a662 100644
--- a/tests/mprotect.c
+++ b/tests/mprotect.c
@@ -213,5 +213,11 @@ int main(int argc, char *argv[])
test_mprotect(fd, "RW->R 1/2", 2*hpage_size, PROT_READ|PROT_WRITE,
  hpage_size, PROT_READ);
 
+   /* PROT_NONE tests */
+   test_mprotect(fd, "NONE->R", hpage_size, PROT_NONE,
+ hpage_size, PROT_READ);
+   test_mprotect(fd, "NONE->RW", hpage_size, PROT_NONE,
+ hpage_size, PROT_READ|PROT_WRITE);
+
PASS();
 }
-- 
1.8.1.4


___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


[PATCH 5/5] Cleanup ARM ld.hugetlbfs HTLB_LINK logic

2013-09-10 Thread Steve Capper
When ld.hugetlbfs is executed with --hugetlbfs-link, there is code
to check for the ARM platform and warn that this is not supported.

There is also code to check for CUSTOM_LDSCRIPTS being false and
give a similar warning.

This patch removes the ARM check as the CUSTOM_LDSCRIPTS check will
catch this.

Signed-off-by: Steve Capper 
---
 ld.hugetlbfs | 5 -
 1 file changed, 5 deletions(-)

diff --git a/ld.hugetlbfs b/ld.hugetlbfs
index 5128aa2..ba9e00a 100755
--- a/ld.hugetlbfs
+++ b/ld.hugetlbfs
@@ -79,11 +79,6 @@ if [ -n "$HTLB_LINK" ]; then
 HTLB_ALIGN="" # --hugetlbfs-link overrides --hugetlbfs-align
 LDSCRIPT="$EMU.x$HTLB_LINK"
 HTLBOPTS="-T${HUGETLB_LDSCRIPT_PATH}/${LDSCRIPT}"
-
-if [ "$EMU" == "armelf_linux_eabi" ]; then
-echo "Please use --hugetlbfs-align when targeting ARM."
-   exit -1
-fi
 fi
 
 MB=$((1024*1024))
-- 
1.8.1.4


___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: [PATCH 0/5] libhugetlbfs - Aarch64 support and some fixes

2013-10-18 Thread Steve Capper
On Tue, Sep 10, 2013 at 02:11:27PM +0100, Steve Capper wrote:
> Hello,
> This series adds Aarch64 support and makes some minor tweaks.
> 
> The first two patches of this series add Aarch64 support to
> libhugetlbfs. (Starting from 3.11-rc1, the Linux Kernel supports
> HugeTLB and THP for ARM64).
> 
> Some general changes are also made:
> PROT_NONE is added to the mprotect unit test, and the
> linkhuge_rw test is enabled for 64 bit where there aren't any
> custom ldscripts.
> 
> The final patch clears up the superfluous ARM ld.hugetlbfs
> HTLB_LINK logic.
> 
> Any comments would be appreciated.
> 
> Cheers,
> --
> Steve

Hi guys,
This is just a ping on this series.
Do these patches look ok for inclusion into libhugetlbfs?
Should I resend the series?

Thanks,
-- 
Steve

> 
> Steve Capper (5):
>   Aarch64 support.
>   Aarch64 unit test fixes.
>   Add PROT_NONE to the mprotect test.
>   Add linkhuge_rw test to 64 bit && !CUSTOM_LDSCIPTS
>   Cleanup ARM ld.hugetlbfs HTLB_LINK logic
> 
>  Makefile  |  7 +++
>  ld.hugetlbfs  |  7 +--
>  sys-aarch64elf.S  | 34 ++
>  tests/Makefile|  2 +-
>  tests/icache-hygiene.c|  7 ---
>  tests/mprotect.c  |  6 ++
>  tests/mremap-expand-slice-collision.c |  2 +-
>  7 files changed, 54 insertions(+), 11 deletions(-)
>  create mode 100644 sys-aarch64elf.S
> 
> -- 
> 1.8.1.4
> 

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: [Libhugetlbfs-devel] [PATCH 0/5] libhugetlbfs - Aarch64 support and some fixes

2013-10-22 Thread Steve Capper
No probs, thank you Eric.

Cheers,
--
Steve

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: [PATCH Ceph] crc32c: add arm64/aarch64 optimized crc32c implementation

2015-01-22 Thread Steve Capper
Hi Yazen,

I have a few comments below.

Cheers,
-- 
Steve


On Wed, Jan 21, 2015 at 02:17:47PM -0600, Yazen Ghannam wrote:
> Tested on AMD Seattle.
> Passes all crc32c unit tests.
> ~4x performance increase versus sctp.

Was that perf result from the unit test?

Also describe the reason for the patch in the commit log; we have an
optional CRC instruction that can be used instead of the Ceph in-built
table lookup. This patch uses the CRC instruction where available...

Unit tests are added and they show a ...

Probably also mention that this based off Ed Nevill's Hadoop patch.

> 
> Signed-off-by: Yazen Ghannam 
> ---
>  configure.ac   |  1 +
>  m4/ax_arm.m4   | 18 +---
>  src/arch/arm.c |  2 ++
>  src/arch/arm.h |  1 +
>  src/common/Makefile.am | 11 --
>  src/common/crc32c.cc   |  6 ++
>  src/common/crc32c_arm64.c  | 48 
> ++
>  src/common/crc32c_arm64.h  | 25 ++
>  src/test/common/test_crc32c.cc |  9 
>  9 files changed, 116 insertions(+), 5 deletions(-)
>  create mode 100644 src/common/crc32c_arm64.c
>  create mode 100644 src/common/crc32c_arm64.h
> 
> diff --git a/configure.ac b/configure.ac
> index d836b02..60e4feb 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -575,6 +575,7 @@ AC_LANG_POP([C++])
>  # Find supported SIMD / NEON / SSE extensions supported by the compiler
>  AX_ARM_FEATURES()
>  AM_CONDITIONAL(HAVE_NEON, [ test "x$ax_cv_support_neon_ext" = "xyes"])
> +AM_CONDITIONAL(HAVE_ARMV8_CRC, [ test "x$ax_cv_support_crc_ext" = "xyes"])
>  AX_INTEL_FEATURES()
>  AM_CONDITIONAL(HAVE_SSSE3, [ test "x$ax_cv_support_ssse3_ext" = "xyes"])
>  AM_CONDITIONAL(HAVE_SSE4_PCLMUL, [ test "x$ax_cv_support_pclmuldq_ext" = 
> "xyes"])
> diff --git a/m4/ax_arm.m4 b/m4/ax_arm.m4
> index 2ccc9a9..4ef8081 100644
> --- a/m4/ax_arm.m4
> +++ b/m4/ax_arm.m4
> @@ -13,13 +13,25 @@ AC_DEFUN([AX_ARM_FEATURES],
>fi
>  ;;
>  aarch64*)
> +  AX_CHECK_COMPILE_FLAG(-march=armv8-a, ax_cv_support_armv8=yes, [])
> +  if test x"$ax_cv_support_armv8" = x"yes"; then
> +ARM_ARCH_FLAGS="-march=armv8-a"
> +ARM_DEFINE_FLAGS="-DARCH_AARCH64"
> +  fi
>AX_CHECK_COMPILE_FLAG(-march=armv8-a+simd, ax_cv_support_neon_ext=yes, 
> [])
>if test x"$ax_cv_support_neon_ext" = x"yes"; then
> -ARM_NEON_FLAGS="-march=armv8-a+simd -DARCH_AARCH64 -DARM_NEON"
> -AC_SUBST(ARM_NEON_FLAGS)
> -ARM_FLAGS="$ARM_FLAGS $ARM_NEON_FLAGS"
> +ARM_ARCH_FLAGS="$ARM_ARCH_FLAGS+simd"
> +ARM_DEFINE_FLAGS="$ARM_DEFINE_FLAGS -DARM_NEON"
>  AC_DEFINE(HAVE_NEON,,[Support NEON instructions])
>fi
> +  AX_CHECK_COMPILE_FLAG(-march=armv8-a+crc, ax_cv_support_crc_ext=yes, 
> [])
> +  if test x"$ax_cv_support_crc_ext" = x"yes"; then
> +ARM_ARCH_FLAGS="$ARM_ARCH_FLAGS+crc"
> +AC_DEFINE(HAVE_ARMV8_CRC,,[Support ARMv8 CRC instructions])
> +  fi
> +ARM_NEON_FLAGS="$ARM_ARCH_FLAGS $ARM_DEFINE_FLAGS"
> +AC_SUBST(ARM_NEON_FLAGS)
> +ARM_FLAGS="$ARM_FLAGS $ARM_NEON_FLAGS"

So if the compiler supports CRC, then ARM_FLAGS will be set with CRC
compile options. This will also affect the jerasure codegen (which
doesn't check for the CRC HWCAP). Although this is unlikely to cause
problems, I would feel more comfortable with something like:
ARM_CRC_FLAGS
That way you disambiguate between NEON (for jerasure) and CRC.

>  ;;
>esac
>  
> diff --git a/src/arch/arm.c b/src/arch/arm.c
> index 93d079a..2ea97eb 100644
> --- a/src/arch/arm.c
> +++ b/src/arch/arm.c
> @@ -2,6 +2,7 @@
>  
>  /* flags we export */
>  int ceph_arch_neon = 0;
> +int ceph_arch_arm64_crc32 = 0;


Can we rename this variable: ceph_arch_aarch64_crc32 ?


>  
>  #include 
>  
> @@ -47,6 +48,7 @@ int ceph_arch_arm_probe(void)
>   ceph_arch_neon = (get_hwcap() & HWCAP_NEON) == HWCAP_NEON;
>  #elif __aarch64__ && __linux__
>   ceph_arch_neon = (get_hwcap() & HWCAP_ASIMD) == HWCAP_ASIMD;
> + ceph_arch_arm64_crc32 = (get_hwcap() & HWCAP_CRC32) == HWCAP_CRC32;
>  #else
>   if (0)
>   get_hwcap();  // make compiler shut up
> diff --git a/src/arch/arm.h b/src/arch/arm.h
> index f613438..4ac716a 100644
> --- a/src/arch/arm.h
> +++ b/src/arch/arm.h
> @@ -6,6 +6,7 @@ extern "C" {
>  #endif
>  
>  extern int ceph_arch_neon;  /* true if we have ARM NEON or ASIMD abilities */
> +extern int ceph_arch_arm64_crc32;  /* true if we have Aarch64 CRC32/CRC32C 
> abilities */

"AArch64", we have two capital A's.

>  
>  extern int ceph_arch_arm_probe(void);
>  
> diff --git a/src/common/Makefile.am b/src/common/Makefile.am
> index 2888194..4c36216 100644
> --- a/src/common/Makefile.am
> +++ b/src/common/Makefile.am
> @@ -103,12 +103,18 @@ libcommon_crc_la_SOURCES = \
>   common/sctp_crc32.c \
>   common/crc32c.cc \
>   common/crc32c_intel_baseline.c \
> - comm

Re: [PATCH v2 Ceph] crc32c: add aarch64 optimized crc32c implementation

2015-01-23 Thread Steve Capper
Hi Yazen,

This is looking good, just a few minor comments below.

Cheers,
-- 
Steve

On Fri, Jan 23, 2015 at 09:13:42AM -0600, Yazen Ghannam wrote:
> ARMv8 defines a set of optional CRC32/CRC32C instructions.
> This patch defines an optimized function that uses these
> instructions when available rather than table-based lookup.
> Optimized function based on a Hadoop patch by Ed Nevill.
> 
> Autotools updated to check for compiler support.
> Optimized function is selected at runtime based on HWCAP_CRC32.
> Added crc32c "performance" unit test and arch unit test.
> 
> Tested on AMD Seattle.
> Passes all crc32c unit tests.
> Unit test shows ~4x performance increase versus sctp.
> 
> Signed-off-by: Yazen Ghannam 
> ---
>  configure.ac   |  1 +
>  m4/ax_arm.m4   | 18 +++--
>  src/arch/arm.c |  2 ++
>  src/arch/arm.h |  1 +
>  src/common/Makefile.am | 10 +++-
>  src/common/crc32c.cc   |  6 +
>  src/common/crc32c_aarch64.c| 58 
> ++
>  src/common/crc32c_aarch64.h| 25 ++
>  src/test/common/test_crc32c.cc | 11 
>  src/test/test_arch.cc  |  3 +++
>  10 files changed, 132 insertions(+), 3 deletions(-)
>  create mode 100644 src/common/crc32c_aarch64.c
>  create mode 100644 src/common/crc32c_aarch64.h
> 
> diff --git a/configure.ac b/configure.ac
> index d836b02..60e4feb 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -575,6 +575,7 @@ AC_LANG_POP([C++])
>  # Find supported SIMD / NEON / SSE extensions supported by the compiler
>  AX_ARM_FEATURES()
>  AM_CONDITIONAL(HAVE_NEON, [ test "x$ax_cv_support_neon_ext" = "xyes"])
> +AM_CONDITIONAL(HAVE_ARMV8_CRC, [ test "x$ax_cv_support_crc_ext" = "xyes"])
>  AX_INTEL_FEATURES()
>  AM_CONDITIONAL(HAVE_SSSE3, [ test "x$ax_cv_support_ssse3_ext" = "xyes"])
>  AM_CONDITIONAL(HAVE_SSE4_PCLMUL, [ test "x$ax_cv_support_pclmuldq_ext" = 
> "xyes"])
> diff --git a/m4/ax_arm.m4 b/m4/ax_arm.m4
> index 2ccc9a9..37ea0aa 100644
> --- a/m4/ax_arm.m4
> +++ b/m4/ax_arm.m4
> @@ -13,13 +13,27 @@ AC_DEFUN([AX_ARM_FEATURES],
>fi
>  ;;
>  aarch64*)
> +  AX_CHECK_COMPILE_FLAG(-march=armv8-a, ax_cv_support_armv8=yes, [])
> +  if test x"$ax_cv_support_armv8" = x"yes"; then
> +ARM_ARCH_FLAGS="-march=armv8-a"
> +ARM_DEFINE_FLAGS="-DARCH_AARCH64"
> +  fi
>AX_CHECK_COMPILE_FLAG(-march=armv8-a+simd, ax_cv_support_neon_ext=yes, 
> [])
>if test x"$ax_cv_support_neon_ext" = x"yes"; then
> +ARM_ARCH_FLAGS="$ARM_ARCH_FLAGS+simd"
> +ARM_DEFINE_FLAGS="$ARM_DEFINE_FLAGS -DARM_NEON"
>  ARM_NEON_FLAGS="-march=armv8-a+simd -DARCH_AARCH64 -DARM_NEON"
> -AC_SUBST(ARM_NEON_FLAGS)
> -ARM_FLAGS="$ARM_FLAGS $ARM_NEON_FLAGS"
>  AC_DEFINE(HAVE_NEON,,[Support NEON instructions])
> +AC_SUBST(ARM_NEON_FLAGS)
> +  fi
> +  AX_CHECK_COMPILE_FLAG(-march=armv8-a+crc, ax_cv_support_crc_ext=yes, 
> [])
> +  if test x"$ax_cv_support_crc_ext" = x"yes"; then
> +ARM_ARCH_FLAGS="$ARM_ARCH_FLAGS+crc"
> +ARM_CRC_FLAGS="-march=armv8-a+crc -DARCH_AARCH64"
> +AC_DEFINE(HAVE_ARMV8_CRC,,[Support ARMv8 CRC instructions])
> +AC_SUBST(ARM_CRC_FLAGS)
>fi
> +ARM_FLAGS="$ARM_ARCH_FLAGS $ARM_DEFINE_FLAGS"
>  ;;
>esac
>  
> diff --git a/src/arch/arm.c b/src/arch/arm.c
> index 93d079a..5a47e33 100644
> --- a/src/arch/arm.c
> +++ b/src/arch/arm.c
> @@ -2,6 +2,7 @@
>  
>  /* flags we export */
>  int ceph_arch_neon = 0;
> +int ceph_arch_aarch64_crc32 = 0;
>  
>  #include 
>  
> @@ -47,6 +48,7 @@ int ceph_arch_arm_probe(void)
>   ceph_arch_neon = (get_hwcap() & HWCAP_NEON) == HWCAP_NEON;
>  #elif __aarch64__ && __linux__
>   ceph_arch_neon = (get_hwcap() & HWCAP_ASIMD) == HWCAP_ASIMD;
> + ceph_arch_aarch64_crc32 = (get_hwcap() & HWCAP_CRC32) == HWCAP_CRC32;
>  #else
>   if (0)
>   get_hwcap();  // make compiler shut up
> diff --git a/src/arch/arm.h b/src/arch/arm.h
> index f613438..1659b2e 100644
> --- a/src/arch/arm.h
> +++ b/src/arch/arm.h
> @@ -6,6 +6,7 @@ extern "C" {
>  #endif
>  
>  extern int ceph_arch_neon;  /* true if we have ARM NEON or ASIMD abilities */
> +extern int ceph_arch_aarch64_crc32;  /* true if we have AArch64 CRC32/CRC32C 
> abilities */
>  
>  extern int ceph_arch_arm_probe(void);
>  
> diff --git a/src/common/Makefile.am b/src/common/Makefile.am
> index 2888194..37d1404 100644
> --- a/src/common/Makefile.am
> +++ b/src/common/Makefile.am
> @@ -112,11 +112,19 @@ endif
>  LIBCOMMON_DEPS += libcommon_crc.la
>  noinst_LTLIBRARIES += libcommon_crc.la
>  
> +if HAVE_ARMV8_CRC
> +libcommon_crc_aarch64_la_SOURCES = common/crc32c_aarch64.c
> +libcommon_crc_aarch64_la_CFLAGS = $(AM_CFLAGS) $(ARM_CRC_FLAGS)
> +LIBCOMMON_DEPS += libcommon_crc_aarch64.la
> +noinst_LTLIBRARIES += libcommon_crc_aarch64.la
> +endif
> +
>  noinst_HEADERS += \
>   common/bloom_fil

Re: [PATCH v3 Ceph] crc32c: add aarch64 optimized crc32c implementation

2015-01-26 Thread Steve Capper
On Fri, Jan 23, 2015 at 12:43:31PM -0500, Yazen Ghannam wrote:
> ARMv8 defines a set of optional CRC32/CRC32C instructions.
> This patch defines an optimized function that uses these
> instructions when available rather than table-based lookup.
> Optimized function based on a Hadoop patch by Ed Nevill.
> 
> Autotools updated to check for compiler support.
> Optimized function is selected at runtime based on HWCAP_CRC32.
> Added crc32c "performance" unit test and arch unit test.
> 
> Tested on AMD Seattle.
> Passes all crc32c unit tests.
> Unit test shows ~4x performance increase versus sctp.

Hi Yazen,

Thanks, this is looking good.

> 
> Signed-off-by: Yazen Ghannam 

Please add:

Reviewed-by: Steve Capper 

This can be sent to ceph-de...@vger.kernel.org, let's see what upstream
think ;-).

Cheers,
-- 
Steve

> ---
>  configure.ac   |  1 +
>  m4/ax_arm.m4   | 18 ++--
>  src/arch/arm.c |  2 ++
>  src/arch/arm.h |  1 +
>  src/common/Makefile.am | 10 -
>  src/common/crc32c.cc   |  6 ++
>  src/common/crc32c_aarch64.c| 47 
> ++
>  src/common/crc32c_aarch64.h| 27 
>  src/test/common/test_crc32c.cc | 10 +
>  src/test/test_arch.cc  | 14 +
>  10 files changed, 133 insertions(+), 3 deletions(-)
>  create mode 100644 src/common/crc32c_aarch64.c
>  create mode 100644 src/common/crc32c_aarch64.h
> 
> diff --git a/configure.ac b/configure.ac
> index d836b02..60e4feb 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -575,6 +575,7 @@ AC_LANG_POP([C++])
>  # Find supported SIMD / NEON / SSE extensions supported by the compiler
>  AX_ARM_FEATURES()
>  AM_CONDITIONAL(HAVE_NEON, [ test "x$ax_cv_support_neon_ext" = "xyes"])
> +AM_CONDITIONAL(HAVE_ARMV8_CRC, [ test "x$ax_cv_support_crc_ext" = "xyes"])
>  AX_INTEL_FEATURES()
>  AM_CONDITIONAL(HAVE_SSSE3, [ test "x$ax_cv_support_ssse3_ext" = "xyes"])
>  AM_CONDITIONAL(HAVE_SSE4_PCLMUL, [ test "x$ax_cv_support_pclmuldq_ext" = 
> "xyes"])
> diff --git a/m4/ax_arm.m4 b/m4/ax_arm.m4
> index 2ccc9a9..37ea0aa 100644
> --- a/m4/ax_arm.m4
> +++ b/m4/ax_arm.m4
> @@ -13,13 +13,27 @@ AC_DEFUN([AX_ARM_FEATURES],
>fi
>  ;;
>  aarch64*)
> +  AX_CHECK_COMPILE_FLAG(-march=armv8-a, ax_cv_support_armv8=yes, [])
> +  if test x"$ax_cv_support_armv8" = x"yes"; then
> +ARM_ARCH_FLAGS="-march=armv8-a"
> +ARM_DEFINE_FLAGS="-DARCH_AARCH64"
> +  fi
>AX_CHECK_COMPILE_FLAG(-march=armv8-a+simd, ax_cv_support_neon_ext=yes, 
> [])
>if test x"$ax_cv_support_neon_ext" = x"yes"; then
> +ARM_ARCH_FLAGS="$ARM_ARCH_FLAGS+simd"
> +ARM_DEFINE_FLAGS="$ARM_DEFINE_FLAGS -DARM_NEON"
>  ARM_NEON_FLAGS="-march=armv8-a+simd -DARCH_AARCH64 -DARM_NEON"
> -AC_SUBST(ARM_NEON_FLAGS)
> -ARM_FLAGS="$ARM_FLAGS $ARM_NEON_FLAGS"
>  AC_DEFINE(HAVE_NEON,,[Support NEON instructions])
> +AC_SUBST(ARM_NEON_FLAGS)
> +  fi
> +  AX_CHECK_COMPILE_FLAG(-march=armv8-a+crc, ax_cv_support_crc_ext=yes, 
> [])
> +  if test x"$ax_cv_support_crc_ext" = x"yes"; then
> +ARM_ARCH_FLAGS="$ARM_ARCH_FLAGS+crc"
> +ARM_CRC_FLAGS="-march=armv8-a+crc -DARCH_AARCH64"
> +AC_DEFINE(HAVE_ARMV8_CRC,,[Support ARMv8 CRC instructions])
> +AC_SUBST(ARM_CRC_FLAGS)
>fi
> +ARM_FLAGS="$ARM_ARCH_FLAGS $ARM_DEFINE_FLAGS"
>  ;;
>esac
>  
> diff --git a/src/arch/arm.c b/src/arch/arm.c
> index 93d079a..5a47e33 100644
> --- a/src/arch/arm.c
> +++ b/src/arch/arm.c
> @@ -2,6 +2,7 @@
>  
>  /* flags we export */
>  int ceph_arch_neon = 0;
> +int ceph_arch_aarch64_crc32 = 0;
>  
>  #include 
>  
> @@ -47,6 +48,7 @@ int ceph_arch_arm_probe(void)
>   ceph_arch_neon = (get_hwcap() & HWCAP_NEON) == HWCAP_NEON;
>  #elif __aarch64__ && __linux__
>   ceph_arch_neon = (get_hwcap() & HWCAP_ASIMD) == HWCAP_ASIMD;
> + ceph_arch_aarch64_crc32 = (get_hwcap() & HWCAP_CRC32) == HWCAP_CRC32;
>  #else
>   if (0)
>   get_hwcap();  // make compiler shut up
> diff --git a/src/arch/arm.h b/src/arch/arm.h
> index f613438..1659b2e 100644
> --- a/src/arch/arm.h
> +++ b/src/arch/arm.h
> @@ -6,6 +6,7 @@ extern "C" {
>  #endif
>  
>  extern int ceph_arch_neon;  /* true if we have ARM NEON or ASIMD 

Cortex A57 Optimisation Guide

2015-03-04 Thread Steve Capper
For those interested in optimising code for the Cortex-A57, the
following guide has just been released:
http://infocenter.arm.com/help/topic/com.arm.doc.uan0015a/cortex_a57_software_optimisation_guide_external.pdf

Cheers,
--
Steve

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: JITs and 52-bit VA

2016-04-28 Thread Steve Capper
On 28 April 2016 at 14:24, Peter Maydell  wrote:
> On 28 April 2016 at 14:17, Arnd Bergmann  wrote:
>> One simple (from the kernel's perspective, not from the JIT) approach
>> might be to always use MAP_FIXED whenever an allocation is made for
>> memory that needs these special pointers, and then manage the available
>> address space explicitly. Would that work, or do you require everything
>> including the binary itself to be below the address?
>
> The trouble IME with this idea is that in practice you're
> linking with glibc, which means glibc is managing (and using)
> the address space, not the JIT. So MAP_FIXED is pretty awkward
> to use.
>
> thanks
> -- PMM

Hi,

One can find holes in the VA space by examining /proc/self/maps, thus
selection of pointers for MAP_FIXED can be deduced.

The other problem is, as Arnd alluded to, if a JIT'ed object needs to
then refer to something allocated outside of the JIT. This could be
remedied by another level of indirection/trampoline.

Taking two steps back though, I would view VA space squeezing as a
stop-gap before removing tags from the upper bits of a pointer
altogether (tagging the bottom bits, by controlling alignment is
perfectly safe). The larger the VA space, the more scope mechanisms
such as Address Space Layout Randomisation have to improve security.

Cheers,
--
Steve
___
linaro-dev mailing list
linaro-dev@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/linaro-dev