[PATCH 1/2] ARM: Correct build architecture detection logic
Our initial patches appear to have been mangled a bit. We check for armv7l (not seventy-one), and correct the whitespace. Signed-off-by: Steve Capper --- build/linux.inc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/build/linux.inc b/build/linux.inc index 1c96d86..16ba6f5 100644 --- a/build/linux.inc +++ b/build/linux.inc @@ -56,8 +56,8 @@ ifndef arch ifeq ($(uname_m),sparc64) export arch:=sparc endif -ifeq ($(uname_m),armv71) -export arch :=armv7 +ifeq ($(uname_m),armv7l) +export arch:=armv7 endif ifndef arch export arch:=$(uname_m) -- 1.8.1.4 ___ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
[PATCH 0/2] Fixes for TBB ARM support
This series fixes a couple of problems with the ARM support in libTBB; the build system did not detect ARM correctly and external code that referenced TBB may have had problems building for Thumb state. These patches apply against tbb41_20130516oss, and have been tested under Fedora 18 and Debian Wheezy (with `make test' and `make examples'). Any comments/critique welcome, as I aim to send these patches upstream. Leif Lindholm (1): ARM: Add IT instructions to inline assembler Steve Capper (1): ARM: Correct build architecture detection logic build/linux.inc | 4 ++-- include/tbb/machine/gcc_armv7.h | 3 +++ 2 files changed, 5 insertions(+), 2 deletions(-) -- 1.8.1.4 ___ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
[PATCH 2/2] ARM: Add IT instructions to inline assembler
From: Leif Lindholm External code that uses libTBB will pull in gcc_armv7.h, which has inline assembler that contains conditional instructions. Unfortunately these external programs won't necessarily have the `-Wa,-mimplicit-it=thumb' build option when compiling for Thumb state, thus may fail to build for Thumb under older build systems. To remedy this, we add the IT instructions to gcc_armv7.h that would normally be added implicitly by the assembler. Signed-off-by: Steve Capper --- include/tbb/machine/gcc_armv7.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/include/tbb/machine/gcc_armv7.h b/include/tbb/machine/gcc_armv7.h index fde1f7a..05ef6a9 100644 --- a/include/tbb/machine/gcc_armv7.h +++ b/include/tbb/machine/gcc_armv7.h @@ -88,6 +88,7 @@ static inline int32_t __TBB_machine_cmpswp4(volatile void *ptr, int32_t value, i "ldrex %1, [%3]\n" "mov%0, #0\n" "cmp%1, %4\n" + "it eq\n" "strexeq%0, %5, [%3]\n" : "=&r" (res), "=&r" (oldval), "+Qo" (*(volatile int32_t*)ptr) : "r" ((int32_t *)ptr), "Ir" (comparand), "r" (value) @@ -118,7 +119,9 @@ static inline int64_t __TBB_machine_cmpswp8(volatile void *ptr, int64_t value, i "mov%0, #0\n" "ldrexd %1, %H1, [%3]\n" "cmp%1, %4\n" +"it eq\n" "cmpeq %H1, %H4\n" +"it eq\n" "strexdeq %0, %5, %H5, [%3]" : "=&r" (res), "=&r" (oldval), "+Qo" (*(volatile int64_t*)ptr) : "r" ((int64_t *)ptr), "r" (comparand), "r" (value) -- 1.8.1.4 ___ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
[PATCH 4/5] Add linkhuge_rw test to 64 bit && !CUSTOM_LDSCIPTS
If one compiles 64 bit with CUSTOM_LDSCRIPTS==no, then the linkhuge_rw test is not compiled even though the logic to build it exists. For 32 bit targets these tests are compiled. This patch adds $(HUGELINK_RW_TESTS) to the set of tests that are compiled for 64 bit in this case. Signed-off-by: Steve Capper --- tests/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/Makefile b/tests/Makefile index 231e3b0..9140e72 100644 --- a/tests/Makefile +++ b/tests/Makefile @@ -54,7 +54,7 @@ ifeq ($(CUSTOM_LDSCRIPTS),yes) TESTS += $(LDSCRIPT_TESTS) $(HUGELINK_TESTS) $(HUGELINK_TESTS:%=xB.%) \ $(HUGELINK_TESTS:%=xBDT.%) else -TESTS += $(LDSCRIPT_TESTS) $(HUGELINK_TESTS) +TESTS += $(LDSCRIPT_TESTS) $(HUGELINK_TESTS) $(HUGELINK_RW_TESTS) endif endif -- 1.8.1.4 ___ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
[PATCH 1/5] Aarch64 support.
This patch adds support for Aarch64. As with ARMv7, We do not add the xBT/xBDT style linker scripts as these have been deprecated in favour of adjusting the page sizes via command line parameter to ld. Signed-off-by: Steve Capper --- Makefile | 7 +++ ld.hugetlbfs | 2 +- sys-aarch64elf.S | 34 ++ 3 files changed, 42 insertions(+), 1 deletion(-) create mode 100644 sys-aarch64elf.S diff --git a/Makefile b/Makefile index 48205af..0e61701 100644 --- a/Makefile +++ b/Makefile @@ -57,6 +57,12 @@ TMPLIB32 = lib ELF32 += armelf_linux_eabi CUSTOM_LDSCRIPTS = no else +ifeq ($(ARCH),aarch64) +CC64 = gcc +ELF64 = aarch64elf +TMPLIB64 = lib64 +CUSTOM_LDSCRIPTS = no +else ifeq ($(ARCH),i386) CC32 = gcc ELF32 = elf_i386 @@ -100,6 +106,7 @@ endif endif endif endif +endif ifdef CC32 OBJDIRS += obj32 diff --git a/ld.hugetlbfs b/ld.hugetlbfs index d6d12c4..5128aa2 100755 --- a/ld.hugetlbfs +++ b/ld.hugetlbfs @@ -91,7 +91,7 @@ case "$EMU" in elf32ppclinux|elf64ppc)HPAGE_SIZE=$((16*$MB)) SLICE_SIZE=$((256*$MB)) ;; elf_i386|elf_x86_64) HPAGE_SIZE=$((4*$MB)) SLICE_SIZE=$HPAGE_SIZE ;; elf_s390|elf64_s390) HPAGE_SIZE=$((1*$MB)) SLICE_SIZE=$HPAGE_SIZE ;; -armelf_linux_eabi) HPAGE_SIZE=$((2*$MB)) SLICE_SIZE=$HPAGE_SIZE ;; +armelf_linux_eabi|aarch64elf) HPAGE_SIZE=$((2*MB)) SLICE_SIZE=$HPAGE_SIZE ;; esac if [ "$HTLB_ALIGN" == "slice" ]; then diff --git a/sys-aarch64elf.S b/sys-aarch64elf.S new file mode 100644 index 000..54799d3 --- /dev/null +++ b/sys-aarch64elf.S @@ -0,0 +1,34 @@ +/* + * libhugetlbfs - Easy use of Linux hugepages + * Copyright (C) 2013 Linaro Ltd. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public License + * version 2.1 as published by the Free Software Foundation. + * + * This library is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + */ + +.text + +.globl direct_syscall + + +direct_syscall: + uxtwx8, w0 + mov x0, x1 + mov x1, x2 + mov x2, x3 + mov x3, x4 + mov x4, x5 + mov x5, x6 + mov x6, x7 + svc 0x0 + ret -- 1.8.1.4 ___ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
[PATCH 0/5] libhugetlbfs - Aarch64 support and some fixes
Hello, This series adds Aarch64 support and makes some minor tweaks. The first two patches of this series add Aarch64 support to libhugetlbfs. (Starting from 3.11-rc1, the Linux Kernel supports HugeTLB and THP for ARM64). Some general changes are also made: PROT_NONE is added to the mprotect unit test, and the linkhuge_rw test is enabled for 64 bit where there aren't any custom ldscripts. The final patch clears up the superfluous ARM ld.hugetlbfs HTLB_LINK logic. Any comments would be appreciated. Cheers, -- Steve Steve Capper (5): Aarch64 support. Aarch64 unit test fixes. Add PROT_NONE to the mprotect test. Add linkhuge_rw test to 64 bit && !CUSTOM_LDSCIPTS Cleanup ARM ld.hugetlbfs HTLB_LINK logic Makefile | 7 +++ ld.hugetlbfs | 7 +-- sys-aarch64elf.S | 34 ++ tests/Makefile| 2 +- tests/icache-hygiene.c| 7 --- tests/mprotect.c | 6 ++ tests/mremap-expand-slice-collision.c | 2 +- 7 files changed, 54 insertions(+), 11 deletions(-) create mode 100644 sys-aarch64elf.S -- 1.8.1.4 ___ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
[PATCH 2/5] Aarch64 unit test fixes.
On Aarch64, zero bytes are illegal instructions, this is added to the icache-hygiene test. In mremap-expand-slice-collision, if __LP64__ is defined then mappings are attempted at 1TB boundaries which are outside the allowable mmap region for Aarch64. For __aarch64__ we change this mapping back to 256MB slices. Signed-off-by: Steve Capper --- tests/icache-hygiene.c| 7 --- tests/mremap-expand-slice-collision.c | 2 +- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/tests/icache-hygiene.c b/tests/icache-hygiene.c index 51792b3..876ce10 100644 --- a/tests/icache-hygiene.c +++ b/tests/icache-hygiene.c @@ -54,7 +54,7 @@ static void cacheflush(void *p) { #if defined(__powerpc__) asm volatile("dcbst 0,%0; sync; icbi 0,%0; isync" : : "r"(p)); -#elif defined(__arm__) +#elif defined(__arm__) || defined(__aarch64__) __clear_cache(p, p + COPY_SIZE); #endif } @@ -87,8 +87,9 @@ static void *sig_expected; static void sig_handler(int signum, siginfo_t *si, void *uc) { #if defined(__powerpc__) || defined(__powerpc64__) || defined(__ia64__) || \ -defined(__s390__) || defined(__s390x__) || defined(__sparc__) - /* On powerpc and ia64 and s390, 0 bytes are an illegal +defined(__s390__) || defined(__s390x__) || defined(__sparc__) || \ +defined(__aarch64__) + /* On powerpc, ia64, s390 and Aarch64, 0 bytes are an illegal * instruction, so, if the icache is cleared properly, we SIGILL * as soon as we jump into the cleared page */ if (signum == SIGILL) { diff --git a/tests/mremap-expand-slice-collision.c b/tests/mremap-expand-slice-collision.c index c25f4c6..853f3c3 100644 --- a/tests/mremap-expand-slice-collision.c +++ b/tests/mremap-expand-slice-collision.c @@ -38,7 +38,7 @@ void init_slice_boundary(int fd) unsigned long slice_size; void *p1, *p2, *heap; int slices_ok, i, rc; -#ifdef __LP64__ +#if defined(__LP64__) && !defined(__aarch64__) /* powerpc: 1TB slices starting at 1 TB */ slice_boundary = 0x100; slice_size = 0x100; -- 1.8.1.4 ___ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
[PATCH 3/5] Add PROT_NONE to the mprotect test.
The mprotect unit test checks PROT_READ and PROT_READ | PROT_WRITE protections. We recently found that PROT_NONE wasn't properly supported in an early version of our huge page kernel code. This patch adds PROT_NONE tests to mprotect. The expected behaviour is that neither reads nor writes should succeed. Signed-off-by: Steve Capper --- tests/mprotect.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/tests/mprotect.c b/tests/mprotect.c index aa4673e..db6a662 100644 --- a/tests/mprotect.c +++ b/tests/mprotect.c @@ -213,5 +213,11 @@ int main(int argc, char *argv[]) test_mprotect(fd, "RW->R 1/2", 2*hpage_size, PROT_READ|PROT_WRITE, hpage_size, PROT_READ); + /* PROT_NONE tests */ + test_mprotect(fd, "NONE->R", hpage_size, PROT_NONE, + hpage_size, PROT_READ); + test_mprotect(fd, "NONE->RW", hpage_size, PROT_NONE, + hpage_size, PROT_READ|PROT_WRITE); + PASS(); } -- 1.8.1.4 ___ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
[PATCH 5/5] Cleanup ARM ld.hugetlbfs HTLB_LINK logic
When ld.hugetlbfs is executed with --hugetlbfs-link, there is code to check for the ARM platform and warn that this is not supported. There is also code to check for CUSTOM_LDSCRIPTS being false and give a similar warning. This patch removes the ARM check as the CUSTOM_LDSCRIPTS check will catch this. Signed-off-by: Steve Capper --- ld.hugetlbfs | 5 - 1 file changed, 5 deletions(-) diff --git a/ld.hugetlbfs b/ld.hugetlbfs index 5128aa2..ba9e00a 100755 --- a/ld.hugetlbfs +++ b/ld.hugetlbfs @@ -79,11 +79,6 @@ if [ -n "$HTLB_LINK" ]; then HTLB_ALIGN="" # --hugetlbfs-link overrides --hugetlbfs-align LDSCRIPT="$EMU.x$HTLB_LINK" HTLBOPTS="-T${HUGETLB_LDSCRIPT_PATH}/${LDSCRIPT}" - -if [ "$EMU" == "armelf_linux_eabi" ]; then -echo "Please use --hugetlbfs-align when targeting ARM." - exit -1 -fi fi MB=$((1024*1024)) -- 1.8.1.4 ___ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
Re: [PATCH 0/5] libhugetlbfs - Aarch64 support and some fixes
On Tue, Sep 10, 2013 at 02:11:27PM +0100, Steve Capper wrote: > Hello, > This series adds Aarch64 support and makes some minor tweaks. > > The first two patches of this series add Aarch64 support to > libhugetlbfs. (Starting from 3.11-rc1, the Linux Kernel supports > HugeTLB and THP for ARM64). > > Some general changes are also made: > PROT_NONE is added to the mprotect unit test, and the > linkhuge_rw test is enabled for 64 bit where there aren't any > custom ldscripts. > > The final patch clears up the superfluous ARM ld.hugetlbfs > HTLB_LINK logic. > > Any comments would be appreciated. > > Cheers, > -- > Steve Hi guys, This is just a ping on this series. Do these patches look ok for inclusion into libhugetlbfs? Should I resend the series? Thanks, -- Steve > > Steve Capper (5): > Aarch64 support. > Aarch64 unit test fixes. > Add PROT_NONE to the mprotect test. > Add linkhuge_rw test to 64 bit && !CUSTOM_LDSCIPTS > Cleanup ARM ld.hugetlbfs HTLB_LINK logic > > Makefile | 7 +++ > ld.hugetlbfs | 7 +-- > sys-aarch64elf.S | 34 ++ > tests/Makefile| 2 +- > tests/icache-hygiene.c| 7 --- > tests/mprotect.c | 6 ++ > tests/mremap-expand-slice-collision.c | 2 +- > 7 files changed, 54 insertions(+), 11 deletions(-) > create mode 100644 sys-aarch64elf.S > > -- > 1.8.1.4 > ___ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
Re: [Libhugetlbfs-devel] [PATCH 0/5] libhugetlbfs - Aarch64 support and some fixes
No probs, thank you Eric. Cheers, -- Steve ___ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
Re: [PATCH Ceph] crc32c: add arm64/aarch64 optimized crc32c implementation
Hi Yazen, I have a few comments below. Cheers, -- Steve On Wed, Jan 21, 2015 at 02:17:47PM -0600, Yazen Ghannam wrote: > Tested on AMD Seattle. > Passes all crc32c unit tests. > ~4x performance increase versus sctp. Was that perf result from the unit test? Also describe the reason for the patch in the commit log; we have an optional CRC instruction that can be used instead of the Ceph in-built table lookup. This patch uses the CRC instruction where available... Unit tests are added and they show a ... Probably also mention that this based off Ed Nevill's Hadoop patch. > > Signed-off-by: Yazen Ghannam > --- > configure.ac | 1 + > m4/ax_arm.m4 | 18 +--- > src/arch/arm.c | 2 ++ > src/arch/arm.h | 1 + > src/common/Makefile.am | 11 -- > src/common/crc32c.cc | 6 ++ > src/common/crc32c_arm64.c | 48 > ++ > src/common/crc32c_arm64.h | 25 ++ > src/test/common/test_crc32c.cc | 9 > 9 files changed, 116 insertions(+), 5 deletions(-) > create mode 100644 src/common/crc32c_arm64.c > create mode 100644 src/common/crc32c_arm64.h > > diff --git a/configure.ac b/configure.ac > index d836b02..60e4feb 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -575,6 +575,7 @@ AC_LANG_POP([C++]) > # Find supported SIMD / NEON / SSE extensions supported by the compiler > AX_ARM_FEATURES() > AM_CONDITIONAL(HAVE_NEON, [ test "x$ax_cv_support_neon_ext" = "xyes"]) > +AM_CONDITIONAL(HAVE_ARMV8_CRC, [ test "x$ax_cv_support_crc_ext" = "xyes"]) > AX_INTEL_FEATURES() > AM_CONDITIONAL(HAVE_SSSE3, [ test "x$ax_cv_support_ssse3_ext" = "xyes"]) > AM_CONDITIONAL(HAVE_SSE4_PCLMUL, [ test "x$ax_cv_support_pclmuldq_ext" = > "xyes"]) > diff --git a/m4/ax_arm.m4 b/m4/ax_arm.m4 > index 2ccc9a9..4ef8081 100644 > --- a/m4/ax_arm.m4 > +++ b/m4/ax_arm.m4 > @@ -13,13 +13,25 @@ AC_DEFUN([AX_ARM_FEATURES], >fi > ;; > aarch64*) > + AX_CHECK_COMPILE_FLAG(-march=armv8-a, ax_cv_support_armv8=yes, []) > + if test x"$ax_cv_support_armv8" = x"yes"; then > +ARM_ARCH_FLAGS="-march=armv8-a" > +ARM_DEFINE_FLAGS="-DARCH_AARCH64" > + fi >AX_CHECK_COMPILE_FLAG(-march=armv8-a+simd, ax_cv_support_neon_ext=yes, > []) >if test x"$ax_cv_support_neon_ext" = x"yes"; then > -ARM_NEON_FLAGS="-march=armv8-a+simd -DARCH_AARCH64 -DARM_NEON" > -AC_SUBST(ARM_NEON_FLAGS) > -ARM_FLAGS="$ARM_FLAGS $ARM_NEON_FLAGS" > +ARM_ARCH_FLAGS="$ARM_ARCH_FLAGS+simd" > +ARM_DEFINE_FLAGS="$ARM_DEFINE_FLAGS -DARM_NEON" > AC_DEFINE(HAVE_NEON,,[Support NEON instructions]) >fi > + AX_CHECK_COMPILE_FLAG(-march=armv8-a+crc, ax_cv_support_crc_ext=yes, > []) > + if test x"$ax_cv_support_crc_ext" = x"yes"; then > +ARM_ARCH_FLAGS="$ARM_ARCH_FLAGS+crc" > +AC_DEFINE(HAVE_ARMV8_CRC,,[Support ARMv8 CRC instructions]) > + fi > +ARM_NEON_FLAGS="$ARM_ARCH_FLAGS $ARM_DEFINE_FLAGS" > +AC_SUBST(ARM_NEON_FLAGS) > +ARM_FLAGS="$ARM_FLAGS $ARM_NEON_FLAGS" So if the compiler supports CRC, then ARM_FLAGS will be set with CRC compile options. This will also affect the jerasure codegen (which doesn't check for the CRC HWCAP). Although this is unlikely to cause problems, I would feel more comfortable with something like: ARM_CRC_FLAGS That way you disambiguate between NEON (for jerasure) and CRC. > ;; >esac > > diff --git a/src/arch/arm.c b/src/arch/arm.c > index 93d079a..2ea97eb 100644 > --- a/src/arch/arm.c > +++ b/src/arch/arm.c > @@ -2,6 +2,7 @@ > > /* flags we export */ > int ceph_arch_neon = 0; > +int ceph_arch_arm64_crc32 = 0; Can we rename this variable: ceph_arch_aarch64_crc32 ? > > #include > > @@ -47,6 +48,7 @@ int ceph_arch_arm_probe(void) > ceph_arch_neon = (get_hwcap() & HWCAP_NEON) == HWCAP_NEON; > #elif __aarch64__ && __linux__ > ceph_arch_neon = (get_hwcap() & HWCAP_ASIMD) == HWCAP_ASIMD; > + ceph_arch_arm64_crc32 = (get_hwcap() & HWCAP_CRC32) == HWCAP_CRC32; > #else > if (0) > get_hwcap(); // make compiler shut up > diff --git a/src/arch/arm.h b/src/arch/arm.h > index f613438..4ac716a 100644 > --- a/src/arch/arm.h > +++ b/src/arch/arm.h > @@ -6,6 +6,7 @@ extern "C" { > #endif > > extern int ceph_arch_neon; /* true if we have ARM NEON or ASIMD abilities */ > +extern int ceph_arch_arm64_crc32; /* true if we have Aarch64 CRC32/CRC32C > abilities */ "AArch64", we have two capital A's. > > extern int ceph_arch_arm_probe(void); > > diff --git a/src/common/Makefile.am b/src/common/Makefile.am > index 2888194..4c36216 100644 > --- a/src/common/Makefile.am > +++ b/src/common/Makefile.am > @@ -103,12 +103,18 @@ libcommon_crc_la_SOURCES = \ > common/sctp_crc32.c \ > common/crc32c.cc \ > common/crc32c_intel_baseline.c \ > - comm
Re: [PATCH v2 Ceph] crc32c: add aarch64 optimized crc32c implementation
Hi Yazen, This is looking good, just a few minor comments below. Cheers, -- Steve On Fri, Jan 23, 2015 at 09:13:42AM -0600, Yazen Ghannam wrote: > ARMv8 defines a set of optional CRC32/CRC32C instructions. > This patch defines an optimized function that uses these > instructions when available rather than table-based lookup. > Optimized function based on a Hadoop patch by Ed Nevill. > > Autotools updated to check for compiler support. > Optimized function is selected at runtime based on HWCAP_CRC32. > Added crc32c "performance" unit test and arch unit test. > > Tested on AMD Seattle. > Passes all crc32c unit tests. > Unit test shows ~4x performance increase versus sctp. > > Signed-off-by: Yazen Ghannam > --- > configure.ac | 1 + > m4/ax_arm.m4 | 18 +++-- > src/arch/arm.c | 2 ++ > src/arch/arm.h | 1 + > src/common/Makefile.am | 10 +++- > src/common/crc32c.cc | 6 + > src/common/crc32c_aarch64.c| 58 > ++ > src/common/crc32c_aarch64.h| 25 ++ > src/test/common/test_crc32c.cc | 11 > src/test/test_arch.cc | 3 +++ > 10 files changed, 132 insertions(+), 3 deletions(-) > create mode 100644 src/common/crc32c_aarch64.c > create mode 100644 src/common/crc32c_aarch64.h > > diff --git a/configure.ac b/configure.ac > index d836b02..60e4feb 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -575,6 +575,7 @@ AC_LANG_POP([C++]) > # Find supported SIMD / NEON / SSE extensions supported by the compiler > AX_ARM_FEATURES() > AM_CONDITIONAL(HAVE_NEON, [ test "x$ax_cv_support_neon_ext" = "xyes"]) > +AM_CONDITIONAL(HAVE_ARMV8_CRC, [ test "x$ax_cv_support_crc_ext" = "xyes"]) > AX_INTEL_FEATURES() > AM_CONDITIONAL(HAVE_SSSE3, [ test "x$ax_cv_support_ssse3_ext" = "xyes"]) > AM_CONDITIONAL(HAVE_SSE4_PCLMUL, [ test "x$ax_cv_support_pclmuldq_ext" = > "xyes"]) > diff --git a/m4/ax_arm.m4 b/m4/ax_arm.m4 > index 2ccc9a9..37ea0aa 100644 > --- a/m4/ax_arm.m4 > +++ b/m4/ax_arm.m4 > @@ -13,13 +13,27 @@ AC_DEFUN([AX_ARM_FEATURES], >fi > ;; > aarch64*) > + AX_CHECK_COMPILE_FLAG(-march=armv8-a, ax_cv_support_armv8=yes, []) > + if test x"$ax_cv_support_armv8" = x"yes"; then > +ARM_ARCH_FLAGS="-march=armv8-a" > +ARM_DEFINE_FLAGS="-DARCH_AARCH64" > + fi >AX_CHECK_COMPILE_FLAG(-march=armv8-a+simd, ax_cv_support_neon_ext=yes, > []) >if test x"$ax_cv_support_neon_ext" = x"yes"; then > +ARM_ARCH_FLAGS="$ARM_ARCH_FLAGS+simd" > +ARM_DEFINE_FLAGS="$ARM_DEFINE_FLAGS -DARM_NEON" > ARM_NEON_FLAGS="-march=armv8-a+simd -DARCH_AARCH64 -DARM_NEON" > -AC_SUBST(ARM_NEON_FLAGS) > -ARM_FLAGS="$ARM_FLAGS $ARM_NEON_FLAGS" > AC_DEFINE(HAVE_NEON,,[Support NEON instructions]) > +AC_SUBST(ARM_NEON_FLAGS) > + fi > + AX_CHECK_COMPILE_FLAG(-march=armv8-a+crc, ax_cv_support_crc_ext=yes, > []) > + if test x"$ax_cv_support_crc_ext" = x"yes"; then > +ARM_ARCH_FLAGS="$ARM_ARCH_FLAGS+crc" > +ARM_CRC_FLAGS="-march=armv8-a+crc -DARCH_AARCH64" > +AC_DEFINE(HAVE_ARMV8_CRC,,[Support ARMv8 CRC instructions]) > +AC_SUBST(ARM_CRC_FLAGS) >fi > +ARM_FLAGS="$ARM_ARCH_FLAGS $ARM_DEFINE_FLAGS" > ;; >esac > > diff --git a/src/arch/arm.c b/src/arch/arm.c > index 93d079a..5a47e33 100644 > --- a/src/arch/arm.c > +++ b/src/arch/arm.c > @@ -2,6 +2,7 @@ > > /* flags we export */ > int ceph_arch_neon = 0; > +int ceph_arch_aarch64_crc32 = 0; > > #include > > @@ -47,6 +48,7 @@ int ceph_arch_arm_probe(void) > ceph_arch_neon = (get_hwcap() & HWCAP_NEON) == HWCAP_NEON; > #elif __aarch64__ && __linux__ > ceph_arch_neon = (get_hwcap() & HWCAP_ASIMD) == HWCAP_ASIMD; > + ceph_arch_aarch64_crc32 = (get_hwcap() & HWCAP_CRC32) == HWCAP_CRC32; > #else > if (0) > get_hwcap(); // make compiler shut up > diff --git a/src/arch/arm.h b/src/arch/arm.h > index f613438..1659b2e 100644 > --- a/src/arch/arm.h > +++ b/src/arch/arm.h > @@ -6,6 +6,7 @@ extern "C" { > #endif > > extern int ceph_arch_neon; /* true if we have ARM NEON or ASIMD abilities */ > +extern int ceph_arch_aarch64_crc32; /* true if we have AArch64 CRC32/CRC32C > abilities */ > > extern int ceph_arch_arm_probe(void); > > diff --git a/src/common/Makefile.am b/src/common/Makefile.am > index 2888194..37d1404 100644 > --- a/src/common/Makefile.am > +++ b/src/common/Makefile.am > @@ -112,11 +112,19 @@ endif > LIBCOMMON_DEPS += libcommon_crc.la > noinst_LTLIBRARIES += libcommon_crc.la > > +if HAVE_ARMV8_CRC > +libcommon_crc_aarch64_la_SOURCES = common/crc32c_aarch64.c > +libcommon_crc_aarch64_la_CFLAGS = $(AM_CFLAGS) $(ARM_CRC_FLAGS) > +LIBCOMMON_DEPS += libcommon_crc_aarch64.la > +noinst_LTLIBRARIES += libcommon_crc_aarch64.la > +endif > + > noinst_HEADERS += \ > common/bloom_fil
Re: [PATCH v3 Ceph] crc32c: add aarch64 optimized crc32c implementation
On Fri, Jan 23, 2015 at 12:43:31PM -0500, Yazen Ghannam wrote: > ARMv8 defines a set of optional CRC32/CRC32C instructions. > This patch defines an optimized function that uses these > instructions when available rather than table-based lookup. > Optimized function based on a Hadoop patch by Ed Nevill. > > Autotools updated to check for compiler support. > Optimized function is selected at runtime based on HWCAP_CRC32. > Added crc32c "performance" unit test and arch unit test. > > Tested on AMD Seattle. > Passes all crc32c unit tests. > Unit test shows ~4x performance increase versus sctp. Hi Yazen, Thanks, this is looking good. > > Signed-off-by: Yazen Ghannam Please add: Reviewed-by: Steve Capper This can be sent to ceph-de...@vger.kernel.org, let's see what upstream think ;-). Cheers, -- Steve > --- > configure.ac | 1 + > m4/ax_arm.m4 | 18 ++-- > src/arch/arm.c | 2 ++ > src/arch/arm.h | 1 + > src/common/Makefile.am | 10 - > src/common/crc32c.cc | 6 ++ > src/common/crc32c_aarch64.c| 47 > ++ > src/common/crc32c_aarch64.h| 27 > src/test/common/test_crc32c.cc | 10 + > src/test/test_arch.cc | 14 + > 10 files changed, 133 insertions(+), 3 deletions(-) > create mode 100644 src/common/crc32c_aarch64.c > create mode 100644 src/common/crc32c_aarch64.h > > diff --git a/configure.ac b/configure.ac > index d836b02..60e4feb 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -575,6 +575,7 @@ AC_LANG_POP([C++]) > # Find supported SIMD / NEON / SSE extensions supported by the compiler > AX_ARM_FEATURES() > AM_CONDITIONAL(HAVE_NEON, [ test "x$ax_cv_support_neon_ext" = "xyes"]) > +AM_CONDITIONAL(HAVE_ARMV8_CRC, [ test "x$ax_cv_support_crc_ext" = "xyes"]) > AX_INTEL_FEATURES() > AM_CONDITIONAL(HAVE_SSSE3, [ test "x$ax_cv_support_ssse3_ext" = "xyes"]) > AM_CONDITIONAL(HAVE_SSE4_PCLMUL, [ test "x$ax_cv_support_pclmuldq_ext" = > "xyes"]) > diff --git a/m4/ax_arm.m4 b/m4/ax_arm.m4 > index 2ccc9a9..37ea0aa 100644 > --- a/m4/ax_arm.m4 > +++ b/m4/ax_arm.m4 > @@ -13,13 +13,27 @@ AC_DEFUN([AX_ARM_FEATURES], >fi > ;; > aarch64*) > + AX_CHECK_COMPILE_FLAG(-march=armv8-a, ax_cv_support_armv8=yes, []) > + if test x"$ax_cv_support_armv8" = x"yes"; then > +ARM_ARCH_FLAGS="-march=armv8-a" > +ARM_DEFINE_FLAGS="-DARCH_AARCH64" > + fi >AX_CHECK_COMPILE_FLAG(-march=armv8-a+simd, ax_cv_support_neon_ext=yes, > []) >if test x"$ax_cv_support_neon_ext" = x"yes"; then > +ARM_ARCH_FLAGS="$ARM_ARCH_FLAGS+simd" > +ARM_DEFINE_FLAGS="$ARM_DEFINE_FLAGS -DARM_NEON" > ARM_NEON_FLAGS="-march=armv8-a+simd -DARCH_AARCH64 -DARM_NEON" > -AC_SUBST(ARM_NEON_FLAGS) > -ARM_FLAGS="$ARM_FLAGS $ARM_NEON_FLAGS" > AC_DEFINE(HAVE_NEON,,[Support NEON instructions]) > +AC_SUBST(ARM_NEON_FLAGS) > + fi > + AX_CHECK_COMPILE_FLAG(-march=armv8-a+crc, ax_cv_support_crc_ext=yes, > []) > + if test x"$ax_cv_support_crc_ext" = x"yes"; then > +ARM_ARCH_FLAGS="$ARM_ARCH_FLAGS+crc" > +ARM_CRC_FLAGS="-march=armv8-a+crc -DARCH_AARCH64" > +AC_DEFINE(HAVE_ARMV8_CRC,,[Support ARMv8 CRC instructions]) > +AC_SUBST(ARM_CRC_FLAGS) >fi > +ARM_FLAGS="$ARM_ARCH_FLAGS $ARM_DEFINE_FLAGS" > ;; >esac > > diff --git a/src/arch/arm.c b/src/arch/arm.c > index 93d079a..5a47e33 100644 > --- a/src/arch/arm.c > +++ b/src/arch/arm.c > @@ -2,6 +2,7 @@ > > /* flags we export */ > int ceph_arch_neon = 0; > +int ceph_arch_aarch64_crc32 = 0; > > #include > > @@ -47,6 +48,7 @@ int ceph_arch_arm_probe(void) > ceph_arch_neon = (get_hwcap() & HWCAP_NEON) == HWCAP_NEON; > #elif __aarch64__ && __linux__ > ceph_arch_neon = (get_hwcap() & HWCAP_ASIMD) == HWCAP_ASIMD; > + ceph_arch_aarch64_crc32 = (get_hwcap() & HWCAP_CRC32) == HWCAP_CRC32; > #else > if (0) > get_hwcap(); // make compiler shut up > diff --git a/src/arch/arm.h b/src/arch/arm.h > index f613438..1659b2e 100644 > --- a/src/arch/arm.h > +++ b/src/arch/arm.h > @@ -6,6 +6,7 @@ extern "C" { > #endif > > extern int ceph_arch_neon; /* true if we have ARM NEON or ASIMD
Cortex A57 Optimisation Guide
For those interested in optimising code for the Cortex-A57, the following guide has just been released: http://infocenter.arm.com/help/topic/com.arm.doc.uan0015a/cortex_a57_software_optimisation_guide_external.pdf Cheers, -- Steve ___ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
Re: JITs and 52-bit VA
On 28 April 2016 at 14:24, Peter Maydell wrote: > On 28 April 2016 at 14:17, Arnd Bergmann wrote: >> One simple (from the kernel's perspective, not from the JIT) approach >> might be to always use MAP_FIXED whenever an allocation is made for >> memory that needs these special pointers, and then manage the available >> address space explicitly. Would that work, or do you require everything >> including the binary itself to be below the address? > > The trouble IME with this idea is that in practice you're > linking with glibc, which means glibc is managing (and using) > the address space, not the JIT. So MAP_FIXED is pretty awkward > to use. > > thanks > -- PMM Hi, One can find holes in the VA space by examining /proc/self/maps, thus selection of pointers for MAP_FIXED can be deduced. The other problem is, as Arnd alluded to, if a JIT'ed object needs to then refer to something allocated outside of the JIT. This could be remedied by another level of indirection/trampoline. Taking two steps back though, I would view VA space squeezing as a stop-gap before removing tags from the upper bits of a pointer altogether (tagging the bottom bits, by controlling alignment is perfectly safe). The larger the VA space, the more scope mechanisms such as Address Space Layout Randomisation have to improve security. Cheers, -- Steve ___ linaro-dev mailing list linaro-dev@lists.linaro.org https://lists.linaro.org/mailman/listinfo/linaro-dev