+1, pls add the typedef...

Thanks… Dave

-----Original Message-----
From: Damjan Marion (damarion) 
Sent: Saturday, August 19, 2017 9:09 AM
To: Dave Barach (dbarach) <dbar...@cisco.com>
Cc: George Zhao <george.y.z...@huawei.com>; vpp-dev@lists.fd.io; 
disc...@lists.fd.io; csit-...@lists.fd.io
Subject: Re: [discuss] Question about VPP support for ARM 64


GCC is able to compile ARM64 code with 256-bit vectors even if target platform 
have only 128-bit registers.

I.e. for the u8x32 version of that function it generates:

ARM64:
dpdk_buffer_init_from_template(void*, void*, void*, void*, void*):
        ld1     {v0.16b - v1.16b}, [x4], 32
        st1     {v0.16b - v1.16b}, [x3], 32
        st1     {v0.16b - v1.16b}, [x2], 32
        st1     {v0.16b - v1.16b}, [x1], 32
        st1     {v0.16b - v1.16b}, [x0], 32
        ld1     {v0.16b - v1.16b}, [x4]
        st1     {v0.16b - v1.16b}, [x3]
        st1     {v0.16b - v1.16b}, [x2]
        st1     {v0.16b - v1.16b}, [x1]
        st1     {v0.16b - v1.16b}, [x0]
        ret

intel x86-64 without AVX2:

dpdk_buffer_init_from_template(void*, void*, void*, void*, void*):
 push   %rbp
 mov    %rsp,%rbp
 and    $0xffffffffffffffe0,%rsp
 lea    0x10(%rsp),%rsp
 movdqa (%r8),%xmm1
 movdqa 0x10(%r8),%xmm0
 movdqa %xmm0,0x10(%rcx)
 movdqa %xmm1,(%rcx)
 movdqa %xmm1,(%rdx)
 movdqa %xmm0,0x10(%rdx)
 movdqa %xmm1,(%rsi)
 movdqa %xmm0,0x10(%rsi)
 movdqa %xmm1,(%rdi)
 movdqa %xmm0,0x10(%rdi)
 movdqa 0x20(%r8),%xmm1
 movdqa 0x30(%r8),%xmm0
 movdqa %xmm0,0x30(%rcx)
 movdqa %xmm1,0x20(%rcx)
 movdqa %xmm1,0x20(%rdx)
 movdqa %xmm0,0x30(%rdx)
 movdqa %xmm1,0x20(%rsi)
 movdqa %xmm0,0x30(%rsi)
 movdqa %xmm1,0x20(%rdi)
 movdqa %xmm0,0x30(%rdi)
 leaveq 
 retq   


So i think here it is only about missing typedef….


> On 19 Aug 2017, at 14:51, Dave Barach (dbarach) <dbar...@cisco.com> wrote:
> 
> Dear George,
>  
> This specific issue isn’t anywhere near as bad as you might think. As given, 
> the code confuses 128-bit vectors with 256-bit vectors, and 64-bit vectors 
> with 128-bit vectors.
>  
> Question: does the hardware involved support 256-bit vectors? Probably not... 
> It almost certainly does support 128-bit vectors.
>  
> To make progress, use the known-good u8x16 / 128-bit vector code:   
>  
> static_always_inline void
> dpdk_buffer_init_from_template (void *d0, void *d1, void *d2, void *d3,
>                       void *s)
> {
> #if defined(CLIB_HAVE_VEC128)
>   int i;
>   for (i = 0; i < 4; i++)
>     {
>       *(u8x16 *) (((u8 *) d0) + i * 16) =
>      *(u8x16 *) (((u8 *) d1) + i * 16) =
>      *(u8x16 *) (((u8 *) d2) + i * 16) =
>      *(u8x16 *) (((u8 *) d3) + i * 16) = *(u8x16 *) (((u8 *) s) + i * 16);
>     }
> #else
> #error "CLIB_HAVE_VEC128 has to be defined"
> #endif
> }
>  
> Responsible parties - they know who they are - will be back from PTO shortly. 
> We need to clean up / create CLIB_HAVE_VEC_256 and move the 256-bit vector 
> engine code...
>  
> You could also try adding “typedef u8 u8x32 _vector_size(32)” but I somehow 
> doubt that will produce anything other than a compiler error.
>  
> HTH… Dave
>  
> From: vpp-dev-boun...@lists.fd.io [mailto:vpp-dev-boun...@lists.fd.io] On 
> Behalf Of George Zhao
> Sent: Friday, August 18, 2017 7:32 PM
> To: 'vpp-dev@lists.fd.io' <vpp-dev@lists.fd.io>; 'disc...@lists.fd.io' 
> <disc...@lists.fd.io>; 'csit-...@lists.fd.io' <csit-...@lists.fd.io>
> Subject: [vpp-dev] Question about VPP support for ARM 64
>  
> We encounter following issues while trying to build VPP over ARM 64. It seems 
> right now only ARM32 are supported in the code. I list the steps we tried and 
> hope VPP folks can help us work around this issue.
>  
> Steps: 
> 1. install Ubuntu 16.04 on OD1K  
> $>> uname -a
> Linux OD1K 4.4.0-92-generic #115-Ubuntu SMP Thu Aug 10 09:10:33 UTC 2017 
> aarch64 aarch64 aarch64 GNU/Linux
>  
> 2. git clone VPP 17.04 and build VPP
> ## Error:
> make[2]: Entering directory '/home/huawei/GIT/vpp.1704/dpdk'
> cat: '/sys/bus/pci/devices/0000:00:01.0/uevent': No such file or directory
>  
> **Work around to bypass MakeFile:
> ##############################################################################
> # Cavium ThunderX
> ##############################################################################
> #else ifneq (,$(findstring thunder,$(shell cat 
> /sys/bus/pci/devices/0000:00:01.0/uevent | grep cavium)))
> else
> export CROSS=""
> DPDK_TARGET           ?= arm64-thunderx-linuxapp-$(DPDK_CC)
> DPDK_MACHINE          ?= thunderx
> DPDK_TUNE             ?= generic
>  
> 3. Then,  make build and failed following:
> /home/huawei/GIT/vpp.1704/build-data/../src/plugins/dpdk/device/node.c:276:9: 
> error: `u8x32' undeclared (first use in this function)
>        *(u8x32 *) (((u8 *) d0) + i * 32) =
>  
> ** Check vppinfra/vppinfra/vector.h   and don’t find u8x32 with “aarch64”
> #if defined (__aarch64__) || defined (__arm__)
> typedef unsigned int u32x4 _vector_size (16);
> typedef u8 u8x16 _vector_size (16);
> typedef u16 u16x8 _vector_size (16);
> typedef u32 u32x4 _vector_size (16);
> typedef u64 u64x2 _vector_size (16);
> #endif
>  
> 4. According  https://wiki.fd.io/view/VPP/Alternative_builds
> The VPP seems to support arm32 only .
> export PLATFORM=arm32
>  
>  
> *Questions:
> Did I miss some steps or should include other header files that defines u8x32?
>  
>  
> Thanks,
> George
>  
> _______________________________________________
> discuss mailing list
> disc...@lists.fd.io
> https://lists.fd.io/mailman/listinfo/discuss

_______________________________________________
vpp-dev mailing list
vpp-dev@lists.fd.io
https://lists.fd.io/mailman/listinfo/vpp-dev

Reply via email to