Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask

2018-10-29 Thread Yao, Lei A
Hi, Lucero, Thomas

This patch set will cause deadlock during memory initialization. 
rte_memseg_walk and try_expand_heap both will lock 
the file &mcfg->memory_hotplug_lock. So dead lock will occur. 

#0   rte_memseg_walk
#1  <-rte_eal_check_dma_mask
#2  <-alloc_pages_on_heap
#3  <-try_expand_heap_primary   
#4  <-try_expand_heap

Log as following:
EAL: TSC frequency is ~2494156 KHz
EAL: Master lcore 0 is ready (tid=77fe3c00;cpuset=[0])
[New Thread 0x75e0d700 (LWP 330350)]
EAL: lcore 1 is ready (tid=75e0d700;cpuset=[1])
EAL: Trying to obtain current memory policy.
EAL: Setting policy MPOL_PREFERRED for socket 0
EAL: Restoring previous memory policy: 0

Could you have a check on this? A lot of test cases in our validation 
team fail because of this. Thanks a lot!

BRs
Lei


> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Thomas Monjalon
> Sent: Monday, October 29, 2018 5:04 AM
> To: Alejandro Lucero 
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA
> mask
> 
> 05/10/2018 14:45, Alejandro Lucero:
> > I sent a patchset about this to be applied on 17.11 stable. The memory
> > code has had main changes since that version, so here it is the patchset
> > adjusted to current master repo.
> >
> > This patchset adds, mainly, a check for ensuring IOVAs are within a
> > restricted range due to addressing limitations with some devices. There
> > are two known cases: NFP and IOMMU VT-d emulation.
> >
> > With this check IOVAs out of range are detected and PMDs can abort
> > initialization. For the VT-d case, IOVA VA mode is allowed as long as
> > IOVAs are within the supported range, avoiding to forbid IOVA VA by
> > default.
> >
> > For the addressing limitations known cases, there are just 40(NFP) or
> > 39(VT-d) bits for handling IOVAs. When using IOVA PA, those limitations
> > imply 1TB(NFP) or 512M(VT-d) as upper limits, which is likely enough for
> > most systems. With machines using more memory, the added check will
> > ensure IOVAs within the range.
> >
> > With IOVA VA, and because the way the Linux kernel serves mmap calls
> > in 64 bits systems, 39 or 40 bits are not enough. It is possible to
> > give an address hint with a lower starting address than the default one
> > used by the kernel, and then ensuring the mmap uses that hint or hint plus
> > some offset. With 64 bits systems, the process virtual address space is
> > large enoguh for doing the hugepages mmaping within the supported
> range
> > when those addressing limitations exist. This patchset also adds a change
> > for using such a hint making the use of IOVA VA a more than likely
> > possibility when there are those addressing limitations.
> >
> > The check is not done by default but just when it is required. This
> > patchset adds the check for NFP initialization and for setting the IOVA
> > mode is an emulated VT-d is detected. Also, because the recent patchset
> > adding dynamic memory allocation, the check is also invoked for ensuring
> > the new memsegs are within the required range.
> >
> > This patchset could be applied to stable 18.05.
> 
> Applied, thanks
> 
> 
> 



Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask

2018-10-29 Thread Thomas Monjalon
29/10/2018 09:23, Yao, Lei A:
> Hi, Lucero, Thomas
> 
> This patch set will cause deadlock during memory initialization. 
> rte_memseg_walk and try_expand_heap both will lock 
> the file &mcfg->memory_hotplug_lock. So dead lock will occur. 
> 
> #0   rte_memseg_walk
> #1  <-rte_eal_check_dma_mask
> #2  <-alloc_pages_on_heap
> #3  <-try_expand_heap_primary   
> #4  <-try_expand_heap
> 
> Log as following:
> EAL: TSC frequency is ~2494156 KHz
> EAL: Master lcore 0 is ready (tid=77fe3c00;cpuset=[0])
> [New Thread 0x75e0d700 (LWP 330350)]
> EAL: lcore 1 is ready (tid=75e0d700;cpuset=[1])
> EAL: Trying to obtain current memory policy.
> EAL: Setting policy MPOL_PREFERRED for socket 0
> EAL: Restoring previous memory policy: 0
> 
> Could you have a check on this? A lot of test cases in our validation 
> team fail because of this. Thanks a lot!

Can we just call rte_memseg_walk_thread_unsafe()?

+Cc Anatoly


> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Thomas Monjalon
> > 05/10/2018 14:45, Alejandro Lucero:
> > > I sent a patchset about this to be applied on 17.11 stable. The memory
> > > code has had main changes since that version, so here it is the patchset
> > > adjusted to current master repo.
> > >
> > > This patchset adds, mainly, a check for ensuring IOVAs are within a
> > > restricted range due to addressing limitations with some devices. There
> > > are two known cases: NFP and IOMMU VT-d emulation.
> > >
> > > With this check IOVAs out of range are detected and PMDs can abort
> > > initialization. For the VT-d case, IOVA VA mode is allowed as long as
> > > IOVAs are within the supported range, avoiding to forbid IOVA VA by
> > > default.
> > >
> > > For the addressing limitations known cases, there are just 40(NFP) or
> > > 39(VT-d) bits for handling IOVAs. When using IOVA PA, those limitations
> > > imply 1TB(NFP) or 512M(VT-d) as upper limits, which is likely enough for
> > > most systems. With machines using more memory, the added check will
> > > ensure IOVAs within the range.
> > >
> > > With IOVA VA, and because the way the Linux kernel serves mmap calls
> > > in 64 bits systems, 39 or 40 bits are not enough. It is possible to
> > > give an address hint with a lower starting address than the default one
> > > used by the kernel, and then ensuring the mmap uses that hint or hint plus
> > > some offset. With 64 bits systems, the process virtual address space is
> > > large enoguh for doing the hugepages mmaping within the supported
> > range
> > > when those addressing limitations exist. This patchset also adds a change
> > > for using such a hint making the use of IOVA VA a more than likely
> > > possibility when there are those addressing limitations.
> > >
> > > The check is not done by default but just when it is required. This
> > > patchset adds the check for NFP initialization and for setting the IOVA
> > > mode is an emulated VT-d is detected. Also, because the recent patchset
> > > adding dynamic memory allocation, the check is also invoked for ensuring
> > > the new memsegs are within the required range.
> > >
> > > This patchset could be applied to stable 18.05.
> > 
> > Applied, thanks





Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask

2018-10-29 Thread Thomas Monjalon
One more comment about this issue,

There was no reply to the question asked by Alejandro on October 11th:
http://mails.dpdk.org/archives/dev/2018-October/115402.html
and there were no more reviews despite all my requests:
http://mails.dpdk.org/archives/dev/2018-October/117475.html
Without any more comment, I had to apply the patchset.

Now we need to find a solution. Please suggest.


29/10/2018 09:42, Thomas Monjalon:
> 29/10/2018 09:23, Yao, Lei A:
> > Hi, Lucero, Thomas
> > 
> > This patch set will cause deadlock during memory initialization. 
> > rte_memseg_walk and try_expand_heap both will lock 
> > the file &mcfg->memory_hotplug_lock. So dead lock will occur. 
> > 
> > #0   rte_memseg_walk
> > #1  <-rte_eal_check_dma_mask
> > #2  <-alloc_pages_on_heap
> > #3  <-try_expand_heap_primary   
> > #4  <-try_expand_heap
> > 
> > Log as following:
> > EAL: TSC frequency is ~2494156 KHz
> > EAL: Master lcore 0 is ready (tid=77fe3c00;cpuset=[0])
> > [New Thread 0x75e0d700 (LWP 330350)]
> > EAL: lcore 1 is ready (tid=75e0d700;cpuset=[1])
> > EAL: Trying to obtain current memory policy.
> > EAL: Setting policy MPOL_PREFERRED for socket 0
> > EAL: Restoring previous memory policy: 0
> > 
> > Could you have a check on this? A lot of test cases in our validation 
> > team fail because of this. Thanks a lot!
> 
> Can we just call rte_memseg_walk_thread_unsafe()?
> 
> +Cc Anatoly
> 
> 
> > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Thomas Monjalon
> > > 05/10/2018 14:45, Alejandro Lucero:
> > > > I sent a patchset about this to be applied on 17.11 stable. The memory
> > > > code has had main changes since that version, so here it is the patchset
> > > > adjusted to current master repo.
> > > >
> > > > This patchset adds, mainly, a check for ensuring IOVAs are within a
> > > > restricted range due to addressing limitations with some devices. There
> > > > are two known cases: NFP and IOMMU VT-d emulation.
> > > >
> > > > With this check IOVAs out of range are detected and PMDs can abort
> > > > initialization. For the VT-d case, IOVA VA mode is allowed as long as
> > > > IOVAs are within the supported range, avoiding to forbid IOVA VA by
> > > > default.
> > > >
> > > > For the addressing limitations known cases, there are just 40(NFP) or
> > > > 39(VT-d) bits for handling IOVAs. When using IOVA PA, those limitations
> > > > imply 1TB(NFP) or 512M(VT-d) as upper limits, which is likely enough for
> > > > most systems. With machines using more memory, the added check will
> > > > ensure IOVAs within the range.
> > > >
> > > > With IOVA VA, and because the way the Linux kernel serves mmap calls
> > > > in 64 bits systems, 39 or 40 bits are not enough. It is possible to
> > > > give an address hint with a lower starting address than the default one
> > > > used by the kernel, and then ensuring the mmap uses that hint or hint 
> > > > plus
> > > > some offset. With 64 bits systems, the process virtual address space is
> > > > large enoguh for doing the hugepages mmaping within the supported
> > > range
> > > > when those addressing limitations exist. This patchset also adds a 
> > > > change
> > > > for using such a hint making the use of IOVA VA a more than likely
> > > > possibility when there are those addressing limitations.
> > > >
> > > > The check is not done by default but just when it is required. This
> > > > patchset adds the check for NFP initialization and for setting the IOVA
> > > > mode is an emulated VT-d is detected. Also, because the recent patchset
> > > > adding dynamic memory allocation, the check is also invoked for ensuring
> > > > the new memsegs are within the required range.
> > > >
> > > > This patchset could be applied to stable 18.05.
> > > 
> > > Applied, thanks





Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask

2018-10-29 Thread Alejandro Lucero
Can we have the configuration triggering this issue?

On Mon, Oct 29, 2018 at 9:07 AM Thomas Monjalon  wrote:

> One more comment about this issue,
>
> There was no reply to the question asked by Alejandro on October 11th:
> http://mails.dpdk.org/archives/dev/2018-October/115402.html
> and there were no more reviews despite all my requests:
> http://mails.dpdk.org/archives/dev/2018-October/117475.html
> Without any more comment, I had to apply the patchset.
>
> Now we need to find a solution. Please suggest.
>
>
> 29/10/2018 09:42, Thomas Monjalon:
> > 29/10/2018 09:23, Yao, Lei A:
> > > Hi, Lucero, Thomas
> > >
> > > This patch set will cause deadlock during memory initialization.
> > > rte_memseg_walk and try_expand_heap both will lock
> > > the file &mcfg->memory_hotplug_lock. So dead lock will occur.
> > >
> > > #0   rte_memseg_walk
> > > #1  <-rte_eal_check_dma_mask
> > > #2  <-alloc_pages_on_heap
> > > #3  <-try_expand_heap_primary
> > > #4  <-try_expand_heap
> > >
> > > Log as following:
> > > EAL: TSC frequency is ~2494156 KHz
> > > EAL: Master lcore 0 is ready (tid=77fe3c00;cpuset=[0])
> > > [New Thread 0x75e0d700 (LWP 330350)]
> > > EAL: lcore 1 is ready (tid=75e0d700;cpuset=[1])
> > > EAL: Trying to obtain current memory policy.
> > > EAL: Setting policy MPOL_PREFERRED for socket 0
> > > EAL: Restoring previous memory policy: 0
> > >
> > > Could you have a check on this? A lot of test cases in our validation
> > > team fail because of this. Thanks a lot!
> >
> > Can we just call rte_memseg_walk_thread_unsafe()?
> >
> > +Cc Anatoly
> >
> >
> > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Thomas Monjalon
> > > > 05/10/2018 14:45, Alejandro Lucero:
> > > > > I sent a patchset about this to be applied on 17.11 stable. The
> memory
> > > > > code has had main changes since that version, so here it is the
> patchset
> > > > > adjusted to current master repo.
> > > > >
> > > > > This patchset adds, mainly, a check for ensuring IOVAs are within a
> > > > > restricted range due to addressing limitations with some devices.
> There
> > > > > are two known cases: NFP and IOMMU VT-d emulation.
> > > > >
> > > > > With this check IOVAs out of range are detected and PMDs can abort
> > > > > initialization. For the VT-d case, IOVA VA mode is allowed as long
> as
> > > > > IOVAs are within the supported range, avoiding to forbid IOVA VA by
> > > > > default.
> > > > >
> > > > > For the addressing limitations known cases, there are just 40(NFP)
> or
> > > > > 39(VT-d) bits for handling IOVAs. When using IOVA PA, those
> limitations
> > > > > imply 1TB(NFP) or 512M(VT-d) as upper limits, which is likely
> enough for
> > > > > most systems. With machines using more memory, the added check will
> > > > > ensure IOVAs within the range.
> > > > >
> > > > > With IOVA VA, and because the way the Linux kernel serves mmap
> calls
> > > > > in 64 bits systems, 39 or 40 bits are not enough. It is possible to
> > > > > give an address hint with a lower starting address than the
> default one
> > > > > used by the kernel, and then ensuring the mmap uses that hint or
> hint plus
> > > > > some offset. With 64 bits systems, the process virtual address
> space is
> > > > > large enoguh for doing the hugepages mmaping within the supported
> > > > range
> > > > > when those addressing limitations exist. This patchset also adds a
> change
> > > > > for using such a hint making the use of IOVA VA a more than likely
> > > > > possibility when there are those addressing limitations.
> > > > >
> > > > > The check is not done by default but just when it is required. This
> > > > > patchset adds the check for NFP initialization and for setting the
> IOVA
> > > > > mode is an emulated VT-d is detected. Also, because the recent
> patchset
> > > > > adding dynamic memory allocation, the check is also invoked for
> ensuring
> > > > > the new memsegs are within the required range.
> > > > >
> > > > > This patchset could be applied to stable 18.05.
> > > >
> > > > Applied, thanks
>
>
>
>


Re: [dpdk-dev] [PATCH v2 2/7] net/mlx5: e-switch VXLAN flow validation routine

2018-10-29 Thread Slava Ovsiienko
> -Original Message-
> From: Yongseok Koh
> Sent: Saturday, October 27, 2018 0:57
> To: Slava Ovsiienko 
> Cc: Shahaf Shuler ; dev@dpdk.org
> Subject: Re: [PATCH v2 2/7] net/mlx5: e-switch VXLAN flow validation
> routine
> 
> On Fri, Oct 26, 2018 at 01:39:38AM -0700, Slava Ovsiienko wrote:
> > > -Original Message-
> > > From: Yongseok Koh
> > > Sent: Friday, October 26, 2018 6:07
> > > To: Slava Ovsiienko 
> > > Cc: Shahaf Shuler ; dev@dpdk.org
> > > Subject: Re: [PATCH v2 2/7] net/mlx5: e-switch VXLAN flow validation
> > > routine
> > >
> > > On Thu, Oct 25, 2018 at 06:53:11AM -0700, Slava Ovsiienko wrote:
> > > > > -Original Message-
> > > > > From: Yongseok Koh
> > > > > Sent: Tuesday, October 23, 2018 13:05
> > > > > To: Slava Ovsiienko 
> > > > > Cc: Shahaf Shuler ; dev@dpdk.org
> > > > > Subject: Re: [PATCH v2 2/7] net/mlx5: e-switch VXLAN flow
> > > > > validation routine
> > > > >
> > > > > On Mon, Oct 15, 2018 at 02:13:30PM +, Viacheslav Ovsiienko
> wrote:
> > > [...]
> > > > > > @@ -1114,7 +1733,6 @@ struct pedit_parser {
> > > > > >error);
> > > > > > if (ret < 0)
> > > > > > return ret;
> > > > > > -   item_flags |=
> MLX5_FLOW_LAYER_OUTER_L3_IPV4;
> > > > > > mask.ipv4 = flow_tcf_item_mask
> > > > > > (items, &rte_flow_item_ipv4_mask,
> > > > > >  &flow_tcf_mask_supported.ipv4,
> @@ -1135,13 +1753,22 @@
> > > > > > struct pedit_parser {
> > > > > > next_protocol =
> > > > > > ((const struct
> rte_flow_item_ipv4 *)
> > > > > >  (items->spec))-
> >hdr.next_proto_id;
> > > > > > +   if (item_flags &
> > > > > MLX5_FLOW_LAYER_OUTER_L3_IPV4) {
> > > > > > +   /*
> > > > > > +* Multiple outer items are not
> allowed as
> > > > > > +* tunnel parameters, will raise an
> error later.
> > > > > > +*/
> > > > > > +   ipv4 = NULL;
> > > > >
> > > > > Can't it be inner then?
> > > > AFAIK,  no for tc rules, we can not specify multiple levels (inner
> > > > + outer) for
> > > them.
> > > > There is just no TCA_FLOWER_KEY_xxx attributes  for specifying
> > > > inner
> > > items
> > > > to match by flower.
> > >
> > > When I briefly read the kernel code, I thought TCA_FLOWER_KEY_* are
> > > for inner header before decap. I mean TCA_FLOWER_KEY_IPV4_SRC is
> for
> > > inner L3 and TCA_FLOWER_KEY_ENC_IPV4_SRC is for outer tunnel
> header.
> > > Please do some experiments with tc-flower command.
> >
> > Hm. Interesting. I will check.
> >
> > > > It is quite unclear comment, not the best one, sorry. I did not
> > > > like it too, just forgot to rewrite.
> > > >
> > > > ipv4, ipv6 , udp variables gather the matching items during the
> > > > item list
> > > scanning,
> > > > later variables are used for VXLAN decap action validation only.
> > > > So, the
> > > "outer"
> > > > means that ipv4 variable contains the VXLAN decap outer addresses,
> > > > and should be NULL-ed if multiple items are found in the items list.
> > > >
> > > > But we can generate an error here if we have valid action_flags
> > > > (gathered by prepare function) and VXLAN decap is set. Raising an
> > > > error looks more relevant and clear.
> > >
> > > You can't use flags at this point. It is validate() so prepare()
> > > might not be preceded.
> > >
> > > > >   flow create 1 ingress transfer
> > > > > pattern eth src is 66:77:88:99:aa:bb
> > > > >   dst is 00:11:22:33:44:55 / ipv4 src is 2.2.2.2 dst is 1.1.1.1 /
> > > > >   udp src is 4789 dst is 4242 / vxlan vni is 0x112233 /
> > > > >   eth / ipv6 / tcp dst is 42 / end
> > > > > actions vxlan_decap / port_id id 2 / end
> > > > >
> > > > > Is this flow supported by linux tcf? I took this example from
> > > > > Adrien's
> > > patch -
> > > > > "[8/8] net/mlx5: add VXLAN decap support to switch flow rules".
> > > > > If so,
> > > isn't it
> > > > > possible to have inner L3 layer (MLX5_FLOW_LAYER_INNER_*)? If
> > > > > not,
> > > you
> > > > > should return error in this case. I don't see any code to check
> > > > > redundant outer items.
> > > > > Did I miss something?
> > > >
> > > > Interesting, besides rule has correct syntax, I'm not sure whether
> > > > it can be
> > > applied w/o errors.
> > >
> > > Please try. You owns this patchset. However, you just can prohibit
> > > such flows (tunneled item) and come up with follow-up patches to
> > > enable it later if it is support by tcf as this whole patchset
> > > itself is pretty huge enough and we don't have much time.
> > >
> > > > At least our current flow_tcf_translate() implementation does not
> > > > support
> > > any INNERs.
> > > > But it seems the flo

Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask

2018-10-29 Thread Yao, Lei A


> -Original Message-
> From: Thomas Monjalon [mailto:tho...@monjalon.net]
> Sent: Monday, October 29, 2018 4:43 PM
> To: Yao, Lei A 
> Cc: Alejandro Lucero ; dev@dpdk.org;
> Xu, Qian Q ; Lin, Xueqin ;
> Burakov, Anatoly 
> Subject: Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA
> mask
> 
> 29/10/2018 09:23, Yao, Lei A:
> > Hi, Lucero, Thomas
> >
> > This patch set will cause deadlock during memory initialization.
> > rte_memseg_walk and try_expand_heap both will lock
> > the file &mcfg->memory_hotplug_lock. So dead lock will occur.
> >
> > #0   rte_memseg_walk
> > #1  <-rte_eal_check_dma_mask
> > #2  <-alloc_pages_on_heap
> > #3  <-try_expand_heap_primary
> > #4  <-try_expand_heap
> >
> > Log as following:
> > EAL: TSC frequency is ~2494156 KHz
> > EAL: Master lcore 0 is ready (tid=77fe3c00;cpuset=[0])
> > [New Thread 0x75e0d700 (LWP 330350)]
> > EAL: lcore 1 is ready (tid=75e0d700;cpuset=[1])
> > EAL: Trying to obtain current memory policy.
> > EAL: Setting policy MPOL_PREFERRED for socket 0
> > EAL: Restoring previous memory policy: 0
> >
> > Could you have a check on this? A lot of test cases in our validation
> > team fail because of this. Thanks a lot!
> 
> Can we just call rte_memseg_walk_thread_unsafe()?
> 
> +Cc Anatoly

Hi, Thomas

I change to rte_memseg_walk_thread_unsafe(), still
Can't work. 

EAL: Setting policy MPOL_PREFERRED for socket 0
EAL: Restoring previous memory policy: 0
EAL: memseg iova 14000, len 4000, out of range
EAL:using dma mask 
EAL: alloc_pages_on_heap(): couldn't allocate memory due to DMA mask
EAL: Trying to obtain current memory policy.
EAL: Setting policy MPOL_PREFERRED for socket 1
EAL: Restoring previous memory policy: 0
EAL: memseg iova 1bc000, len 4000, out of range
EAL:using dma mask 
EAL: alloc_pages_on_heap(): couldn't allocate memory due to DMA mask
error allocating rte services array
EAL: FATAL: rte_service_init() failed
EAL: rte_service_init() failed
PANIC in main():

BRs
Lei
> 
> 
> > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Thomas Monjalon
> > > 05/10/2018 14:45, Alejandro Lucero:
> > > > I sent a patchset about this to be applied on 17.11 stable. The memory
> > > > code has had main changes since that version, so here it is the patchset
> > > > adjusted to current master repo.
> > > >
> > > > This patchset adds, mainly, a check for ensuring IOVAs are within a
> > > > restricted range due to addressing limitations with some devices. There
> > > > are two known cases: NFP and IOMMU VT-d emulation.
> > > >
> > > > With this check IOVAs out of range are detected and PMDs can abort
> > > > initialization. For the VT-d case, IOVA VA mode is allowed as long as
> > > > IOVAs are within the supported range, avoiding to forbid IOVA VA by
> > > > default.
> > > >
> > > > For the addressing limitations known cases, there are just 40(NFP) or
> > > > 39(VT-d) bits for handling IOVAs. When using IOVA PA, those limitations
> > > > imply 1TB(NFP) or 512M(VT-d) as upper limits, which is likely enough for
> > > > most systems. With machines using more memory, the added check
> will
> > > > ensure IOVAs within the range.
> > > >
> > > > With IOVA VA, and because the way the Linux kernel serves mmap calls
> > > > in 64 bits systems, 39 or 40 bits are not enough. It is possible to
> > > > give an address hint with a lower starting address than the default one
> > > > used by the kernel, and then ensuring the mmap uses that hint or hint
> plus
> > > > some offset. With 64 bits systems, the process virtual address space is
> > > > large enoguh for doing the hugepages mmaping within the supported
> > > range
> > > > when those addressing limitations exist. This patchset also adds a
> change
> > > > for using such a hint making the use of IOVA VA a more than likely
> > > > possibility when there are those addressing limitations.
> > > >
> > > > The check is not done by default but just when it is required. This
> > > > patchset adds the check for NFP initialization and for setting the IOVA
> > > > mode is an emulated VT-d is detected. Also, because the recent
> patchset
> > > > adding dynamic memory allocation, the check is also invoked for
> ensuring
> > > > the new memsegs are within the required range.
> > > >
> > > > This patchset could be applied to stable 18.05.
> > >
> > > Applied, thanks
> 
> 



Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask

2018-10-29 Thread Yao, Lei A
Hi, Lucero

My server info:
Intel(R) Xeon(R) Platinum 8180 CPU @ 2.50GHz
Hugepage: 1G
Kernel: 4.15.0
OS: Ubuntu

Steps are simple:

1.   Bind one i40e/ixgbe  NIC to igb_uio

2.   Launch testpmd:

./x86_64-native-linuxapp-gcc/app/testpmd -c 0x03 -n 4 --log-level=eal,8 -- -i

BRs
Lei


From: Alejandro Lucero [mailto:alejandro.luc...@netronome.com]
Sent: Monday, October 29, 2018 5:26 PM
To: Thomas Monjalon 
Cc: Yao, Lei A ; dev ; Xu, Qian Q 
; Lin, Xueqin ; Burakov, Anatoly 
; Yigit, Ferruh ; 
Richardson, Bruce 
Subject: Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask

Can we have the configuration triggering this issue?

On Mon, Oct 29, 2018 at 9:07 AM Thomas Monjalon 
mailto:tho...@monjalon.net>> wrote:
One more comment about this issue,

There was no reply to the question asked by Alejandro on October 11th:
http://mails.dpdk.org/archives/dev/2018-October/115402.html
and there were no more reviews despite all my requests:
http://mails.dpdk.org/archives/dev/2018-October/117475.html
Without any more comment, I had to apply the patchset.

Now we need to find a solution. Please suggest.


29/10/2018 09:42, Thomas Monjalon:
> 29/10/2018 09:23, Yao, Lei A:
> > Hi, Lucero, Thomas
> >
> > This patch set will cause deadlock during memory initialization.
> > rte_memseg_walk and try_expand_heap both will lock
> > the file &mcfg->memory_hotplug_lock. So dead lock will occur.
> >
> > #0   rte_memseg_walk
> > #1  <-rte_eal_check_dma_mask
> > #2  <-alloc_pages_on_heap
> > #3  <-try_expand_heap_primary
> > #4  <-try_expand_heap
> >
> > Log as following:
> > EAL: TSC frequency is ~2494156 KHz
> > EAL: Master lcore 0 is ready (tid=77fe3c00;cpuset=[0])
> > [New Thread 0x75e0d700 (LWP 330350)]
> > EAL: lcore 1 is ready (tid=75e0d700;cpuset=[1])
> > EAL: Trying to obtain current memory policy.
> > EAL: Setting policy MPOL_PREFERRED for socket 0
> > EAL: Restoring previous memory policy: 0
> >
> > Could you have a check on this? A lot of test cases in our validation
> > team fail because of this. Thanks a lot!
>
> Can we just call rte_memseg_walk_thread_unsafe()?
>
> +Cc Anatoly
>
>
> > From: dev [mailto:dev-boun...@dpdk.org] On 
> > Behalf Of Thomas Monjalon
> > > 05/10/2018 14:45, Alejandro Lucero:
> > > > I sent a patchset about this to be applied on 17.11 stable. The memory
> > > > code has had main changes since that version, so here it is the patchset
> > > > adjusted to current master repo.
> > > >
> > > > This patchset adds, mainly, a check for ensuring IOVAs are within a
> > > > restricted range due to addressing limitations with some devices. There
> > > > are two known cases: NFP and IOMMU VT-d emulation.
> > > >
> > > > With this check IOVAs out of range are detected and PMDs can abort
> > > > initialization. For the VT-d case, IOVA VA mode is allowed as long as
> > > > IOVAs are within the supported range, avoiding to forbid IOVA VA by
> > > > default.
> > > >
> > > > For the addressing limitations known cases, there are just 40(NFP) or
> > > > 39(VT-d) bits for handling IOVAs. When using IOVA PA, those limitations
> > > > imply 1TB(NFP) or 512M(VT-d) as upper limits, which is likely enough for
> > > > most systems. With machines using more memory, the added check will
> > > > ensure IOVAs within the range.
> > > >
> > > > With IOVA VA, and because the way the Linux kernel serves mmap calls
> > > > in 64 bits systems, 39 or 40 bits are not enough. It is possible to
> > > > give an address hint with a lower starting address than the default one
> > > > used by the kernel, and then ensuring the mmap uses that hint or hint 
> > > > plus
> > > > some offset. With 64 bits systems, the process virtual address space is
> > > > large enoguh for doing the hugepages mmaping within the supported
> > > range
> > > > when those addressing limitations exist. This patchset also adds a 
> > > > change
> > > > for using such a hint making the use of IOVA VA a more than likely
> > > > possibility when there are those addressing limitations.
> > > >
> > > > The check is not done by default but just when it is required. This
> > > > patchset adds the check for NFP initialization and for setting the IOVA
> > > > mode is an emulated VT-d is detected. Also, because the recent patchset
> > > > adding dynamic memory allocation, the check is also invoked for ensuring
> > > > the new memsegs are within the required range.
> > > >
> > > > This patchset could be applied to stable 18.05.
> > >
> > > Applied, thanks




Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask

2018-10-29 Thread Thomas Monjalon
29/10/2018 10:36, Yao, Lei A:
> From: Thomas Monjalon [mailto:tho...@monjalon.net]
> > 29/10/2018 09:23, Yao, Lei A:
> > > Hi, Lucero, Thomas
> > >
> > > This patch set will cause deadlock during memory initialization.
> > > rte_memseg_walk and try_expand_heap both will lock
> > > the file &mcfg->memory_hotplug_lock. So dead lock will occur.
> > >
> > > #0   rte_memseg_walk
> > > #1  <-rte_eal_check_dma_mask
> > > #2  <-alloc_pages_on_heap
> > > #3  <-try_expand_heap_primary
> > > #4  <-try_expand_heap
> > >
> > > Log as following:
> > > EAL: TSC frequency is ~2494156 KHz
> > > EAL: Master lcore 0 is ready (tid=77fe3c00;cpuset=[0])
> > > [New Thread 0x75e0d700 (LWP 330350)]
> > > EAL: lcore 1 is ready (tid=75e0d700;cpuset=[1])
> > > EAL: Trying to obtain current memory policy.
> > > EAL: Setting policy MPOL_PREFERRED for socket 0
> > > EAL: Restoring previous memory policy: 0
> > >
> > > Could you have a check on this? A lot of test cases in our validation
> > > team fail because of this. Thanks a lot!
> > 
> > Can we just call rte_memseg_walk_thread_unsafe()?
> > 
> > +Cc Anatoly
> 
> Hi, Thomas
> 
> I change to rte_memseg_walk_thread_unsafe(), still
> Can't work. 
> 
> EAL: Setting policy MPOL_PREFERRED for socket 0
> EAL: Restoring previous memory policy: 0
> EAL: memseg iova 14000, len 4000, out of range
> EAL:using dma mask 
> EAL: alloc_pages_on_heap(): couldn't allocate memory due to DMA mask
> EAL: Trying to obtain current memory policy.
> EAL: Setting policy MPOL_PREFERRED for socket 1
> EAL: Restoring previous memory policy: 0
> EAL: memseg iova 1bc000, len 4000, out of range
> EAL:using dma mask 
> EAL: alloc_pages_on_heap(): couldn't allocate memory due to DMA mask
> error allocating rte services array
> EAL: FATAL: rte_service_init() failed
> EAL: rte_service_init() failed
> PANIC in main():

I think it is showing there are at least 2 issues:
1/ deadlock
2/ allocation does not comply with mask check (out of range)




Re: [dpdk-dev] [PATCH v3] doc: support building HTML guides with meson

2018-10-29 Thread Luca Boccassi
On Sat, 2018-10-27 at 23:54 +0200, Thomas Monjalon wrote:
> 20/09/2018 15:51, Timothy Redaelli:
> > On Thu, 20 Sep 2018 14:22:08 +0100
> > Luca Boccassi  wrote:
> > 
> > > From: Bruce Richardson 
> > > 
> > > Signed-off-by: Bruce Richardson 
> > > Signed-off-by: Luca Boccassi 
> > 
> > Tested on Fedora 28 (sphinx 1.7.5) and RHEL 7.5 (sphinx 1.1.3)
> > 
> > Tested-by: Timothy Redaelli 
> 
> Applied, thanks
> 
> In the build directory, docs are in
>   - doc/api/api/
>   - doc/guides/guides/
> Why having the last level of (redundant) directory?

Quirks due to the intersection of Meson and Sphinx/Doxygen and theirs
respective quirks - it needs to be a new directory, and it needs to be
called "guides/api" as that's what gets installed via ninja install.
Yes it's not the prettiest in the build tree - suggestions to improve
it are most welcome of course

-- 
Kind regards,
Luca Boccassi


Re: [dpdk-dev] [PATCH v6 0/6] add encap and decap actions to Direct Verbs flow in MLX5 PMD

2018-10-29 Thread Shahaf Shuler
Hi Dekel,

Thursday, October 25, 2018 11:08 PM, Dekel Peled:
> Subject: [dpdk-dev] [PATCH v6 0/6] add encap and decap actions to Direct
> Verbs flow in MLX5 PMD
> 
> This series adds support of encap and decap actions in DV format.
> L2 tunnel support for VXLAN and NVGRE, and L2/L3 tunnel support using raw
> data buffer.
> It is using the generic encapsulation framework from [1].
> 
> [1] "ethdev: add generic L2/L3 tunnel encapsulation actions"
> 
> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmail
> s.dpdk.org%2Farchives%2Fdev%2F2018-
> October%2F114654.html&data=02%7C01%7Cshahafs%40mellanox.com
> %7C5c61dd00f0ea45acba0608d63ab5c0e6%7Ca652971c7d2e4d9ba6a4d14925
> 6f461b%7C0%7C0%7C636760949640981142&sdata=tdOJhJfndTAP6Uotk
> MJrkgBsa7AXljXJRMPsivfveUo%3D&reserved=0

In general looks good, please see some comments to address. 

> 
> ---
> v6:
> * Adapt L2 tunnel to VXLAN and NVGRE
> * Add encap/decap using raw data
> v5:
> * Move DV actions code under common DV flag.
> v4:
> * Update in glue functions.
> v3:
> * Apply additional code review comments.
> v2:
> * Rebase on tip.
> * Apply code review comments.
> 
> 
> Dekel Peled (6):
>   net/mlx5: add flow action functions to glue
>   net/mlx5: add VXLAN encap action to Direct Verbs
>   net/mlx5: add VXLAN decap action to Direct Verbs
>   net/mlx5: add NVGRE encap action to Direct Verbs
>   net/mlx5: add NVGRE decap action to Direct Verbs
>   net/mlx5: add raw data encap decap to Direct Verbs
> 
>  drivers/net/mlx5/Makefile   |   2 +-
>  drivers/net/mlx5/meson.build|   2 +-
>  drivers/net/mlx5/mlx5_flow.h|  15 +
>  drivers/net/mlx5/mlx5_flow_dv.c | 671
> +++-
>  drivers/net/mlx5/mlx5_glue.c|  38 +++
>  drivers/net/mlx5/mlx5_glue.h|  10 +
>  6 files changed, 731 insertions(+), 7 deletions(-)
> 
> --
> 1.8.3.1



Re: [dpdk-dev] [PATCH v6 1/6] net/mlx5: add flow action functions to glue

2018-10-29 Thread Shahaf Shuler
Thursday, October 25, 2018 11:08 PM, Dekel Peled:
> Subject: [dpdk-dev] [PATCH v6 1/6] net/mlx5: add flow action functions to
> glue
> 
> This patch adds glue functions for operations:
> - Create packet reformat (encap/decap) flow action.
> - Destroy flow action.
> 
> The new operations depend on HAVE_IBV_FLOW_DV_SUPPORT.
> 
> Signed-off-by: Dekel Peled 
> Acked-by: Yongseok Koh 

Acked-by: Shahaf Shuler 



Re: [dpdk-dev] [PATCH v6 2/6] net/mlx5: add VXLAN encap action to Direct Verbs

2018-10-29 Thread Shahaf Shuler
Hi Dekel,

Thursday, October 25, 2018 11:08 PM, Dekel Peled:
> Subject: [dpdk-dev] [PATCH v6 2/6] net/mlx5: add VXLAN encap action to
> Direct Verbs
> 
> This patch implements the VXLAN encap action in DV flow for MLX5 PMD.
> 
> Signed-off-by: Dekel Peled 
> ---
>  drivers/net/mlx5/mlx5_flow.h|   2 +
>  drivers/net/mlx5/mlx5_flow_dv.c | 351
> +++-
>  2 files changed, 348 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
> index 61299d6..6e92afe 100644
> --- a/drivers/net/mlx5/mlx5_flow.h
> +++ b/drivers/net/mlx5/mlx5_flow.h
> @@ -92,6 +92,7 @@
>  #define MLX5_FLOW_ACTION_DEC_TTL (1u << 19)  #define
> MLX5_FLOW_ACTION_SET_MAC_SRC (1u << 20)  #define
> MLX5_FLOW_ACTION_SET_MAC_DST (1u << 21)
> +#define MLX5_FLOW_ACTION_VXLAN_ENCAP (1u << 22)
> 
>  #define MLX5_FLOW_FATE_ACTIONS \
>   (MLX5_FLOW_ACTION_DROP | MLX5_FLOW_ACTION_QUEUE |
> MLX5_FLOW_ACTION_RSS) @@ -181,6 +182,7 @@ struct mlx5_flow_dv {
> #ifdef HAVE_IBV_FLOW_DV_SUPPORT
>   struct mlx5dv_flow_action_attr
> actions[MLX5_DV_MAX_NUMBER_OF_ACTIONS];
>   /**< Action list. */
> + struct ibv_flow_action *verbs_action; /**< Verbs encap/decap

The ibv_flow_action is already part of a union inside mlx5dv_flow_action_attr, 
why you need it separately? 
I see also in the below code that you copy it from the action list to this 
specific field. Can you elaborate why? 

> object.
> +*/
>  #endif
>   int actions_n; /**< number of actions. */  }; diff --git
> a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
> index 8f729f4..14110c5 100644
> --- a/drivers/net/mlx5/mlx5_flow_dv.c
> +++ b/drivers/net/mlx5/mlx5_flow_dv.c
> @@ -34,6 +34,12 @@
> 
>  #ifdef HAVE_IBV_FLOW_DV_SUPPORT
> 
> +/*
> + * Encap buf length, max:
> + *   Eth:14/VLAN:8/IPv6:40/TCP:36/TUNNEL:20/Eth:14

VLAN is 4B not 8B. which tunnel is for 20B? 

> + */
> +#define MLX5_ENCAP_MAX_LEN 132
> +
>  /**
>   * Validate META item.
>   *
> @@ -96,6 +102,300 @@
>  }
> 
>  /**
> + * Validate the L2 encap action.
> + * Used for VXLAN encap action.

No need for that. Later on you put more supported protocols. This is a generic 
function for L2 encap validation. 

> + *
> + * @param[in] action_flags
> + *   Holds the actions detected until now.
> + * @param[in] action
> + *   Pointer to the encap action.
> + * @param[in] attr
> + *   Pointer to flow attributes
> + * @param[out] error
> + *   Pointer to error structure.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +static int
> +flow_dv_validate_action_l2_encap(uint64_t action_flags,
> +  const struct rte_flow_action *action,
> +  const struct rte_flow_attr *attr,
> +  struct rte_flow_error *error)
> +{
> + if (!(action->conf))
> + return rte_flow_error_set(error, EINVAL,
> +   RTE_FLOW_ERROR_TYPE_ACTION,
> action,
> +   "configuration cannot be null");
> + if (action_flags & MLX5_FLOW_ACTION_DROP)
> + return rte_flow_error_set(error, EINVAL,
> +   RTE_FLOW_ERROR_TYPE_ACTION,
> NULL,
> +   "can't drop and encap in same
> flow");
> + if (action_flags & MLX5_FLOW_ACTION_VXLAN_ENCAP)
> + return rte_flow_error_set(error, EINVAL,
> +   RTE_FLOW_ERROR_TYPE_ACTION,
> NULL,
> +   "can only have a single encap"
> +   " action in a flow");
> + if (attr->ingress)
> + return rte_flow_error_set(error, ENOTSUP,
> +
> RTE_FLOW_ERROR_TYPE_ATTR_INGRESS,
> +   NULL,
> +   "encap action not supported for "
> +   "ingress");
> + return 0;
> +}
> +
> +/**
> + * Get the size of specific rte_flow_item_type
> + *
> + * @param[in] item_type
> + *   Tested rte_flow_item_type.
> + *
> + * @return
> + *   sizeof struct item_type, 0 if void or irrelevant.
> + */
> +static size_t
> +flow_dv_get_item_len(const enum rte_flow_item_type item_type) {

Can we have this function as a macro?

#define flow_dv_get_item_len(t) (strncpm(t, VOID, 4) ? 0 : sizeof(struct 
rte_flow_item_##t)

Usage: flow_dv_get_item_len(ETH)


> + size_t retval;
> +
> + switch (item_type) {
> + case RTE_FLOW_ITEM_TYPE_ETH:
> + retval = sizeof(struct rte_flow_item_eth);
> + break;
> + case RTE_FLOW_ITEM_TYPE_VLAN:
> + retval = sizeof(struct rte_flow_item_vlan);
> + break;
> + case RTE_FLOW_ITEM_TYPE_IPV4:
> + retval = sizeof(struct rte_flow_item_ipv4);
> + break;
> + case RTE_FLOW_ITEM_TYPE_IPV6:
> + retval = sizeof(struct 

Re: [dpdk-dev] [PATCH v6 6/6] net/mlx5: add raw data encap decap to Direct Verbs

2018-10-29 Thread Shahaf Shuler
Thursday, October 25, 2018 11:08 PM, Dekel Peled:
> Subject: [dpdk-dev] [PATCH v6 6/6] net/mlx5: add raw data encap decap to
> Direct Verbs
> 
> This patch implements the encap and decap actions, using raw data, in DV
> flow for MLX5 PMD.
> 
> Signed-off-by: Dekel Peled 
> ---
>  drivers/net/mlx5/mlx5_flow.h|  12 ++-
>  drivers/net/mlx5/mlx5_flow_dv.c | 227
> ++--
>  2 files changed, 224 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
> index 2d73a05..92ba111 100644
> --- a/drivers/net/mlx5/mlx5_flow.h
> +++ b/drivers/net/mlx5/mlx5_flow.h
> @@ -96,15 +96,19 @@
>  #define MLX5_FLOW_ACTION_VXLAN_DECAP (1u << 23)  #define
> MLX5_FLOW_ACTION_NVGRE_ENCAP (1u << 24)  #define
> MLX5_FLOW_ACTION_NVGRE_DECAP (1u << 25)
> +#define MLX5_FLOW_ACTION_RAW_ENCAP (1u << 26) #define
> +MLX5_FLOW_ACTION_RAW_DECAP (1u << 27)
> 
>  #define MLX5_FLOW_FATE_ACTIONS \
>   (MLX5_FLOW_ACTION_DROP | MLX5_FLOW_ACTION_QUEUE |
> MLX5_FLOW_ACTION_RSS)
> 
> -#define MLX5_FLOW_ENCAP_ACTIONS  \
> - (MLX5_FLOW_ACTION_VXLAN_ENCAP |
> MLX5_FLOW_ACTION_NVGRE_ENCAP)
> +#define MLX5_FLOW_ENCAP_ACTIONS
>   (MLX5_FLOW_ACTION_VXLAN_ENCAP | \
> +  MLX5_FLOW_ACTION_NVGRE_ENCAP | \
> +  MLX5_FLOW_ACTION_RAW_ENCAP)
> 
> -#define MLX5_FLOW_DECAP_ACTIONS  \
> - (MLX5_FLOW_ACTION_VXLAN_DECAP |
> MLX5_FLOW_ACTION_NVGRE_DECAP)
> +#define MLX5_FLOW_DECAP_ACTIONS
>   (MLX5_FLOW_ACTION_VXLAN_DECAP | \
> +  MLX5_FLOW_ACTION_NVGRE_DECAP | \
> +  MLX5_FLOW_ACTION_RAW_DECAP)
> 
>  #ifndef IPPROTO_MPLS
>  #define IPPROTO_MPLS 137
> diff --git a/drivers/net/mlx5/mlx5_flow_dv.c
> b/drivers/net/mlx5/mlx5_flow_dv.c index d7d0c6b..ca75645 100644
> --- a/drivers/net/mlx5/mlx5_flow_dv.c
> +++ b/drivers/net/mlx5/mlx5_flow_dv.c
> @@ -186,6 +186,82 @@
>  }
> 
>  /**
> + * Validate the raw encap action.
> + *
> + * @param[in] action_flags
> + *   Holds the actions detected until now.
> + * @param[in] action
> + *   Pointer to the encap action.
> + * @param[in] attr
> + *   Pointer to flow attributes
> + * @param[out] error
> + *   Pointer to error structure.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +static int
> +flow_dv_validate_action_raw_encap(uint64_t action_flags,
> +   const struct rte_flow_action *action,
> +   const struct rte_flow_attr *attr,
> +   struct rte_flow_error *error)
> +{
> + if (!(action->conf))
> + return rte_flow_error_set(error, EINVAL,
> +   RTE_FLOW_ERROR_TYPE_ACTION,
> action,
> +   "configuration cannot be null");
> + if (action_flags & MLX5_FLOW_ACTION_DROP)
> + return rte_flow_error_set(error, EINVAL,
> +   RTE_FLOW_ERROR_TYPE_ACTION,
> NULL,
> +   "can't drop and encap in same
> flow");
> + if (action_flags & MLX5_FLOW_ENCAP_ACTIONS)
> + return rte_flow_error_set(error, EINVAL,
> +   RTE_FLOW_ERROR_TYPE_ACTION,
> NULL,
> +   "can only have a single encap"
> +   " action in a flow");
> + /* encap without preceding decap is not supported for ingress */
> + if (attr->ingress && !(action_flags &
> MLX5_FLOW_ACTION_RAW_DECAP))
> + return rte_flow_error_set(error, ENOTSUP,
> +
> RTE_FLOW_ERROR_TYPE_ATTR_INGRESS,
> +   NULL,
> +   "encap action not supported for "
> +   "ingress");

The error message doesn't seems informative enough. 

> + return 0;
> +}
> +
> +/**
> + * Validate the raw decap action.
> + *
> + * @param[in] action_flags
> + *   Holds the actions detected until now.
> + * @param[out] error
> + *   Pointer to error structure.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +static int
> +flow_dv_validate_action_raw_decap(uint64_t action_flags,
> +   struct rte_flow_error *error)
> +{
> + if (action_flags & MLX5_FLOW_ACTION_DROP)
> + return rte_flow_error_set(error, EINVAL,
> +   RTE_FLOW_ERROR_TYPE_ACTION,
> NULL,
> +   "can't drop and decap in same
> flow");
> + if (action_flags & MLX5_FLOW_ENCAP_ACTIONS)
> + return rte_flow_error_set(error, EINVAL,
> +   RTE_FLOW_ERROR_TYPE_ACTION,
> NULL,
> +   "can't have encap action before"
> +   " d

Re: [dpdk-dev] [PATCH v6 3/6] net/mlx5: add VXLAN decap action to Direct Verbs

2018-10-29 Thread Shahaf Shuler
Thursday, October 25, 2018 11:08 PM, Dekel Peled:
> Subject: [dpdk-dev] [PATCH v6 3/6] net/mlx5: add VXLAN decap action to
> Direct Verbs
> 
> This patch implements the VXLAN decap action in DV flow for MLX5 PMD.
> 
> Signed-off-by: Dekel Peled 
> ---
>  drivers/net/mlx5/mlx5_flow.h|   1 +
>  drivers/net/mlx5/mlx5_flow_dv.c | 103
> ++--
>  2 files changed, 100 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
> index 6e92afe..3795644 100644
> --- a/drivers/net/mlx5/mlx5_flow.h
> +++ b/drivers/net/mlx5/mlx5_flow.h
> @@ -93,6 +93,7 @@
>  #define MLX5_FLOW_ACTION_SET_MAC_SRC (1u << 20)  #define
> MLX5_FLOW_ACTION_SET_MAC_DST (1u << 21)  #define
> MLX5_FLOW_ACTION_VXLAN_ENCAP (1u << 22)
> +#define MLX5_FLOW_ACTION_VXLAN_DECAP (1u << 23)
> 
>  #define MLX5_FLOW_FATE_ACTIONS \
>   (MLX5_FLOW_ACTION_DROP | MLX5_FLOW_ACTION_QUEUE |
> MLX5_FLOW_ACTION_RSS) diff --git a/drivers/net/mlx5/mlx5_flow_dv.c
> b/drivers/net/mlx5/mlx5_flow_dv.c index 14110c5..e84a2b6 100644
> --- a/drivers/net/mlx5/mlx5_flow_dv.c
> +++ b/drivers/net/mlx5/mlx5_flow_dv.c
> @@ -131,11 +131,12 @@
>   return rte_flow_error_set(error, EINVAL,
> RTE_FLOW_ERROR_TYPE_ACTION,
> NULL,
> "can't drop and encap in same
> flow");
> - if (action_flags & MLX5_FLOW_ACTION_VXLAN_ENCAP)
> + if (action_flags & (MLX5_FLOW_ACTION_VXLAN_ENCAP |
> + MLX5_FLOW_ACTION_VXLAN_DECAP))
>   return rte_flow_error_set(error, EINVAL,
> RTE_FLOW_ERROR_TYPE_ACTION,
> NULL,
> -   "can only have a single encap"
> -   " action in a flow");
> +   "can only have a single encap or"
> +   " decap action in a flow");
>   if (attr->ingress)
>   return rte_flow_error_set(error, ENOTSUP,
> 
> RTE_FLOW_ERROR_TYPE_ATTR_INGRESS, @@ -146,6 +147,47 @@  }
> 
>  /**
> + * Validate the L2 decap action.
> + * Used for VXLAN decap action.

Same like before

> + *
> + * @param[in] action_flags
> + *   Holds the actions detected until now.
> + * @param[in] action
> + *   Pointer to the decap action.
> + * @param[in] attr
> + *   Pointer to flow attributes
> + * @param[out] error
> + *   Pointer to error structure.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +static int
> +flow_dv_validate_action_l2_decap(uint64_t action_flags,
> + const struct rte_flow_action *action __rte_unused,
> + const struct rte_flow_attr *attr,
> + struct rte_flow_error *error)
> +{
> + if (action_flags & MLX5_FLOW_ACTION_DROP)
> + return rte_flow_error_set(error, EINVAL,
> +   RTE_FLOW_ERROR_TYPE_ACTION,
> NULL,
> +   "can't drop and decap in same
> flow");
> + if (action_flags & (MLX5_FLOW_ACTION_VXLAN_ENCAP |
> + MLX5_FLOW_ACTION_VXLAN_DECAP))
> + return rte_flow_error_set(error, EINVAL,
> +   RTE_FLOW_ERROR_TYPE_ACTION,
> NULL,
> +   "can only have a single encap or"
> +   " decap action in a flow");
> + if (attr->egress)
> + return rte_flow_error_set(error, ENOTSUP,
> +
> RTE_FLOW_ERROR_TYPE_ATTR_EGRESS,
> +   NULL,
> +   "decap action not supported for "
> +   "egress");
> + return 0;
> +}
> +
> +/**
>   * Get the size of specific rte_flow_item_type
>   *
>   * @param[in] item_type
> @@ -396,6 +438,38 @@
>  }
> 
>  /**
> + * Convert L2 decap action to DV specification.
> + * Used for VXLAN decap action.

Same 

> + *
> + * @param[in] dev
> + *   Pointer to rte_eth_dev structure.
> + * @param[in] action
> + *   Pointer to action structure.
> + * @param[out] error
> + *   Pointer to the error structure.
> + *
> + * @return
> + *   Pointer to action on success, NULL otherwise and rte_errno is set.
> + */
> +static struct ibv_flow_action *
> +flow_dv_create_action_l2_decap(struct rte_eth_dev *dev,
> + const struct rte_flow_action *action __rte_unused,
> + struct rte_flow_error *error)
> +{
> + struct ibv_flow_action *verbs_action = NULL;
> + struct priv *priv = dev->data->dev_private;
> +
> + verbs_action = mlx5_glue-
> >dv_create_flow_action_packet_reformat
> + (priv->ctx, 0, NULL,
> +
> MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TUNNEL_TO_L2,
> +  MLX5DV_FLOW_TABLE_TYPE_NIC_RX);
> + if (!verbs_action)
> + rte_flow_error_set(e

Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask

2018-10-29 Thread Alejandro Lucero
I know what is going on.

In patchset version 3 I forgot to remove an old code. Anatoly spotted that
and I was going to send another version for fixing it. Before sending the
new version I saw that report about a problem with dma_mask and I'm afraid
I did not send another version with the fix ...

Yao, can you try with next patch?:

*diff --git a/lib/librte_eal/common/eal_common_memory.c
b/lib/librte_eal/common/eal_common_memory.c*

*index ef656bbad..26adf46c0 100644*

*--- a/lib/librte_eal/common/eal_common_memory.c*

*+++ b/lib/librte_eal/common/eal_common_memory.c*

@@ -458,10 +458,6 @@ rte_eal_check_dma_mask(uint8_t maskbits)

return -1;

}



-   /* keep the more restricted maskbit */

-   if (!mcfg->dma_maskbits || maskbits < mcfg->dma_maskbits)

-   mcfg->dma_maskbits = maskbits;

-

/* create dma mask */

mask = ~((1ULL << maskbits) - 1);

On Mon, Oct 29, 2018 at 9:48 AM Thomas Monjalon  wrote:

> 29/10/2018 10:36, Yao, Lei A:
> > From: Thomas Monjalon [mailto:tho...@monjalon.net]
> > > 29/10/2018 09:23, Yao, Lei A:
> > > > Hi, Lucero, Thomas
> > > >
> > > > This patch set will cause deadlock during memory initialization.
> > > > rte_memseg_walk and try_expand_heap both will lock
> > > > the file &mcfg->memory_hotplug_lock. So dead lock will occur.
> > > >
> > > > #0   rte_memseg_walk
> > > > #1  <-rte_eal_check_dma_mask
> > > > #2  <-alloc_pages_on_heap
> > > > #3  <-try_expand_heap_primary
> > > > #4  <-try_expand_heap
> > > >
> > > > Log as following:
> > > > EAL: TSC frequency is ~2494156 KHz
> > > > EAL: Master lcore 0 is ready (tid=77fe3c00;cpuset=[0])
> > > > [New Thread 0x75e0d700 (LWP 330350)]
> > > > EAL: lcore 1 is ready (tid=75e0d700;cpuset=[1])
> > > > EAL: Trying to obtain current memory policy.
> > > > EAL: Setting policy MPOL_PREFERRED for socket 0
> > > > EAL: Restoring previous memory policy: 0
> > > >
> > > > Could you have a check on this? A lot of test cases in our validation
> > > > team fail because of this. Thanks a lot!
> > >
> > > Can we just call rte_memseg_walk_thread_unsafe()?
> > >
> > > +Cc Anatoly
> >
> > Hi, Thomas
> >
> > I change to rte_memseg_walk_thread_unsafe(), still
> > Can't work.
> >
> > EAL: Setting policy MPOL_PREFERRED for socket 0
> > EAL: Restoring previous memory policy: 0
> > EAL: memseg iova 14000, len 4000, out of range
> > EAL:using dma mask 
> > EAL: alloc_pages_on_heap(): couldn't allocate memory due to DMA mask
> > EAL: Trying to obtain current memory policy.
> > EAL: Setting policy MPOL_PREFERRED for socket 1
> > EAL: Restoring previous memory policy: 0
> > EAL: memseg iova 1bc000, len 4000, out of range
> > EAL:using dma mask 
> > EAL: alloc_pages_on_heap(): couldn't allocate memory due to DMA mask
> > error allocating rte services array
> > EAL: FATAL: rte_service_init() failed
> > EAL: rte_service_init() failed
> > PANIC in main():
>
> I think it is showing there are at least 2 issues:
> 1/ deadlock
> 2/ allocation does not comply with mask check (out of range)
>
>
>


Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask

2018-10-29 Thread Alejandro Lucero
Apologies. Forget my previous email. Just using the wrong repo.

Looking at solving this asap.

On Mon, Oct 29, 2018 at 10:11 AM Alejandro Lucero <
alejandro.luc...@netronome.com> wrote:

> I know what is going on.
>
> In patchset version 3 I forgot to remove an old code. Anatoly spotted that
> and I was going to send another version for fixing it. Before sending the
> new version I saw that report about a problem with dma_mask and I'm afraid
> I did not send another version with the fix ...
>
> Yao, can you try with next patch?:
>
> *diff --git a/lib/librte_eal/common/eal_common_memory.c
> b/lib/librte_eal/common/eal_common_memory.c*
>
> *index ef656bbad..26adf46c0 100644*
>
> *--- a/lib/librte_eal/common/eal_common_memory.c*
>
> *+++ b/lib/librte_eal/common/eal_common_memory.c*
>
> @@ -458,10 +458,6 @@ rte_eal_check_dma_mask(uint8_t maskbits)
>
> return -1;
>
> }
>
>
>
> -   /* keep the more restricted maskbit */
>
> -   if (!mcfg->dma_maskbits || maskbits < mcfg->dma_maskbits)
>
> -   mcfg->dma_maskbits = maskbits;
>
> -
>
> /* create dma mask */
>
> mask = ~((1ULL << maskbits) - 1);
>
> On Mon, Oct 29, 2018 at 9:48 AM Thomas Monjalon 
> wrote:
>
>> 29/10/2018 10:36, Yao, Lei A:
>> > From: Thomas Monjalon [mailto:tho...@monjalon.net]
>> > > 29/10/2018 09:23, Yao, Lei A:
>> > > > Hi, Lucero, Thomas
>> > > >
>> > > > This patch set will cause deadlock during memory initialization.
>> > > > rte_memseg_walk and try_expand_heap both will lock
>> > > > the file &mcfg->memory_hotplug_lock. So dead lock will occur.
>> > > >
>> > > > #0   rte_memseg_walk
>> > > > #1  <-rte_eal_check_dma_mask
>> > > > #2  <-alloc_pages_on_heap
>> > > > #3  <-try_expand_heap_primary
>> > > > #4  <-try_expand_heap
>> > > >
>> > > > Log as following:
>> > > > EAL: TSC frequency is ~2494156 KHz
>> > > > EAL: Master lcore 0 is ready (tid=77fe3c00;cpuset=[0])
>> > > > [New Thread 0x75e0d700 (LWP 330350)]
>> > > > EAL: lcore 1 is ready (tid=75e0d700;cpuset=[1])
>> > > > EAL: Trying to obtain current memory policy.
>> > > > EAL: Setting policy MPOL_PREFERRED for socket 0
>> > > > EAL: Restoring previous memory policy: 0
>> > > >
>> > > > Could you have a check on this? A lot of test cases in our
>> validation
>> > > > team fail because of this. Thanks a lot!
>> > >
>> > > Can we just call rte_memseg_walk_thread_unsafe()?
>> > >
>> > > +Cc Anatoly
>> >
>> > Hi, Thomas
>> >
>> > I change to rte_memseg_walk_thread_unsafe(), still
>> > Can't work.
>> >
>> > EAL: Setting policy MPOL_PREFERRED for socket 0
>> > EAL: Restoring previous memory policy: 0
>> > EAL: memseg iova 14000, len 4000, out of range
>> > EAL:using dma mask 
>> > EAL: alloc_pages_on_heap(): couldn't allocate memory due to DMA mask
>> > EAL: Trying to obtain current memory policy.
>> > EAL: Setting policy MPOL_PREFERRED for socket 1
>> > EAL: Restoring previous memory policy: 0
>> > EAL: memseg iova 1bc000, len 4000, out of range
>> > EAL:using dma mask 
>> > EAL: alloc_pages_on_heap(): couldn't allocate memory due to DMA mask
>> > error allocating rte services array
>> > EAL: FATAL: rte_service_init() failed
>> > EAL: rte_service_init() failed
>> > PANIC in main():
>>
>> I think it is showing there are at least 2 issues:
>> 1/ deadlock
>> 2/ allocation does not comply with mask check (out of range)
>>
>>
>>


Re: [dpdk-dev] [PATCH 1/2] ring: synchronize the load and store of the tail

2018-10-29 Thread Jerin Jacob
-Original Message-
> Date: Mon, 29 Oct 2018 02:57:17 +
> From: "Gavin Hu (Arm Technology China)" 
> To: Jerin Jacob , Thomas Monjalon
>  
> CC: "dev@dpdk.org" , Honnappa Nagarahalli
>  , "sta...@dpdk.org" , Ola
>  Liljedahl , "olivier.m...@6wind.com"
>  , "chao...@linux.vnet.ibm.com"
>  , "bruce.richard...@intel.com"
>  , "konstantin.anan...@intel.com"
>  
> Subject: RE: [dpdk-dev] [PATCH 1/2] ring: synchronize the load and store of
>  the tail
> 
> 
> Hi Thomas and Jerin,
> 
> The patches were extensively reviewed by Arm internally, as the 1st patch was 
> not able to be concluded, I created a new patch series(2 patches).
> How can I clean up this mess?
> 1. make all the previous patches Superseded?
> 2. We have two more new patches, should I submit the 4 patches (the old 2 
> patches + 2 new patches) with V2?

I would suggest to supersede the old patches(not in this case, in any case when 
you
send new version and update the version number).

I would suggest send new patches as separate series. If it has dependency on
exiting Acked patches please mention that in cover letter.


> 
> Best Regards,
> Gavin
> 
> 
> > -Original Message-
> > From: Jerin Jacob 
> > Sent: Saturday, October 27, 2018 11:34 PM
> > To: Thomas Monjalon 
> > Cc: Gavin Hu (Arm Technology China) ; dev@dpdk.org;
> > Honnappa Nagarahalli ; sta...@dpdk.org;
> > Ola Liljedahl ; olivier.m...@6wind.com;
> > chao...@linux.vnet.ibm.com; bruce.richard...@intel.com;
> > konstantin.anan...@intel.com
> > Subject: Re: [dpdk-dev] [PATCH 1/2] ring: synchronize the load and store of
> > the tail
> >
> > -Original Message-
> > > Date: Sat, 27 Oct 2018 17:13:10 +0200
> > > From: Thomas Monjalon 
> > > To: Jerin Jacob 
> > > Cc: "Gavin Hu (Arm Technology China)" ,
> > "dev@dpdk.org"
> > >  , Honnappa Nagarahalli
> > ,
> > > "sta...@dpdk.org" , Ola Liljedahl
> > > , "olivier.m...@6wind.com"
> > >  , "chao...@linux.vnet.ibm.com"
> > >  , "bruce.richard...@intel.com"
> > >  , "konstantin.anan...@intel.com"
> > >  
> > > Subject: Re: [dpdk-dev] [PATCH 1/2] ring: synchronize the load and
> > > store of  the tail
> > >
> > >
> > > 27/10/2018 17:00, Jerin Jacob:
> > > > From: Thomas Monjalon 
> > > > > 17/10/2018 08:35, Gavin Hu (Arm Technology China):
> > > > > > Hi Jerin
> > > > > >
> > > > > > As the 1st one of the 3-patch set was not concluded, I submit this 
> > > > > > 2-
> > patch series to unblock the merge.
> > > > >
> > > > > The thread is totally messed up because:
> > > > > - there is no cover letter
> > > > > - some different series (testpmd, i40e and doc) are in the 
> > > > > same
> > thread
> > > > > - v4 replies to a different series
> > > > > - this version should be a v5 but has no number
> > > > > - this version replies to the v3
> > > > > - patchwork still shows v3 and "v5"
> > > > > - replies from Ola are not quoting previous discussion
> > > > >
> > > > > Because of all of this, it is really difficult to follow.
> > > > > This is probably the reason of the lack of review outside of Arm.
> > > > >
> > > > > One more issue: you must Cc the relevant maintainers.
> > > > > Here:
> > > > > - Olivier for rte_ring
> > > > > - Chao for IBM platform
> > > > > - Bruce and Konstantin for x86
> > > > >
> > > > > Guys, it is really cool to have more Arm developpers in DPDK.
> > > > > But please consider better formatting your discussions, it is
> > > > > really important in our contribution workflow.
> > > > >
> > > > > I don't know what to do.
> > > > > I suggest to wait for more feedbacks and integrate it in -rc2.
> > > >
> > > > This series has been acked and tested. Sure, if we are looking for
> > > > some more feedback we can push to -rc2 if not it a good candidate to
> > > > be selected for -rc1.
> > >
> > > It has been acked and tested only for Arm platforms.
> > > And Olivier, the ring maintainer, was not Cc.
> > >
> > > I feel it is not enough.
> >
> > Sure, More reviews is already better. But lets keep as -rc2 target.
> >
> >
> > >
> > >
> IMPORTANT NOTICE: The contents of this email and any attachments are 
> confidential and may also be privileged. If you are not the intended 
> recipient, please notify the sender immediately and do not disclose the 
> contents to any other person, use it for any purpose, or store or copy the 
> information in any medium. Thank you.


Re: [dpdk-dev] [RFC v2 5/9] ipsec: add SA data-path API

2018-10-29 Thread Jerin Jacob
-Original Message-
> Date: Sun, 28 Oct 2018 20:37:23 +
> From: "Ananyev, Konstantin" 
> To: Jerin Jacob 
> CC: "dev@dpdk.org" , "Awal, Mohammad Abdul"
>  , "Joseph, Anoob"
>  , "Athreya, Narayana Prasad"
>  
> Subject: RE: [dpdk-dev] [RFC v2 5/9] ipsec: add SA data-path API
> 

> 
> Hi Jerin,

Hi Konstantin,

> 
> > > > > +
> > > > > +/**
> > > > > + * Checks that inside given rte_ipsec_session crypto/security fields
> > > > > + * are filled correctly and setups function pointers based on these 
> > > > > values.
> > > > > + * @param ss
> > > > > + *   Pointer to the *rte_ipsec_session* object
> > > > > + * @return
> > > > > + *   - Zero if operation completed successfully.
> > > > > + *   - -EINVAL if the parameters are invalid.
> > > > > + */
> > > > > +int __rte_experimental
> > > > > +rte_ipsec_session_prepare(struct rte_ipsec_session *ss);
> > > > > +
> > > > > +/**
> > > > > + * For input mbufs and given IPsec session prepare crypto ops that 
> > > > > can be
> > > > > + * enqueued into the cryptodev associated with given session.
> > > > > + * expects that for each input packet:
> > > > > + *  - l2_len, l3_len are setup correctly
> > > > > + * Note that erroneous mbufs are not freed by the function,
> > > > > + * but are placed beyond last valid mbuf in the *mb* array.
> > > > > + * It is a user responsibility to handle them further.
> > > > > + * @param ss
> > > > > + *   Pointer to the *rte_ipsec_session* object the packets belong to.
> > > > > + * @param mb
> > > > > + *   The address of an array of *num* pointers to *rte_mbuf* 
> > > > > structures
> > > > > + *   which contain the input packets.
> > > > > + * @param cop
> > > > > + *   The address of an array of *num* pointers to the output 
> > > > > *rte_crypto_op*
> > > > > + *   structures.
> > > > > + * @param num
> > > > > + *   The maximum number of packets to process.
> > > > > + * @return
> > > > > + *   Number of successfully processed packets, with error code set 
> > > > > in rte_errno.
> > > > > + */
> > > > > +static inline uint16_t __rte_experimental
> > > > > +rte_ipsec_crypto_prepare(const struct rte_ipsec_session *ss,
> > > > > +   struct rte_mbuf *mb[], struct rte_crypto_op *cop[], uint16_t 
> > > > > num)
> > > > > +{
> > > > > +   return ss->func.prepare(ss, mb, cop, num);
> > > > > +}
> > > > > +
> > > > static inline uint16_t __rte_experimental
> > > > rte_ipsec_event_process(const struct rte_ipsec_session *ss, struct 
> > > > rte_event *ev[], uint16_t num)
> > > > {
> > > >return ss->func.event_process(ss, ev, num);
> > > > }
> > >
> > > To fulfill that, we can either have 2 separate function pointers:
> > > uint16_t (*pkt_process)( const struct rte_ipsec_session *ss, struct 
> > > rte_mbuf *mb[],uint16_t num);
> > > uint16_t (*event_process)( const struct rte_ipsec_session *ss, struct 
> > > rte_event *ev[],uint16_t num);
> > >
> > > Or we can keep one function pointer, but change it to accept just array 
> > > of pointers:
> > > uint16_t (*process)( const struct rte_ipsec_session *ss, void 
> > > *in[],uint16_t num);
> > > and then make session_prepare() to choose a proper function based on 
> > > input.
> > >
> > > I am ok with both schemes, but second one seems a bit nicer to me.
> >
> > How about best of both worlds, i.e save space and enable compile check
> > by anonymous union of both functions
> >
> > RTE_STD_C11
> > union {
> >   uint16_t (*pkt_process)( const struct rte_ipsec_session *ss,struct 
> > rte_mbuf *mb[],uint16_t num);
> >   uint16_t (*event_process)( const struct rte_ipsec_session *ss, struct 
> > rte_event *ev[],uint16_t num);
> > };
> >
> 
> Yes, it is definitely possible, but then we still need 2 API functions,
> Depending on input type, i.e:
> 
> static inline uint16_t __rte_experimental
> rte_ipsec_event_process(const struct rte_ipsec_session *ss, struct rte_event 
> *ev[], uint16_t num)
>  {
> return ss->func.event_process(ss, ev, num);
> }
> 
> static inline uint16_t __rte_experimental
> rte_ipsec_pkt_process(const struct rte_ipsec_session *ss, struct rte_mbuf 
> *mb[], uint16_t num)
>  {
> return ss->func.pkt_process(ss, mb, num);
> }
> 
> While if we'll have void *[], we can have just one function for both cases:
> 
> static inline uint16_t __rte_experimental
> rte_ipsec_process(const struct rte_ipsec_session *ss, void *in[], uint16_t 
> num)
>  {
> return ss->func.process(ss, in, num);
> }

Since it will be called from different application code path. I would
prefer to have separate functions to allow strict compiler check.



> 
> Konstantin


Re: [dpdk-dev] [PATCH 1/2] ring: synchronize the load and store of the tail

2018-10-29 Thread Thomas Monjalon
29/10/2018 11:16, Jerin Jacob:
> From: "Gavin Hu (Arm Technology China)" 
> > 
> > Hi Thomas and Jerin,
> > 
> > The patches were extensively reviewed by Arm internally, as the 1st patch 
> > was not able to be concluded, I created a new patch series(2 patches).
> > How can I clean up this mess?
> > 1. make all the previous patches Superseded?
> > 2. We have two more new patches, should I submit the 4 patches (the old 2 
> > patches + 2 new patches) with V2?
> 
> I would suggest to supersede the old patches(not in this case, in any case 
> when you
> send new version and update the version number).

Why not in this case?
There are some old patches in patchwork which should be superseded.

> I would suggest send new patches as separate series. If it has dependency on
> exiting Acked patches please mention that in cover letter.

I would suggest also to stop top-posting, it doesn't help reading threads.


> > From: Jerin Jacob 
> > > From: Thomas Monjalon 
> > > > 27/10/2018 17:00, Jerin Jacob:
> > > > > From: Thomas Monjalon 
> > > > > > 17/10/2018 08:35, Gavin Hu (Arm Technology China):
> > > > > > > Hi Jerin
> > > > > > >
> > > > > > > As the 1st one of the 3-patch set was not concluded, I submit 
> > > > > > > this 2-
> > > patch series to unblock the merge.
> > > > > >
> > > > > > The thread is totally messed up because:
> > > > > > - there is no cover letter
> > > > > > - some different series (testpmd, i40e and doc) are in the 
> > > > > > same
> > > thread
> > > > > > - v4 replies to a different series
> > > > > > - this version should be a v5 but has no number
> > > > > > - this version replies to the v3
> > > > > > - patchwork still shows v3 and "v5"
> > > > > > - replies from Ola are not quoting previous discussion
> > > > > >
> > > > > > Because of all of this, it is really difficult to follow.
> > > > > > This is probably the reason of the lack of review outside of Arm.
> > > > > >
> > > > > > One more issue: you must Cc the relevant maintainers.
> > > > > > Here:
> > > > > > - Olivier for rte_ring
> > > > > > - Chao for IBM platform
> > > > > > - Bruce and Konstantin for x86
> > > > > >
> > > > > > Guys, it is really cool to have more Arm developpers in DPDK.
> > > > > > But please consider better formatting your discussions, it is
> > > > > > really important in our contribution workflow.
> > > > > >
> > > > > > I don't know what to do.
> > > > > > I suggest to wait for more feedbacks and integrate it in -rc2.
> > > > >
> > > > > This series has been acked and tested. Sure, if we are looking for
> > > > > some more feedback we can push to -rc2 if not it a good candidate to
> > > > > be selected for -rc1.
> > > >
> > > > It has been acked and tested only for Arm platforms.
> > > > And Olivier, the ring maintainer, was not Cc.
> > > >
> > > > I feel it is not enough.
> > >
> > > Sure, More reviews is already better. But lets keep as -rc2 target.
> > >
> > >
> > > >
> > > >
> > IMPORTANT NOTICE: The contents of this email and any attachments are 
> > confidential and may also be privileged. If you are not the intended 
> > recipient, please notify the sender immediately and do not disclose the 
> > contents to any other person, use it for any purpose, or store or copy the 
> > information in any medium. Thank you.







Re: [dpdk-dev] [PATCH 1/2] ring: synchronize the load and store of the tail

2018-10-29 Thread Jerin Jacob
-Original Message-
> Date: Mon, 29 Oct 2018 11:47:05 +0100
> From: Thomas Monjalon 
> To: Jerin Jacob , "Gavin Hu (Arm Technology
>  China)" 
> Cc: "dev@dpdk.org" , Honnappa Nagarahalli
>  , "sta...@dpdk.org" , Ola
>  Liljedahl , "olivier.m...@6wind.com"
>  , "chao...@linux.vnet.ibm.com"
>  , "bruce.richard...@intel.com"
>  , "konstantin.anan...@intel.com"
>  
> Subject: Re: [dpdk-dev] [PATCH 1/2] ring: synchronize the load and store of
>  the tail
> 
> 
> 29/10/2018 11:16, Jerin Jacob:
> > From: "Gavin Hu (Arm Technology China)" 
> > >
> > > Hi Thomas and Jerin,
> > >
> > > The patches were extensively reviewed by Arm internally, as the 1st patch 
> > > was not able to be concluded, I created a new patch series(2 patches).
> > > How can I clean up this mess?
> > > 1. make all the previous patches Superseded?
> > > 2. We have two more new patches, should I submit the 4 patches (the old 2 
> > > patches + 2 new patches) with V2?
> >
> > I would suggest to supersede the old patches(not in this case, in any case 
> > when you
> > send new version and update the version number).
> 
> Why not in this case?

I did not mean in this case particular. I meant in all cases.

> There are some old patches in patchwork which should be superseded.
> 
> > I would suggest send new patches as separate series. If it has dependency on
> > exiting Acked patches please mention that in cover letter.
> 
> I would suggest also to stop top-posting, it doesn't help reading threads.
> 
> 
> > > From: Jerin Jacob 
> > > > From: Thomas Monjalon 
> > > > > 27/10/2018 17:00, Jerin Jacob:
> > > > > > From: Thomas Monjalon 
> > > > > > > 17/10/2018 08:35, Gavin Hu (Arm Technology China):
> > > > > > > > Hi Jerin
> > > > > > > >
> > > > > > > > As the 1st one of the 3-patch set was not concluded, I submit 
> > > > > > > > this 2-
> > > > patch series to unblock the merge.
> > > > > > >
> > > > > > > The thread is totally messed up because:
> > > > > > > - there is no cover letter
> > > > > > > - some different series (testpmd, i40e and doc) are in 
> > > > > > > the same
> > > > thread
> > > > > > > - v4 replies to a different series
> > > > > > > - this version should be a v5 but has no number
> > > > > > > - this version replies to the v3
> > > > > > > - patchwork still shows v3 and "v5"
> > > > > > > - replies from Ola are not quoting previous discussion
> > > > > > >
> > > > > > > Because of all of this, it is really difficult to follow.
> > > > > > > This is probably the reason of the lack of review outside of Arm.
> > > > > > >
> > > > > > > One more issue: you must Cc the relevant maintainers.
> > > > > > > Here:
> > > > > > > - Olivier for rte_ring
> > > > > > > - Chao for IBM platform
> > > > > > > - Bruce and Konstantin for x86
> > > > > > >
> > > > > > > Guys, it is really cool to have more Arm developpers in DPDK.
> > > > > > > But please consider better formatting your discussions, it is
> > > > > > > really important in our contribution workflow.
> > > > > > >
> > > > > > > I don't know what to do.
> > > > > > > I suggest to wait for more feedbacks and integrate it in -rc2.
> > > > > >
> > > > > > This series has been acked and tested. Sure, if we are looking for
> > > > > > some more feedback we can push to -rc2 if not it a good candidate to
> > > > > > be selected for -rc1.
> > > > >
> > > > > It has been acked and tested only for Arm platforms.
> > > > > And Olivier, the ring maintainer, was not Cc.
> > > > >
> > > > > I feel it is not enough.
> > > >
> > > > Sure, More reviews is already better. But lets keep as -rc2 target.
> > > >
> > > >
> > > > >
> > > > >
> > > IMPORTANT NOTICE: The contents of this email and any attachments are 
> > > confidential and may also be privileged. If you are not the intended 
> > > recipient, please notify the sender immediately and do not disclose the 
> > > contents to any other person, use it for any purpose, or store or copy 
> > > the information in any medium. Thank you.
> 
> 
> 
> 
> 


Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask

2018-10-29 Thread Alejandro Lucero
I got a patch that solves a bug when calling rte_eal_dma_mask using the
mask instead of the maskbits. However, this does not solves the deadlock.

Interestingly, the problem looks like a compiler one. Calling
rte_memseg_walk does not return when calling inside rt_eal_dma_mask, but if
you modify the call like this:

*diff --git a/lib/librte_eal/common/eal_common_memory.c
b/lib/librte_eal/common/eal_common_memory.c*

*index 12dcedf5c..69b26e464 100644*

*--- a/lib/librte_eal/common/eal_common_memory.c*

*+++ b/lib/librte_eal/common/eal_common_memory.c*

@@ -462,7 +462,7 @@ rte_eal_check_dma_mask(uint8_t maskbits)

/* create dma mask */

mask = ~((1ULL << maskbits) - 1);



-   if (rte_memseg_walk(check_iova, &mask))

+   if (!rte_memseg_walk(check_iova, &mask))

/*

 * Dma mask precludes hugepage usage.

 * This device can not be used and we do not need to keep


it works, although the value returned to the invoker changes, of course.
But the point here is it should be the same behaviour when calling
rte_memseg_walk than before and it is not.


Anatoly, maybe you can see something I can not.



On Mon, Oct 29, 2018 at 10:15 AM Alejandro Lucero <
alejandro.luc...@netronome.com> wrote:

> Apologies. Forget my previous email. Just using the wrong repo.
>
> Looking at solving this asap.
>
> On Mon, Oct 29, 2018 at 10:11 AM Alejandro Lucero <
> alejandro.luc...@netronome.com> wrote:
>
>> I know what is going on.
>>
>> In patchset version 3 I forgot to remove an old code. Anatoly spotted
>> that and I was going to send another version for fixing it. Before sending
>> the new version I saw that report about a problem with dma_mask and I'm
>> afraid I did not send another version with the fix ...
>>
>> Yao, can you try with next patch?:
>>
>> *diff --git a/lib/librte_eal/common/eal_common_memory.c
>> b/lib/librte_eal/common/eal_common_memory.c*
>>
>> *index ef656bbad..26adf46c0 100644*
>>
>> *--- a/lib/librte_eal/common/eal_common_memory.c*
>>
>> *+++ b/lib/librte_eal/common/eal_common_memory.c*
>>
>> @@ -458,10 +458,6 @@ rte_eal_check_dma_mask(uint8_t maskbits)
>>
>> return -1;
>>
>> }
>>
>>
>>
>> -   /* keep the more restricted maskbit */
>>
>> -   if (!mcfg->dma_maskbits || maskbits < mcfg->dma_maskbits)
>>
>> -   mcfg->dma_maskbits = maskbits;
>>
>> -
>>
>> /* create dma mask */
>>
>> mask = ~((1ULL << maskbits) - 1);
>>
>> On Mon, Oct 29, 2018 at 9:48 AM Thomas Monjalon 
>> wrote:
>>
>>> 29/10/2018 10:36, Yao, Lei A:
>>> > From: Thomas Monjalon [mailto:tho...@monjalon.net]
>>> > > 29/10/2018 09:23, Yao, Lei A:
>>> > > > Hi, Lucero, Thomas
>>> > > >
>>> > > > This patch set will cause deadlock during memory initialization.
>>> > > > rte_memseg_walk and try_expand_heap both will lock
>>> > > > the file &mcfg->memory_hotplug_lock. So dead lock will occur.
>>> > > >
>>> > > > #0   rte_memseg_walk
>>> > > > #1  <-rte_eal_check_dma_mask
>>> > > > #2  <-alloc_pages_on_heap
>>> > > > #3  <-try_expand_heap_primary
>>> > > > #4  <-try_expand_heap
>>> > > >
>>> > > > Log as following:
>>> > > > EAL: TSC frequency is ~2494156 KHz
>>> > > > EAL: Master lcore 0 is ready (tid=77fe3c00;cpuset=[0])
>>> > > > [New Thread 0x75e0d700 (LWP 330350)]
>>> > > > EAL: lcore 1 is ready (tid=75e0d700;cpuset=[1])
>>> > > > EAL: Trying to obtain current memory policy.
>>> > > > EAL: Setting policy MPOL_PREFERRED for socket 0
>>> > > > EAL: Restoring previous memory policy: 0
>>> > > >
>>> > > > Could you have a check on this? A lot of test cases in our
>>> validation
>>> > > > team fail because of this. Thanks a lot!
>>> > >
>>> > > Can we just call rte_memseg_walk_thread_unsafe()?
>>> > >
>>> > > +Cc Anatoly
>>> >
>>> > Hi, Thomas
>>> >
>>> > I change to rte_memseg_walk_thread_unsafe(), still
>>> > Can't work.
>>> >
>>> > EAL: Setting policy MPOL_PREFERRED for socket 0
>>> > EAL: Restoring previous memory policy: 0
>>> > EAL: memseg iova 14000, len 4000, out of range
>>> > EAL:using dma mask 
>>> > EAL: alloc_pages_on_heap(): couldn't allocate memory due to DMA mask
>>> > EAL: Trying to obtain current memory policy.
>>> > EAL: Setting policy MPOL_PREFERRED for socket 1
>>> > EAL: Restoring previous memory policy: 0
>>> > EAL: memseg iova 1bc000, len 4000, out of range
>>> > EAL:using dma mask 
>>> > EAL: alloc_pages_on_heap(): couldn't allocate memory due to DMA mask
>>> > error allocating rte services array
>>> > EAL: FATAL: rte_service_init() failed
>>> > EAL: rte_service_init() failed
>>> > PANIC in main():
>>>
>>> I think it is showing there are at least 2 issues:
>>> 1/ deadlock
>>> 2/ allocation does not comply with mask check (out of range)
>>>
>>>
>>>


Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask

2018-10-29 Thread Thomas Monjalon
29/10/2018 12:39, Alejandro Lucero:
> I got a patch that solves a bug when calling rte_eal_dma_mask using the
> mask instead of the maskbits. However, this does not solves the deadlock.

The deadlock is a bigger concern I think.

> Interestingly, the problem looks like a compiler one. Calling
> rte_memseg_walk does not return when calling inside rt_eal_dma_mask, but if
> you modify the call like this:
> 
> -   if (rte_memseg_walk(check_iova, &mask))
> +   if (!rte_memseg_walk(check_iova, &mask))
> 
> it works, although the value returned to the invoker changes, of course.
> But the point here is it should be the same behaviour when calling
> rte_memseg_walk than before and it is not.

Anyway, the coding style requires to save the return value in a variable,
instead of nesting the call in an "if" condition.
And the "if" check should be explicitly != 0 because it is not a real boolean.

PS: please do not top post and avoid HTML emails, thanks




[dpdk-dev] [PATCH v2] eal: fix IPC memleak on device hotplug

2018-10-29 Thread Darek Stojaczyk
rte_mp_request_sync() says that the caller is responsible
for freeing one of its parameters afterwards. EAL didn't
do that, causing a memory leak.

Fixes: 244d5130719c ("eal: enable hotplug on multi-process")
Cc: qi.z.zh...@intel.com
Cc: anatoly.bura...@intel.com

Signed-off-by: Darek Stojaczyk 
---
 lib/librte_eal/common/hotplug_mp.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/lib/librte_eal/common/hotplug_mp.c 
b/lib/librte_eal/common/hotplug_mp.c
index 84f59d95b..b68e4cabb 100644
--- a/lib/librte_eal/common/hotplug_mp.c
+++ b/lib/librte_eal/common/hotplug_mp.c
@@ -355,6 +355,7 @@ int eal_dev_hotplug_request_to_primary(struct 
eal_dev_mp_req *req)
resp = (struct eal_dev_mp_req *)mp_reply.msgs[0].param;
req->result = resp->result;
 
+   free(mp_reply.msgs);
return ret;
 }
 
@@ -379,6 +380,7 @@ int eal_dev_hotplug_request_to_secondary(struct 
eal_dev_mp_req *req)
 
if (mp_reply.nb_sent != mp_reply.nb_received) {
RTE_LOG(ERR, EAL, "not all secondary reply\n");
+   free(mp_reply.msgs);
return -1;
}
 
@@ -397,6 +399,7 @@ int eal_dev_hotplug_request_to_secondary(struct 
eal_dev_mp_req *req)
}
}
 
+   free(mp_reply.msgs);
return 0;
 }
 
-- 
2.17.1



Re: [dpdk-dev] [PATCH v2 5/7] net/mlx5: e-switch VXLAN tunnel devices management

2018-10-29 Thread Slava Ovsiienko
> -Original Message-
> From: Yongseok Koh
> Sent: Saturday, October 27, 2018 1:43
> To: Slava Ovsiienko 
> Cc: Shahaf Shuler ; dev@dpdk.org
> Subject: Re: [PATCH v2 5/7] net/mlx5: e-switch VXLAN tunnel devices
> management
> 
> On Fri, Oct 26, 2018 at 02:35:24AM -0700, Slava Ovsiienko wrote:
> > > -Original Message-
> > > From: Yongseok Koh
> > > Sent: Friday, October 26, 2018 9:26
> > > To: Slava Ovsiienko 
> > > Cc: Shahaf Shuler ; dev@dpdk.org
> > > Subject: Re: [PATCH v2 5/7] net/mlx5: e-switch VXLAN tunnel devices
> > > management
> > >
> > > On Thu, Oct 25, 2018 at 01:21:12PM -0700, Slava Ovsiienko wrote:
> > > > > -Original Message-
> > > > > From: Yongseok Koh
> > > > > Sent: Thursday, October 25, 2018 3:28
> > > > > To: Slava Ovsiienko 
> > > > > Cc: Shahaf Shuler ; dev@dpdk.org
> > > > > Subject: Re: [PATCH v2 5/7] net/mlx5: e-switch VXLAN tunnel
> > > > > devices management
> > > > >
> > > > > On Mon, Oct 15, 2018 at 02:13:33PM +, Viacheslav Ovsiienko
> wrote:
> > > > > > VXLAN interfaces are dynamically created for each local UDP
> > > > > > port of outer networks and then used as targets for TC
> > > > > > "flower" filters in order to perform encapsulation. These
> > > > > > VXLAN interfaces are system-wide, the only one device with
> > > > > > given UDP port can exist in the system (the attempt of
> > > > > > creating another device with the same UDP local port returns
> > > > > > EEXIST), so PMD should support the shared device instances
> > > > > > database for PMD instances. These VXLAN implicitly created devices
> are called VTEPs (Virtual Tunnel End Points).
> > > > > >
> > > > > > Creation of the VTEP occurs at the moment of rule applying.
> > > > > > The link is set up, root ingress qdisc is also initialized.
> > > > > >
> > > > > > Encapsulation VTEPs are created on per port basis, the single
> > > > > > VTEP is attached to the outer interface and is shared for all
> > > > > > encapsulation rules on this interface. The source UDP port is
> > > > > > automatically selected in range 3-6.
> > > > > >
> > > > > > For decapsulaton one VTEP is created per every unique UDP
> > > > > > local port to accept tunnel traffic. The name of created VTEP
> > > > > > consists of prefix "vmlx_" and the number of UDP port in
> > > > > > decimal digits without leading zeros (vmlx_4789). The VTEP can
> > > > > > be preliminary created in the system before the launching
> > > > > > application, it allows to share UDP ports between primary
> > > > > > and secondary processes.
> > > > > >
> > > > > > Suggested-by: Adrien Mazarguil 
> > > > > > Signed-off-by: Viacheslav Ovsiienko 
> > > > > > ---
> > > > > >  drivers/net/mlx5/mlx5_flow_tcf.c | 503
> > > > > > ++-
> > > > > >  1 file changed, 499 insertions(+), 4 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/net/mlx5/mlx5_flow_tcf.c
> > > > > > b/drivers/net/mlx5/mlx5_flow_tcf.c
> > > > > > index d6840d5..efa9c3b 100644
> > > > > > --- a/drivers/net/mlx5/mlx5_flow_tcf.c
> > > > > > +++ b/drivers/net/mlx5/mlx5_flow_tcf.c
> > > > > > @@ -3443,6 +3443,432 @@ struct pedit_parser {
> > > > > > return -err;
> > > > > >  }
> > > > > >
> > > > > > +/* VTEP device list is shared between PMD port instances. */
> > > > > > +static LIST_HEAD(, mlx5_flow_tcf_vtep)
> > > > > > +   vtep_list_vxlan = LIST_HEAD_INITIALIZER();
> static
> > > > > pthread_mutex_t
> > > > > > +vtep_list_mutex = PTHREAD_MUTEX_INITIALIZER;
> > > > >
> > > > > What's the reason for choosing pthread_mutex instead of
> rte_*_lock?
> > > >
> > > > The sharing this database for secondary processes?
> > >
> > > The static variable isn't shared with sec proc. But you can leave it as 
> > > is.
> >
> > Yes. The sharing just was assumed, not implemented yet.
> >
> > >
> > > > > > +
> > > > > > +/**
> > > > > > + * Deletes VTEP network device.
> > > > > > + *
> > > > > > + * @param[in] tcf
> > > > > > + *   Context object initialized by mlx5_flow_tcf_context_create().
> > > > > > + * @param[in] vtep
> > > > > > + *   Object represinting the network device to delete. Memory
> > > > > > + *   allocated for this object is freed by routine.
> > > > > > + */
> > > > > > +static void
> > > > > > +flow_tcf_delete_iface(struct mlx6_flow_tcf_context *tcf,
> > > > > > + struct mlx5_flow_tcf_vtep *vtep) {
> > > > > > +   struct nlmsghdr *nlh;
> > > > > > +   struct ifinfomsg *ifm;
> > > > > > +   alignas(struct nlmsghdr)
> > > > > > +   uint8_t buf[mnl_nlmsg_size(MNL_ALIGN(sizeof(*ifm))) + 8];
> > > > > > +   int ret;
> > > > > > +
> > > > > > +   assert(!vtep->refcnt);
> > > > > > +   if (vtep->created && vtep->ifindex) {
> > > > >
> > > > > First of all vtep->created seems of no use. It is introduced to
> > > > > select the error message in flow_tcf_create_iface(). I don't see
> > > > > any necessity to distinguish between 'vtep is allocated by
> > > > > rte_malloc()' and
> > > 'vtep is created i

Re: [dpdk-dev] [PATCH] eal: fix rte_mp_request_sync() memleak on device hotplug

2018-10-29 Thread Stojaczyk, Dariusz
> -Original Message-
> From: Burakov, Anatoly
> Sent: Friday, October 26, 2018 4:22 PM
> To: Stojaczyk, Dariusz ; dev@dpdk.org
> Cc: Zhang, Qi Z ; sta...@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] eal: fix rte_mp_request_sync() memleak
> on device hotplug
> 
> 
>
> This is correct but incomplete. There are also numerous error conditions
> which check for number of received responses to be a particular number,
> and if the number don't match, we just exit without freeing memory.
> Those errors need to free the memory as well.
> 

Yup, thanks. I pushed v2 with one extra free() in the function notifying 
secondary processes.
The function which notifies the primary process has a similar error check -
 - `if (mp_reply.nb_received == 1)`  - but I figured if there's more than 1 
primary process
replying, then the memory leak is your smallest problem.

> --
> Thanks,
> Anatoly


[dpdk-dev] [PATCH] crypto/caam_jr: fix incorrect check

2018-10-29 Thread Gagandeep Singh
Check should be on parameter uio_fd instead of
local variable job_ring

Fixes: e7a45f3cc2 ("crypto/caam_jr: add UIO specific operations")

Signed-off-by: Gagandeep Singh 
---
changes:
 * fixed incorrect check in free_job_ring (comment by Ferruh)

 drivers/crypto/caam_jr/caam_jr_uio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/caam_jr/caam_jr_uio.c 
b/drivers/crypto/caam_jr/caam_jr_uio.c
index c07d9db01..d94101c2f 100644
--- a/drivers/crypto/caam_jr/caam_jr_uio.c
+++ b/drivers/crypto/caam_jr/caam_jr_uio.c
@@ -332,7 +332,7 @@ free_job_ring(uint32_t uio_fd)
struct uio_job_ring *job_ring = NULL;
int i;
 
-   if (!job_ring->uio_fd)
+   if (!uio_fd)
return;
 
for (i = 0; i < MAX_SEC_JOB_RINGS; i++) {
-- 
2.17.1



Re: [dpdk-dev] [PATCH v5 04/15] crypto/caam_jr: add UIO specific operations

2018-10-29 Thread Gagandeep Singh
Hi

> -Original Message-
> From: Ferruh Yigit 
> Sent: Sunday, October 28, 2018 6:05 AM
> To: Gagandeep Singh ; dev@dpdk.org; Akhil Goyal
> 
> Cc: Hemant Agrawal 
> Subject: Re: [dpdk-dev] [PATCH v5 04/15] crypto/caam_jr: add UIO specific
> operations
> 
> On 10/22/2018 3:57 PM, Gagandeep Singh wrote:
> > caam_jr need support from kernel caam driver for job ring
> > initialisation. So to access register space for job ring and allow re
> > configure and map to userspace UIO interface is used. This also allows
> > to handle the caam interrupts from the user space.
> >
> > This patch adds UIO specific operations
> >
> > Signed-off-by: Gagandeep Singh 
> > Signed-off-by: Hemant Agrawal 
> > Acked-by: Akhil Goyal 
> 
> <...>
> 
> > +void
> > +free_job_ring(uint32_t uio_fd)
> > +{
> > +   struct uio_job_ring *job_ring = NULL;
> > +   int i;
> > +
> > +   if (!job_ring->uio_fd)
> > +   return;
> 
> It seems this function is not called. Can the intention be:
> 
>  if (!uio_fd)
>return;

Patch has been sent http://patchwork.dpdk.org/patch/47557/ 


Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask

2018-10-29 Thread Alejandro Lucero
On Mon, Oct 29, 2018 at 11:46 AM Thomas Monjalon 
wrote:

> 29/10/2018 12:39, Alejandro Lucero:
> > I got a patch that solves a bug when calling rte_eal_dma_mask using the
> > mask instead of the maskbits. However, this does not solves the deadlock.
>
> The deadlock is a bigger concern I think.
>
>
I think once the call to rte_eal_check_dma_mask uses the maskbits instead
of the mask, calling rte_memseg_walk_thread_unsafe avoids the deadlock.

Yao, can you try with the attached patch?


> > Interestingly, the problem looks like a compiler one. Calling
> > rte_memseg_walk does not return when calling inside rt_eal_dma_mask, but
> if
> > you modify the call like this:
> >
> > -   if (rte_memseg_walk(check_iova, &mask))
> > +   if (!rte_memseg_walk(check_iova, &mask))
> >
> > it works, although the value returned to the invoker changes, of course.
> > But the point here is it should be the same behaviour when calling
> > rte_memseg_walk than before and it is not.
>
> Anyway, the coding style requires to save the return value in a variable,
> instead of nesting the call in an "if" condition.
> And the "if" check should be explicitly != 0 because it is not a real
> boolean.
>
> PS: please do not top post and avoid HTML emails, thanks
>
>
>


Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask

2018-10-29 Thread Yao, Lei A


From: Alejandro Lucero [mailto:alejandro.luc...@netronome.com]
Sent: Monday, October 29, 2018 8:56 PM
To: Thomas Monjalon 
Cc: Yao, Lei A ; dev ; Xu, Qian Q 
; Lin, Xueqin ; Burakov, Anatoly 
; Yigit, Ferruh 
Subject: Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask


On Mon, Oct 29, 2018 at 11:46 AM Thomas Monjalon 
mailto:tho...@monjalon.net>> wrote:
29/10/2018 12:39, Alejandro Lucero:
> I got a patch that solves a bug when calling rte_eal_dma_mask using the
> mask instead of the maskbits. However, this does not solves the deadlock.

The deadlock is a bigger concern I think.

I think once the call to rte_eal_check_dma_mask uses the maskbits instead of 
the mask, calling rte_memseg_walk_thread_unsafe avoids the deadlock.

Yao, can you try with the attached patch?

Hi, Lucero

This patch can fix the issue at my side. Thanks a lot
for you quick action.

BRs
Lei

> Interestingly, the problem looks like a compiler one. Calling
> rte_memseg_walk does not return when calling inside rt_eal_dma_mask, but if
> you modify the call like this:
>
> -   if (rte_memseg_walk(check_iova, &mask))
> +   if (!rte_memseg_walk(check_iova, &mask))
>
> it works, although the value returned to the invoker changes, of course.
> But the point here is it should be the same behaviour when calling
> rte_memseg_walk than before and it is not.

Anyway, the coding style requires to save the return value in a variable,
instead of nesting the call in an "if" condition.
And the "if" check should be explicitly != 0 because it is not a real boolean.

PS: please do not top post and avoid HTML emails, thanks



Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask

2018-10-29 Thread Alejandro Lucero
On Mon, Oct 29, 2018 at 1:18 PM Yao, Lei A  wrote:

>
>
>
>
> *From:* Alejandro Lucero [mailto:alejandro.luc...@netronome.com]
> *Sent:* Monday, October 29, 2018 8:56 PM
> *To:* Thomas Monjalon 
> *Cc:* Yao, Lei A ; dev ; Xu, Qian Q <
> qian.q...@intel.com>; Lin, Xueqin ; Burakov,
> Anatoly ; Yigit, Ferruh  >
> *Subject:* Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask
>
>
>
>
>
> On Mon, Oct 29, 2018 at 11:46 AM Thomas Monjalon 
> wrote:
>
> 29/10/2018 12:39, Alejandro Lucero:
> > I got a patch that solves a bug when calling rte_eal_dma_mask using the
> > mask instead of the maskbits. However, this does not solves the
> deadlock.
>
> The deadlock is a bigger concern I think.
>
>
>
> I think once the call to rte_eal_check_dma_mask uses the maskbits instead
> of the mask, calling rte_memseg_walk_thread_unsafe avoids the deadlock.
>
>
>
> Yao, can you try with the attached patch?
>
>
>
> Hi, Lucero
>
>
>
> This patch can fix the issue at my side. Thanks a lot
>
> for you quick action.
>
>
>

Great!

I will send an official patch with the changes.

I have to say that I tested the patchset, but I think it was where
legacy_mem was still there and therefore dynamic memory allocation code not
used during memory initialization.

There is something that concerns me though. Using
rte_memseg_walk_thread_unsafe could be a problem under some situations
although those situations being unlikely.

Usually, calling rte_eal_check_dma_mask happens during initialization. Then
it is safe to use the unsafe function for walking memsegs, but with device
hotplug and dynamic memory allocation, there exists a potential race
condition when the primary process is allocating more memory and
concurrently a device is hotplugged and a secondary process does the device
initialization. By now, this is just a problem with the NFP, and the
potential race condition window really unlikely, but I will work on this
asap.


> BRs
>
> Lei
>
>
>
> > Interestingly, the problem looks like a compiler one. Calling
> > rte_memseg_walk does not return when calling inside rt_eal_dma_mask,
> but if
> > you modify the call like this:
> >
> > -   if (rte_memseg_walk(check_iova, &mask))
> > +   if (!rte_memseg_walk(check_iova, &mask))
> >
> > it works, although the value returned to the invoker changes, of course.
> > But the point here is it should be the same behaviour when calling
> > rte_memseg_walk than before and it is not.
>
> Anyway, the coding style requires to save the return value in a variable,
> instead of nesting the call in an "if" condition.
> And the "if" check should be explicitly != 0 because it is not a real
> boolean.
>
> PS: please do not top post and avoid HTML emails, thanks
>
>


Re: [dpdk-dev] [PATCH v2] build: add meson.build for kni kernel module

2018-10-29 Thread Ilya Maximets
> On Fri, 2018-10-12 at 17:29 +0100, Bruce Richardson wrote:
>> On Fri, Oct 12, 2018 at 04:12:21PM +0100, Luca Boccassi wrote:
>> > A Kbuild is also included to allow users to use DKMS natively
>> > without
>> > additional code.
>> > 
>> > Signed-off-by: Luca Boccassi 
>> > ---
>> > v2: add local directory to -I and build sources list recursively to
>> > fix build on Fedora
>> > 
>> 
>> Yep, now seems to build for me on Fedora. One minor suggestion is to
>> look
>> to use the "console" keyword on the custom_target if possible. It
>> should
>> help prevent noticable stalls as make runs in the background.
>> [Unfortunately, it's meson 0.48 onward only, so if conditionals are
>> needed,
>> just ignore this suggestion]
> 
> Good idea, I just with 0.48, 0.47 and 0.41 and in all cases there are
> no complaints or errors, it's simply ignored in the older versions.

Hmm.
I have following on my Ubuntu 18.04 with dpdk v18.11-rc1:

$ meson --version
0.45.1

$ meson build
<...>
kernel/linux/kni/meson.build:16: WARNING: Passed invalid keyword argument 
"console".
WARNING: This will become a hard error in the future.
WARNING: Unknown keyword arguments in target rte_kni: console
<...>

It builds, but prints above warning.

Best regards, Ilya Maximets.

> 
>> Also, do we want to use "make -j" for this, given the number of
>> files?
>> [Maybe -j4 just?]
> 
> Yeah makes sense, added j4.
> 
> Sent v3 with both changes and the ack, thanks.
> 
> -- 
> Kind regards,
> Luca Boccassi



Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask

2018-10-29 Thread Thomas Monjalon
29/10/2018 14:40, Alejandro Lucero:
> On Mon, Oct 29, 2018 at 1:18 PM Yao, Lei A  wrote:
> > *From:* Alejandro Lucero [mailto:alejandro.luc...@netronome.com]
> > On Mon, Oct 29, 2018 at 11:46 AM Thomas Monjalon 
> > wrote:
> >
> > 29/10/2018 12:39, Alejandro Lucero:
> > > I got a patch that solves a bug when calling rte_eal_dma_mask using the
> > > mask instead of the maskbits. However, this does not solves the
> > deadlock.
> >
> > The deadlock is a bigger concern I think.
> >
> > I think once the call to rte_eal_check_dma_mask uses the maskbits instead
> > of the mask, calling rte_memseg_walk_thread_unsafe avoids the deadlock.
> >
> > Yao, can you try with the attached patch?
> >
> > Hi, Lucero
> >
> > This patch can fix the issue at my side. Thanks a lot
> > for you quick action.
> 
> Great!
> 
> I will send an official patch with the changes.

Please, do not forget my other request to better comment functions.


> I have to say that I tested the patchset, but I think it was where
> legacy_mem was still there and therefore dynamic memory allocation code not
> used during memory initialization.
> 
> There is something that concerns me though. Using
> rte_memseg_walk_thread_unsafe could be a problem under some situations
> although those situations being unlikely.
> 
> Usually, calling rte_eal_check_dma_mask happens during initialization. Then
> it is safe to use the unsafe function for walking memsegs, but with device
> hotplug and dynamic memory allocation, there exists a potential race
> condition when the primary process is allocating more memory and
> concurrently a device is hotplugged and a secondary process does the device
> initialization. By now, this is just a problem with the NFP, and the
> potential race condition window really unlikely, but I will work on this
> asap.

Yes, this is what concerns me.
You can add a comment explaining the unsafe which is not handled.


> > > Interestingly, the problem looks like a compiler one. Calling
> > > rte_memseg_walk does not return when calling inside rt_eal_dma_mask,
> > but if
> > > you modify the call like this:
> > >
> > > -   if (rte_memseg_walk(check_iova, &mask))
> > > +   if (!rte_memseg_walk(check_iova, &mask))
> > >
> > > it works, although the value returned to the invoker changes, of course.
> > > But the point here is it should be the same behaviour when calling
> > > rte_memseg_walk than before and it is not.
> >
> > Anyway, the coding style requires to save the return value in a variable,
> > instead of nesting the call in an "if" condition.
> > And the "if" check should be explicitly != 0 because it is not a real
> > boolean.
> >
> > PS: please do not top post and avoid HTML emails, thanks
> >
> >
> 







Re: [dpdk-dev] [PATCH] eal: fix rte_mp_request_sync() memleak on device hotplug

2018-10-29 Thread Zhang, Qi Z


> -Original Message-
> From: Stojaczyk, Dariusz
> Sent: Monday, October 29, 2018 7:02 AM
> To: Burakov, Anatoly ; dev@dpdk.org
> Cc: Zhang, Qi Z ; sta...@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH] eal: fix rte_mp_request_sync() memleak on
> device hotplug
> 
> > -Original Message-
> > From: Burakov, Anatoly
> > Sent: Friday, October 26, 2018 4:22 PM
> > To: Stojaczyk, Dariusz ; dev@dpdk.org
> > Cc: Zhang, Qi Z ; sta...@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH] eal: fix rte_mp_request_sync() memleak
> > on device hotplug
> >
> > 
> >
> > This is correct but incomplete. There are also numerous error
> > conditions which check for number of received responses to be a
> > particular number, and if the number don't match, we just exit without
> freeing memory.
> > Those errors need to free the memory as well.
> >
> 
> Yup, thanks. I pushed v2 with one extra free() in the function notifying
> secondary processes.
> The function which notifies the primary process has a similar error check -
>  - `if (mp_reply.nb_received == 1)`  - but I figured if there's more than 1
> primary process replying, then the memory leak is your smallest problem.


Good capture!
Thanks for fix this.

> 
> > --
> > Thanks,
> > Anatoly


Re: [dpdk-dev] [PATCH v2 6/9] app/procinfo: add code for debug crypto

2018-10-29 Thread Varghese, Vipin
Hi Stepehen,



> > > This is a very big macro, better have static function for this instead of
> macro.
> > >
> >
> > There are two thoughts in choosing MACRO over function.
> > 1. The information need to display in certain format within the same 
> > context.
> > 2. As the API are modified, co locating all at same scope is easier to 
> > clean up
> and correct in future.
> >
> > So I feel use of MACRO over function in this instance.
> 
> 
> I don't agree with your arguments. Macros, are ugly and error prone. This is
> not performance critical so it should be a function. The only reason to use
> macro's is if it is not possible to write it as a function (as in a template 
> for code
> generation).

Thanks for the sharing your feedback. But In my comments I never stated the 
MACRO were used for performance or critical blocks. As mentioned in my 
explanation, the MACRO is used for printing the desired format and format the 
code. 

Since I have received 2 request for change, I will do the same and release v3.

Thanks
Vipin Varghese


Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask

2018-10-29 Thread Alejandro Lucero
On Mon, Oct 29, 2018 at 2:18 PM Thomas Monjalon  wrote:

> 29/10/2018 14:40, Alejandro Lucero:
> > On Mon, Oct 29, 2018 at 1:18 PM Yao, Lei A  wrote:
> > > *From:* Alejandro Lucero [mailto:alejandro.luc...@netronome.com]
> > > On Mon, Oct 29, 2018 at 11:46 AM Thomas Monjalon 
> > > wrote:
> > >
> > > 29/10/2018 12:39, Alejandro Lucero:
> > > > I got a patch that solves a bug when calling rte_eal_dma_mask using
> the
> > > > mask instead of the maskbits. However, this does not solves the
> > > deadlock.
> > >
> > > The deadlock is a bigger concern I think.
> > >
> > > I think once the call to rte_eal_check_dma_mask uses the maskbits
> instead
> > > of the mask, calling rte_memseg_walk_thread_unsafe avoids the deadlock.
> > >
> > > Yao, can you try with the attached patch?
> > >
> > > Hi, Lucero
> > >
> > > This patch can fix the issue at my side. Thanks a lot
> > > for you quick action.
> >
> > Great!
> >
> > I will send an official patch with the changes.
>
> Please, do not forget my other request to better comment functions.
>
>
>
Sure.


> > I have to say that I tested the patchset, but I think it was where
> > legacy_mem was still there and therefore dynamic memory allocation code
> not
> > used during memory initialization.
> >
> > There is something that concerns me though. Using
> > rte_memseg_walk_thread_unsafe could be a problem under some situations
> > although those situations being unlikely.
> >
> > Usually, calling rte_eal_check_dma_mask happens during initialization.
> Then
> > it is safe to use the unsafe function for walking memsegs, but with
> device
> > hotplug and dynamic memory allocation, there exists a potential race
> > condition when the primary process is allocating more memory and
> > concurrently a device is hotplugged and a secondary process does the
> device
> > initialization. By now, this is just a problem with the NFP, and the
> > potential race condition window really unlikely, but I will work on this
> > asap.
>
> Yes, this is what concerns me.
> You can add a comment explaining the unsafe which is not handled.
>
>
I'' do.

Thanks!


>
> > > > Interestingly, the problem looks like a compiler one. Calling
> > > > rte_memseg_walk does not return when calling inside rt_eal_dma_mask,
> > > but if
> > > > you modify the call like this:
> > > >
> > > > -   if (rte_memseg_walk(check_iova, &mask))
> > > > +   if (!rte_memseg_walk(check_iova, &mask))
> > > >
> > > > it works, although the value returned to the invoker changes, of
> course.
> > > > But the point here is it should be the same behaviour when calling
> > > > rte_memseg_walk than before and it is not.
> > >
> > > Anyway, the coding style requires to save the return value in a
> variable,
> > > instead of nesting the call in an "if" condition.
> > > And the "if" check should be explicitly != 0 because it is not a real
> > > boolean.
> > >
> > > PS: please do not top post and avoid HTML emails, thanks
> > >
> > >
> >
>
>
>
>
>
>


Re: [dpdk-dev] [PATCH v6 0/6] add encap and decap actions to Direct Verbs flow in MLX5 PMD

2018-10-29 Thread Dekel Peled
Thanks, PSB.

> -Original Message-
> From: Shahaf Shuler
> Sent: Monday, October 29, 2018 12:03 PM
> To: Dekel Peled ; Yongseok Koh
> 
> Cc: dev@dpdk.org; Ori Kam 
> Subject: RE: [dpdk-dev] [PATCH v6 0/6] add encap and decap actions to
> Direct Verbs flow in MLX5 PMD
> 
> Hi Dekel,
> 
> Thursday, October 25, 2018 11:08 PM, Dekel Peled:
> > Subject: [dpdk-dev] [PATCH v6 0/6] add encap and decap actions to
> > Direct Verbs flow in MLX5 PMD
> >
> > This series adds support of encap and decap actions in DV format.
> > L2 tunnel support for VXLAN and NVGRE, and L2/L3 tunnel support using
> > raw data buffer.
> > It is using the generic encapsulation framework from [1].
> >
> > [1] "ethdev: add generic L2/L3 tunnel encapsulation actions"
> >
> >
> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmail
> > s.dpdk.org%2Farchives%2Fdev%2F2018-
> >
> October%2F114654.html&data=02%7C01%7Cshahafs%40mellanox.com
> >
> %7C5c61dd00f0ea45acba0608d63ab5c0e6%7Ca652971c7d2e4d9ba6a4d14925
> >
> 6f461b%7C0%7C0%7C636760949640981142&sdata=tdOJhJfndTAP6Uotk
> > MJrkgBsa7AXljXJRMPsivfveUo%3D&reserved=0
> 
> In general looks good, please see some comments to address.

Thanks, will do.

> 
> >
> > ---
> > v6:
> > * Adapt L2 tunnel to VXLAN and NVGRE
> > * Add encap/decap using raw data
> > v5:
> > * Move DV actions code under common DV flag.
> > v4:
> > * Update in glue functions.
> > v3:
> > * Apply additional code review comments.
> > v2:
> > * Rebase on tip.
> > * Apply code review comments.
> >
> >
> > Dekel Peled (6):
> >   net/mlx5: add flow action functions to glue
> >   net/mlx5: add VXLAN encap action to Direct Verbs
> >   net/mlx5: add VXLAN decap action to Direct Verbs
> >   net/mlx5: add NVGRE encap action to Direct Verbs
> >   net/mlx5: add NVGRE decap action to Direct Verbs
> >   net/mlx5: add raw data encap decap to Direct Verbs
> >
> >  drivers/net/mlx5/Makefile   |   2 +-
> >  drivers/net/mlx5/meson.build|   2 +-
> >  drivers/net/mlx5/mlx5_flow.h|  15 +
> >  drivers/net/mlx5/mlx5_flow_dv.c | 671
> > +++-
> >  drivers/net/mlx5/mlx5_glue.c|  38 +++
> >  drivers/net/mlx5/mlx5_glue.h|  10 +
> >  6 files changed, 731 insertions(+), 7 deletions(-)
> >
> > --
> > 1.8.3.1



Re: [dpdk-dev] [PATCH v2] build: add meson.build for kni kernel module

2018-10-29 Thread Luca Boccassi
On Mon, 2018-10-29 at 17:09 +0300, Ilya Maximets wrote:
> > On Fri, 2018-10-12 at 17:29 +0100, Bruce Richardson wrote:
> > > On Fri, Oct 12, 2018 at 04:12:21PM +0100, Luca Boccassi wrote:
> > > > A Kbuild is also included to allow users to use DKMS natively
> > > > without
> > > > additional code.
> > > > 
> > > > Signed-off-by: Luca Boccassi 
> > > > ---
> > > > v2: add local directory to -I and build sources list
> > > > recursively to
> > > > fix build on Fedora
> > > > 
> > > 
> > > Yep, now seems to build for me on Fedora. One minor suggestion is
> > > to
> > > look
> > > to use the "console" keyword on the custom_target if possible. It
> > > should
> > > help prevent noticable stalls as make runs in the background.
> > > [Unfortunately, it's meson 0.48 onward only, so if conditionals
> > > are
> > > needed,
> > > just ignore this suggestion]
> > 
> > Good idea, I just with 0.48, 0.47 and 0.41 and in all cases there
> > are
> > no complaints or errors, it's simply ignored in the older versions.
> 
> Hmm.
> I have following on my Ubuntu 18.04 with dpdk v18.11-rc1:
> 
> $ meson --version
> 0.45.1
> 
> $ meson build
> <...>
> kernel/linux/kni/meson.build:16: WARNING: Passed invalid keyword
> argument "console".
> WARNING: This will become a hard error in the future.
> WARNING: Unknown keyword arguments in target rte_kni: console
> <...>
> 
> It builds, but prints above warning.
> 
> Best regards, Ilya Maximets.

Interesting, wonder why I didn't see that on 0.41! Anyway those
warnings are fine to ignore, especially in this case.

-- 
Kind regards,
Luca Boccassi


Re: [dpdk-dev] [PATCH v3 2/6] mem: use address hint for mapping hugepages

2018-10-29 Thread Dariusz Stojaczyk
On Fri, Oct 5, 2018 at 2:47 PM Alejandro Lucero
 wrote:
>
> Linux kernel uses a really high address as starting address for
> serving mmaps calls. If there exist addressing limitations and
> IOVA mode is VA, this starting address is likely too high for
> those devices. However, it is possible to use a lower address in
> the process virtual address space as with 64 bits there is a lot
> of available space.
>
> This patch adds an address hint as starting address for 64 bits
> systems and increments the hint for next invocations. If the mmap
> call does not use the hint address, repeat the mmap call using
> the hint address incremented by page size.
>
> Signed-off-by: Alejandro Lucero 
> Reviewed-by: Anatoly Burakov 
> ---
>  lib/librte_eal/common/eal_common_memory.c | 34 
> ++-
>  1 file changed, 33 insertions(+), 1 deletion(-)
>
> diff --git a/lib/librte_eal/common/eal_common_memory.c 
> b/lib/librte_eal/common/eal_common_memory.c
> index c482f0d..853c44c 100644
> --- a/lib/librte_eal/common/eal_common_memory.c
> +++ b/lib/librte_eal/common/eal_common_memory.c
> @@ -37,6 +37,23 @@
>  static void *next_baseaddr;
>  static uint64_t system_page_sz;
>
> +#ifdef RTE_ARCH_64
> +/*
> + * Linux kernel uses a really high address as starting address for serving
> + * mmaps calls. If there exists addressing limitations and IOVA mode is VA,
> + * this starting address is likely too high for those devices. However, it
> + * is possible to use a lower address in the process virtual address space
> + * as with 64 bits there is a lot of available space.
> + *
> + * Current known limitations are 39 or 40 bits. Setting the starting address
> + * at 4GB implies there are 508GB or 1020GB for mapping the available
> + * hugepages. This is likely enough for most systems, although a device with
> + * addressing limitations should call rte_eal_check_dma_mask for ensuring all
> + * memory is within supported range.
> + */
> +static uint64_t baseaddr = 0x1;
> +#endif

This breaks running with ASAN unless a custom --base-virtaddr option
is specified. The default base-virtaddr introduced by this patch falls
into an area that's already reserved by ASAN.

See here: https://github.com/google/sanitizers/wiki/AddressSanitizerAlgorithm
The only available address space starts at 0x10007fff8000, which
unfortunately doesn't fit in 39 bits.

Right now the very first eal_get_virtual_area() in EAL initialization
is used with 4KB pagesize, meaning that DPDK will try to mmap at each
4KB-aligned offset all the way from 0x1 to 0x10007fff8000,
which takes quite a long, long time.

I'm not sure about the solution to this problem, but I verify that
starting DPDK 18.11-rc1 with `--base-virtaddr 0x2000` works
just fine under ASAN.

D.

>
> 


Re: [dpdk-dev] [PATCH 2/3] service: fix possible NULL access

2018-10-29 Thread Van Haaren, Harry
> -Original Message-
> From: Yigit, Ferruh
> Sent: Saturday, October 27, 2018 6:09 PM
> To: Van Haaren, Harry 
> Cc: dev@dpdk.org; Yigit, Ferruh ; sta...@dpdk.org
> Subject: [PATCH 2/3] service: fix possible NULL access
> 
> Fixes: 21698354c832 ("service: introduce service cores concept")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Ferruh Yigit 
> ---

Acked-by: Harry van Haaren 



Re: [dpdk-dev] [PATCH] net/mlx5: set RSS key len 0 to indicate default RSS

2018-10-29 Thread Ophir Munk
Please find comments inline.

> -Original Message-
> From: Yongseok Koh
> Sent: Wednesday, October 03, 2018 9:56 PM
> To: Ophir Munk 
> Cc: dev@dpdk.org; Asaf Penso ; Shahaf Shuler
> ; Thomas Monjalon ;
> Olga Shern 
> Subject: Re: [PATCH] net/mlx5: set RSS key len 0 to indicate default RSS
> 
> 
> > On Oct 3, 2018, at 10:37 AM, Ophir Munk 
> wrote:
> >
> > Applications which add an RSS flow must supply an RSS key table and an
> > RSS key length. If an application needs to add the default RSS flow it
> > should not care about the exact RSS default key table and its length.
> > By setting key length to 0 - the PMD will know that it should use the
> > default RSS key table and length.
> >
> > Signed-off-by: Ophir Munk 
> > ---
> > drivers/net/mlx5/mlx5_flow.c | 7 ---
> 
> Please rebase the code on top of the latest dpdk-next-net-mlx.
> Actually, it is better to rebase on top of PR#878.
> 
> Thanks,
> Yongseok

I have rebased the code on master branch dated 18.11 28-Oct-18.
After the rebase I have discovered more seg fault scenarios which are fixed in 
V2.

> 
> > 1 file changed, 4 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/net/mlx5/mlx5_flow.c
> > b/drivers/net/mlx5/mlx5_flow.c index 3f548a9..18eacf5 100644
> > --- a/drivers/net/mlx5/mlx5_flow.c
> > +++ b/drivers/net/mlx5/mlx5_flow.c
> > @@ -2062,7 +2062,8 @@ struct mlx5_flow_tunnel_info {
> >
> RTE_FLOW_ERROR_TYPE_ACTION_CONF,
> >   &rss->level,
> >   "tunnel RSS is not supported");
> > -   if (rss->key_len < MLX5_RSS_HASH_KEY_LEN)
> > +   /* key_len 0 means using default RSS key */
> > +   if (rss->key_len > 0 && rss->key_len < MLX5_RSS_HASH_KEY_LEN)
> > return rte_flow_error_set(error, ENOTSUP,
> >
> RTE_FLOW_ERROR_TYPE_ACTION_CONF,
> >   &rss->key_len,
> > @@ -2106,7 +2107,7 @@ struct mlx5_flow_tunnel_info {
> > memcpy((*flow->queue), rss->queue,
> >rss->queue_num * sizeof(uint16_t));
> > flow->rss.queue_num = rss->queue_num;
> > -   memcpy(flow->key, rss->key, MLX5_RSS_HASH_KEY_LEN);
> > +   memcpy(flow->key, rss->key, rss->key_len);
> > flow->rss.types = rss->types;
> > flow->rss.level = rss->level;
> > flow->fate |= MLX5_FLOW_FATE_RSS;
> > @@ -2948,7 +2949,7 @@ struct mlx5_flow_tunnel_info {
> >  flow->rss.queue_num);
> > if (!hrxq)
> > hrxq = mlx5_hrxq_new(dev, flow->key,
> > -
> MLX5_RSS_HASH_KEY_LEN,
> > +flow->rss.key_len,
> >  verbs->hash_fields,
> >  (*flow->queue),
> >  flow->rss.queue_num,
> > --
> > 1.8.3.1
> >



Re: [dpdk-dev] [PATCH] net/mlx5: set RSS key len 0 to indicate default RSS

2018-10-29 Thread Ophir Munk
> -Original Message-
> From: Shahaf Shuler
> Sent: Sunday, October 07, 2018 2:21 PM
> To: Yongseok Koh ; Ophir Munk
> 
> Cc: dev@dpdk.org; Asaf Penso ; Thomas Monjalon
> ; Olga Shern 
> Subject: RE: [PATCH] net/mlx5: set RSS key len 0 to indicate default RSS
> 
> > > On Oct 3, 2018, at 10:37 AM, Ophir Munk 
> > wrote:
> > >
> > > Applications which add an RSS flow must supply an RSS key table and
> > > an RSS key length. If an application needs to add the default RSS
> > > flow it should not care about the exact RSS default key table and its
> length.
> > > By setting key length to 0 - the PMD will know that it should use
> > > the default RSS key table and length.
> 
> Can you refer where in the rte_flow API this is documented?  I couldn't find 
> it
> under struct rte_flow_action_rss  definition.
> I agree it is better for the PMD to behave this way, but this should be
> explained in the API, so for this patch to be accepted we need also doc
> update in rte_flow.h.
> 

Will update documentation in v2

> In my opinion, it is better for the key to be NULL to request from the PMD to
> use the default key, similar to struct rte_eth_rss_conf.
> 

I suggest considering both (NULL key or key_len 0) as a criteria for using 
default RSS key.
In addition if type==0 it will be considered a default type and mlx5 will use 
ETH_RSS_IP
Will update in v2

> > > diff --git a/drivers/net/mlx5/mlx5_flow.c
> > > b/drivers/net/mlx5/mlx5_flow.c index 3f548a9..18eacf5 100644
> > > --- a/drivers/net/mlx5/mlx5_flow.c
> > > +++ b/drivers/net/mlx5/mlx5_flow.c
> > > @@ -2062,7 +2062,8 @@ struct mlx5_flow_tunnel_info {
> > >
> > > @@ -2106,7 +2107,7 @@ struct mlx5_flow_tunnel_info {
> > >   memcpy((*flow->queue), rss->queue,
> > >  rss->queue_num * sizeof(uint16_t));
> > >   flow->rss.queue_num = rss->queue_num;
> > > - memcpy(flow->key, rss->key, MLX5_RSS_HASH_KEY_LEN);
> > > + memcpy(flow->key, rss->key, rss->key_len);
> 
> In Mellanox devices 40B of a key is a must. If you intended to copy 0 bytes, 
> it
> is better to do it in other ways.

rss->key_len can be 0 or 40. Please consider this memcpy() a generic code which 
fits both cases.

> 


Re: [dpdk-dev] [PATCH v3 2/6] mem: use address hint for mapping hugepages

2018-10-29 Thread Alejandro Lucero
Hi Dariousz,

On Mon, Oct 29, 2018 at 4:08 PM Dariusz Stojaczyk 
wrote:

> On Fri, Oct 5, 2018 at 2:47 PM Alejandro Lucero
>  wrote:
> >
> > Linux kernel uses a really high address as starting address for
> > serving mmaps calls. If there exist addressing limitations and
> > IOVA mode is VA, this starting address is likely too high for
> > those devices. However, it is possible to use a lower address in
> > the process virtual address space as with 64 bits there is a lot
> > of available space.
> >
> > This patch adds an address hint as starting address for 64 bits
> > systems and increments the hint for next invocations. If the mmap
> > call does not use the hint address, repeat the mmap call using
> > the hint address incremented by page size.
> >
> > Signed-off-by: Alejandro Lucero 
> > Reviewed-by: Anatoly Burakov 
> > ---
> >  lib/librte_eal/common/eal_common_memory.c | 34
> ++-
> >  1 file changed, 33 insertions(+), 1 deletion(-)
> >
> > diff --git a/lib/librte_eal/common/eal_common_memory.c
> b/lib/librte_eal/common/eal_common_memory.c
> > index c482f0d..853c44c 100644
> > --- a/lib/librte_eal/common/eal_common_memory.c
> > +++ b/lib/librte_eal/common/eal_common_memory.c
> > @@ -37,6 +37,23 @@
> >  static void *next_baseaddr;
> >  static uint64_t system_page_sz;
> >
> > +#ifdef RTE_ARCH_64
> > +/*
> > + * Linux kernel uses a really high address as starting address for
> serving
> > + * mmaps calls. If there exists addressing limitations and IOVA mode is
> VA,
> > + * this starting address is likely too high for those devices. However,
> it
> > + * is possible to use a lower address in the process virtual address
> space
> > + * as with 64 bits there is a lot of available space.
> > + *
> > + * Current known limitations are 39 or 40 bits. Setting the starting
> address
> > + * at 4GB implies there are 508GB or 1020GB for mapping the available
> > + * hugepages. This is likely enough for most systems, although a device
> with
> > + * addressing limitations should call rte_eal_check_dma_mask for
> ensuring all
> > + * memory is within supported range.
> > + */
> > +static uint64_t baseaddr = 0x1;
> > +#endif
>
> This breaks running with ASAN unless a custom --base-virtaddr option
> is specified. The default base-virtaddr introduced by this patch falls
> into an area that's already reserved by ASAN.
>
> See here:
> https://github.com/google/sanitizers/wiki/AddressSanitizerAlgorithm
> The only available address space starts at 0x10007fff8000, which
> unfortunately doesn't fit in 39 bits.
>
> Right now the very first eal_get_virtual_area() in EAL initialization
> is used with 4KB pagesize, meaning that DPDK will try to mmap at each
> 4KB-aligned offset all the way from 0x1 to 0x10007fff8000,
> which takes quite a long, long time.
>
> I'm not sure about the solution to this problem, but I verify that
> starting DPDK 18.11-rc1 with `--base-virtaddr 0x2000` works
> just fine under ASAN.
>
>
Do we have documentation about using Address Sanitizer?
I understand the goal but, which is the cost? Do you have numbers about the
impact on performance?

Solving this is not trivial. I would say someone interested in this but
using a hardware with addressing limitations needs to choose.
Could it be possible to modify the virtual addresses used by default? I
guess the shadow regions can be higher that the default ones.




> D.
>
> >
> > 
>


Re: [dpdk-dev] [PATCH v6 2/6] net/mlx5: add VXLAN encap action to Direct Verbs

2018-10-29 Thread Dekel Peled
Thanks, PSB.

> -Original Message-
> From: Shahaf Shuler
> Sent: Monday, October 29, 2018 12:03 PM
> To: Dekel Peled ; Yongseok Koh
> 
> Cc: dev@dpdk.org; Ori Kam 
> Subject: RE: [dpdk-dev] [PATCH v6 2/6] net/mlx5: add VXLAN encap action to
> Direct Verbs
> 
> Hi Dekel,
> 
> Thursday, October 25, 2018 11:08 PM, Dekel Peled:
> > Subject: [dpdk-dev] [PATCH v6 2/6] net/mlx5: add VXLAN encap action to
> > Direct Verbs
> >
> > This patch implements the VXLAN encap action in DV flow for MLX5 PMD.
> >
> > Signed-off-by: Dekel Peled 
> > ---
> >  drivers/net/mlx5/mlx5_flow.h|   2 +
> >  drivers/net/mlx5/mlx5_flow_dv.c | 351
> > +++-
> >  2 files changed, 348 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/net/mlx5/mlx5_flow.h
> > b/drivers/net/mlx5/mlx5_flow.h index 61299d6..6e92afe 100644
> > --- a/drivers/net/mlx5/mlx5_flow.h
> > +++ b/drivers/net/mlx5/mlx5_flow.h
> > @@ -92,6 +92,7 @@
> >  #define MLX5_FLOW_ACTION_DEC_TTL (1u << 19)  #define
> > MLX5_FLOW_ACTION_SET_MAC_SRC (1u << 20)  #define
> > MLX5_FLOW_ACTION_SET_MAC_DST (1u << 21)
> > +#define MLX5_FLOW_ACTION_VXLAN_ENCAP (1u << 22)
> >
> >  #define MLX5_FLOW_FATE_ACTIONS \
> > (MLX5_FLOW_ACTION_DROP | MLX5_FLOW_ACTION_QUEUE |
> > MLX5_FLOW_ACTION_RSS) @@ -181,6 +182,7 @@ struct mlx5_flow_dv {
> #ifdef
> > HAVE_IBV_FLOW_DV_SUPPORT
> > struct mlx5dv_flow_action_attr
> > actions[MLX5_DV_MAX_NUMBER_OF_ACTIONS];
> > /**< Action list. */
> > +   struct ibv_flow_action *verbs_action; /**< Verbs encap/decap
> 
> The ibv_flow_action is already part of a union inside
> mlx5dv_flow_action_attr, why you need it separately?
> I see also in the below code that you copy it from the action list to this
> specific field. Can you elaborate why?
> 

I added it to use when flow rule is removed, for easy access to the action to 
destroy.
I am now changing the code per 17.11 PR 876, adding cache of encap/decap 
actions, so this member is no longer needed, and will be removed.

> > object.
> > +*/
> >  #endif
> > int actions_n; /**< number of actions. */  }; diff --git
> > a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
> > index 8f729f4..14110c5 100644
> > --- a/drivers/net/mlx5/mlx5_flow_dv.c
> > +++ b/drivers/net/mlx5/mlx5_flow_dv.c
> > @@ -34,6 +34,12 @@
> >
> >  #ifdef HAVE_IBV_FLOW_DV_SUPPORT
> >
> > +/*
> > + * Encap buf length, max:
> > + *   Eth:14/VLAN:8/IPv6:40/TCP:36/TUNNEL:20/Eth:14
> 
> VLAN is 4B not 8B. which tunnel is for 20B?

This is code I reused from 17.11, I will verify it.

> 
> > + */
> > +#define MLX5_ENCAP_MAX_LEN 132
> > +
> >  /**
> >   * Validate META item.
> >   *
> > @@ -96,6 +102,300 @@
> >  }
> >
> >  /**
> > + * Validate the L2 encap action.
> > + * Used for VXLAN encap action.
> 
> No need for that. Later on you put more supported protocols. This is a
> generic function for L2 encap validation.

I will remove it.

> 
> > + *
> > + * @param[in] action_flags
> > + *   Holds the actions detected until now.
> > + * @param[in] action
> > + *   Pointer to the encap action.
> > + * @param[in] attr
> > + *   Pointer to flow attributes
> > + * @param[out] error
> > + *   Pointer to error structure.
> > + *
> > + * @return
> > + *   0 on success, a negative errno value otherwise and rte_errno is set.
> > + */
> > +static int
> > +flow_dv_validate_action_l2_encap(uint64_t action_flags,
> > +const struct rte_flow_action *action,
> > +const struct rte_flow_attr *attr,
> > +struct rte_flow_error *error)
> > +{
> > +   if (!(action->conf))
> > +   return rte_flow_error_set(error, EINVAL,
> > + RTE_FLOW_ERROR_TYPE_ACTION,
> > action,
> > + "configuration cannot be null");
> > +   if (action_flags & MLX5_FLOW_ACTION_DROP)
> > +   return rte_flow_error_set(error, EINVAL,
> > + RTE_FLOW_ERROR_TYPE_ACTION,
> > NULL,
> > + "can't drop and encap in same
> > flow");
> > +   if (action_flags & MLX5_FLOW_ACTION_VXLAN_ENCAP)
> > +   return rte_flow_error_set(error, EINVAL,
> > + RTE_FLOW_ERROR_TYPE_ACTION,
> > NULL,
> > + "can only have a single encap"
> > + " action in a flow");
> > +   if (attr->ingress)
> > +   return rte_flow_error_set(error, ENOTSUP,
> > +
> > RTE_FLOW_ERROR_TYPE_ATTR_INGRESS,
> > + NULL,
> > + "encap action not supported for "
> > + "ingress");
> > +   return 0;
> > +}
> > +
> > +/**
> > + * Get the size of specific rte_flow_item_type
> > + *
> > + * @param[in] item_type
> > + *   Tested rte_flow_item_type.
> > + *
> > + * @return
> > + *   sizeof struct item_type, 0 if void or irrelev

Re: [dpdk-dev] [PATCH v6 3/6] net/mlx5: add VXLAN decap action to Direct Verbs

2018-10-29 Thread Dekel Peled
Thanks, PSB.

> -Original Message-
> From: Shahaf Shuler
> Sent: Monday, October 29, 2018 12:03 PM
> To: Dekel Peled ; Yongseok Koh
> 
> Cc: dev@dpdk.org; Ori Kam 
> Subject: RE: [dpdk-dev] [PATCH v6 3/6] net/mlx5: add VXLAN decap action to
> Direct Verbs
> 
> Thursday, October 25, 2018 11:08 PM, Dekel Peled:
> > Subject: [dpdk-dev] [PATCH v6 3/6] net/mlx5: add VXLAN decap action to
> > Direct Verbs
> >
> > This patch implements the VXLAN decap action in DV flow for MLX5 PMD.
> >
> > Signed-off-by: Dekel Peled 
> > ---
> >  drivers/net/mlx5/mlx5_flow.h|   1 +
> >  drivers/net/mlx5/mlx5_flow_dv.c | 103
> > ++--
> >  2 files changed, 100 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/net/mlx5/mlx5_flow.h
> > b/drivers/net/mlx5/mlx5_flow.h index 6e92afe..3795644 100644
> > --- a/drivers/net/mlx5/mlx5_flow.h
> > +++ b/drivers/net/mlx5/mlx5_flow.h
> > @@ -93,6 +93,7 @@
> >  #define MLX5_FLOW_ACTION_SET_MAC_SRC (1u << 20)  #define
> > MLX5_FLOW_ACTION_SET_MAC_DST (1u << 21)  #define
> > MLX5_FLOW_ACTION_VXLAN_ENCAP (1u << 22)
> > +#define MLX5_FLOW_ACTION_VXLAN_DECAP (1u << 23)
> >
> >  #define MLX5_FLOW_FATE_ACTIONS \
> > (MLX5_FLOW_ACTION_DROP | MLX5_FLOW_ACTION_QUEUE |
> > MLX5_FLOW_ACTION_RSS) diff --git a/drivers/net/mlx5/mlx5_flow_dv.c
> > b/drivers/net/mlx5/mlx5_flow_dv.c index 14110c5..e84a2b6 100644
> > --- a/drivers/net/mlx5/mlx5_flow_dv.c
> > +++ b/drivers/net/mlx5/mlx5_flow_dv.c
> > @@ -131,11 +131,12 @@
> > return rte_flow_error_set(error, EINVAL,
> >   RTE_FLOW_ERROR_TYPE_ACTION,
> > NULL,
> >   "can't drop and encap in same
> > flow");
> > -   if (action_flags & MLX5_FLOW_ACTION_VXLAN_ENCAP)
> > +   if (action_flags & (MLX5_FLOW_ACTION_VXLAN_ENCAP |
> > +   MLX5_FLOW_ACTION_VXLAN_DECAP))
> > return rte_flow_error_set(error, EINVAL,
> >   RTE_FLOW_ERROR_TYPE_ACTION,
> > NULL,
> > - "can only have a single encap"
> > - " action in a flow");
> > + "can only have a single encap or"
> > + " decap action in a flow");
> > if (attr->ingress)
> > return rte_flow_error_set(error, ENOTSUP,
> >
> > RTE_FLOW_ERROR_TYPE_ATTR_INGRESS, @@ -146,6 +147,47 @@  }
> >
> >  /**
> > + * Validate the L2 decap action.
> > + * Used for VXLAN decap action.
> 
> Same like before

I will remove it.

> 
> > + *
> > + * @param[in] action_flags
> > + *   Holds the actions detected until now.
> > + * @param[in] action
> > + *   Pointer to the decap action.
> > + * @param[in] attr
> > + *   Pointer to flow attributes
> > + * @param[out] error
> > + *   Pointer to error structure.
> > + *
> > + * @return
> > + *   0 on success, a negative errno value otherwise and rte_errno is set.
> > + */
> > +static int
> > +flow_dv_validate_action_l2_decap(uint64_t action_flags,
> > +   const struct rte_flow_action *action __rte_unused,
> > +   const struct rte_flow_attr *attr,
> > +   struct rte_flow_error *error)
> > +{
> > +   if (action_flags & MLX5_FLOW_ACTION_DROP)
> > +   return rte_flow_error_set(error, EINVAL,
> > + RTE_FLOW_ERROR_TYPE_ACTION,
> > NULL,
> > + "can't drop and decap in same
> > flow");
> > +   if (action_flags & (MLX5_FLOW_ACTION_VXLAN_ENCAP |
> > +   MLX5_FLOW_ACTION_VXLAN_DECAP))
> > +   return rte_flow_error_set(error, EINVAL,
> > + RTE_FLOW_ERROR_TYPE_ACTION,
> > NULL,
> > + "can only have a single encap or"
> > + " decap action in a flow");
> > +   if (attr->egress)
> > +   return rte_flow_error_set(error, ENOTSUP,
> > +
> > RTE_FLOW_ERROR_TYPE_ATTR_EGRESS,
> > + NULL,
> > + "decap action not supported for "
> > + "egress");
> > +   return 0;
> > +}
> > +
> > +/**
> >   * Get the size of specific rte_flow_item_type
> >   *
> >   * @param[in] item_type
> > @@ -396,6 +438,38 @@
> >  }
> >
> >  /**
> > + * Convert L2 decap action to DV specification.
> > + * Used for VXLAN decap action.
> 
> Same

I will remove it.

> 
> > + *
> > + * @param[in] dev
> > + *   Pointer to rte_eth_dev structure.
> > + * @param[in] action
> > + *   Pointer to action structure.
> > + * @param[out] error
> > + *   Pointer to the error structure.
> > + *
> > + * @return
> > + *   Pointer to action on success, NULL otherwise and rte_errno is set.
> > + */
> > +static struct ibv_flow_action *
> > +flow_dv_create_action_l2_decap(struct rte_eth_dev *dev,
> > +   const struct rte_flow_actio

Re: [dpdk-dev] [PATCH v6 6/6] net/mlx5: add raw data encap decap to Direct Verbs

2018-10-29 Thread Dekel Peled
Thanks, PSB.

> -Original Message-
> From: Shahaf Shuler
> Sent: Monday, October 29, 2018 12:03 PM
> To: Dekel Peled ; Yongseok Koh
> 
> Cc: dev@dpdk.org; Ori Kam 
> Subject: RE: [dpdk-dev] [PATCH v6 6/6] net/mlx5: add raw data encap decap
> to Direct Verbs
> 
> Thursday, October 25, 2018 11:08 PM, Dekel Peled:
> > Subject: [dpdk-dev] [PATCH v6 6/6] net/mlx5: add raw data encap decap
> > to Direct Verbs
> >
> > This patch implements the encap and decap actions, using raw data, in
> > DV flow for MLX5 PMD.
> >
> > Signed-off-by: Dekel Peled 
> > ---
> >  drivers/net/mlx5/mlx5_flow.h|  12 ++-
> >  drivers/net/mlx5/mlx5_flow_dv.c | 227
> > ++--
> >  2 files changed, 224 insertions(+), 15 deletions(-)
> >
> > diff --git a/drivers/net/mlx5/mlx5_flow.h
> > b/drivers/net/mlx5/mlx5_flow.h index 2d73a05..92ba111 100644
> > --- a/drivers/net/mlx5/mlx5_flow.h
> > +++ b/drivers/net/mlx5/mlx5_flow.h
> > @@ -96,15 +96,19 @@
> >  #define MLX5_FLOW_ACTION_VXLAN_DECAP (1u << 23)  #define
> > MLX5_FLOW_ACTION_NVGRE_ENCAP (1u << 24)  #define
> > MLX5_FLOW_ACTION_NVGRE_DECAP (1u << 25)
> > +#define MLX5_FLOW_ACTION_RAW_ENCAP (1u << 26) #define
> > +MLX5_FLOW_ACTION_RAW_DECAP (1u << 27)
> >
> >  #define MLX5_FLOW_FATE_ACTIONS \
> > (MLX5_FLOW_ACTION_DROP | MLX5_FLOW_ACTION_QUEUE |
> > MLX5_FLOW_ACTION_RSS)
> >
> > -#define MLX5_FLOW_ENCAP_ACTIONS\
> > -   (MLX5_FLOW_ACTION_VXLAN_ENCAP |
> > MLX5_FLOW_ACTION_NVGRE_ENCAP)
> > +#define MLX5_FLOW_ENCAP_ACTIONS
> > (MLX5_FLOW_ACTION_VXLAN_ENCAP | \
> > +MLX5_FLOW_ACTION_NVGRE_ENCAP | \
> > +MLX5_FLOW_ACTION_RAW_ENCAP)
> >
> > -#define MLX5_FLOW_DECAP_ACTIONS\
> > -   (MLX5_FLOW_ACTION_VXLAN_DECAP |
> > MLX5_FLOW_ACTION_NVGRE_DECAP)
> > +#define MLX5_FLOW_DECAP_ACTIONS
> > (MLX5_FLOW_ACTION_VXLAN_DECAP | \
> > +MLX5_FLOW_ACTION_NVGRE_DECAP | \
> > +MLX5_FLOW_ACTION_RAW_DECAP)
> >
> >  #ifndef IPPROTO_MPLS
> >  #define IPPROTO_MPLS 137
> > diff --git a/drivers/net/mlx5/mlx5_flow_dv.c
> > b/drivers/net/mlx5/mlx5_flow_dv.c index d7d0c6b..ca75645 100644
> > --- a/drivers/net/mlx5/mlx5_flow_dv.c
> > +++ b/drivers/net/mlx5/mlx5_flow_dv.c
> > @@ -186,6 +186,82 @@
> >  }
> >
> >  /**
> > + * Validate the raw encap action.
> > + *
> > + * @param[in] action_flags
> > + *   Holds the actions detected until now.
> > + * @param[in] action
> > + *   Pointer to the encap action.
> > + * @param[in] attr
> > + *   Pointer to flow attributes
> > + * @param[out] error
> > + *   Pointer to error structure.
> > + *
> > + * @return
> > + *   0 on success, a negative errno value otherwise and rte_errno is set.
> > + */
> > +static int
> > +flow_dv_validate_action_raw_encap(uint64_t action_flags,
> > + const struct rte_flow_action *action,
> > + const struct rte_flow_attr *attr,
> > + struct rte_flow_error *error)
> > +{
> > +   if (!(action->conf))
> > +   return rte_flow_error_set(error, EINVAL,
> > + RTE_FLOW_ERROR_TYPE_ACTION,
> > action,
> > + "configuration cannot be null");
> > +   if (action_flags & MLX5_FLOW_ACTION_DROP)
> > +   return rte_flow_error_set(error, EINVAL,
> > + RTE_FLOW_ERROR_TYPE_ACTION,
> > NULL,
> > + "can't drop and encap in same
> > flow");
> > +   if (action_flags & MLX5_FLOW_ENCAP_ACTIONS)
> > +   return rte_flow_error_set(error, EINVAL,
> > + RTE_FLOW_ERROR_TYPE_ACTION,
> > NULL,
> > + "can only have a single encap"
> > + " action in a flow");
> > +   /* encap without preceding decap is not supported for ingress */
> > +   if (attr->ingress && !(action_flags &
> > MLX5_FLOW_ACTION_RAW_DECAP))
> > +   return rte_flow_error_set(error, ENOTSUP,
> > +
> > RTE_FLOW_ERROR_TYPE_ATTR_INGRESS,
> > + NULL,
> > + "encap action not supported for "
> > + "ingress");
> 
> The error message doesn't seems informative enough.

This is the case when flow rule is created with ingress attribute, and only 
raw_encap action, without raw_decap.
I used the same error message as in L2 encap case.

> 
> > +   return 0;
> > +}
> > +
> > +/**
> > + * Validate the raw decap action.
> > + *
> > + * @param[in] action_flags
> > + *   Holds the actions detected until now.
> > + * @param[out] error
> > + *   Pointer to error structure.
> > + *
> > + * @return
> > + *   0 on success, a negative errno value otherwise and rte_errno is set.
> > + */
> > +static int
> > +flow_dv_validate_action_raw_decap(uint64_t action_flags,
> > + struct rte

[dpdk-dev] [PATCH] net/mlx5: fix missing memory deallocation

2018-10-29 Thread Dekel Peled
Add freeing of allocated memroy before exiting on mlx5dv error.

Fixes: fc2c498ccb94 ("net/mlx5: add Direct Verbs translate items")
Cc: or...@mellanox.com

Signed-off-by: Dekel Peled 
---
 drivers/net/mlx5/mlx5_flow_dv.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index 8f729f4..0e1f715 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -1203,10 +1203,12 @@
dv_attr.flags |= IBV_FLOW_ATTR_FLAGS_EGRESS;
cache_matcher->matcher_object =
mlx5_glue->dv_create_flow_matcher(priv->ctx, &dv_attr);
-   if (!cache_matcher->matcher_object)
+   if (!cache_matcher->matcher_object) {
+   rte_free(cache_matcher);
return rte_flow_error_set(error, ENOMEM,
  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
  NULL, "cannot create matcher");
+   }
rte_atomic32_inc(&cache_matcher->refcnt);
LIST_INSERT_HEAD(&priv->matchers, cache_matcher, next);
dev_flow->dv.matcher = cache_matcher;
-- 
1.8.3.1



Re: [dpdk-dev] [PATCH] compress/qat: fix build issue with clang 7.0.0

2018-10-29 Thread Trahe, Fiona
Hi Jerin,

> -Original Message-
> From: Jerin Jacob [mailto:jerin.ja...@caviumnetworks.com]
> Sent: Saturday, October 27, 2018 9:48 AM
> To: dev@dpdk.org; Trahe, Fiona ; De Lara Guarch, Pablo
> ; Gupta, Ashish 
> Cc: tho...@monjalon.net; Jacob, Jerin 
> Subject: [dpdk-dev] [PATCH] compress/qat: fix build issue with clang 7.0.0
> 
> QAT_NUM_BUFS_IN_IM_SGL defined as 1 the code access beyond
> the first element.
> 
> error log:
> /export/dpdk.org/drivers/compress/qat/qat_comp_pmd.c:214:3:
> error: array index 1 is past the end of the array
> (which contains 1 element) [-Werror,-Warray-bounds]
> sgl->buffers[1].addr = mz_start_phys + offset_of_flat_buffs +
> 
> Fixes: a124830a6f00 ("compress/qat: enable dynamic huffman encoding")
> 
> Signed-off-by: Jerin Jacob 
> ---
>  drivers/compress/qat/qat_comp.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/compress/qat/qat_comp.h b/drivers/compress/qat/qat_comp.h
> index 99a4462eb..24d0d9cb4 100644
> --- a/drivers/compress/qat/qat_comp.h
> +++ b/drivers/compress/qat/qat_comp.h
> @@ -17,7 +17,7 @@
> 
>  #define QAT_64_BYTE_ALIGN_MASK (~0x3f)
>  #define QAT_64_BYTE_ALIGN (64)
> -#define QAT_NUM_BUFS_IN_IM_SGL 1
> +#define QAT_NUM_BUFS_IN_IM_SGL 2
[Fiona] Thanks for this. I'm surprised gcc didn't give a warning.
However, setting to 2 causes issues on some platforms, which 
was why we reverted to 1 in an earlier iteration.
So Nack
I'll send a different patch to fix.

> 
>  #define ERR_CODE_QAT_COMP_WRONG_FW -99
> 
> --
> 2.19.1



Re: [dpdk-dev] [PATCH] net/mlx5: fix Direct Verbs getting item and action flags

2018-10-29 Thread Yongseok Koh
> On Oct 28, 2018, at 11:03 PM, Ori Kam  wrote:
> 
> Why should DV prepare function return the list of actions?
> 
> The only reason I can think of, is if you want to remove the for loop in
> dv_translate. And then in flow_dv_create_action change the switch case
> to ifs.

Then, I should ask you a question. Why did you put actions/layers in rte_flow
struct in the first place? And _prepare() API even takes pointers of item_flags
and actions_flags in order to fill in the structs for nothing. Verbs and TCF
fills in the structs but only DV doesn't. I think you just wanted to use
flow->actions for _apply() and it is filled in _translate(). But, in case of
TCF, _apply() doesn't need to know the action_flags and its _translate doesn't
even fill in the struct while its _prepare() does in vain.
Where's the consistency? What is the definition of the APIs?

If people started to use the fields (because it is there), it won't work as
expected. Slava wants to use the flags in _translate() for his vxlan patch.

So, I think _prepare is the right place to fill in the flags.

Let me know how you want to change it.


Thanks,
Yongseok

> 
>> -Original Message-
>> From: Yongseok Koh
>> Sent: Sunday, October 28, 2018 7:35 PM
>> To: Shahaf Shuler 
>> Cc: dev@dpdk.org; Yongseok Koh ; Ori Kam
>> 
>> Subject: [PATCH] net/mlx5: fix Direct Verbs getting item and action flags
>> 
>> Flow driver has to provide detected item flags and action flags via
>> flow_drv_prepare(). DV doesn't return flags at all.
>> 
>> Fixes: 865a0c15672c ("net/mlx5: add Direct Verbs prepare function")
>> Cc: or...@mellanox.com
>> 
>> Signed-off-by: Yongseok Koh 
>> ---
>> drivers/net/mlx5/mlx5_flow_dv.c | 115
>> 
>> 1 file changed, 106 insertions(+), 9 deletions(-)
>> 
>> diff --git a/drivers/net/mlx5/mlx5_flow_dv.c
>> b/drivers/net/mlx5/mlx5_flow_dv.c
>> index 8f729f44f8..67c133c2fb 100644
>> --- a/drivers/net/mlx5/mlx5_flow_dv.c
>> +++ b/drivers/net/mlx5/mlx5_flow_dv.c
>> @@ -354,6 +354,103 @@ flow_dv_validate(struct rte_eth_dev *dev, const
>> struct rte_flow_attr *attr,
>> }
>> 
>> /**
>> + * Extract item flags and action flags.
>> + *
>> + * @param[in] items
>> + *   Pointer to the list of items.
>> + * @param[in] actions
>> + *   Pointer to the list of actions.
>> + * @param[out] item_flags
>> + *   Pointer to bit mask of all items detected.
>> + * @param[out] action_flags
>> + *   Pointer to bit mask of all actions detected.
>> + */
>> +static void
>> +flow_dv_get_item_action_flags(const struct rte_flow_item items[],
>> +  const struct rte_flow_action actions[],
>> +  uint64_t *item_flags, uint64_t *action_flags)
>> +{
>> +uint64_t detected_items = 0;
>> +uint64_t detected_actions = 0;
>> +int tunnel;
>> +
>> +for (; items->type != RTE_FLOW_ITEM_TYPE_END; items++) {
>> +tunnel = !!(detected_items & MLX5_FLOW_LAYER_TUNNEL);
>> +switch (items->type) {
>> +case RTE_FLOW_ITEM_TYPE_ETH:
>> +detected_items |= tunnel ?
>> MLX5_FLOW_LAYER_INNER_L2 :
>> +
>> MLX5_FLOW_LAYER_OUTER_L2;
>> +break;
>> +case RTE_FLOW_ITEM_TYPE_VLAN:
>> +detected_items |= tunnel ?
>> MLX5_FLOW_LAYER_INNER_VLAN :
>> +
>> MLX5_FLOW_LAYER_OUTER_VLAN;
>> +break;
>> +case RTE_FLOW_ITEM_TYPE_IPV4:
>> +detected_items |= tunnel ?
>> +  MLX5_FLOW_LAYER_INNER_L3_IPV4
>> :
>> +
>> MLX5_FLOW_LAYER_OUTER_L3_IPV4;
>> +break;
>> +case RTE_FLOW_ITEM_TYPE_IPV6:
>> +detected_items |= tunnel ?
>> +  MLX5_FLOW_LAYER_INNER_L3_IPV6
>> :
>> +
>> MLX5_FLOW_LAYER_OUTER_L3_IPV6;
>> +break;
>> +case RTE_FLOW_ITEM_TYPE_TCP:
>> +detected_items |= tunnel ?
>> +  MLX5_FLOW_LAYER_INNER_L4_TCP :
>> +
>> MLX5_FLOW_LAYER_OUTER_L4_TCP;
>> +break;
>> +case RTE_FLOW_ITEM_TYPE_UDP:
>> +detected_items |= tunnel ?
>> +  MLX5_FLOW_LAYER_INNER_L4_UDP
>> :
>> +
>> MLX5_FLOW_LAYER_OUTER_L4_UDP;
>> +break;
>> +case RTE_FLOW_ITEM_TYPE_GRE:
>> +case RTE_FLOW_ITEM_TYPE_NVGRE:
>> +detected_items |= MLX5_FLOW_LAYER_GRE;
>> +break;
>> +case RTE_FLOW_ITEM_TYPE_VXLAN:
>> +detected_items |= MLX5_FLOW_LAYER_VXLAN;
>> +break;
>> +case RTE_FLOW_ITEM_TYPE_VXLAN_GPE:
>> +detected_items |= MLX5_FLOW_LAYER_VXLAN_GPE;
>> +break;
>> +case RTE_FLOW_ITEM_TYPE_META:
>> +detected_items |= MLX5_FLOW_ITEM_METADATA;
>> +break;
>> +default:
>> +  

Re: [dpdk-dev] [PATCH v2 2/7] net/mlx5: e-switch VXLAN flow validation routine

2018-10-29 Thread Yongseok Koh
On Mon, Oct 29, 2018 at 02:33:03AM -0700, Slava Ovsiienko wrote:
> > -Original Message-
> > From: Yongseok Koh
> > Sent: Saturday, October 27, 2018 0:57
> > To: Slava Ovsiienko 
> > Cc: Shahaf Shuler ; dev@dpdk.org
> > Subject: Re: [PATCH v2 2/7] net/mlx5: e-switch VXLAN flow validation
> > routine
> > 
> > On Fri, Oct 26, 2018 at 01:39:38AM -0700, Slava Ovsiienko wrote:
> > > > -Original Message-
> > > > From: Yongseok Koh
> > > > Sent: Friday, October 26, 2018 6:07
> > > > To: Slava Ovsiienko 
> > > > Cc: Shahaf Shuler ; dev@dpdk.org
> > > > Subject: Re: [PATCH v2 2/7] net/mlx5: e-switch VXLAN flow validation
> > > > routine
> > > >
> > > > On Thu, Oct 25, 2018 at 06:53:11AM -0700, Slava Ovsiienko wrote:
> > > > > > -Original Message-
> > > > > > From: Yongseok Koh
> > > > > > Sent: Tuesday, October 23, 2018 13:05
> > > > > > To: Slava Ovsiienko 
> > > > > > Cc: Shahaf Shuler ; dev@dpdk.org
> > > > > > Subject: Re: [PATCH v2 2/7] net/mlx5: e-switch VXLAN flow
> > > > > > validation routine
> > > > > >
> > > > > > On Mon, Oct 15, 2018 at 02:13:30PM +, Viacheslav Ovsiienko
> > wrote:
> > > > [...]
> > > > > > > @@ -1114,7 +1733,6 @@ struct pedit_parser {
> > > > > > >  error);
> > > > > > >   if (ret < 0)
> > > > > > >   return ret;
> > > > > > > - item_flags |=
> > MLX5_FLOW_LAYER_OUTER_L3_IPV4;
> > > > > > >   mask.ipv4 = flow_tcf_item_mask
> > > > > > >   (items, &rte_flow_item_ipv4_mask,
> > > > > > >&flow_tcf_mask_supported.ipv4,
> > @@ -1135,13 +1753,22 @@
> > > > > > > struct pedit_parser {
> > > > > > >   next_protocol =
> > > > > > >   ((const struct
> > rte_flow_item_ipv4 *)
> > > > > > >(items->spec))-
> > >hdr.next_proto_id;
> > > > > > > + if (item_flags &
> > > > > > MLX5_FLOW_LAYER_OUTER_L3_IPV4) {
> > > > > > > + /*
> > > > > > > +  * Multiple outer items are not
> > allowed as
> > > > > > > +  * tunnel parameters, will raise an
> > error later.
> > > > > > > +  */
> > > > > > > + ipv4 = NULL;
> > > > > >
> > > > > > Can't it be inner then?
> > > > > AFAIK,  no for tc rules, we can not specify multiple levels (inner
> > > > > + outer) for
> > > > them.
> > > > > There is just no TCA_FLOWER_KEY_xxx attributes  for specifying
> > > > > inner
> > > > items
> > > > > to match by flower.
> > > >
> > > > When I briefly read the kernel code, I thought TCA_FLOWER_KEY_* are
> > > > for inner header before decap. I mean TCA_FLOWER_KEY_IPV4_SRC is
> > for
> > > > inner L3 and TCA_FLOWER_KEY_ENC_IPV4_SRC is for outer tunnel
> > header.
> > > > Please do some experiments with tc-flower command.
> > >
> > > Hm. Interesting. I will check.
> > >
> > > > > It is quite unclear comment, not the best one, sorry. I did not
> > > > > like it too, just forgot to rewrite.
> > > > >
> > > > > ipv4, ipv6 , udp variables gather the matching items during the
> > > > > item list
> > > > scanning,
> > > > > later variables are used for VXLAN decap action validation only.
> > > > > So, the
> > > > "outer"
> > > > > means that ipv4 variable contains the VXLAN decap outer addresses,
> > > > > and should be NULL-ed if multiple items are found in the items list.
> > > > >
> > > > > But we can generate an error here if we have valid action_flags
> > > > > (gathered by prepare function) and VXLAN decap is set. Raising an
> > > > > error looks more relevant and clear.
> > > >
> > > > You can't use flags at this point. It is validate() so prepare()
> > > > might not be preceded.
> > > >
> > > > > >   flow create 1 ingress transfer
> > > > > > pattern eth src is 66:77:88:99:aa:bb
> > > > > >   dst is 00:11:22:33:44:55 / ipv4 src is 2.2.2.2 dst is 1.1.1.1 
> > > > > > /
> > > > > >   udp src is 4789 dst is 4242 / vxlan vni is 0x112233 /
> > > > > >   eth / ipv6 / tcp dst is 42 / end
> > > > > > actions vxlan_decap / port_id id 2 / end
> > > > > >
> > > > > > Is this flow supported by linux tcf? I took this example from
> > > > > > Adrien's
> > > > patch -
> > > > > > "[8/8] net/mlx5: add VXLAN decap support to switch flow rules".
> > > > > > If so,
> > > > isn't it
> > > > > > possible to have inner L3 layer (MLX5_FLOW_LAYER_INNER_*)? If
> > > > > > not,
> > > > you
> > > > > > should return error in this case. I don't see any code to check
> > > > > > redundant outer items.
> > > > > > Did I miss something?
> > > > >
> > > > > Interesting, besides rule has correct syntax, I'm not sure whether
> > > > > it can be
> > > > applied w/o errors.
> > > >
> > > > Please try. You owns this patchset. However, you just can prohibit
> > > > such flows (tunneled item) and come up with follow-up patc

Re: [dpdk-dev] [PATCH v2 5/7] net/mlx5: e-switch VXLAN tunnel devices management

2018-10-29 Thread Yongseok Koh
On Mon, Oct 29, 2018 at 04:53:34AM -0700, Slava Ovsiienko wrote:
> > -Original Message-
> > From: Yongseok Koh
> > Sent: Saturday, October 27, 2018 1:43
> > To: Slava Ovsiienko 
> > Cc: Shahaf Shuler ; dev@dpdk.org
> > Subject: Re: [PATCH v2 5/7] net/mlx5: e-switch VXLAN tunnel devices
> > management
> > 
> > On Fri, Oct 26, 2018 at 02:35:24AM -0700, Slava Ovsiienko wrote:
> > > > -Original Message-
> > > > From: Yongseok Koh
> > > > Sent: Friday, October 26, 2018 9:26
> > > > To: Slava Ovsiienko 
> > > > Cc: Shahaf Shuler ; dev@dpdk.org
> > > > Subject: Re: [PATCH v2 5/7] net/mlx5: e-switch VXLAN tunnel devices
> > > > management
> > > >
> > > > On Thu, Oct 25, 2018 at 01:21:12PM -0700, Slava Ovsiienko wrote:
> > > > > > -Original Message-
> > > > > > From: Yongseok Koh
> > > > > > Sent: Thursday, October 25, 2018 3:28
> > > > > > To: Slava Ovsiienko 
> > > > > > Cc: Shahaf Shuler ; dev@dpdk.org
> > > > > > Subject: Re: [PATCH v2 5/7] net/mlx5: e-switch VXLAN tunnel
> > > > > > devices management
> > > > > >
> > > > > > On Mon, Oct 15, 2018 at 02:13:33PM +, Viacheslav Ovsiienko
> > wrote:
> > > > > > > VXLAN interfaces are dynamically created for each local UDP
> > > > > > > port of outer networks and then used as targets for TC
> > > > > > > "flower" filters in order to perform encapsulation. These
> > > > > > > VXLAN interfaces are system-wide, the only one device with
> > > > > > > given UDP port can exist in the system (the attempt of
> > > > > > > creating another device with the same UDP local port returns
> > > > > > > EEXIST), so PMD should support the shared device instances
> > > > > > > database for PMD instances. These VXLAN implicitly created devices
> > are called VTEPs (Virtual Tunnel End Points).
> > > > > > >
> > > > > > > Creation of the VTEP occurs at the moment of rule applying.
> > > > > > > The link is set up, root ingress qdisc is also initialized.
> > > > > > >
> > > > > > > Encapsulation VTEPs are created on per port basis, the single
> > > > > > > VTEP is attached to the outer interface and is shared for all
> > > > > > > encapsulation rules on this interface. The source UDP port is
> > > > > > > automatically selected in range 3-6.
> > > > > > >
> > > > > > > For decapsulaton one VTEP is created per every unique UDP
> > > > > > > local port to accept tunnel traffic. The name of created VTEP
> > > > > > > consists of prefix "vmlx_" and the number of UDP port in
> > > > > > > decimal digits without leading zeros (vmlx_4789). The VTEP can
> > > > > > > be preliminary created in the system before the launching
> > > > > > > application, it allows to share   UDP ports between primary
> > > > > > > and secondary processes.
> > > > > > >
> > > > > > > Suggested-by: Adrien Mazarguil 
> > > > > > > Signed-off-by: Viacheslav Ovsiienko 
> > > > > > > ---
> > > > > > >  drivers/net/mlx5/mlx5_flow_tcf.c | 503
> > > > > > > ++-
> > > > > > >  1 file changed, 499 insertions(+), 4 deletions(-)
> > > > > > >
> > > > > > > diff --git a/drivers/net/mlx5/mlx5_flow_tcf.c
> > > > > > > b/drivers/net/mlx5/mlx5_flow_tcf.c
> > > > > > > index d6840d5..efa9c3b 100644
> > > > > > > --- a/drivers/net/mlx5/mlx5_flow_tcf.c
> > > > > > > +++ b/drivers/net/mlx5/mlx5_flow_tcf.c
> > > > > > > @@ -3443,6 +3443,432 @@ struct pedit_parser {
> > > > > > >   return -err;
> > > > > > >  }
> > > > > > >
> > > > > > > +/* VTEP device list is shared between PMD port instances. */
> > > > > > > +static LIST_HEAD(, mlx5_flow_tcf_vtep)
> > > > > > > + vtep_list_vxlan = LIST_HEAD_INITIALIZER();
> > static
> > > > > > pthread_mutex_t
> > > > > > > +vtep_list_mutex = PTHREAD_MUTEX_INITIALIZER;
> > > > > >
> > > > > > What's the reason for choosing pthread_mutex instead of
> > rte_*_lock?
> > > > >
> > > > > The sharing this database for secondary processes?
> > > >
> > > > The static variable isn't shared with sec proc. But you can leave it as 
> > > > is.
> > >
> > > Yes. The sharing just was assumed, not implemented yet.
> > >
> > > >
> > > > > > > +
> > > > > > > +/**
> > > > > > > + * Deletes VTEP network device.
> > > > > > > + *
> > > > > > > + * @param[in] tcf
> > > > > > > + *   Context object initialized by 
> > > > > > > mlx5_flow_tcf_context_create().
> > > > > > > + * @param[in] vtep
> > > > > > > + *   Object represinting the network device to delete. Memory
> > > > > > > + *   allocated for this object is freed by routine.
> > > > > > > + */
> > > > > > > +static void
> > > > > > > +flow_tcf_delete_iface(struct mlx6_flow_tcf_context *tcf,
> > > > > > > +   struct mlx5_flow_tcf_vtep *vtep) {
> > > > > > > + struct nlmsghdr *nlh;
> > > > > > > + struct ifinfomsg *ifm;
> > > > > > > + alignas(struct nlmsghdr)
> > > > > > > + uint8_t buf[mnl_nlmsg_size(MNL_ALIGN(sizeof(*ifm))) + 8];
> > > > > > > + int ret;
> > > > > > > +
> > > > > > > + assert(!vtep->refcnt);
> > > > > > > + if (vtep->created && vtep->ifindex) {
> > 

Re: [dpdk-dev] [PATCH] net/mlx5: fix missing memory deallocation

2018-10-29 Thread Yongseok Koh


> On Oct 29, 2018, at 9:09 AM, Dekel Peled  wrote:
> 
> Add freeing of allocated memroy before exiting on mlx5dv error.
> 
> Fixes: fc2c498ccb94 ("net/mlx5: add Direct Verbs translate items")
> Cc: or...@mellanox.com
> 
> Signed-off-by: Dekel Peled 
> ---

Nice catch!

Acked-by: Yongseok Koh 
 
Thanks

> drivers/net/mlx5/mlx5_flow_dv.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
> index 8f729f4..0e1f715 100644
> --- a/drivers/net/mlx5/mlx5_flow_dv.c
> +++ b/drivers/net/mlx5/mlx5_flow_dv.c
> @@ -1203,10 +1203,12 @@
>   dv_attr.flags |= IBV_FLOW_ATTR_FLAGS_EGRESS;
>   cache_matcher->matcher_object =
>   mlx5_glue->dv_create_flow_matcher(priv->ctx, &dv_attr);
> - if (!cache_matcher->matcher_object)
> + if (!cache_matcher->matcher_object) {
> + rte_free(cache_matcher);
>   return rte_flow_error_set(error, ENOMEM,
> RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
> NULL, "cannot create matcher");
> + }
>   rte_atomic32_inc(&cache_matcher->refcnt);
>   LIST_INSERT_HEAD(&priv->matchers, cache_matcher, next);
>   dev_flow->dv.matcher = cache_matcher;
> -- 
> 1.8.3.1
> 



Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask

2018-10-29 Thread Yongseok Koh


> On Oct 29, 2018, at 7:18 AM, Thomas Monjalon  wrote:
> 
> 29/10/2018 14:40, Alejandro Lucero:
>> On Mon, Oct 29, 2018 at 1:18 PM Yao, Lei A  wrote:
>>> *From:* Alejandro Lucero [mailto:alejandro.luc...@netronome.com]
>>> On Mon, Oct 29, 2018 at 11:46 AM Thomas Monjalon 
>>> wrote:
>>> 
>>> 29/10/2018 12:39, Alejandro Lucero:
 I got a patch that solves a bug when calling rte_eal_dma_mask using the
 mask instead of the maskbits. However, this does not solves the
>>> deadlock.
>>> 
>>> The deadlock is a bigger concern I think.
>>> 
>>> I think once the call to rte_eal_check_dma_mask uses the maskbits instead
>>> of the mask, calling rte_memseg_walk_thread_unsafe avoids the deadlock.
>>> 
>>> Yao, can you try with the attached patch?
>>> 
>>> Hi, Lucero
>>> 
>>> This patch can fix the issue at my side. Thanks a lot
>>> for you quick action.
>> 
>> Great!
>> 
>> I will send an official patch with the changes.
> 
> Please, do not forget my other request to better comment functions.

Alejandro,

This patchset has been merged to stable/17.11 per your request for the last 
release.
You must send a fix to stable/17.11 as well, if you think there's a same issue 
there.

Thanks,
Yongseok

>> I have to say that I tested the patchset, but I think it was where
>> legacy_mem was still there and therefore dynamic memory allocation code not
>> used during memory initialization.
>> 
>> There is something that concerns me though. Using
>> rte_memseg_walk_thread_unsafe could be a problem under some situations
>> although those situations being unlikely.
>> 
>> Usually, calling rte_eal_check_dma_mask happens during initialization. Then
>> it is safe to use the unsafe function for walking memsegs, but with device
>> hotplug and dynamic memory allocation, there exists a potential race
>> condition when the primary process is allocating more memory and
>> concurrently a device is hotplugged and a secondary process does the device
>> initialization. By now, this is just a problem with the NFP, and the
>> potential race condition window really unlikely, but I will work on this
>> asap.
> 
> Yes, this is what concerns me.
> You can add a comment explaining the unsafe which is not handled.
> 
> 
 Interestingly, the problem looks like a compiler one. Calling
 rte_memseg_walk does not return when calling inside rt_eal_dma_mask,
>>> but if
 you modify the call like this:
 
 -   if (rte_memseg_walk(check_iova, &mask))
 +   if (!rte_memseg_walk(check_iova, &mask))
 
 it works, although the value returned to the invoker changes, of course.
 But the point here is it should be the same behaviour when calling
 rte_memseg_walk than before and it is not.
>>> 
>>> Anyway, the coding style requires to save the return value in a variable,
>>> instead of nesting the call in an "if" condition.
>>> And the "if" check should be explicitly != 0 because it is not a real
>>> boolean.
>>> 
>>> PS: please do not top post and avoid HTML emails, thanks
>>> 
>>> 
>> 
> 
> 
> 
> 
> 



Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask

2018-10-29 Thread Alejandro Lucero
On Mon, Oct 29, 2018 at 6:54 PM Yongseok Koh  wrote:

>
> > On Oct 29, 2018, at 7:18 AM, Thomas Monjalon 
> wrote:
> >
> > 29/10/2018 14:40, Alejandro Lucero:
> >> On Mon, Oct 29, 2018 at 1:18 PM Yao, Lei A  wrote:
> >>> *From:* Alejandro Lucero [mailto:alejandro.luc...@netronome.com]
> >>> On Mon, Oct 29, 2018 at 11:46 AM Thomas Monjalon 
> >>> wrote:
> >>>
> >>> 29/10/2018 12:39, Alejandro Lucero:
>  I got a patch that solves a bug when calling rte_eal_dma_mask using
> the
>  mask instead of the maskbits. However, this does not solves the
> >>> deadlock.
> >>>
> >>> The deadlock is a bigger concern I think.
> >>>
> >>> I think once the call to rte_eal_check_dma_mask uses the maskbits
> instead
> >>> of the mask, calling rte_memseg_walk_thread_unsafe avoids the deadlock.
> >>>
> >>> Yao, can you try with the attached patch?
> >>>
> >>> Hi, Lucero
> >>>
> >>> This patch can fix the issue at my side. Thanks a lot
> >>> for you quick action.
> >>
> >> Great!
> >>
> >> I will send an official patch with the changes.
> >
> > Please, do not forget my other request to better comment functions.
>
> Alejandro,
>
> This patchset has been merged to stable/17.11 per your request for the
> last release.
> You must send a fix to stable/17.11 as well, if you think there's a same
> issue there.
>
>
The patchset for 17.11 was much more simpler. There have been a lot of
changes to the memory code since 17.11, and this problem should not be
present in stable 17.11.

Once I have said that, if there are any reports about a problem with this
patchset in 17.11, I will work on it as a priority.

Thanks.


> Thanks,
> Yongseok
>
> >> I have to say that I tested the patchset, but I think it was where
> >> legacy_mem was still there and therefore dynamic memory allocation code
> not
> >> used during memory initialization.
> >>
> >> There is something that concerns me though. Using
> >> rte_memseg_walk_thread_unsafe could be a problem under some situations
> >> although those situations being unlikely.
> >>
> >> Usually, calling rte_eal_check_dma_mask happens during initialization.
> Then
> >> it is safe to use the unsafe function for walking memsegs, but with
> device
> >> hotplug and dynamic memory allocation, there exists a potential race
> >> condition when the primary process is allocating more memory and
> >> concurrently a device is hotplugged and a secondary process does the
> device
> >> initialization. By now, this is just a problem with the NFP, and the
> >> potential race condition window really unlikely, but I will work on this
> >> asap.
> >
> > Yes, this is what concerns me.
> > You can add a comment explaining the unsafe which is not handled.
> >
> >
>  Interestingly, the problem looks like a compiler one. Calling
>  rte_memseg_walk does not return when calling inside rt_eal_dma_mask,
> >>> but if
>  you modify the call like this:
> 
>  -   if (rte_memseg_walk(check_iova, &mask))
>  +   if (!rte_memseg_walk(check_iova, &mask))
> 
>  it works, although the value returned to the invoker changes, of
> course.
>  But the point here is it should be the same behaviour when calling
>  rte_memseg_walk than before and it is not.
> >>>
> >>> Anyway, the coding style requires to save the return value in a
> variable,
> >>> instead of nesting the call in an "if" condition.
> >>> And the "if" check should be explicitly != 0 because it is not a real
> >>> boolean.
> >>>
> >>> PS: please do not top post and avoid HTML emails, thanks
> >>>
> >>>
> >>
> >
> >
> >
> >
> >
>
>


Re: [dpdk-dev] PDF guides broken

2018-10-29 Thread Dan Gora
It looks like it's choking on a figure in the mvpp2.rst file:

guides/nics/mvpp2.rst:.. figure:: img/mvpp2_tm.svg

Reverting commit 0ba610ca1d178cbeedba6e033ee399dfe773801e fixes it.

This fixes it too:

diff --git a/doc/guides/nics/mvpp2.rst b/doc/guides/nics/mvpp2.rst
index 59fa0e10d..10303a1c1 100644
--- a/doc/guides/nics/mvpp2.rst
+++ b/doc/guides/nics/mvpp2.rst
@@ -644,7 +644,7 @@ Node which has a parent is called a leaf whereas
node without
parent is called a non-leaf (root).
MVPP2 PMD supports two level hierarchy where level 0 represents ports
and level 1 represents tx queues of a given port.

-.. figure:: img/mvpp2_tm.svg
+.. figure:: img/mvpp2_tm.*



On Mon, Oct 29, 2018 at 12:38 AM Thomas Monjalon  wrote:
>
> Hi,
>
> The command make-guides-pdf fails currently (18.11-rc1).
> There is probably a subtle syntax error in one of the RST file.
>
> Please, who wants to look at the issue and fix it?
>
> Thank you
>
>


[dpdk-dev] [PATCH] net/mvpp2: fix building pdf documentation

2018-10-29 Thread Dan Gora
Don't use .svg extension on ..figure references.  PDF versions of
the documents use .png images generated from the .svg images.

Fixes: 0ba610ca1d17 ("net/mvpp2: document MTR and TM usage")
Cc: nsams...@marvell.com

Signed-off-by: Dan Gora 
---
 doc/guides/nics/mvpp2.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/guides/nics/mvpp2.rst b/doc/guides/nics/mvpp2.rst
index 59fa0e10d..10303a1c1 100644
--- a/doc/guides/nics/mvpp2.rst
+++ b/doc/guides/nics/mvpp2.rst
@@ -644,7 +644,7 @@ Node which has a parent is called a leaf whereas node 
without
 parent is called a non-leaf (root).
 MVPP2 PMD supports two level hierarchy where level 0 represents ports and 
level 1 represents tx queues of a given port.
 
-.. figure:: img/mvpp2_tm.svg
+.. figure:: img/mvpp2_tm.*
 
 Nodes hold following types of settings:
 
-- 
2.19.0



[dpdk-dev] [PATCH] net/mlx5: make vectorized Tx threshold configurable

2018-10-29 Thread Yongseok Koh
Add txqs_min_vec parameter to configure the maximum number of Tx queues to
enable vectorized Tx. And its default value is set according to the
architecture and device type.

Signed-off-by: Yongseok Koh 
---
 doc/guides/nics/mlx5.rst | 16 +++-
 drivers/net/mlx5/mlx5.c  | 16 
 drivers/net/mlx5/mlx5.h  |  1 +
 drivers/net/mlx5/mlx5_defs.h |  2 ++
 drivers/net/mlx5/mlx5_rxtx_vec.c |  2 +-
 5 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 1dc32829ff..9ed015c377 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -338,6 +338,20 @@ Run-time configuration
 
 - Set to 8 by default.
 
+- ``txqs_min_vec`` parameter [int]
+
+  Enable vectorized Tx only when the number of TX queues is less than or
+  equal to this value. Effective only when ``tx_vec_en`` is enabled.
+
+  On ConnectX-5:
+
+- Set to 8 by default on ARMv8.
+- Set to 4 by default otherwise.
+
+  On Bluefield
+
+- Set to 16 by default.
+
 - ``txq_mpw_en`` parameter [int]
 
   A nonzero value enables multi-packet send (MPS) for ConnectX-4 Lx and
@@ -383,7 +397,7 @@ Run-time configuration
 - ``tx_vec_en`` parameter [int]
 
   A nonzero value enables Tx vector on ConnectX-5 and Bluefield NICs if the 
number of
-  global Tx queues on the port is lesser than MLX5_VPMD_MIN_TXQS.
+  global Tx queues on the port is less than ``txqs_min_vec``.
 
   This option cannot be used with certain offloads such as 
``DEV_TX_OFFLOAD_TCP_TSO,
   DEV_TX_OFFLOAD_VXLAN_TNL_TSO, DEV_TX_OFFLOAD_GRE_TNL_TSO, 
DEV_TX_OFFLOAD_VLAN_INSERT``.
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 6fa50ba1b1..544431e68e 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -75,6 +75,12 @@
  */
 #define MLX5_TXQS_MIN_INLINE "txqs_min_inline"
 
+/*
+ * Device parameter to configure the number of TX queues threshold for
+ * enabling vectorized Tx.
+ */
+#define MLX5_TXQS_MIN_VEC "txqs_min_vec"
+
 /* Device parameter to enable multi-packet send WQEs. */
 #define MLX5_TXQ_MPW_EN "txq_mpw_en"
 
@@ -496,6 +502,8 @@ mlx5_args_check(const char *key, const char *val, void 
*opaque)
config->txq_inline = tmp;
} else if (strcmp(MLX5_TXQS_MIN_INLINE, key) == 0) {
config->txqs_inline = tmp;
+   } else if (strcmp(MLX5_TXQS_MIN_VEC, key) == 0) {
+   config->txqs_vec = tmp;
} else if (strcmp(MLX5_TXQ_MPW_EN, key) == 0) {
config->mps = !!tmp;
} else if (strcmp(MLX5_TXQ_MPW_HDR_DSEG_EN, key) == 0) {
@@ -543,6 +551,7 @@ mlx5_args(struct mlx5_dev_config *config, struct 
rte_devargs *devargs)
MLX5_RXQS_MIN_MPRQ,
MLX5_TXQ_INLINE,
MLX5_TXQS_MIN_INLINE,
+   MLX5_TXQS_MIN_VEC,
MLX5_TXQ_MPW_EN,
MLX5_TXQ_MPW_HDR_DSEG_EN,
MLX5_TXQ_MAX_INLINE_LEN,
@@ -1443,6 +1452,8 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv 
__rte_unused,
};
/* Device speicific configuration. */
switch (pci_dev->id.device_id) {
+   case PCI_DEVICE_ID_MELLANOX_CONNECTX5BF:
+   dev_config.txqs_vec = MLX5_VPMD_MIN_TXQS_BLUEFIELD;
case PCI_DEVICE_ID_MELLANOX_CONNECTX4VF:
case PCI_DEVICE_ID_MELLANOX_CONNECTX4LXVF:
case PCI_DEVICE_ID_MELLANOX_CONNECTX5VF:
@@ -1450,6 +1461,11 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv 
__rte_unused,
dev_config.vf = 1;
break;
default:
+#if defined(RTE_ARCH_ARM64)
+   dev_config.txqs_vec = MLX5_VPMD_MIN_TXQS_ARM64;
+#else
+   dev_config.txqs_vec = MLX5_VPMD_MIN_TXQS;
+#endif
break;
}
for (i = 0; i != n; ++i) {
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 24a3415c8d..0b4418b80b 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -140,6 +140,7 @@ struct mlx5_dev_config {
unsigned int ind_table_max_size; /* Maximum indirection table size. */
int txq_inline; /* Maximum packet size for inlining. */
int txqs_inline; /* Queue number threshold for inlining. */
+   int txqs_vec; /* Queue number threshold for vectorized Tx. */
int inline_max_packet_sz; /* Max packet size for inlining. */
 };
 
diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
index f2a1679511..12b78099e1 100644
--- a/drivers/net/mlx5/mlx5_defs.h
+++ b/drivers/net/mlx5/mlx5_defs.h
@@ -62,6 +62,8 @@
 
 /* Default minimum number of Tx queues for vectorized Tx. */
 #define MLX5_VPMD_MIN_TXQS 4
+#define MLX5_VPMD_MIN_TXQS_ARM64 8
+#define MLX5_VPMD_MIN_TXQS_BLUEFIELD 16
 
 /* Threshold of buffer replenishment for vectorized Rx. */
 #define MLX5_VPMD_RXQ_RPLNSH_THRESH(n) \
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.c b/drivers/net/mlx5/mlx5_rxtx_vec.c
index 1453f4ff63..340292addf 100644
--- a/drivers/net/mlx

Re: [dpdk-dev] [PATCH v2] test/pmd_ring: release ring resources after test

2018-10-29 Thread Phil Yang (Arm Technology China)
Hi,

The commit 6e20a08 ("test/pmd_ring: restructure and cleanup") has already fixed 
this issue.
Abandon this one.

Thanks,
Phil Yang

> -Original Message-
> From: dev  On Behalf Of Phil Yang
> Sent: Friday, October 19, 2018 7:12 PM
> To: dev@dpdk.org
> Cc: nd 
> Subject: [dpdk-dev] [PATCH v2] test/pmd_ring: release ring resources after 
> test
> 
> Need to release the port and the ring resources after test. Otherwise, it will
> cause failure to allocate memory when reentry the test.
> 
> Fixes: 4ea3801 ("app/test: fix ring unit test")
> 
> Signed-off-by: Phil Yang 
> Reviewed-by: Gavin Hu 
> ---
>  test/test/test_pmd_ring.c | 100 
> --
>  1 file changed, 62 insertions(+), 38 deletions(-)
> 
> diff --git a/test/test/test_pmd_ring.c b/test/test/test_pmd_ring.c index
> 19d7d20..f332df9 100644
> --- a/test/test/test_pmd_ring.c
> +++ b/test/test/test_pmd_ring.c
> @@ -7,15 +7,16 @@
> 
>  #include 
>  #include 
> -
> -static struct rte_mempool *mp;
> -static int tx_porta, rx_portb, rxtx_portc, rxtx_portd, rxtx_porte;
> +#include 
> 
>  #define SOCKET0 0
>  #define RING_SIZE 256
>  #define NUM_RINGS 2
>  #define NB_MBUF 512
> +#define NUM_PORTS 5
> 
> +static struct rte_mempool *mp;
> +static int ports[NUM_PORTS];
> 
>  static int
>  test_ethdev_configure_port(int port)
> @@ -64,18 +65,19 @@ test_send_basic_packets(void)
>   struct rte_mbuf *pbufs[RING_SIZE];
>   int i;
> 
> - printf("Testing send and receive RING_SIZE/2 packets (tx_porta ->
> rx_portb)\n");
> + printf("Testing send and receive RING_SIZE/2 "
> + "packets (ports[0] -> ports[1])\n");
> 
>   for (i = 0; i < RING_SIZE/2; i++)
>   pbufs[i] = &bufs[i];
> 
> - if (rte_eth_tx_burst(tx_porta, 0, pbufs, RING_SIZE/2) < RING_SIZE/2) {
> - printf("Failed to transmit packet burst port %d\n", tx_porta);
> + if (rte_eth_tx_burst(ports[0], 0, pbufs, RING_SIZE/2) < RING_SIZE/2) {
> + printf("Failed to transmit packet burst port %d\n", ports[0]);
>   return -1;
>   }
> 
> - if (rte_eth_rx_burst(rx_portb, 0, pbufs, RING_SIZE) != RING_SIZE/2) {
> - printf("Failed to receive packet burst on port %d\n", rx_portb);
> + if (rte_eth_rx_burst(ports[1], 0, pbufs, RING_SIZE) != RING_SIZE/2) {
> + printf("Failed to receive packet burst on port %d\n", ports[1]);
>   return -1;
>   }
> 
> @@ -95,7 +97,8 @@ test_send_basic_packets_port(int port)
>   struct rte_mbuf *pbufs[RING_SIZE];
>   int i;
> 
> - printf("Testing send and receive RING_SIZE/2 packets (cmdl_port0 ->
> cmdl_port0)\n");
> + printf("Testing send and receive RING_SIZE/2 packets"
> + "(cmdl_port0 -> cmdl_port0)\n");
> 
>   for (i = 0; i < RING_SIZE/2; i++)
>   pbufs[i] = &bufs[i];
> @@ -232,8 +235,10 @@ test_pmd_ring_pair_create_attach(int portd, int porte)
>   return -1;
>   }
> 
> - if ((rte_eth_rx_queue_setup(portd, 0, RING_SIZE, SOCKET0, NULL, mp) <
> 0)
> - || (rte_eth_rx_queue_setup(porte, 0, RING_SIZE, SOCKET0,
> NULL, mp) < 0)) {
> + if ((rte_eth_rx_queue_setup(portd, 0, RING_SIZE,
> + SOCKET0, NULL, mp) < 0)
> + || (rte_eth_rx_queue_setup(porte, 0, RING_SIZE,
> + SOCKET0, NULL, mp) < 0)) {
>   printf("RX queue setup failed\n");
>   return -1;
>   }
> @@ -388,9 +393,6 @@ test_pmd_ring_pair_create_attach(int portd, int porte)
>   return -1;
>   }
> 
> - rte_eth_dev_stop(portd);
> - rte_eth_dev_stop(porte);
> -
>   return 0;
>  }
> 
> @@ -400,6 +402,7 @@ test_pmd_ring(void)
>   struct rte_ring *rxtx[NUM_RINGS];
>   int port, cmdl_port0 = -1;
>   uint8_t nb_ports;
> + char port_name[RTE_ETH_NAME_MAX_LEN];
> 
>   nb_ports = rte_eth_dev_count_avail();
>   printf("nb_ports=%d\n", (int)nb_ports); @@ -409,29 +412,36 @@
> test_pmd_ring(void)
>*
>*  Test with the command line option --vdev=net_ring0 to test
> rte_pmd_ring_devinit.
>*/
> - rxtx[0] = rte_ring_create("R0", RING_SIZE, SOCKET0,
> RING_F_SP_ENQ|RING_F_SC_DEQ);
> + rxtx[0] = rte_ring_create("R0", RING_SIZE, SOCKET0,
> + RING_F_SP_ENQ|RING_F_SC_DEQ);
>   if (rxtx[0] == NULL) {
>   printf("rte_ring_create R0 failed");
>   return -1;
>   }
> 
> - rxtx[1] = rte_ring_create("R1", RING_SIZE, SOCKET0,
> RING_F_SP_ENQ|RING_F_SC_DEQ);
> + rxtx[1] = rte_ring_create("R1", RING_SIZE, SOCKET0,
> + RING_F_SP_ENQ|RING_F_SC_DEQ);
>   if (rxtx[1] == NULL) {
>   printf("rte_ring_create R1 failed");
>   return -1;
>   }
> 
> - tx_porta = rte_eth_from_rings("net_ringa", rxtx, NUM_RINGS, rxtx,
> NUM_RINGS, SOCKET0);
> - rx_portb = rte_eth_from_rings("net_ringb", rxtx, NUM_RINGS, rxtx,

Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask

2018-10-29 Thread Lin, Xueqin
Hi Lucero&Thomas,

Find the patch can’t fix multi-process cases.
Steps:

1.   Setup primary process successfully

./hotplug_mp --proc-type=auto



2.   Fail to setup secondary process

./hotplug_mp --proc-type=auto

EAL: Detected 88 lcore(s)

EAL: Detected 2 NUMA nodes

EAL: Auto-detected process type: SECONDARY

EAL: Multi-process socket /var/run/dpdk/rte/mp_socket_147212_2bfe08ee88d23

Segmentation fault (core dumped)


More information as below:

Thread 1 "hotplug_mp" received signal SIGSEGV, Segmentation fault.

0x00597cfb in find_next (arr=0x77ff20a4, start=0, used=true)

at /root/dpdk/lib/librte_eal/common/eal_common_fbarray.c:264

264 for (idx = first; idx < msk->n_masks; idx++) {

#0  0x00597cfb in find_next (arr=0x77ff20a4, start=0, used=true)

at /root/dpdk/lib/librte_eal/common/eal_common_fbarray.c:264

#1  0x00598573 in fbarray_find (arr=0x77ff20a4, start=0, next=true,

used=true) at /root/dpdk/lib/librte_eal/common/eal_common_fbarray.c:1001

#2  0x0059929b in rte_fbarray_find_next_used (arr=0x77ff20a4, 
start=0)

at /root/dpdk/lib/librte_eal/common/eal_common_fbarray.c:1018

#3  0x0058c877 in rte_memseg_walk_thread_unsafe (func=0x58c401 
,

arg=0x7fffcc38) at 
/root/dpdk/lib/librte_eal/common/eal_common_memory.c:589

#4  0x0058ce08 in rte_eal_check_dma_mask (maskbits=48 '0')

at /root/dpdk/lib/librte_eal/common/eal_common_memory.c:465

#5  0x005b96c4 in pci_one_device_iommu_support_va (dev=0x11b3d90)

at /root/dpdk/drivers/bus/pci/linux/pci.c:593

#6  0x005b9738 in pci_devices_iommu_support_va ()

at /root/dpdk/drivers/bus/pci/linux/pci.c:626

#7  0x005b97a7 in rte_pci_get_iommu_class ()

at /root/dpdk/drivers/bus/pci/linux/pci.c:650

#8  0x0058f1ce in rte_bus_get_iommu_class ()

at /root/dpdk/lib/librte_eal/common/eal_common_bus.c:237

#9  0x00577c7a in rte_eal_init (argc=2, argv=0x7fffdf98)

at /root/dpdk/lib/librte_eal/linuxapp/eal/eal.c:919

#10 0x0045dd56 in main (argc=2, argv=0x7fffdf98)

at /root/dpdk/examples/multi_process/hotplug_mp/main.c:28


Best regards,
Xueqin

From: Alejandro Lucero [mailto:alejandro.luc...@netronome.com]
Sent: Monday, October 29, 2018 9:41 PM
To: Yao, Lei A 
Cc: Thomas Monjalon ; dev ; Xu, Qian Q 
; Lin, Xueqin ; Burakov, Anatoly 
; Yigit, Ferruh 
Subject: Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask


On Mon, Oct 29, 2018 at 1:18 PM Yao, Lei A 
mailto:lei.a@intel.com>> wrote:


From: Alejandro Lucero 
[mailto:alejandro.luc...@netronome.com]
Sent: Monday, October 29, 2018 8:56 PM
To: Thomas Monjalon mailto:tho...@monjalon.net>>
Cc: Yao, Lei A mailto:lei.a@intel.com>>; dev 
mailto:dev@dpdk.org>>; Xu, Qian Q 
mailto:qian.q...@intel.com>>; Lin, Xueqin 
mailto:xueqin@intel.com>>; Burakov, Anatoly 
mailto:anatoly.bura...@intel.com>>; Yigit, Ferruh 
mailto:ferruh.yi...@intel.com>>
Subject: Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask


On Mon, Oct 29, 2018 at 11:46 AM Thomas Monjalon 
mailto:tho...@monjalon.net>> wrote:
29/10/2018 12:39, Alejandro Lucero:
> I got a patch that solves a bug when calling rte_eal_dma_mask using the
> mask instead of the maskbits. However, this does not solves the deadlock.

The deadlock is a bigger concern I think.

I think once the call to rte_eal_check_dma_mask uses the maskbits instead of 
the mask, calling rte_memseg_walk_thread_unsafe avoids the deadlock.

Yao, can you try with the attached patch?

Hi, Lucero

This patch can fix the issue at my side. Thanks a lot
for you quick action.


Great!

I will send an official patch with the changes.

I have to say that I tested the patchset, but I think it was where legacy_mem 
was still there and therefore dynamic memory allocation code not used during 
memory initialization.

There is something that concerns me though. Using rte_memseg_walk_thread_unsafe 
could be a problem under some situations although those situations being 
unlikely.

Usually, calling rte_eal_check_dma_mask happens during initialization. Then it 
is safe to use the unsafe function for walking memsegs, but with device hotplug 
and dynamic memory allocation, there exists a potential race condition when the 
primary process is allocating more memory and concurrently a device is 
hotplugged and a secondary process does the device initialization. By now, this 
is just a problem with the NFP, and the potential race condition window really 
unlikely, but I will work on this asap.

BRs
Lei

> Interestingly, the problem looks like a compiler one. Calling
> rte_memseg_walk does not return when calling inside rt_eal_dma_mask, but if
> you modify the call like this:
>
> -   if (rte_memseg_walk(check_iova, &mask))
> +   if (!rte_memseg_walk(check_iova, &mask))
>
> it works, although the value returned to the invoker changes, of course.
> But the p

[dpdk-dev] [PATCH v2] doc/qos_meter: update application information

2018-10-29 Thread Vipin Varghese
The change adds note for previous colour in colour blind and DROP
in profile table actions. In colour blind mode only valid previous
colour is GREEN. To drop packets based on new colour one needs to
set action as DROP in profile table.

Signed-off-by: Vipin Varghese 
---

V2:
changed wording from 0 to GREEN - Cristian Dumitrescu
reword the note section - Vipin Varghese
---
 doc/guides/sample_app_ug/qos_metering.rst | 4 
 1 file changed, 4 insertions(+)

diff --git a/doc/guides/sample_app_ug/qos_metering.rst 
b/doc/guides/sample_app_ug/qos_metering.rst
index 6391841c6..2e8e0175f 100644
--- a/doc/guides/sample_app_ug/qos_metering.rst
+++ b/doc/guides/sample_app_ug/qos_metering.rst
@@ -149,3 +149,7 @@ In this particular case:
 *   Every packet which color has improved is dropped (this particular case 
can't happen, so these values will not be used).
 
 *   For the rest of the cases, the color is changed to red.
+
+.. note::
+* In color blind mode, first row GREEN colour is only valid.
+* To drop the packet, policer_table action has to be set to DROP.
-- 
2.17.1



[dpdk-dev] [PATCH v1] doc: add meson build to contributing guide

2018-10-29 Thread Vipin Varghese
Patches has to be validated for meson builds. Updating documentation
for meson build steps in Checking Compilation category.

Signed-off-by: Vipin Varghese 
---
 doc/guides/contributing/patches.rst | 9 +
 1 file changed, 9 insertions(+)

diff --git a/doc/guides/contributing/patches.rst 
b/doc/guides/contributing/patches.rst
index a3d788024..494037778 100644
--- a/doc/guides/contributing/patches.rst
+++ b/doc/guides/contributing/patches.rst
@@ -468,6 +468,15 @@ The recommended configurations and options to test 
compilation prior to submitti
export DPDK_DEP_PCAP=y
export DPDK_DEP_SSL=y
 
+Compliation of patches has to be tested using meson::
+
+./devtools/test-meson-builds.sh'
+
+The default build target is for shared library. Build can be modified for
+static and cross build by
+
+ * static by ``--default-library=static``.
+ * arm cross build use ``--cross-file=`` to override default path.
 
 Sending Patches
 ---
-- 
2.17.1