[PATCH] usertools: enhance logic to display NUMA

2022-03-26 Thread Vipin Varghese
enhance python logic to accomadate NUMA information. Current logic
considers physical socket with CPU threads to core map. With new
AMD SKU vairant NUMA is no longer same as SOCKET. Single physical
socket can be partitioned to variant of 1,2 and 4.

The changes address the new mapping with Socket-NUMA to CPU cores.

Signed-off-by: Vipin Varghese 
---
 usertools/cpu_layout.py | 76 +
 1 file changed, 47 insertions(+), 29 deletions(-)

diff --git a/usertools/cpu_layout.py b/usertools/cpu_layout.py
index 891b9238fa..295f2c0e9b 100755
--- a/usertools/cpu_layout.py
+++ b/usertools/cpu_layout.py
@@ -3,13 +3,27 @@
 # Copyright(c) 2010-2014 Intel Corporation
 # Copyright(c) 2017 Cavium, Inc. All rights reserved.
 
+import glob
+import os
+
 sockets = []
 cores = []
+numaNodes = []
 core_map = {}
+numa_map = {}
+node_path = "/sys/devices/system/node"
 base_path = "/sys/devices/system/cpu"
-fd = open("{}/kernel_max".format(base_path))
-max_cpus = int(fd.read())
-fd.close()
+max_cpus = 0
+
+if os.path.isdir(base_path):
+temp_maxCpu = glob.glob(base_path + '/cpu[0-9]*')
+max_cpus = len(temp_maxCpu)
+
+if os.path.isdir(node_path):
+temp_numaNodes = glob.glob(node_path + '/node*')
+for numaId in range(0, int(os.path.basename(temp_numaNodes[-1])[4:]) + 1):
+numaNodes.append(numaId)
+
 for cpu in range(max_cpus + 1):
 try:
 fd = open("{}/cpu{}/topology/core_id".format(base_path, cpu))
@@ -17,48 +31,52 @@
 continue
 core = int(fd.read())
 fd.close()
+
+tempGet_cpuNuma = glob.glob("{}/cpu{}/node*".format(base_path, cpu))
+temp_cpuNuma = tempGet_cpuNuma[-1].split("{}/cpu{}/".format(base_path, 
cpu))[-1]
+numa = temp_cpuNuma.split("node")[-1]
+
 fd = open("{}/cpu{}/topology/physical_package_id".format(base_path, cpu))
 socket = int(fd.read())
 fd.close()
+
 if core not in cores:
 cores.append(core)
+
 if socket not in sockets:
 sockets.append(socket)
+
 key = (socket, core)
 if key not in core_map:
 core_map[key] = []
 core_map[key].append(cpu)
 
+key = (socket, numa)
+if key not in numa_map:
+numa_map[key] = []
+
+if (core_map[(socket, core)] not in numa_map[key]):
+numa_map[key].append(core_map[(socket, core)])
+
 print(format("=" * (47 + len(base_path
 print("Core and Socket Information (as reported by '{}')".format(base_path))
 print("{}\n".format("=" * (47 + len(base_path
 print("cores = ", cores)
+print("numa nodes per socket = ", numaNodes)
 print("sockets = ", sockets)
 print("")
 
-max_processor_len = len(str(len(cores) * len(sockets) * 2 - 1))
-max_thread_count = len(list(core_map.values())[0])
-max_core_map_len = (max_processor_len * max_thread_count)  \
-  + len(", ") * (max_thread_count - 1) \
-  + len('[]') + len('Socket ')
-max_core_id_len = len(str(max(cores)))
-
-output = " ".ljust(max_core_id_len + len('Core '))
-for s in sockets:
-output += " Socket %s" % str(s).ljust(max_core_map_len - len('Socket '))
-print(output)
-
-output = " ".ljust(max_core_id_len + len('Core '))
-for s in sockets:
-output += " ".ljust(max_core_map_len)
-output += " "
-print(output)
-
-for c in cores:
-output = "Core %s" % str(c).ljust(max_core_id_len)
-for s in sockets:
-if (s, c) in core_map:
-output += " " + str(core_map[(s, c)]).ljust(max_core_map_len)
-else:
-output += " " * (max_core_map_len + 1)
-print(output)
+for keys in numa_map:
+  print ("")
+  socket,numa = keys
+
+  output = " Socket " + str(socket).ljust(3, ' ') + " Numa " + 
str(numa).zfill(1) + " "
+  #output = " Socket " + str(socket).zfill(1) + " Numa " + str(numa).zfill(1) 
+ " "
+  print(output)
+  print(format("-" * len(output)))
+
+  for index,coreSibling in enumerate(numa_map[keys]):
+  print ("Core " + str(index).ljust(3, ' ') + "" + str(coreSibling))
+  #print ("Core " + str(index).zfill(3) + "" + str(coreSibling))
+print("")
+
-- 
2.25.1



Re: [PATCH] meson: update doc logic for Windows

2022-03-26 Thread Thomas Monjalon
Hello,

Thank you for looking at this problem.

26/03/2022 03:59, Vipin Varghese:
> Support for shell scripts doxy-html-custom, generate_doxygen and
> generate_examples are absent. The current patch address the same by
> disabling document build notifying the user.

It should not prevent generating guides with sphinx.

> Steps to reproduce the error:
>  - Install dependencies doxygen & sphinix build on Windwos server 2019.
>  - Build DPDK with option enable_docs=true for API or User Guide.
> 
> This produces error
> ```
> FAILED: doc/api/examples.dox
> sh -e dpdk/doc/api/generate_examples.sh dpdk/examples doc/api/examples.dox
> ```

I suppose we could replace shell scripts with Python equivalent.





Re: [PATCH] usertools: enhance logic to display NUMA

2022-03-26 Thread Thomas Monjalon
26/03/2022 08:32, Vipin Varghese:
> enhance python logic to accomadate NUMA information. Current logic
> considers physical socket with CPU threads to core map. With new
> AMD SKU vairant NUMA is no longer same as SOCKET. Single physical
> socket can be partitioned to variant of 1,2 and 4.
> 
> The changes address the new mapping with Socket-NUMA to CPU cores.
> 
> Signed-off-by: Vipin Varghese 
> ---
>  usertools/cpu_layout.py | 76 +
>  1 file changed, 47 insertions(+), 29 deletions(-)

Honestly, I'm not sure it is a good idea to keep this script in the DPDK repo.
Can it be replaced with hwloc usage?
What is the output on the new AMD SKU for this command?
lstopo-no-graphics --merge





RE: [PATCH v4] net/ixgbe: retry SFP ID read to handle misbehaving SFPs

2022-03-26 Thread Wang, Haiyue
> -Original Message-
> From: je...@silicom-usa.com 
> Sent: Friday, March 25, 2022 17:54
> To: dev@dpdk.org
> Cc: Stephen Douthit ; Daly, Jeff 
> ; Wang, Haiyue
> 
> Subject: [PATCH v4] net/ixgbe: retry SFP ID read to handle misbehaving SFPs
> 
> From: Stephen Douthit 
> 
> Some XGS-PON SFPs have been observed ACKing I2C reads and returning
> uninitialized garbage while their uC boots.  This can lead to the SFP ID
> code marking an otherwise working SFP module as unsupported if a bogus
> ID value is read while its internal PHY/microcontroller is still
> booting.
> 
> Retry the ID read several times looking not just for NAK, but also for a
> valid ID field.
> 
> Since the device isn't NAKing the trasanction the existing longer retry
> code in ixgbe_read_i2c_byte_generic_int() doesn't apply here.
> 
> Signed-off-by: Stephen Douthit 
> Signed-off-by: Jeff Daly 
> ---
> 
> Notes:
> v4:
> * Fixed git summary
> 

No need v4, have been merged by Qi. ; - )

https://git.dpdk.org/next/dpdk-next-net-intel/commit/?id=2f010a904cb06277d8710bad16ebfe9a38f61b62

> v3:
> * Removed extra braces around single statement if
> 
> v2:
> * Removed superfluous DEBUGOUT
> * Renamed id_reads to retries
> * Don't assume status == 0 means IXGBE_SUCCESS
> 
>  drivers/net/ixgbe/base/ixgbe_phy.c | 27 ---
>  1 file changed, 24 insertions(+), 3 deletions(-)
> 



> --
> 2.25.1



RE: [PATCH v5] net/bonding: another fix to LACP mempool size

2022-03-26 Thread Wang, Haiyue
> -Original Message-
> From: Morten Brørup 
> Sent: Friday, March 25, 2022 21:14
> To: Gaoxiang Liu ; ch...@att.com; humi...@huawei.com
> Cc: dev@dpdk.org; liugaoxi...@huawei.com; olivier.m...@6wind.com; 
> andrew.rybche...@oktetlabs.ru
> Subject: RE: [PATCH v5] net/bonding: another fix to LACP mempool size
> 
> +CC mempool maintainers
> 
> > From: Gaoxiang Liu [mailto:gaoxiangl...@163.com]
> > Sent: Friday, 25 March 2022 14.02
> >
> > The following log message may appear after a slave is idle(or nearly
> > idle)
> > for a few minutes:"PMD: Failed to allocate LACP packet from pool".
> > And bond mode 4 negotiation may fail.
> >
> > Problem:When bond mode 4 has been chosed and delicated queue has
> > not been enable, all mbufs from a slave' private pool(used
> > exclusively for transmitting LACPDUs) have been allocated in
> > interrupt thread, and are still sitting in the device's tx
> > descriptor ring and other cores' mempool caches in fwd thread.
> > Thus the interrupt thread can not alloc LACP packet from pool.
> >
> > Solution: Ensure that each slave'tx (LACPDU) mempool owns more than
> > n-tx-queues * n-tx-descriptor + fwd_core_num *
> > per-core-mmempool-flush-threshold mbufs.
> >
> > Note that the LACP tx machine fuction is the only code that allocates
> > from a slave's private pool. It runs in the context of the interrupt
> > thread, and thus it has no mempool cache of its own.
> >
> > Signed-off-by: Gaoxiang Liu 
> >
> > ---
> > v2:
> > * Fixed compile issues.
> >
> > v3:
> > * delete duplicate code.
> >
> > v4;
> > * Fixed some issues.
> > 1. total_tx_desc should use +=
> > 2. add detailed logs
> >
> > v5:
> > * Fixed some issues.
> > 1. move CACHE_FLUSHTHRESH_MULTIPLIER to rte_eth_bond-8023ad.c
> > 2. use RTE_MIN
> > ---
> >  drivers/net/bonding/rte_eth_bond_8023ad.c | 11 ---


> >
> > +/* BONDING_8023AD_CACHE_FLUSHTHRESH_MULTIPLIER  is the same as
> > + * CACHE_FLUSHTHRESH_MULTIPLIER already defined in rte_mempool.c */
> > +#define BONDING_8023AD_CACHE_FLUSHTHRESH_MULTIPLIER 1.5
> 
> Very important comment. Thank you!
> 
> May I suggest that a similar comment is added to the rte_mempool.c file, so if
> CACHE_FLUSHTHRESH_MULTIPLIER is changed there, we don't forget to change the 
> copy-pasted code in the
> rte_eth_bond_8023ad.c file too. It has previously been discussed changing it 
> from 1.5 to 2 for
> symmetry reasons.

Then, introduce some kind of public API macro, like
RTE_MEMPOOL_CACHE_MAX_FLUSHTHRESH_MULTIPLIER as RTE_MEMPOOL_CACHE_MAX_SIZE does 
?
So that when calling mempool create API, it can do other kind of calculation, 
like
RTE_MIN(user's new flush multiper, 
RTE_MEMPOOL_CACHE_MAX_FLUSHTHRESH_MULTIPLIER).

Just a suggestion, so that no need add strange comments like BONDING_8023AD in 
an
lib.

> 
> > +
> > +   cache_size = RTE_MIN(RTE_MEMPOOL_CACHE_MAX_SIZE, 32);
> > +   total_tx_desc += rte_lcore_count() * cache_size *
> > BONDING_8023AD_CACHE_FLUSHTHRESH_MULTIPLIER;
> > snprintf(mem_name, RTE_DIM(mem_name), "slave_port%u_pool",
> > slave_id);
> > port->mbuf_pool = rte_pktmbuf_pool_create(mem_name,
> > total_tx_desc,
> > -   RTE_MEMPOOL_CACHE_MAX_SIZE >= 32 ?
> > -   32 : RTE_MEMPOOL_CACHE_MAX_SIZE,
> > -   0, element_size, socket_id);
> > +   cache_size, 0, element_size, socket_id);
> >
> > /* Any memory allocation failure in initialization is critical
> > because
> >  * resources can't be free, so reinitialization is impossible. */
> > --
> > 2.32.0
> >



Re: [RFC] eal: add seqlock

2022-03-26 Thread Mattias Rönnblom
On 2022-03-25 22:10, Stephen Hemminger wrote:
> On Fri, 25 Mar 2022 21:24:28 +0100
> Mattias Rönnblom  wrote:
>
>> diff --git a/lib/eal/include/rte_seqlock.h b/lib/eal/include/rte_seqlock.h
>> new file mode 100644
>> index 00..b975ca848a
>> --- /dev/null
>> +++ b/lib/eal/include/rte_seqlock.h
>> @@ -0,0 +1,84 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright(c) 2022 Ericsson AB
>> + */
>> +
>> +#ifndef _RTE_SEQLOCK_H_
>> +#define _RTE_SEQLOCK_H_
>> +
>> +#include 
>> +#include 
>> +
>> +#include 
>> +#include 
>> +#include 
>> +
>> +struct rte_seqlock {
>> +uint64_t sn;
>> +rte_spinlock_t lock;
>> +};
>> +
>> +typedef struct rte_seqlock rte_seqlock_t;
>> +
>
> Add a reference to Wikipedia and/or Linux since not every DPDK
> user maybe familar with this.

OK, will do.

>> +
>> +sn = seqlock->sn + 1;
>> +
>> +__atomic_store_n(&seqlock->sn, sn, __ATOMIC_RELAXED);
>> +
>> +/* __ATOMIC_RELEASE to prevent stores after (in program order)
>> + * from happening before the sn store.
>> + */
>> +rte_atomic_thread_fence(__ATOMIC_RELEASE);
> Could this just be __atomic_fetch_add() with __ATOMIC_RELEASE?

If I understood C11 correctly, an __atomic_fetch_add() with 
__ATOMIC_RELEASE only prevents stores that precedes it (in program 
order) to be move ahead of it. Thus, stores that follows it may be 
reordered across the __atomic_fetch_add(), and seen by a reader before 
the sn change.

Also, __atomic_fetch_add() would generate an atomic add machine 
instruction, which, at least according to my experience (on x86_64), is 
slower than a mov+add+mov, which is what the above code will generate 
(plus prevent certain compiler optimizations). That's with TSO. What 
would happen on weakly ordered machines, I don't know in detail.