Re: [OMPI users] Binding to thread 0

2023-09-11 Thread Luis Cebamanos via users
pping policy are causing the policy to be redefined:   New policy:   RANK_FILE   Prior policy:  BYCORE Regards, L From the looks of it your process is both binding and mapping by hwthread now. -Nathan On Sep 11, 2023, at 10:20 AM, Luis Cebamanos via users wrote: @Gilles @Jeff Sorry, I th

Re: [OMPI users] Binding to thread 0

2023-09-11 Thread Luis Cebamanos via users
he BIOS, and then each of your MPI processes can use the full core, not just a single hardware thread. ---- *From:* users on behalf of Luis Cebamanos via users *Sent:* Friday, September 8, 2023 7:10 AM *To:* Ralph Castain via us

Re: [OMPI users] Binding to thread 0

2023-09-11 Thread Luis Cebamanos via users
hread. *From:* users on behalf of Luis Cebamanos via users *Sent:* Friday, September 8, 2023 7:10 AM *To:* Ralph Castain via users *Cc:* Luis Cebamanos *Subject:* [OMPI users] Binding to thread 0 Hello, Up to now, I have been using numerous ways of bi

[OMPI users] Binding to thread 0

2023-09-08 Thread Luis Cebamanos via users
Hello, Up to now, I have been using numerous ways of binding with wrappers (numactl, taskset) whenever I wanted to play with core placing. Another way I have been using is via -rankfile, however I notice that some ranks jump from thread 0 to thread 1 on SMT chips. I can control this with numactl f

Re: [OMPI users] libnuma.so error

2023-07-20 Thread Luis Cebamanos via users
libnuma.so is not found, so it appears if it's treating it as a warning, not an error. *From:* users on behalf of Luis Cebamanos via users *Sent:* Wednesday, July 19, 2023 10:09 AM *To:*

[OMPI users] libnuma.so error

2023-07-19 Thread Luis Cebamanos via users
Hello, I was wondering if anyone has ever seen the following runtime error: mpirun -np 32 ./hello . [LOG_CAT_SBGP] libnuma.so: cannot open shared object file: No such file or directory [LOG_CAT_SBGP] Failed to dlopen libnuma.so. Fallback to GROUP_BY_SOCKET manual. . The funny thing i

[OMPI users] HWLOC icc error

2021-03-23 Thread Luis Cebamanos via users
Hello, Compiling OpenMPI 4.0.5 with Intel 2020 I came across this error. Has anyone seen this before? I have tried with internal and external HWLOC with the same outcome. CC   net.lo In file included from ../../opal/mca/hwloc/hwloc201/hwloc201.h(26), from ../../opal/mca/hwlo

Re: [OMPI users] Mapping, binding and ranking

2021-03-01 Thread Luis Cebamanos via users
s. Probably rankfile >> is my only option too. >> >> On 28/02/2021 22:44, Ralph Castain via users wrote: >>> The only way I know of to do what you want is >>> >>> --map-by ppr:32:socket --bind-to core --cpu-list 0,2,4,6,... >>> >&g

Re: [OMPI users] Mapping, binding and ranking

2021-03-01 Thread Luis Cebamanos via users
of to do what you want is > > --map-by ppr:32:socket --bind-to core --cpu-list 0,2,4,6,... > > where you list out the exact cpus you want to use. > > >> On Feb 28, 2021, at 9:58 AM, Luis Cebamanos via users >> mailto:users@lists.open-mpi.org>> wrote: >> &g

Re: [OMPI users] Mapping, binding and ranking

2021-03-01 Thread Luis Cebamanos via users
rs wrote: >> The only way I know of to do what you want is >> >> --map-by ppr:32:socket --bind-to core --cpu-list 0,2,4,6,... >> >> where you list out the exact cpus you want to use. >> >> >>> On Feb 28, 2021, at 9:58 AM, Luis Cebamanos via users

Re: [OMPI users] Mapping, binding and ranking

2021-02-28 Thread Luis Cebamanos via users
not every machine has two HWTs/core. On Feb 28, 2021, at 7:43 AM, Luis Cebamanos via users mailto:users@lists.open-mpi.org>> wrote: Hi Ralph, Thanks for this, however --map-by ppr:32:socket:PE=2 --bind-to core reports the same binding than --map-by ppr:32:socket:PE=4 --bind-to hwthr

Re: [OMPI users] Mapping, binding and ranking

2021-02-28 Thread Luis Cebamanos via users
slots=0-1 > rank 1 slots=2-3 > etc. > > Hence the difference. I was simply correcting your mpirun cmd line as you > said you wanted two CORES, and that isn't guaranteed if you are stipulating > things in terms of HWTs as not every machine has two HWTs/core. > > > > On F

Re: [OMPI users] Mapping, binding and ranking

2021-02-28 Thread Luis Cebamanos via users
ect: > > --map-by ppr:32:socket:PE=4 --bind-to hwthread > > should be > > --map-by ppr:32:socket:PE=2 --bind-to core > > > >> On Feb 28, 2021, at 5:57 AM, Luis Cebamanos via users >> mailto:users@lists.open-mpi.org>> wrote: >> >> I should have said, &

Re: [OMPI users] Mapping, binding and ranking

2021-02-28 Thread Luis Cebamanos via users
I should have said, "I would like to run 128 MPI processes on 2 nodes" and not 64 like I initially said... On Sat, 27 Feb 2021, 15:03 Luis Cebamanos, wrote: > Hello OMPI users, > > On 128 core nodes, 2 sockets x 64 cores/socket (2 hwthreads/core) , I am > trying to match the behavior of running

[OMPI users] Mapping, binding and ranking

2021-02-27 Thread Luis Cebamanos via users
Hello OMPI users, On 128 core nodes, 2 sockets x 64 cores/socket (2 hwthreads/core) , I am trying to match the behavior of running with a rankfile with manual mapping/ranking/binding. I would like to run 64 MPI processes on 2 nodes, 1 MPI process every 2 cores. This is, I want to run 32 MPI

Re: [OMPI users] Binding blocks of processes in round-robin fashion

2021-01-29 Thread Luis Cebamanos via users
Hi Ralph, It would be great to have it for load balancing issues. Ideally one could do something like --bind-to:N where N is the block size, 4 in this case. mpirun -np 40  --map-by ppr:40:node  --bind-to core:4  I think it would be interesting to have it. Of course, I can always use srun but not

Re: [OMPI users] Binding blocks of processes in round-robin fashion

2021-01-28 Thread Luis Cebamanos via users
ranks 12-15 to be mapped onto node2 etc.etc. Correct? On Jan 28, 2021, at 3:00 PM, Luis Cebamanos via users wrote: Hello all, What are the options for binding MPI tasks on a blocks of cores per node/socket/numa in a round-robin fashion? Say I want to fully populate 40 core sockets on dual-s

[OMPI users] Binding blocks of processes in round-robin fashion

2021-01-28 Thread Luis Cebamanos via users
Hello all, What are the options for binding MPI tasks on a blocks of cores per node/socket/numa in a round-robin fashion? Say I want to fully populate 40 core sockets on dual-socket nodes but in a round-robin fashion binding 4 cores on the first node, then 4 cores on the next, and so on.  Woul