Re: [OMPI users] Process binding with SLURM and 'heterogeneous' nodes

2015-10-03 Thread Ralph Castain
Thanks - please go ahead and release that allocation as I’m not going to get to this immediately. I’ve got several hot irons in the fire right now, and I’m not sure when I’ll get a chance to track this down. Gilles or anyone else who might have time - feel free to take a gander and see if somet

Re: [OMPI users] Process binding with SLURM and 'heterogeneous' nodes

2015-10-03 Thread marcin.krotkiewski
Done. I have compiled 1.10.0 and 1.10.rc1 with --enable-debug and executed mpirun --mca rmaps_base_verbose 10 --hetero-nodes --report-bindings --bind-to core -np 32 ./affinity In case of 1.10.rc1 I have also added :overload-allowed - output in a separate file. This option did not make much d

Re: [OMPI users] Process binding with SLURM and 'heterogeneous' nodes

2015-10-03 Thread Ralph Castain
Rats - just realized I have no way to test this as none of the machines I can access are setup for cgroup-based multi-tenant. Is this a debug version of OMPI? If not, can you rebuild OMPI with —enable-debug? Then please run it with —mca rmaps_base_verbose 10 and pass along the output. Thanks Ra

Re: [OMPI users] Process binding with SLURM and 'heterogeneous' nodes

2015-10-03 Thread Ralph Castain
What version of slurm is this? I might try to debug it here. I’m not sure where the problem lies just yet. > On Oct 3, 2015, at 8:59 AM, marcin.krotkiewski > wrote: > > Here is the output of lstopo. In short, (0,16) are core 0, (1,17) - core 1 > etc. > > Machine (64GB) > NUMANode L#0 (P#0

Re: [OMPI users] Process binding with SLURM and 'heterogeneous' nodes

2015-10-03 Thread marcin.krotkiewski
Here is the output of lstopo. In short, (0,16) are core 0, (1,17) - core 1 etc. Machine (64GB) NUMANode L#0 (P#0 32GB) Socket L#0 + L3 L#0 (20MB) L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 PU L#0 (P#0) PU L#1 (P#16) L2 L#1 (256KB) + L1d L#1 (32K

Re: [OMPI users] Process binding with SLURM and 'heterogeneous' nodes

2015-10-03 Thread Ralph Castain
Maybe I’m just misreading your HT map - that slurm nodelist syntax is a new one to me, but they tend to change things around. Could you run lstopo on one of those compute nodes and send the output? I’m just suspicious because I’m not seeing a clear pairing of HT numbers in your output, but HT n

Re: [OMPI users] Process binding with SLURM and 'heterogeneous' nodes

2015-10-03 Thread marcin.krotkiewski
On 10/03/2015 04:38 PM, Ralph Castain wrote: If mpirun isn’t trying to do any binding, then you will of course get the right mapping as we’ll just inherit whatever we received. Yes. I meant that whatever you received (what SLURM gives) is a correct cpu map and assigns _whole_ CPUs, not a singl

Re: [OMPI users] Process binding with SLURM and 'heterogeneous' nodes

2015-10-03 Thread Ralph Castain
If mpirun isn’t trying to do any binding, then you will of course get the right mapping as we’ll just inherit whatever we received. Looking at your output, it’s pretty clear that you are getting independent HTs assigned and not full cores. My guess is that something in slurm has changed such tha

Re: [OMPI users] Process binding with SLURM and 'heterogeneous' nodes

2015-10-03 Thread marcin.krotkiewski
On 10/03/2015 01:06 PM, Ralph Castain wrote: Thanks Marcin. Looking at this, I’m guessing that Slurm may be treating HTs as “cores” - i.e., as independent cpus. Any chance that is true? Not to the best of my knowledge, and at least not intentionally. SLURM starts as many processes as there are

Re: [OMPI users] [Open MPI Announce] Open MPI v1.10.1rc1 release

2015-10-03 Thread Dimitar Pashov
Hi, I have a pet bug causing silent data corruption here: https://github.com/open-mpi/ompi/issues/965 which seems to have a fix committed some time later. I've tested v1.10.1rc1 now and it still has the issue. I hope the fix makes it in the release. Cheers! On Saturday 03 Oct 2015 10:18:47

Re: [OMPI users] Process binding with SLURM and 'heterogeneous' nodes

2015-10-03 Thread Gilles Gouaillardet
Marcin, could you give a try at v1.10.1rc1 that was released today ? it fixes a bug when hwloc was trying to bind outside the cpuset. Ralph and all, imho, there are several issues here - if slurm allocates threads instead of core, then the --oversubscribe mpirun option could be mandatory - with

Re: [OMPI users] Process binding with SLURM and 'heterogeneous' nodes

2015-10-03 Thread Ralph Castain
Thanks Marcin. Looking at this, I’m guessing that Slurm may be treating HTs as “cores” - i.e., as independent cpus. Any chance that is true? I’m wondering because bind-to core will attempt to bind your proc to both HTs on the core. For some reason, we thought that 8.24 were HTs on the same core,

[OMPI users] Open MPI v1.10.1rc1 release

2015-10-03 Thread Jeff Squyres (jsquyres)
Open MPI users -- We have just posted first release candidate for the upcoming v1.10.1 bug fix release. We'd appreciate any testing and/or feedback that you may on this release candidate: http://www.open-mpi.org/software/ompi/v1.10/ Thank you! Changes since v1.10.0: - Fix segv when invo

Re: [OMPI users] Process binding with SLURM and 'heterogeneous' nodes

2015-10-03 Thread marcin.krotkiewski
Hi, Ralph, I submit my slurm job as follows salloc --ntasks=64 --mem-per-cpu=2G --time=1:0:0 Effectively, the allocated CPU cores are spread amount many cluster nodes. SLURM uses cgroups to limit the CPU cores available for mpi processes running on a given cluster node. Compute nodes are 2-so