[slurm-dev] question within SLURM

2017-10-24 Thread Rajiv Nishtala
Hi, This is the first LARGE project I'm dealing with, it is exciting but also a bit complex to browse through even with cscope and ctags. I'm trying to play with the part of the code that is responsible for killing a job if it exceeds a memory limit, for instance via cgroups or so. If I und

[slurm-dev] Re: question within SLURM

2017-10-24 Thread Chris Samuel
On Tuesday, 24 October 2017 7:51:23 PM AEDT Rajiv Nishtala wrote: > I'm trying to play with the part of the code that is responsible for killing > a job if it exceeds a memory limit, for instance via cgroups or so. With cgroups it is the Linux kernel, not Slurm, that is responsible for killing

[slurm-dev] SLURM 17.02.8 not optimally scheduling jobs/utilizing resources

2017-10-24 Thread Sean Caron
Hi all, I recently performed a forklift upgrade of SLURM at my site from version 14.10.X to the most recent available, 17.02.8. Since there were so many versions skipped, we didn't try to do a phased series of upgrades, rather, we just used sacctmgr show commands to dump users, accounts, etc, then

[slurm-dev] Selecting a network interface with srun

2017-10-24 Thread Sebastian Eastham
Dear Slurm Developers mailing list, When calling the “srun” command, is there any way to specify the desired network interface? Our network is a mix of ethernet and inifiniband, such that only a subset of the nodes have an infiniband interface. When using “mpirun” we can specify “-iface ib0”, bu

[slurm-dev] Slurm version 17.11.0-0rc1 is now available

2017-10-24 Thread Moe Jette
We are pleased to announce the availability of Slurm version 17.11.0-0rc1 (release candidate 1). Production release of version 17.11 is expected in November. Interested parties are invited to test this pre-release. Slurm can be downloaded from https://www.schedmd.com/downloads.php Major changes

[slurm-dev] Re: Selecting a network interface with srun

2017-10-24 Thread Doug Meyer
Hi, I believe that if you are using OpenMPI you can declare the interface. By default it should select the fastest interface on the system. This FAQ may be of help. https://www.open-mpi.org/faq/?category=tcp If I have misunderstood the problem I apologize. best of luck! Doug On Tue, Oct 24,

[slurm-dev] about openmpi under slurm

2017-10-24 Thread 黄旸
Dear sir: These days I built a parallel computing server in my lab. I used slurm as the resource manager and openmpi run my own paralleling softwares, but problems always arised. I can't launch my jobs to the computing nodes. I'm sure of no problems wtih my openmpi buliding, because I can

[slurm-dev] Re: Selecting a network interface with srun

2017-10-24 Thread r...@open-mpi.org
“ibface” isn’t an OpenMPI cmd line option, so I suspect you are using something other than OpenMPI. For OMPI, you could specify the interface via MCA param in the environment or default MCA parameter file. Most MPI implementations have a similar mechanism - you might check your documentation.

[slurm-dev] Re: Selecting a network interface with srun

2017-10-24 Thread Paul Hargrove
The "-iface ib0" syntax is used by the hydra launcher if MPICH and its derivatives such as MVAPICH. I suggest http://wiki.mcs.anl.gov/mpich2/index.php/Using_the_Hydra_Process_Manager as a starting point. -Paul On Tue, Oct 24, 2017 at 8:19 PM, r...@open-mpi.org wrote: > “ibface” isn’t an OpenMPI

[slurm-dev] Re: Selecting a network interface with srun

2017-10-24 Thread Gilles Gouaillardet
fwiw, with Open MPI, ib0 can be selected with export OMPI_MCA_btl_openib_if_include=ib0 assuming slurm was not configured not to export this environment variable Gilles On 10/25/2017 12:55 PM, Paul Hargrove wrote: Re: [slurm-dev] Re: Selecting a network interface with srun The "-iface ib0"