[slurm-dev] RFC Perl 6 DRMAA bindings

2017-10-16 Thread Vittore Scolari
Hello fellow coders, you might be interested in the DRMAA bindings for Perl6, which you can find here: https://github.com/scovit/Scheduler-DRMAA It provides the full C language API, and on top of that I implemented a quite expressive, easy and experimental high level API, you might want to use it

[slurm-dev] Running a process from a SLURMs Spank plugin

2017-10-16 Thread Jordi A . Gómez
Hello, I am developing a Spank plugin which starts a process per node. This is a privileged process that performs some irrelevant things. We don't want the process to be closed at the end of the SLURMs job execution, but seems that SLURM is closing this process after the job completion. Do you kn

[slurm-dev] Re: Finding job command after fails

2017-10-16 Thread Merlin Hartley
You could also use a simple epilog script to save the output of ‘scontrol show job’ to a file/database. M -- Merlin Hartley Computer Officer MRC Mitochondrial Biology Unit Cambridge, CB2 0XY United Kingdom > On 15 Oct 2017, at 20:49, Ryan Richholt wrote: > > Is there any way to get the job c

[slurm-dev] Re: Tasks distribution

2017-10-16 Thread Sysadmin CAOS
Hi, after reading your message about if OMPI 1.8 supports PMIx, I have compiled OMPI 3.0.0. I resume, againg, all what I have done: 1. First of all, I have compiled contrib "pmi" package allocate inside SLURM 17.02.7 folder: cd contribs/pmi2 && make && make install --

[slurm-dev] Re: Slurm API thread safety and concurrency

2017-10-16 Thread Moe Jette
All of the calls are thread safe On October 15, 2017 11:37:55 AM MDT, "Frank Ramirez, Alvaro" wrote: >Hi all, > > >New user but long time user and plugin developer here. > > >I am currently porting an automated set of virtualization triggers for >slurm to c/c++. > > >As virtualization steps can

[slurm-dev] Re: Tasks distribution

2017-10-16 Thread Jeffrey T Frey
> If, now, I submit with "sbatch --distribution=cyclic -N 2 -n 12 > ./test-new.sh", what I get is: > Process 0 on clus01.hpc.local out of 12 > Process 1 on clus02.hpc.local out of 12 > Process 2 on clus01.hpc.local out of 12 > Process 3 on clus01.hpc.local out of 12 > Process 4 on clus01.hpc.loca

[slurm-dev] Re: Finding job command after fails

2017-10-16 Thread Ryan Richholt
Thanks, that sounds like a good idea. A prolog script could also handle this right? That way if the node crashes while the job is running, it would still be saved. On Mon, Oct 16, 2017 at 3:20 AM Merlin Hartley < merlin-sl...@mrc-mbu.cam.ac.uk> wrote: > You could also use a simple epilog script t

[slurm-dev] spank plugin to redirect execution to a different node

2017-10-16 Thread Nicolas Bock
Hi, I have a compute node (CN) that is not running slurmd and is not directly managed by slurm but that I would like to use via slurm. Very crudely speaking I imagine that I could use an existing slurmd and use a spank plugin to rewrite the IP address of an allocation such that a task is exe

[slurm-dev] Re: Tasks distribution

2017-10-16 Thread sysadmin.caos
If I run with "--ntasks-per-node=6", result is: Process 0 on clus01.hpc.local out of 12 Process 1 on clus02.hpc.local out of 12 Process 2 on clus01.hpc.local out of 12 Process 3 on clus02.hpc.local out of 12 Process 4 on clus01.hpc.local out of 12 Process 5 on clu