lly you get the f. out of here twice a day so that
my jobs can start running.
Hhahaha!!!
From: slurm-users on behalf of John
Hearns
Sent: Monday, July 2, 2018 12:37:13 PM
To: Slurm User Community List
Subject: Re: [slurm-users] All user's jobs killed
S21:46 0:00 -bash
> moha 194080 0.0 0.0 151060 1820 pts/4R+ 21:52 0:00 ps aux
> moha 194081 0.0 0.0 112664 972 pts/4S+ 21:52 0:00 grep
> --color=auto moha
>
>
> ________________________
> From: slurm-users on behalf of
> Thomas M. Payerle
lor=auto moha
From: slurm-users on behalf of Thomas
M. Payerle
Sent: Friday, June 29, 2018 7:34:09 PM
To: Slurm User Community List
Subject: Re: [slurm-users] All user's jobs killed at the same time on all nodes
A couple comments/possible suggestions.
First,
/exec/gnu_M/charmm <
>> newphcnl99a0.inp > newphcnl99a0.out
>>
>>
>>
>>
>> so they are all independent mpiruns... if one of them is killed, why
>> would all others go down as well?
>>
>>
>> That would make sense if a single mpirun is running 36 tasks... b
em is killed, why
> would all others go down as well?
>
>
> That would make sense if a single mpirun is running 36 tasks... but the
> user is not doing this.
>
> ________________
> From: slurm-users on behalf of
> John Hearns
> Sent: Friday, June 29,
ity List
Subject: Re: [slurm-users] All user's jobs killed at the same time on all nodes
Matteo, a stupid question but if these are single CPU jobs why is mpirun being
used?
Is your user using these 36 jobs to construct a parallel job to run charmm?
If the mpirun is killed, yes all the othe
Hi Matteo,
On Fri, Jun 29, 2018 at 10:13:33AM +, Matteo Guglielmi wrote:
> Dear comunity,
>
> I have a user who usually submits 36 (identical) jobs at a time using a
> simple for loop,
> thus jobs are sbatched all the same time.
>
> Each job requests a single core and all jobs are independ
Matteo, a stupid question but if these are single CPU jobs why is mpirun
being used?
Is your user using these 36 jobs to construct a parallel job to run charmm?
If the mpirun is killed, yes all the other processes which are started by
it on the other compute nodes will be killed.
I suspect your u
Dear comunity,
I have a user who usually submits 36 (identical) jobs at a time using a simple
for loop,
thus jobs are sbatched all the same time.
Each job requests a single core and all jobs are independent from one another
(read
different input files and write to different output files).
Jobs