On 21/03/18 19:09, Daniel Grimwood wrote:
Hi Chris,
Hiya!
Thanks for that. I had overlooked SBATCH_EXPORT as I interpreted the
man page literally, as "Same as --export". It's actually "Same as
--export without setting SLURM_EXPORT_ENV=NONE". That's great!
My pleasure. :-)
Now all we n
On 22/03/18 01:43, sysadmin.caos wrote:
I'm trying to compile SLURM-17.02.7 with "lua" support executing
"./configure && make && make contribs && make install", but make does
nothing in src/plugins/job_submit/lua and I don't know why...
How do I have to compile that plugin? The rest of the pl
On 22/03/18 00:09, Ole Holm Nielsen wrote:
Chris, I don't understand what you refer to as "that"? Someone must
have created /etc/pam.d/slurm.* files, and it doesn't seem to be the
Slurm RPMs.
Sorry Ole, just meant that PAM automates reading those files for you
if you create them (and the code
On Wednesday, 21 March 2018, at 20:14:22 (+0100),
Ole Holm Nielsen wrote:
> Thanks for your friendly advice! I keep forgetting about Systemd
> details, and your suggestions are really detailed and useful for
> others! Do you mind if I add your advice to my Slurm Wiki page?
Of course not! Espec
Hi Michael,
Thanks for your friendly advice! I keep forgetting about Systemd
details, and your suggestions are really detailed and useful for others!
Do you mind if I add your advice to my Slurm Wiki page?
/Ole
On 21-03-2018 16:29, Michael Jennings wrote:
On Wednesday, 21 March 2018, at 1
PS: we're using Slurm 17.11.5
Am 21.03.2018 um 16:18 schrieb Henkel, Andreas
mailto:hen...@uni-mainz.de>>:
Hi,
recently, while trying a new configuration I came cross a Problem. In
principal, we have one big Partition containing all nodes with PriorityTier=2.
Each account got GrpTRESRunMin=
On Wednesday, 21 March 2018, at 12:08:00 (+0100),
Ole Holm Nielsen wrote:
> One working solution is to modify the slurmd Systemd service file
> /usr/lib/systemd/system/slurmd.service to add a line:
> LimitCORE=0
This is a bit off-topic, but I see this a lot, so I thought I'd
provide a friendly
On Wednesday, 21 March 2018, at 08:40:32 (-0600),
Ryan Cox wrote:
> UsePAM has to do with how jobs are launched when controlled by
> Slurm. Basically, it sends jobs launched under Slurm through the
> PAM stack. UsePAM is not required by pam_slurm_adopt because it is
> *sshd* and not *slurmd or s
Hi,
recently, while trying a new configuration I came cross a Problem. In
principal, we have one big Partition containing all nodes with PriorityTier=2.
Each account got GrpTRESRunMin=cpu=<#somelimit> set. Every now and then we have
the Situation that part of the nodes are idling. For this we
On Wednesday, 21 March 2018, at 12:05:49 (+0100),
Alexis Huxley wrote:
> > >Depending on the load on the scheduler, this can be slow. Is there
> > >faster way? Perhaps one that doesn't involve communicating with
> > >the scheduler node? Thanks!
>
> Thanks for the suggestion Ole, but we have somet
I'm trying to compile SLURM-17.02.7 with "lua" support executing
"./configure && make && make contribs && make install", but make does
nothing in src/plugins/job_submit/lua and I don't know why...
How do I have to compile that plugin? The rest of the plugins compile
with no problems (defaults,
Ole,
UsePAM has to do with how jobs are launched when controlled by Slurm.
Basically, it sends jobs launched under Slurm through the PAM stack.
UsePAM is not required by pam_slurm_adopt because it is *sshd* and not
*slurmd or slurmstepd* that is involved with pam_slurm_adopt. That's
what I
On 03/21/2018 02:03 PM, Bill Barth wrote:
I don’t think we had to do anything special since we have UsePAM = 1 in our
slurm.conf. I didn’t do the install personally, but our pam.d/slurm* files are
written by us and installed by our configuration management system. Not sure
which one UsePAM loo
On 03/21/2018 01:57 PM, Chris Samuel wrote:
On Wednesday, 21 March 2018 11:49:53 PM AEDT Ole Holm Nielsen wrote:
However, there are no /etc/pam.d/slurm.* files on our system (running
Slurm 17.02). Did TACC create a special Slurm PAM configuration file,
and is this documented in the public doma
Ole,
I don’t think we had to do anything special since we have UsePAM = 1 in our
slurm.conf. I didn’t do the install personally, but our pam.d/slurm* files are
written by us and installed by our configuration management system. Not sure
which one UsePAM looks for, but here are ours:
c501-101[s
On Wednesday, 21 March 2018 11:49:53 PM AEDT Ole Holm Nielsen wrote:
> However, there are no /etc/pam.d/slurm.* files on our system (running
> Slurm 17.02). Did TACC create a special Slurm PAM configuration file,
> and is this documented in the public domain?
I think that's just how PAM works.
On 03/21/2018 01:08 PM, Bill Barth wrote:
You could set /etc/security/limits.conf on every node to contain something like
(check my syntax):
* soft core 0
* hard core 0
Nice suggestion, however, processes spawned by slurmd doesn't read the
/etc/security/limits.conf file.
And make sure th
Hi Gareth,
Thanks for the suggestion. This does seem like a good way forward. I
will look into it.
Regards,
Emyr
On 21/03/2018 13:34, gareth.willi...@csiro.au wrote:
Hi Emyr,
Perhaps you could be more explicit about the i/o boundedness and have jobs
request an io gres as well as compute
You could set /etc/security/limits.conf on every node to contain something like
(check my syntax):
* soft core 0
* hard core 0
And make sure that /etc/pam.d/slurm.* and /etc/pam.d/system-auth* contain:
session required pam_limits.so
session required pam_limits.so
…so that li
> */2 * * * * scontrol --oneliner show node > /cluster/var/node-info.new
> 2>/dev/null && mv -f /cluster/var/node-info.new /cluster/var/node-info
> 2>/dev/null
>
> So, every 2. minute, the /cluster/var/node-info is updated (if the
> scontrol command succeeds), and the nodes simply grep in that f
Alexis Huxley writes:
>> >Depending on the load on the scheduler, this can be slow. Is there
>> >faster way? Perhaps one that doesn't involve communicating with
>> >the scheduler node? Thanks!
>
> Thanks for the suggestion Ole, but we have something in place that
> we don't want to change at this
On Wednesday, 21 March 2018 10:08:00 PM AEDT Ole Holm Nielsen wrote:
> Thanks for sharing any experiences.
Would:
echo "ulimit -c unlimited"
in the taskprolog work?
I guess it assumes a Bourne shell..
--
Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC
We experience problems with MPI jobs dumping lots (1 per MPI task) of
multi-GB core dump files, causing problems for file servers and compute
nodes.
The user has "ulimit -c 0" in his .bashrc file, but that's ignored when
slurmd starts the job, and the slurmd process limits are employed in stea
> >Depending on the load on the scheduler, this can be slow. Is there
> >faster way? Perhaps one that doesn't involve communicating with
> >the scheduler node? Thanks!
Thanks for the suggestion Ole, but we have something in place that
we don't want to change at this time. We just need a faster way
On Wednesday, 21 March 2018 9:07:08 PM AEDT sysadmin.caos wrote:
> What I want to get is a batch partition that doesn't allow "srun" commands
> from the command line and a interactive partition only for "srun" commands.
You might well be able to do this with a lua submit filter, testing for the
On 03/21/2018 11:18 AM, Alexis Huxley wrote:
I'm running a node health script that needs to know the state of
the node on which it is running. Currently, I'm getting the
state with this:
sinfo -N ... | grep `uname -n`
Depending on the load on the scheduler, this can be slow. Is there
fa
Hi Emyr,
Perhaps you could be more explicit about the i/o boundedness and have jobs
request an io gres as well as compute and memory resource. You could then set
the amount of io resource per node (and maybe globally - possibly separate
iolocal and ioglobal). Then you could avoid io contention
I'm running a node health script that needs to know the state of
the node on which it is running. Currently, I'm getting the
state with this:
sinfo -N ... | grep `uname -n`
Depending on the load on the scheduler, this can be slow. Is there
faster way? Perhaps one that doesn't involve com
Hello,
I would like to configure SLURM with two partitions:
one called "batch.q" only for batchs jobs
one called "interactive.q" only for batch jobs
What I want to get is a batch partition that doesn't allow "srun"
commands from the command line and
Hi Chris,
Thanks for that. I had overlooked SBATCH_EXPORT as I interpreted the man
page literally, as "Same as --export". It's actually "Same as --export
without setting SLURM_EXPORT_ENV=NONE". That's great!
Now all we need for completeness is a SRUN_EXPORT that works the same,
although SBATCH
30 matches
Mail list logo