On 10/24/2017 09:30 PM, Sean Caron wrote:
(...)
>
> Here's some config relating to the scheduler from our slurm.conf that
> might be germane:
> -
>
> SchedulerType=sched/backfil
On 10/25/2017 01:52 PM, Holger Naundorf wrote:
I'd really appreciate any help the SLURM wizards can provide! We suspect
it's something to do with how we've set up QoS or maybe, we need to
tweak the scheduler configuration in 17.02.8 however there's no single
clear path forward. Just let me know
Hello,
I would like to apply a calendar in SLURM, something the SGE calendar
for enabling and disabling queues automatically. I know I could create
some "cron" jobs, but I'm looking for something "smart".
Thanks.
Hi Sebastian,
Another solution could be to change the configuration of nodes in slurm.conf,
making use of NodeName and NodeHostname (and NodeAddr if needed) :
“
NodeName
Name that Slurm uses to refer to a node[...]. Typically this would be the
string that "/bin/hostname -s" returns.[...]It may
On 10/25/2017 06:58 AM, Ole Holm Nielsen wrote:
I agree that the backfill scheduler requires configuration beyond the
default settings! This surprised me as well. I wrote some notes in my
Wiki which could be used as a starting point:
https://wiki.fysik.dtu.dk/niflheim/Slurm_scheduler#backf
When using “mpirun” we can specify “-iface ib0”this is true, and the
exact syntax depends on your MPI of choice, as noted above.
However, don't get confused between IPOIB and Infiniband itself. IPOIB
is of course sending IP traffic over Infiniband.
An Infiniband network can perfectly happil
Good points. I would also caution against renaming nodes using interfaces. This
frequently causes failure of 3rd party software packages that compare the
return value of “hostname” to the list of allocated nodes for optimization or
placement purposes - e.g., mpirun! A quick grep of the mailing l
Ralph, indeed.
As I have said before: Finally, my one piece of advice to everyone
managing batch systems. It is a name resolution problem. No, really it is.
Even if your cluster catches fire, the real reason that your jobs are not
being submitted is that the DNS resolver is burning and the sch
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
I think this is probably the best advice. I've personally never run
into a set of circumstances where MPI didn't pick the right interface
where there wasn't a major misconfiguration (you'd of course want to
test that on your system), but users might w
Thanks for the suggestions, everyone. I will certainly be looking at
SchedulerParameters to see if I can optimize that a little further with
contemporary guidance on best practices.
It looks like what got us going well enough, for now, was just to bump the
Priority associated with each QoS class t
Hi All,
I've now been asked twice in two days if there is any way to intelligently
name slurm output files.
Sometimes our users will do something like
for fasta_file in `ls /path/to/fasta_files`; do sbatch myscript.sbatch
$fasta_file; done
They would like their output and error files to be like
Hi Lachlan,
On Thu, Oct 26, 2017 at 10:10 AM, Lachlan Musicman wrote:
> for fasta_file in `ls /path/to/fasta_files`; do sbatch myscript.sbatch
> $fasta_file; done
>
> They would like their output and error files to be like:
>
> --output=$fasta_file.out
> --error=$fasta_file.err
I analyze a lo
Why can't you just do
for fasta_file in `ls /path/to/fasta_files`; do sbatch
--output=$fasta_file.out --error=$fasta_file.err myscript.sbatch
$fasta_file; done
On Wed, Oct 25, 2017 at 7:08 PM, Lachlan Musicman wrote:
> Hi All,
>
> I've now been asked twice in two days if there is any way to int
The information contained in this electronic message and any attachments to
this message are intended for the exclusive use of the addressee(s) and may
contain proprietary, confidential or privileged information. If you are not the
intended recipient, you should not disseminate, distribute or c
Hi,
We are running slurm 17.02.7
For some reason, we are observing that the preferred CPUs defined in gres.conf
for GPU devices are being ignored when running jobs. That is, in our gres.conf
we have gpu resource lines, such as:
Name=gpu Type=kepler File=/dev/nvidia0
CPUs=0,1,2,3,4,5,6,7,16,1
Hello.
I have a remote license configured in slurm and I'd like to limit the
number of licenses each user can use.
The license shows up correctly in both "scontrol show lic", "sacctmgr
show tres" and "sacctmgr show resource", but attempting to use QoS to
set a limit on the number of license
On 26 October 2017 at 13:27, Alex Chekholko wrote:
> Why can't you just do
>
> for fasta_file in `ls /path/to/fasta_files`; do sbatch
> --output=$fasta_file.out --error=$fasta_file.err myscript.sbatch
> $fasta_file; done
>
Because it was staring me in the face and I ignored it. Thank you.
che
17 matches
Mail list logo