I got this email to my gmail account? I don't understand why my gmail
account would register any bounces at all? Am I still unsubscribed?
cheers
L.
-- Forwarded message -
From:
Date: Wed, 30 Jan 2019 at 02:44
Subject: confirm e3c10e8d4f2f35ab689c7a4a88e5e2b57931da79
To:
Your m
On Wed, 24 Oct 2018 at 22:56, Zohar Roe MLM wrote:
> Hello,
>
> I have a node that from some reason change state to "Down" evert few
> minutes.
>
> When I change it with scontrol to "resume" its ok until Down again.
>
> In the slurm server log I can see error:
>
> "agent/is_node_resp: node:myName
On Mon, 15 Oct 2018 at 17:59, Bjørn-Helge Mevik
wrote:
> Lachlan Musicman writes:
>
> > There's one thing that no one seems to have mentioned - I think you will
> > need to list it as an AllocNode in the Partition that you want it to be
> > able to allocate jobs to.
On Fri, 12 Oct 2018 at 17:02, Aravindh Sampathkumar
wrote:
> @Chris and @Lachlan,
> Thanks for your responses.
>
> I resolved the issue based on hint from Jeffrey in earlier email. I
> tweaked the location of PID files in slurm config files, but missed to
> change them in the systemd service defi
There's one thing that no one seems to have mentioned - I think you will
need to list it as an AllocNode in the Partition that you want it to be
able to allocate jobs to.
https://slurm.schedmd.com/slurm.conf.html#OPT_AllocNodes
Eg in my conf we have one partition that looks like
PartitionName=re
1. After systemctl restart slurmdbd , what does journalctl -xe say?
2. Your email is very hard to read. This is bc posted in html, with
terminal colours and etc. Could you send the next email in plain text pls?
Cheers
L.
On Fri, 12 Oct 2018 at 08:02, Aravindh Sampathkumar
wrote:
> Hello.
>
> I'
On 29 July 2018 at 04:32, Felix Wolfheimer
wrote:
> I'm experimenting with SLURM Elastic Compute on a cloud platform. I'm
> facing the following situation: Let's say, SLURM requests that a compute
> instance is started. The ResumeProgram tries to create the instance, but
> doesn't succeed because
On 27 July 2018 at 03:13, Michael Robbert wrote:
> The line that you list from your slurm.conf shows the "course" partition
> being set as the default partition, but on our system the sinfo command
> shows our default partition with a * at the end and your output doesn't
> show that so I'm wonder
On 31 May 2018 at 17:00, Ole Holm Nielsen
wrote:
> Hi Lachlan,
>
> Slurm upgrades on CentOS 7.5 should run without problems. It seems to me
> that your problems are unrelated to the Slurm RPMs. FWIW, I documented the
> Munge and Slurm installation as well as upgrade process in my Wiki page
> ht
After last night's announcement, I decided to start the upgrade process.
Build went fine - once I worked out where munge went - and installation
also seemed fine.
slurmctld won't restart though.
In the logs I'm seeing:
[2018-05-31T15:20:50.810] debug: Munge encode failed: Failed to access
"xxx
On 31 May 2018 at 10:23, Lachlan Musicman wrote:
> Hola
>
> According to the documentation, the slurm munge rpm will be built if the
> munge libraries are installed.
>
> In CentOS 7.4 I have munge, munge-devel and munge-libs install, via yum.
> The libs are in /usr/lib64,
Hola
According to the documentation, the slurm munge rpm will be built if the
munge libraries are installed.
In CentOS 7.4 I have munge, munge-devel and munge-libs install, via yum.
The libs are in /usr/lib64, the bins are in /usr/bin, the daemon is in
/usr/sbin.
Neither machine on which I run `
On 28 May 2018 at 20:24, Loris Bennett wrote:
> Hi Ronan,
>
> "Buckley, Ronan" writes:
>
> > Hi All,
> >
> > I am unable to get confirmation from the SLURM documentation that
> > there is no impact to active SLURM jobs when “scontrol reconfigure” is
> > ran to enforce new SLURM configuration fro
ch for those terms here
https://slurm.schedmd.com/slurm.conf.html
Accounting system is using FairShare/Fair Tree
https://slurm.schedmd.com/fair_tree.html
PDF of presentation -> https://slurm.schedmd.com/SC14/BYU_Fair_Tree.pdf
Cheers
L.
> Thanks, Nadav
>
>
> On 27/05/2018 11:34,
On 27 May 2018 at 18:23, Nadav Toledo wrote:
> Hello forum,
>
> I am trying to deal with idle session for some time, and haven't found a
> solution i am happy with.
> The scenario is as follow: users using srun for jupyter-lab(which is fine
> and even encouraged by me) on image processing cluster
On 26 May 2018 at 12:19, 程迪 wrote:
> Hi, everyone
>
> I just found the sbatch will copy the original sbatch script to a new
> place and I cannot get the path to original sbatch script. Is there any
> method to solve it?
>
> I am using the path to copy related files. I need to populate a scratch
>
On 21 May 2018 at 11:36, Lachlan Musicman wrote:
> On 21 May 2018 at 11:29, 程迪 wrote:
>
>> Hi everyone,
>>
>> I am using SLURM as a normal user. I want to find the usage limit of my
>> user. I can access the slurm's config via `scontrol show config`. But it
On 21 May 2018 at 11:29, 程迪 wrote:
> Hi everyone,
>
> I am using SLURM as a normal user. I want to find the usage limit of my
> user. I can access the slurm's config via `scontrol show config`. But it is
> my user's limit.
>
> I can find the account of my user by `sacctmgr show user di`. But I ca
On 11 May 2018 at 01:35, Eric F. Alemany wrote:
> Hi All,
>
> I know this might sounds as a very basic question: where in the cluster
> should I install Python and R?
> Headnode?
> Execute nodes ?
>
> And is there a particular directory (path) I need to install Python and R.
>
> Background:
> SLU
On 12 April 2018 at 01:22, Matt Hohmeister wrote:
>
> Thanks; I just set StateSaveLocation=/var/spool/slurm.state, and that
> went away. Of course, another error popped up:
>
>
>
> Apr 11 11:19:24 psy-slurm slurmctld[1772]: fatal: Invalid node names in
> partition slurm
>
>
>
> Here’s the relevan
On 14 March 2018 at 14:53, Christopher Samuel wrote:
> On 14/03/18 14:50, Lachlan Musicman wrote:
>
> As per subject, recently I've been shuffling nodes around into new
>> partitions. In that time somehow the default partition switched from prod
>> to dev. Not the end o
As per subject, recently I've been shuffling nodes around into new
partitions. In that time somehow the default partition switched from prod
to dev. Not the end of the world - desirable in fact. But I'd like to know
what happened to cause it?
cheers
L.
--
"The antidote to apocalypticism is *
On 19 January 2018 at 07:29, Ryan Novosielski wrote:
> Hi all,
>
> Looked back at the mailing list to see if there was a question about this
> already. There was some mention of /using/ Nagios, but no real mention of
> specifics. What do people monitor with Nagios? We monitor, so far,
> slurmctld
Hi all,
As part of both Munge and SLURM, time synchronised servers are necessary.
I keep finding chrony installed and running and ntpd stopped. I turn chrony
off and restart/enable ntpd but every CentOS point update it seems to flip.
>From what I've read ntpd is better for always on devices, and
I'd imagine so. As long as the slurmd is running on the arm nodes, I can't
see why not. Should be transparent to the underlying hardware.
L.
--
"The antidote to apocalypticism is *apocalyptic civics*. Apocalyptic civics
is the insistence that we cannot ignore the truth, nor should we panic
ab
Hola,
Apparently (I was on holiday - of course) we experienced a mini email
server melt because of a confluence of two events.
Triggered by a user making a spelling mistake in their own email address,
this was compounded by the fact that slurm creates a from address for the
outgoing email in the
On 8 December 2017 at 18:07, Loris Bennett
wrote:
> Lachlan Musicman writes:
> >>
> >> Running sshare -l only shows the root user:
> >> Account User RawShares NormShares RawUsage NormUsage EffectvUsage
> FairShar
On 1 December 2017 at 20:48, Bruno Santos wrote:
> Loris, I think you hit the nail on the head.
>
> Running sshare -l only shows the root user:
> Account User RawShares NormSharesRawUsage
> NormUsage EffectvUsage FairShareLevelFS
> GrpTRESMinsTR
b64/slurm/accounting_storage_*
> /usr/lib64/slurm/accounting_storage_filetxt.so
> /usr/lib64/slurm/accounting_storage_none.so /usr/lib64/slurm/accounting_
> storage_slurmdbd.so
>
> However, I did install the slurm-sql rpm package.
> Any idea about what's failing?
>
> Thanks
> On 20/1
On 20 November 2017 at 20:50, Juan A. Cordero Varelaq <
bioinformatica-i...@us.es> wrote:
> $ systemctl start slurmdbd
> Job for slurmdbd.service failed because the control process exited
> with error code. See "systemctl status slurmdbd.service" and "journalctl
> -xe" for details.
> $
Works fine on CentOS 7.4
Some of my users are getting > 100% efficiency? That seems weird, tbh, but
I've not done a thorough analysis of their work/sbatch files.
L.
--
"The antidote to apocalypticism is *apocalyptic civics*. Apocalyptic civics
is the insistence that we cannot ignore the trut
On 9 November 2017 at 10:54, Elisabetta Falivene
wrote:
> I am the admin and I have no documentation :D I'll try The third option.
> Thank you very much
>
Ah. Yes. Well, you will need some sort of drive shared between all the
nodes so that they can read and write from a common space.
Also, I re
The IT team sent an email saying "complete network wide network outage
tomorrow night from 10pm across the whole institute".
Our plan is to put all queued jobs on hold, suspend all running jobs, and
turning off the login node.
I've just discovered that the partitions have a state, and it can be s
on that the only way out is through, and the only way through is
together. "
*Greg Bloom* @greggish
https://twitter.com/greggish/status/873177525903609857
> Il mercoledì 8 novembre 2017, Lachlan Musicman ha
> scritto:
>
>> On 9 November 2017 at 09:19, Elisabetta Falivene >
On 9 November 2017 at 09:19, Elisabetta Falivene
wrote:
> I'm getting this message anytime I try to execute any job on my cluster.
> (node01 is the name of my first of eight nodes and is up and running)
>
> Trying a python simple script:
> *root@mycluster:/tmp# srun python test.py *
> *slurmd[nod
I use
alias sn='sinfo -Nle -o "%.20n %.15C %.8O %.7t" | uniq'
and then it's just
[root@machine]# sn
cheers
L.
--
"The antidote to apocalypticism is *apocalyptic civics*. Apocalyptic civics
is the insistence that we cannot ignore the truth, nor should we panic
about it. It is a shared consc
Likewise - cheers!
L.
--
"The antidote to apocalypticism is *apocalyptic civics*. Apocalyptic civics
is the insistence that we cannot ignore the truth, nor should we panic
about it. It is a shared consciousness that our institutions have failed
and our ecosystem is collapsing, yet we are stil
37 matches
Mail list logo