your needs (IB? SlingShot?
Nvidia? etc.).
--
--
Bas van der Vlies
| High Performance Computing & Visualization | SURF| Science Park 140 |
1098 XG Amsterdam
| T +31 (0) 20 800 1300 | bas.vandervl...@surf.nl | www.surf.nl |
--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubs
derstanding what constitutes a subaccount? When I "sacctmgr
show assoc tree", then I see that mystuff is under stuff.
Or am I misreading the documentation that says that subaccounts are
included in who is allowed to use the partition?
Thanks,
Rob
--
--
Bas van der Vlies
| Hi
I have question can I submit jobs as root user to the slumrestd server
and change the userid like `sbatch --uid`?
Or can this acomplished by:
* (root) scontrol token username=
* (user) use this token to submit jobs as
Regards
Bas
--
--
Bas van der Vlies
| High Performance Computing
point in querying in a loop every 30 seconds when we're
talking about large numbers of jobs.
Ward
--
--
Bas van der Vlies
| High Performance Computing & Visualization | SURF| Science Park 140 |
1098 XG Amsterdam
| T +31 (0) 20 800 1300 | bas.vandervl...@surf.nl | www.surf.nl |
is located adjacent to
slurm.conf. Any additional config files will need to be shared a
different way or added to the parent config.
```
We also used include statements before we switched to configless. Same
arguments as Bjorn.
--
--
Bas van der Vlies
| High Performance Computing &
sites also have this problem? Did I miss an option?
Regards
--
--
Bas van der Vlies
| High Performance Computing & Visualization | SURF| Science Park 140 |
1098 XG Amsterdam
| T +31 (0) 20 800 1300 | bas.vandervl...@surf.nl | www.surf.nl |
Just released a new version of the plugin. Our cluster has been upgraded to
21.08.6 and the cgroups structure is different. Fixed in latest release:
* Tested on 21.08 and 20.11
Regards
> On 4 Apr 2022, at 09:20, Bas van der Vlies wrote:
>
> We have the exact same request for our
gards
--
Bas van der Vlies
| HPCV Supercomputing | Internal Services | SURF |
https://userinfo.surfsara.nl |
| Science Park 140 | 1098 XG Amsterdam | Phone: +31208001300 |
| bas.vandervl...@surf.nl
smime.p7s
Description: S/MIME Cryptographic Signature
ne the
type for non MIG GPU and also set the UniqueId for MIG instances would
be the perfect solution
- slurm.conf Node Def: MIG + non MIG Gres types
-> problem: it doesn't work
-> Parsing error at unrecognized key: UniqueId
Thanks for reading this far. Am I missing something? How can
or: CONN:9 No error
Does anyone knows what that mean?
I also found out, that all *tres* fields in taurus_assoc_table are empty but
not NULL.
Kind regards,
Danny Rotscher
--
Bas van der Vlies
| HPCV Supercomputing | Internal Services | SURF |
https://userinfo.surfsara.nl |
| Science Par
I have no other solution this was the solution at our site.
> On 22 Oct 2021, at 03:19, pankajd wrote:
>
> thanks, but after setting PMIX_MCA_psec=native, now mpirun hangs and does not
> produce any output.
>
> On October 21, 2021 at 9:21 PM Bas van der Vlies
> wrote
appropriate legal action will be taken.
----
--
Bas van der Vlies
| HPCV Supercomputing | Internal Services | SURF |
https://userinfo.surfsara.nl |
| Science Park 140 | 1098 XG Amste
Hi Quirin maybe you have this gres issue
https://bugs.schedmd.com/show_bug.cgi?id=12642#c27
--
Bas van der Vlies
> On 17 Oct 2021, at 16:32, Quirin Lohr wrote:
>
> Hi,
>
> I just upgraded from 20.11 to 21.08.2.
>
> Now it seems the slurmd cannot handle my custom GRES.
goal is to pass information from a user's SBATCH job to the
Prologslurmctld script to provision the node correctly before running
the job.
--
Bas van der Vlies
| HPCV Supercomputing | Internal Services | SURF |
https://userinfo.surfsara.nl |
| Science Park 140 | 1098 XG Amsterdam |
solution, see:
* https://bugs.schedmd.com/show_bug.cgi?id=12350
Regards
> On 7 Apr 2021, at 13:57, Bas van der Vlies wrote:
>
>
>
> Still have this question. Sometime we have free nodes and users that are
> allowed to run in the MAGNETIC reservation are first scheduled on
I know but see script we only do this for uid > 1000.
On 20/05/2021 17:29, Timo Rothenpieler wrote:
You shouldn't need this script and pam_exec.
You can set those limits directly in the systemd config to match every
user.
On 20.05.2021 16:28, Bas van der Vlies wrote:
same here we
trash
up the login nodes, which so far worked fine.
They occasionally compile stuff on the login nodes in preparation of
runs, so I don't want to limit them too much.
--
Bas van der Vlies
| HPCV Supercomputing | Internal Services | SURF |
https://userinfo.surfsara.nl |
| Science Park
provide the patch or
request the feature? And how would I do it either way?
Thanks,
Alexander
--
Bas van der Vlies
| HPCV Supercomputing | Internal Services | SURF |
https://userinfo.surfsara.nl |
| Science Park 140 | 1098 XG Amsterdam | Phone: +31208001300 |
| bas.vandervl...@surf.nl
Still have this question. Sometime we have free nodes and users that are
allowed to run in the MAGNETIC reservation are first scheduled on the
free nodes instead of reservation nodes. Dit I forgot an option or is
this the expected behavior?
On 25/09/2020 16:47, Bas van der Vlies wrote
lightuserdata(L, job_desc);
lua_setfield(L, -2, "_job_desc");
lua_setmetatable(L, -2);
}
```
--
Bas van der Vlies
| HPCV Supercomputing | Internal Services | SURF |
https://userinfo.surfsara.nl |
| Science Park 140 | 1098 XG Amsterdam | Phone: +31208001300 |
| bas.vandervl...@surf.nl
For those who are interested:
* https://bugs.schedmd.com/show_bug.cgi?id=11044
On 09/03/2021 14:21, Bas van der Vlies wrote:
I have found the problem and will submit a patch. If we find a partition
were a job can run but all nodes are busy. Save this state and return
this when all partitions
pe Type=e5_2650_v2
Prentice
On 3/8/21 11:29 AM, Bas van der Vlies wrote:
Hi,
On this cluster I have version 20.02.6 installed. We have different
partitions for cpu type and gpu types. we want to make it easy for
the user who not care where there job runs and for the experienced
user they
_2650_v2
Prentice
On 3/8/21 11:29 AM, Bas van der Vlies wrote:
Hi,
On this cluster I have version 20.02.6 installed. We have different
partitions for cpu type and gpu types. we want to make it easy for the
user who not care where there job runs and for the experienced user
they can specify
ion is not available)
[2021-03-08T19:46:09.378] _slurm_rpc_allocate_resources: Requested node
configuration is not available
```
On 08/03/2021 17:29, Bas van der Vlies wrote:
Hi,
On this cluster I have version 20.02.6 installed. We have different
partitions for cpu type and gpu types. we want to make i
Bas van der Vlies
| HPCV Supercomputing | Internal Services | SURF |
https://userinfo.surfsara.nl |
| Science Park 140 | 1098 XG Amsterdam | Phone: +31208001300 |
| bas.vandervl...@surf.nl
Just a question do you use allowgroups for partition access? we had similar
problems that squeue or other slurm commands hang when slurm fetch the groups
form LDAP for
each partition also the one it already fetched. So we configured `nscd` to
prevent this.
--
Bas van der Vlies
| Operations
Thanks Troy,
That is our intention also for course/training purposes if there are course
days.
regards
--
Bas van der Vlies
| Operations, Support & Development | SURFsara | Science Park 140 | 1098 XG
Amsterdam
| T +31 (0) 20 800 1300 | bas.vandervl...@surf.nl | www.surf.nl |
>
Are people using the MAGNETIC reservation flag? My question would be how?
because to my it would be more useful if the reservation is tried first and
then the free nodes.
That is what I expected with the MAGNETIC flag.
Bas van der Vlies
| Operations, Support & Development | SURFsara | Sci
That is why we switched to tarball installations with version directories as
suggested by schedmd. No deb/rpm installations any more.
--
Bas van der Vlies
| Operations, Support & Development | SURFsara | Science Park 140 | 1098 XG
Amsterdam
| T +31 (0) 20 800 1300 | bas.vandervl...@sur
has been allocated resources
```
From this I see that the "magentic" reservation is considered as last.
regards
--
--
Bas van der Vlies
| Operations, Support & Development | SURF | Science Park 140 | 1098 XG
Amsterdam
| T +31 (0) 20 800 1300 | bas.vandervl...@surf.nl | www.surf.
n a later version? Or is there another solution for
this?
Regards,
--
--
Bas van der Vlies
| Operations, Support & Development | SURF | Science Park 140 | 1098 XG
Amsterdam
| T +31 (0) 20 800 1300 | bas.vandervl...@surf.nl | www.surf.nl
smime.p7s
Description: S/MIME cryptographic signature
here is this "2048" and "2" factor coming from in those two calculations?
>
job is using 8G --> 0.25 * 8192 = 2048
with .25G ---) .25/1024M --> 8192 * 0.25/1024 = 2
Memory is always a factor 1024
> On Thu, 6 Aug 2020 6:46am, Bas van der Vlies wrote:
>
> > Il
Il 06/08/20 10:00, Bas van der Vlies ha scritto:
Tks for the answer.
>> We have also node with GPU's (dfiferent types) and some cost more the others.
> The partitions always have the same type of nodes not mixed,eg:
> *
> TRESBillingWeights=CPU=3801.0,Mem=502246.0T,G
Hi Diego,
Yes this can be tricky we also use this feature. The billing is on partition
level. so you can set different
schemas.
We have nodes with 16 cores and 96GB of ram and this are the cheapest nodes they
cost in our model.
1 SBU (System Billing Unit). For this node we have the following
I'd known that two years ago, might've saved me some setting up
> (if it was around two years ago). My SLURM configuration is also
> CFEngine3 controlled. So I'm quite interested in sharing .
>
> Having a look at it in a minute...
>
> Tina
>
>
> On 13/
/basvandervlies/cf_surfsara_lib/blob/master/doc/services.md
--
Bas van der Vlies
| Operations, Support & Development | SURFsara | Science Park 140 | 1098 XG
Amsterdam
| T +31 (0) 20 800 1300 | bas.vandervl...@surf.nl | www.surf.nl |
smime.p7s
Description: S/MIME cryptographic signature
t of SLURM configuration can be used to enforce this ratio?
Thank you,
Durai Arasan
Zentrum für Datenverarbeitung
Tübingen
--
--
Bas van der Vlies
| Operations, Support & Development | SURFsara | Science Park 140 | 1098 XG
Amsterdam
| T +31 (0) 20 800 1300 | bas.vandervl...@surfsara.nl | www.surfsara.nl |
We have a Debian Stretch en Debian Buster cluster and both using MariaDB no
problems so far. Version 19.05.5 and we are planning to upgrade to 20.02
--
Bas van der Vlies
| Operations, Support & Development | SURFsara | Science Park 140 | 1098 XG
Amsterdam
| T +31 (0) 20 800
fi
```
Note: This was tested on my laptop using a set of docker containers
using this configuration : https://github.com/SciDAS/slurm-in-docker .
--
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io <http://s-sajid-ali.github.io>
--
--
Bas v
test-s5 wayne.he R 0:03 1 ucs480
--
--
Bas van der Vlies
| Operations, Support & Development | SURFsara | Science Park 140 | 1098 XG
Amsterdam
| T +31 (0) 20 800 1300 | bas.vandervl...@surfsara.nl | www.surfsara.nl |
hanks Chris I also found the above link. I read the RPC documentation
wrong and now have the correct procedure for upgrading
--
Bas van der Vlies
| Operations, Support & Development | SURFsara | Science Park 140 | 1098
XG Amsterdam
| T +31 (0) 20 800 1300 | bas.vandervl...@surfsara.nl | www.surfsara.nl |
= commands
Thanks a lot Ole. This helps a a lot.
Regards
--
Bas van der Vlies
| Operations, Support & Development | SURFsara | Science Park 140 | 1098
XG Amsterdam
| T +31 (0) 20 800 1300 | bas.vandervl...@surfsara.nl | www.surfsara.nl |
413] error: slurm_unpack_received_msg: Incompatible
versions of client and server code
}}}
I have read about the RPC protocol:
* https://slurm.schedmd.com/rpc.html
Can an old `slurmctld` not communicate with a newer `slurmd`? Or is this
setup supported and something else goes wrong?
Regards
--
B
ogrammer | SURFsara | Science Park 140 | 1098
XG Amsterdam |
| T +31 6 20043417 | martijn.krui...@surfsara.nl
<mailto:bas.vandervl...@surfsara.nl> | www.surfsara.nl
<http://www.surfsara.nl> |
--
Sent from Gmail Mobile
--
Sent from Gmail Mobile
--
rk displays.
Regards,
Mahmood
--
--
Bas van der Vlies
| Operations, Support & Development | SURFsara | Science Park 140 | 1098
XG Amsterdam
| T +31 (0) 20 800 1300 | bas.vandervl...@surfsara.nl | www.surfsara.nl |
smime.p7s
Description: S/MIME Cryptographic Signature
Oke if we change:
* TaskPlugin=task/affinity,task/cgroup
to:
* TaskPlugin=task/affinity
The pmi2 interface works. Investigating this further
On 31/10/2018 08:26, Bas van der Vlies wrote:
I am busy with migrating from Torque/Moab to SLURM.
I have installed slurm 18.03 and trying to run an
r/lib/x86_64-linux-gnu/slurm//task_cgroup.so
#7 0x2b9cea2a5977 in task_p_pre_setuid () from
/usr/lib/x86_64-linux-gnu/slurm//task_cgroup.so
#8 0x5631c2a04216 in task_g_pre_setuid ()
#9 0x5631c29e713d in ?? ()
#10 0x5631c29ec3f4 in job_manager ()
#11 0x5631c29e9374 in main
47 matches
Mail list logo