date:20200707

Re: [slurm-users] [EXT] Jobs Immediately Fail for Certain Users

2020-07-07 Thread Christopher Samuel

On 7/7/20 5:57 pm, Jason Simms wrote: Failed to look up user weissp: No such process That looks like the user isn't known to the node. What do these say: id weissp getent passwd weissp Which version of Slurm is this? All the best, Chris -- Chris Samuel : http://www.csamuel.org/ : Ber

Re: [slurm-users] [EXT] Jobs Immediately Fail for Certain Users

2020-07-07 Thread Jason Simms

Now that is interesting. If I do: loginctl enable-linger weissp Then I get the following error: Failed to look up user weissp: No such process This is one of the users that always fails. But if I run it for myself with: loginctl enable-linger simmsj Everything works (as expected). Any though

Re: [slurm-users] [EXT] Jobs Immediately Fail for Certain Users

2020-07-07 Thread Sean Crosby

Hi Jason, What happens when you try to run that command on the node? Is the exit status of the command 0? e.g. for my servers, where lingering is masked, I get [root@thespian-gpgpu001 ~]# loginctl enable-linger scrosby Could not enable linger: Unit is masked. [root@thespian-gpgpu001 ~]# echo $?

Re: [slurm-users] [EXT] Weird issues with slurm's Priority

2020-07-07 Thread Sean Crosby

On Wed, 8 Jul 2020 at 00:47, zaxs84 wrote: > *UoM notice: External email. Be cautious of links, attachments, or > impersonation attempts.* > -- > Hi Sean, > thank you very much for your reply. > > > If a lower priority job can start AND finish before the resources a >

Re: [slurm-users] Allow certain users to run over partition limit

2020-07-07 Thread Sebastian T Smith

Hi, We use Job QOS and Resource Reservations for this purpose. QOS is a good option for a "permanent" change to a user's resource limits. We use reservations similar to how you're currently using partitions to "temporarily" provide a resource boost without the complexities of re-partitioning

[slurm-users] Allow certain users to run over partition limit

2020-07-07 Thread Matthew BETTINGER

Hello, We have a slurm system with partitions set for max runtime of 24hours. What would be the proper way to allow a certain set of users to run jobs on the current partitions beyond the partition limits? In the past we would isolate some nodes based on their job requirements , make a new pa

[slurm-users] Automatically stop low priority jobs when submitting high priority jobs

2020-07-07 Thread zaxs84

Hi all. Is there a scheduler option that allows low priority jobs to be immediately paused (or even stopped) when jobs with higher priority are submitted? Related to this question, I am also a bit confused about how "scontrol suspend" works; my understanding is that a job that gets suspended rece

[slurm-users] Jobs Immediately Fail for Certain Users

2020-07-07 Thread Jason Simms

Hello all, Two users on my system experience job failures every time they submit a job via sbatch. When I run their exact submission script, or when I create a local system user and launch from there, the jobs run fine. Here is an example of what I see in the slurmd log: [2020-07-06T15:02:41.284]

Re: [slurm-users] [EXT] Weird issues with slurm's Priority

2020-07-07 Thread zaxs84

Hi Sean, thank you very much for your reply. > If a lower priority job can start AND finish before the resources a higher priority job requires are available, the backfill scheduler will start the lower priority job. That's very interesting, but how can the scheduler predict how long a low-priori

Re: [slurm-users] [EXT] Weird issues with slurm's Priority

2020-07-07 Thread Sean Crosby

Hi, What you have described is how the backfill scheduler works. If a lower priority job can start AND finish before the resources a higher priority job requires are available, the backfill scheduler will start the lower priority job. Your high priority job requires 24 cores, whereas the lower pr

[slurm-users] Weird issues with slurm's Priority

2020-07-07 Thread zaxs84

Hi all. We want to achieve a simple thing with slurm: launch "normal" jobs, and be able to launch "high priority" jobs that run as soon as possible. End of it. However we cannot achieve this in a reliable way, meaning that our current config sometimes works, sometimes not, and this is driving us c

Re: [slurm-users] [EXT] Jobs Immediately Fail for Certain Users

Re: [slurm-users] [EXT] Jobs Immediately Fail for Certain Users

Re: [slurm-users] [EXT] Jobs Immediately Fail for Certain Users

Re: [slurm-users] [EXT] Weird issues with slurm's Priority

Re: [slurm-users] Allow certain users to run over partition limit

[slurm-users] Allow certain users to run over partition limit

[slurm-users] Automatically stop low priority jobs when submitting high priority jobs

[slurm-users] Jobs Immediately Fail for Certain Users

Re: [slurm-users] [EXT] Weird issues with slurm's Priority

Re: [slurm-users] [EXT] Weird issues with slurm's Priority

[slurm-users] Weird issues with slurm's Priority

11 matches

Site Navigation

Mail list logo

Footer information