...and turning schedd_job_info on for a bit also didn't really help; it gives me "cannot run in PE "smp" because it only offers 0 slots"; however, it doesn't really tell me why it thinks there aren't any free slots (I think there are). I don't see the queue instances it could run on in the 'cannot run in queue' stanzas, so unfortunately I'm not much wiser. (qstat -F for a queue instance that would fit does give me 'qc:slots=8').

Tina

On 07/07/14 17:24, Tina Friedrich wrote:
Okay, I checked. All jobs in the queue have the same priority. They all
request the same resources. There aren't any ARs. There are no resource
quota sets defined.

I'll turn the scheduler info on for a bit tomorrow (have a maintenance
window tomorrow).

Tina

On 07/07/14 17:10, Tina Friedrich wrote:
Hi William,

On 07/07/14 15:22, William Hay wrote:
On Fri, 4 Jul 2014 10:37:56 +0000
Tina Friedrich <tina.friedr...@diamond.ac.uk> wrote:

Hello list,

I have a couple of jobs sitting in the queue (been there for ages)
that never seem to start (they're in qw).

qalter -w p #JOBNO says "verification: found possible assignment with
8 slots"
Are there higher priority jobs queued?  Possibly with reservations?
AIUI -w p does not take account of such reservations but in essence
works as if the job in question were the only one waiting.

No reservations; I'll need to check the priorities thing. There might be
higher priority jobs not running due to resource mismatch that
effectively block this one... thanks for that, didn't think of that.

Possibly failing to transfer to an assigned node.  You might be able to
identify which node from the schedule file if you have that enabled.

Do you have schedd_job_info enabled?  qstat -j will provide more info
on why a job wasn't scheduled last go around if you do but it can cause
memory leaks and other problems.  Possibly the job_list variant or just
turning it on for a single scheduling cycle might help avoid those
problems.

Not have schedd_job_info, no - used to, but it started killing my
qmaster process(es). Could try that.


Not sure if qalter -w p  takes account of resource quotas either...

Haven't got resource quotas defined - shouldn't be one of those.


William



_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users







--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442

--
This e-mail and any attachments may contain confidential, copyright and or 
privileged material, and are for the use of the intended addressee only. If you 
are not the intended addressee or an authorised recipient of the addressee 
please notify us of receipt by returning the e-mail and do not use, copy, 
retain, distribute or disclose the information in or attached to the e-mail.
Any opinions expressed within this e-mail are those of the individual and not necessarily of Diamond Light Source Ltd. Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments are free from viruses and we cannot accept liability for any damage which you may sustain as a result of software viruses which may be transmitted in or with the message.
Diamond Light Source Limited (company no. 4375679). Registered in England and 
Wales with its registered office at Diamond House, Harwell Science and 
Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom




_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to