Hi Reuti,
>> I have a bit of a problem with our job submission.
We have a setup with four different 'priority' queues - very low,
>> low, medium, and high - with subordination. The setup actually works
quite well for our usage pattern - with the highest priority queue being
reserved for automatic data reduction procedures (which are supposed to
run whenever triggered).
Of late, we have noticed that we do quite often see jobs in the
highest priority queue suspend jobs running in the medium priority
queue, although there are nodes that do not have any medium priority
jobs on them. It's not a big problem, but it is annoying. So, I'm after
ideas as to how to make the number of jobs running in the medium
priority queue a factor in the scheduling decision.
One of the problems is (I suspect) that there are always jobs
running
in the very low priority queue - i.e. there is always load on the nodes.
I suppose that might skew the scheduling decision a bit (as it makes
load average misleading). From our point of view, we'd like the
scheduler to basically disregard the load average and focus on how many
jobs there are running on this host already when making a scheduling
decision.
I have tried a load sensor - basically counting the number of jobs
in the queue on a machine
The load in the "medium" queue is counted only and "queue_sort_order
load" set? In principle it should work.
I'm pretty sure that's what we did, yes. (Currently, queue_sort_order is
back to seq_no - we have always had the sequencing modified as to fill
high.q from one end and medium.q from the other, so to speak).
- but that didn't seem to make a difference; which might be due to
the weighting, I suppose.
For serial and parallel SMP jobs there is:
http://wiki.gridengine.info/wiki/index.php/StephansBlog as an option.
Maybe instead of using "slots" a custom complex is necessary which is
called "medium" and all jobs of this type have to request it. This is
similar to your setup with the load sensor.
It is, yes. Thanks.
There of course no way to have different sorting algorithms for
different queues (i.e. different load entries). You can give a different
seq_no to the nodes in different queues, but his doesn't take the load
in account (only for the ones with the same seq_no).
As already mentioned, yes, we do have different seq_no for the different
queues.
And with the "medium" load sensor or complex: I assume you would
liketo have a different behavior for the medium jobs: fill the nodes first
vs. for the priority jobs: use a free node.
And this makes me remember something I ought to have mentioned in the
original post. Most of the jobs run in these two queues more or less
need/request a full node (they expect to use all available CPUs. We do
this by requesting all slots using the smp PE. Allocation rules are a
bit different for PEs, aren't they?
==
A completely different approach: using a seq_no fill the cluster
with medium jobs from the one side (ascending entries for each exechost), and
for priority jobs from the other side (descending entries for each
exechost).
Which is what we attempted to do. For some reason, the jobs still seem
to have a preference to collide on one host.
I'll sit down & review the full configuration, I think. Just to make
sure I haven't got an obvious bug somewhere.
Tina
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
--
This e-mail and any attachments may contain confidential, copyright and or
privileged material, and are for the use of the intended addressee only. If you
are not the intended addressee or an authorised recipient of the addressee
please notify us of receipt by returning the e-mail and do not use, copy,
retain, distribute or disclose the information in or attached to the e-mail.
Any opinions expressed within this e-mail are those of the individual and not
necessarily of Diamond Light Source Ltd. Diamond Light Source Ltd. cannot
guarantee that this e-mail or any attachments are free from viruses and we
cannot accept liability for any damage which you may sustain as a result of
software viruses which may be transmitted in or with the message.
Diamond Light Source Limited (company no. 4375679). Registered in England and
Wales with its registered office at Diamond House, Harwell Science and
Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
--
This e-mail and any attachments may contain confidential, copyright and or
privileged material, and are for the use of the intended addressee only. If you
are not the intended addressee or an authorised recipient of the addressee
please notify us of receipt by returning the e-mail and do not use, copy,
retain, distribute or disclose the information in or attached to the e-mail.
Any opinions expressed within this e-mail are those of the individual and not necessarily of Diamond Light Source Ltd.
Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments are free from viruses and we cannot accept liability for any damage which you may sustain as a result of software viruses which may be transmitted in or with the message.
Diamond Light Source Limited (company no. 4375679). Registered in England and
Wales with its registered office at Diamond House, Harwell Science and
Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users