On 7 March 2012 11:36, Reuti <[email protected]> wrote: > Am 07.03.2012 um 11:18 schrieb William Hay: > >> On 7 March 2012 10:11, Mazouzi <[email protected]> wrote: >>> I remember Reuti proposed a solution using RQS: >>> >>> { >>> name noverload >>> description Make sure host will not take more than 1 process per >>> processor >>> enabled TRUE >>> limit hosts {*} to slots=$num_proc >>> } >>> >> While I could switch to using RQS rather than host consumables to >> control slot usage I'd rather understand why a solution that AFAICT >> worked perfectly up to now has stopped working for these two >> jobs/hosts. Without that understanding I have no guarantee that the >> RQS solution won't have the same issue. Is there any reason to >> believe the RQS solution will be more reliable than the host >> consumable solution (which has worked pretty well up to now)? > > Yes, it could be set up in an RQS too, it's mainly a matter of taste. > Attaching it to a node makes the output of `qquota` shorter to show the real > limits but it must be done for each machine by hand or script.
Mostly scripted here. > > To the real issue: > > There was no change and it happened out of the blue? > > Do you request a load value in addition during submission? > No load value requested. We have another identical (apart from being submitted a little later) job running that has been correctly scheduled to two nodes. > https://arc.liv.ac.uk/trac/SGE/ticket/1316 Fritz from Univa has offered an explanation that fits the facts (despite us not paying them a penny for support)so it may not be worth pursuing this further. _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
