Hi Reuti,

thank you for your quick reply.
Actually, when specifying the host as you suggest - via "-q *@<hostname>", the 
jobs runs fine. Much obliged for this hint!

And no, there is no reservations/backfilling/...

I'd still like to learn why the syntax I was following does not work, but at 
least I do have a good alternative now.

Regards,
Manfred

-----Original Message-----
From: Reuti [mailto:re...@staff.uni-marburg.de]
Sent: Donnerstag, 5. Januar 2017 09:52
To: Manfred Selz
Cc: users@gridengine.org
Subject: Re: [gridengine users] Issue with hostname specification and parallel 
environment - jobs do not start

Hi,

Am 05.01.2017 um 07:54 schrieb Manfred Selz:

> Hi,
>
> in my SGE 6.2u5 environment, I am seeing a strange issue when submitting jobs 
> to a parallel environment while also providing a hard hostname resource 
> requirement.
> This is not a standard situation, but sometimes certain benchmarks need to be 
> run on one specific host only.
>
> When submitting a jobs either with a parallel environment or with a hard 
> hostname resource specification, the jobs starts without delay.
> However, the combination of both sometimes keeps jobs waiting for an extended 
> period of time, and I have not been able to get a clear messages from the 
> "qstat -j  <jobID>" report.
>
> Parallel environment settings is:
> $  qconf -sp local
> pe_name            local
> slots              1000
> user_lists         NONE
> xuser_lists        NONE
> start_proc_args    /bin/true
> stop_proc_args     /bin/true
> allocation_rule    $pe_slots
> control_slaves     FALSE
> job_is_first_task  TRUE
> urgency_slots      min
> accounting_summary TRUE
>
> The specific host being targeted has 32 slots configured for the queue being 
> used, and all of them are unused at this time.
> Is anybody aware of specific issues with the combination of parallel 
> environments and a hard hostname resource request?
>
> I have already tested this:
> *         Removed the parallel environment request - works
> *         Removed the hostname request - works
> *         Removed all resource limits ("qconf -mrqs") - no change
> *         Increased the "slots" limit in the PE setting - no change
> *         Changed the PE allocation_rule to "round_robin" - no change

I only saw problems when requesting a queue and a host at the same time, i.e. 
"-q" & "-l h=" at the same time. The solution may work also in your case: 
request the host by a queue request:

-q "*@node123"


> After all, the final message in the "qstat -j <jobID>" report is always:
> cannot run in PE "local" because it only offers 0 slots

I assume the node is free and you have no backfilling issue where slots are 
reserved.

-- Reuti


>
> I have seen many older reports for the "only offers 0 slots" message on older 
> pages, but none specifically for the combination with a hostname spec. (only).
>
> Regards,
> Manfred
>
> _______________________________________________
> users mailing list
> users@gridengine.org
> https://gridengine.org/mailman/listinfo/users

________________________________

Dialog Semiconductor GmbH
Neue Str. 95
D-73230 Kirchheim
Managing Directors: Dr. Jalal Bagherli, Carsten Dahl
Chairman of the Supervisory Board: Rich Beyer
Commercial register: Amtsgericht Stuttgart: HRB 231181
UST-ID-Nr. DE 811121668

Legal Disclaimer: This e-mail communication (and any attachment/s) is 
confidential and contains proprietary information, some or all of which may be 
legally privileged. It is intended solely for the use of the individual or 
entity to which it is addressed. Access to this email by anyone else is 
unauthorized. If you are not the intended recipient, any disclosure, copying, 
distribution or any action taken or omitted to be taken in reliance on it, is 
prohibited and may be unlawful.

Please consider the environment before printing this e-mail



_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to