Hi folks,

Got a bewildering situation I've never seen before with simple SMP/threaded PE techniques

I made a brand new PE called threaded:

$ qconf -sp threaded
pe_name            threaded
slots              999
user_lists         NONE
xuser_lists        NONE
start_proc_args    NONE
stop_proc_args     NONE
allocation_rule    $pe_slots
control_slaves     FALSE
job_is_first_task  TRUE
urgency_slots      min
accounting_summary FALSE
qsort_args         NONE


And I attached that to all.q on an IDLE grid and submitted a job with '-pe threaded 1' argument

However all "qstat -j" data is showing this scheduler decision line:

cannot run in PE "threaded" because it only offers 0 slots


I'm sort of lost on how to debug this because I can't figure out how to probe where SGE is keeping track of PE specific slots.  With other stuff I can look at complex_values reported by execution hosts or I can use an "-F" argument to qstat to dump the live state and status of a requestable resource but I don't really have any debug or troubleshooting ideas for "how to figure out why SGE thinks there are 0 slots when the static PE on an idle cluster has. been set to contain 999 slots"

Anyone seen something like this before?  I don't think I've ever seen this particular issue with an SGE parallel environment before ...


Chris

_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to