Am 11.01.2013 um 23:16 schrieb [email protected]:

> 
> I recently reconfigured our SGE (6.2u5) environment to better handle MPI jobs
> on a heterogeneous cluster. This seems to have caused a problem with the
> "threaded" (SMP) PE.
> 
> Our PEs are:
> 
>       qconf -spl
>               make                    (unused)
>               openmpi-AMD
>               openmpi-Intel
>               threaded
> 
> 
> I'm using a JSV to allow users to request "-pe openmpi" and alter that
> to "-pe openmpi-*". The two "openmpi-*" PEs are both assigned to the
> "all.q", but only given a hostgroup with the appropriate servers. This
> works fine for OpenMPI jobs.
> 
> The PE "threaded" is also assigned to the "all.q". That PE should consist of
> all hosts in the queue.
> 
>       qconf -sq all.q | grep pe_list
>               pe_list  threaded 
> make,[@mpi-AMD=openmpi-AMD],[@mpi-Intel=openmpi-Intel]

pe_list  make,[@mpi-AMD=openmpi-AMD threaded],[@mpi-Intel=openmpi-Intel 
threaded]

should do it - Reuti


> However, jobs submitted with a request for "-pe threaded" are not run. SGE
> claims that the PE is not assigned to any queue:
> 
>       qstat -j 5170487
>               parallel environment:  threaded range: 4
>               cannot run in queue "all.q@c5-10" because PE "threaded" is not 
> in pe list
>                cannot run in queue "all.q@c5-11" because PE "threaded" is not 
> in pe list
>                cannot run in queue "all.q@c5-12" because PE "threaded" is not 
> in pe list
> 
> 
> I've tried assiging a hostgroup (@batch, the same as the hostgroup
> assigned to the all.q) to the "threaded" PE, but that puts the nodes
> into the c(onfiguration ambiguous) state.
> 
> Any suggestions?
> 
> Thanks,
> 
> Mark
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to