I tried it according to the instructions, but it won’t work. The Messages file 
for the qmaster says that the scripts are not executable, but I chmod +x’d both 
the scripts.

On my cluster /opt/sge is owned by user gridengine. Do I have to specify 
gridengine@/path/to/scripts in the queue conf for prolog and epilog or ??

Mfg,
Juan Jimenez
System Administrator, BIH HPC Cluster
MDC Berlin / IT-Dept.
Tel.: +49 30 9406 2800


From: Kamel Mazouzi <[email protected]>
Date: Thursday, 18. May 2017 at 13:07
To: "Jimenez, Juan Esteban" <[email protected]>
Cc: William Hay <[email protected]>, "[email protected]" 
<[email protected]>
Subject: Re: [SGE-discuss] GPUs as a resource

Hi,
For GPU integration we are using a solution like this one:

https://github.com/kyamagu/sge-gpuprolog
Regards

On Thu, May 18, 2017 at 12:37 PM, 
[email protected]<mailto:[email protected]> 
<[email protected]<mailto:[email protected]>> 
wrote:
Ok, so I create a new queue, gpu.q, that only has that node, with the complex 
value for the gpu. I removed the node from @allhosts so that the all.q and 
interactive.q don’t use the node. I also modified the user list so that only 
users authorized to use the GPU can use the node.

But now I am told that this is not recommended. ??

Mfg,
Juan Jimenez
System Administrator, BIH HPC Cluster
MDC Berlin / IT-Dept.
Tel.: +49 30 9406 2800<tel:%2B49%2030%209406%202800>

On 17.05.17, 09:44, "William Hay" <[email protected]<mailto:[email protected]>> 
wrote:

    On Tue, May 16, 2017 at 08:07:15PM +0000, 
[email protected]<mailto:[email protected]> 
wrote:
    > In our cluster we have one node with two Nvidia GPUs. I have been trying 
to figure out how to set them up as consumable resources tied to an ACL, but I 
can't get SGE to handle them correctly. It always says the resource is not 
available.
    >
    > Can someone walk me through the steps required to set this up correctly? 
The docs I have found are rather cryptic.
    Assuming you want other people to be able to use the node but not the GPUs 
I would think the process would be:
    1)Define the resource in the complex_values of a queue that exists only on 
the node in question.
    2)Add the Grid Engine ACL to the queue.
    3)Ensure all resources shared between gpu and non-gpu jobs (including 
slots/cpus) are defined on the
    host rather than the queue.

    You might want to set up the prolog and epilog to twiddle the permissions 
on the /dev/ files representing
    the GPUs so only the job can access them to enforce access.



    William


_______________________________________________
SGE-discuss mailing list
[email protected]<mailto:[email protected]>
https://arc.liv.ac.uk/mailman/listinfo/sge-discuss

_______________________________________________
SGE-discuss mailing list
[email protected]
https://arc.liv.ac.uk/mailman/listinfo/sge-discuss

Reply via email to