Re: [SGE-discuss] GPUs as a resource

juanesteban.jime...@mdc-berlin.de Thu, 18 May 2017 05:15:57 -0700

I tried it according to the instructions, but it won’t work. The Messages file 
for the qmaster says that the scripts are not executable, but I chmod +x’d both 
the scripts.

On my cluster /opt/sge is owned by user gridengine. Do I have to specify 
gridengine@/path/to/scripts in the queue conf for prolog and epilog or ??

Mfg,
Juan Jimenez
System Administrator, BIH HPC Cluster
MDC Berlin / IT-Dept.
Tel.: +49 30 9406 2800

From: Kamel Mazouzi <mazo...@gmail.com>
Date: Thursday, 18. May 2017 at 13:07
To: "Jimenez, Juan Esteban" <juanesteban.jime...@mdc-berlin.de>
Cc: William Hay <w....@ucl.ac.uk>, "SGE-discuss@liv.ac.uk" 
<sge-disc...@liverpool.ac.uk>
Subject: Re: [SGE-discuss] GPUs as a resource

Hi,
For GPU integration we are using a solution like this one:

https://github.com/kyamagu/sge-gpuprolog
Regards

On Thu, May 18, 2017 at 12:37 PM, 
juanesteban.jime...@mdc-berlin.de<mailto:juanesteban.jime...@mdc-berlin.de> 
<juanesteban.jime...@mdc-berlin.de<mailto:juanesteban.jime...@mdc-berlin.de>> 
wrote:
Ok, so I create a new queue, gpu.q, that only has that node, with the complex 
value for the gpu. I removed the node from @allhosts so that the all.q and 
interactive.q don’t use the node. I also modified the user list so that only 
users authorized to use the GPU can use the node.

But now I am told that this is not recommended. ??

Mfg,
Juan Jimenez
System Administrator, BIH HPC Cluster
MDC Berlin / IT-Dept.
Tel.: +49 30 9406 2800<tel:%2B49%2030%209406%202800>

On 17.05.17, 09:44, "William Hay" <w....@ucl.ac.uk<mailto:w....@ucl.ac.uk>> 
wrote:

    On Tue, May 16, 2017 at 08:07:15PM +0000, 
juanesteban.jime...@mdc-berlin.de<mailto:juanesteban.jime...@mdc-berlin.de> 
wrote:
    > In our cluster we have one node with two Nvidia GPUs. I have been trying 
to figure out how to set them up as consumable resources tied to an ACL, but I 
can't get SGE to handle them correctly. It always says the resource is not 
available.
    >
    > Can someone walk me through the steps required to set this up correctly? 
The docs I have found are rather cryptic.
    Assuming you want other people to be able to use the node but not the GPUs 
I would think the process would be:
    1)Define the resource in the complex_values of a queue that exists only on 
the node in question.
    2)Add the Grid Engine ACL to the queue.
    3)Ensure all resources shared between gpu and non-gpu jobs (including 
slots/cpus) are defined on the
    host rather than the queue.

    You might want to set up the prolog and epilog to twiddle the permissions 
on the /dev/ files representing
    the GPUs so only the job can access them to enforce access.

    William

_______________________________________________
SGE-discuss mailing list
SGE-discuss@liv.ac.uk<mailto:SGE-discuss@liv.ac.uk>
https://arc.liv.ac.uk/mailman/listinfo/sge-discuss

_______________________________________________
SGE-discuss mailing list
SGE-discuss@liv.ac.uk
https://arc.liv.ac.uk/mailman/listinfo/sge-discuss

Re: [SGE-discuss] GPUs as a resource

Reply via email to