Title: Re: [slurm-dev] defaults, passwd and data

Hello,


On 09/24/2017 08:35 AM, Nadav Toledo wrote:
defaults, passwd and data Hey all,

We are trying to setup a Slurm cluster for both cpu and gpu partitions for research and education(courses) in a computer science faculty at my university
everything seems to work fine and we have managed to accomplish almost everythingן¿½ needed except a few things:

A. Is it possible to setup a global defaults values for srun/sbatch (i.e. number of cores ,email, etc)? if so, how can it be done?
You might create a submit plugin (lua) or else use template sbatch scripts, and a wrapper script that will fill out the templates according to the user input
A2. Is it possible to make some srun/sbatch parameters required(i.e a user cannot run a job via srun unless specifying email)? if so, how?

B. We have active directory(AD) in our faculty, and We prefer manage users/groups from there , is it possible? any guide available somewhere?
Search this mailing list, this question pops up every now and again, there is no builtin solution.
You should consider using accounting, but if you decide to incorporate AD into slurm accounting, you will have to decide how to group users and accounts (create correct rules).

C. What is the recommanded way to handle data files? meaning , a user wants his data/code files (for example a data set of pictures for gpu deep learning) to be accessible to the nodes allocated to him and get the result back easily without sshing to those nodes(I want to close the nodes to ssh if possible), so far we investigatedן¿½ nfs(low preformance vs files locally on server), nextcloud(file syncing back and forth), is there a better way we overlooked?
Some form of shared storage, with an http file server, and a post script (epilog in slurm speak) that would automatically send a URI to the user's email? This will mean each job must create it's own path for the server to publish.


D. We need to give a specific known user the ability to run his jobs on specific nodes on specific hours while no other jobs allowed to run concurrently(exclusion)
We saw there is reservation, but it takes the resources even if that user didn't eventually use his reservation, another solution was to create a partition with priority higher than all the others , put this partition in down state and only give that user a right to submit jobs to it, then put a script in crontab to change the state of the partition in the time window needed,
What do you think? is there a more elegant way?
Be evil. Use fair share. Once that user's credit goes down, they will pay more attention and cancel or use the reservation.


Our most common os is ubuntu, and we are using slurm 17.02.7

Thanks in advance for you time and effort, Nadav

--

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to