Hi folks,

Not directly Slurm-related, but... We have a couple of research groups that 
have large data sets they are processing via Slurm jobs (deep-learning 
applications) and are presently consuming the data via NFS mounts (both groups 
have 10G ethernet interconnects between the Slurm nodes and the NFS servers.) 
They are both now complaining of "too-long loading times" for the data, and are 
casting about for a way to bring the needed data onto the processing node, onto 
fast SSD single drives (or even SSD arrays.) These local drives would be 
considered "scratch space", not for long-term data storage, but for use over 
the lifetime of a job, or maybe perhaps a few sequential jobs (given the nature 
of the work.) "Permanent" storage would remain the existing NFS servers. We 
don't really have the funding for 25-100G networks and/or all-flash commercial 
data storage appliances (NetApp, Pure, etc.)

Any good patterns that I might be able to learn about implementing here? We 
have a few ideas floating about, but I figured this already may be a solved 
problem in this community...

Thanks!
Will

Reply via email to