On 20 June 2012 10:40, Jake Carroll <[email protected]> wrote: > Hi all. > > A probably not so uncommon question today with what is probably a simple > answer forthcoming. > > I've currently got a situation where one of the storage arrays I'm using to > share "big" NFS to my compute nodes is under a significant amount of 10GbE > I/O strain. The array can't handle the concurrency I'm currently throwing at > it. > > To that end – I started contemplating somehow forcing queues to somehow > "transfer" the data working sets or resources requested of the storage to > local /scratch "inter-node". Each node has some decently speedy 15K SAS > spindles inside it. I thought it'd be nice to see if we could reduce latency > and contention on the 10GbE connected array a little by doing this. > > We found this: > > https://www.nbcr.net/pub/wiki/index.php?title=Reduce_I/O_bottleneck_by_using_compute_node_local_scratch_disks > > But I am sure there is a lot more to it. > > I know of a configuration item I've seen called the "transfer" queue, but > I've got a feeling it's got nothing to do with this, and is more used as a > mechanism to programmatically forward jobs to other SGE queues et al. > > Looking for some guidance on how we might programmatically enforce the jobs > at "wire up" time to transfer working sets to node local /scratch to > increase efficiencies (perhaps?). > > Thanks, all! > > --JC > I've read through all this and I'm still not sure what you are trying to achieve. Have a data set used by a lot of jobs available locally on all nodes? Transfer copies of a dataset used by a parallel job to each node in the job without involving the central NFS filestore? Some sort of P2P transfer set up so a job could obtain the dataset it needs from a node (or nodes) where it already exists? Prevent people who could use local scratch(ie single node jobs) from using NFS?
William _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
