Hello, Really, I don't know if my question is for this mailing list... but I will explain my problem and, then, you could answer me whatever you think ;) I manage a SLURM clusters composed by 3 networks:
When I submit a MPI job, SLURM scheduler offers me "n" nodes called, for example, clus01 and clus02 and, there, my application runs perfectly using second network for SLURM connectivity and first network for NFS (and NIS) shares. By default, as SLURM connectivity is on second network, my nodelist contains nodes called "clus0x". However, now, I'm getting a "new" problem. I want to use third network (Infiniband), but as SLURM offers me "clus0x" (second network), my MPI application runs OK but using second network. This problem also occurs, for example, using NAMD (Charmrun) application.So, my questions are:
Thanks a lot!! |
- [slurm-users] Question about networks and connectivity sysadmin.caos