Hi - I have compiled vasp 4.6.34 using the Intel fortran compiler 11.1
with openmpi 1.3.3 on a cluster of 104 nodes running Rocks 5.2 with two
quad core opterons connected by a Gbit ethernet. Running in parallel on
one node (8 cores) runs very well, faster than any other cluster I have
run it on. However, running on 2 nodes in parallel only improves the
performance by 10% over the one node case while running on 4 and 8 nodes
yields no improvement over the two node case. Furthermore, when running
multiple (3-4) jobs simultaneously, the performance decreases by around
50% compared to running only a single job on the entire cluster. The
nodes are connected by a Dell Powerconnect 6248 managed switch. I get
the same performance with mpich2, so I don't think it is a problem
specific to openmpi. Other vasp users have reported very good scaling
up to 4 nodes on a similar cluster, so I don't think the problem is vasp
either. Could something be wrong with the way mpi is configured to work
with the switch? Or the operating system is not configured to work with
the switch properly? Or the switch itself needs to be configured? Thanks!
- [OMPI users] very bad parallel scaling of vasp using openm... Craig Plaisance
- Re: [OMPI users] very bad parallel scaling of vasp us... Jeff Squyres
- Re: [OMPI users] very bad parallel scaling of vasp us... Joe Landman
- Re: [OMPI users] very bad parallel scaling of vasp us... Craig Plaisance
- Re: [OMPI users] very bad parallel scaling of vasp us... Craig Plaisance
- Re: [OMPI users] very bad parallel scaling of vas... Joe Landman
- Re: [OMPI users] very bad parallel scaling of vas... Craig Plaisance