Hi Mark and Alexey, Thank you for taking the time to write the responses. Following is the info about the cluster
[chou...@hpc-login2 ~]$ uname -a Linux hpc-login2 2.6.18-164.el5 #1 SMP Thu Sep 3 03:28:30 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux [chou...@hpc2000 ~]$ mpiexec --version Version 0.82, configure options: '--host=x86_64-redhat-linux-gnu' '--build=x86_64-redhat-linux-gnu' '--target=x86_64-redhat-linux' '--program-prefix=' '--prefix=/usr' '--exec-prefix=/usr' '--bindir=/usr/bin' '--sbindir=/usr/sbin' '--sysconfdir=/etc' '--datadir=/usr/share' '--includedir=/usr/include' '--libdir=/usr/lib64' '--libexecdir=/usr/libexec' '--localstatedir=/var' '--sharedstatedir=/usr/com' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--with-pbs=/usr/lib64/torque' '--disable-p4-shmem' 'CFLAGS=-O2 -g' 'build_alias=x86_64-redhat-linux-gnu' 'host_alias=x86_64-redhat-linux-gnu' 'target_alias=x86_64-redhat-linux' [chou...@hpc2000 ~]$ ifort --version ifort (IFORT) 10.0 20070426 Copyright (C) 1985-2007 Intel Corporation. All rights reserved. [chou...@hpc2000 ~]$ icc --version icc (ICC) 10.0 20070426 Copyright (C) 1985-2007 Intel Corporation. All rights reserved. Thanks for all the help. Amit 2010/3/3 Alexey Shvetsov <alexx...@gmail.com> > Hi > > Looks like your system simply runs out of memory. So power cycling nodes > isnt > needed. If your cluster runs linux then it already has OOM Killer that will > kill processes that runs out of memory. Also having swap on nodes is a good > idea even with huge amount of memory. > Memory usage for mpi processes will strongly depend on mpi implentation > because some of them are usualy caching slave process memory (like usualy > do > mvapich2) > > So can you provide info about youre cluster setup. > OS version (including kernel version) uname -a > mpi version mpirun --version or mpiexec --version > also compiler version that was used for compiling gromacs > > On Четверг 04 марта 2010 03:15:53 Amit Choubey wrote: > > Hi Roland, > > > > I was using 32 nodes with 8 cores, each with 16 Gb memory. The system was > > about 154 M particles. This should be feasible according to the numbers. > > Assuming that it takes 50bytes per atoms and 1.76 KB per atom per core > then > > > > Masternode -> (50*154 M + 8*1.06)bytes ~ 16GB (There is no leverage here) > > All other nodes 8*1.06 ~ 8.5 GB > > > > I am planning to try the same run on 64 nodes with 8 cores each again but > > not until i am a little more confident. The problem is if gromacs crashes > > due to memory it makes the nodes to hang and people have to recycle the > > power supply. > > > > > > Thank you, > > > -- > Best Regards, > Alexey 'Alexxy' Shvetsov > Petersburg Nuclear Physics Institute, Russia > Department of Molecular and Radiation Biophysics > Gentoo Team Ru > Gentoo Linux Dev > mailto:alexx...@gmail.com > mailto:ale...@gentoo.org > mailto:ale...@omrb.pnpi.spb.ru > > -- > gmx-users mailing list gmx-users@gromacs.org > http://lists.gromacs.org/mailman/listinfo/gmx-users > Please search the archive at http://www.gromacs.org/search before posting! > Please don't post (un)subscribe requests to the list. Use the > www interface or send it to gmx-users-requ...@gromacs.org. > Can't post? Read http://www.gromacs.org/mailing_lists/users.php >
-- gmx-users mailing list gmx-users@gromacs.org http://lists.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org. Can't post? Read http://www.gromacs.org/mailing_lists/users.php