Hello,

I have a very peculiar problem: I have a micro cluster with three nodes (18
cores total); the nodes are clones of each other and connected to a
frontend via Ethernet and Debian squeeze as the OS for all nodes. When I
run parallel jobs I can used up “-np 10” if I go further the job crashes, I
have primarily done tests with GROMACS (because that is what I will be
running) but have also used OSU Micro-Benchmarks 3.5.2.

For a simple parallel job I use: “path/mpirun –hostfile path/hostfile –np
XX –d –display-map path/mdrun_mpi –s path/topol.tpr –o path/output.trr”

(path is global) For –np XX being smaller than or 10 it works, however as
soon as I make use of 11 or larger the whole thing crashes. The terminal
dump is attached to this mail: when_working.txt is for “–np 10”,
when_crash.txt is for “–np 12”, and OpenMPI_info.txt is output from
“path/mpirun --bynode --hostfile path/hostfile --tag-output ompi_info -v
ompi full –parsable”

I have tried OpenMPI v.1.4.2 all the way up to beta v1.5.5, and all yield
the same result.

The output files are from a new install I did today: I formatted all nodes
and started from a fresh minimal install of Squeeze and used "apt-get
install gromacs gromacs-openmpi" and installed all dependencies. Then I ran
two jobs using the parameters described above, I also did one with OSU
bench (data is not included) it also crashed with “-np” larger than 10.

I hope somebody can help figure out what is wrong and how I can fix it.

Best regards,
Mohtadin


*****************************************************************************
**                                                                         **
** WARNING:  This email contains an attachment of a very suspicious type.  **
** You are urged NOT to open this attachment unless you are absolutely     **
** sure it is legitimate.  Opening this attachment may cause irreparable   **
** damage to your computer and your files.  If you have any questions      **
** about the validity of this message, PLEASE SEEK HELP BEFORE OPENING IT. **
**                                                                         **
** This warning was added by the IU Computer Science Dept. mail scanner.   **
*****************************************************************************


<<attachment: Archive.zip>>

Reply via email to