Hi all, First, excuse my english, it isn't good :)
Well, I have 2 machines, one a Xeon with 2 cpu (64bit) and a Pentium 4 with only one cpu. At the 2 machines I have installed Ubuntu 8 Server and all packages to open-mpi and gromacs. I use gromacs for my works Ok, in the 2 machines, at my users folder, I have a file like this: machine1 cpu=2 machine2 Machine1 is Xeon (192.168.0.10) and Machine2 is Pentium 4 (192.168.0.11) My file /etc/hosts is configured too. When I run mpiexec in machine2, I have like this: mariojose@machine2:~/lam-mpi$ mpiexec -n 3 hostname machine1 machine2 ----------------------------------------------------------------------------- It seems that [at least] one of the processes that was started with mpirun did not invoke MPI_INIT before quitting (it is possible that more than one process did not invoke MPI_INIT -- mpirun was only notified of the first one, which was on node n0). mpirun can *only* be used with MPI programs (i.e., programs that invoke MPI_INIT and MPI_FINALIZE). You can use the "lamexec" program to run non-MPI programs over the lambooted nodes. ----------------------------------------------------------------------------- machine1 mpirun failed with exit status 252 When I run in machine1 I have like this: mariojose@machine1:~/lam-mpi$ mpiexec -n 3 hostname machine1 machine1 machine2 ----------------------------------------------------------------------------- It seems that [at least] one of the processes that was started with mpirun did not invoke MPI_INIT before quitting (it is possible that more than one process did not invoke MPI_INIT -- mpirun was only notified of the first one, which was on node n0). mpirun can *only* be used with MPI programs (i.e., programs that invoke MPI_INIT and MPI_FINALIZE). You can use the "lamexec" program to run non-MPI programs over the lambooted nodes. ----------------------------------------------------------------------------- mpirun failed with exit status 252 I don't know why I have this message. I think that is a error. I try run with gromacs, if anybody use gromacs and can help me I like very much :) . mariojose@machine1:~/lam-mpi$ grompp -f run.mdp -p topol.top -c pr.gro -o run.tpr mariojose@machine1:~/mpiexec -n 3 mdrun -v -deffnm run It's works Ok. I see that cpu of 2 machines woks in 100%. It look well for me. But I have a error em I run mdrun_mpi that is a binary to work in cluster. mariojose@machine1:~/lam-mpi$ grompp -f run.mdp -p topol.top -c pr.gro -o run.tpr -np 3 -sort -shuffle mariojose@machine1:~/lam-mpi$ mpiexec -n 3 mdrun_mpi -v -deffnm run NNODES=3, MYRANK=0, HOSTNAME=machine1 NNODES=3, MYRANK=2, HOSTNAME=machine1 NODEID=0 argc=4 NODEID=2 argc=4 NNODES=3, MYRANK=1, HOSTNAME=machine2 NODEID=1 argc=4 :-) G R O M A C S (-: Gyas ROwers Mature At Cryogenic Speed :-) VERSION 3.3.3 (-: Written by David van der Spoel, Erik Lindahl, Berk Hess, and others. Copyright (c) 1991-2000, University of Groningen, The Netherlands. Copyright (c) 2001-2008, The GROMACS development team, check out http://www.gromacs.org for more information. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. :-) mdrun_mpi (-: Option Filename Type Description ------------------------------------------------------------ -s run.tpr Input Generic run input: tpr tpb tpa xml -o run.trr Output Full precision trajectory: trr trj -x run.xtc Output, Opt. Compressed trajectory (portable xdr format) -c run.gro Output Generic structure: gro g96 pdb xml -e run.edr Output Generic energy: edr ene -g run.log Output Log file -dgdl run.xvg Output, Opt. xvgr/xmgr file -field run.xvg Output, Opt. xvgr/xmgr file -table run.xvg Input, Opt. xvgr/xmgr file -tablep run.xvg Input, Opt. xvgr/xmgr file -rerun run.xtc Input, Opt. Generic trajectory: xtc trr trj gro g96 pdb -tpi run.xvg Output, Opt. xvgr/xmgr file -ei run.edi Input, Opt. ED sampling input -eo run.edo Output, Opt. ED sampling output -j run.gct Input, Opt. General coupling stuff -jo run.gct Output, Opt. General coupling stuff -ffout run.xvg Output, Opt. xvgr/xmgr file -devout run.xvg Output, Opt. xvgr/xmgr file -runav run.xvg Output, Opt. xvgr/xmgr file -pi run.ppa Input, Opt. Pull parameters -po run.ppa Output, Opt. Pull parameters -pd run.pdo Output, Opt. Pull data output -pn run.ndx Input, Opt. Index file -mtx run.mtx Output, Opt. Hessian matrix -dn run.ndx Output, Opt. Index file Option Type Value Description ------------------------------------------------------ -[no]h bool no Print help info and quit -nice int 19 Set the nicelevel -deffnm string run Set the default filename for all file options -[no]xvgr bool yes Add specific codes (legends etc.) in the output xvg files for the xmgrace program -np int 1 Number of nodes, must be the same as used for grompp -nt int 1 Number of threads to start on each node -[no]v bool yes Be loud and noisy -[no]compact bool yes Write a compact log file -[no]sepdvdl bool no Write separate V and dVdl terms for each interaction type and node to the log file(s) -[no]multi bool no Do multiple simulations in parallel (only with -np > 1) -replex int 0 Attempt replica exchange every # steps -reseed int -1 Seed for replica exchange, -1 is generate a seed -[no]glas bool no Do glass simulation with special long range corrections -[no]ionize bool no Do a simulation including the effect of an X-Ray bombardment on your system Back Off! I just backed up run2.log to ./#run2.log.5# Back Off! I just backed up run0.log to ./#run0.log.12# Getting Loaded... Reading file run.tpr, VERSION 3.3.3 (single precision) Back Off! I just backed up run1.log to ./#run1.log.12# ------------------------------------------------------- Program mdrun_mpi, VERSION 3.3.3 Source code file: ../../../../src/gmxlib/block_tx.c, line: 74 Fatal error: 0: size=672, len=840, rx_count=0 ------------------------------------------------------- "They're Red Hot" (Red Hot Chili Peppers) Error on node 1, will try to stop all the nodes Halting parallel program mdrun_mpi on CPU 1 out of 3 gcq#220: "They're Red Hot" (Red Hot Chili Peppers) ----------------------------------------------------------------------------- One of the processes started by mpirun has exited with a nonzero exit code. This typically indicates that the process finished in error. If your process did not finish in error, be sure to include a "return 0" or "exit(0)" in your C code before exiting the application. PID 15964 failed on node n0 (192.168.0.10) with exit status 1. ----------------------------------------------------------------------------- mpirun failed with exit status 1 I don't know what is problem. Anybody can help me ? Thanks Mario Jose /* WE ARE FREE */ Hack to learn, don't learn to hack. /* Free Software Foundation */ "Free software" is a matter of liberty, not price GNU's Not UNIX. Be free, use GNU/Linux www.gnu.org www.fsf.org /* Free Culture */ free-culture.org creativecommons.org /* ... Hoarders may get piles of money, That is true, hackers, that is true. But they cannot help their neighbors; That's not good, hackers, that's not good ... Richard Stallman (www.stallman.org) */ /* Human knowledge belongs to the world */ Novos endereços, o Yahoo! que você conhece. Crie um email novo com a sua cara @ymail.com ou @rocketmail.com. http://br.new.mail.yahoo.com/addresses