Re: [OMPI users] config error
On Apr 24, 2006, at 12:32 PM, sdamjad wrote: Brain sorry i am enclosing my config.log file tar file here I can not reach to make step Hence can not include it It looks like you are trying to use the IBM XLF compiler for your Fortran compiler on OS X 10.4 There are some special things you have to do in order to make that combination work (I don't know the details - a web search should find them). The problem is that the Fortran compiler can't produce executables: configure:20852: f77 conftestf.f conftest.o -o conftest ** fsize === End of Compilation 1 === 1501-510 Compilation successful for file conftestf.f. /usr/bin/ld: can't open: lSystemStubs (No such file or directory, errno = 2) configure:20859: $? = 1 Once you get your Fortran compiler working, you should be able to configure and build Open MPI without any issues. If you don't need the Fortran bindings for MPI, you can configure Open MPI with the -- disable-mpi-f77 option, which should avoid the whole mess. But if you ever intend to use MPI with Fortran, you'll need to fix your Fortran compiler first. Hope this helps, Brian -- Brian Barrett Open MPI developer http://www.open-mpi.org/
[OMPI users] Developer Workshop : Wednesday and Thursday slides?
Just wondering if/when the slides from Wednesday and Thursday of the "Open MPI Developer's Workshop" will be posted. Thanks -DON
Re: [OMPI users] Developer Workshop : Wednesday and Thursday slides?
Blast! I could have sworn that I posted the Wednesday slides already; I'll go do that right now. I have pinged Mellanox and Myricom for their Thursday slides; they both indicated that they needed to get some final approvals before they posted. > -Original Message- > From: users-boun...@open-mpi.org > [mailto:users-boun...@open-mpi.org] On Behalf Of Donald Kerr > Sent: Wednesday, April 26, 2006 9:56 AM > To: us...@open-mpi.org > Subject: [OMPI users] Developer Workshop : Wednesday and > Thursday slides? > > > Just wondering if/when the slides from Wednesday and Thursday of the > "Open MPI Developer's Workshop" will be posted. > > Thanks > -DON > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
[OMPI users] missing mpi_allgather_f90.f90.sh in openmpi-1.2a1r9704
Open MPI 1.2a1r9704 Summary: configure with --with-mpi-f90-size=large and then make. /bin/sh: line 1: ./scripts/mpi_allgather_f90.f90.sh: No such file or directory I doubt this one is system specific --- my details: Building OpenMPI 1.2a1r9704 with g95 (Apr 23 2006) on OS X 10.4.6 using ./configure F77=g95 FC=g95 LDFLAGS=-lSystemStubs --with-mpi-f90- size=large Configures fine but make gives the error listed above. However no error if I don't specify f90-size=large. ./scripts/mpi_allgather_f90.f90.sh /Users/mkluskens/Public/MPI/ OpenMPI/openmpi-1.2a1r9704/ompi/mpi/f90 > mpi_allgather_f90.f90 /bin/sh: line 1: ./scripts/mpi_allgather_f90.f90.sh: No such file or directory make[4]: *** [mpi_allgather_f90.f90] Error 127 make[3]: *** [all-recursive] Error 1 make[2]: *** [all-recursive] Error 1 make[1]: *** [all-recursive] Error 1 make: *** [all-recursive] Error 1 mpi_allgather_f90.f90.sh does not exist in my configured and built Open MPI 1.1a3r9704 so I can't compare between the two. I assume it should be generated into ompi/mpi/f90/scripts. Config log attached config.log.gz Description: GNU Zip compressed data
[OMPI users] which is better: 64x1 or 32x2
Hi, I want to build an hpc cluster for running mm5 and wien2k scientific applications for my physics coledge. both of them use MPI. Interconnection between nodes: GigEth (Cisco 24 port GigEth) It seems I have two choices for nodes: * 32 dual core opteron processors (1 GB ram for each node) * 64 single core opteron processors (2 GB ram for each node) Which is better (performance & price)?
Re: [OMPI users] which is better: 64x1 or 32x2
You might want to take this question over to the Beowulf list -- they talk a lot more about cluster configurations than we do -- and/or the mm4 and wein2k support lists (since they know the details of those applications -- if you're going to have a cluster for a specific set of applications, it can be best to get input from the developers who know the applications best, and what their communication characteristics are). > -Original Message- > From: users-boun...@open-mpi.org > [mailto:users-boun...@open-mpi.org] On Behalf Of h...@gurban.org > Sent: Wednesday, April 26, 2006 12:23 PM > To: us...@open-mpi.org > Subject: [OMPI users] which is better: 64x1 or 32x2 > > Hi, > > I want to build an hpc cluster for running mm5 and wien2k > scientific applications for my physics coledge. both of them > use MPI. > > Interconnection between nodes: GigEth (Cisco 24 port GigEth) > > It seems I have two choices for nodes: > * 32 dual core opteron processors (1 GB ram for each node) > * 64 single core opteron processors (2 GB ram for each node) > > Which is better (performance & price)? > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
[OMPI users] Make and config error
Brain I changed lstubs in gcc compiler . I am enclosing tar file that as output of condig.log.confog.out,make.out and makeinstall.out ask.tar Description: Unix tar archive
Re: [OMPI users] which is better: 64x1 or 32x2
A thing to look at is how much bandwidth the models require compared to the CPU load. You can redline gigabit ethernet with a 1GHz PIII and a 64-bit PCI bus. Opterons on a decent motherboard will definitely keep a gigabit line chock full. With dual-core you get the advantage of very fast processor-to-processor communication but you'll run the risk of choking on the ethernet connection. You might be OK if you can get dual-ethernet connections on the motherboard and run channel-bonding to increase the bandwidth, but your switch has to be able to handle it. Damien > You might want to take this question over to the Beowulf list -- they > talk a lot more about cluster configurations than we do -- and/or the > mm4 and wein2k support lists (since they know the details of those > applications -- if you're going to have a cluster for a specific set of > applications, it can be best to get input from the developers who know > the applications best, and what their communication characteristics > are). > > > >> -Original Message- >> From: users-boun...@open-mpi.org >> [mailto:users-boun...@open-mpi.org] On Behalf Of h...@gurban.org >> Sent: Wednesday, April 26, 2006 12:23 PM >> To: us...@open-mpi.org >> Subject: [OMPI users] which is better: 64x1 or 32x2 >> >> Hi, >> >> I want to build an hpc cluster for running mm5 and wien2k >> scientific applications for my physics coledge. both of them >> use MPI. >> >> Interconnection between nodes: GigEth (Cisco 24 port GigEth) >> >> It seems I have two choices for nodes: >> * 32 dual core opteron processors (1 GB ram for each node) >> * 64 single core opteron processors (2 GB ram for each node) >> >> Which is better (performance & price)? >> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] Spawn and Disconnect
Correction on this, this problem only occurs (with OpenMPI 1.2) when I don't use mpirun to launch my process. I know seems strange to most mpi users, it turns out that when using OpenMPI and only needing one process (because I spawn everything else I need), I had found it quicker just to launch the executable directly. I have only confirmed my test code works with OpenMPI 1.2 (if I have trouble I'll test 1.1), below is the proper output for my test of spawning, disconnecting, and respawning: >mpirun -np 1 parent2 parent: 0 of 1 parent: How many processes total? 2 parent: Calling MPI_Comm_spawn to start 1 subprocesses. child starting parent returned from Comm_Spawn call parent: Calling MPI_BCAST with btest = 17 . child = 3 child 0 of 1: Parent 3 parent: Calling MPI_Comm_spawn to start 1 subprocesses. child 0 of 1: Receiving 17 from parent child calling COMM_FREE child calling FINALIZE child exiting Maximum user memory allocated: 0 child starting parent: Calling MPI_BCAST with btest = 17 . child = 3 child 0 of 1: Parent 3 child 0 of 1: Receiving 17 from parent child calling COMM_FREE child calling FINALIZE Michael On Apr 25, 2006, at 2:57 PM, Michael Kluskens wrote: I'm running OpenMPI 1.1 (v9704)and when a spawned processes exits the parent does not die (see previous discussions about 1.0.1/1.0.2); however, the next time the parent tries to spawn a process MPI_Comm_spawn does not return. My test output below: parent: 0 of 1 parent: How many processes total? 2 parent: Calling MPI_Comm_spawn to start 1 subprocesses. child starting parent returned from Comm_Spawn call parent: Calling MPI_BCAST with btest = 17 . child = 3 child 0 of 1: Parent 3 parent: Calling MPI_Comm_spawn to start 1 subprocesses. child 0 of 1: Receiving 17 from parent child calling COMM_FREE child calling FINALIZE child exiting Notice there is no message saying "parent returned from Comm_Spawn" and the parent just sits there and obviously the second set of processes don't get launched. Quick note on code fixes, my child process now calls MPI_COMM_FREE (parent,ierr) to free the communicator to the parent before exiting, in earlier version of 1.1 this crashed the code. I'm guessing this is the right thing to do, the Complete Reference book has an example without it and the Using MPI-2 book has a more detailed example with this in. In either case, I get the same results regardless. Background from previous discussion on this follows. It will cost me less to test new versions of Open MPI handling this than work around this issue in my project. Michael On Mar 2, 2006, at 1:55 PM, Ralph Castain wrote: We expect to have much better support for the entire comm_spawn process in the next incarnation of the RTE. I don't expect that to be included in a release, however, until 1.1 (Jeff may be able to give you an estimate for when that will happen). Jeff et al may be able to give you access to an early non-release version sooner, if better comm_spawn support is a critical issue and you don't mind being patient with the inevitable bugs in such versions. Ralph Edgar Gabriel wrote: Open MPI currently does not fully support a proper disconnection of parent and child processes. Thus, if a child dies/aborts, the parents will abort as well, despite of calling MPI_Comm_disconnect. (The new RTE will have better support for these operations, Ralph/Jeff can probably give a better estimate when this will be available.) However, what should not happen is, that if the child calls MPI_Finalize (so not a violent death but a proper shutdown), the parent goes down at the same time. Let me check that as well... Brignone, Sergio wrote: Hi everybody, I am trying to run a master/slave set. Because of the nature of the problem I need to start and stop (kill) some slaves. The problem is that as soon as one of the slave dies, the master dies also. ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] missing mpi_allgather_f90.f90.sh in openmpi-1.2a1r9704
I made another test and the problem does not occur with --with-mpi- f90-size=medium. Michael On Apr 26, 2006, at 11:50 AM, Michael Kluskens wrote: Open MPI 1.2a1r9704 Summary: configure with --with-mpi-f90-size=large and then make. /bin/sh: line 1: ./scripts/mpi_allgather_f90.f90.sh: No such file or directory I doubt this one is system specific --- my details: Building OpenMPI 1.2a1r9704 with g95 (Apr 23 2006) on OS X 10.4.6 using ./configure F77=g95 FC=g95 LDFLAGS=-lSystemStubs --with-mpi-f90- size=large Configures fine but make gives the error listed above. However no error if I don't specify f90-size=large. ./scripts/mpi_allgather_f90.f90.sh /Users/mkluskens/Public/MPI/ OpenMPI/openmpi-1.2a1r9704/ompi/mpi/f90 > mpi_allgather_f90.f90 /bin/sh: line 1: ./scripts/mpi_allgather_f90.f90.sh: No such file or directory make[4]: *** [mpi_allgather_f90.f90] Error 127 make[3]: *** [all-recursive] Error 1 make[2]: *** [all-recursive] Error 1 make[1]: *** [all-recursive] Error 1 make: *** [all-recursive] Error 1 mpi_allgather_f90.f90.sh does not exist in my configured and built Open MPI 1.1a3r9704 so I can't compare between the two. I assume it should be generated into ompi/mpi/f90/scripts.
Re: [OMPI users] Make and config error
On Apr 26, 2006, at 2:49 PM, sdamjad wrote: Brain I changed lstubs in gcc compiler . I am enclosing tar file that as output of condig.log.confog.out,make.out and makeinstall.out Are you trying to report a problem? From your logs, everything looked ok. Brian -- Brian Barrett Open MPI developer http://www.open-mpi.org/
[OMPI users] Xgrid and Open MPI
I'm having trouble running apps with multiple inputs using the xgrid backend to mpirun. I can't find any options to send files to the nodes as I would be able to do via simple xgrid commandline options. In addition, the output files don't show up. E.g. when I run LAMMPS locally, I get a dump file, but when I run it on the cluster, no dump file. Is there a return mechanism in the current xgrid interface? This is unrelated to Open MPI. I've been looking at setting up an NFS to deal with the input problems and make installation of MPI apps easier, but Mac OS X 10.4 seems to have NFS disabled somehow. Is there another protocol I can setup on 20 nodes without using a GUI on each node?
Re: [OMPI users] missing mpi_allgather_f90.f90.sh inopenmpi-1.2a1r9704
Ok, I am investigating -- I think I know what the problem is, but the guy who did the bulk of the F90 work in OMPI is out traveling for a few days (making these fixes take a little while). > -Original Message- > From: users-boun...@open-mpi.org > [mailto:users-boun...@open-mpi.org] On Behalf Of Michael Kluskens > Sent: Wednesday, April 26, 2006 3:38 PM > To: Open MPI Users > Subject: Re: [OMPI users] missing mpi_allgather_f90.f90.sh > inopenmpi-1.2a1r9704 > > I made another test and the problem does not occur with --with-mpi- > f90-size=medium. > > Michael > > On Apr 26, 2006, at 11:50 AM, Michael Kluskens wrote: > > > Open MPI 1.2a1r9704 > > Summary: configure with --with-mpi-f90-size=large and then make. > > > > /bin/sh: line 1: ./scripts/mpi_allgather_f90.f90.sh: No such file > > or directory > > > > I doubt this one is system specific > > --- > > my details: > > > > Building OpenMPI 1.2a1r9704 with g95 (Apr 23 2006) on OS X 10.4.6 > > using > > > > ./configure F77=g95 FC=g95 LDFLAGS=-lSystemStubs --with-mpi-f90- > > size=large > > > > Configures fine but make gives the error listed above. However no > > error if I don't specify f90-size=large. > > > > ./scripts/mpi_allgather_f90.f90.sh /Users/mkluskens/Public/MPI/ > > OpenMPI/openmpi-1.2a1r9704/ompi/mpi/f90 > mpi_allgather_f90.f90 > > /bin/sh: line 1: ./scripts/mpi_allgather_f90.f90.sh: No such file > > or directory > > make[4]: *** [mpi_allgather_f90.f90] Error 127 > > make[3]: *** [all-recursive] Error 1 > > make[2]: *** [all-recursive] Error 1 > > make[1]: *** [all-recursive] Error 1 > > make: *** [all-recursive] Error 1 > > > > mpi_allgather_f90.f90.sh does not exist in my configured > and built > > Open MPI 1.1a3r9704 so I can't compare between the two. > > > > I assume it should be generated into ompi/mpi/f90/scripts. > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >