Re: [OMPI users] config error

2006-04-26 Thread Brian Barrett

On Apr 24, 2006, at 12:32 PM, sdamjad wrote:


Brain
sorry i am enclosing my config.log file  tar file here
I can not reach to make step
Hence can not include it
It looks like you are trying to use the IBM XLF compiler for your  
Fortran compiler on OS X 10.4  There are some special things you have  
to do in order to make that combination work (I don't know the  
details - a web search should find them).  The problem is that the  
Fortran compiler can't produce executables:


configure:20852: f77  conftestf.f conftest.o -o conftest
** fsize   === End of Compilation 1 ===
1501-510  Compilation successful for file conftestf.f.
/usr/bin/ld: can't open: lSystemStubs (No such file or directory,  
errno = 2)

configure:20859: $? = 1

Once you get your Fortran compiler working, you should be able to  
configure and build Open MPI without any issues.  If you don't need  
the Fortran bindings for MPI, you can configure Open MPI with the -- 
disable-mpi-f77 option, which should avoid the whole mess.  But if  
you ever intend to use MPI with Fortran, you'll need to fix your  
Fortran compiler first.



Hope this helps,

Brian

--
  Brian Barrett
  Open MPI developer
  http://www.open-mpi.org/




[OMPI users] Developer Workshop : Wednesday and Thursday slides?

2006-04-26 Thread Donald Kerr


Just wondering if/when the slides from Wednesday and Thursday of the 
"Open MPI Developer's Workshop" will be posted.


Thanks
-DON


Re: [OMPI users] Developer Workshop : Wednesday and Thursday slides?

2006-04-26 Thread Jeff Squyres (jsquyres)
Blast!  I could have sworn that I posted the Wednesday slides already;
I'll go do that right now.

I have pinged Mellanox and Myricom for their Thursday slides; they both
indicated that they needed to get some final approvals before they
posted.


> -Original Message-
> From: users-boun...@open-mpi.org 
> [mailto:users-boun...@open-mpi.org] On Behalf Of Donald Kerr
> Sent: Wednesday, April 26, 2006 9:56 AM
> To: us...@open-mpi.org
> Subject: [OMPI users] Developer Workshop : Wednesday and 
> Thursday slides?
> 
> 
> Just wondering if/when the slides from Wednesday and Thursday of the 
> "Open MPI Developer's Workshop" will be posted.
> 
> Thanks
> -DON
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 



[OMPI users] missing mpi_allgather_f90.f90.sh in openmpi-1.2a1r9704

2006-04-26 Thread Michael Kluskens

Open MPI 1.2a1r9704
Summary: configure with --with-mpi-f90-size=large and then make.

/bin/sh: line 1: ./scripts/mpi_allgather_f90.f90.sh: No such file or  
directory


I doubt this one is system specific
---
my details:

Building OpenMPI 1.2a1r9704 with g95 (Apr 23 2006) on OS X 10.4.6 using

./configure F77=g95 FC=g95 LDFLAGS=-lSystemStubs --with-mpi-f90- 
size=large


Configures fine but make gives the error listed above.  However no  
error if I don't specify f90-size=large.


./scripts/mpi_allgather_f90.f90.sh /Users/mkluskens/Public/MPI/ 
OpenMPI/openmpi-1.2a1r9704/ompi/mpi/f90 > mpi_allgather_f90.f90
/bin/sh: line 1: ./scripts/mpi_allgather_f90.f90.sh: No such file or  
directory

make[4]: *** [mpi_allgather_f90.f90] Error 127
make[3]: *** [all-recursive] Error 1
make[2]: *** [all-recursive] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all-recursive] Error 1

 mpi_allgather_f90.f90.sh does not exist in my configured and built  
Open MPI 1.1a3r9704 so I can't compare between the two.


I assume it should be generated into ompi/mpi/f90/scripts.

Config log attached



config.log.gz
Description: GNU Zip compressed data


[OMPI users] which is better: 64x1 or 32x2

2006-04-26 Thread hpc
Hi,

I want to build an hpc cluster for running mm5 and wien2k
scientific applications for my physics coledge. both of them
use MPI.

Interconnection between nodes: GigEth (Cisco 24 port GigEth)

It seems I have two choices for nodes:
 * 32 dual core opteron processors (1 GB ram for each node)
 * 64 single core opteron processors (2 GB ram for each node)

Which is better (performance & price)?



Re: [OMPI users] which is better: 64x1 or 32x2

2006-04-26 Thread Jeff Squyres (jsquyres)
You might want to take this question over to the Beowulf list -- they
talk a lot more about cluster configurations than we do -- and/or the
mm4 and wein2k support lists (since they know the details of those
applications -- if you're going to have a cluster for a specific set of
applications, it can be best to get input from the developers who know
the applications best, and what their communication characteristics
are).



> -Original Message-
> From: users-boun...@open-mpi.org 
> [mailto:users-boun...@open-mpi.org] On Behalf Of h...@gurban.org
> Sent: Wednesday, April 26, 2006 12:23 PM
> To: us...@open-mpi.org
> Subject: [OMPI users] which is better: 64x1 or 32x2
> 
> Hi,
> 
> I want to build an hpc cluster for running mm5 and wien2k
> scientific applications for my physics coledge. both of them
> use MPI.
> 
> Interconnection between nodes: GigEth (Cisco 24 port GigEth)
> 
> It seems I have two choices for nodes:
>  * 32 dual core opteron processors (1 GB ram for each node)
>  * 64 single core opteron processors (2 GB ram for each node)
> 
> Which is better (performance & price)?
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 



[OMPI users] Make and config error

2006-04-26 Thread sdamjad
Brain
I changed lstubs in gcc compiler .  I am enclosing tar file that as output of 
condig.log.confog.out,make.out 
and makeinstall.out


ask.tar
Description: Unix tar archive


Re: [OMPI users] which is better: 64x1 or 32x2

2006-04-26 Thread damien
A thing to look at is how much bandwidth the models require compared to
the CPU load.  You can redline gigabit ethernet with a 1GHz PIII and a
64-bit PCI bus.  Opterons on a decent motherboard will definitely keep a
gigabit line chock full.  With dual-core you get the advantage of very
fast processor-to-processor communication but you'll run the risk of
choking on the ethernet connection.  You might be OK if you can get
dual-ethernet connections on the motherboard and run channel-bonding to
increase the bandwidth, but your switch has to be able to handle it.

Damien

> You might want to take this question over to the Beowulf list -- they
> talk a lot more about cluster configurations than we do -- and/or the
> mm4 and wein2k support lists (since they know the details of those
> applications -- if you're going to have a cluster for a specific set of
> applications, it can be best to get input from the developers who know
> the applications best, and what their communication characteristics
> are).
>
>
>
>> -Original Message-
>> From: users-boun...@open-mpi.org
>> [mailto:users-boun...@open-mpi.org] On Behalf Of h...@gurban.org
>> Sent: Wednesday, April 26, 2006 12:23 PM
>> To: us...@open-mpi.org
>> Subject: [OMPI users] which is better: 64x1 or 32x2
>>
>> Hi,
>>
>> I want to build an hpc cluster for running mm5 and wien2k
>> scientific applications for my physics coledge. both of them
>> use MPI.
>>
>> Interconnection between nodes: GigEth (Cisco 24 port GigEth)
>>
>> It seems I have two choices for nodes:
>>  * 32 dual core opteron processors (1 GB ram for each node)
>>  * 64 single core opteron processors (2 GB ram for each node)
>>
>> Which is better (performance & price)?
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>




Re: [OMPI users] Spawn and Disconnect

2006-04-26 Thread Michael Kluskens
Correction on this, this problem only occurs (with OpenMPI 1.2) when  
I don't use mpirun to launch my process.


I know seems strange to most mpi users, it turns out that when using  
OpenMPI and only needing one process (because I spawn everything else  
I need), I had found it quicker just to launch the executable directly.


I have only confirmed my test code works with OpenMPI 1.2 (if I have  
trouble I'll test 1.1), below is the proper output for my test of  
spawning, disconnecting, and respawning:


>mpirun -np 1 parent2
parent:  0  of  1
parent: How many processes total?
2
parent: Calling MPI_Comm_spawn to start  1  subprocesses.
child starting
parent returned from Comm_Spawn call
parent: Calling MPI_BCAST with btest =  17 .  child =  3
child 0 of 1:  Parent 3
parent: Calling MPI_Comm_spawn to start  1  subprocesses.
child 0 of 1:  Receiving   17 from parent
child calling COMM_FREE
child calling FINALIZE
child exiting
Maximum user memory allocated: 0
child starting
parent: Calling MPI_BCAST with btest =  17 .  child =  3
child 0 of 1:  Parent 3
child 0 of 1:  Receiving   17 from parent
child calling COMM_FREE
child calling FINALIZE

Michael

On Apr 25, 2006, at 2:57 PM, Michael Kluskens wrote:

I'm running OpenMPI 1.1 (v9704)and when a spawned processes exits  
the parent does not die (see previous discussions about  
1.0.1/1.0.2); however, the next time the parent tries to spawn a  
process MPI_Comm_spawn does not return.


My test output below:

 parent:  0  of  1
parent: How many processes total?
2
parent: Calling MPI_Comm_spawn to start  1  subprocesses.
child starting
parent returned from Comm_Spawn call
parent: Calling MPI_BCAST with btest =  17 .  child =  3
child 0 of 1:  Parent 3
parent: Calling MPI_Comm_spawn to start  1  subprocesses.
child 0 of 1:  Receiving   17 from parent
child calling COMM_FREE
child calling FINALIZE
child exiting

Notice there is no message saying "parent returned from Comm_Spawn"  
and the parent just sits there and obviously the second set of  
processes don't get launched.


Quick note on code fixes, my child process now calls MPI_COMM_FREE 
(parent,ierr) to free the communicator to the parent before  
exiting, in earlier version of 1.1 this crashed the code.  I'm  
guessing this is the right thing to do, the Complete Reference book  
has an example without it and the Using MPI-2 book has a more  
detailed example with this in.  In either case, I get the same  
results regardless.


Background from previous discussion on this follows.  It will cost  
me less to test new versions of Open MPI handling this than work  
around this issue in my project.


Michael

On Mar 2, 2006, at 1:55 PM, Ralph Castain wrote:

We expect to have much better support for the entire comm_spawn  
process in the next incarnation of the RTE. I don't expect that to  
be included in a release, however, until 1.1 (Jeff may be able to  
give you an estimate for when that will happen).


Jeff et al may be able to give you access to an early non-release  
version sooner, if better comm_spawn support is a critical issue  
and you don't mind being patient with the inevitable bugs in such  
versions.


Ralph


Edgar Gabriel wrote:
Open MPI currently does not fully support a proper disconnection  
of parent and child processes. Thus, if a child dies/aborts, the  
parents will abort as well, despite of calling  
MPI_Comm_disconnect. (The new RTE will have better support for  
these operations, Ralph/Jeff can probably give a better estimate  
when this will be available.) However, what should not happen is,  
that if the child calls MPI_Finalize (so not a violent death but  
a proper shutdown), the parent goes down at the same time. Let me  
check that as well... Brignone, Sergio wrote:
Hi everybody, I am trying to run a master/slave set. Because of  
the nature of the problem I need to start and stop (kill) some  
slaves. The problem is that as soon as one of the slave dies,  
the master dies also.




___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] missing mpi_allgather_f90.f90.sh in openmpi-1.2a1r9704

2006-04-26 Thread Michael Kluskens
I made another test and the problem does not occur with --with-mpi- 
f90-size=medium.


Michael

On Apr 26, 2006, at 11:50 AM, Michael Kluskens wrote:


Open MPI 1.2a1r9704
Summary: configure with --with-mpi-f90-size=large and then make.

/bin/sh: line 1: ./scripts/mpi_allgather_f90.f90.sh: No such file  
or directory


I doubt this one is system specific
---
my details:

Building OpenMPI 1.2a1r9704 with g95 (Apr 23 2006) on OS X 10.4.6  
using


./configure F77=g95 FC=g95 LDFLAGS=-lSystemStubs --with-mpi-f90- 
size=large


Configures fine but make gives the error listed above.  However no  
error if I don't specify f90-size=large.


./scripts/mpi_allgather_f90.f90.sh /Users/mkluskens/Public/MPI/ 
OpenMPI/openmpi-1.2a1r9704/ompi/mpi/f90 > mpi_allgather_f90.f90
/bin/sh: line 1: ./scripts/mpi_allgather_f90.f90.sh: No such file  
or directory

make[4]: *** [mpi_allgather_f90.f90] Error 127
make[3]: *** [all-recursive] Error 1
make[2]: *** [all-recursive] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all-recursive] Error 1

 mpi_allgather_f90.f90.sh does not exist in my configured and built  
Open MPI 1.1a3r9704 so I can't compare between the two.


I assume it should be generated into ompi/mpi/f90/scripts.




Re: [OMPI users] Make and config error

2006-04-26 Thread Brian Barrett

On Apr 26, 2006, at 2:49 PM, sdamjad wrote:


Brain
I changed lstubs in gcc compiler .  I am enclosing tar file that as  
output of

condig.log.confog.out,make.out
and makeinstall.out


Are you trying to report a problem?  From your logs, everything  
looked ok.


Brian


--
  Brian Barrett
  Open MPI developer
  http://www.open-mpi.org/




[OMPI users] Xgrid and Open MPI

2006-04-26 Thread knapper
I'm having trouble running apps with multiple inputs using the xgrid backend to 
mpirun.  I can't find any options to send files to the nodes as I would be able 
to do via simple xgrid commandline options.  In addition, the output files 
don't show up.  E.g. when I run LAMMPS locally, I get a dump file, but when I 
run it on the cluster, no dump file.  Is there a return mechanism in the 
current xgrid interface?

This is unrelated to Open MPI.  I've been looking at setting up an NFS to deal 
with the input problems and make installation of MPI apps easier, but Mac OS X 
10.4 seems to have NFS disabled somehow.  Is there another protocol I can setup 
on 20 nodes without using a GUI on each node?



Re: [OMPI users] missing mpi_allgather_f90.f90.sh inopenmpi-1.2a1r9704

2006-04-26 Thread Jeff Squyres (jsquyres)
Ok, I am investigating -- I think I know what the problem is, but the
guy who did the bulk of the F90 work in OMPI is out traveling for a few
days (making these fixes take a little while). 

> -Original Message-
> From: users-boun...@open-mpi.org 
> [mailto:users-boun...@open-mpi.org] On Behalf Of Michael Kluskens
> Sent: Wednesday, April 26, 2006 3:38 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] missing mpi_allgather_f90.f90.sh 
> inopenmpi-1.2a1r9704
> 
> I made another test and the problem does not occur with --with-mpi- 
> f90-size=medium.
> 
> Michael
> 
> On Apr 26, 2006, at 11:50 AM, Michael Kluskens wrote:
> 
> > Open MPI 1.2a1r9704
> > Summary: configure with --with-mpi-f90-size=large and then make.
> >
> > /bin/sh: line 1: ./scripts/mpi_allgather_f90.f90.sh: No such file  
> > or directory
> >
> > I doubt this one is system specific
> > ---
> > my details:
> >
> > Building OpenMPI 1.2a1r9704 with g95 (Apr 23 2006) on OS X 10.4.6  
> > using
> >
> > ./configure F77=g95 FC=g95 LDFLAGS=-lSystemStubs --with-mpi-f90- 
> > size=large
> >
> > Configures fine but make gives the error listed above.  However no  
> > error if I don't specify f90-size=large.
> >
> > ./scripts/mpi_allgather_f90.f90.sh /Users/mkluskens/Public/MPI/ 
> > OpenMPI/openmpi-1.2a1r9704/ompi/mpi/f90 > mpi_allgather_f90.f90
> > /bin/sh: line 1: ./scripts/mpi_allgather_f90.f90.sh: No such file  
> > or directory
> > make[4]: *** [mpi_allgather_f90.f90] Error 127
> > make[3]: *** [all-recursive] Error 1
> > make[2]: *** [all-recursive] Error 1
> > make[1]: *** [all-recursive] Error 1
> > make: *** [all-recursive] Error 1
> >
> >  mpi_allgather_f90.f90.sh does not exist in my configured 
> and built  
> > Open MPI 1.1a3r9704 so I can't compare between the two.
> >
> > I assume it should be generated into ompi/mpi/f90/scripts.
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>