Re: [OMPI users] Communitcation between OpenMPI and ClusterTools

2008-07-30 Thread Terry Dontje
One last note to close this out.  After some discussion on the 
developers list it was pointed out that this problem was fixed with new 
code in the trunk and 1.3 branch.   So my statement below of the trunk, 
1.3 and CT8 EA2 supporting nodes on different subnets can be made 
stronger that we really do expect this to work.


--td
Terry Dontje wrote:

Terry Dontje wrote:


Date: Tue, 29 Jul 2008 14:19:14 -0400
From: "Alexander Shabarshin" 
Subject: Re: [OMPI users] Communitcation between OpenMPI and
ClusterTools
To: 
Message-ID: <00b701c8f1a7$9c24f7c0$c8afcea7@Shabarshin>
Content-Type: text/plain; format=flowed; charset="iso-8859-1";
reply-type=response

Hello

 
>>> > One idea comes to mind is whether the two nodes are on the 
same >>> > subnet? If they are not on the same subnet I think 
there is a bug in >>> > which the TCP BTL will recuse itself 
from communications between the >>> > two nodes.
  


 
>> you are right - subnets are different, but routes set up 
correctly and >> everything like ping, ssh etc. are working OK 
between them
  


 
> But it isn't a routing problem but how the tcp btl in Open MPI 
decides > which interface the nodes can communicate with 
(completely out of the > hands of the TCP stack and lower).



Do you know when it can be fixed in official OpenMPI?
Is patch available or something?
  
Well this problem is captured in ticket 972 
(https://svn.open-mpi.org/trac/ompi/ticket/972).  There is a question 
as to whether this ticket has been fixed or not (that is was code 
actually putback).  Sun's experience with the Trunk, 1.3 branch and 
CT8 EA2 release seems to be that you now can run jobs across subnets 
but we (Sun) are not completely



I guess I should have ended with "mumble..mumble" :-)
Now for the rest of the sentence:

... sure whether the support is truly in there or we just got lucky in 
how our setup was configured.


--td
FWIW, it looks like that code has had a lot of changes in it between 
1.2 and 1.3.


--td








Re: [OMPI users] TCP Latency

2008-07-30 Thread Andy Georgi

Thanks again for all the answers. It seems that were was a bug in the driver in 
combination with
Suse Linux Enterprise Server 10. It was fixed with version 1.0.146. Now we have 
12us with NPtcp and
22us with NPmpi. This is still not fast enough but for the time acceptable. I 
will check the
alternatives as soon as possible and look forward to OpenMPI 1.3. Then we will 
see what iWARP brings
;-).

Best regards,

Andy

Kozin, I (Igor) schrieb:

Thanks for the fast answer. So is this latency normal for TCP
communications over MPI!? Could RDMA maybe reduce the latency? It
should work with those cards but there are still problems with OFED.
iWARP is also one of the features they offer but if it works...


Hi Andy,
Yes, ~40us TCP latency is normal (it can be worse too).
If you need lower MPI latency you need to look elsewhere (but it's not
going to be TCP). Check SCore, OpenMX and Gamma. SCore is more mature of
the three but OpenMX looks promising too. We get less than 15 us using
SCore MPI and Intel NICs (IMB PingPong). Of course commercial MPI
libraries offer low latency too e.g. Scali MPI.

Best,
Igor


--

Dresden University of Technology
Center for Information Services
and High Performance Computing (ZIH)
D-01062 Dresden
Germany

e-mail: andy.geo...@zih.tu-dresden.de
WWW:http://www.tu-dresden.de/zih

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


I. Kozin  (i.kozin at dl.ac.uk)
Computational Science and Engineering Dept.
STFC Daresbury Laboratory 
Daresbury Science and Innovation Centre
Daresbury, Warrington, WA4 4AD 
skype: in_kozin

tel: +44 (0) 1925 603308
fax: +44 (0) 1925 603634
http://www.cse.clrc.ac.uk/disco


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




--


Dresden University of Technology
Center for Information Services
and High Performance Computing (ZIH)
D-01062 Dresden
Germany

Phone:(+49) 351/463-38783
Fax:  (+49) 351/463-38245

e-mail: andy.geo...@zih.tu-dresden.de
WWW:http://www.tu-dresden.de/zih


Re: [OMPI users] Segmentation fault: Address not mapped

2008-07-30 Thread James Philbin
Hi,

OK, to answer my own question, I recompiled OpenMPI appending
'--with-memory-manager=none' to configure and now things seem to run
fine. I'm not sure how this might affect performance, but at least
it's working now. Maybe this can be put in the FAQ?

James

On Wed, Jul 30, 2008 at 2:02 AM, James Philbin  wrote:
> Hi,
>
> I'm running an mpi module in python (pypar), but I believe (after
> googling) that this might be a problem with openmpi.
> When I run: 'python -c "import pypar"', I get:
> [titus:21965] *** Process received signal ***
> [titus:21965] Signal: Segmentation fault (11)
> [titus:21965] Signal code: Address not mapped (1)
> [titus:21965] Failing at address: 0x837a004
> [titus:21965] [ 0] /lib/i686/libpthread.so.0 [0x40035f93]
> [titus:21965] [ 1] python [0x42029180]
> [titus:21965] [ 2] /users/james/lib/libopen-pal.so.0(free+0xbc) [0x40e112fc]
> [titus:21965] [ 3]
> /users/james/lib/libopen-pal.so.0(mca_base_components_open+0x83)
> [0x40dff9b3]
> [titus:21965] [ 4]
> /users/james/lib/libmpi.so.0(mca_allocator_base_open+0x46)
> [0x40cb03b6]
> [titus:21965] [ 5] /users/james/lib/libmpi.so.0(ompi_mpi_init+0x3dd)
> [0x40c7b7dd]
> [titus:21965] [ 6] /users/james/lib/libmpi.so.0(MPI_Init+0xef) [0x40c9fb1f]
> [titus:21965] [ 7]
> /users/james/lib/python2.5/site-packages/pypar/mpiext.so [0x40576613]
> [titus:21965] [ 8] python(PyCFunction_Call+0x5a) [0x810c9ea]
> [titus:21965] [ 9] python [0x80bb2fb]
> [titus:21965] [10] python(PyEval_EvalFrameEx+0x22d2) [0x80b97a2]
> [titus:21965] [11] python(PyEval_EvalCodeEx+0x376) [0x80ba0b6]
> [titus:21965] [12] python(PyEval_EvalCode+0x57) [0x80bcfe7]
> [titus:21965] [13] python(PyImport_ExecCodeModuleEx+0x13a) [0x80d0b9a]
> [titus:21965] [14] python [0x80d3eeb]
> [titus:21965] [15] python [0x80d180e]
> [titus:21965] [16] python [0x80d27b6]
> [titus:21965] [17] python [0x80d2309]
> [titus:21965] [18] python [0x80d45bf]
> [titus:21965] [19] python(PyImport_ImportModuleLevel+0x90) [0x80d3a40]
> [titus:21965] [20] python [0x80b3dda]
> [titus:21965] [21] python(PyCFunction_Call+0xce) [0x810ca5e]
> [titus:21965] [22] python(PyObject_Call+0x29) [0x805eca9]
> [titus:21965] [23] python(PyEval_CallObjectWithKeywords+0x75) [0x80bae95]
> [titus:21965] [24] python(PyEval_EvalFrameEx+0x2041) [0x80b9511]
> [titus:21965] [25] python(PyEval_EvalCodeEx+0x376) [0x80ba0b6]
> [titus:21965] [26] python(PyEval_EvalCode+0x57) [0x80bcfe7]
> [titus:21965] [27] python(PyImport_ExecCodeModuleEx+0x13a) [0x80d0b9a]
> [titus:21965] [28] python [0x80d3eeb]
> [titus:21965] [29] python [0x80d180e]
>
> I've built openmpi from the 1.2.6 sources with the following configure
> flag: './configure --disable-dlopen --prefix=/users/james'.
>
> pypar seems to work fine on my ubuntu system also with openmpi
> (installed from repositories). I'm tearing my hair out trying to solve
> this, so any advice would be very welcome.
>
> Thanks,
> James
>


Re: [OMPI users] How to specify hosts for MPI_Comm_spawn

2008-07-30 Thread Mark Borgerding
I keep checking my email in hopes that someone will come up with 
something that Matt or I might've missed. 

I'm just having a hard time accepting that something so fundamental 
would be so broken.
The MPI_Comm_spawn command is essentially useless without the ability to 
spawn processes on other nodes.


If this is true, then my personal scorecard reads:
   # Days spent using openmpi: 4 (off and on)
   # identified bugs in openmpi :2
   # useful programs built: 0

Please prove me wrong.  I'm eager to be shown my ignorance -- to find 
out where I've been stupid and what documentation I should've read.



Matt Hughes wrote:

I've found that I always have to use mpirun to start my spawner
process, due to the exact problem you are having:  the need to give
OMPI a hosts file!  It seems the singleton functionality is lacking
somehow... it won't allow you to spawn on arbitrary hosts.  I have not
tested if this is fixed in the 1.3 series.

Try
mpiexec -np 1 -H op2-1,op2-2 spawner op2-2

mpiexec should start the first process on op2-1, and the spawn call
should start the second on op2-2.  If you don't use the Info object to
set the hostname specifically, then on 1.2.x it will automatically
start on op2-2.  With 1.3, the spawn call will start processes
starting with the first item in the host list.

mch


[snip]


Re: [OMPI users] How to specify hosts for MPI_Comm_spawn

2008-07-30 Thread Ralph Castain
As your own tests have shown, it works fine if you just "mpirun -n 1 ./ 
spawner". It is only singleton comm_spawn that appears to be having a  
problem in the latest 1.2 release. So I don't think comm_spawn is  
"useless". ;-)


I'm checking this morning to ensure that singletons properly spawns on  
other nodes in the 1.3 release. I sincerely doubt we will backport a  
fix to 1.2.



On Jul 30, 2008, at 6:49 AM, Mark Borgerding wrote:

I keep checking my email in hopes that someone will come up with  
something that Matt or I might've missed.
I'm just having a hard time accepting that something so fundamental  
would be so broken.
The MPI_Comm_spawn command is essentially useless without the  
ability to spawn processes on other nodes.


If this is true, then my personal scorecard reads:
  # Days spent using openmpi: 4 (off and on)
  # identified bugs in openmpi :2
  # useful programs built: 0

Please prove me wrong.  I'm eager to be shown my ignorance -- to  
find out where I've been stupid and what documentation I should've  
read.



Matt Hughes wrote:

I've found that I always have to use mpirun to start my spawner
process, due to the exact problem you are having:  the need to give
OMPI a hosts file!  It seems the singleton functionality is lacking
somehow... it won't allow you to spawn on arbitrary hosts.  I have  
not

tested if this is fixed in the 1.3 series.

Try
mpiexec -np 1 -H op2-1,op2-2 spawner op2-2

mpiexec should start the first process on op2-1, and the spawn call
should start the second on op2-2.  If you don't use the Info object  
to

set the hostname specifically, then on 1.2.x it will automatically
start on op2-2.  With 1.3, the spawn call will start processes
starting with the first item in the host list.

mch


[snip]
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] How to specify hosts for MPI_Comm_spawn

2008-07-30 Thread Ralph Castain
Singleton comm_spawn works fine on the 1.3 release branch - if  
singleton comm_spawn is critical to your plans, I suggest moving to  
that version. You can get a pre-release version off of the www.open-mpi.org 
 web site.



On Jul 30, 2008, at 6:58 AM, Ralph Castain wrote:

As your own tests have shown, it works fine if you just "mpirun -n  
1 ./spawner". It is only singleton comm_spawn that appears to be  
having a problem in the latest 1.2 release. So I don't think  
comm_spawn is "useless". ;-)


I'm checking this morning to ensure that singletons properly spawns  
on other nodes in the 1.3 release. I sincerely doubt we will  
backport a fix to 1.2.



On Jul 30, 2008, at 6:49 AM, Mark Borgerding wrote:

I keep checking my email in hopes that someone will come up with  
something that Matt or I might've missed.
I'm just having a hard time accepting that something so fundamental  
would be so broken.
The MPI_Comm_spawn command is essentially useless without the  
ability to spawn processes on other nodes.


If this is true, then my personal scorecard reads:
 # Days spent using openmpi: 4 (off and on)
 # identified bugs in openmpi :2
 # useful programs built: 0

Please prove me wrong.  I'm eager to be shown my ignorance -- to  
find out where I've been stupid and what documentation I should've  
read.



Matt Hughes wrote:

I've found that I always have to use mpirun to start my spawner
process, due to the exact problem you are having:  the need to give
OMPI a hosts file!  It seems the singleton functionality is lacking
somehow... it won't allow you to spawn on arbitrary hosts.  I have  
not

tested if this is fixed in the 1.3 series.

Try
mpiexec -np 1 -H op2-1,op2-2 spawner op2-2

mpiexec should start the first process on op2-1, and the spawn call
should start the second on op2-2.  If you don't use the Info  
object to

set the hostname specifically, then on 1.2.x it will automatically
start on op2-2.  With 1.3, the spawn call will start processes
starting with the first item in the host list.

mch


[snip]
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] How to specify hosts for MPI_Comm_spawn

2008-07-30 Thread Mark Borgerding
Just to clarify: the test code I wrote does *not* use MPI_Comm_spawn in 
the mpirun case.  The problem may or may not exist under miprun.




Ralph Castain wrote:
As your own tests have shown, it works fine if you just "mpirun -n 1 
./spawner". It is only singleton comm_spawn that appears to be having 
a problem in the latest 1.2 release. So I don't think comm_spawn is 
"useless". ;-)


I'm checking this morning to ensure that singletons properly spawns on 
other nodes in the 1.3 release. I sincerely doubt we will backport a 
fix to 1.2.



On Jul 30, 2008, at 6:49 AM, Mark Borgerding wrote:

I keep checking my email in hopes that someone will come up with 
something that Matt or I might've missed.
I'm just having a hard time accepting that something so fundamental 
would be so broken.
The MPI_Comm_spawn command is essentially useless without the ability 
to spawn processes on other nodes.


If this is true, then my personal scorecard reads:
  # Days spent using openmpi: 4 (off and on)
  # identified bugs in openmpi :2
  # useful programs built: 0

Please prove me wrong.  I'm eager to be shown my ignorance -- to find 
out where I've been stupid and what documentation I should've read.



Matt Hughes wrote:

I've found that I always have to use mpirun to start my spawner
process, due to the exact problem you are having:  the need to give
OMPI a hosts file!  It seems the singleton functionality is lacking
somehow... it won't allow you to spawn on arbitrary hosts.  I have not
tested if this is fixed in the 1.3 series.

Try
mpiexec -np 1 -H op2-1,op2-2 spawner op2-2

mpiexec should start the first process on op2-1, and the spawn call
should start the second on op2-2.  If you don't use the Info object to
set the hostname specifically, then on 1.2.x it will automatically
start on op2-2.  With 1.3, the spawn call will start processes
starting with the first item in the host list.

mch


[snip]
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] How to specify hosts for MPI_Comm_spawn

2008-07-30 Thread Mark Borgerding

I'm afraid I can't dictate to the customer that they must upgrade.
The target platform is RHEL 5.2 ( uses openmpi 1.2.6 )

I will try to find some sort of workaround. Any suggestions on how to 
"fake" the functionality of MPI_Comm_spawn are welcome.


To reiterate my needs:
I am writing a shared object that plugs into an existing framework.
I do not control how the framework launches its processes (no mpirun).
I want to start remote processes to crunch the data.
The shared object marshall the I/O between the framework and the remote 
processes.


-- Mark


Ralph Castain wrote:
Singleton comm_spawn works fine on the 1.3 release branch - if 
singleton comm_spawn is critical to your plans, I suggest moving to 
that version. You can get a pre-release version off of the 
www.open-mpi.org web site.



On Jul 30, 2008, at 6:58 AM, Ralph Castain wrote:

As your own tests have shown, it works fine if you just "mpirun -n 1 
./spawner". It is only singleton comm_spawn that appears to be having 
a problem in the latest 1.2 release. So I don't think comm_spawn is 
"useless". ;-)


I'm checking this morning to ensure that singletons properly spawns 
on other nodes in the 1.3 release. I sincerely doubt we will backport 
a fix to 1.2.



On Jul 30, 2008, at 6:49 AM, Mark Borgerding wrote:

I keep checking my email in hopes that someone will come up with 
something that Matt or I might've missed.
I'm just having a hard time accepting that something so fundamental 
would be so broken.
The MPI_Comm_spawn command is essentially useless without the 
ability to spawn processes on other nodes.


If this is true, then my personal scorecard reads:
 # Days spent using openmpi: 4 (off and on)
 # identified bugs in openmpi :2
 # useful programs built: 0

Please prove me wrong.  I'm eager to be shown my ignorance -- to 
find out where I've been stupid and what documentation I should've 
read.



Matt Hughes wrote:

I've found that I always have to use mpirun to start my spawner
process, due to the exact problem you are having:  the need to give
OMPI a hosts file!  It seems the singleton functionality is lacking
somehow... it won't allow you to spawn on arbitrary hosts.  I have not
tested if this is fixed in the 1.3 series.

Try
mpiexec -np 1 -H op2-1,op2-2 spawner op2-2

mpiexec should start the first process on op2-1, and the spawn call
should start the second on op2-2.  If you don't use the Info object to
set the hostname specifically, then on 1.2.x it will automatically
start on op2-2.  With 1.3, the spawn call will start processes
starting with the first item in the host list.

mch


[snip]
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] How to specify hosts for MPI_Comm_spawn

2008-07-30 Thread Robert Kubrick
Mark, if you can run a server process on the remote machine, you  
could send a request from your local MPI app to your server, then use  
an Intercomm to link the local process to the new remote process?


On Jul 30, 2008, at 9:55 AM, Mark Borgerding wrote:


I'm afraid I can't dictate to the customer that they must upgrade.
The target platform is RHEL 5.2 ( uses openmpi 1.2.6 )

I will try to find some sort of workaround. Any suggestions on how  
to "fake" the functionality of MPI_Comm_spawn are welcome.


To reiterate my needs:
I am writing a shared object that plugs into an existing framework.
I do not control how the framework launches its processes (no mpirun).
I want to start remote processes to crunch the data.
The shared object marshall the I/O between the framework and the  
remote processes.


-- Mark


Ralph Castain wrote:
Singleton comm_spawn works fine on the 1.3 release branch - if  
singleton comm_spawn is critical to your plans, I suggest moving  
to that version. You can get a pre-release version off of the  
www.open-mpi.org web site.



On Jul 30, 2008, at 6:58 AM, Ralph Castain wrote:

As your own tests have shown, it works fine if you just "mpirun - 
n 1 ./spawner". It is only singleton comm_spawn that appears to  
be having a problem in the latest 1.2 release. So I don't think  
comm_spawn is "useless". ;-)


I'm checking this morning to ensure that singletons properly  
spawns on other nodes in the 1.3 release. I sincerely doubt we  
will backport a fix to 1.2.



On Jul 30, 2008, at 6:49 AM, Mark Borgerding wrote:

I keep checking my email in hopes that someone will come up with  
something that Matt or I might've missed.
I'm just having a hard time accepting that something so  
fundamental would be so broken.
The MPI_Comm_spawn command is essentially useless without the  
ability to spawn processes on other nodes.


If this is true, then my personal scorecard reads:
 # Days spent using openmpi: 4 (off and on)
 # identified bugs in openmpi :2
 # useful programs built: 0

Please prove me wrong.  I'm eager to be shown my ignorance -- to  
find out where I've been stupid and what documentation I  
should've read.



Matt Hughes wrote:

I've found that I always have to use mpirun to start my spawner
process, due to the exact problem you are having:  the need to  
give
OMPI a hosts file!  It seems the singleton functionality is  
lacking
somehow... it won't allow you to spawn on arbitrary hosts.  I  
have not

tested if this is fixed in the 1.3 series.

Try
mpiexec -np 1 -H op2-1,op2-2 spawner op2-2

mpiexec should start the first process on op2-1, and the spawn  
call
should start the second on op2-2.  If you don't use the Info  
object to

set the hostname specifically, then on 1.2.x it will automatically
start on op2-2.  With 1.3, the spawn call will start processes
starting with the first item in the host list.

mch


[snip]
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Communitcation between OpenMPI and ClusterTools

2008-07-30 Thread Alexander Shabarshin
OK, thanks! 


Is it possible to fix it somehow directly in 1.2.x codebase?

- Original Message - 
From: "Terry Dontje" 

To: 
Sent: Wednesday, July 30, 2008 7:15 AM
Subject: Re: [OMPI users] Communitcation between OpenMPI and ClusterTools


One last note to close this out.  After some discussion on the 
developers list it was pointed out that this problem was fixed with new 
code in the trunk and 1.3 branch.   So my statement below of the trunk, 
1.3 and CT8 EA2 supporting nodes on different subnets can be made 
stronger that we really do expect this to work.


--td
Terry Dontje wrote:

Terry Dontje wrote:


Date: Tue, 29 Jul 2008 14:19:14 -0400
From: "Alexander Shabarshin" 
Subject: Re: [OMPI users] Communitcation between OpenMPI and
ClusterTools
To: 
Message-ID: <00b701c8f1a7$9c24f7c0$c8afcea7@Shabarshin>
Content-Type: text/plain; format=flowed; charset="iso-8859-1";
reply-type=response

Hello

 
>>> > One idea comes to mind is whether the two nodes are on the 
same >>> > subnet? If they are not on the same subnet I think 
there is a bug in >>> > which the TCP BTL will recuse itself 
from communications between the >>> > two nodes.
  


 
>> you are right - subnets are different, but routes set up 
correctly and >> everything like ping, ssh etc. are working OK 
between them
  


 
> But it isn't a routing problem but how the tcp btl in Open MPI 
decides > which interface the nodes can communicate with 
(completely out of the > hands of the TCP stack and lower).



Do you know when it can be fixed in official OpenMPI?
Is patch available or something?
  
Well this problem is captured in ticket 972 
(https://svn.open-mpi.org/trac/ompi/ticket/972).  There is a question 
as to whether this ticket has been fixed or not (that is was code 
actually putback).  Sun's experience with the Trunk, 1.3 branch and 
CT8 EA2 release seems to be that you now can run jobs across subnets 
but we (Sun) are not completely



I guess I should have ended with "mumble..mumble" :-)
Now for the rest of the sentence:

... sure whether the support is truly in there or we just got lucky in 
how our setup was configured.


--td
FWIW, it looks like that code has had a lot of changes in it between 
1.2 and 1.3.


--td






___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


Re: [OMPI users] How to specify hosts for MPI_Comm_spawn

2008-07-30 Thread Ralph Castain
IThe problem would be finding a way to tell all the MPI apps how to  
contact each other as the Intercomm procedure needs that info to  
complete. I don't recall if the MPI_Name_publish/lookup functions  
worked in 1.2 - I'm building the code now to see.


If it does, then you could use it to get the required contact info and  
wire up the Intercomm...it's a lot of what goes on under the  
comm_spawn covers anyway. Only diff is the necessity for the server...


On Jul 30, 2008, at 8:24 AM, Robert Kubrick wrote:

Mark, if you can run a server process on the remote machine, you  
could send a request from your local MPI app to your server, then  
use an Intercomm to link the local process to the new remote process?


On Jul 30, 2008, at 9:55 AM, Mark Borgerding wrote:


I'm afraid I can't dictate to the customer that they must upgrade.
The target platform is RHEL 5.2 ( uses openmpi 1.2.6 )

I will try to find some sort of workaround. Any suggestions on how  
to "fake" the functionality of MPI_Comm_spawn are welcome.


To reiterate my needs:
I am writing a shared object that plugs into an existing framework.
I do not control how the framework launches its processes (no  
mpirun).

I want to start remote processes to crunch the data.
The shared object marshall the I/O between the framework and the  
remote processes.


-- Mark


Ralph Castain wrote:
Singleton comm_spawn works fine on the 1.3 release branch - if  
singleton comm_spawn is critical to your plans, I suggest moving  
to that version. You can get a pre-release version off of the www.open-mpi.org 
 web site.



On Jul 30, 2008, at 6:58 AM, Ralph Castain wrote:

As your own tests have shown, it works fine if you just "mpirun - 
n 1 ./spawner". It is only singleton comm_spawn that appears to  
be having a problem in the latest 1.2 release. So I don't think  
comm_spawn is "useless". ;-)


I'm checking this morning to ensure that singletons properly  
spawns on other nodes in the 1.3 release. I sincerely doubt we  
will backport a fix to 1.2.



On Jul 30, 2008, at 6:49 AM, Mark Borgerding wrote:

I keep checking my email in hopes that someone will come up with  
something that Matt or I might've missed.
I'm just having a hard time accepting that something so  
fundamental would be so broken.
The MPI_Comm_spawn command is essentially useless without the  
ability to spawn processes on other nodes.


If this is true, then my personal scorecard reads:
# Days spent using openmpi: 4 (off and on)
# identified bugs in openmpi :2
# useful programs built: 0

Please prove me wrong.  I'm eager to be shown my ignorance -- to  
find out where I've been stupid and what documentation I  
should've read.



Matt Hughes wrote:

I've found that I always have to use mpirun to start my spawner
process, due to the exact problem you are having:  the need to  
give
OMPI a hosts file!  It seems the singleton functionality is  
lacking
somehow... it won't allow you to spawn on arbitrary hosts.  I  
have not

tested if this is fixed in the 1.3 series.

Try
mpiexec -np 1 -H op2-1,op2-2 spawner op2-2

mpiexec should start the first process on op2-1, and the spawn  
call
should start the second on op2-2.  If you don't use the Info  
object to
set the hostname specifically, then on 1.2.x it will  
automatically

start on op2-2.  With 1.3, the spawn call will start processes
starting with the first item in the host list.

mch


[snip]
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] How to specify hosts for MPI_Comm_spawn

2008-07-30 Thread Ralph Castain
Okay, I tested it and MPI_Name_publish and MPI_Name_lookup work on  
1.2.6, so this may provide an avenue (albeit cumbersome) for you to  
get this to work. It may require a server, though, to make it work -  
your first MPI proc may be able to play that role if you pass it's  
contact info to the others, but I'd have to play with it for awhile to  
be sure. Haven't really tried that before.


Otherwise, even if we devised a fix for the singleton comm_spawn in  
1.2, it would still require an upgrade by the customer as it wouldn't  
be in 1.2.6 - best that could happen is for it to appear in 1.2.7,  
assuming we created the fix for that impending release (far from  
certain).


So if this doesn't work, and the customer cannot or will not upgrade  
from 1.2.6, I fear you probably cannot do this with OMPI under the  
constraints you describe.



On Jul 30, 2008, at 8:36 AM, Ralph Castain wrote:

IThe problem would be finding a way to tell all the MPI apps how to  
contact each other as the Intercomm procedure needs that info to  
complete. I don't recall if the MPI_Name_publish/lookup functions  
worked in 1.2 - I'm building the code now to see.


If it does, then you could use it to get the required contact info  
and wire up the Intercomm...it's a lot of what goes on under the  
comm_spawn covers anyway. Only diff is the necessity for the server...


On Jul 30, 2008, at 8:24 AM, Robert Kubrick wrote:

Mark, if you can run a server process on the remote machine, you  
could send a request from your local MPI app to your server, then  
use an Intercomm to link the local process to the new remote process?


On Jul 30, 2008, at 9:55 AM, Mark Borgerding wrote:


I'm afraid I can't dictate to the customer that they must upgrade.
The target platform is RHEL 5.2 ( uses openmpi 1.2.6 )

I will try to find some sort of workaround. Any suggestions on how  
to "fake" the functionality of MPI_Comm_spawn are welcome.


To reiterate my needs:
I am writing a shared object that plugs into an existing framework.
I do not control how the framework launches its processes (no  
mpirun).

I want to start remote processes to crunch the data.
The shared object marshall the I/O between the framework and the  
remote processes.


-- Mark


Ralph Castain wrote:
Singleton comm_spawn works fine on the 1.3 release branch - if  
singleton comm_spawn is critical to your plans, I suggest moving  
to that version. You can get a pre-release version off of the www.open-mpi.org 
 web site.



On Jul 30, 2008, at 6:58 AM, Ralph Castain wrote:

As your own tests have shown, it works fine if you just "mpirun - 
n 1 ./spawner". It is only singleton comm_spawn that appears to  
be having a problem in the latest 1.2 release. So I don't think  
comm_spawn is "useless". ;-)


I'm checking this morning to ensure that singletons properly  
spawns on other nodes in the 1.3 release. I sincerely doubt we  
will backport a fix to 1.2.



On Jul 30, 2008, at 6:49 AM, Mark Borgerding wrote:

I keep checking my email in hopes that someone will come up  
with something that Matt or I might've missed.
I'm just having a hard time accepting that something so  
fundamental would be so broken.
The MPI_Comm_spawn command is essentially useless without the  
ability to spawn processes on other nodes.


If this is true, then my personal scorecard reads:
# Days spent using openmpi: 4 (off and on)
# identified bugs in openmpi :2
# useful programs built: 0

Please prove me wrong.  I'm eager to be shown my ignorance --  
to find out where I've been stupid and what documentation I  
should've read.



Matt Hughes wrote:

I've found that I always have to use mpirun to start my spawner
process, due to the exact problem you are having:  the need to  
give
OMPI a hosts file!  It seems the singleton functionality is  
lacking
somehow... it won't allow you to spawn on arbitrary hosts.  I  
have not

tested if this is fixed in the 1.3 series.

Try
mpiexec -np 1 -H op2-1,op2-2 spawner op2-2

mpiexec should start the first process on op2-1, and the spawn  
call
should start the second on op2-2.  If you don't use the Info  
object to
set the hostname specifically, then on 1.2.x it will  
automatically

start on op2-2.  With 1.3, the spawn call will start processes
starting with the first item in the host list.

mch


[snip]
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailm

Re: [OMPI users] How to specify hosts for MPI_Comm_spawn

2008-07-30 Thread Mark Borgerding
I appreciate the suggestion about running a daemon on each of the remote 
nodes, but wouldn't I kind of be reinventing the wheel there? Process 
management is one of the things I'd like to be able to count on ORTE for.  

Would the following work to give the parent process an intercomm with 
each child?


parent i.e. my non-mpirun-started process  calls  MPI_Init then 
MPI_Open_port
parent spawns mpirun command via system/exec to create the remote 
children . The name from MPI_Open_port is placed in the environment.

parent calls MPI_Comm_accept (once for each child?)
all children call MPI_connect to the name

I think this would give one intercommunicator back to the parent for 
each remote process (not ideal, but I can worry about broadcast data later)

The remote processes can communicate to each other through MPI_COMM_WORLD.


Actually when I think through the details, much of this is pretty 
similar to the daemon MPI_Publish_name+MPI_Lookup_name approach.  The 
main difference being which processes come first.





Mark Borgerding wrote:

I'm afraid I can't dictate to the customer that they must upgrade.
The target platform is RHEL 5.2 ( uses openmpi 1.2.6 )

I will try to find some sort of workaround. Any suggestions on how to 
"fake" the functionality of MPI_Comm_spawn are welcome.


To reiterate my needs:
I am writing a shared object that plugs into an existing framework.
I do not control how the framework launches its processes (no mpirun).
I want to start remote processes to crunch the data.
The shared object marshall the I/O between the framework and the 
remote processes.


-- Mark


Ralph Castain wrote:
Singleton comm_spawn works fine on the 1.3 release branch - if 
singleton comm_spawn is critical to your plans, I suggest moving to 
that version. You can get a pre-release version off of the 
www.open-mpi.org web site.



On Jul 30, 2008, at 6:58 AM, Ralph Castain wrote:

As your own tests have shown, it works fine if you just "mpirun -n 1 
./spawner". It is only singleton comm_spawn that appears to be 
having a problem in the latest 1.2 release. So I don't think 
comm_spawn is "useless". ;-)


I'm checking this morning to ensure that singletons properly spawns 
on other nodes in the 1.3 release. I sincerely doubt we will 
backport a fix to 1.2.



On Jul 30, 2008, at 6:49 AM, Mark Borgerding wrote:

I keep checking my email in hopes that someone will come up with 
something that Matt or I might've missed.
I'm just having a hard time accepting that something so fundamental 
would be so broken.
The MPI_Comm_spawn command is essentially useless without the 
ability to spawn processes on other nodes.


If this is true, then my personal scorecard reads:
 # Days spent using openmpi: 4 (off and on)
 # identified bugs in openmpi :2
 # useful programs built: 0

Please prove me wrong.  I'm eager to be shown my ignorance -- to 
find out where I've been stupid and what documentation I should've 
read.



Matt Hughes wrote:

I've found that I always have to use mpirun to start my spawner
process, due to the exact problem you are having:  the need to give
OMPI a hosts file!  It seems the singleton functionality is lacking
somehow... it won't allow you to spawn on arbitrary hosts.  I have 
not

tested if this is fixed in the 1.3 series.

Try
mpiexec -np 1 -H op2-1,op2-2 spawner op2-2

mpiexec should start the first process on op2-1, and the spawn call
should start the second on op2-2.  If you don't use the Info 
object to

set the hostname specifically, then on 1.2.x it will automatically
start on op2-2.  With 1.3, the spawn call will start processes
starting with the first item in the host list.

mch


[snip]
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] How to specify hosts for MPI_Comm_spawn

2008-07-30 Thread Jeff Squyres

On Jul 30, 2008, at 11:12 AM, Mark Borgerding wrote:

I appreciate the suggestion about running a daemon on each of the  
remote nodes, but wouldn't I kind of be reinventing the wheel there?  
Process management is one of the things I'd like to be able to count  
on ORTE for.


Keep in mind that the daemons here are not for process management --  
they're for name service.


Would the following work to give the parent process an intercomm  
with each child?


parent i.e. my non-mpirun-started process  calls  MPI_Init then  
MPI_Open_port
parent spawns mpirun command via system/exec to create the remote  
children . The name from MPI_Open_port is placed in the environment.

parent calls MPI_Comm_accept (once for each child?)
all children call MPI_connect to the name


It may be problematic to call system/exec in some environments (e.g.,  
if using OpenFabrics networks).  Bad Things can happen.


I think this would give one intercommunicator back to the parent for  
each remote process (not ideal, but I can worry about broadcast data  
later)
The remote processes can communicate to each other through  
MPI_COMM_WORLD.



Actually when I think through the details, much of this is pretty  
similar to the daemon MPI_Publish_name+MPI_Lookup_name approach.   
The main difference being which processes come first.


Instead of having the framework call MPI_Init in your plugin, can you  
plugin system/exec "mpirun -np 1 my_parent_app"?  And perhaps use a  
pipe (or socket or some other IPC) to communicate between the  
framework process and my_parent_app?  I realize it's a kludgey  
workaround, but it looks like we clearly have a bug in the 1.2 series  
with singletons in this area...


--
Jeff Squyres
Cisco Systems



Re: [OMPI users] How to specify hosts for MPI_Comm_spawn

2008-07-30 Thread Robert Kubrick


On Jul 30, 2008, at 11:12 AM, Mark Borgerding wrote:

I appreciate the suggestion about running a daemon on each of the  
remote nodes, but wouldn't I kind of be reinventing the wheel  
there? Process management is one of the things I'd like to be able  
to count on ORTE for.
Would the following work to give the parent process an intercomm  
with each child?


parent i.e. my non-mpirun-started process  calls  MPI_Init then  
MPI_Open_port
parent spawns mpirun command via system/exec to create the remote  
children . The name from MPI_Open_port is placed in the environment.

parent calls MPI_Comm_accept (once for each child?)


I think you have to create a separate thread to run the accept, in  
order to accept multiple client connections. This should be supported  
by OpenMPI as it was the original idea of the API design to handle  
multiple client connections. There is an MPI_Comm_accept multi-thread  
example in the book Using MPI-2.



all children call MPI_connect to the name

I think this would give one intercommunicator back to the parent  
for each remote process (not ideal, but I can worry about broadcast  
data later)
The remote processes can communicate to each other through  
MPI_COMM_WORLD.


You should be able to merge each child communicator from each accept  
thread into a global comm anyway.





Actually when I think through the details, much of this is pretty  
similar to the daemon MPI_Publish_name+MPI_Lookup_name approach.   
The main difference being which processes come first.


You can run a deamon through system/exec the same way you run  
mpiexec. Just use ssh or rsh on the system/exec call.







Mark Borgerding wrote:

I'm afraid I can't dictate to the customer that they must upgrade.
The target platform is RHEL 5.2 ( uses openmpi 1.2.6 )

I will try to find some sort of workaround. Any suggestions on how  
to "fake" the functionality of MPI_Comm_spawn are welcome.


To reiterate my needs:
I am writing a shared object that plugs into an existing framework.
I do not control how the framework launches its processes (no  
mpirun).

I want to start remote processes to crunch the data.
The shared object marshall the I/O between the framework and the  
remote processes.


-- Mark


Ralph Castain wrote:
Singleton comm_spawn works fine on the 1.3 release branch - if  
singleton comm_spawn is critical to your plans, I suggest moving  
to that version. You can get a pre-release version off of the  
www.open-mpi.org web site.



On Jul 30, 2008, at 6:58 AM, Ralph Castain wrote:

As your own tests have shown, it works fine if you just "mpirun - 
n 1 ./spawner". It is only singleton comm_spawn that appears to  
be having a problem in the latest 1.2 release. So I don't think  
comm_spawn is "useless". ;-)


I'm checking this morning to ensure that singletons properly  
spawns on other nodes in the 1.3 release. I sincerely doubt we  
will backport a fix to 1.2.



On Jul 30, 2008, at 6:49 AM, Mark Borgerding wrote:

I keep checking my email in hopes that someone will come up  
with something that Matt or I might've missed.
I'm just having a hard time accepting that something so  
fundamental would be so broken.
The MPI_Comm_spawn command is essentially useless without the  
ability to spawn processes on other nodes.


If this is true, then my personal scorecard reads:
 # Days spent using openmpi: 4 (off and on)
 # identified bugs in openmpi :2
 # useful programs built: 0

Please prove me wrong.  I'm eager to be shown my ignorance --  
to find out where I've been stupid and what documentation I  
should've read.



Matt Hughes wrote:

I've found that I always have to use mpirun to start my spawner
process, due to the exact problem you are having:  the need to  
give
OMPI a hosts file!  It seems the singleton functionality is  
lacking
somehow... it won't allow you to spawn on arbitrary hosts.  I  
have not

tested if this is fixed in the 1.3 series.

Try
mpiexec -np 1 -H op2-1,op2-2 spawner op2-2

mpiexec should start the first process on op2-1, and the spawn  
call
should start the second on op2-2.  If you don't use the Info  
object to
set the hostname specifically, then on 1.2.x it will  
automatically

start on op2-2.  With 1.3, the spawn call will start processes
starting with the first item in the host list.

mch


[snip]
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/lis

Re: [OMPI users] How to specify hosts for MPI_Comm_spawn

2008-07-30 Thread Ralph Castain
Just to be clear: you do not require a daemon on every node. You just  
need one daemon - sitting somewhere - that can act as the data server  
for MPI_Name_publish/lookup. You then tell each app where to find it.


Normally, mpirun fills that function. But if you don't have it, you  
can kickoff a persistent orted (perhaps just have the parent  
application fork/exec it) for that purpose.



On Jul 30, 2008, at 9:50 AM, Robert Kubrick wrote:



On Jul 30, 2008, at 11:12 AM, Mark Borgerding wrote:

I appreciate the suggestion about running a daemon on each of the  
remote nodes, but wouldn't I kind of be reinventing the wheel  
there? Process management is one of the things I'd like to be able  
to count on ORTE for.
Would the following work to give the parent process an intercomm  
with each child?


parent i.e. my non-mpirun-started process  calls  MPI_Init then  
MPI_Open_port
parent spawns mpirun command via system/exec to create the remote  
children . The name from MPI_Open_port is placed in the environment.

parent calls MPI_Comm_accept (once for each child?)


I think you have to create a separate thread to run the accept, in  
order to accept multiple client connections. This should be  
supported by OpenMPI as it was the original idea of the API design  
to handle multiple client connections. There is an MPI_Comm_accept  
multi-thread example in the book Using MPI-2.



all children call MPI_connect to the name

I think this would give one intercommunicator back to the parent  
for each remote process (not ideal, but I can worry about broadcast  
data later)
The remote processes can communicate to each other through  
MPI_COMM_WORLD.


You should be able to merge each child communicator from each accept  
thread into a global comm anyway.





Actually when I think through the details, much of this is pretty  
similar to the daemon MPI_Publish_name+MPI_Lookup_name approach.   
The main difference being which processes come first.


You can run a deamon through system/exec the same way you run  
mpiexec. Just use ssh or rsh on the system/exec call.







Mark Borgerding wrote:

I'm afraid I can't dictate to the customer that they must upgrade.
The target platform is RHEL 5.2 ( uses openmpi 1.2.6 )

I will try to find some sort of workaround. Any suggestions on how  
to "fake" the functionality of MPI_Comm_spawn are welcome.


To reiterate my needs:
I am writing a shared object that plugs into an existing framework.
I do not control how the framework launches its processes (no  
mpirun).

I want to start remote processes to crunch the data.
The shared object marshall the I/O between the framework and the  
remote processes.


-- Mark


Ralph Castain wrote:
Singleton comm_spawn works fine on the 1.3 release branch - if  
singleton comm_spawn is critical to your plans, I suggest moving  
to that version. You can get a pre-release version off of the www.open-mpi.org 
 web site.



On Jul 30, 2008, at 6:58 AM, Ralph Castain wrote:

As your own tests have shown, it works fine if you just "mpirun - 
n 1 ./spawner". It is only singleton comm_spawn that appears to  
be having a problem in the latest 1.2 release. So I don't think  
comm_spawn is "useless". ;-)


I'm checking this morning to ensure that singletons properly  
spawns on other nodes in the 1.3 release. I sincerely doubt we  
will backport a fix to 1.2.



On Jul 30, 2008, at 6:49 AM, Mark Borgerding wrote:

I keep checking my email in hopes that someone will come up  
with something that Matt or I might've missed.
I'm just having a hard time accepting that something so  
fundamental would be so broken.
The MPI_Comm_spawn command is essentially useless without the  
ability to spawn processes on other nodes.


If this is true, then my personal scorecard reads:
# Days spent using openmpi: 4 (off and on)
# identified bugs in openmpi :2
# useful programs built: 0

Please prove me wrong.  I'm eager to be shown my ignorance --  
to find out where I've been stupid and what documentation I  
should've read.



Matt Hughes wrote:

I've found that I always have to use mpirun to start my spawner
process, due to the exact problem you are having:  the need to  
give
OMPI a hosts file!  It seems the singleton functionality is  
lacking
somehow... it won't allow you to spawn on arbitrary hosts.  I  
have not

tested if this is fixed in the 1.3 series.

Try
mpiexec -np 1 -H op2-1,op2-2 spawner op2-2

mpiexec should start the first process on op2-1, and the spawn  
call
should start the second on op2-2.  If you don't use the Info  
object to
set the hostname specifically, then on 1.2.x it will  
automatically

start on op2-2.  With 1.3, the spawn call will start processes
starting with the first item in the host list.

mch


[snip]
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://

[OMPI users] Missing F90 modules

2008-07-30 Thread Scott Beardsley
I'm attempting to move to OpenMPI from another MPICH-derived 
implementation. I compiled openmpi 1.2.6 using the following configure:


./configure --build=x86_64-redhat-linux-gnu 
--host=x86_64-redhat-linux-gnu --target=x86_64-redhat-linux-gnu 
--program-prefix= --prefix=/usr/mpi/pathscale/openmpi-1.2.6 
--exec-prefix=/usr/mpi/pathscale/openmpi-1.2.6 
--bindir=/usr/mpi/pathscale/openmpi-1.2.6/bin 
--sbindir=/usr/mpi/pathscale/openmpi-1.2.6/sbin 
--sysconfdir=/usr/mpi/pathscale/openmpi-1.2.6/etc 
--datadir=/usr/mpi/pathscale/openmpi-1.2.6/share 
--includedir=/usr/mpi/pathscale/openmpi-1.2.6/include 
--libdir=/usr/mpi/pathscale/openmpi-1.2.6/lib64 
--libexecdir=/usr/mpi/pathscale/openmpi-1.2.6/libexec 
--localstatedir=/var 
--sharedstatedir=/usr/mpi/pathscale/openmpi-1.2.6/com 
--mandir=/usr/mpi/pathscale/openmpi-1.2.6/share/man 
--infodir=/usr/share/info --with-openib=/usr 
--with-openib-libdir=/usr/lib64 CC=pathcc CXX=pathCC F77=pathf90 
FC=pathf90 --with-psm-dir=/usr --enable-mpirun-prefix-by-default 
--with-mpi-f90-size=large


It looks like there is a single MPI.mod generated upon compilation and 
installation. Is this normal? I have a user complaining that MPI1.mod, 
MPI2.mod, and the f90base directory among others are missing (and thus 
the installation is incomplete). Are these modules provided by OpenMPI?


I see in the configure help that the f90 bindings are enabled by default 
so I didn't add the "--enable-mpi-f90" option.


Scot


Re: [OMPI users] Missing F90 modules

2008-07-30 Thread Brock Palen

On all MPI's I have always used there was only MPI

use mpi;

Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985



On Jul 30, 2008, at 1:45 PM, Scott Beardsley wrote:
I'm attempting to move to OpenMPI from another MPICH-derived  
implementation. I compiled openmpi 1.2.6 using the following  
configure:


./configure --build=x86_64-redhat-linux-gnu --host=x86_64-redhat- 
linux-gnu --target=x86_64-redhat-linux-gnu --program-prefix= -- 
prefix=/usr/mpi/pathscale/openmpi-1.2.6 --exec-prefix=/usr/mpi/ 
pathscale/openmpi-1.2.6 --bindir=/usr/mpi/pathscale/openmpi-1.2.6/ 
bin --sbindir=/usr/mpi/pathscale/openmpi-1.2.6/sbin --sysconfdir=/ 
usr/mpi/pathscale/openmpi-1.2.6/etc --datadir=/usr/mpi/pathscale/ 
openmpi-1.2.6/share --includedir=/usr/mpi/pathscale/openmpi-1.2.6/ 
include --libdir=/usr/mpi/pathscale/openmpi-1.2.6/lib64 -- 
libexecdir=/usr/mpi/pathscale/openmpi-1.2.6/libexec -- 
localstatedir=/var --sharedstatedir=/usr/mpi/pathscale/ 
openmpi-1.2.6/com --mandir=/usr/mpi/pathscale/openmpi-1.2.6/share/ 
man --infodir=/usr/share/info --with-openib=/usr --with-openib- 
libdir=/usr/lib64 CC=pathcc CXX=pathCC F77=pathf90 FC=pathf90 -- 
with-psm-dir=/usr --enable-mpirun-prefix-by-default --with-mpi-f90- 
size=large


It looks like there is a single MPI.mod generated upon compilation  
and installation. Is this normal? I have a user complaining that  
MPI1.mod, MPI2.mod, and the f90base directory among others are  
missing (and thus the installation is incomplete). Are these  
modules provided by OpenMPI?


I see in the configure help that the f90 bindings are enabled by  
default so I didn't add the "--enable-mpi-f90" option.


Scot
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users






Re: [OMPI users] Missing F90 modules

2008-07-30 Thread Scott Beardsley

Brock Palen wrote:

On all MPI's I have always used there was only MPI

use mpi;


Please excuse my admittedly gross ignorance of all things Fortran but 
why does "include 'mpif.h'" work but "use mpi" does not? When I try the 
"use mpi" method I get errors like:


$ mpif90 -c cart.f

  call mpi_cart_get(   igcomm,2,ivdimx,lvperx, mygrid,   ierr)
   ^
pathf95-389 pathf90: ERROR CART, File = cart.f, Line = 34, Column = 12
  No specific match can be found for the generic subprogram call 
"MPI_CART_GET"


$ mpif90 -c cartfoo.f
$ diff cart.f cartfoo.f
3,4c3,4
< C include 'mpif.h'
<   use mpi;
---
>   include 'mpif.h'
> C  use mpi;
$

From the googling I've done it seems like "use mpi" is preferred[1]. 
I've made sure that my $LD_LIBRARY_PATH has the directory that MPI.mod 
is in.


Scott

[1] http://www.mpi-forum.org/docs/mpi-20-html/node243.htm


Re: [OMPI users] Missing F90 modules

2008-07-30 Thread Jeff Squyres
This is correct; Open MPI only generates MPI.mod so that you can "use  
mpi" in your Fortran app.


I'm not sure that MPI1.mod and MPI2.mod and f90base are -- perhaps  
those are somehow specific artifacts of the other MPI implementation,  
and/or artifacts of the Fortran compiler...?



On Jul 30, 2008, at 1:45 PM, Scott Beardsley wrote:

I'm attempting to move to OpenMPI from another MPICH-derived  
implementation. I compiled openmpi 1.2.6 using the following  
configure:


./configure --build=x86_64-redhat-linux-gnu --host=x86_64-redhat- 
linux-gnu --target=x86_64-redhat-linux-gnu --program-prefix= -- 
prefix=/usr/mpi/pathscale/openmpi-1.2.6 --exec-prefix=/usr/mpi/ 
pathscale/openmpi-1.2.6 --bindir=/usr/mpi/pathscale/openmpi-1.2.6/ 
bin --sbindir=/usr/mpi/pathscale/openmpi-1.2.6/sbin --sysconfdir=/ 
usr/mpi/pathscale/openmpi-1.2.6/etc --datadir=/usr/mpi/pathscale/ 
openmpi-1.2.6/share --includedir=/usr/mpi/pathscale/openmpi-1.2.6/ 
include --libdir=/usr/mpi/pathscale/openmpi-1.2.6/lib64 -- 
libexecdir=/usr/mpi/pathscale/openmpi-1.2.6/libexec --localstatedir=/ 
var --sharedstatedir=/usr/mpi/pathscale/openmpi-1.2.6/com --mandir=/ 
usr/mpi/pathscale/openmpi-1.2.6/share/man --infodir=/usr/share/info  
--with-openib=/usr --with-openib-libdir=/usr/lib64 CC=pathcc  
CXX=pathCC F77=pathf90 FC=pathf90 --with-psm-dir=/usr --enable- 
mpirun-prefix-by-default --with-mpi-f90-size=large


It looks like there is a single MPI.mod generated upon compilation  
and installation. Is this normal? I have a user complaining that  
MPI1.mod, MPI2.mod, and the f90base directory among others are  
missing (and thus the installation is incomplete). Are these modules  
provided by OpenMPI?


I see in the configure help that the f90 bindings are enabled by  
default so I didn't add the "--enable-mpi-f90" option.


Scot
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems



Re: [OMPI users] Missing F90 modules

2008-07-30 Thread Brock Palen
I have seen strange things about fortran compilers and the suffix of  
files.


use mpi

is a fortran 90 thing, not 77,  many compilers want fortran 90 codes  
to end in .f90 or .F90


Try renaming carfoo.f  to cartfoo.f90  and try again.

I have attached a helloworld.f90 that uses use mpi that works on our  
openmpi installs.




helloworld.f90
Description: Binary data



Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985



On Jul 30, 2008, at 4:15 PM, Scott Beardsley wrote:

Brock Palen wrote:

On all MPI's I have always used there was only MPI
use mpi;


Please excuse my admittedly gross ignorance of all things Fortran  
but why does "include 'mpif.h'" work but "use mpi" does not? When I  
try the "use mpi" method I get errors like:


$ mpif90 -c cart.f

  call mpi_cart_get(   igcomm,2,ivdimx,lvperx, mygrid,   ierr)
   ^
pathf95-389 pathf90: ERROR CART, File = cart.f, Line = 34, Column = 12
  No specific match can be found for the generic subprogram call  
"MPI_CART_GET"


$ mpif90 -c cartfoo.f
$ diff cart.f cartfoo.f
3,4c3,4
< C include 'mpif.h'
<   use mpi;
---
>   include 'mpif.h'
> C  use mpi;
$

From the googling I've done it seems like "use mpi" is preferred 
[1]. I've made sure that my $LD_LIBRARY_PATH has the directory that  
MPI.mod is in.


Scott

[1] http://www.mpi-forum.org/docs/mpi-20-html/node243.htm
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users






Re: [OMPI users] Missing F90 modules

2008-07-30 Thread Joe Griffin
Scott,

include brings in a file

use brings in a module .. kind of like an object file.

Joe


> -Original Message-
> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org]
On
> Behalf Of Scott Beardsley
> Sent: Wednesday, July 30, 2008 1:16 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] Missing F90 modules
> 
> Brock Palen wrote:
> > On all MPI's I have always used there was only MPI
> >
> > use mpi;
> 
> Please excuse my admittedly gross ignorance of all things Fortran but
> why does "include 'mpif.h'" work but "use mpi" does not? When I try
the
> "use mpi" method I get errors like:
> 
> $ mpif90 -c cart.f
> 
>call mpi_cart_get(   igcomm,2,ivdimx,lvperx, mygrid,
ierr)
> ^
> pathf95-389 pathf90: ERROR CART, File = cart.f, Line = 34, Column = 12
>No specific match can be found for the generic subprogram call
> "MPI_CART_GET"
> 
> $ mpif90 -c cartfoo.f
> $ diff cart.f cartfoo.f
> 3,4c3,4
> < C include 'mpif.h'
> <   use mpi;
> ---
>  >   include 'mpif.h'
>  > C  use mpi;
> $
> 
>  From the googling I've done it seems like "use mpi" is preferred[1].
> I've made sure that my $LD_LIBRARY_PATH has the directory that MPI.mod
> is in.
> 
> Scott
> 
> [1] http://www.mpi-forum.org/docs/mpi-20-html/node243.htm
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] Missing F90 modules

2008-07-30 Thread Edmund Sumbar
On Wed, Jul 30, 2008 at 01:15:54PM -0700, Scott Beardsley wrote:
> Brock Palen wrote:
> > On all MPI's I have always used there was only MPI
> > 
> > use mpi;
> 
> Please excuse my admittedly gross ignorance of all things Fortran but 
> why does "include 'mpif.h'" work but "use mpi" does not? When I try the 
> "use mpi" method I get errors like:
> 
> $ mpif90 -c cart.f
> 
>call mpi_cart_get(   igcomm,2,ivdimx,lvperx, mygrid,   ierr)
> ^
> pathf95-389 pathf90: ERROR CART, File = cart.f, Line = 34, Column = 12
>No specific match can be found for the generic subprogram call 
> "MPI_CART_GET"
> 
> $ mpif90 -c cartfoo.f
> $ diff cart.f cartfoo.f
> 3,4c3,4
> < C include 'mpif.h'
> <   use mpi;
> ---
>  >   include 'mpif.h'
>  > C  use mpi;
> $
> 
>  From the googling I've done it seems like "use mpi" is preferred[1]. 
> I've made sure that my $LD_LIBRARY_PATH has the directory that MPI.mod 
> is in.

Try adding the path to MPI.mod to the include path (e.g.,
-I/usr/local/openmpi/mod).

-- 
Ed[mund [Sumbar]]
AICT Research Support Group
esum...@ualberta.ca
780.492.9360


Re: [OMPI users] Missing F90 modules

2008-07-30 Thread Jeff Squyres
"use mpi" basically gives you stronger type checking in Fortran 90  
that you don't get with Fortran 77.  So the error you're seeing is  
basically a compiler error telling you that you have the wrong types  
for MPI_CART_GET and that it doesn't match any of the functions  
provided by Open MPI.


FWIW, the official declaration of the Fortran binding for MPI_CART_GET  
is:


MPI_CART_GET(COMM, MAXDIMS, DIMS, PERIODS, COORDS, IERROR)
INTEGER COMM, MAXDIMS, DIMS(*), COORDS(*), IERROR
LOGICAL PERIODS(*)

The real problem is that it looks like we have a bug in our F90  
bindings.  :-(  We have the "periods" argument typed as an integer  
array, when it really should be a logical array.  Doh!


The patch below fixes the problem in the v1.2 series; I'll get it  
included in v1.2.7 and the upcoming v1.3 series.


Index: ompi/mpi/f90/scripts/mpi-f90-interfaces.h.sh
===
--- ompi/mpi/f90/scripts/mpi-f90-interfaces.h.sh(revision 19099)
+++ ompi/mpi/f90/scripts/mpi-f90-interfaces.h.sh(working copy)
@@ -1120,7 +1120,7 @@
   integer, intent(in) :: comm
   integer, intent(in) :: maxdims
   integer, dimension(*), intent(out) :: dims
-  integer, dimension(*), intent(out) :: periods
+  logical, dimension(*), intent(out) :: periods
   integer, dimension(*), intent(out) :: coords
   integer, intent(out) :: ierr
 end subroutine ${procedure}





On Jul 30, 2008, at 4:15 PM, Scott Beardsley wrote:


Brock Palen wrote:

On all MPI's I have always used there was only MPI
use mpi;


Please excuse my admittedly gross ignorance of all things Fortran  
but why does "include 'mpif.h'" work but "use mpi" does not? When I  
try the "use mpi" method I get errors like:


$ mpif90 -c cart.f

 call mpi_cart_get(   igcomm,2,ivdimx,lvperx, mygrid,   ierr)
  ^
pathf95-389 pathf90: ERROR CART, File = cart.f, Line = 34, Column = 12
 No specific match can be found for the generic subprogram call  
"MPI_CART_GET"


$ mpif90 -c cartfoo.f
$ diff cart.f cartfoo.f
3,4c3,4
< C include 'mpif.h'
<   use mpi;
---
>   include 'mpif.h'
> C  use mpi;
$

From the googling I've done it seems like "use mpi" is preferred[1].  
I've made sure that my $LD_LIBRARY_PATH has the directory that  
MPI.mod is in.


Scott

[1] http://www.mpi-forum.org/docs/mpi-20-html/node243.htm
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems



Re: [OMPI users] Missing F90 modules

2008-07-30 Thread Scott Beardsley


The real problem is that it looks like we have a bug in our F90 
bindings.  :-(  We have the "periods" argument typed as an integer 
array, when it really should be a logical array.  Doh!


Ahhh ha! I checked the manpage vs the user's code but I didn't check the 
OpenMPI code. I can confirm that the patch you sent fixes the problem 
for me (v1.2.6). Thanks all!


Scott