Re: [OMPI users] OpenMPI Documentation?

2007-09-13 Thread Amit Kumar Saha
Hi Richard,

On 9/12/07, Richard Friedman  wrote:
>
>  Amit:
>  Well, as far as I know a documentation community within OpenMPI has not yet
> been formed, but maybe it is time to send out a general call to the OpenMPI
> members to see about creating one.
>  I'm new to the OpenMPI community myself, so I'm not yet sure how this can
> be done. But we can find out.
>  Thanks for the interest.

Well, some has to take the initiative, and it would be ideal to have
an experienced Open MPI programmer take the lead role and members like
me can be contributors.


Regards,
Amit
-- 
Amit Kumar Saha
[URL]:http://amitsaha.in.googlepages.com


Re: [OMPI users] OpenMPI Documentation?

2007-09-13 Thread Jeff Squyres
I would be very happy to help setup a documentation community --  
goodness knows we need more/better documentation for Open MPI!


Who else would be interested?


On Sep 13, 2007, at 5:13 AM, Amit Kumar Saha wrote:


Hi Richard,

On 9/12/07, Richard Friedman  wrote:


 Amit:
 Well, as far as I know a documentation community within OpenMPI  
has not yet
been formed, but maybe it is time to send out a general call to  
the OpenMPI

members to see about creating one.
 I'm new to the OpenMPI community myself, so I'm not yet sure how  
this can

be done. But we can find out.
 Thanks for the interest.


Well, some has to take the initiative, and it would be ideal to have
an experienced Open MPI programmer take the lead role and members like
me can be contributors.


Regards,
Amit
--
Amit Kumar Saha
[URL]:http://amitsaha.in.googlepages.com
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems



Re: [OMPI users] OpenMPI Documentation?

2007-09-13 Thread jody
Hi
I would like to contribute something as well.

I have about half a year of experience with OpenMPI,
and i used LAM MPI for some more than half a year before.

Jody


[OMPI users] connect failed with errno=111

2007-09-13 Thread Tim Campbell

Greetings,

I am using OpenMPI v1.2.3 via SGE on a network of amd64  
workstations.  When mpirun tries to start the processes on certain  
nodes I get the following error output.


[sr70][0,1,2][btl_tcp_endpoint.c: 
572:mca_btl_tcp_endpoint_complete_connect] connect() failed with  
errno=111
[sr71][0,1,3][btl_tcp_endpoint.c: 
572:mca_btl_tcp_endpoint_complete_connect] connect() failed with  
errno=111


Using perl -e 'die$!=111' I see that the error message is "Connection  
refused".  I am able to connect to both nodes in question via ssh and/ 
or rsh.  I changed btl_base_debug to 2, but that did not provide  
additional information.


What are some possible issues that might be causing this?  What can I  
do to get more information?


Thanks,
~Tim




Re: [OMPI users] connect failed with errno=111

2007-09-13 Thread Pak Lui

Hi Tim,

You could try setting -mca pls_gridengine_verbose 1 to show whether SGE 
is able to start the ORTE daemons on the remote nodes successfully.


It seems you are having the problem previously asked by another user, 
Perhaps you may want to follow this thread and check your ifconfig 
settings to see if anything suspicious?

http://www.open-mpi.org/community/lists/users/2007/02/2669.php

My 2 cents...

Tim Campbell wrote:

Greetings,

I am using OpenMPI v1.2.3 via SGE on a network of amd64  
workstations.  When mpirun tries to start the processes on certain  
nodes I get the following error output.


[sr70][0,1,2][btl_tcp_endpoint.c: 
572:mca_btl_tcp_endpoint_complete_connect] connect() failed with  
errno=111
[sr71][0,1,3][btl_tcp_endpoint.c: 
572:mca_btl_tcp_endpoint_complete_connect] connect() failed with  
errno=111


Using perl -e 'die$!=111' I see that the error message is "Connection  
refused".  I am able to connect to both nodes in question via ssh and/ 
or rsh.  I changed btl_base_debug to 2, but that did not provide  
additional information.


What are some possible issues that might be causing this?  What can I  
do to get more information?


Thanks,
~Tim


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--

- Pak Lui
pak@sun.com


Re: [OMPI users] connect failed with errno=111

2007-09-13 Thread Tim Campbell

Thanks.

I think I figured out the problem.  I found that in my .ssh/ 
known_hosts there were several "bad" keys associated with some of the  
machines in the gridengine pool.  My hypothesis is that when mpirun  
was establishing the connection topology of the processes there was  
some process pairs that failed to complete the connection due to the  
bad ssh keys.  I don't have explicit evidence for this since there  
was no ssh error output generated.


I generated new keys for all the amd64 machines in the gridengine  
pool for which there was an offending key.  Now my job runs with a  
set of machines that includes ones that had previously failed.  I  
will assume for now that the problem is fixed.


~Tim


On Sep 13, 2007, at 12:06 PM, Pak Lui wrote:


Hi Tim,

You could try setting -mca pls_gridengine_verbose 1 to show whether  
SGE

is able to start the ORTE daemons on the remote nodes successfully.

It seems you are having the problem previously asked by another user,
Perhaps you may want to follow this thread and check your ifconfig
settings to see if anything suspicious?
http://www.open-mpi.org/community/lists/users/2007/02/2669.php

My 2 cents...

Tim Campbell wrote:

Greetings,

I am using OpenMPI v1.2.3 via SGE on a network of amd64
workstations.  When mpirun tries to start the processes on certain
nodes I get the following error output.

[sr70][0,1,2][btl_tcp_endpoint.c:
572:mca_btl_tcp_endpoint_complete_connect] connect() failed with
errno=111
[sr71][0,1,3][btl_tcp_endpoint.c:
572:mca_btl_tcp_endpoint_complete_connect] connect() failed with
errno=111

Using perl -e 'die$!=111' I see that the error message is "Connection
refused".  I am able to connect to both nodes in question via ssh  
and/

or rsh.  I changed btl_base_debug to 2, but that did not provide
additional information.

What are some possible issues that might be causing this?  What can I
do to get more information?

Thanks,
~Tim


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--

- Pak Lui
pak@sun.com
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users





Re: [OMPI users] connect failed with errno=111

2007-09-13 Thread Adrian Knoth
On Thu, Sep 13, 2007 at 11:15:47AM -0500, Tim Campbell wrote:

> workstations.  When mpirun tries to start the processes on certain  
> nodes I get the following error output.
> 
> [sr70][0,1,2][btl_tcp_endpoint.c: 
> 572:mca_btl_tcp_endpoint_complete_connect] connect() failed with  
> errno=111
> [sr71][0,1,3][btl_tcp_endpoint.c: 
> 572:mca_btl_tcp_endpoint_complete_connect] connect() failed with  
> errno=111
> 
> Using perl -e 'die$!=111' I see that the error message is "Connection  
> refused".  I am able to connect to both nodes in question via ssh and/ 

This sounds pretty much like an IP setup issue. Perhaps some nodes have
more than one interface, i.e. internal and external network,
IP-over-FireWire, ppp-Devices or something else. Exporting these
addresses would clearly cause other nodes to be unable to connect.

If so, use btl_tcp_if_exclude (or _include) to specify the right
interface.

Second problem: local firewalls. Though ssh connections might be
allowed, the sysadmin could block almost any other (destination) port,
thus causing the same error messages. (in case of
icmp-port-unreachable).

> What are some possible issues that might be causing this?  What can I  
> do to get more information?

I agree that you surely need more information. Can you recompile with
--enable-debug and change 

#define WANT_PEER_DUMP 0

in file ompi/mca/btl/tcp/btl_tcp_endpoint.c from "0" to "1" before
recompiling?

This should give you detailed information.


HTH

-- 
Cluster and Metacomputing Working Group
Friedrich-Schiller-Universität Jena, Germany

private: http://adi.thur.de


Re: [OMPI users] OpenMPI Documentation?

2007-09-13 Thread pat . o'bryant
Jeff,
 I would also be interested. I am getting questions from my customers
about the location of documentation.
 Thanks,
  Pat





 Jeff Squyres  
  To 
 Sent by: Open MPI Users   
 users-bounces@ cc 
 open-mpi.org rc...@sun.com
   Subject 
  Re: [OMPI users] OpenMPI 
 09/13/07 10:33   Documentation?   
 AM


 Please respond
   to  
 Open MPI Users
 








I would be very happy to help setup a documentation community --
goodness knows we need more/better documentation for Open MPI!

Who else would be interested?


On Sep 13, 2007, at 5:13 AM, Amit Kumar Saha wrote:

> Hi Richard,
>
> On 9/12/07, Richard Friedman  wrote:
>>
>>  Amit:
>>  Well, as far as I know a documentation community within OpenMPI
>> has not yet
>> been formed, but maybe it is time to send out a general call to
>> the OpenMPI
>> members to see about creating one.
>>  I'm new to the OpenMPI community myself, so I'm not yet sure how
>> this can
>> be done. But we can find out.
>>  Thanks for the interest.
>
> Well, some has to take the initiative, and it would be ideal to have
> an experienced Open MPI programmer take the lead role and members like
> me can be contributors.
>
>
> Regards,
> Amit
> --
> Amit Kumar Saha
> [URL]:http://amitsaha.in.googlepages.com
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Jeff Squyres
Cisco Systems

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] OpenMPI Documentation?

2007-09-13 Thread Jeff Squyres
So there are at least a few people who are interested in this effort  
(keep chiming in if you are interested so that we can get a tally of  
who would like to be involved).


What kind of resources / organization would be useful for this  
group?  Indiana University graciously hosts all of Open MPI's  
electronic resources (Subversion, web site, bug tracking, DNS,  
mailing lists, ...) and I certainly can't speak for them, but if we  
ask nicely, I'd be willing to bet that they would add some hosting  
services for a documentation project (if such additional resources  
would be helpful, of course).


I would also be happy to host a teleconference if talking about all  
this start/admin stuff for an hour would save 1-2 weeks worth of  
detailed e-mails.


-

The only current documentation we have is:

- the web FAQ
- the README in the tarball

What is conspicuously missing is a nice PDF and/or HTML tarball with  
comprehensive documentation.  But I think that FAQ/README also fit  
into the general category of documentation, so it might make sense to  
put all 3 of these items under the control of one group.  The obvious  
rationale here is that all three could stay in tighter sync if  
there's one group monitoring all 3.


One point worth mentioning: Open MPI is all about community  
consensus, but "s/he who implements usually wins".  :-)  So if we get  
an active group working on documentation, the FAQ could be totally re- 
done if the group so decides (for example).


All this being said, the OMPI developers *have* talked about  
documentation a bit over time.  Here's some of the points from prior  
discussions, in no particular order:


- It highly desirable to have documentation that can be output in  
multiple different forms (PDF, HTML, ...whatever).  If possible, the  
docs should be shipped in distribution tarballs and hosted on the  
OMPI web site.


- LAM/MPI had two great docs: one for installation LAM/MPI and one  
for using LAM/MPI.  These might be good example documents for what  
Open MPI might want to do (see http://www.lam-mpi.org/using/docs/),  
regardless of the back-end technology used to generate the docs.   
Source LaTeX for these guides are available if it would be helpful (I  
wrote most of them).


- It would be most helpful if the documentation is written in a tool  
that has free editors, preferably cross-platform and available in  
multiple POSIX-like environments (Solaris, Linux, OS X).  MS Office  
was explicitly rejected because of its requirement for Windows/OS X  
(other Office clones were not really discussed).  LaTeX was discussed  
but wasn't favored due to the steep learning curve and general lack  
of experience with it outside of academia.


- First documentation should be aimed towards users.  Developer  
documentation might follow.


- Once upon a time, we developers started to use doxygen for  
documentation, but it has proven to be lousy for book-like entities  
(IONSHO).  Doxygen is decent for code documentation, but not documents.


- A few recent discussions about documentation came to the conclusion  
that Docbook (www.docbook.org) looked promising, but we didn't get  
deep into details / investigating the feasibility.  One obvious Big  
Project using Docbook is Subversion (see http://svnbook.red- 
bean.com/).  Docbook-produced HTML and PDF seem to look both pretty  
and functional.


- It would also be nice if sub-distributions of Open MPI could take  
the documentation and -- in some defined automated fashion -- be able  
to do the following:
- insert their own "chapters" or "sections" that are specific to  
that sub-distribution (e.g., Sun ClusterTools have some Solaris- 
specific stuff, OFED have some OpenFabrics-specific stuff, etc.)
- remove/"turn off" specific sections of documentation (e.g.,  
OFED would likely not include any documentation about Myricom  
networks [and vice versa])
This would go a long ways towards being able to keep the community  
documentation in sync with docs included in targeted/vendor OMPI  
releases.


- The OMPI web site is almost entirely written in PHP and is mirrored  
around the world.  It would be *strongly* preferred if the web-site  
hosting of the docs is fully mirror-able (because assumedly docs are  
one of the things that users would want to browse the most).  Hence,  
requiring a new kind of server other than HTML/PHP would require  
very, very strong rationale.  :-)


- The technology of choice for displaying on the web site is PHP.   
But that still leaves open a wide variety of choices for serving docs  
via the web site, including (but not limited to):
- just posting PDFs (although having HTML-based docs would  
certainly be nice)

- a PHP-based package or home-grown PHP
- generating HTML offline (via cron or whatever) and putting the  
results in the web site

- ...etc.




On Sep 13, 2007, at 1:31 PM, pat.o'bry...@exxonmobil.com wrote:


Jeff,
 I would also be interested. I a

Re: [OMPI users] OpenMPI Documentation?

2007-09-13 Thread Jeff Pummill

Jeff,

Count us in at the UofA. My initial impressions of Open MPI are very 
good and I would be open to contributing to this effort as time allows.


Thanks!

Jeff F. Pummill
Senior Linux Cluster Administrator
University of Arkansas
Fayetteville, Arkansas 72701
(479) 575 - 4590
http://hpc.uark.edu

"A supercomputer is a device for turning compute-bound
problems into I/O-bound problems." -Seymour Cray


Jeff Squyres wrote:
So there are at least a few people who are interested in this effort  
(keep chiming in if you are interested so that we can get a tally of  
who would like to be involved).


What kind of resources / organization would be useful for this  
group?  Indiana University graciously hosts all of Open MPI's  
electronic resources (Subversion, web site, bug tracking, DNS,  
mailing lists, ...) and I certainly can't speak for them, but if we  
ask nicely, I'd be willing to bet that they would add some hosting  
services for a documentation project (if such additional resources  
would be helpful, of course).


I would also be happy to host a teleconference if talking about all  
this start/admin stuff for an hour would save 1-2 weeks worth of  
detailed e-mails.


-

The only current documentation we have is:

- the web FAQ
- the README in the tarball

What is conspicuously missing is a nice PDF and/or HTML tarball with  
comprehensive documentation.  But I think that FAQ/README also fit  
into the general category of documentation, so it might make sense to  
put all 3 of these items under the control of one group.  The obvious  
rationale here is that all three could stay in tighter sync if  
there's one group monitoring all 3.


One point worth mentioning: Open MPI is all about community  
consensus, but "s/he who implements usually wins".  :-)  So if we get  
an active group working on documentation, the FAQ could be totally re- 
done if the group so decides (for example).


All this being said, the OMPI developers *have* talked about  
documentation a bit over time.  Here's some of the points from prior  
discussions, in no particular order:


- It highly desirable to have documentation that can be output in  
multiple different forms (PDF, HTML, ...whatever).  If possible, the  
docs should be shipped in distribution tarballs and hosted on the  
OMPI web site.


- LAM/MPI had two great docs: one for installation LAM/MPI and one  
for using LAM/MPI.  These might be good example documents for what  
Open MPI might want to do (see http://www.lam-mpi.org/using/docs/),  
regardless of the back-end technology used to generate the docs.   
Source LaTeX for these guides are available if it would be helpful (I  
wrote most of them).


- It would be most helpful if the documentation is written in a tool  
that has free editors, preferably cross-platform and available in  
multiple POSIX-like environments (Solaris, Linux, OS X).  MS Office  
was explicitly rejected because of its requirement for Windows/OS X  
(other Office clones were not really discussed).  LaTeX was discussed  
but wasn't favored due to the steep learning curve and general lack  
of experience with it outside of academia.


- First documentation should be aimed towards users.  Developer  
documentation might follow.


- Once upon a time, we developers started to use doxygen for  
documentation, but it has proven to be lousy for book-like entities  
(IONSHO).  Doxygen is decent for code documentation, but not documents.


- A few recent discussions about documentation came to the conclusion  
that Docbook (www.docbook.org) looked promising, but we didn't get  
deep into details / investigating the feasibility.  One obvious Big  
Project using Docbook is Subversion (see http://svnbook.red- 
bean.com/).  Docbook-produced HTML and PDF seem to look both pretty  
and functional.


- It would also be nice if sub-distributions of Open MPI could take  
the documentation and -- in some defined automated fashion -- be able  
to do the following:
 - insert their own "chapters" or "sections" that are specific to  
that sub-distribution (e.g., Sun ClusterTools have some Solaris- 
specific stuff, OFED have some OpenFabrics-specific stuff, etc.)
 - remove/"turn off" specific sections of documentation (e.g.,  
OFED would likely not include any documentation about Myricom  
networks [and vice versa])
This would go a long ways towards being able to keep the community  
documentation in sync with docs included in targeted/vendor OMPI  
releases.


- The OMPI web site is almost entirely written in PHP and is mirrored  
around the world.  It would be *strongly* preferred if the web-site  
hosting of the docs is fully mirror-able (because assumedly docs are  
one of the things that users would want to browse the most).  Hence,  
requiring a new kind of server other than HTML/PHP would require  
very, very strong rationale.  :-)


- The technology of choice for displaying on the web site is PHP.   
But that still leaves open a wide variety o

Re: [OMPI users] OpenMPI Documentation?

2007-09-13 Thread richard.fried...@sun.com
As more people start chiming in wanting to help with OpenMPI 
documentation (a good thing!), maybe we should think about starting 
forum or separate email list just for this discussion.

At least, initially to get the ball rolling.

Do we have the capability of creating a new mail list at open-mpi.org?

There are other alternatives, like Yahoo groups and such.




  
<>

Re: [OMPI users] OpenMPI Documentation?

2007-09-13 Thread Jeff Squyres

On Sep 13, 2007, at 4:17 PM, richard.fried...@sun.com wrote:

As more people start chiming in wanting to help with OpenMPI  
documentation (a good thing!), maybe we should think about starting  
forum or separate email list just for this discussion.

At least, initially to get the ball rolling.

Do we have the capability of creating a new mail list at open-mpi.org?


Yes, we can create whatever we want/need.  I asked Indiana University  
offline and they are quite amenable to adding hosting for whatever a  
docs sub-group would want/need (see http://www.open-mpi.org/community/ 
lists/users/2007/09/4002.php).


What name/address do we want?  d...@open-mpi.org?  (or suggest an  
alternative)


--
Jeff Squyres
Cisco Systems



Re: [OMPI users] OpenMPI Documentation?

2007-09-13 Thread richard.fried...@sun.com



Jeff Squyres wrote:


What name/address do we want?  d...@open-mpi.org?  (or suggest an 
alternative)



Sounds right to me. Only alternative might be  docs_t...@open-mpi.org
<>

[OMPI users] Two different compilation of openmpi

2007-09-13 Thread Francesco Pietra
Is it possible to have two different compilations of openmpi on the same
machine (dual-opterons, Debian Linux etch)?

On that parallel computer sander.MPI (Amber9) and openmpi 1.2.3 have both been
compiled with Intel Fortran 9.1.036.

Now, I wish to install DOCK6 on this machine and I am advised that it should be
better compiled on GNU compilers. As to openmpi I could install the Debian
package, which is GNU compiled. Are conflicts between the two installation
foreseeable? Although I don't have experience with DOCK, I suspect that certain
procedures with DOCK call sander.MPI into play.

I rule out the alternative of compiling Amber9 with GNU compilers, which will
run slower.

Thanks

francesco pietra




Pinpoint customers who are looking for what you sell. 
http://searchmarketing.yahoo.com/


Re: [OMPI users] Two different compilation of openmpi

2007-09-13 Thread Andrew J Caird

Francesco,

We use modules (http://modules.sourceforge.net/) to manage 14 different 
OpenMPI versions on the same cluster, along with their associated 
applications.  This is a nice way to establish dependancies between apps 
and libs and keep things organized.


Good luck.
--andy

$ module avail openmpi
 /home/software/rhel4/Modules/3.2.1/modulefiles 

openmpi/1.0.2-gcc  openmpi/1.1.0-pgi616   openmpi/1.1a9-pgi
openmpi/1.0.2-nag  openmpi/1.1.2-intelopenmpi/1.2-pgi
openmpi/1.0.2-pgi(default) openmpi/1.1.2-pgi  openmpi/1.2.3-gcc
openmpi/1.0.3a1-pgiopenmpi/1.1.4-pgi62openmpi/1.2.3-pgi
openmpi/1.1.0-pgi  openmpi/1.1a8-nag


On Thu, 13 Sep 2007, Francesco Pietra wrote:

Is it possible to have two different compilations of openmpi on the same 
machine (dual-opterons, Debian Linux etch)?


On that parallel computer sander.MPI (Amber9) and openmpi 1.2.3 have 
both been compiled with Intel Fortran 9.1.036.


Now, I wish to install DOCK6 on this machine and I am advised that it 
should be better compiled on GNU compilers. As to openmpi I could 
install the Debian package, which is GNU compiled. Are conflicts between 
the two installation foreseeable? Although I don't have experience with 
DOCK, I suspect that certain procedures with DOCK call sander.MPI into 
play.


I rule out the alternative of compiling Amber9 with GNU compilers, which 
will run slower.


Thanks

francesco pietra





Pinpoint customers who are looking for what you sell.
http://searchmarketing.yahoo.com/
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users