Re: [OMPI users] OpenMPI Documentation?
Hi Richard, On 9/12/07, Richard Friedman wrote: > > Amit: > Well, as far as I know a documentation community within OpenMPI has not yet > been formed, but maybe it is time to send out a general call to the OpenMPI > members to see about creating one. > I'm new to the OpenMPI community myself, so I'm not yet sure how this can > be done. But we can find out. > Thanks for the interest. Well, some has to take the initiative, and it would be ideal to have an experienced Open MPI programmer take the lead role and members like me can be contributors. Regards, Amit -- Amit Kumar Saha [URL]:http://amitsaha.in.googlepages.com
Re: [OMPI users] OpenMPI Documentation?
I would be very happy to help setup a documentation community -- goodness knows we need more/better documentation for Open MPI! Who else would be interested? On Sep 13, 2007, at 5:13 AM, Amit Kumar Saha wrote: Hi Richard, On 9/12/07, Richard Friedman wrote: Amit: Well, as far as I know a documentation community within OpenMPI has not yet been formed, but maybe it is time to send out a general call to the OpenMPI members to see about creating one. I'm new to the OpenMPI community myself, so I'm not yet sure how this can be done. But we can find out. Thanks for the interest. Well, some has to take the initiative, and it would be ideal to have an experienced Open MPI programmer take the lead role and members like me can be contributors. Regards, Amit -- Amit Kumar Saha [URL]:http://amitsaha.in.googlepages.com ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems
Re: [OMPI users] OpenMPI Documentation?
Hi I would like to contribute something as well. I have about half a year of experience with OpenMPI, and i used LAM MPI for some more than half a year before. Jody
[OMPI users] connect failed with errno=111
Greetings, I am using OpenMPI v1.2.3 via SGE on a network of amd64 workstations. When mpirun tries to start the processes on certain nodes I get the following error output. [sr70][0,1,2][btl_tcp_endpoint.c: 572:mca_btl_tcp_endpoint_complete_connect] connect() failed with errno=111 [sr71][0,1,3][btl_tcp_endpoint.c: 572:mca_btl_tcp_endpoint_complete_connect] connect() failed with errno=111 Using perl -e 'die$!=111' I see that the error message is "Connection refused". I am able to connect to both nodes in question via ssh and/ or rsh. I changed btl_base_debug to 2, but that did not provide additional information. What are some possible issues that might be causing this? What can I do to get more information? Thanks, ~Tim
Re: [OMPI users] connect failed with errno=111
Hi Tim, You could try setting -mca pls_gridengine_verbose 1 to show whether SGE is able to start the ORTE daemons on the remote nodes successfully. It seems you are having the problem previously asked by another user, Perhaps you may want to follow this thread and check your ifconfig settings to see if anything suspicious? http://www.open-mpi.org/community/lists/users/2007/02/2669.php My 2 cents... Tim Campbell wrote: Greetings, I am using OpenMPI v1.2.3 via SGE on a network of amd64 workstations. When mpirun tries to start the processes on certain nodes I get the following error output. [sr70][0,1,2][btl_tcp_endpoint.c: 572:mca_btl_tcp_endpoint_complete_connect] connect() failed with errno=111 [sr71][0,1,3][btl_tcp_endpoint.c: 572:mca_btl_tcp_endpoint_complete_connect] connect() failed with errno=111 Using perl -e 'die$!=111' I see that the error message is "Connection refused". I am able to connect to both nodes in question via ssh and/ or rsh. I changed btl_base_debug to 2, but that did not provide additional information. What are some possible issues that might be causing this? What can I do to get more information? Thanks, ~Tim ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- - Pak Lui pak@sun.com
Re: [OMPI users] connect failed with errno=111
Thanks. I think I figured out the problem. I found that in my .ssh/ known_hosts there were several "bad" keys associated with some of the machines in the gridengine pool. My hypothesis is that when mpirun was establishing the connection topology of the processes there was some process pairs that failed to complete the connection due to the bad ssh keys. I don't have explicit evidence for this since there was no ssh error output generated. I generated new keys for all the amd64 machines in the gridengine pool for which there was an offending key. Now my job runs with a set of machines that includes ones that had previously failed. I will assume for now that the problem is fixed. ~Tim On Sep 13, 2007, at 12:06 PM, Pak Lui wrote: Hi Tim, You could try setting -mca pls_gridengine_verbose 1 to show whether SGE is able to start the ORTE daemons on the remote nodes successfully. It seems you are having the problem previously asked by another user, Perhaps you may want to follow this thread and check your ifconfig settings to see if anything suspicious? http://www.open-mpi.org/community/lists/users/2007/02/2669.php My 2 cents... Tim Campbell wrote: Greetings, I am using OpenMPI v1.2.3 via SGE on a network of amd64 workstations. When mpirun tries to start the processes on certain nodes I get the following error output. [sr70][0,1,2][btl_tcp_endpoint.c: 572:mca_btl_tcp_endpoint_complete_connect] connect() failed with errno=111 [sr71][0,1,3][btl_tcp_endpoint.c: 572:mca_btl_tcp_endpoint_complete_connect] connect() failed with errno=111 Using perl -e 'die$!=111' I see that the error message is "Connection refused". I am able to connect to both nodes in question via ssh and/ or rsh. I changed btl_base_debug to 2, but that did not provide additional information. What are some possible issues that might be causing this? What can I do to get more information? Thanks, ~Tim ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- - Pak Lui pak@sun.com ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] connect failed with errno=111
On Thu, Sep 13, 2007 at 11:15:47AM -0500, Tim Campbell wrote: > workstations. When mpirun tries to start the processes on certain > nodes I get the following error output. > > [sr70][0,1,2][btl_tcp_endpoint.c: > 572:mca_btl_tcp_endpoint_complete_connect] connect() failed with > errno=111 > [sr71][0,1,3][btl_tcp_endpoint.c: > 572:mca_btl_tcp_endpoint_complete_connect] connect() failed with > errno=111 > > Using perl -e 'die$!=111' I see that the error message is "Connection > refused". I am able to connect to both nodes in question via ssh and/ This sounds pretty much like an IP setup issue. Perhaps some nodes have more than one interface, i.e. internal and external network, IP-over-FireWire, ppp-Devices or something else. Exporting these addresses would clearly cause other nodes to be unable to connect. If so, use btl_tcp_if_exclude (or _include) to specify the right interface. Second problem: local firewalls. Though ssh connections might be allowed, the sysadmin could block almost any other (destination) port, thus causing the same error messages. (in case of icmp-port-unreachable). > What are some possible issues that might be causing this? What can I > do to get more information? I agree that you surely need more information. Can you recompile with --enable-debug and change #define WANT_PEER_DUMP 0 in file ompi/mca/btl/tcp/btl_tcp_endpoint.c from "0" to "1" before recompiling? This should give you detailed information. HTH -- Cluster and Metacomputing Working Group Friedrich-Schiller-Universität Jena, Germany private: http://adi.thur.de
Re: [OMPI users] OpenMPI Documentation?
Jeff, I would also be interested. I am getting questions from my customers about the location of documentation. Thanks, Pat Jeff Squyres To Sent by: Open MPI Users users-bounces@ cc open-mpi.org rc...@sun.com Subject Re: [OMPI users] OpenMPI 09/13/07 10:33 Documentation? AM Please respond to Open MPI Users I would be very happy to help setup a documentation community -- goodness knows we need more/better documentation for Open MPI! Who else would be interested? On Sep 13, 2007, at 5:13 AM, Amit Kumar Saha wrote: > Hi Richard, > > On 9/12/07, Richard Friedman wrote: >> >> Amit: >> Well, as far as I know a documentation community within OpenMPI >> has not yet >> been formed, but maybe it is time to send out a general call to >> the OpenMPI >> members to see about creating one. >> I'm new to the OpenMPI community myself, so I'm not yet sure how >> this can >> be done. But we can find out. >> Thanks for the interest. > > Well, some has to take the initiative, and it would be ideal to have > an experienced Open MPI programmer take the lead role and members like > me can be contributors. > > > Regards, > Amit > -- > Amit Kumar Saha > [URL]:http://amitsaha.in.googlepages.com > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] OpenMPI Documentation?
So there are at least a few people who are interested in this effort (keep chiming in if you are interested so that we can get a tally of who would like to be involved). What kind of resources / organization would be useful for this group? Indiana University graciously hosts all of Open MPI's electronic resources (Subversion, web site, bug tracking, DNS, mailing lists, ...) and I certainly can't speak for them, but if we ask nicely, I'd be willing to bet that they would add some hosting services for a documentation project (if such additional resources would be helpful, of course). I would also be happy to host a teleconference if talking about all this start/admin stuff for an hour would save 1-2 weeks worth of detailed e-mails. - The only current documentation we have is: - the web FAQ - the README in the tarball What is conspicuously missing is a nice PDF and/or HTML tarball with comprehensive documentation. But I think that FAQ/README also fit into the general category of documentation, so it might make sense to put all 3 of these items under the control of one group. The obvious rationale here is that all three could stay in tighter sync if there's one group monitoring all 3. One point worth mentioning: Open MPI is all about community consensus, but "s/he who implements usually wins". :-) So if we get an active group working on documentation, the FAQ could be totally re- done if the group so decides (for example). All this being said, the OMPI developers *have* talked about documentation a bit over time. Here's some of the points from prior discussions, in no particular order: - It highly desirable to have documentation that can be output in multiple different forms (PDF, HTML, ...whatever). If possible, the docs should be shipped in distribution tarballs and hosted on the OMPI web site. - LAM/MPI had two great docs: one for installation LAM/MPI and one for using LAM/MPI. These might be good example documents for what Open MPI might want to do (see http://www.lam-mpi.org/using/docs/), regardless of the back-end technology used to generate the docs. Source LaTeX for these guides are available if it would be helpful (I wrote most of them). - It would be most helpful if the documentation is written in a tool that has free editors, preferably cross-platform and available in multiple POSIX-like environments (Solaris, Linux, OS X). MS Office was explicitly rejected because of its requirement for Windows/OS X (other Office clones were not really discussed). LaTeX was discussed but wasn't favored due to the steep learning curve and general lack of experience with it outside of academia. - First documentation should be aimed towards users. Developer documentation might follow. - Once upon a time, we developers started to use doxygen for documentation, but it has proven to be lousy for book-like entities (IONSHO). Doxygen is decent for code documentation, but not documents. - A few recent discussions about documentation came to the conclusion that Docbook (www.docbook.org) looked promising, but we didn't get deep into details / investigating the feasibility. One obvious Big Project using Docbook is Subversion (see http://svnbook.red- bean.com/). Docbook-produced HTML and PDF seem to look both pretty and functional. - It would also be nice if sub-distributions of Open MPI could take the documentation and -- in some defined automated fashion -- be able to do the following: - insert their own "chapters" or "sections" that are specific to that sub-distribution (e.g., Sun ClusterTools have some Solaris- specific stuff, OFED have some OpenFabrics-specific stuff, etc.) - remove/"turn off" specific sections of documentation (e.g., OFED would likely not include any documentation about Myricom networks [and vice versa]) This would go a long ways towards being able to keep the community documentation in sync with docs included in targeted/vendor OMPI releases. - The OMPI web site is almost entirely written in PHP and is mirrored around the world. It would be *strongly* preferred if the web-site hosting of the docs is fully mirror-able (because assumedly docs are one of the things that users would want to browse the most). Hence, requiring a new kind of server other than HTML/PHP would require very, very strong rationale. :-) - The technology of choice for displaying on the web site is PHP. But that still leaves open a wide variety of choices for serving docs via the web site, including (but not limited to): - just posting PDFs (although having HTML-based docs would certainly be nice) - a PHP-based package or home-grown PHP - generating HTML offline (via cron or whatever) and putting the results in the web site - ...etc. On Sep 13, 2007, at 1:31 PM, pat.o'bry...@exxonmobil.com wrote: Jeff, I would also be interested. I a
Re: [OMPI users] OpenMPI Documentation?
Jeff, Count us in at the UofA. My initial impressions of Open MPI are very good and I would be open to contributing to this effort as time allows. Thanks! Jeff F. Pummill Senior Linux Cluster Administrator University of Arkansas Fayetteville, Arkansas 72701 (479) 575 - 4590 http://hpc.uark.edu "A supercomputer is a device for turning compute-bound problems into I/O-bound problems." -Seymour Cray Jeff Squyres wrote: So there are at least a few people who are interested in this effort (keep chiming in if you are interested so that we can get a tally of who would like to be involved). What kind of resources / organization would be useful for this group? Indiana University graciously hosts all of Open MPI's electronic resources (Subversion, web site, bug tracking, DNS, mailing lists, ...) and I certainly can't speak for them, but if we ask nicely, I'd be willing to bet that they would add some hosting services for a documentation project (if such additional resources would be helpful, of course). I would also be happy to host a teleconference if talking about all this start/admin stuff for an hour would save 1-2 weeks worth of detailed e-mails. - The only current documentation we have is: - the web FAQ - the README in the tarball What is conspicuously missing is a nice PDF and/or HTML tarball with comprehensive documentation. But I think that FAQ/README also fit into the general category of documentation, so it might make sense to put all 3 of these items under the control of one group. The obvious rationale here is that all three could stay in tighter sync if there's one group monitoring all 3. One point worth mentioning: Open MPI is all about community consensus, but "s/he who implements usually wins". :-) So if we get an active group working on documentation, the FAQ could be totally re- done if the group so decides (for example). All this being said, the OMPI developers *have* talked about documentation a bit over time. Here's some of the points from prior discussions, in no particular order: - It highly desirable to have documentation that can be output in multiple different forms (PDF, HTML, ...whatever). If possible, the docs should be shipped in distribution tarballs and hosted on the OMPI web site. - LAM/MPI had two great docs: one for installation LAM/MPI and one for using LAM/MPI. These might be good example documents for what Open MPI might want to do (see http://www.lam-mpi.org/using/docs/), regardless of the back-end technology used to generate the docs. Source LaTeX for these guides are available if it would be helpful (I wrote most of them). - It would be most helpful if the documentation is written in a tool that has free editors, preferably cross-platform and available in multiple POSIX-like environments (Solaris, Linux, OS X). MS Office was explicitly rejected because of its requirement for Windows/OS X (other Office clones were not really discussed). LaTeX was discussed but wasn't favored due to the steep learning curve and general lack of experience with it outside of academia. - First documentation should be aimed towards users. Developer documentation might follow. - Once upon a time, we developers started to use doxygen for documentation, but it has proven to be lousy for book-like entities (IONSHO). Doxygen is decent for code documentation, but not documents. - A few recent discussions about documentation came to the conclusion that Docbook (www.docbook.org) looked promising, but we didn't get deep into details / investigating the feasibility. One obvious Big Project using Docbook is Subversion (see http://svnbook.red- bean.com/). Docbook-produced HTML and PDF seem to look both pretty and functional. - It would also be nice if sub-distributions of Open MPI could take the documentation and -- in some defined automated fashion -- be able to do the following: - insert their own "chapters" or "sections" that are specific to that sub-distribution (e.g., Sun ClusterTools have some Solaris- specific stuff, OFED have some OpenFabrics-specific stuff, etc.) - remove/"turn off" specific sections of documentation (e.g., OFED would likely not include any documentation about Myricom networks [and vice versa]) This would go a long ways towards being able to keep the community documentation in sync with docs included in targeted/vendor OMPI releases. - The OMPI web site is almost entirely written in PHP and is mirrored around the world. It would be *strongly* preferred if the web-site hosting of the docs is fully mirror-able (because assumedly docs are one of the things that users would want to browse the most). Hence, requiring a new kind of server other than HTML/PHP would require very, very strong rationale. :-) - The technology of choice for displaying on the web site is PHP. But that still leaves open a wide variety o
Re: [OMPI users] OpenMPI Documentation?
As more people start chiming in wanting to help with OpenMPI documentation (a good thing!), maybe we should think about starting forum or separate email list just for this discussion. At least, initially to get the ball rolling. Do we have the capability of creating a new mail list at open-mpi.org? There are other alternatives, like Yahoo groups and such. <>
Re: [OMPI users] OpenMPI Documentation?
On Sep 13, 2007, at 4:17 PM, richard.fried...@sun.com wrote: As more people start chiming in wanting to help with OpenMPI documentation (a good thing!), maybe we should think about starting forum or separate email list just for this discussion. At least, initially to get the ball rolling. Do we have the capability of creating a new mail list at open-mpi.org? Yes, we can create whatever we want/need. I asked Indiana University offline and they are quite amenable to adding hosting for whatever a docs sub-group would want/need (see http://www.open-mpi.org/community/ lists/users/2007/09/4002.php). What name/address do we want? d...@open-mpi.org? (or suggest an alternative) -- Jeff Squyres Cisco Systems
Re: [OMPI users] OpenMPI Documentation?
Jeff Squyres wrote: What name/address do we want? d...@open-mpi.org? (or suggest an alternative) Sounds right to me. Only alternative might be docs_t...@open-mpi.org <>
[OMPI users] Two different compilation of openmpi
Is it possible to have two different compilations of openmpi on the same machine (dual-opterons, Debian Linux etch)? On that parallel computer sander.MPI (Amber9) and openmpi 1.2.3 have both been compiled with Intel Fortran 9.1.036. Now, I wish to install DOCK6 on this machine and I am advised that it should be better compiled on GNU compilers. As to openmpi I could install the Debian package, which is GNU compiled. Are conflicts between the two installation foreseeable? Although I don't have experience with DOCK, I suspect that certain procedures with DOCK call sander.MPI into play. I rule out the alternative of compiling Amber9 with GNU compilers, which will run slower. Thanks francesco pietra Pinpoint customers who are looking for what you sell. http://searchmarketing.yahoo.com/
Re: [OMPI users] Two different compilation of openmpi
Francesco, We use modules (http://modules.sourceforge.net/) to manage 14 different OpenMPI versions on the same cluster, along with their associated applications. This is a nice way to establish dependancies between apps and libs and keep things organized. Good luck. --andy $ module avail openmpi /home/software/rhel4/Modules/3.2.1/modulefiles openmpi/1.0.2-gcc openmpi/1.1.0-pgi616 openmpi/1.1a9-pgi openmpi/1.0.2-nag openmpi/1.1.2-intelopenmpi/1.2-pgi openmpi/1.0.2-pgi(default) openmpi/1.1.2-pgi openmpi/1.2.3-gcc openmpi/1.0.3a1-pgiopenmpi/1.1.4-pgi62openmpi/1.2.3-pgi openmpi/1.1.0-pgi openmpi/1.1a8-nag On Thu, 13 Sep 2007, Francesco Pietra wrote: Is it possible to have two different compilations of openmpi on the same machine (dual-opterons, Debian Linux etch)? On that parallel computer sander.MPI (Amber9) and openmpi 1.2.3 have both been compiled with Intel Fortran 9.1.036. Now, I wish to install DOCK6 on this machine and I am advised that it should be better compiled on GNU compilers. As to openmpi I could install the Debian package, which is GNU compiled. Are conflicts between the two installation foreseeable? Although I don't have experience with DOCK, I suspect that certain procedures with DOCK call sander.MPI into play. I rule out the alternative of compiling Amber9 with GNU compilers, which will run slower. Thanks francesco pietra Pinpoint customers who are looking for what you sell. http://searchmarketing.yahoo.com/ ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users