Re: [OMPI users] failure to launch MPMD program on win32 w 1.4.3
> -Original Message- > From: Shiqing Fan [mailto:f...@hlrs.de] > Sent: 30 November 2010 23:39 > To: Open MPI Users > Cc: Hicham Mouline; Rainer Keller > Subject: Re: [OMPI users] failure to launch MPMD program on win32 w > 1.4.3 > > Hi, > > I don't have boost on my Windows, so I made a very similar program just > using MPI, and everything works just fine for me: > > D:\work\OpenMPI\tests\CXX>more hello.cpp > > # include "mpi.h" > > using namespace std; > > int main ( int argc, char *argv[] ) > { >int rank, size; > >MPI::Init ( argc, argv ); >size = MPI::COMM_WORLD.Get_size ( ); >rank = MPI::COMM_WORLD.Get_rank ( ); > >printf("Process # %d \n", rank); > >MPI::Finalize ( ); >return 0; > } > > > D:\work\OpenMPI\tests\CXX>mpirun -np 3 hello.exe : -np 3 hello.exe > Process # 2 > Process # 4 > Process # 0 > Process # 3 > Process # 5 > Process # 1 > > > May be something related to boost? > > > Regards, > Shiqing > I've had this issue with -np 3 : -np 3 but not with -np 2: -np 2 or -np 1: -np 4 or other combinations. I've also rebuilt from vs2008 with the libs advapi32.lib Ws2_32.lib shlwapi.lib as visible in the text file: share\openmpi\mpic++.exe-wrapper-data.txt, and the problem seemed to stop happening. so now it is working. I assume I will be able to do this on several windows boxes? Do they need to be all 32bit or 64bit or can I mix? regards,
[OMPI users] win: mpic++ -showme reports duplicate .libs
Hello, >mpic++ -showme:link /TP /EHsc /link /LIBPATH:"C:/Program Files (x86)/openmpi/lib" libmpi.lib libopen-pal.lib libopen-rte.lib libmpi_cxx.lib libmpi.lib libopen-pal.lib libopen-rte.lib advapi32.lib Ws2_32.lib shlwapi.lib reports using the 4 mpi libs twice. I've followed the cmake way in README.windows. Is this intended or have I wronged somewhere? rds,
Re: [OMPI users] failure to launch MPMD program on win32 w 1.4.3
Hi Hicham, I've had this issue with -np 3 : -np 3 but not with -np 2: -np 2 or -np 1: -np 4 or other combinations. I've also rebuilt from vs2008 with the libs advapi32.lib Ws2_32.lib shlwapi.lib as visible in the text file: share\openmpi\mpic++.exe-wrapper-data.txt, and the problem seemed to stop happening. so now it is working. Great! But I don't see the cause of the problem. If it's missing the linking libraries, the compiler should already complain at linking time. I assume I will be able to do this on several windows boxes? Do they need to be all 32bit or 64bit or can I mix? Yes, you can mix 32 and 64 bit, but you have to take care of the executables on each machine. And for running on multiple windows boxes, please refer to the windows readme file. In order to simplify the WMI configuration process, you may also use the small tool I attached for configure users (change the file extension to .exe): Syntax: wmi-config [] ... For example: wmi-config add LOCAL_COMPUTER\user wmi-config add DOMAIN1\user1 DOMAIN2\user2 wmi-config del DOMAIN1\user1 Regards, Shiqing -- -- Shiqing Fan http://www.hlrs.de/people/fan High Performance Computing Tel.: +49 711 685 87234 Center Stuttgart (HLRS)Fax.: +49 711 685 65832 Address:Allmandring 30 email: f...@hlrs.de 70569 Stuttgart wmi-config.ex_ Description: Binary data
Re: [OMPI users] win: mpic++ -showme reports duplicate .libs
Hi Hicham, Thanks for noticing it. It's now been fixed on trunk. Regards, Shiqing On 2010-12-1 10:02 AM, Hicham Mouline wrote: Hello, mpic++ -showme:link /TP /EHsc /link /LIBPATH:"C:/Program Files (x86)/openmpi/lib" libmpi.lib libopen-pal.lib libopen-rte.lib libmpi_cxx.lib libmpi.lib libopen-pal.lib libopen-rte.lib advapi32.lib Ws2_32.lib shlwapi.lib reports using the 4 mpi libs twice. I've followed the cmake way in README.windows. Is this intended or have I wronged somewhere? rds, ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- -- Shiqing Fan http://www.hlrs.de/people/fan High Performance Computing Tel.: +49 711 685 87234 Center Stuttgart (HLRS)Fax.: +49 711 685 65832 Address:Allmandring 30 email: f...@hlrs.de 70569 Stuttgart
Re: [OMPI users] Problem with MPI_Comm_accept in a dynamic client/server application
Hi Kalin, Which version of Open MPI did you use? It seems that the ess component couldn't be selected. Could you please send me the output of ompi_info? Regards, Shiqing On 2010-11-30 12:32 AM, Kalin Kanov wrote: Hi Shiqing, I must have missed your response among all the e-mails that get sent to the mailing list. Here are a little more details about the issues that I am having. My client/server programs seem to run sometimes, but then after a successful run I always seem to get the error that I included in my first post. The way that I run the programs is by running the server application first, which generates the port string, etc. I then proceed to run the client application with a new call to mpirun. After getting the errors that I e-mailed about I also tried to run ompi-clean, but the results are the following: >ompi-clean [Lazar:05984] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file ..\..\orte\r untime\orte_init.c at line 125 -- It looks like orte_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during orte_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): orte_ess_base_select failed --> Returned value Not found (-13) instead of ORTE_SUCCESS -- Any help with this issue will be greatly appreciated. Thank you, Kalin On 27.10.2010 г. 05:52, Shiqing Fan wrote: Hi Kalin, Sorry for the late reply. I checked the code and got confused. (I'm not and MPI expert) I'm just wondering how to start the server and client in the same mpirun command while the client needs a hand-input port name, which is given by the server at runtime. I found a similar program on the Internet (see attached), that works well on my Windows. In this program, the generated port name will be send among the processes by MPI_Send. Regards, Shiqing On 2010-10-13 11:09 PM, Kalin Kanov wrote: Hi there, I am trying to create a client/server application with OpenMPI, which has been installed on a Windows machine, by following the instruction (with CMake) in the README.WINDOWS file in the OpenMPI distribution (version 1.4.2). I have ran other test application that compile file under the Visual Studio 2008 Command Prompt. However I get the following errors on the server side when accepting a new client that is trying to connect: [Lazar:02716] [[47880,1],0] ORTE_ERROR_LOG: Not found in file ..\..\orte\mca\grp comm\base\grpcomm_base_allgather.c at line 222 [Lazar:02716] [[47880,1],0] ORTE_ERROR_LOG: Not found in file ..\..\orte\mca\grp comm\basic\grpcomm_basic_module.c at line 530 [Lazar:02716] [[47880,1],0] ORTE_ERROR_LOG: Not found in file ..\..\ompi\mca\dpm \orte\dpm_orte.c at line 363 [Lazar:2716] *** An error occurred in MPI_Comm_accept [Lazar:2716] *** on communicator MPI_COMM_WORLD [Lazar:2716] *** MPI_ERR_INTERN: internal error [Lazar:2716] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort) -- mpirun has exited due to process rank 0 with PID 476 on node Lazar exiting without calling "finalize". This may have caused other processes in the application to be terminated by signals sent by mpirun (as reported here). -- The server and client code is attached. I have straggled with this problem for quite a while, so please let me know what the issue might be. I have looked at the archives and the FAQ, and the only thing similar that I have found had to do with different version of OpenMPI installed, but I only have one version, and I believe it is the one being used. Thank you, Kalin ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- -- Shiqing Fanhttp://www.hlrs.de/people/fan High Performance Computing Tel.: +49 711 685 87234 Center Stuttgart (HLRS)Fax.: +49 711 685 65832 Address:Allmandring 30 email:f...@hlrs.de 70569 Stuttgart -- -- Shiqing Fan http://www.hlrs.de/people/fan High Performance Computing Tel.: +49 711 685 87234 Center Stuttgart (HLRS)Fax.: +49 711 685 65832 Address:Allmandring 30 email: f...@hlrs.de 70569 Stuttgart
Re: [OMPI users] How to avoid abort when calling MPI_Finalize without calling MPI_File_close?
On Mon, Nov 22, 2010 at 04:40:14PM -0700, James Overfelt wrote: > Hello, > > I have a small test case where a file created with MPI_File_open > is still open at the time MPI_Finalize is called. In the actual > program there are lots of open files and it would be nice to avoid the > resulting "Your MPI job will now abort." by either having MPI_Finalize > close the files or honor the error handler and return an error code > without an abort. > > I've tried with with OpenMPI 1.4.3 and 1.5 with the same results. > Attached are the configure, compile and source files and the whole > program follows. under MPICH2, this simple test program does not abort. You leak a lot of resources (e.g. info structure allocated is not freed) but it sounds like you are well aware of that. under openmpi, this test program fails because openmpi is trying to help you out. I'm going to need some help from the openmpi folks here, but the backtrace makes it look like MPI_Finalize is setting the "no more mpi calls allowed" flag, and then goes and calls some mpi routines to clean up the opened files: Breakpoint 1, 0xb7f7c346 in PMPI_Barrier () from /home/robl/work/soft/openmpi-1.4/lib/libmpi.so.0 (gdb) where #0 0xb7f7c346 in PMPI_Barrier () from /home/robl/work/soft/openmpi-1.4/lib/libmpi.so.0 #1 0xb78a4c25 in mca_io_romio_dist_MPI_File_close () from /home/robl/work/soft/openmpi-1.4/lib/openmpi/mca_io_romio.so #2 0xb787e8b3 in mca_io_romio_file_close () from /home/robl/work/soft/openmpi-1.4/lib/openmpi/mca_io_romio.so #3 0xb7f591b1 in file_destructor () from /home/robl/work/soft/openmpi-1.4/lib/libmpi.so.0 #4 0xb7f58f28 in ompi_file_finalize () from /home/robl/work/soft/openmpi-1.4/lib/libmpi.so.0 #5 0xb7f67eb3 in ompi_mpi_finalize () from /home/robl/work/soft/openmpi-1.4/lib/libmpi.so.0 #6 0xb7f82828 in PMPI_Finalize () from /home/robl/work/soft/openmpi-1.4/lib/libmpi.so.0 #7 0x0804f9c2 in main (argc=1, argv=0xbfffed94) at file_error.cc:17 Why is there an MPI_Barrier in the close path? It has to do with our implementation of shared file pointers. If you run this test on a file system that does not support shared file pointers ( PVFS, for example), you might get a little further. So, I think the ball is back in the OpenMPI court: they have to re-jigger the order of the destructors so that closing files comes a little earlier in the shutdown process. ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA
Re: [OMPI users] Problem with MPI_Comm_accept in a dynamic client/server application
Hi Shiqing, I am using OpenMPI version 1.4.2 Here is the output of ompi_info: Package: Open MPI Kalin Kanov@LAZAR Distribution Open MPI: 1.4.2 Open MPI SVN revision: r23093 Open MPI release date: May 04, 2010 Open RTE: 1.4.2 Open RTE SVN revision: r23093 Open RTE release date: May 04, 2010 OPAL: 1.4.2 OPAL SVN revision: r23093 OPAL release date: May 04, 2010 Ident string: 1.4.2 Prefix: C:/Program Files/openmpi-1.4.2/installed Configured architecture: x86 Windows-5.2 Configure host: LAZAR Configured by: Kalin Kanov Configured on: 18:00 04.10.2010 ?. Configure host: LAZAR Built by: Kalin Kanov Built on: 18:00 04.10.2010 ?. Built host: LAZAR C bindings: yes C++ bindings: yes Fortran77 bindings: no Fortran90 bindings: no Fortran90 bindings size: na C compiler: cl C compiler absolute: cl C++ compiler: cl C++ compiler absolute: cl Fortran77 compiler: CMAKE_Fortran_COMPILER-NOTFOUND Fortran77 compiler abs: none Fortran90 compiler: Fortran90 compiler abs: none C profiling: yes C++ profiling: yes Fortran77 profiling: no Fortran90 profiling: no C++ exceptions: no Thread support: no Sparse Groups: no Internal debug support: no MPI parameter check: runtime Memory profiling support: no Memory debugging support: no libltdl support: no Heterogeneous support: no mpirun default --prefix: yes MPI I/O support: yes MPI_WTIME support: gettimeofday Symbol visibility support: yes FT Checkpoint support: yes (checkpoint thread: no) MCA backtrace: none (MCA v2.0, API v2.0, Component v1.4.2) MCA paffinity: windows (MCA v2.0, API v2.0, Component v1.4.2) MCA carto: auto_detect (MCA v2.0, API v2.0, Component v1.4.2) MCA maffinity: first_use (MCA v2.0, API v2.0, Component v1.4.2) MCA timer: windows (MCA v2.0, API v2.0, Component v1.4.2) MCA installdirs: windows (MCA v2.0, API v2.0, Component v1.4.2) MCA installdirs: env (MCA v2.0, API v2.0, Component v1.4.2) MCA installdirs: config (MCA v2.0, API v2.0, Component v1.4.2) MCA crs: none (MCA v2.0, API v2.0, Component v1.4.2) MCA dpm: orte (MCA v2.0, API v2.0, Component v1.4.2) MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.4.2) MCA allocator: basic (MCA v2.0, API v2.0, Component v1.4.2) MCA allocator: bucket (MCA v2.0, API v2.0, Component v1.4.2) MCA coll: basic (MCA v2.0, API v2.0, Component v1.4.2) MCA coll: hierarch (MCA v2.0, API v2.0, Component v1.4.2) MCA coll: self (MCA v2.0, API v2.0, Component v1.4.2) MCA coll: sm (MCA v2.0, API v2.0, Component v1.4.2) MCA coll: sync (MCA v2.0, API v2.0, Component v1.4.2) MCA mpool: rdma (MCA v2.0, API v2.0, Component v1.4.2) MCA mpool: sm (MCA v2.0, API v2.0, Component v1.4.2) MCA pml: ob1 (MCA v2.0, API v2.0, Component v1.4.2) MCA bml: r2 (MCA v2.0, API v2.0, Component v1.4.2) MCA btl: self (MCA v2.0, API v2.0, Component v1.4.2) MCA btl: sm (MCA v2.0, API v2.0, Component v1.4.2) MCA btl: tcp (MCA v2.0, API v2.0, Component v1.4.2) MCA topo: unity (MCA v2.0, API v2.0, Component v1.4.2) MCA osc: pt2pt (MCA v2.0, API v2.0, Component v1.4.2) MCA osc: rdma (MCA v2.0, API v2.0, Component v1.4.2) MCA iof: hnp (MCA v2.0, API v2.0, Component v1.4.2) MCA iof: orted (MCA v2.0, API v2.0, Component v1.4.2) MCA iof: tool (MCA v2.0, API v2.0, Component v1.4.2) MCA oob: tcp (MCA v2.0, API v2.0, Component v1.4.2) MCA odls: process (MCA v2.0, API v2.0, Component v1.4.2) MCA ras: ccp (MCA v2.0, API v2.0, Component v1.4.2) MCA rmaps: round_robin (MCA v2.0, API v2.0, Component v1.4.2) MCA rmaps: seq (MCA v2.0, API v2.0, Component v1.4.2) MCA rml: ftrm (MCA v2.0, API v2.0, Component v1.4.2) MCA rml: oob (MCA v2.0, API v2.0, Component v1.4.2) MCA routed: binomial (MCA v2.0, API v2.0, Component v1.4.2) MCA routed: linear (MCA v2.0, API v2.0, Component v1.4.2) MCA plm: ccp (MCA v2.0, API v2.0, Component v1.4.2) MCA plm: process (MCA v2.0, API v2.0, Component v1.4.2) MCA errmgr: default (MCA v2.0, API v2.0, Component v1.4.2) MCA ess: env (MCA v2.0, API v2.0, Component v1.4.2) MCA ess: hnp (MCA
Re: [OMPI users] How to avoid abort when calling MPI_Finalize without calling MPI_File_close?
On Wed, Dec 1, 2010 at 8:28 AM, Rob Latham wrote: > On Mon, Nov 22, 2010 at 04:40:14PM -0700, James Overfelt wrote: >> Hello, >> >> I have a small test case where a file created with MPI_File_open >> is still open at the time MPI_Finalize is called. In the actual >> program there are lots of open files and it would be nice to avoid the >> resulting "Your MPI job will now abort." by either having MPI_Finalize >> close the files or honor the error handler and return an error code >> without an abort. >> >> I've tried with with OpenMPI 1.4.3 and 1.5 with the same results. >> Attached are the configure, compile and source files and the whole >> program follows. > > under MPICH2, this simple test program does not abort. You leak a lot > of resources (e.g. info structure allocated is not freed) but it > sounds like you are well aware of that. > > under openmpi, this test program fails because openmpi is trying to > help you out. I'm going to need some help from the openmpi folks > here, but the backtrace makes it look like MPI_Finalize is setting the > "no more mpi calls allowed" flag, and then goes and calls some mpi > routines to clean up the opened files: > > Breakpoint 1, 0xb7f7c346 in PMPI_Barrier () from > /home/robl/work/soft/openmpi-1.4/lib/libmpi.so.0 > (gdb) where > #0 0xb7f7c346 in PMPI_Barrier () from > /home/robl/work/soft/openmpi-1.4/lib/libmpi.so.0 > #1 0xb78a4c25 in mca_io_romio_dist_MPI_File_close () from > /home/robl/work/soft/openmpi-1.4/lib/openmpi/mca_io_romio.so > #2 0xb787e8b3 in mca_io_romio_file_close () from > /home/robl/work/soft/openmpi-1.4/lib/openmpi/mca_io_romio.so > #3 0xb7f591b1 in file_destructor () from > /home/robl/work/soft/openmpi-1.4/lib/libmpi.so.0 > #4 0xb7f58f28 in ompi_file_finalize () from > /home/robl/work/soft/openmpi-1.4/lib/libmpi.so.0 > #5 0xb7f67eb3 in ompi_mpi_finalize () from > /home/robl/work/soft/openmpi-1.4/lib/libmpi.so.0 > #6 0xb7f82828 in PMPI_Finalize () from > /home/robl/work/soft/openmpi-1.4/lib/libmpi.so.0 > #7 0x0804f9c2 in main (argc=1, argv=0xbfffed94) at file_error.cc:17 > > Why is there an MPI_Barrier in the close path? It has to do with our > implementation of shared file pointers. If you run this test on a file system > that does not support shared file pointers ( PVFS, for example), you might get > a little further. > > So, I think the ball is back in the OpenMPI court: they have to > re-jigger the order of the destructors so that closing files comes a > little earlier in the shutdown process. > > ==rob > Rob, Thank you, that is the answer I was hoping for: I'm not crazy and it should be an easy fix. I'll look through the OpenMPI source code and maybe suggest a fix. jro
[OMPI users] Open MPI vs IBM MPI performance help
OpenMPI version: 1.4.3 Platform: IBM P5, 32 processors, 256 GB memory, Symmetric Multi-Threading (SMT) enabled Application: starts up 48 processes and does MPI using MPI_Barrier, MPI_Get, MPI_Put (lots of transfers, large amounts of data) Issue: When implemented using Open MPI vs. IBM's MPI ('poe' from HPC Toolkit), the application runs 3-5 times slower. I suspect that IBM's MPI implementation must take advantage of some knowledge that it has about data transfers that Open MPI is not taking advantage of. Any suggestions? Thanks, Brian Price
Re: [OMPI users] SIGPIPE handling?
Sorry, one more question: I don't completely understand the version numbering, but can/will this fix go into 1.5.1 at some point? I notice that the trunk is labeled as 1.7. Thanks again Jesse Ziser wrote: It turned out I was using development version 1.5.0. After going back to the release version, I found that there was another problem on my end, which had nothing to do with OpenMPI. So thanks for the help; all is well. (And sorry for the belated reply.) Ralph Castain wrote: After digging around a little, I found that you must be using the OMPI devel trunk as no release version contains this code. I also looked to see why it was done, and found that the concern was with an inadvertent sigpipe that can occur internal to OMPI due to a race condition. So I modified the trunk a little. We will ignore the first few sigpipe errors we get, but will then abort with an appropriate error. HTH Ralph On Nov 24, 2010, at 5:08 PM, Jesse Ziser wrote: Hello, I've noticed that OpenMPI does not seem to detect when something downstream of it fails. Specifically, I think it does not handle SIGPIPE or pass it down to its young, but it still prints an error message every time it occurs. For example, running a command like this: mpirun -np 1 ./mpi-cat /dev/null (where mpi-cat is just a simple program that initializes MPI and then copies its input to its output) hangs after the dd quits, and produces an eternity of repetitions of this error message: [[35845,0],0] reports a SIGPIPE error on fd 13 I am unsure whether this is the intended behavior, but it certainly seems unfortunate from my persepective. Is there any way to make it exit nicely, preferably with a single error, whenever what it's trying to write to doesn't exist anymore? I think I could even submit a patch to make it quit on SIGPIPE, if it is agreed that that makes sense. Here's the source for my mpi-cat example: #include #include int main (int iArgC, char *apArgV []) { int iRank; MPI_Init (&iArgC, &apArgV); MPI_Comm_rank (MPI_COMM_WORLD, &iRank); if (iRank == 0) { while(1) if(putchar(getchar()) < 0) break; } MPI_Finalize (); return (0); } Thank you, Jesse Ziser Applied Research Laboratories: The University of Texas at Austin ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] SIGPIPE handling?
I can schedule it into the 1.5 series, but I don't think it will make 1.5.1 (too close to release). Have to ask... On Dec 1, 2010, at 2:12 PM, Jesse Ziser wrote: > Sorry, one more question: I don't completely understand the version > numbering, but can/will this fix go into 1.5.1 at some point? I notice that > the trunk is labeled as 1.7. > > Thanks again > > Jesse Ziser wrote: >> It turned out I was using development version 1.5.0. After going back to >> the release version, I found that there was another problem on my end, which >> had nothing to do with OpenMPI. So thanks for the help; all is well. (And >> sorry for the belated reply.) >> Ralph Castain wrote: >>> After digging around a little, I found that you must be using the OMPI >>> devel trunk as no release version contains this code. I also looked to see >>> why it was done, and found that the concern was with an inadvertent sigpipe >>> that can occur internal to OMPI due to a race condition. >>> >>> So I modified the trunk a little. We will ignore the first few sigpipe >>> errors we get, but will then abort with an appropriate error. >>> >>> HTH >>> Ralph >>> >>> On Nov 24, 2010, at 5:08 PM, Jesse Ziser wrote: >>> Hello, I've noticed that OpenMPI does not seem to detect when something downstream of it fails. Specifically, I think it does not handle SIGPIPE or pass it down to its young, but it still prints an error message every time it occurs. For example, running a command like this: mpirun -np 1 ./mpi-cat /dev/null (where mpi-cat is just a simple program that initializes MPI and then copies its input to its output) hangs after the dd quits, and produces an eternity of repetitions of this error message: [[35845,0],0] reports a SIGPIPE error on fd 13 I am unsure whether this is the intended behavior, but it certainly seems unfortunate from my persepective. Is there any way to make it exit nicely, preferably with a single error, whenever what it's trying to write to doesn't exist anymore? I think I could even submit a patch to make it quit on SIGPIPE, if it is agreed that that makes sense. Here's the source for my mpi-cat example: #include #include int main (int iArgC, char *apArgV []) { int iRank; MPI_Init (&iArgC, &apArgV); MPI_Comm_rank (MPI_COMM_WORLD, &iRank); if (iRank == 0) { while(1) if(putchar(getchar()) < 0) break; } MPI_Finalize (); return (0); } Thank you, Jesse Ziser Applied Research Laboratories: The University of Texas at Austin ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >>> ___ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] SIGPIPE handling?
On Dec 1, 2010, at 4:12 PM, Jesse Ziser wrote: > Sorry, one more question: I don't completely understand the version > numbering, but can/will this fix go into 1.5.1 at some point? I notice that > the trunk is labeled as 1.7. Here's an explanation of our version numbering: http://www.open-mpi.org/software/ompi/versions/ Short version is: - v1.4: our "super stable" / mature series. Someday it will be retired. - v1.5: our "feature" series -- not quite as mature as the v1.4 series. Someday it will transition to be the next "super stable" series: v1.6. - SVN development trunk/v1.7: what will eventually become the v1.7 series (i.e., our next "feature" series). So v1.5 is an official release series. But it's still under active development and having features added. v1.4 is only having bug fixes applied to it -- it's in the stable/production portion of its lifespan. > Thanks again > > Jesse Ziser wrote: >> It turned out I was using development version 1.5.0. After going back to >> the release version, I found that there was another problem on my end, which >> had nothing to do with OpenMPI. So thanks for the help; all is well. (And >> sorry for the belated reply.) >> Ralph Castain wrote: >>> After digging around a little, I found that you must be using the OMPI >>> devel trunk as no release version contains this code. I also looked to see >>> why it was done, and found that the concern was with an inadvertent sigpipe >>> that can occur internal to OMPI due to a race condition. >>> >>> So I modified the trunk a little. We will ignore the first few sigpipe >>> errors we get, but will then abort with an appropriate error. >>> >>> HTH >>> Ralph >>> >>> On Nov 24, 2010, at 5:08 PM, Jesse Ziser wrote: >>> Hello, I've noticed that OpenMPI does not seem to detect when something downstream of it fails. Specifically, I think it does not handle SIGPIPE or pass it down to its young, but it still prints an error message every time it occurs. For example, running a command like this: mpirun -np 1 ./mpi-cat /dev/null (where mpi-cat is just a simple program that initializes MPI and then copies its input to its output) hangs after the dd quits, and produces an eternity of repetitions of this error message: [[35845,0],0] reports a SIGPIPE error on fd 13 I am unsure whether this is the intended behavior, but it certainly seems unfortunate from my persepective. Is there any way to make it exit nicely, preferably with a single error, whenever what it's trying to write to doesn't exist anymore? I think I could even submit a patch to make it quit on SIGPIPE, if it is agreed that that makes sense. Here's the source for my mpi-cat example: #include #include int main (int iArgC, char *apArgV []) { int iRank; MPI_Init (&iArgC, &apArgV); MPI_Comm_rank (MPI_COMM_WORLD, &iRank); if (iRank == 0) { while(1) if(putchar(getchar()) < 0) break; } MPI_Finalize (); return (0); } Thank you, Jesse Ziser Applied Research Laboratories: The University of Texas at Austin ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >>> ___ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] How to avoid abort when calling MPI_Finalize without calling MPI_File_close?
On Dec 1, 2010, at 10:28 AM, Rob Latham wrote: > under openmpi, this test program fails because openmpi is trying to > help you out. I'm going to need some help from the openmpi folks > here, but the backtrace makes it look like MPI_Finalize is setting the > "no more mpi calls allowed" flag, and then goes and calls some mpi > routines to clean up the opened files: Rob -- I think you're right. I'll file a ticket, but I don't know exactly when this will be addressed. James; if you can find a good solution and send a patch, that would be most appreciated. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] win: mpic++ -showme reports duplicate .libs
> -Original Message- > From: Shiqing Fan [mailto:f...@hlrs.de] > Sent: 01 December 2010 11:29 > To: Open MPI Users > Cc: Hicham Mouline > Subject: Re: [OMPI users] win: mpic++ -showme reports duplicate .libs > > Hi Hicham, > > Thanks for noticing it. It's now been fixed on trunk. > > > Regards, > Shiqing > > On 2010-12-1 10:02 AM, Hicham Mouline wrote: > > Hello, > > > >> mpic++ -showme:link > > /TP /EHsc /link /LIBPATH:"C:/Program Files (x86)/openmpi/lib" > libmpi.lib > > libopen-pal.lib libopen-rte.lib libmpi_cxx.lib libmpi.lib libopen- > pal.lib > > libopen-rte.lib advapi32.lib Ws2_32.lib shlwapi.lib > > > > reports using the 4 mpi libs twice. > > > > I've followed the cmake way in README.windows. > > > > Is this intended or have I wronged somewhere? > > > > rds, That was fast. I'm glad these are sorted quickly. This is used by FindMPI module in cmake which is hopefully being extended by the maintainer after some emails, to work on windows as well. regards,
[OMPI users] win: cmake: release+debug
Hi, Following the instructions from Readme.windows, I've used cmake and 4 build directories to generate release and debug win32 and x64 builds. When it came to install, I wondered: there are 4 directories involved, bin, lib, share and include. Are include and share identical across the 4 configurations. If so, it'd be good to have a cmake way to share those directories somewhere. As the debug libraries have a d added to their names, they could also coexist in the same lib directory as the release libs. on a win64 box, I could see: \Program Files\openmpi\bin and bin\debug: 64bit release and debug mpic++ and co (though I don't see the benefit of debug mpic++) \Program Files\openmpi\lib: debug and release 64bit libs \Program Files\openmpi\include: common? include \Program Files\openmpi\share: common? share \Program Files(x86)\openmpi: same as above but for 32bit on a win32box, \Program Files(x86)\openmpi: same as above but _only_ for 32bit Is it doable easily like this already? rds,