[OMPI users] btl_tcp_use_nagle is negated in openmpi-1.7.4rc1
Hi, I found that btl_tcp_use_nagle was negated in openmpi-1.7.4rc1, which causes severe slowdown of tcp-network for smaller size(< 1024) in our environment as show at the bottom. This happened in SVN r28719, where new MCA variable system was added. The flag of tcp_not_use_nodelay was newly introduced as the negation of tcp_use_nodelay in r28719(btl_tcp_component.c). r28361(btl_tcp_component.c): 218 mca_btl_tcp_component.tcp_use_nodelay = 219 !mca_btl_tcp_param_register_int ("use_nagle", "Whether to use Nagle's algorithm or not (using Nagle's algo rithm may increase short message latency)", 0); r28719(btl_tcp_component.c): 242 mca_btl_tcp_param_register_int ("use_nagle", "Whether to use Nagle's algorithm or not (using Nagle's algorithm may increase short message latency)", 0, &mca_btl_tcp_component.tcp_not_use_nodelay); In spite of this negation, the socket option was set by tcp_not_use_nodelay as same as before in btl_tcp_endpoint.c. I think the line 515 should be: optval = !mca_btl_tcp_component.tcp_not_use_nodelay; /* tmishima */ I already confirmed that this fix worked well with openmpi-1.7.4rc1. btl_tcp_endpoint.c @ 28719 : 514 #if defined(TCP_NODELAY) 515 optval = mca_btl_tcp_component.tcp_not_use_nodelay; 516 if(setsockopt(sd, IPPROTO_TCP, TCP_NODELAY, (char *)&optval, sizeof(optval)) < 0) { 517 BTL_ERROR(("setsockopt(TCP_NODELAY) failed: %s (%d)", 518strerror(opal_socket_errno), opal_socket_errno)); 519 } 520 #endif Regards, Tetsuya Mishima [mishima@manage OMB-3.1.1]$ mpirun -np 2 -host manage,node05 -mca btl self,tcp osu_bw # OSU MPI Bandwidth Test v3.1.1 # SizeBandwidth (MB/s) 1 0.00 2 0.01 4 0.01 8 0.03 160.05 320.10 640.16 128 0.35 256 0.74 512 20.30 1024149.89 2048182.88 4096203.17 8192217.08 16384 228.58 32768 232.21 65536 169.81 131072 232.67 262144 207.03 524288 224.22 1048576 233.30 2097152 233.51 4194304 234.64
Re: [OMPI users] Basic question on compiling fortran with windows computer
Hi Jeff, thanks a lot. I will check that! Best wishes, Johanna Am 07.01.2014 00:16, schrieb Jeff Squyres (jsquyres): Sorry -- I was offline for the MPI_Festivus(3) break, and just returned to the office today. If you don't have the mpif90 or mpif77 executables in the same directory as the mpicc executable, then your installation did not build with Fortran support. Check the output of "ompi_info -a", too -- that will indicate whether your OMPI installation expects to have Fortran support included or not. On Dec 28, 2013, at 9:18 AM, Johanna Schauer wrote: Hi Jeff, thanks a lot for your response. When I use "mpif90" I get the same response as before ("'mpif90' is not recognized as an internal or external command, operable program or batch file."). Are there any settings I might need to adjust (path variables etc.) ? Thanks a lot and best wishes, Johanna Am 17.12.2013 21:32, schrieb Jeff Squyres (jsquyres): In the OMPI 1.6 series, the Fortran wrapper compilers are named "mpif77" and "mpif90". They were consolidated down to "mpifort" starting with OMPI 1.7. On Dec 17, 2013, at 2:18 PM, Johanna Schauer wrote: Dear List, I have been looking for an answer everywhere, but I cannot find much on this topic. I have a fortran code that uses open mpi. Also, I have a windows 8 computer. I have gfortran installed on my computer and it compiles just fine by itself. Now, I have downloaded and installed Open MPI v.1.6.2-2 win 64. I have tried to compile my file with the command: mpifort -o test test.f90 All I get back is the following message: "'mpifort' is not recognized as an internal or external command, operable program or batch file." I would be very thankful for any help. Best wishes, Johanna ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] btl_tcp_use_nagle is negated in openmpi-1.7.4rc1
You are quite correct - r29719 did indeed reverse the logic of that param. Thanks for tracking it down! I pushed a fix to the trunk and scheduled it for 1.7.4 On Jan 7, 2014, at 9:45 PM, tmish...@jcity.maeda.co.jp wrote: > > > Hi, > > I found that btl_tcp_use_nagle was negated in openmpi-1.7.4rc1, which > causes severe slowdown of tcp-network for smaller size(< 1024) in our > environment as show at the bottom. > > This happened in SVN r28719, where new MCA variable system was added. > The flag of tcp_not_use_nodelay was newly introduced as the negation of > tcp_use_nodelay in r28719(btl_tcp_component.c). > > r28361(btl_tcp_component.c): > 218 mca_btl_tcp_component.tcp_use_nodelay = > 219 !mca_btl_tcp_param_register_int ("use_nagle", "Whether to > use Nagle's algorithm or not (using Nagle's algo > rithm may increase short message latency)", 0); > > r28719(btl_tcp_component.c): > 242 mca_btl_tcp_param_register_int ("use_nagle", "Whether to use > Nagle's algorithm or not (using Nagle's algorithm > may increase short message latency)", 0, > &mca_btl_tcp_component.tcp_not_use_nodelay); > > In spite of this negation, the socket option was set by tcp_not_use_nodelay > as same as before in btl_tcp_endpoint.c. I think the line 515 should be: > > optval = !mca_btl_tcp_component.tcp_not_use_nodelay; /* tmishima */ > > I already confirmed that this fix worked well with openmpi-1.7.4rc1. > > btl_tcp_endpoint.c @ 28719 : > 514 #if defined(TCP_NODELAY) > 515 optval = mca_btl_tcp_component.tcp_not_use_nodelay; > 516 if(setsockopt(sd, IPPROTO_TCP, TCP_NODELAY, (char *)&optval, > sizeof(optval)) < 0) { > 517 BTL_ERROR(("setsockopt(TCP_NODELAY) failed: %s (%d)", > 518strerror(opal_socket_errno), opal_socket_errno)); > 519 } > 520 #endif > > Regards, > Tetsuya Mishima > > [mishima@manage OMB-3.1.1]$ mpirun -np 2 -host manage,node05 -mca btl > self,tcp osu_bw > # OSU MPI Bandwidth Test v3.1.1 > # SizeBandwidth (MB/s) > 1 0.00 > 2 0.01 > 4 0.01 > 8 0.03 > 160.05 > 320.10 > 640.16 > 128 0.35 > 256 0.74 > 512 20.30 > 1024149.89 > 2048182.88 > 4096203.17 > 8192217.08 > 16384 228.58 > 32768 232.21 > 65536 169.81 > 131072 232.67 > 262144 207.03 > 524288 224.22 > 1048576 233.30 > 2097152 233.51 > 4194304 234.64 > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] Failing to MPI run on my linux cluster
Hey all My cluster consist of 2 workstations with hyper threaded Intel Xeon processors and an old Dell dual core computer to control them. I am failing to mpirun on the cluster. 1.When executing as user [prufa@master]$ mpirun -np 16 --hostfile /home/prufa/prufa.mpi_hostfile fds_mpi SST1SV20.fds Process 0 of 15 is running . forrtl: Permission denied forrtl: severe (9): permission to access file denied, unit 4, file /share/apps/FDS/FDS6/FDS6/ Examples/Verkis/FDS6MPI_SST_1STEPVALUES_VEL_20.smv 2. Executing same command forrtl: severe (47): write to READONLY file, unit 4, file /share/apps/FDS/FDS6/FDS6/Examples/Verkis/FDS6MPI_SST_1STEPVALUES_VEL_20.smv 3. When i try one of openmpi examples [prufa@master]$ mpirun -np 18 /share/apps/openmpi-1.6.5/ examples/hello_c.c -- mpirun was unable to launch the specified application as it could not access or execute an executable: Executable: /share/apps/openmpi-1.6.5/examples/hello_c.c Node: w0094.stofa.is while attempting to start process rank 0. -- 18 total processes failed to start Please could you guys help me with this problem. Thanks in advance Best regards Axel
Re: [OMPI users] Failing to MPI run on my linux cluster
I can't speak to your app as I don't know what it does. However, you *do* have to compile the example first! :-) A simple "make" in the examples directory will create all the binaries On Jan 8, 2014, at 7:29 AM, Axel Viðarsson wrote: > Hey all > > My cluster consist of 2 workstations with hyper threaded Intel Xeon > processors and an old Dell dual core computer to control them. > I am failing to mpirun on the cluster. > > 1.When executing as user > > [prufa@master]$ mpirun -np 16 --hostfile /home/prufa/prufa.mpi_hostfile > fds_mpi SST1SV20.fds > > Process 0 of 15 is running . > > forrtl: Permission denied > forrtl: severe (9): permission to access file denied, unit 4, file > /share/apps/FDS/FDS6/FDS6/ > Examples/Verkis/FDS6MPI_SST_1STEPVALUES_VEL_20.smv > > 2. Executing same command > > forrtl: severe (47): write to READONLY file, unit 4, file > /share/apps/FDS/FDS6/FDS6/Examples/Verkis/FDS6MPI_SST_1STEPVALUES_VEL_20.smv > > 3. When i try one of openmpi examples > > [prufa@master]$ mpirun -np 18 /share/apps/openmpi-1.6.5/ > examples/hello_c.c > -- > mpirun was unable to launch the specified application as it could not access > or execute an executable: > > Executable: /share/apps/openmpi-1.6.5/examples/hello_c.c > Node: w0094.stofa.is > > while attempting to start process rank 0. > -- > 18 total processes failed to start > > > Please could you guys help me with this problem. > > Thanks in advance > > Best regards > Axel > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Failing to MPI run on my linux cluster
Thanks Ralph, now I can least run the examples. My app is called FDS or Fire dynamics simulator, if someone is familiar with that or just those errors i am getting then i would appreciate any help. Thanks Axel 2014/1/8 Ralph Castain > I can't speak to your app as I don't know what it does. However, you *do* > have to compile the example first! :-) > > A simple "make" in the examples directory will create all the binaries > > > On Jan 8, 2014, at 7:29 AM, Axel Viðarsson wrote: > > Hey all > > My cluster consist of 2 workstations with hyper threaded Intel Xeon > processors and an old Dell dual core computer to control them. > I am failing to mpirun on the cluster. > > 1.When executing as user > > [prufa@master]$ mpirun -np 16 --hostfile /home/prufa/prufa.mpi_hostfile > fds_mpi SST1SV20.fds > > Process 0 of 15 is running . > > forrtl: Permission denied > forrtl: severe (9): permission to access file denied, unit 4, file > /share/apps/FDS/FDS6/FDS6/ > Examples/Verkis/FDS6MPI_SST_1STEPVALUES_VEL_20.smv > > 2. Executing same command > > forrtl: severe (47): write to READONLY file, unit 4, file > /share/apps/FDS/FDS6/FDS6/Examples/Verkis/FDS6MPI_SST_1STEPVALUES_VEL_20.smv > > 3. When i try one of openmpi examples > > [prufa@master]$ mpirun -np 18 /share/apps/openmpi-1.6.5/ > examples/hello_c.c > -- > mpirun was unable to launch the specified application as it could not > access > or execute an executable: > > Executable: /share/apps/openmpi-1.6.5/examples/hello_c.c > Node: w0094.stofa.is > > while attempting to start process rank 0. > -- > 18 total processes failed to start > > > Please could you guys help me with this problem. > > Thanks in advance > > Best regards > Axel > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] Failing to MPI run on my linux cluster
It sounds like you are having filesystem permission issues -- i.e., your app is trying to write to a file that is not writable (i.e., this doesn't sound like an MPI issue). On Jan 8, 2014, at 11:31 AM, Axel Viðarsson wrote: > Thanks Ralph, now I can least run the examples. > > My app is called FDS or Fire dynamics simulator, if someone is familiar with > that or just those errors i am getting then i would appreciate any help. > > Thanks > Axel > > > 2014/1/8 Ralph Castain > I can't speak to your app as I don't know what it does. However, you *do* > have to compile the example first! :-) > > A simple "make" in the examples directory will create all the binaries > > > On Jan 8, 2014, at 7:29 AM, Axel Viðarsson wrote: > >> Hey all >> >> My cluster consist of 2 workstations with hyper threaded Intel Xeon >> processors and an old Dell dual core computer to control them. >> I am failing to mpirun on the cluster. >> >> 1.When executing as user >> >> [prufa@master]$ mpirun -np 16 --hostfile /home/prufa/prufa.mpi_hostfile >> fds_mpi SST1SV20.fds >> >> Process 0 of 15 is running . >> >> forrtl: Permission denied >> forrtl: severe (9): permission to access file denied, unit 4, file >> /share/apps/FDS/FDS6/FDS6/ >> Examples/Verkis/FDS6MPI_SST_1STEPVALUES_VEL_20.smv >> >> 2. Executing same command >> >> forrtl: severe (47): write to READONLY file, unit 4, file >> /share/apps/FDS/FDS6/FDS6/Examples/Verkis/FDS6MPI_SST_1STEPVALUES_VEL_20.smv >> >> 3. When i try one of openmpi examples >> >> [prufa@master]$ mpirun -np 18 /share/apps/openmpi-1.6.5/ >> examples/hello_c.c >> -- >> mpirun was unable to launch the specified application as it could not access >> or execute an executable: >> >> Executable: /share/apps/openmpi-1.6.5/examples/hello_c.c >> Node: w0094.stofa.is >> >> while attempting to start process rank 0. >> -- >> 18 total processes failed to start >> >> >> Please could you guys help me with this problem. >> >> Thanks in advance >> >> Best regards >> Axel >> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] Regression: Fortran derived types with newer MPI module
On Jan 7, 2014, at 11:23 PM, Jed Brown wrote: > On page 610, I see text disallowing the explicit interfaces in > ompi/mpi/fortran/use-mpi-tkr: > > In S2 and S3: [snip] > > Why did OMPI decide that this (presumably non-normative) text in the > standard was not worth following? (Rejecting something in the standard > indicates stronger convictions than would be independently weighing the > benefits of each approach.) I thought a lot about this before answering. The short answer probably is: we were (are) wrong, and should probably change the default back to "small" (i.e., no interfaces for MPI subroutines with choice buffers) when ignore TKR is not supported. As I mentioned Craig and I debated long and hard to change that default, but, in summary, we apparently missed this clause on p610. I'll change it back. I'll be happy when gfortran 4.9 is released that supports ignore TKR and you'll get proper interfaces. :-) > I don't call MPI from Fortran, but someone on a Fortran project that I > watch mentioned that the compiler would complain about such and such a > use (actually relating to types for MPI_Status in MPI_Recv rather than > buffer types). Can you provide more details here? Choice buffer issues aside, I'm failing to think of a scenario where you should get a compile mismatch for the MPI status dummy argument in MPI_Recv... > It's nice to know that after 60 years (when Fortran 201x is released, > including TS 29113), there will be a Fortran standard with an analogue > of void*. It actually took quite a lot of coordination between the MPI Forum and the J3 Fortran standards body to make that happen. :-) >> - very few codes use the "mpi" module > > FWIW, I've noticed a few projects transition to it in the last few years. Good to know. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] Regression: Fortran derived types with newer MPI module
"Jeff Squyres (jsquyres)" writes: > As I mentioned Craig and I debated long and hard to change that > default, but, in summary, we apparently missed this clause on p610. > I'll change it back. Okay, thanks. > I'll be happy when gfortran 4.9 is released that supports ignore TKR > and you'll get proper interfaces. :-) Better for everyone. >> I don't call MPI from Fortran, but someone on a Fortran project that I >> watch mentioned that the compiler would complain about such and such a >> use (actually relating to types for MPI_Status in MPI_Recv rather than >> buffer types). > > Can you provide more details here? Choice buffer issues aside, I'm > failing to think of a scenario where you should get a compile mismatch > for the MPI status dummy argument in MPI_Recv... Totally superficial, just passing "status(1)" instead of "status" or "status(1:MPI_STATUS_SIZE)". I extrapolated: how can they provide an explicit interface to MPI_Recv in "use mpi", given portability constraints/existing language standards? pgpCIfMJ5CYnP.pgp Description: PGP signature
[OMPI users] MPI stats argument in Fortran mpi module (was Regression: Fortran derived types with newer MPI module)
On Jan 8, 2014, at 8:17 PM, Jed Brown wrote: >>> I don't call MPI from Fortran, but someone on a Fortran project that I >>> watch mentioned that the compiler would complain about such and such a >>> use (actually relating to types for MPI_Status in MPI_Recv rather than >>> buffer types). (changed subject to reflect a different thread) >> Can you provide more details here? > Totally superficial, just passing "status(1)" instead of "status" or > "status(1:MPI_STATUS_SIZE)". That's a different type (INTEGER scalar vs. INTEGER array). So the compiler complaining about that is actually correct. Under the covers, Fortran will (most likely) pass both by reference, so they'll both actually (most likely) *work* if you build with an MPI that doesn't provide an interface for MPI_Recv, but passing status(1) is actually incorrect Fortran. > I extrapolated: how can they provide an > explicit interface to MPI_Recv in "use mpi", given portability > constraints/existing language standards? I think you're saying that you agree with my above statements about the different types, and you're just detailing how you got to asking about WTF we were providing an MPI_Recv interface in the first place. Kumbaya. :-) -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] MPI stats argument in Fortran mpi module
"Jeff Squyres (jsquyres)" writes: >> Totally superficial, just passing "status(1)" instead of "status" or >> "status(1:MPI_STATUS_SIZE)". > > That's a different type (INTEGER scalar vs. INTEGER array). So the > compiler complaining about that is actually correct. Yes, exactly. > Under the covers, Fortran will (most likely) pass both by reference, > so they'll both actually (most likely) *work* if you build with an MPI > that doesn't provide an interface for MPI_Recv, but passing status(1) > is actually incorrect Fortran. Prior to slice notation, this would be the only way to build an array of statuses. I.e., receives go into status(1:MPI_STATUS_SIZE), status(1+MPI_STATUS_SIZE:2*MPI_STATUS_SIZE), etc. Due to pass-by-reference semantics, I think this will always work, despite not type-checking with explicit interfaces. I don't know what the language standard says about backward-compatibility of such constructs, but presumably we need to know the dialect to understand whether it's acceptable. (I actually don't know if the Fortran 77 standard defines the behavior when passing status(1), status(1+MPI_STATUS_SIZE), etc., or whether it works only as a consequence of the only reasonable implementation. > I think you're saying that you agree with my above statements about > the different types, and you're just detailing how you got to asking > about WTF we were providing an MPI_Recv interface in the first place. > Kumbaya. :-) Indeed. pgpRjssQGNBtq.pgp Description: PGP signature