[OMPI users] Old version openmpi 1.2 support infiniband?
Hi everyone, Recently I need to compile High-Performance Linpack code with openmpi 1.2 version (a little bit old). When I finish compilation, and try to run, I get the following errors: [test:32058] *** Process received signal *** [test:32058] Signal: Segmentation fault (11) [test:32058] Signal code: Address not mapped (1) [test:32058] Failing at address: 0x14a2b84b6304 [test:32058] [ 0] /lib64/libpthread.so.0(+0xf5e0) [0x14eb116295e0] [test:32058] [ 1] /root/research/lib/openmpi-1. 2.9/lib/openmpi/mca_btl_sm.so(mca_btl_sm_component_progress+0x28a) [0x14eaa81258aa] [test:32058] [ 2] /root/research/lib/openmpi-1. 2.9/lib/openmpi/mca_bml_r2.so(mca_bml_r2_progress+0x2b) [0x14eaa853219b] [test:32058] [ 3] /root/research/lib/openmpi-1. 2.9/lib/libopen-pal.so.0(opal_progress+0x4a) [0x14eb128dbaaa] [test:32058] [ 4] /root/research/lib/openmpi-1.2.9/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_msg_wait+0x1d) [0x14eaf41e6b4d] [test:32058] [ 5] /root/research/lib/openmpi-1.2.9/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_recv+0x3a5) [0x14eaf41eac45] [test:32058] [ 6] /root/research/lib/openmpi-1.2.9/lib/libopen-rte.so.0(mca_oob_recv_packed+0x33) [0x14eb12b62223] [test:32058] [ 7] /root/research/lib/openmpi-1. 2.9/lib/openmpi/mca_gpr_proxy.so(orte_gpr_proxy_put+0x1f9) [0x14eaf3dd7db9] [test:32058] [ 8] /root/research/lib/openmpi-1. 2.9/lib/libopen-rte.so.0(orte_smr_base_set_proc_state+0x31d) [0x14eb12b7893d] [test:32058] [ 9] /root/research/lib/openmpi-1.2.9/lib/libmpi.so.0(ompi_mpi_init+0x8d6) [0x14eb13202136] [test:32058] [10] /root/research/lib/openmpi-1.2.9/lib/libmpi.so.0(MPI_Init+0x6a) [0x14eb1322461a] [test:32058] [11] ./xhpl(main+0x5d) [0x404e7d] [test:32058] [12] /lib64/libc.so.6(__libc_start_main+0xf5) [0x14eb11278c05] [test:32058] [13] ./xhpl() [0x4056cb] [test:32058] *** End of error message *** mpirun noticed that job rank 0 with PID 31481 on node test.novalocal exited on signal 15 (Terminated). 23 additional processes aborted (not shown) The machine has infiniband, so I doubt whether openmpi 1.2 does not support infiniband by default. I also try to run it not through infiniband, but the program can only deal with small size input. When I increase the input size and grid size, it just gets stuck. The program I run is a benchmark, so I don't think there would be a problem in the code. Any idea? Thanks. ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] Old version openmpi 1.2 support infiniband?
Hi Jeff, Thank you for your reply. I just changed to another cluster which does not have infiniband. I ran the HPL by: mpirun *--mca btl tcp,self* -np 144 --hostfile /root/research/hostfile ./xhpl It ran successfully, but if I delete "--mca btl tcp,self", it cannot run again. So I doubt whether openmpi 1.2 cannot identify the proper network interface and set correct parameters for them. Then, I return back to the previous cluster with infiniband and type the same command above. It gets stuck forever. I change the command to: mpirun *--mca btl_tcp_if_include ib0* --hostfile /root/research/hostfile-ib -np 48 ./xhpl It can successfully launch but gives me errors as follows when HPL tries to split the communication: [node1.novalocal:09562] *** An error occurred in MPI_Comm_split [node1.novalocal:09562] *** on communicator MPI COMMUNICATOR 3 SPLIT FROM 0 [node1.novalocal:09562] *** MPI_ERR_IN_STATUS: error code in status [node1.novalocal:09562] *** MPI_ERRORS_ARE_FATAL (goodbye) [node1.novalocal:09583] *** An error occurred in MPI_Comm_split [node1.novalocal:09583] *** on communicator MPI COMMUNICATOR 3 SPLIT FROM 0 [node1.novalocal:09583] *** MPI_ERR_IN_STATUS: error code in status [node1.novalocal:09583] *** MPI_ERRORS_ARE_FATAL (goodbye) [node1.novalocal:09637] *** An error occurred in MPI_Comm_split [node1.novalocal:09637] *** on communicator MPI COMMUNICATOR 3 SPLIT FROM 0 [node1.novalocal:09637] *** MPI_ERR_IN_STATUS: error code in status [node1.novalocal:09637] *** MPI_ERRORS_ARE_FATAL (goodbye) [node1.novalocal:09994] *** An error occurred in MPI_Comm_split [node1.novalocal:09994] *** on communicator MPI COMMUNICATOR 3 SPLIT FROM 0 [node1.novalocal:09994] *** MPI_ERR_IN_STATUS: error code in status [node1.novalocal:09994] *** MPI_ERRORS_ARE_FATAL (goodbye) mpirun noticed that job rank 0 with PID 46005 on node test-ib exited on signal 15 (Terminated). Hope you can give me some suggestions. Thank you. Kaiming Ouyang, Research Assistant. Department of Computer Science and Engineering University of California, Riverside 900 University Avenue, Riverside, CA 92521 On Mon, Mar 19, 2018 at 7:35 PM, Jeff Squyres (jsquyres) wrote: > That's actually failing in a shared memory section of the code. > > But to answer your question, yes, Open MPI 1.2 did have IB support. > > That being said, I have no idea what would cause this shared memory segv > -- it's quite possible that it's simple bit rot (i.e., v1.2.9 was released > 9 years ago -- see https://www.open-mpi.org/software/ompi/versions/ > timeline.php. Perhaps it does not function correctly on modern > glibc/Linux kernel-based platforms). > > Can you upgrade to a [much] newer Open MPI? > > > > > On Mar 19, 2018, at 8:29 PM, Kaiming Ouyang wrote: > > > > Hi everyone, > > Recently I need to compile High-Performance Linpack code with openmpi > 1.2 version (a little bit old). When I finish compilation, and try to run, > I get the following errors: > > > > [test:32058] *** Process received signal *** > > [test:32058] Signal: Segmentation fault (11) > > [test:32058] Signal code: Address not mapped (1) > > [test:32058] Failing at address: 0x14a2b84b6304 > > [test:32058] [ 0] /lib64/libpthread.so.0(+0xf5e0) [0x14eb116295e0] > > [test:32058] [ 1] /root/research/lib/openmpi-1. > 2.9/lib/openmpi/mca_btl_sm.so(mca_btl_sm_component_progress+0x28a) > [0x14eaa81258aa] > > [test:32058] [ 2] /root/research/lib/openmpi-1. > 2.9/lib/openmpi/mca_bml_r2.so(mca_bml_r2_progress+0x2b) [0x14eaa853219b] > > [test:32058] [ 3] /root/research/lib/openmpi-1. > 2.9/lib/libopen-pal.so.0(opal_progress+0x4a) [0x14eb128dbaaa] > > [test:32058] [ 4] /root/research/lib/openmpi-1. > 2.9/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_msg_wait+0x1d) [0x14eaf41e6b4d] > > [test:32058] [ 5] /root/research/lib/openmpi-1. > 2.9/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_recv+0x3a5) [0x14eaf41eac45] > > [test:32058] [ 6] /root/research/lib/openmpi-1. > 2.9/lib/libopen-rte.so.0(mca_oob_recv_packed+0x33) [0x14eb12b62223] > > [test:32058] [ 7] /root/research/lib/openmpi-1. > 2.9/lib/openmpi/mca_gpr_proxy.so(orte_gpr_proxy_put+0x1f9) > [0x14eaf3dd7db9] > > [test:32058] [ 8] /root/research/lib/openmpi-1. > 2.9/lib/libopen-rte.so.0(orte_smr_base_set_proc_state+0x31d) > [0x14eb12b7893d] > > [test:32058] [ 9] /root/research/lib/openmpi-1. > 2.9/lib/libmpi.so.0(ompi_mpi_init+0x8d6) [0x14eb13202136] > > [test:32058] [10] /root/research/lib/openmpi-1. > 2.9/lib/libmpi.so.0(MPI_Init+0x6a) [0x14eb1322461a] > > [test:32058] [11] ./xhpl(main+0x5d) [0x404e7d] > > [test:32058] [12] /lib64/libc.so.6(__libc_start_main+0xf5) > [0x14eb11278c05] > > [test:32058] [13] ./xhpl() [0x4056cb] > > [test:32058] *** End of error message *** > > mpirun noticed that job rank
Re: [OMPI users] Old version openmpi 1.2 support infiniband?
Thank you. I am using newest version HPL. I forgot to say I can run HPL with openmpi-3.0 under infiniband. The reason I want to use old version is I need to compile a library that only supports old version openmpi, so I am trying to do this tricky job. Anyways, thank you for your reply Jeff, have a good day. Kaiming Ouyang, Research Assistant. Department of Computer Science and Engineering University of California, Riverside 900 University Avenue, Riverside, CA 92521 On Mon, Mar 19, 2018 at 8:39 PM, Jeff Squyres (jsquyres) wrote: > I'm sorry; I can't help debug a version from 9 years ago. The best > suggestion I have is to use a modern version of Open MPI. > > Note, however, your use of "--mca btl ..." is going to have the same > meaning for all versions of Open MPI. The problem you showed in the first > mail was with the shared memory transport. Using "--mca btl tcp,self" > means you're not using the shared memory transport. If you don't specify > "--mca btl tcp,self", Open MPI will automatically use the shared memory > transport. Hence, you could be running into the same (or similar/related) > problem that you mentioned in the first mail -- i.e., something is going > wrong with how the v1.2.9 shared memory transport is interacting with your > system. > > Likewise, "--mca btl_tcp_if_include ib0" tells the TCP BTL plugin to use > the "ib0" network. But if you have the openib BTL available (i.e., the > IB-native plug), that will be used instead of the TCP BTL because native > verbs over IB performs much better than TCP over IB. Meaning: if you > specify btl_Tcp_if_include without specifying "--mca btl tcp,self", then > (assuming openib is available) the TCP BTL likely isn't used and the > btl_tcp_if_include value is therefore ignored. > > Also, what version of Linpack are you using? The error you show is > usually indicative of an MPI application bug (the MPI_COMM_SPLIT error). > If you're running an old version of xhpl, you should upgrade to the latest. > > > > > > On Mar 19, 2018, at 9:59 PM, Kaiming Ouyang wrote: > > > > Hi Jeff, > > Thank you for your reply. I just changed to another cluster which does > not have infiniband. I ran the HPL by: > > mpirun --mca btl tcp,self -np 144 --hostfile /root/research/hostfile > ./xhpl > > > > It ran successfully, but if I delete "--mca btl tcp,self", it cannot run > again. So I doubt whether openmpi 1.2 cannot identify the proper network > interface and set correct parameters for them. > > Then, I return back to the previous cluster with infiniband and type the > same command above. It gets stuck forever. > > > > I change the command to: > > mpirun --mca btl_tcp_if_include ib0 --hostfile > /root/research/hostfile-ib -np 48 ./xhpl > > > > It can successfully launch but gives me errors as follows when HPL tries > to split the communication: > > > > [node1.novalocal:09562] *** An error occurred in MPI_Comm_split > > [node1.novalocal:09562] *** on communicator MPI COMMUNICATOR 3 SPLIT > FROM 0 > > [node1.novalocal:09562] *** MPI_ERR_IN_STATUS: error code in status > > [node1.novalocal:09562] *** MPI_ERRORS_ARE_FATAL (goodbye) > > [node1.novalocal:09583] *** An error occurred in MPI_Comm_split > > [node1.novalocal:09583] *** on communicator MPI COMMUNICATOR 3 SPLIT > FROM 0 > > [node1.novalocal:09583] *** MPI_ERR_IN_STATUS: error code in status > > [node1.novalocal:09583] *** MPI_ERRORS_ARE_FATAL (goodbye) > > [node1.novalocal:09637] *** An error occurred in MPI_Comm_split > > [node1.novalocal:09637] *** on communicator MPI COMMUNICATOR 3 SPLIT > FROM 0 > > [node1.novalocal:09637] *** MPI_ERR_IN_STATUS: error code in status > > [node1.novalocal:09637] *** MPI_ERRORS_ARE_FATAL (goodbye) > > [node1.novalocal:09994] *** An error occurred in MPI_Comm_split > > [node1.novalocal:09994] *** on communicator MPI COMMUNICATOR 3 SPLIT > FROM 0 > > [node1.novalocal:09994] *** MPI_ERR_IN_STATUS: error code in status > > [node1.novalocal:09994] *** MPI_ERRORS_ARE_FATAL (goodbye) > > mpirun noticed that job rank 0 with PID 46005 on node test-ib exited on > signal 15 (Terminated). > > > > Hope you can give me some suggestions. Thank you. > > > > Kaiming Ouyang, Research Assistant. > > Department of Computer Science and Engineering > > University of California, Riverside > > 900 University Avenue, Riverside, CA 92521 > > > > > > On Mon, Mar 19, 2018 at 7:35 PM, Jeff Squyres (jsquyres) < > jsquy...@cisco.com> wrote: > > That's actually failing in a shared memory section of the code. > > &g
Re: [OMPI users] Old version openmpi 1.2 support infiniband?
I think the problem it has is it only deal with the old framework because it will intercept MPI calls and do some profiling. Here is the library: https://github.com/LLNL/Adagio I checked the openmpi changelog. From openmpi 1.3, it began to switch to a new framework, and openmpi 1.4+ has different one too. This library only works under openmpi 1.2. Thank you for your advise, I will try it. My current problem is this library seems to try to patch mpi.h file, but it fails during the patching process for new version openmpi. I don't know the reason yet, and will check it soon. Thank you. Kaiming Ouyang, Research Assistant. Department of Computer Science and Engineering University of California, Riverside 900 University Avenue, Riverside, CA 92521 On Tue, Mar 20, 2018 at 4:35 AM, Jeff Squyres (jsquyres) wrote: > On Mar 19, 2018, at 11:32 PM, Kaiming Ouyang wrote: > > > > Thank you. > > I am using newest version HPL. > > I forgot to say I can run HPL with openmpi-3.0 under infiniband. The > reason I want to use old version is I need to compile a library that only > supports old version openmpi, so I am trying to do this tricky job. > > Gotcha. > > Is there something in particular about the old library that requires Open > MPI v1.2.x? > > More specifically: is there a particular error you get when you try to use > Open MPI v3.0.0 with that library? > > I ask because if the app supports the MPI API in Open MPI v1.2.9, then it > also supports the MPI API in Open MPI v3.0.0. We *have* changed lots of > other things under the covers in that time, such as: > > - how those MPI API's are implemented > - mpirun (and friends) command line parameters > - MCA parameters > - compilation flags > > But many of those things might actually be mostly -- if not entirely -- > hidden from a library that uses MPI. > > My point: it may be easier to get your library to use a newer version of > Open MPI than you think. For example, if the library has some hard-coded > flags in their configure/Makefile to build with Open MPI, just replace > those flags with `mpicc --showme:BLAH` variants (see `mpicc --showme:help` > for a full listing). This will have Open MPI tell you exactly what flags > it needs to compile, link, etc. > > -- > Jeff Squyres > jsquy...@cisco.com > > ___ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users > ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] Old version openmpi 1.2 support infiniband?
Hi John, Thank you for your advice. But this is only related to its functionality, and right now my problem is it cannot compile with new version openmpi. The reason may come from its patch file since it needs to intercept MPI calls to profile some data. New version openmpi may change its framework so that this old software does not fit it anymore. Kaiming Ouyang, Research Assistant. Department of Computer Science and Engineering University of California, Riverside 900 University Avenue, Riverside, CA 92521 On Tue, Mar 20, 2018 at 10:46 AM, John Hearns via users < users@lists.open-mpi.org> wrote: > "It does not handle more recent improvements such as Intel's turbo > mode and the processor performance inhomogeneity that comes with it." > I guess it is easy enough to disable Turbo mode in the BIOS though. > > > > On 20 March 2018 at 17:48, Kaiming Ouyang wrote: > >> I think the problem it has is it only deal with the old framework because >> it will intercept MPI calls and do some profiling. Here is the library: >> https://github.com/LLNL/Adagio >> >> I checked the openmpi changelog. From openmpi 1.3, it began to switch to >> a new framework, and openmpi 1.4+ has different one too. This library only >> works under openmpi 1.2. >> Thank you for your advise, I will try it. My current problem is this >> library seems to try to patch mpi.h file, but it fails during the patching >> process for new version openmpi. I don't know the reason yet, and will >> check it soon. Thank you. >> >> Kaiming Ouyang, Research Assistant. >> Department of Computer Science and Engineering >> University of California, Riverside >> 900 University Avenue, Riverside, CA 92521 >> >> >> On Tue, Mar 20, 2018 at 4:35 AM, Jeff Squyres (jsquyres) < >> jsquy...@cisco.com> wrote: >> >>> On Mar 19, 2018, at 11:32 PM, Kaiming Ouyang wrote: >>> > >>> > Thank you. >>> > I am using newest version HPL. >>> > I forgot to say I can run HPL with openmpi-3.0 under infiniband. The >>> reason I want to use old version is I need to compile a library that only >>> supports old version openmpi, so I am trying to do this tricky job. >>> >>> Gotcha. >>> >>> Is there something in particular about the old library that requires >>> Open MPI v1.2.x? >>> >>> More specifically: is there a particular error you get when you try to >>> use Open MPI v3.0.0 with that library? >>> >>> I ask because if the app supports the MPI API in Open MPI v1.2.9, then >>> it also supports the MPI API in Open MPI v3.0.0. We *have* changed lots of >>> other things under the covers in that time, such as: >>> >>> - how those MPI API's are implemented >>> - mpirun (and friends) command line parameters >>> - MCA parameters >>> - compilation flags >>> >>> But many of those things might actually be mostly -- if not entirely -- >>> hidden from a library that uses MPI. >>> >>> My point: it may be easier to get your library to use a newer version of >>> Open MPI than you think. For example, if the library has some hard-coded >>> flags in their configure/Makefile to build with Open MPI, just replace >>> those flags with `mpicc --showme:BLAH` variants (see `mpicc --showme:help` >>> for a full listing). This will have Open MPI tell you exactly what flags >>> it needs to compile, link, etc. >>> >>> -- >>> Jeff Squyres >>> jsquy...@cisco.com >>> >>> ___ >>> users mailing list >>> users@lists.open-mpi.org >>> https://lists.open-mpi.org/mailman/listinfo/users >>> >> >> >> ___ >> users mailing list >> users@lists.open-mpi.org >> https://lists.open-mpi.org/mailman/listinfo/users >> > > > ___ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users > ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] Old version openmpi 1.2 support infiniband?
Hi Jeff, Thank you for your advice. I will contact the author for some suggestions. I also notice I may port this old version library to new openmpi 3.0. I will work on this soon. Thank you. Kaiming Ouyang, Research Assistant. Department of Computer Science and Engineering University of California, Riverside 900 University Avenue, Riverside, CA 92521 On Wed, Mar 21, 2018 at 5:24 AM, Jeff Squyres (jsquyres) wrote: > You might want to take that library author's advice from their README: > > - > The source code herein was used as the basis of Rountree ICS 2009. It > was my first nontrivial MPI tool and was never intended to be released > to the wider world. I beleive it was tied rather tightly to a subset > of a (now) old MPI implementation. I expect a nontrivial amount of > work would have to be done to get this to compile and run again, and > that effort would probably be better served starting from scratch > (using Todd Gamblin's wrap.py PMPI shim generator, for example). > - > > > > On Mar 21, 2018, at 2:23 AM, John Hearns via users < > users@lists.open-mpi.org> wrote: > > > > Kaiming, good luck with your project. I think you should contact Barry > Rountree directly. you will probably get good advice! > > > > It is worth saying that with Turboboost there is variation between each > individual CPU die, even within the same SKU. > > What Turboboost does is to set a thermal envelope, and the CPU core(s) > ramp up in frequency till the thermal limit is reached. > > So each CPU die is slightly different (*) > > Indeed in my last job we had a benchmarking exercise where the > instruction was to explicitly turn off Turboboost. > > > > > > (*) As I work at ASML I really should understand this better... I really > should. > > > > > > > > > > > > > > On 20 March 2018 at 19:34, Kaiming Ouyang wrote: > > Hi John, > > Thank you for your advice. But this is only related to its > functionality, and right now my problem is it cannot compile with new > version openmpi. > > The reason may come from its patch file since it needs to intercept MPI > calls to profile some data. New version openmpi may change its framework so > that this old software does not fit it anymore. > > > > > > Kaiming Ouyang, Research Assistant. > > Department of Computer Science and Engineering > > University of California, Riverside > > 900 University Avenue, Riverside, CA 92521 > > > > > > On Tue, Mar 20, 2018 at 10:46 AM, John Hearns via users < > users@lists.open-mpi.org> wrote: > > "It does not handle more recent improvements such as Intel's turbo > > mode and the processor performance inhomogeneity that comes with it." > > I guess it is easy enough to disable Turbo mode in the BIOS though. > > > > > > > > On 20 March 2018 at 17:48, Kaiming Ouyang wrote: > > I think the problem it has is it only deal with the old framework > because it will intercept MPI calls and do some profiling. Here is the > library: > > https://github.com/LLNL/Adagio > > > > I checked the openmpi changelog. From openmpi 1.3, it began to switch to > a new framework, and openmpi 1.4+ has different one too. This library only > works under openmpi 1.2. > > Thank you for your advise, I will try it. My current problem is this > library seems to try to patch mpi.h file, but it fails during the patching > process for new version openmpi. I don't know the reason yet, and will > check it soon. Thank you. > > > > Kaiming Ouyang, Research Assistant. > > Department of Computer Science and Engineering > > University of California, Riverside > > 900 University Avenue, Riverside, CA 92521 > > > > > > On Tue, Mar 20, 2018 at 4:35 AM, Jeff Squyres (jsquyres) < > jsquy...@cisco.com> wrote: > > On Mar 19, 2018, at 11:32 PM, Kaiming Ouyang wrote: > > > > > > Thank you. > > > I am using newest version HPL. > > > I forgot to say I can run HPL with openmpi-3.0 under infiniband. The > reason I want to use old version is I need to compile a library that only > supports old version openmpi, so I am trying to do this tricky job. > > > > Gotcha. > > > > Is there something in particular about the old library that requires > Open MPI v1.2.x? > > > > More specifically: is there a particular error you get when you try to > use Open MPI v3.0.0 with that library? > > > > I ask because if the app supports the MPI API in Open MPI v1.2.9, then > it also supports the MPI API in Open MPI v3.0.0. We *have* changed lots of > other things unde
[OMPI users] OMPI sendrecv bugs?
Hi all, I am trying to test the bandwidth of intra-MPI send and recv. The code is attached here. When I give the input 2048 (namely each process will send and receive 2GB data), the program reported: Read 2147479552, expected 2147483648, errno = 95 Read 2147479552, expected 2147483648, errno = 98 Read 2147479552, expected 2147483648, errno = 98 Read 2147479552, expected 2147483648, errno = 98 Does this mean Openmpi does not support the send and recv where data size is larger than 2GB, or is there a bug in my code? Thank you. #include #include #include #include #include int main(int argc, char *argv[]) { int count; float voltage; int *in; int i; int rank, size; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &size); MPI_Comm_rank(MPI_COMM_WORLD, &rank); long mb = atol(argv[1]); count = mb / 4 * 1024 * 1024; in = (int *)malloc( count * sizeof(int)); for (i = 0; i < count; i++) { *(in + i) = i; } MPI_Barrier(MPI_COMM_WORLD); float time = MPI_Wtime(); if ((rank & 1) == 0) { MPI_Send(in, count, MPI_INT, rank + 1, 0, MPI_COMM_WORLD); } else { MPI_Status status; MPI_Recv(in, count, MPI_INT, rank - 1, 0, MPI_COMM_WORLD, &status); } if ((rank & 1) == 1) { MPI_Send(in, count, MPI_INT, rank - 1, 0, MPI_COMM_WORLD); } else { MPI_Status status; MPI_Recv(in, count, MPI_INT, rank + 1, 0, MPI_COMM_WORLD, &status); } time = MPI_Wtime() - time; free( in ); MPI_Finalize(); return 0; } ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] OMPI sendrecv bugs?
Hi Gilles, Thank you for your reply. I am using Openmpi 3.0.0. I tried the command you recommend, and it works right now. Perhaps this bug is still there? On Thu, Apr 12, 2018 at 10:06 PM, Gilles Gouaillardet < gilles.gouaillar...@gmail.com> wrote: > Which version of Open MPI are you running ? > This reminds me of a bug in CMA that has already been fixed. > > > can you try again with > > > mpirun --mca btl_vader_single_copy_mechanism none ... > > > Cheers, > > Gilles > > On Fri, Apr 13, 2018 at 1:51 PM, Kaiming Ouyang wrote: > > Hi all, > > I am trying to test the bandwidth of intra-MPI send and recv. The code is > > attached here. When I give the input 2048 (namely each process will send > and > > receive 2GB data), the program reported: > > Read 2147479552, expected 2147483648, errno = 95 > > Read 2147479552, expected 2147483648, errno = 98 > > Read 2147479552, expected 2147483648, errno = 98 > > Read 2147479552, expected 2147483648, errno = 98 > > > > Does this mean Openmpi does not support the send and recv where data > size is > > larger than 2GB, or is there a bug in my code? Thank you. > > > > > > ___ > > users mailing list > > users@lists.open-mpi.org > > https://lists.open-mpi.org/mailman/listinfo/users > ___ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users > ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
[OMPI users] Infiniband driver update must recompile openmpi?
Hi All, I have a question about recompiling openmpi. Recently I updated the infiniband driver for network card Mellanox, but I found original openmpi did not work anymore. Does this mean the driver update must be followed by recompiling openmpi? Or there are some other issues I should consider? Thank you very much. Kaiming Ouyang, Research Assistant. Department of Computer Science and Engineering University of California, Riverside 900 University Avenue, Riverside, CA 92521 ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users