[OMPI users] IB question (slightly off topic)
Hello all I have a question, that I do realize, is somewhat off topic to this list. But I do not know who to approach for an answer. Hopefully the community here will help me out. I know that Infiniband is a 'standard' interface (standardized by IETF? IEEE? or some similar body), much like Ethernet. However, I do see that they come in different 'flavors', (and have different feature set?) such as Qlogic PSM or Mellanox ConnectX, that have *user space* "drivers" and even OpenMPI treats them differently (preferring Qlogic PSM over other IB, as a default behavior). For someone very new to the Infiniband world, what are the differences? How can they be different and yet confirm to the (supposed) standard? Any pointer to appropriate literature is appreciated. Thanks in advance Durga Life is complex. It has real and imaginary parts.
Re: [OMPI users] IB question (slightly off topic)
In my understanding, the standard is mainly for the hardware and misc stuff only. For example, mellanox and qlogic infiniband can use the same cables and switches. Iirc, they can use the same subnet manager and communicate via IPoIB. When performance matters, mellanox use ib verbs, and qlogic uses PSM library. I am not sure what you mean by "ompi prefers PSM over other IB" assuming qlogic can work with IB verbs, then yes, PSM is faster for qlogic, so ompi will prefer PSM. Mellanox infiniband cannot use PSM so ompi use IB verbs. Note mellanox also provides optimized proprietary libraries (hcoll, mxm, ...) that can be used for enhanced performances. Fwiw and iirc, Intel bought the infiniband assets from Qlogic a few years ago. Cheers, Gilles On Saturday, March 12, 2016, dpchoudh . wrote: > Hello all > > I have a question, that I do realize, is somewhat off topic to this list. > But I do not know who to approach for an answer. Hopefully the community > here will help me out. > > I know that Infiniband is a 'standard' interface (standardized by IETF? > IEEE? or some similar body), much like Ethernet. > > However, I do see that they come in different 'flavors', (and have > different feature set?) such as Qlogic PSM or Mellanox ConnectX, that have > *user space* "drivers" and even OpenMPI treats them differently (preferring > Qlogic PSM over other IB, as a default behavior). > > For someone very new to the Infiniband world, what are the differences? > How can they be different and yet confirm to the (supposed) standard? > > Any pointer to appropriate literature is appreciated. > > Thanks in advance > Durga > > Life is complex. It has real and imaginary parts. >
Re: [OMPI users] Communication problem (on one node) when network interface is down
Also, loop back interface is somehow special. though all nodes do have the same ip 127.0.0.1, this interface cannot be used for inter node communication. On Saturday, March 12, 2016, Jeff Squyres (jsquyres) wrote: > It's set by default in btl_tcp_if_exclude (because in most cases, you *do* > want to exclude the loopback interface -- it's much slower than other > shared memory types of scenarios). But this value can certainly be > overridden: > > mpirun --mca btl_tcp_if_exclude '' > > > > > On Mar 11, 2016, at 11:15 AM, dpchoudh . > wrote: > > > > Hello all > > > > From a user standpoint, that does not seem right to me. Why should one > need any kind of network at all if one is entirely dealing with a single > node? Is there any particular reason OpenMPI does not/cannot use the lo > (loopback) interface? I'd think it is there for exactly this kind of > situation. > > > > Thanks > > Durga > > > > Life is complex. It has real and imaginary parts. > > > > On Fri, Mar 11, 2016 at 6:08 AM, Gilles Gouaillardet < > gilles.gouaillar...@gmail.com > wrote: > > Spawned tasks cannot use the sm nor vader btl so you need an other one > (tcp, openib, ...) > > self btl is only to send/recvcount with oneself (e.g. it does not work > for inter process and intra node communications. > > > > I am pretty sure the lo interface is always discarded by openmpi, so I > have no solution on top of my head that involves openmpi. > > maybe your bed bet is to use a "dummy" interface, for example tan or tun > or even a bridge. > > > > Cheers, > > > > Gilles > > > > > > > > On Friday, March 11, 2016, Rémy Grünblatt > wrote: > > Hello, > > I'm having communications problem between two processes (with one being > > spawned by the other, on the *same* physical machine). Everything works > > as expected when I have network interface such as eth0 or wlo1 up, but > > as soon as they are down, I get errors (such as « At least one pair of > > MPI processes are unable to reach each other for MPI communications […] > »). > > I tried to specify a set of mca parameters including the btl "self" > > parameter and including the lo interface in btl_tcp_if_include list, as > > advised by https://www.open-mpi.org/faq/?category=tcp but I didn't reach > > any working state for this code with "external" network interface down. > > > > Got any idea about what I might do wrong ? > > > > Example code that triggers the problem: https://ptpb.pw/YOjr.tar.gz > > Ompi_info: https://ptpb.pw/Vt_V.txt > > Full log: https://ptpb.pw/JCXn.txt > > > > Rémy > > > > > > > > ___ > > users mailing list > > us...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > > Link to this post: > http://www.open-mpi.org/community/lists/users/2016/03/28684.php > > > > ___ > > users mailing list > > us...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > > Link to this post: > http://www.open-mpi.org/community/lists/users/2016/03/28687.php > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > ___ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2016/03/28689.php
Re: [OMPI users] Error with MPI_Register_datarep
Hi Gilles, Le 16-03-10 23:14, Gilles Gouaillardet a écrit : Eric, my short answer is no. long answer is : - from MPI_Register_datarep() /* The io framework is only initialized lazily. If it hasn't already been initialized, do so now (note that MPI_FILE_OPEN and MPI_FILE_DELETE are the only two places that it will be initialized). */ - from mca_io_base_register_datarep() /* Find the maximum additional number of bytes required by all io components for requests and make that the request size */ OPAL_LIST_FOREACH(cli, &ompi_io_base_framework.framework_components, mca_base_component_list_item_t) { ... } in your case, since nor MPI_File_open nor MPI_File_delete is invoked, the ompio component could be disabled. but that would mean the io component selection is also based on the fact that MPI_Register_datarep() has been invoked or not before. i can foresee users complaining about IO performance discrepancies just because of one line (e.g. MPI_Register_datarep invokation) in their code. Ok, my situation is that I want a datarep only to have a compatible 32bits code (long int) working with files written from 64bits code with "native" datarep... So I want to activate the datarep functionality only into 32 bits compilation of the code... Now, I continued my tests with "--mca io ^ompio", but I hit another wall: when I try to use the datarep just to test it, I now have this message: ERROR Returned by MPI: 51 ERROR_string Returned by MPI: MPI_ERR_UNSUPPORTED_DATAREP: data representation not supported which is pretty similar to MPICH output... So I am completely stuck into implementing a solution to read/write "native" 64 bits datarep files from a 32 bits architecture... Isn't that into the MPI-2 standard? Does it means that no MPI implementation is standard compliant? >:) now if MPI_File_open is invoked first, that means that MPI_Register_datarep will fail or success based on the selected io component (and iirc, that could be file(system) dependent within the same application). i am open to suggestions, but so far, i do not see a better one (other than implementing this in OMPIO) the patch for v1.10 can be downloaded at https://github.com/ggouaillardet/ompi-release/commit/1589278200d9fb363d61fa20fb39a4c2fa78c942.patch application will not crash, but fail "nicely" on MPI_Register_datarep In reality I want a solution to read/write files with the same API (MPI collective calls)... and produce files that are compatible between 32bits vs 64 bits architecture relative to long int issue and without any lost of precision or performance for "native" 64bits architectures... I saw about 4 years ago the example into the "Using MPI-2" book about datarep, and that let me though I could easily implement a solution to read/write files in a compatible format between architectures, even if I choose "native" datarep under a 64 bits architecture that is the only one really used into clusters and by our users until now... I made the decision to code once, with all collective I/O calls, knowing I would be able to convert int32 to int64 when needed only... Now, I feel I made a bad choice, since no MPI implementation have this working... or maybe there is a simple workaround? Is there an "external64" available? Is there something written into the file about the datarep? If not, then "native" could still be as performant as "external64" since no conversion shall be done, but under 32bits architectures, there shall be some io performance losts, since more conversions will occur Thanks for helping me understand! Eric Cheers, Gilles On 3/11/2016 12:11 PM, Éric Chamberland wrote: Thanks Gilles! it works... I will continue my tests with that command line... Until OMPIO supports this, is there a way to put a call into the code to disable ompio the same way --mca io ^ompio does? Thanks, Eric Le 16-03-10 20:13, Gilles Gouaillardet a écrit : Eric, I will fix the crash (fwiw, it is already fixed in v2.x and master) note this program cannot currently run "as is". by default, there are two frameworks for io : ROMIO and OMPIO. MPI_Register_datarep does try to register the datarep into all frameworks, and successes only if datarep was successfully registered into all frameworks. OMPIO does not currently support this (and the stub is missing in v1.10 so the app does not crash) your test is successful if you blacklist ompio : mpirun --mca io ^ompio ./int64 or OMPI_MCA_io=^romio ./int64 and you do not even need a patch for that :-) Cheers, Gilles On 3/11/2016 4:47 AM, Éric Chamberland wrote: Hi, I have a segfault while trying to use MPI_Register_datarep with openmpi-1.10.2: mpic++ -g -o int64 int64.cc ./int64 [melkor:24426] *** Process received signal *** [melkor:24426] Signal: Segmentation fault (11) [melkor:24426] Signal code: Address not mapped (1) [melkor:24426] Failing at address: (nil) [melkor:24426] [