Re: [OMPI users] v2.1.1 How to utilise multiple NIC ports
Hi Jeff, > How are you measuring that it hasn't been successful? A network switch sits between the two machines and I am watching the link activity on the ports. > One thing to make sure of is that you interfaces are on different subnets. Oh. I had them all on the same subnet. Now the first port shares the same subnet so I can ssh in and the other ports have their own just as you suggested. > Bad Things(tm) can happen... :) How do I now go about setting up /etc/hosts, -hostfile entries and bringing them all together on the mpirun run line ? For example, my 2nd machine is a quad core Dell T3500. Should I create a separate entry in /etc/hosts for each NIC port ? (T3500-eth1, T3500-eth2, T3500-eth3): and for the -hostfile should I also create separate entries for each core ? Cheers, Bob. ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] v2.1.1 How to utilise multiple NIC ports
On Dec 22, 2018, at 10:56 AM, Bob Beattie wrote: > > How do I now go about setting up /etc/hosts, -hostfile entries and bringing > them all together on the mpirun run line ? > For example, my 2nd machine is a quad core Dell T3500. Should I create a > separate entry in /etc/hosts for each NIC port ? (T3500-eth1, T3500-eth2, > T3500-eth3): > and for the -hostfile should I also create separate entries for each core ? You can add entries in /etc/hosts for the new IP interfaces if you like, but Open MPI won't care. Open MPI deals with IP addresses, and it'll auto-discover them (by looking at all the IP interfaces exported by the kernel) and use them as it finds them. -- Jeff Squyres jsquy...@cisco.com ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] v2.1.1 How to utilise multiple NIC ports
Many, many thanks. Couldn't see the wood for the trees ! I now have the two machines using all their 1Gb ports to talk to each other. Cheers Jeff, Happy holidays. Bob. South UK. ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
[OMPI users] open-mpi.org 3.1.3.tar.gz needs a refresh?
Maybe the distribution tar ball at https://download.open-mpi.org/release/open-mpi/v3.1/openmpi-3.1.3.tar.gz did not get refreshed after the fix in https://github.com/bosilca/ompi/commit/b902cd5eb765ada57f06c75048509d0716953549 was implemented? I downloaded the tarball from open-mpi.org today, 22 Dec, and compiled and I get the warnings. ibv_exp_query_device: invalid comp_mask !!! (comp_mask = 0xd82 valid_mask = 0x1) [bn01][[37143,17005],0][btl_openib_component.c:1670:init_one_device] error obtaining device attributes for mlx4_0 errno says Invalid argument ibv_exp_query_device: invalid comp_mask !!! (comp_mask = 0xd810002 valid_mask = 0x1) [bn01][[37143,17005],1][btl_openib_component.c:1670:init_one_device] error obtaining device attributes for mlx4_0 errno says Invalid argument -- WARNING: There was an error initializing an OpenFabrics device. Local host: bn01 Local device: mlx4_0 -- -- WARNING: There was an error initializing an OpenFabrics device. Local host: bn01 Local device: mlx4_0 -- It looks like Howard merged the fix on Dec 4, but the date listed for the 3.1.3 tarball on the open-mpi.org site is in Oct. Relevant lines in opal/mca/btl/openib/btl_openib_component.c from the tar ball are these. Missing the memset(&device->ib_exp_dev_attr, 0, sizeof(device->ib_exp_dev_attr)); that should have been inserted at 1667. 1666 #if HAVE_DECL_IBV_EXP_QUERY_DEVICE 1667 device->ib_exp_dev_attr.comp_mask = IBV_EXP_DEVICE_ATTR_RESERVED - 1; 1668 if(ibv_exp_query_device(device->ib_dev_context, &device->ib_exp_dev_att r)){ 1669 BTL_ERROR(("error obtaining device attributes for %s errno says %s" , 1670 ibv_get_device_name(device->ib_dev), strerror(errno))); 1671 goto error; 1672 } 1673 #endif I added a comment to the GitHub issue, but it was closed and I am not sure that will be noticed. Sorry for the double-posting if that was sufficient. -- bennet ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
[OMPI users] open-mpi.org is DOWN
Hello all Apologies to everyone, but I received an alert this moring that malware has been detected on the www.open-mpi.org site. I have tried to contact the hosting agency and the security scanners, but nobody is around on this pre-holiday weekend. Accordingly, I have taken the site OFFLINE for the indeterminate future until we can get this resolved. Sadly, with the holidays upon us, I don’t know how long it will take to get responses from either company. Until we do, the site will remain offline for safety reasons. Ralph ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users