Try adding —hetero-nodes to the cmd line and see if that helps resolve the
problem. Of course, if all the machines are identical, then it won’t
> On Apr 29, 2015, at 1:43 PM, Brice Goglin wrote:
>
> Le 29/04/2015 22:25, Noam Bernstein a écrit :
>>> On Apr 29, 2015, at 4:09 PM, Brice Goglin wr
Le 29/04/2015 22:25, Noam Bernstein a écrit :
>> On Apr 29, 2015, at 4:09 PM, Brice Goglin wrote:
>>
>> Nothing wrong in that XML. I don't see what could be happening besides a
>> node rebooting with hyper-threading enabled for random reasons.
>> Please run "lstopo foo.xml" again on the node next
> On Apr 29, 2015, at 4:09 PM, Brice Goglin wrote:
>
> Nothing wrong in that XML. I don't see what could be happening besides a
> node rebooting with hyper-threading enabled for random reasons.
> Please run "lstopo foo.xml" again on the node next time you get the OMPI
> failure (assuming you get
Le 29/04/2015 18:55, Noam Bernstein a écrit :
>> On Apr 29, 2015, at 12:47 PM, Brice Goglin wrote:
>>
>> Thanks. It's indeed normal that OMPI failed to bind to cpuset 0,16 since
>> 16 doesn't exist at all.
>> Can you run "lstopo foo.xml" on one node where it failed, and send the
>> foo.xml that go
Received from Rolf vandeVaart on Wed, Apr 29, 2015 at 11:14:15AM EDT:
>
> >-Original Message-
> >From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Lev Givon
> >Sent: Wednesday, April 29, 2015 10:54 AM
> >To: us...@open-mpi.org
> >Subject: [OMPI users] getting OpenMPI 1.8.4 w/ CU
> On Apr 29, 2015, at 12:47 PM, Brice Goglin wrote:
>
> Thanks. It's indeed normal that OMPI failed to bind to cpuset 0,16 since
> 16 doesn't exist at all.
> Can you run "lstopo foo.xml" on one node where it failed, and send the
> foo.xml that got generated? Just want to make sure we don't have
Le 29/04/2015 14:53, Noam Bernstein a écrit :
> They’re dual 8-core processor, so the 16 cores are physical ones. lstopo
> output looks identical on nodes where this does happen, and nodes where it
> never does. My next step is to see if I can reproduce the behavior at will -
> I’m still n
Hi Lev:
Any chance you can try Open MPI 1.8.5rc3 and see if you see the same behavior?
That code has changed a bit from the 1.8.4 series and I am curious if you will
still see the same issue.
http://www.open-mpi.org/software/ompi/v1.8/downloads/openmpi-1.8.5rc3.tar.gz
Thanks,
Rolf
>-Ori
I'm trying to build/package OpenMPI 1.8.4 with CUDA support enabled on Linux
x86_64 so that the compiled software can be downloaded/installed as one of the
dependencies of a project I'm working on with no further user configuration. I
noticed that MPI programs built with the above will try to acce
> On Apr 28, 2015, at 4:54 PM, Brice Goglin wrote:
>
> Hello,
> Can you build hwloc and run lstopo on these nodes to check that everything
> looks similar?
> You have hyperthreading enabled on all nodes, and you're trying to bind
> processes to entire cores, right?
> Does 0,16 correspond to two
10 matches
Mail list logo