Thats very interesting Yevgeny,
Yes tcp,self ran in 12 seconds
tcp,self,sm ran in 27 seconds
Does anyone have any idea how this can be?
About half the data would go to local processes, so SM should pay dividends.
From: Yevgeny Kliteynik
To: Randolph Pullen
The test for SCTP support in libc on FreeBSD only allows it to work on
FreeBSD 7 (or I suppose 70 :). That attached patch expands the test to
7 though 19 which should be enough for a while. Hopefully by the time
FreeBSD 19 is out everything will have sctp support in libc or have
dropped it. :)
--
We actually include hwloc v1.3.2 in the OMPI v1.6 series.
Can you download and try that on your machines?
http://www.open-mpi.org/software/hwloc/v1.3/
In particular try the hwloc-bind executable (outside of OMPI), and see if
binding works properly on your machines. I typically run a te
We've added "lo" back into the list, but I'm curious as to why 127.0.0.1/8
doesn't work.
If you run ipconfig, what does it say for the localhost entry? I.e., what's
its IP address and netmask?
On Sep 9, 2012, at 1:27 PM, Siegmar Gross wrote:
> Hi Shiqing,
>
> I disabled IPv6 in my network a
Ok, so this is 2 errors.
1. Something in the C++ bindings (which is weird because it's new; I don't
think this code has changed in a long, long time). This actually looks like a
problem in your C++ compiler, however -- can you compile other C++ applications
at all?
2. Same issue in VT. I'll
Hmmm...well, let's try to isolate this a little. Would you mind installing a
copy of the current trunk on this machine and trying it?
I ask because I'd like to better understand if the problem is in the actual
binding mechanism (i.e., hwloc), or in the code that computes where to bind the
proce
I replied a couple days ago (with OMPI users in CC) but got an error
last night:
Action: failed
Status: 5.0.0 (permanent failure)
Diagnostic-Code: smtp; 5.4.7 - Delivery expired (message too old) 'timeout'
(delivery attempts: 0)
I resent the mail this morning, it looks like it wasn't delivered
I got no response for this question. Is Open-MX
no longer supported in Open MPI? Or is there someplace else
I should submit this information? I also attached my ompi_info
and omx_info output
--
Doug
> I built open-mpi 1.6.1 using the open-mx libraries.
> This worked previously and now I get the
Just following up on this comment about running from a backend node while under
slurm - I just tested this (using the patched 1.6 branch) and found it works
just fine. However, note that you will only be able to execute on that local
node as we cannot detect the full allocation anywhere but on t
Yes, 1.6.2rc1 had a problem - now fixed, will be in tomorrow's nightly 1.6
tarball.
On Sep 10, 2012, at 9:50 AM, Siegmar Gross
wrote:
> Hi,
>
> thank you very much for your fast answer.
>
>> On 10/09/2012 15:41, Siegmar Gross wrote:
>>> Hi,
>>>
>>> I have built openmpi-1.6.2rc1 and get the
Hi,
thank you very much for your fast answer.
> On 10/09/2012 15:41, Siegmar Gross wrote:
> > Hi,
> >
> > I have built openmpi-1.6.2rc1 and get the following error.
> >
> > tyr small_prog 123 mpicc -showme
> > cc -I/usr/local/openmpi-1.6.2_32_cc/include -mt
> >-L/usr/local/openmpi-1.6.2_32_cc
Wow - okay, I'll have to investigate. Be aware, though, that you just described
a completely different failure. Oracle isn't using slurm, last I heard - you
were using rsh/qrsh. And you aren't running from a backend node, but from the
same frontend - just have two hosts listed in your -host entr
On 09/10/12 11:37, Ralph Castain wrote:
On Sep 10, 2012, at 8:12 AM, Aleksey Senin wrote:
On 10/09/2012 15:41, Siegmar Gross wrote:
Hi,
I have built openmpi-1.6.2rc1 and get the following error.
tyr small_prog 123 mpicc -showme
cc -I/usr/local/openmpi-1.6.2_32_cc/include -mt
-L/usr/local
On Sep 10, 2012, at 8:12 AM, Aleksey Senin wrote:
> On 10/09/2012 15:41, Siegmar Gross wrote:
>> Hi,
>>
>> I have built openmpi-1.6.2rc1 and get the following error.
>>
>> tyr small_prog 123 mpicc -showme
>> cc -I/usr/local/openmpi-1.6.2_32_cc/include -mt
>> -L/usr/local/openmpi-1.6.2_32_cc/
On 10/09/2012 15:41, Siegmar Gross wrote:
Hi,
I have built openmpi-1.6.2rc1 and get the following error.
tyr small_prog 123 mpicc -showme
cc -I/usr/local/openmpi-1.6.2_32_cc/include -mt
-L/usr/local/openmpi-1.6.2_32_cc/lib -lmpi -lm -lkstat -llgrp
-lsocket -lnsl -lrt -lm
tyr small_prog 12
Hi,
I have built openmpi-1.6.2rc1 and get the following error.
tyr small_prog 123 mpicc -showme
cc -I/usr/local/openmpi-1.6.2_32_cc/include -mt
-L/usr/local/openmpi-1.6.2_32_cc/lib -lmpi -lm -lkstat -llgrp
-lsocket -lnsl -lrt -lm
tyr small_prog 124 mpiexec -np 2 -host tyr init_finalize
Hello
Hi,
> > are the following outputs helpful to find the error with
> > a rankfile on Solaris?
>
> If you can't bind on the new Solaris machine, then the rankfile
> won't do you any good. It looks like we are getting the incorrect
> number of cores on that machine - is it possible that it has
> hard
Randolph,
So what you saying in short, leaving all the numbers aside, is the following:
In your particular application on your particular setup with this particular
OMPI version,
1. openib BTL performs faster than shared memory BTL
2. TCP BTL performs faster than shared memory
IMHO, this indic
18 matches
Mail list logo