On 2/5/2016 2:38 AM, dpchoudh . wrote:
Dear all
This is a slightly off-topic post, and hopefully people won't mind
helping me out.
I have a very simple setup with two PCs, both with identical Chelsio
10GE iWARP adapter connected back-to-back.
With this setup, the TCP channel works fine (with
Hey Jeff, what did you run to generate the memory corruption? Can you
run the same test with --mca btl_openib_memalign_threshold 12288 and see
if you get the same corruption? I'm not hitting any corruption over
iw_cxgb4 with a simple test.
On 6/10/2015 2:39 PM, Jeff Squyres (jsquyres) wrote
FYI:
I opened:
https://github.com/open-mpi/ompi/issues/638
to track this.
Steve.
On 6/10/2015 4:07 PM, Ralph Castain wrote:
Done
On Jun 10, 2015, at 1:55 PM, Steve Wise <mailto:sw...@opengridcomputing.com>> wrote:
If you're trying to release 1.8.6, I recommend you revert th
ist
> Cc: Nathan Hjelm; Steve Wise
> Subject: Re: [OMPI users] Default value of btl_openib_memalign_threshold
>
> Nathan / Steve -- you guys are nominally the owners of the openib BTL: can
> you please investigate?
>
>
> > On Jun 10, 2015, at 4:15 PM, Ralph Castain wro
t;
>> On Jun 2, 2015, at 7:10 AM, Steve Wise
> wrote:
>>
>> On 6/1/2015 9:51 PM, Ralph Castain wrote:
>>> I’m wondering if it is also possible that the error message is
simply printing that ID incorrectly. Looking at the code, it
appears
b BTL
bootstrapping). :-)
On Jun 2, 2015, at 10:04 AM, Ralph Castain wrote:
On Jun 2, 2015, at 7:10 AM, Steve Wise wrote:
On 6/1/2015 9:51 PM, Ralph Castain wrote:
I’m wondering if it is also possible that the error message is simply printing
that ID incorrectly. Looking at the code, it ap
On 6/2/2015 10:04 AM, Ralph Castain wrote:
On Jun 2, 2015, at 7:10 AM, Steve Wise <mailto:sw...@opengridcomputing.com>> wrote:
On 6/1/2015 9:51 PM, Ralph Castain wrote:
I’m wondering if it is also possible that the error message is
simply printing that ID incorrectly. Looking at the
erent
MPI processes specific different receive queue specifications.
You mentioned that the device ID is being incorrectly identified: is that
OMPI's fault, or something wrong with the device itself?
On Jun 1, 2015, at 6:06 PM, Steve Wise wrote:
On 6/1/2015 9:53 AM, Ralph Castain wrote
On 6/1/2015 9:53 AM, Ralph Castain wrote:
Well, I checked and it looks to me like —hetero-apps is a stale option in the
master at least - I don’t see where it gets used.
Looking at the code, I would suspect that something didn’t get configured
correctly - either the —enable-heterogeneous flag
one of the settings that were printed out:
P,128,256,192,128:S,2048,1024,1008,64:S,12288,1024,1008,64:S,65536,1024,1008,64
or
P,65536,64
-Nathan
On Mon, Jun 01, 2015 at 09:28:28AM -0500, Steve Wise wrote:
Hello,
I'm seeing an error trying to run a simple OMPI job on a 2 node cluster where
been so long since someone tried this that I’d have to look to remember what it
does.
On Jun 1, 2015, at 7:28 AM, Steve Wise wrote:
Hello,
I'm seeing an error trying to run a simple OMPI job on a 2 node cluster where
one node is a PPC64 BE byte order and the other is a
X86_64 LE byte
Hello,
I'm seeing an error trying to run a simple OMPI job on a 2 node cluster where
one node is a PPC64 BE byte order and the other is a
X86_64 LE byte order node. OMPI 1.8.4 is configured with
--enable-heterogeneous:
./configure --with-openib=/usr CC=gcc CXX=g++ F77=gfortran FC=gfortran
--e
Hey Open MPI wizards,
I'm trying to debug something in my library that gets loaded into my mpi
processes when they are started via mpirun. With other MPIs, I've been
able to deliver SIGUSR2 to the process and trigger some debug code I
have in my library that sets up a handler for SIGUSR2. Ho
Hi,
I'm trying to use padb 3.0 to get stack traces on open-mpi / IMB1 runs.
While the job is running, I do run this, but get an error:
[ompi@hpc-hn1 ~]$ padb --show-jobs --config-option rmgr=orte
65427
[ompi@hpc-hn1 ~]$ padb --all --proc-summary --config-option rmgr=orte
Warning, failed to l
Andy Georgi wrote:
Steve Wise wrote:
Are you using Chelsio's TOE drivers? Or just a driver from the distro?
We use the Chelsio TOE drivers.
Steve Wise wrote:
Ok. Did you run their perftune.sh script?
Yes, if not we wouldn't get the 1.15 GB/s on the TCP level. We had
~800 M
Jon Mason wrote:
On Mon, Aug 18, 2008 at 10:00:24AM +0200, Andy Georgi wrote:
Steve Wise wrote:
Are you using Chelsio's TOE drivers? Or just a driver from the distro?
We use the Chelsio TOE drivers.
Steve Wise wrote:
Ok. Did you run their perftune.sh s
Andy Georgi wrote:
Hello again ;),
after getting acceptable latency on our Chelsio S320E-CXA adapters we
now want to check if we can
also tune the bandwidth. On TCP level (measured via iperf) we get 1.15
GB/s, on MPI level (measured
via MPI-Ping-Pong) just 930 MB/s. We already set btl_tcp_sndb
With OpenMPI 1.3 / iWARP you should get around 8us latency using mpi
pingpong tests.
Andy Georgi wrote:
Thanks again for all the answers. It seems that were was a bug in the
driver in combination with
Suse Linux Enterprise Server 10. It was fixed with version 1.0.146.
Now we have 12us with NP
On Thu, 2007-05-10 at 20:07 -0400, Jeff Squyres wrote:
> Brian --
>
> Didn't you add something to fix exactly this problem recently? I
> have a dim recollection of seeing a commit go by about this...?
>
> (I advised Steve in IM to use --disable-ipv6 in the meantime)
>
Yes, disabling it worke
I'm trying to run a job specifically over tcp and the eth1 interface.
It seems to be barfing on trying to listen via ipv6. I don't want ipv6.
How can I disable it?
Here's my mpirun line:
[root@vic12-10g ~]# mpirun --n 2 --host vic12,vic20 --mca btl self,tcp -mca
btl_tcp_if_include eth1 /root/IM
20 matches
Mail list logo