Kewl! Let us know if it breaks again.
> On Apr 26, 2015, at 4:29 PM, Andy Riebs wrote:
>
> Yes, it just worked -- I took the old command line, just to ensure that I was
> testing the correct problem, and it worked. Then I remembered that I had set
> OMPI_MCA_plm_rsh_pass_path and OMPI_MCA_plm_
Yes, it just worked -- I took the old command line, just to ensure
that I was testing the correct problem, and it worked. Then I
remembered that I had set OMPI_MCA_plm_rsh_pass_path and
OMPI_MCA_plm_rsh_pass_libpath in my test setup, so I removed those
from my environment
Not intentionally - I did add that new MCA param as we discussed, but don’t
recall making any other changes in this area.
There have been some other build system changes made as a result of more
extensive testing of the 1.8 release candidate - it is possible that something
in that area had an i
Hi Ralph,
Did you solve this problem in a more general way? I finally sat down
this morning to try this with the openmpi-dev-1567-g11e8c20.tar.bz2
nightly kit from last week, and can't reproduce the problem at all.
Andy
On 04/16/2015 12:15 PM, Ralph Cas
Sorry - I had to revert the commit due to a reported MTT problem. I'll
reinsert it after I get home and can debug the problem this weekend.
On Thu, Apr 16, 2015 at 9:41 AM, Andy Riebs wrote:
> Hi Ralph,
>
> If I did this right (NEVER a good bet :-) ), it didn't work...
>
> Using last night's ma
Hi Ralph,
If I did this right (NEVER a good bet :-) ), it didn't work...
Using last night's master nightly,
openmpi-dev-1515-gc869490.tar.bz2, I built with the same script as
yesterday, but removing the LDFLAGS=-Wl, stuff:
$ ./configure --prefix=/home/a
FWIW: I just added (last night) a pair of new MCA params for this purpose:
plm_rsh_pass_pathprepends the designated path to the remote
shell's PATH prior to executing orted
plm_rsh_pass_libpath same thing for LD_LIBRARY_PATH
I believe that will resolve the problem for Andy regardless of com
Hello,
On Apr 15, 2015, at 02:11 , Gilles Gouaillardet wrote:
what about reconfiguring Open MPI with LDFLAGS="-Wl,-rpath,/opt/
intel/15.0/composer_xe_2015.2.164/compiler/lib/mic" ?
IIRC, an other option is : LDFLAGS="-static-intel"
let me first state that I have no experience developing for
Ralph,
now i remember this part ...
IIRC, LD_LIBRARY_PATH was never forwarded when remote starting orted.
i simply avoided this issue by using gnu compilers, or gcc/g++/ifort if
i need
intel fortran
/* you already mentionned this is not officially supported by Intel */
What about adding a new
Gilles and Ralph, thanks!
$ shmemrun -H mic0,mic1 -n 2 -x SHMEM_SYMMETRIC_HEAP_SIZE=1M
$PWD/mic.out
[atl1-01-mic0:192474] [[29886,0],0] ORTE_ERROR_LOG: Not found in
file base/plm_base_launch_support.c at line 440
Hello World from process 0 of 2
Hello World fr
I think Gilles may be correct here. In reviewing the code, it appears we have
never (going back to the 1.6 series, at least) forwarded the local
LD_LIBRARY_PATH to the remote node when exec’ing the orted. The only thing we
have done is to set the PATH and LD_LIBRARY_PATH to support the OMPI pref
Andy,
what about reconfiguring Open MPI with
LDFLAGS="-Wl,-rpath,/opt/intel/15.0/composer_xe_2015.2.164/compiler/lib/mic"
?
IIRC, an other option is : LDFLAGS="-static-intel"
last but not least, you can always replace orted with a simple script
that sets the LD_LIBRARY_PATH and exec the ori
Hmmm…certainly looks that way. I’ll investigate.
> On Apr 14, 2015, at 6:06 AM, Andy Riebs wrote:
>
> Hi Ralph,
>
> Still no happiness... It looks like my LD_LIBRARY_PATH just isn't getting
> propagated?
>
> $ ldd /home/ariebs/mic/mpi-nightly/bin/orted
> linux-vdso.so.1 => (0x7ff
Hi Ralph,
Still no happiness... It looks like my LD_LIBRARY_PATH just isn't
getting propagated?
$ ldd /home/ariebs/mic/mpi-nightly/bin/orted
linux-vdso.so.1 => (0x7fffa1d3b000)
libopen-rte.so.0 =>
/home/ariebs/mic/mpi-nightly/lib/lib
Weird. I’m not sure what to try at that point - IIRC, building static won’t
resolve this problem (but you could try and see). You could add the following
to the cmd line and see if it tells us anything useful:
—leave-session-attached —mca mca_component_show_load_errors 1
You might also do an ld
Ralph and Nathan,
The problem may be something trivial, as I don't typically use
"shmemrun" to start jobs. With the following, I *think* I've
demonstrated that the problem library is where it belongs on the
remote system:
$ ldd mic.out
linux-vds
For talking between PHIs on the same system I recommend using the scif
BTL NOT tcp.
That said, it looks like the LD_LIBRARY_PATH is wrong on the remote
system. It looks like it can't find the intel compiler libraries.
-Nathan Hjelm
HPC-5, LANL
On Mon, Apr 13, 2015 at 04:06:21PM -0400, Andy Rieb
I don’t see that LD_PRELOAD showing up on the ssh path, Andy
> /usr/bin/ssh mic1 PATH=/home/ariebs/mic/mpi-nightly/bin:$PATH ; export
> PATH ; LD_LIBRARY_PATH=/home/ariebs/mic/mpi-nightly/lib:$LD_LIBRARY_PATH ;
> export LD_LIBRARY_PATH ;
> DYLD_LIBRARY_PATH=/home/ariebs/mic/mpi-nightly/lib:
Progress! I can run my trivial program on the local PHI, but not
the other PHI, on the system. Here are the interesting parts:
A pretty good recipe with last night's nightly master:
$ ./configure --prefix=/home/ariebs/mic/mpi-nightly CC="icc -mmic"
CXX="icpc -m
Hi Ralph,
Here are the results with last night's "master" nightly,
openmpi-dev-1487-g9c6d452.tar.bz2, and adding the
memheap_base_verbose option (yes, it looks like the "ERROR_LOG"
problem has gone away):
$ cat /proc/sys/kernel/shmmax
33554432
$ cat
My fault, I thought the tar ball name looked funny :-)
Will try again tomorrow
Andy
--
Andy Riebs
andy.ri...@hp.com
Original message
From: Ralph Castain
Date:04/12/2015 3:10 PM (GMT-05:00)
To: Open MPI Users
Subject: Re: [OMPI users] Problems using Open MPI 1.8.4 OSHMEM on
Sorry about that - I hadn’t brought it over to the 1.8 branch yet. I’ve done so
now, which means the ERROR_LOG shouldn’t show up any more. It won’t fix the
memheap problem, though.
You might try adding “--mca memheap_base_verbose 100” to your cmd line so we
can see why none of the memheap compo
Hi Ralph,
Here's the output with openmpi-v1.8.4-202-gc2da6a5.tar.bz2:
$ shmemrun -H localhost -N 2 --mca sshmem mmap --mca
plm_base_verbose 5 $PWD/mic.out
[atl1-01-mic0:190189] mca:base:select:( plm) Querying component
[rsh]
[atl1-01-mic0:190189] [[INV
Got it - thanks. I fixed that ERROR_LOG issue (I think- please verify). I
suspect the memheap issue relates to something else, but I probably need to let
the OSHMEM folks comment on it
> On Apr 11, 2015, at 9:52 AM, Andy Riebs wrote:
>
> Everything is built on the Xeon side, with the icc "-mm
Everything is built on the Xeon side, with the icc "-mmic" switch. I
then ssh into one of the PHIs, and run shmemrun from there.
On 04/11/2015 12:00 PM, Ralph Castain
wrote:
Let me try to understand the setup a little better. Are you
r
Let me try to understand the setup a little better. Are you running shmemrun on
the PHI itself? Or is it running on the host processor, and you are trying to
spawn a process onto the Phi?
> On Apr 11, 2015, at 7:55 AM, Andy Riebs wrote:
>
> Hi Ralph,
>
> Yes, this is attempting to get OSHMEM
Hi Ralph,
Yes, this is attempting to get OSHMEM to run on the Phi.
I grabbed openmpi-dev-1484-g033418f.tar.bz2 and configured it with
$ ./configure --prefix=/home/ariebs/mic/mpi-nightly CC=icc -mmic
CXX=icpc -mmic \
--build=x86_64-unknown-linu
Andy - could you please try the current 1.8.5 nightly tarball and see if it
helps? The error log indicates that it is failing to get the topology from some
daemon, I’m assuming the one on the Phi?
You might also add —enable-debug to that configure line and then put -mca
plm_base_verbose on the
Summary: MPI jobs work fine, SHMEM jobs work just often enough to be
tantalizing, on an Intel Xeon Phi/MIC system.
Longer version
Thanks to the excellent write-up last June
(),
I have been able to build a version of Open MPI for the Xeon Phi
coprocessor
29 matches
Mail list logo