Dear all,
What is Open MPI trunk ?
if there is Open MPI functionality that resides in Open MPI Trunk, what
does it mean ?
is it possible to use any Open MPI functionality that resides in Open MPI
Trunk ?
I want to use ompi-migrate command. According to Open MPI archive, it is
resides in open MPI
Husen,
trunk is an old term coming from SVN.
now you should read Open MPI master, e.g. the "master" branch from
https://github.com/open-mpi/ompi.git
(vs the v2.x or v1.10 branch of https://github.com/open-mpi/ompi-release.git)
Cheers,
Gilles
On Wed, Mar 23, 2016 at 3:13 PM, Husen R wrote:
>
I get 100 GFLOPS for 16 cores on one node, but 1 GFLOP running 8 cores
on two nodes. It seems that quad-infiniband should do better than
this. I built openmpi-1.10.2g with gcc version 6.0.0 20160317 . Any
ideas of what to do to get usable performance? Thank you!
bstatus
Infiniband device 'mlx4_0'
Attached is the output of ompi_info --all .
Note that the message :
Fort use mpi_f08: yes
Fort mpi_f08 compliance: The mpi_f08 module is available, but due to
limitations in the gfortran compiler, does not support the following:
array subsections, direct passthru (where possible) to under
Ronald,
did you try to build openmpi with a previous gcc release ?
if yes, what about the performance ?
did you build openmpi from a tarball or from git ?
if from git and without VPATH, then you need to
configure with --disable-debug
iirc, one issue was identified previously
(gcc optimization th
Thank you! Here are the answers:
I did not try a previous release of gcc.
I built from a tarball.
What should I do about the iirc issue--how should I check?
Are there any flags I should be using for infiniband? Is this a
problem with latency?
Ron
---
Ron Cohen
recoh...@gmail.com
skypename: ron
Hi, Ron
Please include the command line you used in your tests. Have you run any
sanity checks, like OSU latency and bandwidth benchmarks between the nodes?
Josh
On Wed, Mar 23, 2016 at 8:47 AM, Ronald Cohen wrote:
> Thank you! Here are the answers:
>
> I did not try a previous release of gcc
Ronald,
the fix I mentioned landed into the v1.10 branch
https://github.com/open-mpi/ompi-release/commit/c376994b81030cfa380c29d5b8f60c3e53d3df62
can you please post your configure command line ?
you can also try to
mpirun --mca btl self,vader,openib ...
to make sure your run will abort instead
Gilles,
I managed to get snapshots of all the /proc//status entries for all
liggghts jobs, but the Cpus_allowed ist similar no matter if the system
was cold or warm booted.
Then I looked around in /proc/ and found sched_debug.
This at least shows, that the liggghts-processes are not spread over
Rainer,
what if you explicitly bind tasks to cores ?
mpirun -bind-to core ...
note this is v1.8 syntax ...
v1.6 is now obsolete (Debian folks are working on upgrading it...)
out of curiosity, did you try an other distro such as redhat and the likes,
suse ...
and do you observe the same behavior
I have tried:
mpirun --mca btl openib,self -hostfile $PBS_NODEFILE -n 16 xhpl > xhpl.out
and
mpirun -hostfile $PBS_NODEFILE -n 16 xhpl > xhpl.out
How do I run "sanity checks, like OSU latency and bandwidth benchmarks
between the nodes?" I am not superuser. Thanks,
Ron
---
Ron Cohen
recoh.
The configure line was simply:
./configure --prefix=/home/rcohen
when I run:
mpirun --mca btl self,vader,openib ...
I get the same lousy results: 1.5 GFLOPS
The output of the grep is:
Cpus_allowed_list: 0-7
Cpus_allowed_list: 8-15
Cpus_allowed_list: 0-7
Cpus_allowed_list:
Ronald,
first, can you make sure tm was built ?
the easiest way us to
configure --with-tm ...
it will crash if tm is not found
if pbs/torque is not installed in a standard location, then you have to
configure --with-tm=
then you can omit -hostfile from your mpirun command line
hpl is known to sc
Dear Gilles,
--with-tm fails. I have now built with
./configure --prefix=/home/rcohen --with-tm=/opt/torque
make clean
make -j 8
make install
This rebuilt greatly improved performance, from 1 GF to 32 GF for 2
nodes for a 2000 size matrix. For 5000 it went up to 108. So this
sounds pretty good.
Ronald,
out of curiosity, what kind of performance do you get with tcp and two
nodes ?
e.g.
mpirun --mca tcp,vader,self ...
before that, you can
mpirun uptime
to ensure all your nodes are free
(e.g. no process was left running by an other job)
you might also want to allocate your nodes exclusive
not sure whether it is relevant in this case, but I spent in January
nearly one week to figure out why the openib component was running very
slow with the new Open MPI releases (though it was the 2.x series at
that time), and the culprit turned out to be the
btl_openib_flags parameter. I used to
I don't have any parameters set other than the defaults--thank you!
Ron
---
Ron Cohen
recoh...@gmail.com
skypename: ronaldcohen
twitter: @recohen3
On Wed, Mar 23, 2016 at 11:07 AM, Edgar Gabriel wrote:
> not sure whether it is relevant in this case, but I spent in January nearly
> one week to
So I've redone this with openmpi 1.10.2 and another piece of software (lammps
16feb16) and get same results.
Upon cr_restart I see the openlava_wrapper process, the mpirun process
reappearing but no orted and no lmp_mpi processes. Not obvious error anywhere.
Using the --save-all feature from
I don’t believe checkpoint/restart is supported in OMPI past the 1.6 series.
There was some attempt to restore it, but that person graduated prior to fully
completing the work.
> On Mar 23, 2016, at 9:14 AM, Meij, Henk wrote:
>
> So I've redone this with openmpi 1.10.2 and another piece of so
Both BLCR and Open MPI work just fine. Independently.
Checkpointing and restarting a parallel application is not as simple as
mixing 2 tools together (especially when we talk about a communication
library, aka. MPI), they have to cooperate in order to achieve the desired
goal of being able to cont
Thanks for responding.
#1 I am checkpointing the "wrapper" script (for the scheduler) which sets up
the mpirun env, builds machinefile etc, then launches mpirun which launches
orted which launches lmp_mpi ... this gave me an idea to check BLCR, it states
" The '--tree' flag to 'cr_checkpoint'
So I want to thank you so much! My benchmark for my actual application
went from 5052 seconds to 266 seconds with this simple fix!
Ron
---
Ron Cohen
recoh...@gmail.com
skypename: ronaldcohen
twitter: @recohen3
On Wed, Mar 23, 2016 at 11:00 AM, Ronald Cohen wrote:
> Dear Gilles,
>
> --with-tm f
Hi Durga,
Sorry for the late reply and thanks for reporting that issue. As Rayson
mentioned, CUDA is intrinsically C++ and indeed uses the host C++
compiler. Hence linking MPI + CUDA code may need to use mpic++.
It happens to work with mpicc on various platforms where the libstdc++
is linked
Dear all,
I have been trying ,for the last week, compiling a code (SPRKKR). the
compilation went through ok. however, there are problems with the executable
(kkrscf6.3MPI) not finding the MKL library links. i could not fix the
problem..I have tried several things but in vain..I will post both
Elio,
it seems /opt/intel/composer_xe_2013_sp1/bin/compilervars.sh is only
available on your login/frontend nodes,
but not on your compute nodes.
you might be luckier with
/opt/intel/mkl/bin/mklvars.sh
an other option is to
ldd /home/emoujaes/Elie/SPRKKR/bin/kkrscf6.3MPI
on your login node, an
Dear Gilles,
thanks for your reply and your options. I have tried the first option, hich for
me basically is the easiest. I have compiled using "make.inc" but now setting
LIB = -L/opt/intel/mkl/lib/intel64 -lmkl_blas95_lp64 -lmkl_lapack95_lp64
-lmkl_intel_lp64 -lmkl_core -lmkl_sequential
E
26 matches
Mail list logo