Gilles' question is correct; the larger point to make here: the openib BTL is
obsolete and is actively replaced by the UCX PML. UCX is supported by the
vendor (NVIDIA); openib is not.
If you're just starting a new project, I would strongly advocate using UCX
instead of openib.
On Nov 1, 202
Hi Ben,
have you tried
export OMPI_MCA_common_ucx_opal_mem_hooks=1
Cheers,
Gilles
On Mon, Nov 1, 2021 at 9:22 PM bend linux4ms.net via users <
users@lists.open-mpi.org> wrote:
> Ok, I a am newbie supporting the a HPC project and learning about MPI.
>
> I have the following portion of a shells
Ok, I a am newbie supporting the a HPC project and learning about MPI.
I have the following portion of a shells script:
export OMPI_MCA_btl_openib_allow_ib=1
export OMPI_MCA_btl_openib_if_include="mlx5_0:1"
mpirun -machinefile ${hostlist} \
--mca opal_common_ucx_opal_mem_hooks 1 \
-np $NP \
Hello Gilles
Thanks again for your inputs. Since that code snippet works for you, I am
now fairly certain that my 'instrumentation' has broken something; sorry
for troubling the whole community while I climb the learning curve. The
netcat script that you mention does work correctly; that and that
your program works fine on my environment.
this is typical of a firewall running on your host(s), can you double
check that ?
a simple way to do that is to
10.10.10.11# nc -l 1024
and on the other node
echo ahah | nc 10.10.10.11 1024
the first command should print "ahah" unless the host is u
Hello Gilles
Thanks for your help.
My question was more of a sanity check on myself. That little program I
sent looked correct to me; do you see anything wrong with it?
What I am running on my setup is an instrumented OMPI stack, taken from git
HEAD, in an attempt to understand how some of the i
Hi,
per a previous message, can you give a try to
mpirun -np 2 -hostfile ~/hostfile -mca btl self,tcp --mca pml ob1 ./mpitest
if it still hangs, the issue could be OpenMPI think some subnets are
reachable but they are not.
for diagnostic :
mpirun --mca btl_base_verbose 100 ...
you can explic
Hello all
I don't mean to be competing for the 'silliest question of the year award',
but I can't figure this out on my own:
My 'cluster' has 2 machines, bigMPI and smallMPI. They are connected via
several (types of) networks and the connectivity is OK.
In this setup, the following program hangs
> > On a side note, do you have an RDMA supporting device (
> > Infiniband/RoCE/iWarp) ?
>
> I'm just an engineer trying to get something to work on an AMD dual core
> notebook for the powers-that-be at a small engineering concern (all MEs) in
> Huntsville, AL - i.e., NASA work.
>
If on a uni
> On a side note, do you have an RDMA supporting device (
Infiniband/RoCE/iWarp) ?
I'm just an engineer trying to get something to work on an AMD dual core
notebook for the powers-that-be at a small engineering concern (all MEs) in
Huntsville, AL - i.e., NASA work.
---John
On Sun, Sep 16, 2012 a
Thanks, I'll go to the FAQs. ---John
On Sun, Sep 16, 2012 at 3:21 AM, Jingcha Joba wrote:
> John,
>
> BTL refers to Byte Transfer Layer, a framework to send/receive point to
> point messages on different network. It has several components
> (implementations) like openib, tcp, mx, shared mem, et
John,
BTL refers to Byte Transfer Layer, a framework to send/receive point to point
messages on different network. It has several components (implementations) like
openib, tcp, mx, shared mem, etc.
^openib means "not" to use openib component for p2p messages.
On a side note, do you have an RDM
BTW, I looked up the -mca option:
-mca |--mca
Pass context-specific MCA parameters; they are
considered global if --gmca is not used and only
one context is specified (arg0 is the parameter
name; arg1 is the parameter value)
Could you exp
BINGO! That did it. Thanks. ---John
On Sat, Sep 15, 2012 at 9:32 PM, Ralph Castain wrote:
> No - the mca param has to be specified *before* your executable
>
> mpiexec -mca btl ^openib -n 4 ./a.out
>
> Also, note the space between "btl" and "^openib"
>
>
> On Sep 15, 2012, at 5:45 PM, John Ch
No - the mca param has to be specified *before* your executable
mpiexec -mca btl ^openib -n 4 ./a.out
Also, note the space between "btl" and "^openib"
On Sep 15, 2012, at 5:45 PM, John Chludzinski
wrote:
> Is this what you intended(?):
>
> $ mpiexec -n 4 ./a.out -mca btl^openib
>
> librdma
Is this what you intended(?):
*$ mpiexec -n 4 ./a.out -mca btl^openib
*librdmacm: couldn't read ABI version.
librdmacm: assuming: 4
CMA: unable to get RDMA device list
--
[[5991,1],0]: A high-performance Open MPI point-to-poi
Try adding "-mca btl ^openib" to your cmd line and see if that cleans it up.
On Sep 15, 2012, at 12:44 PM, John Chludzinski
wrote:
> There was a bug in the code. So now I get this, which is correct but how do
> I get rid of all these ABI, CMA, etc. messages?
>
> $ mpiexec -n 4 ./a.out
> li
There was a bug in the code. So now I get this, which is correct but how
do I get rid of all these ABI, CMA, etc. messages?
$ mpiexec -n 4 ./a.out
librdmacm: couldn't read ABI version.
librdmacm: couldn't read ABI version.
librdmacm: assuming: 4
CMA: unable to get RDMA device list
librdmacm: assu
BTW, here the example code:
program scatter
include 'mpif.h'
integer, parameter :: SIZE=4
integer :: numtasks, rank, sendcount, recvcount, source, ierr
real :: sendbuf(SIZE,SIZE), recvbuf(SIZE)
! Fortran stores this array in column major order, so the
! scatter will actually scatter columns, n
# export LD_LIBRARY_PATH
# mpiexec -n 1 printenv | grep PATH
LD_LIBRARY_PATH=/usr/lib/openmpi/lib/
PATH=/usr/lib/openmpi/bin/:/usr/lib/ccache:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/jski/.local/bin:/home/jski/bin
MODULEPATH=/usr/share/Modules/modulefiles:/etc/modulefiles
Ah - note that there is no LD_LIBRARY_PATH in the environment. That's the
problem
On Sep 15, 2012, at 11:19 AM, John Chludzinski
wrote:
> $ which mpiexec
> /usr/lib/openmpi/bin/mpiexec
>
> # mpiexec -n 1 printenv | grep PATH
> PATH=/usr/lib/openmpi/bin/:/usr/lib/ccache:/usr/local/bin:/usr/bin
Am 15.09.2012 um 19:00 schrieb John Chludzinski:
> I installed OpenMPI (I have a simple dual core AMD notebook with Fedora 16)
> via:
>
> # yum install openmpi
> # yum install openmpi-devel
> # mpirun --version
> mpirun (Open MPI) 1.5.4
>
> I added:
>
> $ PATH=PATH=/usr/lib/openmpi/bin/:$PAT
$ which mpiexec
/usr/lib/openmpi/bin/mpiexec
# mpiexec -n 1 printenv | grep PATH
PATH=/usr/lib/openmpi/bin/:/usr/lib/ccache:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/jski/.local/bin:/home/jski/bin
MODULEPATH=/usr/share/Modules/modulefiles:/etc/modulefiles
WINDOWPATH=1
O
Couple of things worth checking:
1. verify that you executed the "mpiexec" you think you did - a simple "which
mpiexec" should suffice
2. verify that your environment is correct by "mpiexec -n 1 printenv | grep
PATH". Sometimes the ld_library_path doesn't carry over like you think it should
O
I installed OpenMPI (I have a simple dual core AMD notebook with Fedora 16)
via:
# yum install openmpi
# yum install openmpi-devel
# mpirun --version
mpirun (Open MPI) 1.5.4
I added:
$ PATH=PATH=/usr/lib/openmpi/bin/:$PATH
$ LD_LIBRARY_PATH=/usr/lib/openmpi/lib/
Then:
$ mpif90 ex1.f95
$ mpiexe
On Thu, Jan 13, 2011 at 05:34:48PM -0800, Tena Sakai wrote:
> Hi Gus,
>
> > Did you speak to the Rmpi author about this?
>
> No, I haven't, but here's what the author wrote:
> https://stat.ethz.ch/pipermail/r-sig-hpc/2009-February/000104.html
> in which he states:
>...The way of spawning R sl
Hi Gus,
> Did you speak to the Rmpi author about this?
No, I haven't, but here's what the author wrote:
https://stat.ethz.ch/pipermail/r-sig-hpc/2009-February/000104.html
in which he states:
...The way of spawning R slaves under LAM is not working
any more under OpenMPI. Under LAM, one just
Tena Sakai wrote:
Fantastic, Gus! Now I think I got framework pretty much done.
The rest is to work on 'problem solving' end with R.
Many thanks for your insight and kindness. I really appreciate it.
Regards,
Tena Sakai
tsa...@gallo.ucsf.edu
Hi Tena
I'm glad that it helped somebody at the
Fantastic, Gus! Now I think I got framework pretty much done.
The rest is to work on 'problem solving' end with R.
Many thanks for your insight and kindness. I really appreciate it.
Regards,
Tena Sakai
tsa...@gallo.ucsf.edu
On 1/13/11 2:40 PM, "Gus Correa" wrote:
> Tena Sakai wrote:
>> Hi,
Tena Sakai wrote:
Hi,
I have a script I call fib.r. It looks like:
#!/usr/bin/env r
fib <- function( n ) {
a <- 0
b <- 1
for ( i in 1:n ) {
t <- b
b <- a
a <- a + t
Hi,
I have a script I call fib.r. It looks like:
#!/usr/bin/env r
fib <- function( n ) {
a <- 0
b <- 1
for ( i in 1:n ) {
t <- b
b <- a
a <- a + t
}
Ralph Castain wrote:
On Jan 12, 2011, at 12:54 PM, Tena Sakai wrote:
Hi Siegmar,
Many thanks for your reply.
I have tried man pages you mention, but one hurdle I am running into
is orte_hosts page. I don't find the specification of fields for
the file. I see an example:
dummy1 slots=4
On Jan 12, 2011, at 12:54 PM, Tena Sakai wrote:
> Hi Siegmar,
>
> Many thanks for your reply.
>
> I have tried man pages you mention, but one hurdle I am running into
> is orte_hosts page. I don't find the specification of fields for
> the file. I see an example:
>
> dummy1 slots=4
> dum
Hi Siegmar,
Many thanks for your reply.
I have tried man pages you mention, but one hurdle I am running into
is orte_hosts page. I don't find the specification of fields for
the file. I see an example:
dummy1 slots=4
dummy2 slots=4
dummy3 slots=4
dummy4 slots=4
dummy5 slots=4
I
You might be confusing LAM/MPI with Open MPI -- they're two different software
code bases. They both implement the MPI standard, but they're entirely
different software projects. Indeed, all the LAM/MPI developers (including me)
abandoned LAM/MPI several years ago and went to work on Open MPI
So are you trying to start an mpi job that one process is one executable
and the other process(es) are something else? If so, you probably want
to use a multiple app context. If you look at FAQ question 7. How do I
run an MPMD MPI Job at http://www.open-mpi.org/faq/?category=running
this sho
Hi,
> What I want is to spawn a bunch of R slaves to other machines on
> the network. I can spawn R slaves, as many as I like, to the local
> machine, but I don t know how to do this with machines on the
> network. That s what hosts parameter of mpi.spawn.Rslaves()
> enables me to do, I think. I
Hi,
Thanks for your reply.
I am afraid your terse response doesn’t shed much light. What I need is “hosts”
parameter I can use to mpi.spawn.Rslaves() function. Can you explain or better
yet give an example as to how I can get this via mpirun?
Looking at mpirun man page, I found an example:
m
You can use mpirun.
On Mon, Jan 10, 2011 at 8:04 PM, Tena Sakai wrote:
> Hi,
>
> I am an mpi newbie. My open MPI is v 1.4.3, which I compiled
> on a linux machine.
>
> I am using a language called R, which has an mpi interface/package.
> It appears that it is happy, on the surface, with the op
Hi,
I am an mpi newbie. My open MPI is v 1.4.3, which I compiled
on a linux machine.
I am using a language called R, which has an mpi interface/package.
It appears that it is happy, on the surface, with the open MPI I installed.
There is an R function called mpi.spawn.Rslaves(). An argument to
I fixed the OOB. I also mucked some things up with it interface wise
that I need to undo :). Anyway, I'll have a look at fixing up the
TCP component in the next day or two.
Brian
On May 10, 2007, at 6:07 PM, Jeff Squyres wrote:
Brian --
Didn't you add something to fix exactly this probl
Good to know. This suggests that building VASP properly with Open
MPI should work properly; perhaps there's some secret sauce in the
Makefile somewhere...? Off list, someone cited the following to me:
-
Also VASP has a forum for things like this too.
http://cms.mpi.univie.ac.at/vasp-for
On Thu, 2007-05-10 at 20:07 -0400, Jeff Squyres wrote:
> Brian --
>
> Didn't you add something to fix exactly this problem recently? I
> have a dim recollection of seeing a commit go by about this...?
>
> (I advised Steve in IM to use --disable-ipv6 in the meantime)
>
Yes, disabling it worke
Brian --
Didn't you add something to fix exactly this problem recently? I
have a dim recollection of seeing a commit go by about this...?
(I advised Steve in IM to use --disable-ipv6 in the meantime)
On May 10, 2007, at 1:25 PM, Steve Wise wrote:
I'm trying to run a job specifically over
I'm trying to run a job specifically over tcp and the eth1 interface.
It seems to be barfing on trying to listen via ipv6. I don't want ipv6.
How can I disable it?
Here's my mpirun line:
[root@vic12-10g ~]# mpirun --n 2 --host vic12,vic20 --mca btl self,tcp -mca
btl_tcp_if_include eth1 /root/IM
I have previously been running parallel VASP happily with an old,
prerelease version of OpenMPI:
[terry@nocona Vasp.4.6-OpenMPI]$
head /home/terry/Install_trees/OpenMPI-1.0rc6/config.log
This file contains any messages produced by compilers while
running configure, to aid debugging if configure
Thank Jeff very much for your efforts and helps.
On 5/9/07, Jeff Squyres wrote:
I have mailed the VASP maintainer asking for a copy of the code.
Let's see what happens.
On May 9, 2007, at 2:44 PM, Steven Truong wrote:
> Hi, Jeff. Thank you very much for looking into this issue. I am
> afr
I have mailed the VASP maintainer asking for a copy of the code.
Let's see what happens.
On May 9, 2007, at 2:44 PM, Steven Truong wrote:
Hi, Jeff. Thank you very much for looking into this issue. I am
afraid that I can not give you the application/package because it is a
comercial softw
Hi, Jeff. Thank you very much for looking into this issue. I am
afraid that I can not give you the application/package because it is a
comercial software. I believe that a lot of people are using this
VASP software package http://cms.mpi.univie.ac.at/vasp/.
My current environment uses MPICH
Can you send a simple test that reproduces these errors?
I.e., if there's a single, simple package that you can send
instructions on how to build, it would be most helpful if we could
reproduce the error (and therefore figure out how to fix it).
Thanks!
On May 9, 2007, at 2:19 PM, Steven
Oh, no. I tried with ACML and had the same set of errors.
Steven.
On 5/9/07, Steven Truong wrote:
Hi, Kevin and all. I tried with the following:
./configure --prefix=/usr/local/openmpi-1.2.1 --disable-ipv6
--with-tm=/usr/local/pbs --enable-mpirun-prefix-by-default
--enable-mpi-f90 --with-t
Hi, Kevin and all. I tried with the following:
./configure --prefix=/usr/local/openmpi-1.2.1 --disable-ipv6
--with-tm=/usr/local/pbs --enable-mpirun-prefix-by-default
--enable-mpi-f90 --with-threads=posix --enable-static
and added the mpi.o in my VASP's makefile but i still got error.
I forg
Thank Kevin and Brook for replying to my question. I am going to try
out what Kevin suggested.
Steven.
On 5/9/07, Kevin Radican wrote:
Hi,
We use VASP 4.6 in parallel with opemmpi 1.1.2 without any problems on
x86_64 with opensuse and compiled with gcc and Intel fortran and use
torque PBS.
Hi,
We use VASP 4.6 in parallel with opemmpi 1.1.2 without any problems on
x86_64 with opensuse and compiled with gcc and Intel fortran and use
torque PBS.
I used standard configure to build openmpi something like
./configure --prefix=/usr/local --enable-static --with-threads
--with-tm=/usr/loc
Steven,
We run vasp on both Linux (PGI compilers) and Max OSX (xlf) I am sad
to announce that VASP does not work with openMPI last I tried
(1.1.1) With the errors you reported are the same I saw. VASP for
the time (version 4) Works only with Lam and MPICH-1.
If you have insight into th
Hi, all. I am new to OpenMPI and after initial setup I tried to run
my app but got the followign errors:
[node07.my.com:16673] *** An error occurred in MPI_Comm_rank
[node07.my.com:16673] *** on communicator MPI_COMM_WORLD
[node07.my.com:16673] *** MPI_ERR_COMM: invalid communicator
[node07.my.c
56 matches
Mail list logo