date:20110930

Re: [OMPI users] MPIRUN + Environtment Variable

2011-09-30 Thread Eugene Loh


 On 09/29/11 20:54, Xin Tong wrote:
I need to set up some environment variables before I run my 
application ( appA ). I am currently using mpirun -np 1 -host socrates 
(socrates is another machine) appA. Before appA runs, it expects 
some environment variables to be set up. How do i do that ?

% man mpirun
...
 To manage files and runtime environment:
...
 -x 
  Export  the  specified  environment  variables  to  the
  remote  nodes  before  executing the program.  Only one
  environment variable can be specified  per  -x  option.
  Existing  environment variables can be specified or new
  variable names  specified  with  corresponding  values.
  For example:
  % mpirun -x DISPLAY -x OFILE=/tmp/out ...

  The parser for the -x option is not very sophisticated;
  it  does  not even understand quoted values.  Users are
  advised to set variables in the environment,  and  then
  use -x to export (not define) them.

Re: [OMPI users] VampirTrace integration with VT_GNU_NMFILE environment variable

2011-09-30 Thread Matthias Jurenz

Hello,

first, please consider that the VT versions integrated in Open MPI v1.5.x and 
v1.4.x are different - respectively the names of the environment variables for 
setting a pre-created symbol list:

Open MPI v1.4.x: VT_NMFILE
Open MPI v1.5.x: VT_GNU_NMFILE


Furthermore, make sure that the environment variable is exported to *all* MPI 
tasks. Therefor, add the option '-x ' to your mpirun command:

mpirun -x VT_GNU_NMFILE ...


Regards,
Matthias

On Monday 26 September 2011 3:19:21 you wrote:
> According to the VampirTrace documentation, it is possible to create a
> symbol list file in advance and set the name of the file in the
> environment variable VT_GNU_NMFILE.  For example, you might do this:
> 
> $ nm hello > hello.nm
> $ export VT_GNU_NMFILE="hello.nm"
> 
> I have set up a symbol file list as above (with full path name of
> course) but when I run my VT instrumented program (via mpirun) it
> appears to ignore the VT_GNU_NMFILE environment variable and run "nm"
> automatically on startup (the default behavior).  This can be a time
> consuming process, so I would prefer to use the pre-created symbol
> list file.
> 
> Can anyone confirm if the VT_GNU_NMFILE environment variable is
> supported with the OpenMPI integration?
> 
> Thanks,
> Rocky
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] Role of ethernet interfaces of startup of openmpi job using IB

2011-09-30 Thread Salvatore Podda


Thanks for the prompt reply!




On Sep 27, 2011, at 6:35 AM, Salvatore Podda wrote:

	We would like to know if the ethernet interfaces play any role in  
the startup phase of an opempi job using InfiniBand

In this case, where we can found some literature on this topic?


Unfortunately, there's not a lot of docs about this other than  
people asking questions on this list.




For the above reason, does anyone, in the list, know which the order/ 
ranking by which the

ethernet interfaces will be qeuried in the case of multiple ones?
And which are the rules?

Regards

Salvatore Podda
IP is used by default during Open MPI startup.  Specifically, it is  
used as our "out of band" communication channel for things like  
stdin/stdout/stderr redirection, launch command relaying, process  
control, etc.  The OOB channel is also used by default for  
bootstrapping IB queue pairs.


To clarify, note that these are two different things:

1. the out of band (OOB) channel used for process control, std*  
routing, etc.

2. bootstrapping IB queue pairs

You can change the IB QP bootstrapping to use the OpenFabrics RDMA  
communications manager (vs. our OOB channel) with the following:


   mpirun --mca btl_openib_if_cpc rdmacm ...

See if that helps (although the OF RDMA CM has its own scalability  
issues, also associated with ARP).


If your cluster is large, you might want to check out the section on  
our FAQ about large clusters:


   http://www.open-mpi.org/faq/?category=large-clusters

I don't think there's an entry on there yet about this, but it may  
also be worthwhile to try enabling the "radix" support; a more  
scalable version of our OOB channel (i.e., the tree across all the  
support daemons has a much larger radix and is therefore much  
flatter).  Los Alamos recently committed an IB UD OOB channel plugin  
to our development trunk and is comparing its performance to the  
radix tree to see if it's worthwhile.


--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI users] Role of ethernet interfaces of startup of openmpi job using IB

2011-09-30 Thread Jeff Squyres

On Sep 30, 2011, at 6:29 AM, Salvatore Podda wrote:

> For the above reason, does anyone, in the list, know which the order/ranking 
> by which the
> ethernet interfaces will be qeuried in the case of multiple ones?
> And which are the rules?

They're all used equally.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI users] EXTERNAL: Re: Unresolved reference 'mbind' and 'get_mempolicy'

2011-09-30 Thread Jeff Squyres

On Sep 28, 2011, at 5:02 PM, Blosch, Edwin L wrote:

> ./configure --prefix=/release/cfd/openmpi-intel --without-tm --without-sge 
> --without-lsf --without-psm --without-portals --without-elan --without-slurm 
> --without-loadleveler --without-libnuma --enable-mpirun-prefix-by-default 
> --enable-contrib-no-build=vt --enable-mca-no-build=maffinity 
> --disable-per-user-config-files --disable-io-romio --enable-static 
> --disable-shared --without-openib CXX=/appserv/intel/cce/10.1.021/bin/icpc 
> CC=/appserv/intel/cce/10.1.021/bin/icc 'CFLAGS=  -O2' 'CXXFLAGS=  -O2' 
> F77=/appserv/intel/fce/10.1.021/bin/ifort 'FFLAGS=-D_GNU_SOURCE -traceback  
> -O2' FC=/appserv/intel/fce/10.1.021/bin/ifort 'FCFLAGS=-D_GNU_SOURCE 
> -traceback  -O2' 'LDFLAGS= -static-intel

The weird thing here is that I am unable to replicate this issue.  :-\

I thought that if I tried essentially the same configure line as above, I 
should see the same issue, because I have libnuma.so and no libnuma.a.  But it 
worked fine (i.e., OMPI build and installed fine, and I'm able to compile/link 
MPI applications just fine).  Huh.

> The error messages upon linking the application are unchanged:  
>> /scratch1/bloscel/builds/release/openmpi-intel/lib/libmpi.a(topology-linux.o):
>>  In function `hwloc_linux_alloc_membind':
>> topology-linux.c:(.text+0x1da): undefined reference to `mbind'
> 
> Re: NUMA:  It appears there is a /usr/lib64/libnuma.so but no static version. 
> There is /usr/include/numa.h and /usr/include/numaif.h.
> 
> I don't understand about make V=1. What tree? Somewhere in the OpenMPI build, 
> or in the application compilation itself? Is "V=1" something in the OpenMPI 
> makefile structure?

Sorry, "make V=1" is part of OMPI's build system.  If you "make V=1" in the 
v1.5 (and later) OMPI, it'll show you the whole compile line instead of the 
abbreviated output.

> Thanks,
> 
> Ed
> 
> -Original Message-
> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On 
> Behalf Of Jeff Squyres
> Sent: Wednesday, September 28, 2011 11:05 AM
> To: Open MPI Users
> Subject: EXTERNAL: Re: [OMPI users] Unresolved reference 'mbind' and 
> 'get_mempolicy'
> 
> Yowza; that sounds like a configury bug.  :-(
> 
> What line were you using to configure Open MPI?  Do you have libnuma 
> installed?  If so, do you have the .h and .so files?  Do you have the .a file?
> 
> Can you send the last few lines of output from a failed "make V=1" in that 
> tree?  (it'll show us the exact commands used to compile/link, etc.)
> 
> 
> On Sep 28, 2011, at 11:55 AM, Blosch, Edwin L wrote:
> 
>> I am getting some undefined references in building OpenMPI 1.5.4 and I would 
>> like to know how to work around it.
>> 
>> The errors look like this:
>> 
>> /scratch1/bloscel/builds/release/openmpi-intel/lib/libmpi.a(topology-linux.o):
>>  In function `hwloc_linux_alloc_membind':
>> topology-linux.c:(.text+0x1da): undefined reference to `mbind'
>> topology-linux.c:(.text+0x213): undefined reference to `mbind'
>> /scratch1/bloscel/builds/release/openmpi-intel/lib/libmpi.a(topology-linux.o):
>>  In function `hwloc_linux_set_area_membind':
>> topology-linux.c:(.text+0x414): undefined reference to `mbind'
>> topology-linux.c :(.text+0x46c): undefined reference to `mbind'
>> /scratch1/bloscel/builds/release/openmpi-intel/lib/libmpi.a(topology-linux.o):
>>  In function `hwloc_linux_get_thisthread_membind':
>> topology-linux.c:(.text+0x4ff): undefined reference to `get_mempolicy'
>> topology-linux.c:(.text+0x5ff): undefined reference to `get_mempolicy'
>> /scratch1/bloscel/builds/release/openmpi-intel/lib/libmpi.a(topology-linux.o):
>>  In function `hwloc_linux_set_thisthread_membind':
>> topology-linux.c:(.text+0x7b5): undefined reference to `migrate_pages'
>> topology-linux.c:(.text+0x7e9): undefined reference to `set_mempolicy'
>> topology-linux.c:(.text+0x831): undefined reference to `set_mempolicy'
>> make: *** [main] Error 1
>> 
>> S ome  configure output that is probably relevant:
>> 
>> checking numaif.h usability... yes
>> checking numaif.h presence... yes
>> checking for numaif.h... yes
>> checking for set_mempolicy in -lnuma... yes
>> checking for mbind in -lnuma... yes
>> checking for migrate_pages in -lnuma... yes
>> 
>> The FAQ says that I should have to give -with-libnuma explicitly, but I did 
>> not do that.   Is there a problem with configure? Or the FAQ?  Or perhaps 
>> the system has a configuration peculiarity?
>> 
>> On another system, the configure output is different, and there are no 
>> unresolved references:
>> 
>> checking numaif.h usability... no
>> checking numaif.h presence... no
>> checking for numaif.h... no
>> 
>> What is the configure option that will make the unresolved references go 
>> away?
>> 
>> Thanks,
>> 
>> Ed
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For c

Re: [OMPI users] EXTERNAL: Re: Unresolved reference 'mbind' and 'get_mempolicy'

2011-09-30 Thread Jeff Squyres

On Sep 29, 2011, at 12:45 PM, Blosch, Edwin L wrote:

> If I add --without-hwloc in addition to --without-libnuma, then it builds.  
> Is that a reasonable thing to do?  Is there a better workaround?  This 
> 'hwloc' module looks like it might be important.

As a note of explanation: hwloc is effectively our replacement for libnuma.  
You might want to check out hwloc (the standalone software package) -- it has a 
CLI and is quite useful for administrating servers, even outside of an HPC 
environment:

http://www.open-mpi.org/projects/hwloc/

hwloc may use libnuma under the covers; that's where this issue is coming from 
(i.e., OMPI may still use libnuma -- it's just now doing so indirectly, instead 
of directly).

> For what it's worth, if there's something wrong with my configure line, let 
> me know what to improve. Otherwise, as weird as 
> "--enable-mca-no-build=maffinity --disable-io-romio --enable-static 
> --disable-shared" may look, I am not trying to build fully static binaries. I 
> have unavoidable need to build OpenMPI on certain machines and then transfer 
> the executables to other machines that are compatable but not identical, and 
> over the years these are the minimal set of configure flags necessary to make 
> that possible. I may revisit these choices at some point, but if they are 
> supposed to work, then I'd rather just keep using them.

Your configure line looks fine to me.

FWIW/heads up: in the 1.7 series, we're going to be ignoring the $F77 and 
$FFLAGS variables; we'll *only* be using $FC and $FCFLAGS.  There's still 
plenty of time before this hits mainstream, but I figured I'd let you know it's 
coming.  :-)

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI users] EXTERNAL: Re: Unresolved reference 'mbind' and 'get_mempolicy'

2011-09-30 Thread Jeff Squyres

I think the issue here is that it's linking the *MPI application* that is 
causing the problem.  Is that right?

If so, can you send your exact application compile line, and the the output of 
that compile line with "--showme" at the end?


On Sep 29, 2011, at 4:24 PM, Brice Goglin wrote:

> Le 28/09/2011 23:02, Blosch, Edwin L a écrit :
>> Jeff,  
>> 
>> I've tried it now adding --without-libnuma.  Actually that did NOT fix the 
>> problem, so I can send you the full output from configure if you want, to 
>> understand why this "hwloc" function is trying to use a function which 
>> appears to be unavailable.
> 
> This function is likely available... in the dynamic version of libnuma
> (that's why configure is happy), but make is probably trying to link
> with the static version which isn't available on your machine. That's my
> guess, at least.
> 
>> I don't understand about make V=1. What tree? Somewhere in the OpenMPI 
>> build, or in the application compilation itself? Is "V=1" something in the 
>> OpenMPI makefile structure?
> 
> Instead of doing
>  ./configure ...
>  make
> do
>  ./configure 
>  make V=1
> 
> It will make the output more verbose. Once you get the failure, please
> send the last 15 lines or so. We will look at these verbose lines to
> understand how things are being compiled (which linker flags, which
> libraries, ...)
> 
> Brice
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI users] EXTERNAL: Re: Unresolved reference 'mbind' and 'get_mempolicy'

2011-09-30 Thread Ed Blosch

Thank you for all this information.

Your diagnosis is totally right.  I actually sent e-mail yesterday but
apparently it never got through :<

It IS the MPI application that is failing to link, not OpenMPI itself; my
e-mail was not well written; sorry Brice.

The situation is this:  I am trying to compile using an OpenMPI 1.5.4 that
was built to be rooted in /release, but it is not placed there yet
(testing); it is currently under /builds/release.  I have set OPAL_PREFIX in
the environment, with the intention of helping the compiler wrappers work
right. Under /release, I currently have OpenMPI 1.4.3, whereas the OpenMPI
under /builds/release is 1.5.4.

What I am getting is this:  The mpif90 wrapper (under
/builds/release/openmpi/bin) puts -I/release instead of -I/builds/release.
But it includes -L/builds/release.

So I'm getting headers from 1.4.3 when compiling, but the libmpi from 1.5.4
when linking.

I did a quick "move 1.4.3 out of the way and put 1.5.4 over to /release
where it belongs" test, and my application did link without errors, so I
think that confirms the nature of the problem.

Is it a bug that mpif90 didn't pay attention to OPAL_PREFIX in the -I but
did use it in the -L ?

-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of Jeff Squyres
Sent: Friday, September 30, 2011 7:04 AM
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: Unresolved reference 'mbind' and
'get_mempolicy'

I think the issue here is that it's linking the *MPI application* that is
causing the problem.  Is that right?

If so, can you send your exact application compile line, and the the output
of that compile line with "--showme" at the end?

On Sep 29, 2011, at 4:24 PM, Brice Goglin wrote:

> Le 28/09/2011 23:02, Blosch, Edwin L a écrit :
>> Jeff,  
>> 
>> I've tried it now adding --without-libnuma.  Actually that did NOT fix
the problem, so I can send you the full output from configure if you want,
to understand why this "hwloc" function is trying to use a function which
appears to be unavailable.
> 
> This function is likely available... in the dynamic version of libnuma
> (that's why configure is happy), but make is probably trying to link
> with the static version which isn't available on your machine. That's my
> guess, at least.
> 
>> I don't understand about make V=1. What tree? Somewhere in the OpenMPI
build, or in the application compilation itself? Is "V=1" something in the
OpenMPI makefile structure?
> 
> Instead of doing
>  ./configure ...
>  make
> do
>  ./configure 
>  make V=1
> 
> It will make the output more verbose. Once you get the failure, please
> send the last 15 lines or so. We will look at these verbose lines to
> understand how things are being compiled (which linker flags, which
> libraries, ...)
> 
> Brice
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

[OMPI users] Open MPI process cannot do send-receive message correctly on a distributed memory cluster

2011-09-30 Thread Jack Bryan


Hi, 

I have a Open MPI program, which works well on a Linux shared memory multicore 
(2 x 6 cores) machine.

But, it does not work well on a distributed cluster with Linux Open MPI.

I found that the the process sends out some messages to other processes, which 
can not receive them. 

What is the possible reason ? 

I do not change anything of the program. 

Any help is really appreciated. 

Thanks

Re: [OMPI users] Open MPI process cannot do send-receive message correctly on a distributed memory cluster

2011-09-30 Thread Rayson Ho

You can use a debugger (just gdb will do, no TotalView needed) to find
out which MPI send & receive calls are hanging the code on the
distributed cluster, and see if the send & receive pair is due to a
problem described at:

Deadlock avoidance in your MPI programs:
http://www.cs.ucsb.edu/~hnielsen/cs140/mpi-deadlocks.html

Rayson

=
Grid Engine / Open Grid Scheduler
http://gridscheduler.sourceforge.net

Wikipedia Commons
http://commons.wikimedia.org/wiki/User:Raysonho


On Fri, Sep 30, 2011 at 11:06 AM, Jack Bryan  wrote:
> Hi,
>
> I have a Open MPI program, which works well on a Linux shared memory
> multicore (2 x 6 cores) machine.
>
> But, it does not work well on a distributed cluster with Linux Open MPI.
>
> I found that the the process sends out some messages to other processes,
> which can not receive them.
>
> What is the possible reason ?
>
> I do not change anything of the program.
>
> Any help is really appreciated.
>
> Thanks
>
>
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

==
Open Grid Scheduler - The Official Open Source Grid Engine
http://gridscheduler.sourceforge.net/

Re: [OMPI users] Open MPI process cannot do send-receive message correctly on a distributed memory cluster

2011-09-30 Thread Jack Bryan


Thanks, 

I am using non-blocking MPI_Isend to send out message and using blocking 
MPI_Recv to get the message. 

Each MPI_Isend use a distinct buffer to hold the message, which is not changed 
until the message is received. 

Then, the sender process waits for the MPI_Isend to be finished. 


Before this message is sent out, a heading message (about how many data and 
what data will be sent out in the following MPI_Isend) 
is sent out in the same way, they can be received well. 

Why the following message (which has larger size) cannot be received ? 

Any help is really appreciated. 

> Date: Fri, 30 Sep 2011 11:33:16 -0400
> From: raysonlo...@gmail.com
> To: us...@open-mpi.org
> Subject: Re: [OMPI users] Open MPI process cannot do send-receive message 
> correctly on a distributed memory cluster
> 
> You can use a debugger (just gdb will do, no TotalView needed) to find
> out which MPI send & receive calls are hanging the code on the
> distributed cluster, and see if the send & receive pair is due to a
> problem described at:
> 
> Deadlock avoidance in your MPI programs:
> http://www.cs.ucsb.edu/~hnielsen/cs140/mpi-deadlocks.html
> 
> Rayson
> 
> =
> Grid Engine / Open Grid Scheduler
> http://gridscheduler.sourceforge.net
> 
> Wikipedia Commons
> http://commons.wikimedia.org/wiki/User:Raysonho
> 
> 
> On Fri, Sep 30, 2011 at 11:06 AM, Jack Bryan  wrote:
> > Hi,
> >
> > I have a Open MPI program, which works well on a Linux shared memory
> > multicore (2 x 6 cores) machine.
> >
> > But, it does not work well on a distributed cluster with Linux Open MPI.
> >
> > I found that the the process sends out some messages to other processes,
> > which can not receive them.
> >
> > What is the possible reason ?
> >
> > I do not change anything of the program.
> >
> > Any help is really appreciated.
> >
> > Thanks
> >
> >
> >
> >
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> 
> ==
> Open Grid Scheduler - The Official Open Source Grid Engine
> http://gridscheduler.sourceforge.net/
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

[OMPI users] problem running with RoCE over 10GbE

2011-09-30 Thread Konz, Jeffrey (SSA Solution Centers)

Encountered a problem when trying to run OpenMPI 1.5.4 with RoCE over 10GbE 
fabric.

Got this run time error:

An invalid CPC name was specified via the btl_openib_cpc_include MCA
parameter.

  Local host:   atl3-14
  btl_openib_cpc_include value: rdmacm
  Invalid name: rdmacm
  All possible valid names: oob,xoob
--
[atl3-14:07184] mca: base: components_open: component btl / openib open 
function failed
[atl3-12:09178] mca: base: components_open: component btl / openib open 
function failed

Used these options to mpirun:
  "--mca btl openib,self,sm --mca btl_openib_cpc_include rdmacm -mca 
btl_openib_if_include mlx4_0:2"

We have a Mellanox LOM with two ports, first is an IB port, second is an 10GbE 
port.
Running over the IB port and TCP over the 10GbE port work fine.

Built OpenMPI with this option "--enable-openib-rdmacm".
Our system has OFED 1.5.2 with librdmacm-1.0.13-1

I noticed this output from configure script:
checking rdma/rdma_cma.h usability... no
checking rdma/rdma_cma.h presence... no
checking for rdma/rdma_cma.h... no
checking whether IBV_LINK_LAYER_ETHERNET is declared... yes
checking if RDMAoE support is enabled... yes
checking for infiniband/driver.h... yes
checking if ConnectX XRC support is enabled... yes
checking if dynamic SL is enabled... no
checking if OpenFabrics RDMACM support is enabled... no

Are we missing a build option or a piece of software?
Config.log and output from "ompi_info --all" attached.

% ibv_devinfo
hca_id: mlx4_0
transport:  InfiniBand (0)
fw_ver: 2.9.1000
node_guid:  78e7:d103:0021:4464
sys_image_guid: 78e7:d103:0021:4467
vendor_id:  0x02c9
vendor_part_id: 26438
hw_ver: 0xB0
board_id:   HP_020003
phys_port_cnt:  2
port:   1
state:  PORT_ACTIVE (4)
max_mtu:2048 (4)
active_mtu: 2048 (4)
sm_lid: 34
port_lid:   11
port_lmc:   0x00
link_layer: IB

port:   2
state:  PORT_ACTIVE (4)
max_mtu:2048 (4)
active_mtu: 1024 (3)
sm_lid: 0
port_lid:   0
port_lmc:   0x00
link_layer: Ethernet

% /sbin/ifconfig
eth0  Link encap:Ethernet  HWaddr 78:E7:D1:21:44:60
  inet addr:16.113.180.147  Bcast:16.113.183.255  Mask:255.255.252.0
  inet6 addr: fe80::7ae7:d1ff:fe21:4460/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:1861763 errors:0 dropped:0 overruns:0 frame:0
  TX packets:1776402 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000
  RX bytes:712448939 (679.4 MiB)  TX bytes:994111004 (948.0 MiB)
  Memory:fb9e-fba0

eth2  Link encap:Ethernet  HWaddr 78:E7:D1:21:44:65
  inet addr:10.10.0.147  Bcast:10.10.0.255  Mask:255.255.255.0
  inet6 addr: fe80::78e7:d100:121:4465/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:8519814 errors:0 dropped:0 overruns:0 frame:0
  TX packets:8555715 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000
  RX bytes:12370127778 (11.5 GiB)  TX bytes:12372246315 (11.5 GiB)

ib0   Link encap:InfiniBand  HWaddr 
80:00:00:4D:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
  inet addr:192.168.0.147  Bcast:192.168.0.255  Mask:255.255.255.0
  inet6 addr: fe80::7ae7:d103:21:4465/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:16384  Metric:1
  RX packets:1989 errors:0 dropped:0 overruns:0 frame:0
  TX packets:208 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:256
  RX bytes:275196 (268.7 KiB)  TX bytes:19202 (18.7 KiB)

loLink encap:Local Loopback
  inet addr:127.0.0.1  Mask:255.0.0.0
  inet6 addr: ::1/128 Scope:Host
  UP LOOPBACK RUNNING  MTU:16436  Metric:1
  RX packets:42224 errors:0 dropped:0 overruns:0 frame:0
  TX packets:42224 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:0
  RX bytes:3115668 (2.9 MiB)  TX bytes:3115668 (2.9 MiB)

Thanks,

-Jeff


/**/
/* Jeff Konz  jeffrey.k...

Re: [OMPI users] Proper way to stop MPI process

2011-09-30 Thread Ralph Castain

Sigterm should work - what version are you using?
Ralph

Sent from my iPad

On Sep 28, 2011, at 1:40 PM, Xin Tong  wrote:

> I am wondering what the proper way of stop a mpirun process and the child 
> process it created. I tried to send SIGTERM,  it does not respond to it ? 
> What kind of signal should I be sending to it ?
> 
> 
> Thanks 
> 
> 
> Xin 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] MPIRUN + Environtment Variable

Re: [OMPI users] VampirTrace integration with VT_GNU_NMFILE environment variable

Re: [OMPI users] Role of ethernet interfaces of startup of openmpi job using IB

Re: [OMPI users] Role of ethernet interfaces of startup of openmpi job using IB

Re: [OMPI users] EXTERNAL: Re: Unresolved reference 'mbind' and 'get_mempolicy'

Re: [OMPI users] EXTERNAL: Re: Unresolved reference 'mbind' and 'get_mempolicy'

Re: [OMPI users] EXTERNAL: Re: Unresolved reference 'mbind' and 'get_mempolicy'

Re: [OMPI users] EXTERNAL: Re: Unresolved reference 'mbind' and 'get_mempolicy'

[OMPI users] Open MPI process cannot do send-receive message correctly on a distributed memory cluster

Re: [OMPI users] Open MPI process cannot do send-receive message correctly on a distributed memory cluster

Re: [OMPI users] Open MPI process cannot do send-receive message correctly on a distributed memory cluster

[OMPI users] problem running with RoCE over 10GbE

Re: [OMPI users] Proper way to stop MPI process

13 matches

Site Navigation

Mail list logo

Footer information