Dear Mohamed,
Is there some checkpointing software for interconnect other than
tcp say IB or Myrinet?
Regards
Neeraj Chourasia (MTS)
Computational Research Laboratories Ltd.
(A wholly Owned Subsidiary of TATA SONS Ltd)
B-101, ICC Trade Towers, Senapati Bapat Road
Pune 411016 (Mah) INDIA
during mpi launch time to select mthca0 or
mthca1?
Any help is appreciated. Btw i just checked Mvapich and feature is
there inside.
Regards
Neeraj Chourasia (MTS)
Computational Research Laboratories Ltd.
(A wholly Owned Subsidiary of TATA SONS Ltd)
B-101, ICC Trade Towers, Senapati Bapat
Thanks Ralph,
i found the mca parameter. It is btl_openib_max_btls which
controls the available HCAs.
Thanks for helping.
Regards
Neeraj Chourasia (MTS)
Computational Research Laboratories Ltd.
(A wholly Owned Subsidiary of TATA SONS Ltd)
B-101, ICC Trade Towers, Senapati
.
Since collectives depend heavily on your network architecture and
message size, i would like you to first fine tune your collectives on your
network fabric before running any scientific application.
Regards
Neeraj Chourasia (MTS)
Computational Research Laboratories Ltd.
(A wholly Owned Subsidiary
Hi Terry,
I feel hierarchical collectives are slower compare to tuned one. I
had done some benchmark in the past specific to collectives, and this is
what i feel based on my observation.
Regards
Neeraj Chourasia (MTS)
Computational Research Laboratories Ltd.
(A wholly Owned Subsidiary
Hi Craig,
How was the nodefile selected for execution? Whether it was
provided by scheduler say LSF/SGE/PBS or you manually gave it?
With WRF, we observed giving sequential nodes (Blades which are in the
same order as in enclosure) gave us some performance benefit.
Regards
Neeraj
infiniband clos network.
Results were tested on 12-16 nodes with 8 mpi process each node.
Regards
Neeraj Chourasia (MTS)
Computational Research Laboratories Ltd.
(A wholly Owned Subsidiary of TATA SONS Ltd)
B-101, ICC Trade Towers, Senapati Bapat Road
Pune 411016 (Mah) INDIA
(O) +91-20
using supported
thread library on your platform(selected by default during configure or
use --with-threads).
You can't use OPAL library as it is not exported to outside MPI
programming world.
Regards
Neeraj Chourasia (MTS)
Computational Research Laboratories Ltd.
(A wholly Owned Subsidia
MPI_Barrier(MPI_COMM_WORLD);
}
MPI_Finalize();
}
Let me know, what could be the error. I feel there is the error in MPI
process coordination.
Regards
Neeraj Chourasia
Member of Technical Staff
Computational Research Laboratories Limited
(A wholly Owned Subsidiary of TAT
may
differ from one network topology to another.
In that case, i would suggest you to run benchmark programs with
2. option and fine tune the MPI Collectives suited for your cluster
architecture.
Regards
Neeraj Chourasia
Member of Technical Staff
Computational Research Laboratories
Soon, this error goes away, if i force mpirun to use tcp for
communication using mca parameters and then error a) starts coming which
is related with some datatype handling during checkpoint.
Regards
Neeraj Chourasia
Member of Technical Staff
Computational Research Laboratories Limited
(A wholly
Thanks Pasha for sharing IB Roadmaps with us. But i am more interested in
to find out latency figures since they often matter more than bit rate.
Could there be rough if not accurate the latency figures being targeted in
IB World?
Regards
Neeraj Chourasia
Member of Technical Staff
If you are using scheduler like PBS or SGE over MPI, there is an option
called prolog and epilog, where you can give scripts which does copy
operation. This script is called before and after job execution as the
name suggests.
Without it, in mpi itself, i have to see, if it can be done.
T
Hello Everyone, I was checking the development version from
svn and found that support for libnbc is going to come in next release. I
thought of compiling it, but failed to do.Could some one suggest me how to get
it compiled.When i made changes to configure script(Basically added some
flags)
,
then how does compute node gets the information of the same during execution ?
Does it use OOB for it ?-Neeraj
communication.Any help in this direction would be appreciated.-Neeraj
done it for
Ethernet/Giga-bit Ethernet and IPoIB ofcourse in experimental stage. Actually i
want to contribute for it in OpenMPI and need the help for the same.-NeerajOn
Thu, 11 Oct 2007 12:01:39 +0200 Open MPI Users wrote Hi Neeraj, >
Could anyone tell me the important tun
Yes, the buffer was being re-used. No we didnt try to benchmark it with netpipe
and other stuffs. But the program was pretty simple. Do you think, I need to
test it with bigger chunks (>8MB) for communication.?We also tried
manipulating eager_limit and min_rdma_sze, but no success.NeerajOn Fri,
having Makefile created on running configure script, but few of them like
runtime doesn\'t have the Makefile.Please help me compiling it.-Neeraj
, and to my surprise, old version performs better in both scenarios.Could
anyone give me the reason for the same?I repeated the above point to
point tests between all set of nodes, but the result were same :(-Neeraj
Hi, Please ensure if following things are correct1) The array
bounds are equal. Means \"my_x\" and \"size_y\" has the same value on all
nodes.2) Nodes are homogenous. To check that, you could decide root to be some
different node and run the program-NeerajOn Fri, 26 Oct 2007 10:13:15 +0500
(
messages like local
protocol error, flush error, invalid request error, local length error kind of
messages.Any help would be appreciated.-Neeraj
of reference,
program works fine, if we force openmpi to select TCP interconnect using --mca
btl tcp,self.-Neeraj
#include
#include
#include
#include
#include
#include "time.h"
#include
#define MAX 100
int main(int argc, char *argv[])
{
int required = MPI_THREAD_MULTIP
, at
12:17 AM, Neeraj Chourasia wrote:> Hi folks, > > I have
been seeing some nasty behaviour in MPI_Send/Recv> with large dataset(8
MB), when used with OpenMP and Openmpi> together with IB Interconnect.
Attached is a program. > >
should be considerate about?-Neeraj
ms of checkpointing. But i am
pretty sure, once v1.3 will come, it will help a lot to HPC community.
I can find the development trunk version, but i am more interested in
production release version.
-Neeraj
. Problem comes when data size increases and OpenMPI starts
splitting it.
I think even with Bigger sizes, Program works if interconnect is TCP, but fails
to work on IB. So on IB, you can run your program if you set mca paramter
mpi_leave_pinned to 1.
Cheers
Neeraj
On Thu, 29 Nov 2007 Brock
Hello everyone, While going through collective algorithms, I
came across preprocessor directive MPI_IN_PLACE which is (void *)1. Its always
being compared against source buffer(sbuf). My question is when MPI_IN_PLACE ==
sbuf condition would be true. As far as i understand, sbuf is the address
Neeraj,MPI_IN_PLACE is defined by the MPI standard in order
to allow theusers to specify that the input and output buffers for the
collectivesare the same. Moreover, not all collectives support MPI_IN_PLACE
andfor those that support it some strict rules apply. Please read the
i am looking at is in large clusters,
mpirun takes lot of time starting orted (by ssh) on remote nodes. If orte is
already running, hopefully we can save considerable time. Any comments is
appreciated. -Neeraj
When
i do ssh to n101, there is no orted and qrsh_starter running. While checking
its spool file, i came across following
message---Execd spool Error
Message-----|execd|n101|E|n
Hello everyone, I downloaded openmpi-1.3 version from night
tarballs to check RDMA-CM support. I am able to compile and install it, but
dont know how to run it as there is no documentation provided. Did someone try
running it with OpenMPI?My another question is Does OpenMPI1.3 has
progress-t
Hello, With openmpi-1.3, new mca feature is introduced
namely --mca routed binomial. This ensures out of band communication to happen
in binomial fashion and reduces the net socket opening and hence solves file
open issues.-NeerajOn Thu, 18 Sep 2008 16:46:23 -0700 Open MPI Users wrote
I'm
33 matches
Mail list logo