To Whom This May Concern:
I am having problems with an OpenMPI application I am writing on the
Solaris/Intel AMD64 platform. I am using OpenMPI 1.3.2 which I compiled
myself using the Solaris C/C++ compiler.
My application was crashing (signal 11) inside a call to malloc() which my
code was runn
Best I can tell, the remote orted never got executed - it looks to me
like there is something that blocks the ssh from working. Can you get
into another window and ssh to the remote node? If so, can you do a ps
and verify that the orted is actually running there?
mpirun is using the same sh
As far as I can tell, both the PATH and LD_LIBRARY_PATH are set correctly. I've tried with the full path to the mpirun executable and using the --prefix command line option. Neither works. The debug output seems to contain a lot of system specific information (IPs, usernames and such), which I'm a
Okay, that's one small step forward. You can lock that in by setting
the appropriate MCA parameter in one of two ways:
1. add the following to your default mca parameter file: btl =
tcp,sm,self (I added the shared memory subsystem as this will help
with performance). You can see how to do
Hi Ankush
Glad to hear that your MPI and cluster project were successful.
I don't know if you would call these "mathematical computation"
or "real life applications" of MPI and clusters, but here are a
few samples I am familiar with (Earth Science):
Weather forecast:
http://www.wrf-model.org/in
Hi,
Yes I'm using ethernet connections. Doing as you suggest removes the
errors generated by running the small test program, but still doesn't
allow programs (including the small test program) to execute on any
node other than the one launching mpirun. If I try to do that, the
command han
On Apr 28, 2009, at 1:29 PM, Ankush Kaul wrote:
I would like to know one more thing, what are real life applications
that i can use the cluster for (except mathematical computation)?
Can i use if for my web server, if yes then how?
Not really. MPI is just about message passing -- it's fre
In this instance, OMPI is complaining that you are attempting to use
Infiniband, but no suitable devices are found.
I assume you have Ethernet between your nodes? Can you run this with the
following added to your mpirun cmd line:
-mca btl tcp,self
That will cause OMPI to ignore the Infiniband su
For what it's worth, the genome assembly software ABySS uses exactly
this system that Jody is describing to represent a directed graph.
http://www.bcgsc.ca/platform/bioinfo/software/abyss
Cheers,
Shaun
jody wrote:
Hi Barnabas
As far as i know, Open-MPI is not a shared memory system.
Using Op
Many thanks for your help nonetheless.
Hugh
On 28 Apr 2009, at 17:23, jody wrote:
Hi Hugh
I'm sorry, but i must admit that i have never encountered these
messages,
and i don't know what their cause exactly is.
Perhaps one of the developers can give an explanation?
Jody
On Tue, Apr 28, 2
Hi Eugene and Jody,
thanks for the ideas and elaborate answers. I will look into SysV and
mmap, and find out something. I am not tied to PGAPack, there may be
other PGA libs too... But I guess MPI and SysV/mmap do not cancel each
other out, I just have to know about what is running locally and wha
Thanks everyone(esp Gus and Jeff) for the support and guidance. We are
almost at the verge of completing our project which could have not been
possible without all u guys.
I would like to know one more thing, what are real life applications that i
can use the cluster for (except mathematical compu
Barnabas Debreczeni wrote:
I am using PGAPack as a GA library, and it uses MPI to parallelize
optimization runs. This is how I got to Open MPI.
Let me see if I understand the underlying premise. You want to
parallelize, but there are some large shared tables. There are many
different para
Hi Hugh
I'm sorry, but i must admit that i have never encountered these messages,
and i don't know what their cause exactly is.
Perhaps one of the developers can give an explanation?
Jody
On Tue, Apr 28, 2009 at 5:52 PM, Hugh Dickinson
wrote:
> Hi again,
>
> I tried a simple mpi c++ program:
>
Hi again,
I tried a simple mpi c++ program:
--
#include
#include
using namespace MPI;
using namespace std;
int main(int argc, char* argv[]) {
int rank,size;
Init(argc,argv);
rank=COMM_WORLD.Get_rank();
size=COMM_WORLD.Get_size();
cout << "P:" << rank << " out of " << size << endl;
Hi Jody,
I can paswordlessly ssh between all nodes (to and from)
Almost none of these mpirun commands work. The only working case is
if nodenameX is the node from which you are running the command. I
don't know if this gives you extra diagnostic information, but if I
explicitly set the wron
Hi Hugh
You're right, there is no initialization command (like lamboot) you
have to call.
I don't really know why your sewtup doesn't work, so i'm making some
more "blind shots"
can you do passwordless ssh from between any two of your nodes?
does
mpirun -np 1 --host nodenameX uptime
work for e
Hi Barnabas
As far as i know, Open-MPI is not a shared memory system.
Using Open-MPI to attack your problem on N processors, i would poceed
as follows:
- processor 0 reads the table and then splits it into N parts
(Table_0,...Table_N)
- processor 0 sends Table_i to processor i (for all i > 0) usi
Hi Jody,
The node names are exactly the same. I wanted to avoid updating the
version because I'm not the system administrator, and it could take
some time before it gets done. If it's likely to fix the problem
though I'll try it. I'm assuming that I don't have to do something
analogous to
Hi Hugh
Again, just to make sure, are the hostnames in your host file well-known?
I.e. when you say you can do
ssh nodename uptime
do you use exactly the same nodename in your host file?
(I'm trying to eliminate all non-Open-MPI error sources,
because with your setup it should basically work.)
Hi!
I am new to this list and to parallel programming in general. I am
writing a trading simulator for the forex market and I am using
genetic algorithms to breed trading parameters.
I am using PGAPack as a GA library, and it uses MPI to parallelize
optimization runs. This is how I got to Open MP
Hi Jody,Indeed, all the nodes are running the same version of Open MPI. Perhaps I was incorrect to describe the cluster as heterogeneous. In fact, all the nodes run the same operating system (Scientific Linux 5.2), it's only the hardware that's different and even then they're all i386 or i686. I'm
Hi Hugh
Just to make sure:
You have installed Open-MPI on all your nodes?
Same version everywhere?
Jody
On Tue, Apr 28, 2009 at 12:57 PM, Hugh Dickinson
wrote:
> Hi all,
>
> First of all let me make it perfectly clear that I'm a complete beginner as
> far as MPI is concerned, so this may well
On Apr 27, 2009, at 10:22 PM, jan wrote:
Thank You Jeff Squyres.
I have checked out the web page
http://www.open-mpi.org/community/lists/announce/2009/03/0029.php,
then the
page https://svn.open-mpi.org/trac/ompi/ticket/1853 , but the web page
svn.open-mpi.org seems crash.
Try that ticket
On Apr 28, 2009, at 7:27 AM, nee...@crlindia.com wrote:
Hi Josh,
Thanks for your reply. Actually the reason for hang was
missing
blcr library in LD_LIBRARY_PATH.
After setting it right, checkpoint was working but as you
mentioned before, datatype error is coming along with,
due to other reasons) -- he has two separate installs of OMPI:
/opt/ompi-1.2
/opt/ompi-1.3
Jeff, that is correct.
He builds his app with /opt/ompi-1.2/bin/mpicc.
But then he sets his LD_LIBRARY_PATH to /opt/ompi-1.3/lib/ and runs his
app with /opt/ompi-1.3/bin/mpirun. This means his app will
On Apr 28, 2009, at 7:50 AM, Ralph Castain wrote:
I'd be fascinated to understand how this works. There are multiple
function calls in MPI_Init, for example, that simply don't exist in
1.3.x. There are references to fields in structures that are no longer
present, though the structure itself doe
I'd be fascinated to understand how this works. There are multiple
function calls in MPI_Init, for example, that simply don't exist in
1.3.x. There are references to fields in structures that are no longer
present, though the structure itself does still exist. Etc.
I frankly am stunned that
Ralph, Brian, and Jeff,
Thank you for your answers.
I want to confirm Brian's words that I am "compiling the application
against one version of Open MPI, linking dynamically, then running
against another version of Open MPI".
The fact that the ABI has stabilized with the release of version 1
Hi Josh,
Thanks for your reply. Actually the reason for hang was missing
blcr library in LD_LIBRARY_PATH.
After setting it right, checkpoint was working but as you
mentioned before, datatype error is coming along with, and hence restart
is not working.
a) The errors comi
Hi all,
First of all let me make it perfectly clear that I'm a complete
beginner as far as MPI is concerned, so this may well be a trivial
problem!
I've tried to set up Open MPI to use SSH to communicate between nodes
on a heterogeneous cluster. I've set up passwordless SSH and it seems
Hi Axida,
There are two ways you can call MPI_collectives.
1) Automatic decision by OpenMPI which in turn call tuned
collectives
2) Forced decision, where you can override OpenMPI to call certain
algorithms available for collective say MPI_Bcast.
The logic for 1
Hi all,
I think there are several algorithms used in MPI_Bcast.
I am wondering how are they decided to be excuted ?
I mean, How to decide which algorithm will be used?
Is it depending on the message size or something ?
Would some people help me?
Thank you!
33 matches
Mail list logo