Hi
I'm trying to get fault tolerant ompi running on our cluster for my
semesterthesis.
Build & compile were successful, blcr checkpointing works. openmpi 1.5.3, blcr
0.8.2
Now i'm trying to set up the SELF checkpointing. the example from
http://osl.iu.edu/research/ft/ompi-cr/examples.php does
Hi Roman,
Did you try to checkpoint and restart with the parameter "-machinefile". It
may work.
Regards,
Nguyen Toan
On Wed, Apr 6, 2011 at 7:05 PM, Hellmüller Roman wrote:
> Hi
>
> I'm trying to get fault tolerant ompi running on our cluster for my
> semesterthesis.
>
> Build & compile were su
hi,
i need use open-mpi with g95 on debian linux lenny 5.0 - x86_64
i compile it with FC=g95 F77=g95 and test on my example.c file
but with g95 mpirun dont use process1 just process 0.
perhaps my compile option are wrong ?
i want that mpirun use process 0 and 1 both.
hostname paola12
mpi
Hi Toan
Thx for your suggestion. It gives me the following result, which does not tell
anything more.
hroman@cbl1 ~/checkpoints $ ompi-restart -v -machinefile
../semesterthesis/code/code2_self_example/my-hroman-cr-file.ckpt om
pi_global_snapshot_28952.ckpt/
[cbl1:28974] Checking for the exis
Hi Roman,
It seems that you misunderstand the parameter "-machinefile".
Following this parameter shoud be a file containing a list of machines
which your MPI application will be run on. For example, you want to
run your app on 2 nodes, named "node1" and "node2", then this file, let call
it "MACHIN
Hi Toan
no that didn't change anything. i'm trying to restart the program on the
computer it run before and i execute the ompi-restart on the same.
machinefile_cbl1 contains just cbl1
hroman@cbl1 ~/checkpoints $ ompi-restart -v -machinefile machinefile_cbl1
ompi_global_snapshot_28952.ckpt/
[cb
If I read your error messages correctly, it looks like mpirun is crashing - the
daemon is complaining that it lost the socket connection back to mpirun, and
hence will abort.
Are you seeing mpirun still alive?
On Apr 5, 2011, at 4:46 AM, jody wrote:
> Hi
>
> On my workstation and the cluste
Hi Ralph
No, after the above error message mpirun has exited.
But i also noticed that it is to ssh into squid_0 and open a xterm there:
jody@chefli ~/share/neander $ ssh -Y squid_0
Last login: Wed Apr 6 17:14:02 CEST 2011 from chefli.uzh.ch on pts/0
jody@squid_0 ~ $ xterm
xterm Xt error:
thanks all, I realized that the sun compilers weren't installed on all the
nodes. It seems to be working, soon I will test the mca parameters for IB
On Mon, Apr 4, 2011 at 7:35 PM, Terry Dontje wrote:
> libfui.so is a library a part of the Solaris Studio FORTRAN tools. It
> should be located
On Mon, Apr 4, 2011 at 7:35 PM, Terry Dontje wrote:
> libfui.so is a library a part of the Solaris Studio FORTRAN tools. It
> should be located under lib from where your Solaris Studio compilers are
> installed from. So one question is whether you actually have Studio Fortran
> installed on all
some tests I did. I hope this isn't an abuse of the list. please tell me if
it is but thanks to all those who helped me.
this goes to say that the sun MPI works with programs not compiled with
sun’s compilers.
this first test was run as a base case to see if MPI works., the sedcond run
is to see
also, I'm not sure if I'm reading the results right. According to the last
run, did using the sun compilers (update 1 ) result in higher performance
with sunct?
On Wed, Apr 6, 2011 at 11:38 AM, Nehemiah Dacres wrote:
> some tests I did. I hope this isn't an abuse of the list. please tell me if
Nehemiah Dacres wrote:
also, I'm not sure if I'm reading the results right.
According to the last run, did using the sun compilers (update 1 )
result in higher performance with sunct?
On Wed, Apr 6, 2011 at 11:38 AM, Nehemiah
Dacres
wrote:
this
first test was run as
Something looks fishy about your numbers. The first two sets of numbers
look the same and the last set do look better for the most part. Your
mpirun command line looks weird to me with the "-mca
orte_base_help_aggregate btl,openib,self," did something get chopped off
with the text copy? You
Like I said, I'm not expert. However, a quick "google" of revealed this result:
> When trying to set up x11 forwarding over an ssh session to a remote server
> with the -X switch, I was getting an error like Warning: No xauth data; using
> fake authentication data for X11 forwarding.
>
> When d
Sorry Jody - I should have read your note more carefully to see that you
already tried -Y. :-(
Not sure what to suggest...
On Apr 6, 2011, at 12:29 PM, Ralph Castain wrote:
> Like I said, I'm not expert. However, a quick "google" of revealed this
> result:
>
>
>> When trying to set up x11 f
Here's a little more info - it's for Cygwin, but I don't see anything
Cygwin-specific in the answers:
http://x.cygwin.com/docs/faq/cygwin-x-faq.html#q-ssh-no-x11forwarding
On Apr 6, 2011, at 12:30 PM, Ralph Castain wrote:
> Sorry Jody - I should have read your note more carefully to see that y
We tend to build OMPI for several different architectures. Rather than untar
the archive file each time I'd rather do a "make distclean" in between builds.
However, this always produces the following error:
...
Making distclean in libltdl
make[2]: Entering directory `/user/openmpi-1.4.3/opal/li
Hello,
I'm trying again with the 1.4.3 version to use compile openmpi statically
with my program . but I'm running into a more basic problem, similar to one
I previously encountered and solved using LD_LIBRARY_PATH.
The configure script is dying when it tries to run the "simple C++ program".
I am also trying to get netlib's hpl to run via sun cluster tools so i am
trying to compile it and am having trouble. Which is the proper mpi library
to give?
naturally this isn't going to work
MPdir= /opt/SUNWhpc/HPC8.2.1c/sun/
MPinc= -I$(MPdir)/include
*MPlib= $(MPdir)/li
Look at your output from mpicc --showme. It indicates that the OMPI libs were
put in the lib64 directory, not lib.
On Apr 6, 2011, at 1:38 PM, Nehemiah Dacres wrote:
> I am also trying to get netlib's hpl to run via sun cluster tools so i am
> trying to compile it and am having trouble. Which
[jian@therock lib]$ ls lib64/*.a
lib64/libotf.a lib64/libvt.fmpi.a lib64/libvt.omp.a
lib64/libvt.a lib64/libvt.mpi.a lib64/libvt.ompi.a
last time i linked one of those files it told me they were in the wrong
format. these are in archive format, what format should they be in?
On Wed, Apr 6,
Sigh...look at the output of mpicc --showme. It tells you where the OMPI libs
were installed:
-I/opt/SUNWhpc/HPC8.2.1c/sun/include/64
-I/opt/SUNWhpc/HPC8.2.1c/sun/include/64/openmpi -R/opt/mx/lib/lib64
-R/opt/SUNWhpc/HPC8.2.1c/sun/lib/lib64 -L/opt/SUNWhpc/HPC8.2.1c/sun/lib/lib64
-lmpi -lopen-rt
On Apr 6, 2011, at 1:27 PM, Jason Palmer wrote:
> Hello,
>
> I’m trying again with the 1.4.3 version to use compile openmpi statically
> with my program … but I’m running into a more basic problem, similar to one I
> previously encountered and solved using LD_LIBRARY_PATH.
>
> The configure
On Apr 6, 2011, at 1:21 PM, David Gunter wrote:
> We tend to build OMPI for several different architectures. Rather than untar
> the archive file each time I'd rather do a "make distclean" in between
> builds. However, this always produces the following error:
>
> ...
> Making distclean in li
Thanks, this seems to be resolved now after sorting out my previous test
installations of gcc.
-Jason
From: Ralph Castain [mailto:rhc.open...@gmail.com] On Behalf Of Ralph
Castain
Sent: Wednesday, April 06, 2011 1:35 PM
To: japal...@ucsd.edu; Open MPI Users
Subject: Re: [OMPI users] problem w
Ralph Castain wrote:
On Apr 6, 2011, at 1:21 PM, David Gunter wrote:
We tend to build OMPI for several different architectures.
Rather than untar the archive file each time I'd rather
do a "make distclean" in between builds.
However, this always produces the following error:
...
Making dis
Hi,
I am having trouble running a batch job in SGE using openmpi. I have read
the faq, which says that openmpi will automatically do the right thing, but
something seems to be wrong.
Previously I used MPICH1 under SGE without any problems. I'm avoiding MPICH2
because it doesn't seem to support st
Btw, I did compile openmpi with the --with-sge flag.
I am able to compile a test program using openf90 with no errors or
warnings. But when I try to run a test program that just calls
MPI_INIT(ierr), then MPI_COMM_RANK(ierr), I get the following, whether
static or linked, and whether run with mpir
Are you able to run non-MPI programs like "hostname"?
I ask because that error message indicates that everything started just fine,
but there is an error in your application.
On Apr 6, 2011, at 6:01 PM, Jason Palmer wrote:
> Btw, I did compile openmpi with the --with-sge flag.
>
> I am able t
Ok, the problem was apparently that I was still including mpif.h instead of
using "use mpi". Seems to be working now.
-Original Message-
From: Jason Palmer [mailto:japalme...@gmail.com]
Sent: Wednesday, April 06, 2011 5:01 PM
To: 'Open MPI Users'
Subject: RE: SGE and openmpi
Btw, I did
31 matches
Mail list logo