Hi all,
I am using Condor to run my MPI jobs on a large cluster of nodes. The jobs
run fine but after sometimes they automatically get restarted. What can be
the reason?
Cheers,
Asad
--
"A Bayesian is one who, vaguely expecting a horse, and catching a glimpse
of a donkey, strongly believes he
t most of the cluster users are
non-mpi users and thus they don't have much knowledge about the
configuration of MPI with Condor.
If you know any person who uses Condor for running MPI jobs then please let
me know.
Cheers,
Asad
>
>
> On Apr 14, 2011, at 6:37 PM, Asad Ali wrote:
&
Hi All,
I had been using open-mpi for parallel computing on Fedora and Ubuntu and
everything was going quite fine. But recently I started using other OSs such
as CentOS and Debian and found a strange thing regarding mpi. I found that
running the same source code on these OS, with the same versions
Hi Fabian,
Hi Asad,
>> I
>> found that running the same source code on these OS, with the same
>> versions of of gcc and open-mpi installed on them, gives different
>> results than Fedora and Ubuntu after a few hundred iterations. The first
>> few hundered iterations are exactly similar to that o
Hi Jodi,
> I once got different results when running on a 64-Bit platform instead of
> a 32 bit platform - if i remember correctly, the reason was that on the
> 32-bit platform 80bit extended precision floats were used but on the 64bit
> platform only 64bit floats.
Could you please give me an id
On Mon, Apr 26, 2010 at 8:01 PM, Ashley Pittman wrote:
>
> On 25 Apr 2010, at 22:27, Asad Ali wrote:
>
> > Yes I use different machines such as
> >
> > machine 1 uses AMD Opterons. (Fedora)
> >
> > machine 2 and 3 use Intel Xeons. (CentOS)
> >
Hi all,
Many many thanks to all of you for your time, sincere help, useful tips and
advices.
I have solved that problem. I just removed the gcc flag -O3 from my compile
script and the error vanished. However the speed of my code is also reduced
to 50 iterations/minute from 70 iterations/minute, st
Hi Gus,
Thanks for your well researched and thoughtful reply. It will take a bit of
time to absorb such a big and energetic doze. :)
I took your earlier advice regarding the optimization flags causing errors
in your case.
You wrote in reply to Dave
"The optimization flags were the main cause of c
Hi all,
I am working on a parallel tempering MCMC code using OpenMPI scripts. I am a
bit confused about proposing swaps between chains running on different
cores.
I know how to propose swaps but I am not sure as to where to to do it (i.e.
how to specify an independent node or core for it.). If som
Hi Jack,
Debugging OpenMPI with traditional debuggers is a pain.
>From your error message it sounds that you have some memory allocation
problem. Do you use dynamic memory allocation (allocate and then free)?
I use display (printf()) command with MPIrank command. It tells me which
thread is givin
Hi Everybody,
I am running a code in parallel fashion using Open MPI. The code
compiles successfully but when I
run the output executable file I get the following error.
Signal:11 info.si_errno:0(Success) si_code:1(SEGV_MAPERR)
Failing at addr:0xc491f018
[0] func:/usr/lib/openmpi/libopal.so.0 [0x
Hi Jeff,
I have changed the position of malloc.h in header files list. I moved it up
above mpi.h. Now I am getting a different error message see following,
[asad@stat74 T]$ mpirun --np 4 nice -10 ./lisa09EMRIT-P
+---[ lisa14.c ]---
| This is proc
12 matches
Mail list logo