Signal 9 more than likely means that some external entity killed your MPI job
(e.g., a resource manager determined that your process took too much time / CPU
/ whatever and killed it). That also makes sense since you say that short jobs
complete with no problem, but (assumedly) longer jobs get
It's worth noting that this new component will likely get pulled into 1.5.1
(we're refreshing a bunch of stuff in 1.5.1 -- this new component will be
included in that refresh).
No specific timeline on 1.5.1 yet, though.
On Jul 22, 2010, at 5:53 PM, Ralph Castain wrote:
> Dev trunk looks okay
Dear All:
I run a parallel job on 6 nodes of an OpenMPI cluster.
But I got error:
rank 0 in job 82 system.cluster_37948 caused collective abort of all ranks
exit status of rank 0: killed by signal 9
It seems that there is segmentation fault on node 0.
But, if the program is run for a short
thanks
very clear,
i was not aware that openMPI internally uses shared memory in case two
proceses reside on the same node,
which is perfect.
very complete explanations,
thanks really
On Thu, Jul 22, 2010 at 7:11 PM, Gus Correa wrote:
> Hi Cristobal
>
> Cristobal Navarro wrote:
>>
>> yes,
>> i
Hi Cristobal
Cristobal Navarro wrote:
yes,
i was aware of the big difference hehe.
now that openMP and openMPI is in talk, i've alwyas wondered if its a
good idea to model a solution on the following way, using both openMP
and openMPI.
suppose you have n nodes, each node has a quadcore, (so you
its possible. but Not a novel idea. hehe.
Its a form of HYBRID programming (distributed shared programming).
But it needs to be ensured that whether it is beneficial for a given
case/problem/code.
On Thu, Jul 22, 2010 at 5:52 PM, Cristobal Navarro wrote:
> yes,
> i was aware of the big differen
Dev trunk looks okay right now - I think you'll be fine using it. My new
component -might- work with 1.5, but probably not with 1.4. I haven't checked
either of them.
Anything at r23478 or above will have the new module. Let me know how it works
for you. I haven't tested it myself, but am prett
yes,
i was aware of the big difference hehe.
now that openMP and openMPI is in talk, i've alwyas wondered if its a
good idea to model a solution on the following way, using both openMP
and openMPI.
suppose you have n nodes, each node has a quadcore, (so you have n*4 processors)
launch n proceses a
Hi Cristobal,
Note that the pic in http://dl.dropbox.com/u/6380744/clusterLibs.png
shows that Scalapack is based on what; it only shows which packages
Scalapack uses; hence no OpenMP is there.
Also be clear about the difference:
"OpenMP" is for shared memory parallel programming, while
"OpenMPI"
Ralph,
Thank you so much!!
I'll give it a try and let you know.
I know it's a tough question, but how stable is the dev trunk? Can I
just grab the latest and run, or am I better off taking your changes
and copy them back in a stable release? (if so, which one? 1.4? 1.5?)
p.
On Thu, Jul 22, 201
Thanks
im looking at the manual, seems good.
i think now the picture is more clear.
i have a very custom algorithm, local problem of research,
paralelizable, thats where openMPI enters.
then, at some point on the program, all the computation traduces to
numeric (double) matrix operations, eigenva
It was easier for me to just construct this module than to explain how to do so
:-)
I will commit it this evening (couple of hours from now) as that is our
standard practice. You'll need to use the developer's trunk, though, to use it.
Here are the envars you'll need to provide:
Each process n
Hi Cristobal
You may want to take a look at PETSc,
which has all the machinery for linear algebra that
you need, can easily attach a variety of Linear Algebra packages,
including those in the diagram you sent and more,
builds on top of MPI, and can even build MPI for you, if you prefer.
It has C
Hello,
i am designing a solution to one of my programs, which mixes some tree
generation, matrix operatons, eigenvaluies, among other tasks.
i have to paralellize all of this for a cluster of 4 nodes (32 cores),
and what i first thought was MPI as a blind choice, but after looking
at this picture
Dear Josh,
I hope to see this new API soon. Anyway, I will try these critical section
functions in BLCR. Thank you for the support.
Best Regards,
Nguyen Toan
On Sat, Jul 17, 2010 at 6:34 AM, Josh Hursey wrote:
>
> On Jun 14, 2010, at 5:26 AM, Nguyen Toan wrote:
>
> > Hi all,
> > I have a MPI pr
Dear Josh,
Thank you very much for the reply. I am sorry if my question was unclear, so
please let me organize my question again.
Currently I am applying the staging technique with the mca-params.conf
setting as follows:
snapc_base_store_in_place=0 # enable remote file transfer to global storage
c
That did it. Thanks.
David
On Wed, 2010-07-21 at 15:29 -0500, Dave Goodell wrote:
> On Jul 21, 2010, at 2:54 PM CDT, Jed Brown wrote:
>
> > On Wed, 21 Jul 2010 15:20:24 -0400, David Ronis
> > wrote:
> >> Hi Jed,
> >>
> >> Thanks for the reply and suggestion. I tried adding -mca
> >> yield_w
On Wed, Jul 21, 2010 at 10:44 AM, Ralph Castain wrote:
>
> On Jul 21, 2010, at 7:44 AM, Philippe wrote:
>
>> Ralph,
>>
>> Sorry for the late reply -- I was away on vacation.
>
> no problem at all!
>
>>
>> regarding your earlier question about how many processes where
>> involved when the memory wa
18 matches
Mail list logo