thanks
Wat if the master has to send and receive large data package ?
It has to be splited into multiple parts ?
This may increase communication overhead.
I can use MPI_datatype to wrap it up as a specific datatype, which can carry
the data. What if the data is very large? 1k bytes or 10 kbyte
On Wed, 07 Jul 2010 17:34:44 -0600, "Price, Brian M (N-KCI)"
wrote:
> Jed,
>
> The IBM P5 I'm working on is big endian.
Sorry, that didn't register. The displ argument is MPI_Aint which is 8
bytes (at least on LP64, probably also on LLP64), so your use of kind=8
for that is certainly correct.
Does it means we have to split the MPI_Get to many 2GB parts?
I have a MPI programm which first serialize a object, sending to other
process. The char array after serialize is just below 2GB now, but the data
is increasing.
One method is to build a large type with MPI_Type_vector, align the char
This error typically occurs when the received message is bigger than the
specified buffer size. You need to narrow your code down to offending
receive command to see if this is indeed the case.
On Wed, Jul 7, 2010 at 8:42 AM, Jack Bryan wrote:
> Dear All:
>
> I need to transfer some messages f
Dear all,
How can I decide the configure options? I am greatly confused.
I am using school's high performance computer.
But the openmpi there is version 1.3.2, old, so I want to build the new one.
I am new to openmpi, I have built the openmpi and it doesn't work, I built and
installed it to my
Jed,
The IBM P5 I'm working on is big endian.
The test program I'm using is written in Fortran 90 (as stated in my question).
I imagine this is indeed a library issue, but I still don't understand what
I've done wrong here.
Can anyone tell me how I should be building my OpenMPI libraries and m
On Wed, 07 Jul 2010 15:51:41 -0600, "Price, Brian M (N-KCI)"
wrote:
> Jeff,
>
> I understand what you've said about 32-bit signed INTs, but in my program,
> the displacement variable that I use for the MPI_GET call is a 64-bit INT
> (KIND = 8).
The MPI Fortran bindings expect a standard int,
Jeff,
I understand what you've said about 32-bit signed INTs, but in my program, the
displacement variable that I use for the MPI_GET call is a 64-bit INT (KIND =
8).
In fact, the only thing in my program that isn't a 64-bit INT is the array that
I'm trying to transfer values from.
I would po
+1.
FWIW, Open MPI works pretty well with SLURM; I use it back here at Cisco for
all my testing. That one particular option you're testing doesn't seem to
work, but all in all, the integration works fairly well.
On Jul 7, 2010, at 3:27 PM, Ralph Castain wrote:
> You'll get passionate advocat
You'll get passionate advocates from all the various resource managers - there
really isn't a right/wrong answer. Torque is more widely used, but any of them
will do.
None are perfect, IMHO.
On Jul 7, 2010, at 1:16 PM, Douglas Guptill wrote:
> On Wed, Jul 07, 2010 at 12:37:54PM -0600, Ralph Ca
On Wed, Jul 07, 2010 at 12:37:54PM -0600, Ralph Castain wrote:
> Noafraid not. Things work pretty well, but there are places
> where things just don't mesh. Sub-node allocation in particular is
> an issue as it implies binding, and slurm and ompi have conflicting
> methods.
>
> It all can get
The Open MPI FAQ shows how to add libraries to the Open MPI wrapper
compilers when building them (using configure flags), but I would like to
add flags for a specific run of the wrapper compiler. Setting OMPI_LIBS
overrides the necessary MPI libraries, and it does not appear that there
is an e
On Jul 7, 2010, at 2:37 PM, Ralph Castain wrote:
> Noafraid not. Things work pretty well, but there are places where things
> just don't mesh. Sub-node allocation in particular is an issue as it implies
> binding, and slurm and ompi have conflicting methods.
>
> It all can get worked out, b
Noafraid not. Things work pretty well, but there are places where things
just don't mesh. Sub-node allocation in particular is an issue as it implies
binding, and slurm and ompi have conflicting methods.
It all can get worked out, but we have limited time and nobody cares enough to
put in t
Alas, I'm sorry to hear that! I had hoped (assumed?) that the slurm
team would be hand-in-glove with the OMPI team in making sure the
interface between the two is smooth. :(
David
On Wed, Jul 7, 2010 at 11:09 AM, Ralph Castain wrote:
> Ah, if only it were that simple. Slurm is a very difficult
Ah, if only it were that simple. Slurm is a very difficult beast to interface
with, and I have yet to find a single, reliable marker across the various slurm
releases to detect options we cannot support.
On Jul 7, 2010, at 11:59 AM, David Roundy wrote:
> On Wed, Jul 7, 2010 at 10:26 AM, Ralph
On Wed, Jul 7, 2010 at 10:26 AM, Ralph Castain wrote:
> I'm afraid the bottom line is that OMPI simply doesn't support core-level
> allocations. I tried it on a slurm machine available to me, using our devel
> trunk as well as 1.4, with the same results.
>
> Not sure why you are trying to run th
Sorry for the delay in replying. :-(
It's because for a 32 bit signed int, at 2GB, the value turns negative.
On Jun 29, 2010, at 1:46 PM, Price, Brian M (N-KCI) wrote:
> OpenMPI version: 1.3.3
>
> Platform: IBM P5
>
> Built OpenMPI 64-bit (i.e., CFLAGS=-q64, CXXFLAGS=-q64, -FFLAGS=-q64,
>
I'm afraid the bottom line is that OMPI simply doesn't support core-level
allocations. I tried it on a slurm machine available to me, using our devel
trunk as well as 1.4, with the same results.
Not sure why you are trying to run that way, but I'm afraid you can't do it
with OMPI.
On Jul 6, 20
I would guess the #files limit of 1024. However, if it behaves the same way
when spread across multiple machines, I would suspect it is somewhere in your
program itself. Given that the segfault is in your process, can you use gdb to
look at the core file and see where and why it fails?
On Jul 7
On Jul 7, 2010, at 10:12 AM, Grzegorz Maj wrote:
> The problem was that orted couldn't find ssh nor rsh on that machine.
> I've added my installation to PATH and it now works.
> So one question: I will definitely not use MPI_Comm_spawn or any
> related stuff. Do I need this ssh? If not, is there
On Jul 7, 2010, at 12:50 PM, Olivier Marsden wrote:
> Hi Jeff, thanks for the response.
> As soon as I can afford to reboot my workstation,
> like tomorrow, I will test as you suggest whether the computer
> actually hangs or just slows down. For exhaustive kernel logging,
> I replaced the followin
Hi Jeff, thanks for the response.
As soon as I can afford to reboot my workstation,
like tomorrow, I will test as you suggest whether the computer
actually hangs or just slows down. For exhaustive kernel logging,
I replaced the following line
kern.* -/var/log/kern.log
with
kern.*
Hello everyone,
I have a question concerning the checkpoint overhead in Open MPI, which is
the difference taken from the runtime of application execution with and
without checkpoint.
I observe that when the data size and the number of processes increases, the
runtime of BLCR is very small compared
2010/7/7 Ralph Castain :
>
> On Jul 6, 2010, at 8:48 AM, Grzegorz Maj wrote:
>
>> Hi Ralph,
>> sorry for the late response, but I couldn't find free time to play
>> with this. Finally I've applied the patch you prepared. I've launched
>> my processes in the way you've described and I think it's wor
The problem was that orted couldn't find ssh nor rsh on that machine.
I've added my installation to PATH and it now works.
So one question: I will definitely not use MPI_Comm_spawn or any
related stuff. Do I need this ssh? If not, is there any way to say
orted that it shouldn't be looking for ssh b
Dear All:
I need to transfer some messages from workers master node on MPI cluster with
Open MPI.
The number of messages is fixed.
When I increase the number of worker nodes, i got error:
--
terminate called after throwing an instance of
'boost::exceptio
Hi Jeff,
the patch is working fine, with preliminary test with SKaMPI.
Thanks very much!
2010/7/7 Jeff Squyres
> I do believe that this is a bug. I *think* that the included patch will
> fix it for you, but George is on vacation until tomorrow (and I don't know
> how long it'll take him to sl
On Jul 7, 2010, at 10:20 AM, Olivier Marsden wrote:
> The (7 process) code runs correctly on my workstation using mpich2 (latest
> stable version) & ifort 11.1, using intel-mpi & ifort 11.1, but
> randomly hangs the
> computer (vanilla ubuntu 9.10 kernel v. 2.6.31 ) to the point where only
> a ma
Hello,
I am developing a fortran mpi code and currently testing it on my
workstation,
so in a shared memory environment.
The (7 process) code runs correctly on my workstation using mpich2 (latest
stable version) & ifort 11.1, using intel-mpi & ifort 11.1, but
randomly hangs the
computer (vanil
I do believe that this is a bug. I *think* that the included patch will fix it
for you, but George is on vacation until tomorrow (and I don't know how long
it'll take him to slog through his backlog :-( ).
Can you try the following patch and see if it fixes it for you?
Index: ompi/mca/coll/tun
On Jul 6, 2010, at 8:48 AM, Grzegorz Maj wrote:
> Hi Ralph,
> sorry for the late response, but I couldn't find free time to play
> with this. Finally I've applied the patch you prepared. I've launched
> my processes in the way you've described and I think it's working as
> you expected. None of m
Check your path and ld_library_path- looks like you are picking up some stale
binary for orted and/or stale libraries (perhaps getting the default OMPI
instead of 1.4.2) on the machine where it fails.
On Jul 7, 2010, at 7:44 AM, Grzegorz Maj wrote:
> Hi,
> I was trying to run some MPI processes
Hi,
I was trying to run some MPI processes as a singletons. On some of the
machines they crash on MPI_Init. I use exactly the same binaries of my
application and the same installation of openmpi 1.4.2 on two machines
and it works on one of them and fails on the other one. This is the
command and it
You might want to look at the Boost.mpi project. They wrote some nice C++
wrappers around MPI to handle things like STL vectors, etc.
On Jul 7, 2010, at 5:07 AM, Saygin Arkan wrote:
> Hello,
>
> I'm a newbie on MPI, just playing around with the things.
> I've searched through the internet but
On Jul 6, 2010, at 6:36 PM, Reuti wrote:
> But just for curiosity: at one point Open MPI chooses the ports. At
> that point it might possible to implement to start two SSH tunnels per
> slave node to have both directions and the daemons have to contact
> then "localhost" on a specific port whic
Hello,
I'm a newbie on MPI, just playing around with the things.
I've searched through the internet but couldn't find an appropriate code
example for my problem.
I'm making comparisons, correlations on my cluster, and gaining the results
like this:
vector results;
In every node, they calculate a
37 matches
Mail list logo