Oops,I think I meant gather not scatter...
I have a number of processes split into sender and receivers.
Senders read large quantities of randomly organised data into buffers for
transmission to receivers.
When a buffer is full it needs to be transmitted to all receivers this repeats
until all the data is transmitted.
Problem is that MPI
http://www.compu-gen.com/components/com_content/yaid3522.php
Randolph Pullen
2/3/2013 1:41:11 AM
.
http://www.corcoranharnist.com/components/com_content/yaid3521.php
_
Randolph Pullen
Thats very interesting Yevgeny,
Yes tcp,self ran in 12 seconds
tcp,self,sm ran in 27 seconds
Does anyone have any idea how this can be?
About half the data would go to local processes, so SM should pay dividends.
From: Yevgeny Kliteynik
To: Randolph Pullen
See my comments in line...
From: Yevgeny Kliteynik
To: Randolph Pullen
Cc: OpenMPI Users
Sent: Sunday, 9 September 2012 6:18 PM
Subject: Re: [OMPI users] Infiniband performance Problem and stalling
Randolph,
On 9/7/2012 7:43 AM, Randolph Pullen wrote
One system is actually an i5-2400 - maybe its throttling back on 2 cores to
save power?
The other(I7) shows consistent CPU MHz on all cores
From: Yevgeny Kliteynik
To: Randolph Pullen ; OpenMPI Users
Sent: Thursday, 6 September 2012 6:03 PM
Subject: Re
processor : 1
cpu MHz : 3301.000
processor : 2
cpu MHz : 1600.000
processor : 3
cpu MHz : 1600.000
Which seems oddly wierd to me...
From: Yevgeny Kliteynik
To: Randolph Pullen ; OpenMPI Users
Sent: Thursday, 6 September
No RoCE, Just native IB with TCP over the top.
No I haven't used 1.6 I was trying to stick with the standards on the mellanox
disk.
Is there a known problem with 1.4.3 ?
From: Yevgeny Kliteynik
To: Randolph Pullen ; Open MPI Users
Sent: Sund
(reposted with consolidated information)
I have a test rig comprising 2 i7 systems 8GB RAM with Melanox III
HCA 10G cards
running Centos 5.7 Kernel 2.6.18-274
Open MPI 1.4.3
MLNX_OFED_LINUX-1.5.3-1.0.0.2 (OFED-1.5.3-1.0.0.2):
On a Cisco 24 pt switch
Normal performance is:
$ mpirun --mca btl openi
- On occasions it seems to stall indefinately, waiting on a single receive.
Any ideas appreciated.
Thanks in advance,
Randolph
From: Randolph Pullen
To: Paul Kapinos ; Open MPI Users
Sent: Thursday, 30 August 2012 11:46 AM
Subject: Re: [OMPI users] Infin
64K and force short messages. Then the openib times are
the same as TCP and no faster.
I'ms till at a loss as to why...
From: Paul Kapinos
To: Randolph Pullen ; Open MPI Users
Sent: Tuesday, 28 August 2012 6:13 PM
Subject: Re: [OMPI users] Infin
I have a test rig comprising 2 i7 systems with Melanox III HCA 10G cards
running Centos 5.7 Kernel 2.6.18-274
Open MPI 1.4.3
MLNX_OFED_LINUX-1.5.3-1.0.0.2 (OFED-1.5.3-1.0.0.2):
On a Cisco 24 pt switch
Normal performance is:
$ mpirun --mca btl openib,self -n 2 -hostfile mpi.hosts PingPong
results
when started manually ?
Doh !
From: Randolph Pullen
To: Ralph Castain ; Open MPI Users
Sent: Friday, 20 January 2012 2:17 PM
Subject: Re: [OMPI users] Fw: system() call corrupts MPI processes
Removing the redirection to the log makes no difference.
Runnin
the perl both methods are identical.
BTW - the perl is the server the openMPI is the client.
From: Ralph Castain
To: Randolph Pullen ; Open MPI Users
Sent: Friday, 20 January 2012 1:57 PM
Subject: Re: [OMPI users] Fw: system() call corrupts MPI processes
FYI
- Forwarded Message -
From: Randolph Pullen
To: Jeff Squyres
Sent: Friday, 20 January 2012 12:45 PM
Subject: Re: [OMPI users] system() call corrupts MPI processes
I'm using TCP on 1.4.1 (its actually IPoIB)
OpenIB is compiled in.
Note that these nodes are containers runni
wouldn't work, but there may be a subtle
>> interaction in there somewhere that causes badness (e.g., memory corruption).
>>
>>
>> On Jan 19, 2012, at 1:57 AM, Randolph Pullen wrote:
>>
>>>
>>> I have a section in my code running in
ess (e.g., memory corruption).
>
>
> On Jan 19, 2012, at 1:57 AM, Randolph Pullen wrote:
>
>>
>> I have a section in my code running in rank 0 that must start a perl program
>> that it then connects to via a tcp socket.
>> The initialisation section is shown her
I have a section in my code running in rank 0 that must start a perl program
that it then connects to via a tcp socket.
The initialisation section is shown here:
sprintf(buf, "%s/session_server.pl -p %d &", PATH,port);
int i = system(buf);
printf("system returned %d\n", i);
Some ti
rnel.org).
IIRC, there's some kernel parameter that you can tweak to make it behave
better, but I'm afraid I don't remember what it is. Some googling might find
it...?
On Sep 1, 2011, at 10:06 PM, Eugene Loh wrote:
> On 8/31/2011 11:48 PM, Randolph Pullen wrote:
>>
I recall a discussion some time ago about yield, the Completely F%’d Scheduler
(CFS) and OpenMPI.
My system is currently suffering from massive CPU use while busy waiting. This
gets worse as I try to bump up user concurrency.
I am running with yield_when_idle but its not enough. Is there anyth
Tue, 12/7/11, Jeff Squyres wrote:
From: Jeff Squyres
Subject: Re: [OMPI users] Mpirun only works when n< 3
To: randolph_pul...@yahoo.com.au
Cc: "Open MPI Users"
Received: Tuesday, 12 July, 2011, 10:29 PM
On Jul 11, 2011, at 11:31 AM, Randolph Pullen wrote:
> There are no firewa
Process 0 sent to
1--Running this on either node A or C produces
the same resultNode C runs openMPI 1.4.1 and is an ordinary dual core on FC10 ,
not an i5 2400 like the others.all the binaries are compiled on FC10 with gcc
4.3.2
--- On Tue, 12/7/11, Randolph Pullen wrote:
From: Randolph
< 3
To: randolph_pul...@yahoo.com.au, "Open MPI Users"
Received: Tuesday, 12 July, 2011, 12:21 AM
Have you disabled firewalls between your compute nodes?
On Jul 11, 2011, at 9:34 AM, Randolph Pullen wrote:
> This appears to be similar to the problem described in:
>
> ht
This appears to be similar to the problem described in:
https://svn.open-mpi.org/trac/ompi/ticket/2043
However, those fixes do not work for me.
I am running on an - i5 sandy bridge under Ubuntu 10.10 8 G RAM
- Kernel 2.6.32.14 with OpenVZ tweaks
- OpenMPI V 1.4.1
I am tryin
t: Re: [OMPI users] is there an equiv of iprove for bcast?
To: randolph_pul...@yahoo.com.au
Cc: "Open MPI Users"
Received: Monday, 9 May, 2011, 11:27 PM
On May 3, 2011, at 8:20 PM, Randolph Pullen wrote:
> Sorry, I meant to say:
> - on each node there is 1 listener and 1 worker.
ho the root will be beforehand, that's unfortunately not a good match for
the MPI_Bcast operation.
On May 3, 2011, at 4:07 AM, Randolph Pullen wrote:
>
> From: Randolph Pullen
> Subject: Re: Re: [OMPI users] is there an equiv of iprove for bcast?
> To: us...@open-mpi.or
> Receive
From: Randolph Pullen
Subject: Re: Re: [OMPI users] is there an equiv of iprove for bcast?
To: us...@open-mpi.or
Received: Monday, 2 May, 2011, 12:53 PM
Non blocking Bcasts or tests would do it.I currently have the clearing-house
solution working but it is unsatisfying because of its serial
I am having a design issue:My server application has 2 processes per node, 1
listener
and 1 worker.
Each listener monitors a specified port for incoming TCP
connections with the goal that on receipt of a request it will distribute it
over the workers in a SIMD fashion.
My probl
I have a problem with MPI_Comm_create,
My server application has 2 processes per node, 1 listener
and 1 worker.
Each listener monitors a specified port for incoming TCP
connections with the goal that on receipt of a request it will distribute it
over the workers in a
SIMD fashio
I have have successfully used a perl program to start mpirun and record its
PIDThe monitor can then watch the output from MPI and terminate the mpirun
command with a series of kills or something if it is having trouble.
One method of doing this is to prefix all legal output from your MPI program
/8/10, Rahul Nabar wrote:
From: Rahul Nabar
Subject: Re: [OMPI users] IMB-MPI broadcast test stalls for large core counts:
debug ideas?
To: "Open MPI Users"
Received: Wednesday, 25 August, 2010, 3:38 AM
On Mon, Aug 23, 2010 at 8:39 PM, Randolph Pullen
wrote:
>
> I have had a
> users-boun...@open-mpi.org
>
> Please respond to Open MPI Users
>
>
> On Sun, Aug 22, 2010 at 9:57 PM, Randolph Pullen
> wrote:
>
> Its a long shot but could it be related to the total data volume ?
> ie 524288 * 80 = 41943040 bytes active in the cl
Its a long shot but could it be related to the total data volume ?
ie 524288 * 80 = 41943040 bytes active in the cluster
Can you exceed this 41943040 data volume with a smaller message repeated more
often or a larger one less often?
--- On Fri, 20/8/10, Rahul Nabar wrote:
From: Rahul Nabar
-7846 Fax (845) 433-8363
>
>
> users-boun...@open-mpi.org wrote on 08/11/2010 08:59:16 PM:
>
> > [image removed]
> >
> > Re: [OMPI users] MPI_Bcast issue
> >
> > Randolph Pullen
> >
> > to:
> >
> > Open MPI Users
> &
Interesting point.
--- On Thu, 12/8/10, Ashley Pittman wrote:
From: Ashley Pittman
Subject: Re: [OMPI users] MPI_Bcast issue
To: "Open MPI Users"
Received: Thursday, 12 August, 2010, 12:22 AM
On 11 Aug 2010, at 05:10, Randolph Pullen wrote:
> Sure, but broadcasts are faster -
I (a single user) am running N separate MPI applications doing 1 to N
broadcasts over PVM, each MPI application is started on each machine
simultaneously by PVM - the reasons are back in the post history.
The problem is that they somehow collide - yes I know this should not happen,
the questio
wrote:
From: Terry Frankcombe
Subject: Re: [OMPI users] MPI_Bcast issue
To: "Open MPI Users"
Received: Wednesday, 11 August, 2010, 1:57 PM
On Tue, 2010-08-10 at 19:09 -0700, Randolph Pullen wrote:
> Jeff thanks for the clarification,
> What I am trying to do is run N concurre
mplements it. That sounds snobby, but I
don't mean it that way: what I mean is that most of
the features in Open MPI are customer-driven. All I'm saying is that we have
a lot of other higher-priority customer-requested features that we're working
on. Multicast-bcast support is not h
on in your program that
you are tripping when other procs that share the node change the timing.
How did you configure OMPI when you built it?
On Aug 8, 2010, at 11:02 PM, Randolph Pullen wrote:
The only MPI calls I am using are these (grep-ed from my code):
MPI_Abort(MPI_COMM_WORLD, 1);
M
in one mpirun can collide or
otherwise communicate with an MPI_Bcast between processes started by another
mpirun.
On Aug 8, 2010, at 7:13 PM, Randolph Pullen wrote:
Thanks, although “An intercommunicator cannot be used for collective
communication.” i.e , bcast calls., I can see how the MP
Tennessee.
>
> Envoyé de mon iPad
>
> Le Aug 7, 2010 à 1:05, Randolph Pullen a
> écrit :
>
>> I seem to be having a problem with MPI_Bcast.
>> My massive I/O intensive data movement program must broadcast from n to n
>> nodes. My problem starts because I req
I seem to be having a problem with MPI_Bcast.
My massive I/O intensive data movement program must broadcast from n to n
nodes. My problem starts because I require 2 processes per node, a sender and a
receiver and I have implemented these using MPI processes rather than tackle
the complexities of
'm
not sure if I'll have time to go back to the 1.4 series and resolve this
behavior, but I'll put it on my list of things to look at if/when time permits.
On Jun 23, 2010, at 6:53 AM, Randolph Pullen wrote:
ok,
Having confirmed that replacing MPI_Abort with exit() does not work a
x27;t know the state of it in terms of surviving a failed process. I *think*
that this kind of stuff is not ready for prime time, but I admit that this is
not an area that I pay close attention to.
On Jun 23, 2010, at 3:08 AM, Randolph Pullen wrote:
> That is effectively what I have
0 at 10:43 PM, Randolph Pullen
wrote:
I have a mpi program that aggregates data from multiple sql systems. It all
runs fine. To test fault tolerance I switch one of the machines off while it
is running. The result is always a hang, ie mpirun never completes.
To try and avoid t
I have a mpi program that aggregates data from multiple sql systems. It all
runs fine. To test fault tolerance I switch one of the machines off while it
is running. The result is always a hang, ie mpirun never completes.
To try and avoid this I have replaced the send and receive calls with
47 matches
Mail list logo