date:20100707

[OMPI users] Sending an objects vector via MPI C++

2010-07-07 Thread Saygin Arkan

Hello,

I'm a newbie on MPI, just playing around with the things.
I've searched through the internet but couldn't find an appropriate code
example for my problem.

I'm making comparisons, correlations on my cluster, and gaining the results
like this:
vector results;

In every node, they calculate and create the results array, in their local
storage.
And then I'd like to collect these vectors in my server node, rank (0).

I had done this with MPI gather but just for double arrays, not with objects
or vectors.

I have some guess about MPI::Create_contiguous, or MPI::Create_vector
functions,
but all these ask for another associated MPI type, such as CHAR or INT or
etc.
And I don't know if I should use packing somehow...

is there a way to collect these vectors in my server node with Gather
function?
or even with send & recv?

Thanks a lot,

-- 
Saygin

Re: [OMPI users] OpenMPI Hangs, No Error

2010-07-07 Thread Jeff Squyres

On Jul 6, 2010, at 6:36 PM, Reuti wrote:

> But just for curiosity: at one point Open MPI chooses the ports. At 
> that point it might possible to implement to start two SSH tunnels per 
> slave node to have both directions and the daemons have to contact 
> then "localhost" on a specific port which will be tunneled to each 
> slave. In principle it should work I think, but it's just not 
> implemented for now.

Agreed.  Patches would be welcome!  :-)

> Maybe it could be an addition to Open MPI for security concerned 
> usage. I wonder about the speed impact, when compression is switched 
> on per se in SSH in such a setup in case you transfer large amounts of 
> data via Open MPI.

For control data (i.e., control messages passed during MPI startup, shutdown, 
etc.), the impact may not matter much.  For MPI data (i.e., having something 
like an "ssh" BTL), I could imagine quite a bit of slowdown.  But then again, 
it depends on what your goals are -- if your local policies demand ssh or 
nothing, having "slow" MPI might be better than nothing.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI users] Sending an objects vector via MPI C++

2010-07-07 Thread Jeff Squyres

You might want to look at the Boost.mpi project.  They wrote some nice C++ 
wrappers around MPI to handle things like STL vectors, etc.


On Jul 7, 2010, at 5:07 AM, Saygin Arkan wrote:

> Hello,
> 
> I'm a newbie on MPI, just playing around with the things.
> I've searched through the internet but couldn't find an appropriate code 
> example for my problem.
> 
> I'm making comparisons, correlations on my cluster, and gaining the results 
> like this:
> vector results;
> 
> In every node, they calculate and create the results array, in their local 
> storage.
> And then I'd like to collect these vectors in my server node, rank (0).
> 
> I had done this with MPI gather but just for double arrays, not with objects 
> or vectors.
> 
> I have some guess about MPI::Create_contiguous, or MPI::Create_vector 
> functions,
> but all these ask for another associated MPI type, such as CHAR or INT or etc.
> And I don't know if I should use packing somehow...
> 
> is there a way to collect these vectors in my server node with Gather 
> function?
> or even with send & recv?
> 
> Thanks a lot, 
> 
> -- 
> Saygin ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

[OMPI users] MPI_Init failing in singleton

2010-07-07 Thread Grzegorz Maj

Hi,
I was trying to run some MPI processes as a singletons. On some of the
machines they crash on MPI_Init. I use exactly the same binaries of my
application and the same installation of openmpi 1.4.2 on two machines
and it works on one of them and fails on the other one. This is the
command and its output (test is a simple application calling only
MPI_Init and MPI_Finalize):

LD_LIBRARY_PATH=/home/gmaj/openmpi/lib ./test
[host01:21866] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file
../../../../../orte/mca/ess/hnp/ess_hnp_module.c at line 161
--
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_plm_base_select failed
  --> Returned value Not found (-13) instead of ORTE_SUCCESS
--
[host01:21866] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file
../../orte/runtime/orte_init.c at line 132
--
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_set_name failed
  --> Returned value Not found (-13) instead of ORTE_SUCCESS
--
[host01:21866] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file
../../orte/orted/orted_main.c at line 323
[host01:21865] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a
daemon on the local node in file
../../../../../orte/mca/ess/singleton/ess_singleton_module.c at line
381
[host01:21865] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a
daemon on the local node in file
../../../../../orte/mca/ess/singleton/ess_singleton_module.c at line
143
[host01:21865] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a
daemon on the local node in file ../../orte/runtime/orte_init.c at
line 132
--
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_set_name failed
  --> Returned value Unable to start a daemon on the local node (-128)
instead of ORTE_SUCCESS
--
--
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_mpi_init: orte_init failed
  --> Returned "Unable to start a daemon on the local node" (-128)
instead of "Success" (0)
--
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[host01:21865] Abort before MPI_INIT completed successfully; not able
to guarantee that all other processes were killed!


Any ideas on this?

Thanks,
Grzegorz Maj

Re: [OMPI users] MPI_Init failing in singleton

2010-07-07 Thread Ralph Castain

Check your path and ld_library_path- looks like you are picking up some stale 
binary for orted and/or stale libraries (perhaps getting the default OMPI 
instead of 1.4.2) on the machine where it fails.

On Jul 7, 2010, at 7:44 AM, Grzegorz Maj wrote:

> Hi,
> I was trying to run some MPI processes as a singletons. On some of the
> machines they crash on MPI_Init. I use exactly the same binaries of my
> application and the same installation of openmpi 1.4.2 on two machines
> and it works on one of them and fails on the other one. This is the
> command and its output (test is a simple application calling only
> MPI_Init and MPI_Finalize):
> 
> LD_LIBRARY_PATH=/home/gmaj/openmpi/lib ./test
> [host01:21866] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file
> ../../../../../orte/mca/ess/hnp/ess_hnp_module.c at line 161
> --
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>  orte_plm_base_select failed
>  --> Returned value Not found (-13) instead of ORTE_SUCCESS
> --
> [host01:21866] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file
> ../../orte/runtime/orte_init.c at line 132
> --
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>  orte_ess_set_name failed
>  --> Returned value Not found (-13) instead of ORTE_SUCCESS
> --
> [host01:21866] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file
> ../../orte/orted/orted_main.c at line 323
> [host01:21865] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a
> daemon on the local node in file
> ../../../../../orte/mca/ess/singleton/ess_singleton_module.c at line
> 381
> [host01:21865] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a
> daemon on the local node in file
> ../../../../../orte/mca/ess/singleton/ess_singleton_module.c at line
> 143
> [host01:21865] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a
> daemon on the local node in file ../../orte/runtime/orte_init.c at
> line 132
> --
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>  orte_ess_set_name failed
>  --> Returned value Unable to start a daemon on the local node (-128)
> instead of ORTE_SUCCESS
> --
> --
> It looks like MPI_INIT failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during MPI_INIT; some of which are due to configuration or environment
> problems.  This failure appears to be an internal failure; here's some
> additional information (which may only be relevant to an Open MPI
> developer):
> 
>  ompi_mpi_init: orte_init failed
>  --> Returned "Unable to start a daemon on the local node" (-128)
> instead of "Success" (0)
> --
> *** An error occurred in MPI_Init
> *** before MPI was initialized
> *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
> [host01:21865] Abort before MPI_INIT completed successfully; not able
> to guarantee that all other processes were killed!
> 
> 
> Any ideas on this?
> 
> Thanks,
> Grzegorz Maj
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] Dynamic processes connection and segfault on MPI_Comm_accept

2010-07-07 Thread Ralph Castain


On Jul 6, 2010, at 8:48 AM, Grzegorz Maj wrote:

> Hi Ralph,
> sorry for the late response, but I couldn't find free time to play
> with this. Finally I've applied the patch you prepared. I've launched
> my processes in the way you've described and I think it's working as
> you expected. None of my processes runs the orted daemon and they can
> perform MPI operations. Unfortunately I'm still hitting the 65
> processes issue :(
> Maybe I'm doing something wrong.
> I attach my source code. If anybody could have a look on this, I would
> be grateful.
> 
> When I run that code with clients_count <= 65 everything works fine:
> all the processes create a common grid, exchange some information and
> disconnect.
> When I set clients_count > 65 the 66th process crashes on
> MPI_Comm_connect (segmentation fault).

I didn't have time to check the code, but my guess is that you are still 
hitting some kind of file descriptor or other limit. Check to see what your 
limits are - usually "ulimit" will tell you.

> 
> Another thing I would like to know is if it's normal that any of my
> processes when calling MPI_Comm_connect or MPI_Comm_accept when the
> other side is not ready, is eating up a full CPU available.

Yes - the waiting process is polling in a tight loop waiting for the connection 
to be made.

> 
> Any help would be appreciated,
> Grzegorz Maj
> 
> 
> 2010/4/24 Ralph Castain :
>> Actually, OMPI is distributed with a daemon that does pretty much what you
>> want. Checkout "man ompi-server". I originally wrote that code to support
>> cross-application MPI publish/subscribe operations, but we can utilize it
>> here too. Have to blame me for not making it more publicly known.
>> The attached patch upgrades ompi-server and modifies the singleton startup
>> to provide your desired support. This solution works in the following
>> manner:
>> 1. launch "ompi-server -report-uri ". This starts a persistent
>> daemon called "ompi-server" that acts as a rendezvous point for
>> independently started applications.  The problem with starting different
>> applications and wanting them to MPI connect/accept lies in the need to have
>> the applications find each other. If they can't discover contact info for
>> the other app, then they can't wire up their interconnects. The
>> "ompi-server" tool provides that rendezvous point. I don't like that
>> comm_accept segfaulted - should have just error'd out.
>> 2. set OMPI_MCA_orte_server=file:" in the environment where you
>> will start your processes. This will allow your singleton processes to find
>> the ompi-server. I automatically also set the envar to connect the MPI
>> publish/subscribe system for you.
>> 3. run your processes. As they think they are singletons, they will detect
>> the presence of the above envar and automatically connect themselves to the
>> "ompi-server" daemon. This provides each process with the ability to perform
>> any MPI-2 operation.
>> I tested this on my machines and it worked, so hopefully it will meet your
>> needs. You only need to run one "ompi-server" period, so long as you locate
>> it where all of the processes can find the contact file and can open a TCP
>> socket to the daemon. There is a way to knit multiple ompi-servers into a
>> broader network (e.g., to connect processes that cannot directly access a
>> server due to network segmentation), but it's a tad tricky - let me know if
>> you require it and I'll try to help.
>> If you have trouble wiring them all into a single communicator, you might
>> ask separately about that and see if one of our MPI experts can provide
>> advice (I'm just the RTE grunt).
>> HTH - let me know how this works for you and I'll incorporate it into future
>> OMPI releases.
>> Ralph
>> 
>> 
>> On Apr 24, 2010, at 1:49 AM, Krzysztof Zarzycki wrote:
>> 
>> Hi Ralph,
>> I'm Krzysztof and I'm working with Grzegorz Maj on this our small
>> project/experiment.
>> We definitely would like to give your patch a try. But could you please
>> explain your solution a little more?
>> You still would like to start one mpirun per mpi grid, and then have
>> processes started by us to join the MPI comm?
>> It is a good solution of course.
>> But it would be especially preferable to have one daemon running
>> persistently on our "entry" machine that can handle several mpi grid starts.
>> Can your patch help us this way too?
>> Thanks for your help!
>> Krzysztof
>> 
>> On 24 April 2010 03:51, Ralph Castain  wrote:
>>> 
>>> In thinking about this, my proposed solution won't entirely fix the
>>> problem - you'll still wind up with all those daemons. I believe I can
>>> resolve that one as well, but it would require a patch.
>>> 
>>> Would you like me to send you something you could try? Might take a couple
>>> of iterations to get it right...
>>> 
>>> On Apr 23, 2010, at 12:12 PM, Ralph Castain wrote:
>>> 
 HmmmI -think- this will work, but I cannot guarantee it:
 
 1. launch one process (can just be a spinner) using

Re: [OMPI users] Dynamic algorithms problem

2010-07-07 Thread Jeff Squyres

I do believe that this is a bug.  I *think* that the included patch will fix it 
for you, but George is on vacation until tomorrow (and I don't know how long 
it'll take him to slog through his backlog :-( ).

Can you try the following patch and see if it fixes it for you?

Index: ompi/mca/coll/tuned/coll_tuned_module.c
===
--- ompi/mca/coll/tuned/coll_tuned_module.c (revision 23360)
+++ ompi/mca/coll/tuned/coll_tuned_module.c (working copy)
@@ -165,6 +165,7 @@
 {   \
 int need_dynamic_decision = 0;  \
 ompi_coll_tuned_forced_getvalues( (TYPE), 
&((DATA)->user_forced[(TYPE)]) ); \
+(DATA)->com_rules[(TYPE)] = NULL;   \
 if( 0 != (DATA)->user_forced[(TYPE)].algorithm ) {  \
 need_dynamic_decision = 1;  \
 EXECUTE;\





On Jul 4, 2010, at 8:12 AM, Gabriele Fatigati wrote:

> Dear OpenMPI user,
> 
> i'm trying to use collective dynamic rules with OpenMPi 1.4.2:
> 
> export OMPI_MCA_coll_tuned_use_dynamic_rules=1
> export OMPI_MCA_coll_tuned_bcast_algorithm=1
> 
> My target is to test Bcast peformances using SKaMPI benchmark changing 
> dynamic rules. But at runtime i get the follow error:
> 
> 
> [node003:05871] *** Process received signal ***
> [node003:05871] Signal: Segmentation fault (11)
> [node003:05871] Signal code: Address not mapped (1)
> [node003:05871] Failing at address: 0xcc
> [node003:05872] *** Process received signal ***
> [node003:05872] Signal: Segmentation fault (11)
> [node003:05872] Signal code: Address not mapped (1)
> [node003:05872] Failing at address: 0xcc
> [node003:05871] [ 0] /lib64/libpthread.so.0 [0x3be160e4c0]
> [node003:05871] [ 1] 
> /gpfs/scratch/userinternal/cin0243a/openmpi-1.4.2/openmpi-1.4.2-install/lib/libmpi.so.0
>  [0x2accf7210145]
> [node003:05871] [ 2] 
> /gpfs/scratch/userinternal/cin0243a/openmpi-1.4.2/openmpi-1.4.2-install/lib/libmpi.so.0
>  [0x2accf720ef16]
> [node003:05871] [ 3] 
> /gpfs/scratch/userinternal/cin0243a/openmpi-1.4.2/openmpi-1.4.2-install/lib/libmpi.so.0
>  [0x2accf721fec9]
> [node003:05871] [ 4] 
> /gpfs/scratch/userinternal/cin0243a/openmpi-1.4.2/openmpi-1.4.2-install/lib/libmpi.so.0(MPI_Bcast+0x171)
>  [0x2accf71b81e1]
> [node003:05871] [ 5] ./skampi [0x409566]
> [node003:05871] [ 6] /lib64/libc.so.6(__libc_start_main+0xf4) [0x3be0e1d974]
> [node003:05871] [ 7] ./skampi [0x404e19]
> [node003:05871] *** End of error message ***
> [node003:05872] [ 0] /lib64/libpthread.so.0 [0x3be160e4c0]
> [node003:05872] [ 1] 
> /gpfs/scratch/userinternal/cin0243a/openmpi-1.4.2/openmpi-1.4.2-install/lib/libmpi.so.0
>  [0x2b1959eb3145]
> [node003:05872] [ 2] 
> /gpfs/scratch/userinternal/cin0243a/openmpi-1.4.2/openmpi-1.4.2-install/lib/libmpi.so.0
>  [0x2b1959eb1f16]
> [node003:05872] [ 3] 
> /gpfs/scratch/userinternal/cin0243a/openmpi-1.4.2/openmpi-1.4.2-install/lib/libmpi.so.0
>  [0x2b1959ec2ec9]
> [node003:05872] [ 4] 
> /gpfs/scratch/userinternal/cin0243a/openmpi-1.4.2/openmpi-1.4.2-install/lib/libmpi.so.0(MPI_Bcast+0x171)
>  [0x2b1959e5b1e1]
> [node003:05872] [ 5] ./skampi [0x409566]
> [node003:05872] [ 6] /lib64/libc.so.6(__libc_start_main+0xf4) [0x3be0e1d974]
> [node003:05872] [ 7] ./skampi [0x404e19]
> [node003:05872] *** End of error message ***
> --
> mpirun noticed that process rank 9 with PID 5872 on node node003ib0 exited on 
> signal 11 (Segmentation fault).
> --
> 
> 
> The same using other Bcast algorithm. Disabling dynamic rules, it works well. 
> Maybe i'm using some wrong parameter setup?
> 
> Thanks in advance.
> 
> 
> 
> 
> 
> -- 
> Ing. Gabriele Fatigati
> 
> Parallel programmer
> 
> CINECA Systems & Tecnologies Department
> 
> Supercomputing Group
> 
> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
> 
> www.cineca.itTel:   +39 051 6171722
> 
> g.fatigati [AT] cineca.it   
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

[OMPI users] perhaps an openmpi bug, how best to identify?

2010-07-07 Thread Olivier Marsden


Hello,
I am developing a fortran mpi code and currently testing it on my 
workstation,

so in a shared memory environment.
The (7 process) code runs correctly on my workstation using mpich2 (latest
stable version) & ifort 11.1, using intel-mpi & ifort 11.1, but  
randomly hangs the
computer (vanilla ubuntu 9.10 kernel v. 2.6.31 ) to the point where only 
a magic

sysrq combination can "save" me (i.e. reboot), when using
- openmpi 1.4.2 compiled from source with gcc, ifort for mpif90
- clustertools v. 8.2.1c distribution from sun/oracle, also based on 
openmpi 1.4.2, using sun f90

 for mpif90

I am prepared to do some testing if that can help, but don't know the 
best way to identify what's going on.

I have found no useful information in the syslog files.

Regards, & thanks for the work on a great open source tool,


Olivier Marsden

Re: [OMPI users] perhaps an openmpi bug, how best to identify?

2010-07-07 Thread Jeff Squyres

On Jul 7, 2010, at 10:20 AM, Olivier Marsden wrote:

> The (7 process) code runs correctly on my workstation using mpich2 (latest
> stable version) & ifort 11.1, using intel-mpi & ifort 11.1, but 
> randomly hangs the
> computer (vanilla ubuntu 9.10 kernel v. 2.6.31 ) to the point where only
> a magic
> sysrq combination can "save" me (i.e. reboot), when using
> - openmpi 1.4.2 compiled from source with gcc, ifort for mpif90
> - clustertools v. 8.2.1c distribution from sun/oracle, also based on
> openmpi 1.4.2, using sun f90
>   for mpif90

Yowza.  Open MPI is user space code, so it should never be able to hang the 
entire computer.  Open MPI and MPICH2 do implement things in very different 
ways, so it's quite possible that we trip entirely different code paths in the 
same linux kernel.

Never say "never" -- it could well be an Open MPI bug.  But it smells like a 
kernel bug...

> I am prepared to do some testing if that can help, but don't know the
> best way to identify what's going on.
> I have found no useful information in the syslog files.

Is the machine totally hung?  Or is it just running really, really slowly?  Try 
leaving some kind of slowly-monitoring process running in the background and 
see if it keeps running (perhaps even more slowly than before) when the machine 
hangs.  E.g., something like a shell script that loops over sleeping for a 
second and then appending the output of "date" to a file.  Or something like 
that.

My point: see if Open MPI went into some hyper-aggressive mode where it's 
(literally) stealing every available cycle and making the machine look hung.  
You might even want to try running the OMPI procs at a low priority to see if 
it can help alleviate the "steal all cycles" mode (if that is, indeed, what is 
happening).

If the machine is truly hung, then something else might be going on.  Do any 
kernel logs report anything?  Can you crank up your syslog to report *all* 
events, for example?

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI users] Dynamic algorithms problem

2010-07-07 Thread Gabriele Fatigati

Hi Jeff,

the patch is working fine, with preliminary test with SKaMPI.

Thanks very much!

2010/7/7 Jeff Squyres 

> I do believe that this is a bug.  I *think* that the included patch will
> fix it for you, but George is on vacation until tomorrow (and I don't know
> how long it'll take him to slog through his backlog :-( ).
>
> Can you try the following patch and see if it fixes it for you?
>
> Index: ompi/mca/coll/tuned/coll_tuned_module.c
> ===
> --- ompi/mca/coll/tuned/coll_tuned_module.c (revision 23360)
> +++ ompi/mca/coll/tuned/coll_tuned_module.c (working copy)
> @@ -165,6 +165,7 @@
> {   \
> int need_dynamic_decision = 0;  \
> ompi_coll_tuned_forced_getvalues( (TYPE),
> &((DATA)->user_forced[(TYPE)]) ); \
> +(DATA)->com_rules[(TYPE)] = NULL;   \
> if( 0 != (DATA)->user_forced[(TYPE)].algorithm ) {  \
> need_dynamic_decision = 1;  \
> EXECUTE;\
>
>
>
>
>
> On Jul 4, 2010, at 8:12 AM, Gabriele Fatigati wrote:
>
> > Dear OpenMPI user,
> >
> > i'm trying to use collective dynamic rules with OpenMPi 1.4.2:
> >
> > export OMPI_MCA_coll_tuned_use_dynamic_rules=1
> > export OMPI_MCA_coll_tuned_bcast_algorithm=1
> >
> > My target is to test Bcast peformances using SKaMPI benchmark changing
> dynamic rules. But at runtime i get the follow error:
> >
> >
> > [node003:05871] *** Process received signal ***
> > [node003:05871] Signal: Segmentation fault (11)
> > [node003:05871] Signal code: Address not mapped (1)
> > [node003:05871] Failing at address: 0xcc
> > [node003:05872] *** Process received signal ***
> > [node003:05872] Signal: Segmentation fault (11)
> > [node003:05872] Signal code: Address not mapped (1)
> > [node003:05872] Failing at address: 0xcc
> > [node003:05871] [ 0] /lib64/libpthread.so.0 [0x3be160e4c0]
> > [node003:05871] [ 1]
> /gpfs/scratch/userinternal/cin0243a/openmpi-1.4.2/openmpi-1.4.2-install/lib/libmpi.so.0
> [0x2accf7210145]
> > [node003:05871] [ 2]
> /gpfs/scratch/userinternal/cin0243a/openmpi-1.4.2/openmpi-1.4.2-install/lib/libmpi.so.0
> [0x2accf720ef16]
> > [node003:05871] [ 3]
> /gpfs/scratch/userinternal/cin0243a/openmpi-1.4.2/openmpi-1.4.2-install/lib/libmpi.so.0
> [0x2accf721fec9]
> > [node003:05871] [ 4]
> /gpfs/scratch/userinternal/cin0243a/openmpi-1.4.2/openmpi-1.4.2-install/lib/libmpi.so.0(MPI_Bcast+0x171)
> [0x2accf71b81e1]
> > [node003:05871] [ 5] ./skampi [0x409566]
> > [node003:05871] [ 6] /lib64/libc.so.6(__libc_start_main+0xf4)
> [0x3be0e1d974]
> > [node003:05871] [ 7] ./skampi [0x404e19]
> > [node003:05871] *** End of error message ***
> > [node003:05872] [ 0] /lib64/libpthread.so.0 [0x3be160e4c0]
> > [node003:05872] [ 1]
> /gpfs/scratch/userinternal/cin0243a/openmpi-1.4.2/openmpi-1.4.2-install/lib/libmpi.so.0
> [0x2b1959eb3145]
> > [node003:05872] [ 2]
> /gpfs/scratch/userinternal/cin0243a/openmpi-1.4.2/openmpi-1.4.2-install/lib/libmpi.so.0
> [0x2b1959eb1f16]
> > [node003:05872] [ 3]
> /gpfs/scratch/userinternal/cin0243a/openmpi-1.4.2/openmpi-1.4.2-install/lib/libmpi.so.0
> [0x2b1959ec2ec9]
> > [node003:05872] [ 4]
> /gpfs/scratch/userinternal/cin0243a/openmpi-1.4.2/openmpi-1.4.2-install/lib/libmpi.so.0(MPI_Bcast+0x171)
> [0x2b1959e5b1e1]
> > [node003:05872] [ 5] ./skampi [0x409566]
> > [node003:05872] [ 6] /lib64/libc.so.6(__libc_start_main+0xf4)
> [0x3be0e1d974]
> > [node003:05872] [ 7] ./skampi [0x404e19]
> > [node003:05872] *** End of error message ***
> >
> --
> > mpirun noticed that process rank 9 with PID 5872 on node node003ib0
> exited on signal 11 (Segmentation fault).
> >
> --
> >
> >
> > The same using other Bcast algorithm. Disabling dynamic rules, it works
> well. Maybe i'm using some wrong parameter setup?
> >
> > Thanks in advance.
> >
> >
> >
> >
> >
> > --
> > Ing. Gabriele Fatigati
> >
> > Parallel programmer
> >
> > CINECA Systems & Tecnologies Department
> >
> > Supercomputing Group
> >
> > Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
> >
> > www.cineca.itTel:   +39 051 6171722
> >
> > g.fatigati [AT] cineca.it
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>


-- 
Ing. Gabriele Fatigati

Parallel programmer

CINECA Systems & Tecnologie

[OMPI users] Open MPI error MPI_ERR_TRUNCATE: message truncated

2010-07-07 Thread Jack Bryan


Dear All:
I need to transfer some messages from workers master node on MPI cluster with 
Open MPI.
The number of messages is fixed. 
When I increase the number of worker nodes, i got error: 
--
terminate called after throwing an instance of 
'boost::exception_detail::clone_impl
 >'  what():  MPI_Unpack: MPI_ERR_TRUNCATE: message truncated[n231:45873] *** 
Process received signal ***[n231:45873] Signal: Aborted (6)[n231:45873] Signal 
code:  (-6)[n231:45873] [ 0] /lib64/libpthread.so.0 [0x3c50e0e4c0][n231:45873] 
[ 1] /lib64/libc.so.6(gsignal+0x35) [0x3c50230215][n231:45873] [ 2] 
/lib64/libc.so.6(abort+0x110) [0x3c50231cc0]

--
For 40 workers , it works well. 
But for 50 workers, it got this error. 
The largest message size is not more then 72 bytes. 
Any help is appreciated. 
thanks
Jack
July 7 2010   
_
The New Busy is not the too busy. Combine all your e-mail accounts with Hotmail.
http://www.windowslive.com/campaign/thenewbusy?tile=multiaccount&ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_4

Re: [OMPI users] MPI_Init failing in singleton

2010-07-07 Thread Grzegorz Maj

The problem was that orted couldn't find ssh nor rsh on that machine.
I've added my installation to PATH and it now works.
So one question: I will definitely not use MPI_Comm_spawn or any
related stuff. Do I need this ssh? If not, is there any way to say
orted that it shouldn't be looking for ssh because it won't need it?

Regards,
Grzegorz Maj

2010/7/7 Ralph Castain :
> Check your path and ld_library_path- looks like you are picking up some stale 
> binary for orted and/or stale libraries (perhaps getting the default OMPI 
> instead of 1.4.2) on the machine where it fails.
>
> On Jul 7, 2010, at 7:44 AM, Grzegorz Maj wrote:
>
>> Hi,
>> I was trying to run some MPI processes as a singletons. On some of the
>> machines they crash on MPI_Init. I use exactly the same binaries of my
>> application and the same installation of openmpi 1.4.2 on two machines
>> and it works on one of them and fails on the other one. This is the
>> command and its output (test is a simple application calling only
>> MPI_Init and MPI_Finalize):
>>
>> LD_LIBRARY_PATH=/home/gmaj/openmpi/lib ./test
>> [host01:21866] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file
>> ../../../../../orte/mca/ess/hnp/ess_hnp_module.c at line 161
>> --
>> It looks like orte_init failed for some reason; your parallel process is
>> likely to abort.  There are many reasons that a parallel process can
>> fail during orte_init; some of which are due to configuration or
>> environment problems.  This failure appears to be an internal failure;
>> here's some additional information (which may only be relevant to an
>> Open MPI developer):
>>
>>  orte_plm_base_select failed
>>  --> Returned value Not found (-13) instead of ORTE_SUCCESS
>> --
>> [host01:21866] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file
>> ../../orte/runtime/orte_init.c at line 132
>> --
>> It looks like orte_init failed for some reason; your parallel process is
>> likely to abort.  There are many reasons that a parallel process can
>> fail during orte_init; some of which are due to configuration or
>> environment problems.  This failure appears to be an internal failure;
>> here's some additional information (which may only be relevant to an
>> Open MPI developer):
>>
>>  orte_ess_set_name failed
>>  --> Returned value Not found (-13) instead of ORTE_SUCCESS
>> --
>> [host01:21866] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file
>> ../../orte/orted/orted_main.c at line 323
>> [host01:21865] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a
>> daemon on the local node in file
>> ../../../../../orte/mca/ess/singleton/ess_singleton_module.c at line
>> 381
>> [host01:21865] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a
>> daemon on the local node in file
>> ../../../../../orte/mca/ess/singleton/ess_singleton_module.c at line
>> 143
>> [host01:21865] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a
>> daemon on the local node in file ../../orte/runtime/orte_init.c at
>> line 132
>> --
>> It looks like orte_init failed for some reason; your parallel process is
>> likely to abort.  There are many reasons that a parallel process can
>> fail during orte_init; some of which are due to configuration or
>> environment problems.  This failure appears to be an internal failure;
>> here's some additional information (which may only be relevant to an
>> Open MPI developer):
>>
>>  orte_ess_set_name failed
>>  --> Returned value Unable to start a daemon on the local node (-128)
>> instead of ORTE_SUCCESS
>> --
>> --
>> It looks like MPI_INIT failed for some reason; your parallel process is
>> likely to abort.  There are many reasons that a parallel process can
>> fail during MPI_INIT; some of which are due to configuration or environment
>> problems.  This failure appears to be an internal failure; here's some
>> additional information (which may only be relevant to an Open MPI
>> developer):
>>
>>  ompi_mpi_init: orte_init failed
>>  --> Returned "Unable to start a daemon on the local node" (-128)
>> instead of "Success" (0)
>> --
>> *** An error occurred in MPI_Init
>> *** before MPI was initialized
>> *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
>> [host01:21865] Abort before MPI_INIT completed successfully; not able
>> to guarantee that all other processes were killed!
>>
>>
>> Any ideas on this?
>>
>> Thanks,
>> Grzegorz Maj
>> ___
>> users mailing lis

Re: [OMPI users] Dynamic processes connection and segfault on MPI_Comm_accept

2010-07-07 Thread Grzegorz Maj

2010/7/7 Ralph Castain :
>
> On Jul 6, 2010, at 8:48 AM, Grzegorz Maj wrote:
>
>> Hi Ralph,
>> sorry for the late response, but I couldn't find free time to play
>> with this. Finally I've applied the patch you prepared. I've launched
>> my processes in the way you've described and I think it's working as
>> you expected. None of my processes runs the orted daemon and they can
>> perform MPI operations. Unfortunately I'm still hitting the 65
>> processes issue :(
>> Maybe I'm doing something wrong.
>> I attach my source code. If anybody could have a look on this, I would
>> be grateful.
>>
>> When I run that code with clients_count <= 65 everything works fine:
>> all the processes create a common grid, exchange some information and
>> disconnect.
>> When I set clients_count > 65 the 66th process crashes on
>> MPI_Comm_connect (segmentation fault).
>
> I didn't have time to check the code, but my guess is that you are still 
> hitting some kind of file descriptor or other limit. Check to see what your 
> limits are - usually "ulimit" will tell you.

My limitations are:
time(seconds)unlimited
file(blocks) unlimited
data(kb) unlimited
stack(kb)10240
coredump(blocks) 0
memory(kb)   unlimited
locked memory(kb)64
process  200704
nofiles  1024
vmemory(kb)  unlimited
locksunlimited

Which one do you think could be responsible for that?

I was trying to run all the 66 processes on one machine or spread them
across several machines and it always crashes the same way on the 66th
process.

>
>>
>> Another thing I would like to know is if it's normal that any of my
>> processes when calling MPI_Comm_connect or MPI_Comm_accept when the
>> other side is not ready, is eating up a full CPU available.
>
> Yes - the waiting process is polling in a tight loop waiting for the 
> connection to be made.
>
>>
>> Any help would be appreciated,
>> Grzegorz Maj
>>
>>
>> 2010/4/24 Ralph Castain :
>>> Actually, OMPI is distributed with a daemon that does pretty much what you
>>> want. Checkout "man ompi-server". I originally wrote that code to support
>>> cross-application MPI publish/subscribe operations, but we can utilize it
>>> here too. Have to blame me for not making it more publicly known.
>>> The attached patch upgrades ompi-server and modifies the singleton startup
>>> to provide your desired support. This solution works in the following
>>> manner:
>>> 1. launch "ompi-server -report-uri ". This starts a persistent
>>> daemon called "ompi-server" that acts as a rendezvous point for
>>> independently started applications.  The problem with starting different
>>> applications and wanting them to MPI connect/accept lies in the need to have
>>> the applications find each other. If they can't discover contact info for
>>> the other app, then they can't wire up their interconnects. The
>>> "ompi-server" tool provides that rendezvous point. I don't like that
>>> comm_accept segfaulted - should have just error'd out.
>>> 2. set OMPI_MCA_orte_server=file:" in the environment where you
>>> will start your processes. This will allow your singleton processes to find
>>> the ompi-server. I automatically also set the envar to connect the MPI
>>> publish/subscribe system for you.
>>> 3. run your processes. As they think they are singletons, they will detect
>>> the presence of the above envar and automatically connect themselves to the
>>> "ompi-server" daemon. This provides each process with the ability to perform
>>> any MPI-2 operation.
>>> I tested this on my machines and it worked, so hopefully it will meet your
>>> needs. You only need to run one "ompi-server" period, so long as you locate
>>> it where all of the processes can find the contact file and can open a TCP
>>> socket to the daemon. There is a way to knit multiple ompi-servers into a
>>> broader network (e.g., to connect processes that cannot directly access a
>>> server due to network segmentation), but it's a tad tricky - let me know if
>>> you require it and I'll try to help.
>>> If you have trouble wiring them all into a single communicator, you might
>>> ask separately about that and see if one of our MPI experts can provide
>>> advice (I'm just the RTE grunt).
>>> HTH - let me know how this works for you and I'll incorporate it into future
>>> OMPI releases.
>>> Ralph
>>>
>>>
>>> On Apr 24, 2010, at 1:49 AM, Krzysztof Zarzycki wrote:
>>>
>>> Hi Ralph,
>>> I'm Krzysztof and I'm working with Grzegorz Maj on this our small
>>> project/experiment.
>>> We definitely would like to give your patch a try. But could you please
>>> explain your solution a little more?
>>> You still would like to start one mpirun per mpi grid, and then have
>>> processes started by us to join the MPI comm?
>>> It is a good solution of course.
>>> But it would be especially preferable to have one daemon running
>>> persistently on our "entry" machine that can handle several mpi grid starts.
>>>

[OMPI users] Question on checkpoint overhead in Open MPI

2010-07-07 Thread Nguyen Toan

Hello everyone,
I have a question concerning the checkpoint overhead in Open MPI, which is
the difference taken from the runtime of application execution with and
without checkpoint.
I observe that when the data size and the number of processes increases, the
runtime of BLCR is very small compared to the overall checkpoint overhead in
Open MPI. Is it because of the increase of coordination time for checkpoint?
And what is included in the overall checkpoint overhead besides the BLCR's
checkpoint overhead and coordination time?
Thank you.

Best Regards,
Nguyen Toan

Re: [OMPI users] perhaps an openmpi bug, how best to identify?

2010-07-07 Thread Olivier Marsden


Hi Jeff, thanks for the response.
As soon as I can afford to reboot my workstation,
like tomorrow, I will test as you suggest whether the computer
actually hangs or just slows down. For exhaustive kernel logging,
I replaced the following line
kern.*   -/var/log/kern.log
with
kern.*/var/log/kern.log
in my /etc/rsyslog.d/50-default.conf file, does that look about right?

Regards,

Olivier Marsden

Jeff Squyres wrote:

On Jul 7, 2010, at 10:20 AM, Olivier Marsden wrote:

  

The (7 process) code runs correctly on my workstation using mpich2 (latest
stable version) & ifort 11.1, using intel-mpi & ifort 11.1, but 
randomly hangs the

computer (vanilla ubuntu 9.10 kernel v. 2.6.31 ) to the point where only
a magic
sysrq combination can "save" me (i.e. reboot), when using
- openmpi 1.4.2 compiled from source with gcc, ifort for mpif90
- clustertools v. 8.2.1c distribution from sun/oracle, also based on
openmpi 1.4.2, using sun f90
  for mpif90



Yowza.  Open MPI is user space code, so it should never be able to hang the 
entire computer.  Open MPI and MPICH2 do implement things in very different 
ways, so it's quite possible that we trip entirely different code paths in the 
same linux kernel.

Never say "never" -- it could well be an Open MPI bug.  But it smells like a 
kernel bug...

  

I am prepared to do some testing if that can help, but don't know the
best way to identify what's going on.
I have found no useful information in the syslog files.



Is the machine totally hung?  Or is it just running really, really slowly?  Try leaving 
some kind of slowly-monitoring process running in the background and see if it keeps 
running (perhaps even more slowly than before) when the machine hangs.  E.g., something 
like a shell script that loops over sleeping for a second and then appending the output 
of "date" to a file.  Or something like that.

My point: see if Open MPI went into some hyper-aggressive mode where it's (literally) 
stealing every available cycle and making the machine look hung.  You might even want to 
try running the OMPI procs at a low priority to see if it can help alleviate the 
"steal all cycles" mode (if that is, indeed, what is happening).

If the machine is truly hung, then something else might be going on.  Do any 
kernel logs report anything?  Can you crank up your syslog to report *all* 
events, for example?

Re: [OMPI users] perhaps an openmpi bug, how best to identify?

2010-07-07 Thread Jeff Squyres

On Jul 7, 2010, at 12:50 PM, Olivier Marsden wrote:

> Hi Jeff, thanks for the response.
> As soon as I can afford to reboot my workstation,
> like tomorrow, I will test as you suggest whether the computer
> actually hangs or just slows down. For exhaustive kernel logging,
> I replaced the following line
> kern.*   -/var/log/kern.log
> with
> kern.*/var/log/kern.log
> in my /etc/rsyslog.d/50-default.conf file, does that look about right?

I'd add another:

*.*   -/var/log/everything.log

...because who knows what the actual problem is?

And/or, if you have another machine that can listen for syslog, remote syslog 
to that machine so that you might be able to see the results more or less 
immediately.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI users] MPI_Init failing in singleton

2010-07-07 Thread Ralph Castain


On Jul 7, 2010, at 10:12 AM, Grzegorz Maj wrote:

> The problem was that orted couldn't find ssh nor rsh on that machine.
> I've added my installation to PATH and it now works.
> So one question: I will definitely not use MPI_Comm_spawn or any
> related stuff. Do I need this ssh? If not, is there any way to say
> orted that it shouldn't be looking for ssh because it won't need it?

That's an interesting question - never faced that situation before. At the 
moment, the answer is "no". However, I could conjure up a patch that lets the 
orted not select a plm module

> 
> Regards,
> Grzegorz Maj
> 
> 2010/7/7 Ralph Castain :
>> Check your path and ld_library_path- looks like you are picking up some 
>> stale binary for orted and/or stale libraries (perhaps getting the default 
>> OMPI instead of 1.4.2) on the machine where it fails.
>> 
>> On Jul 7, 2010, at 7:44 AM, Grzegorz Maj wrote:
>> 
>>> Hi,
>>> I was trying to run some MPI processes as a singletons. On some of the
>>> machines they crash on MPI_Init. I use exactly the same binaries of my
>>> application and the same installation of openmpi 1.4.2 on two machines
>>> and it works on one of them and fails on the other one. This is the
>>> command and its output (test is a simple application calling only
>>> MPI_Init and MPI_Finalize):
>>> 
>>> LD_LIBRARY_PATH=/home/gmaj/openmpi/lib ./test
>>> [host01:21866] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file
>>> ../../../../../orte/mca/ess/hnp/ess_hnp_module.c at line 161
>>> --
>>> It looks like orte_init failed for some reason; your parallel process is
>>> likely to abort.  There are many reasons that a parallel process can
>>> fail during orte_init; some of which are due to configuration or
>>> environment problems.  This failure appears to be an internal failure;
>>> here's some additional information (which may only be relevant to an
>>> Open MPI developer):
>>> 
>>>  orte_plm_base_select failed
>>>  --> Returned value Not found (-13) instead of ORTE_SUCCESS
>>> --
>>> [host01:21866] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file
>>> ../../orte/runtime/orte_init.c at line 132
>>> --
>>> It looks like orte_init failed for some reason; your parallel process is
>>> likely to abort.  There are many reasons that a parallel process can
>>> fail during orte_init; some of which are due to configuration or
>>> environment problems.  This failure appears to be an internal failure;
>>> here's some additional information (which may only be relevant to an
>>> Open MPI developer):
>>> 
>>>  orte_ess_set_name failed
>>>  --> Returned value Not found (-13) instead of ORTE_SUCCESS
>>> --
>>> [host01:21866] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file
>>> ../../orte/orted/orted_main.c at line 323
>>> [host01:21865] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a
>>> daemon on the local node in file
>>> ../../../../../orte/mca/ess/singleton/ess_singleton_module.c at line
>>> 381
>>> [host01:21865] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a
>>> daemon on the local node in file
>>> ../../../../../orte/mca/ess/singleton/ess_singleton_module.c at line
>>> 143
>>> [host01:21865] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a
>>> daemon on the local node in file ../../orte/runtime/orte_init.c at
>>> line 132
>>> --
>>> It looks like orte_init failed for some reason; your parallel process is
>>> likely to abort.  There are many reasons that a parallel process can
>>> fail during orte_init; some of which are due to configuration or
>>> environment problems.  This failure appears to be an internal failure;
>>> here's some additional information (which may only be relevant to an
>>> Open MPI developer):
>>> 
>>>  orte_ess_set_name failed
>>>  --> Returned value Unable to start a daemon on the local node (-128)
>>> instead of ORTE_SUCCESS
>>> --
>>> --
>>> It looks like MPI_INIT failed for some reason; your parallel process is
>>> likely to abort.  There are many reasons that a parallel process can
>>> fail during MPI_INIT; some of which are due to configuration or environment
>>> problems.  This failure appears to be an internal failure; here's some
>>> additional information (which may only be relevant to an Open MPI
>>> developer):
>>> 
>>>  ompi_mpi_init: orte_init failed
>>>  --> Returned "Unable to start a daemon on the local node" (-128)
>>> instead of "Success" (0)
>>> --
>>> *** An error occurred in MPI_Init

Re: [OMPI users] Dynamic processes connection and segfault on MPI_Comm_accept

2010-07-07 Thread Ralph Castain

I would guess the #files limit of 1024. However, if it behaves the same way 
when spread across multiple machines, I would suspect it is somewhere in your 
program itself. Given that the segfault is in your process, can you use gdb to 
look at the core file and see where and why it fails?

On Jul 7, 2010, at 10:17 AM, Grzegorz Maj wrote:

> 2010/7/7 Ralph Castain :
>> 
>> On Jul 6, 2010, at 8:48 AM, Grzegorz Maj wrote:
>> 
>>> Hi Ralph,
>>> sorry for the late response, but I couldn't find free time to play
>>> with this. Finally I've applied the patch you prepared. I've launched
>>> my processes in the way you've described and I think it's working as
>>> you expected. None of my processes runs the orted daemon and they can
>>> perform MPI operations. Unfortunately I'm still hitting the 65
>>> processes issue :(
>>> Maybe I'm doing something wrong.
>>> I attach my source code. If anybody could have a look on this, I would
>>> be grateful.
>>> 
>>> When I run that code with clients_count <= 65 everything works fine:
>>> all the processes create a common grid, exchange some information and
>>> disconnect.
>>> When I set clients_count > 65 the 66th process crashes on
>>> MPI_Comm_connect (segmentation fault).
>> 
>> I didn't have time to check the code, but my guess is that you are still 
>> hitting some kind of file descriptor or other limit. Check to see what your 
>> limits are - usually "ulimit" will tell you.
> 
> My limitations are:
> time(seconds)unlimited
> file(blocks) unlimited
> data(kb) unlimited
> stack(kb)10240
> coredump(blocks) 0
> memory(kb)   unlimited
> locked memory(kb)64
> process  200704
> nofiles  1024
> vmemory(kb)  unlimited
> locksunlimited
> 
> Which one do you think could be responsible for that?
> 
> I was trying to run all the 66 processes on one machine or spread them
> across several machines and it always crashes the same way on the 66th
> process.
> 
>> 
>>> 
>>> Another thing I would like to know is if it's normal that any of my
>>> processes when calling MPI_Comm_connect or MPI_Comm_accept when the
>>> other side is not ready, is eating up a full CPU available.
>> 
>> Yes - the waiting process is polling in a tight loop waiting for the 
>> connection to be made.
>> 
>>> 
>>> Any help would be appreciated,
>>> Grzegorz Maj
>>> 
>>> 
>>> 2010/4/24 Ralph Castain :
 Actually, OMPI is distributed with a daemon that does pretty much what you
 want. Checkout "man ompi-server". I originally wrote that code to support
 cross-application MPI publish/subscribe operations, but we can utilize it
 here too. Have to blame me for not making it more publicly known.
 The attached patch upgrades ompi-server and modifies the singleton startup
 to provide your desired support. This solution works in the following
 manner:
 1. launch "ompi-server -report-uri ". This starts a persistent
 daemon called "ompi-server" that acts as a rendezvous point for
 independently started applications.  The problem with starting different
 applications and wanting them to MPI connect/accept lies in the need to 
 have
 the applications find each other. If they can't discover contact info for
 the other app, then they can't wire up their interconnects. The
 "ompi-server" tool provides that rendezvous point. I don't like that
 comm_accept segfaulted - should have just error'd out.
 2. set OMPI_MCA_orte_server=file:" in the environment where you
 will start your processes. This will allow your singleton processes to find
 the ompi-server. I automatically also set the envar to connect the MPI
 publish/subscribe system for you.
 3. run your processes. As they think they are singletons, they will detect
 the presence of the above envar and automatically connect themselves to the
 "ompi-server" daemon. This provides each process with the ability to 
 perform
 any MPI-2 operation.
 I tested this on my machines and it worked, so hopefully it will meet your
 needs. You only need to run one "ompi-server" period, so long as you locate
 it where all of the processes can find the contact file and can open a TCP
 socket to the daemon. There is a way to knit multiple ompi-servers into a
 broader network (e.g., to connect processes that cannot directly access a
 server due to network segmentation), but it's a tad tricky - let me know if
 you require it and I'll try to help.
 If you have trouble wiring them all into a single communicator, you might
 ask separately about that and see if one of our MPI experts can provide
 advice (I'm just the RTE grunt).
 HTH - let me know how this works for you and I'll incorporate it into 
 future
 OMPI releases.
 Ralph
 
 
 On Apr 24, 2010, at 1:49 AM, Krzysztof Zarzycki wrote:
 
 Hi Ralph,
 I'm Krzysztof and I'm work

Re: [OMPI users] trouble using openmpi under slurm

2010-07-07 Thread Ralph Castain

I'm afraid the bottom line is that OMPI simply doesn't support core-level 
allocations. I tried it on a slurm machine available to me, using our devel 
trunk as well as 1.4, with the same results.

Not sure why you are trying to run that way, but I'm afraid you can't do it 
with OMPI.

On Jul 6, 2010, at 3:20 PM, David Roundy wrote:

> On Tue, Jul 6, 2010 at 12:31 PM, Ralph Castain  wrote:
>> Thanks - that helps.
>> 
>> As you note, the issue is that OMPI doesn't support the core-level 
>> allocation options of slurm - never has, probably never will. What I found 
>> interesting, though, was that your envars don't anywhere indicate that this 
>> is what you requested. I don't see anything there that would case the daemon 
>> to crash.
>> 
>> So I'm left to guess that this is an issue where slurm doesn't like 
>> something OMPI does because it violates that core-level option. Can you add 
>> --display-devel-map to your mpirun command? It would be interesting to see 
>> where it thinks the daemon should go.
>> 
>> Just to check - the envars you sent in your other note came from the sbatch 
>> -c 2 run, yes?
> 
> Yes indeed.
> 
> Just for good measure, I'm attaching my current test script submit.sh
> and its complete output, also run with sbatch -c 2.  Oddly enough
> adding --display-devel-map doesn't cause mpirun to generate any output
> before crashing.  Does this give you any sort of a hint?  :(  Any
> other suggestions for tracking the source of this down? I'd really
> hoped you'd tell me that one of the env vars told you that my slurm
> config was messed up, since that would seem pretty easy to fix, once I
> knew how it was messed up...
> 
> David
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] MPI_GET beyond 2 GB displacement

2010-07-07 Thread Jeff Squyres

Sorry for the delay in replying.  :-(

It's because for a 32 bit signed int, at 2GB, the value turns negative.


On Jun 29, 2010, at 1:46 PM, Price, Brian M (N-KCI) wrote:

> OpenMPI version: 1.3.3
>  
> Platform: IBM P5
>  
> Built OpenMPI 64-bit (i.e., CFLAGS=-q64, CXXFLAGS=-q64, -FFLAGS=-q64, 
> -FCFLAGS=-q64)
>  
> FORTRAN 90 test program:
> -  Create a large array (3.6 GB of 32-bit INTs)
> -  Initialize MPI
> -  Create a large window to encompass large array (3.6 GB)
> -  Have PE 0 get 1 32-bit INT from PE1
> o   Lock the window
> o   MPI_GET
> o   Unlock the window
> -  Free the window
> -  Finalize MPI
>  
> Built FORTRAN 90 test program 64-bit using OpenMPI wrapper compiler (mpif90 
> –q64).
>  
> Why would this MPI_GET work fine with displacements all the way up to just 
> under 2 GB, and then fail as soon as the displacement hits 2 GB?
>  
> The MPI_GET succeeds with a displacement of 2147483644 (4 bytes less than 2 
> GB).
>  
> I get a segmentation fault (address not mapped) when the displacement is 
> 2147483648 (2 GB) or larger.
>  
> Thanks.
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI users] trouble using openmpi under slurm

2010-07-07 Thread David Roundy

On Wed, Jul 7, 2010 at 10:26 AM, Ralph Castain  wrote:
> I'm afraid the bottom line is that OMPI simply doesn't support core-level 
> allocations. I tried it on a slurm machine available to me, using our devel 
> trunk as well as 1.4, with the same results.
>
> Not sure why you are trying to run that way, but I'm afraid you can't do it 
> with OMPI.

Hmmm.  I'm still trying to figure out how to configure slurm properly.
 I want it to be able to put one single-process job per core on each
machine.  I just now figured out that there is a slurm "-n" option.  I
had previously only been aware of the "-N" and "-c" options, and the
latter was closer match.  It looks like everything works fine with the
"-n" option.

However, wouldn't it be a good idea to avoid crashing when "-c 2" is
used, e.g. by ignoring the environment variable SLURM_CPUS_PER_TASK?
It seems like this would be an important feature to be able to use if
one wanted to run mpi with multiple threads per node (as I've been
known to do in the past).

In my trouble shooting, I came up with the following script, which can
reliably crash mpirun (when run without slurm, but obviously
pretending to be running under slurm).  :(

#!/bin/sh
set -ev
export SLURM_JOBID=137
export SLURM_TASKS_PER_NODE=1
export SLURM_NNODES=1
export SLURM_CPUS_PER_TASK=2
export SLURM_NODELIST=localhost
mpirun --display-devel-map echo hello world
echo it worked

David

Re: [OMPI users] trouble using openmpi under slurm

2010-07-07 Thread Ralph Castain

Ah, if only it were that simple. Slurm is a very difficult beast to interface 
with, and I have yet to find a single, reliable marker across the various slurm 
releases to detect options we cannot support.


On Jul 7, 2010, at 11:59 AM, David Roundy wrote:

> On Wed, Jul 7, 2010 at 10:26 AM, Ralph Castain  wrote:
>> I'm afraid the bottom line is that OMPI simply doesn't support core-level 
>> allocations. I tried it on a slurm machine available to me, using our devel 
>> trunk as well as 1.4, with the same results.
>> 
>> Not sure why you are trying to run that way, but I'm afraid you can't do it 
>> with OMPI.
> 
> Hmmm.  I'm still trying to figure out how to configure slurm properly.
> I want it to be able to put one single-process job per core on each
> machine.  I just now figured out that there is a slurm "-n" option.  I
> had previously only been aware of the "-N" and "-c" options, and the
> latter was closer match.  It looks like everything works fine with the
> "-n" option.
> 
> However, wouldn't it be a good idea to avoid crashing when "-c 2" is
> used, e.g. by ignoring the environment variable SLURM_CPUS_PER_TASK?
> It seems like this would be an important feature to be able to use if
> one wanted to run mpi with multiple threads per node (as I've been
> known to do in the past).
> 
> In my trouble shooting, I came up with the following script, which can
> reliably crash mpirun (when run without slurm, but obviously
> pretending to be running under slurm).  :(
> 
> #!/bin/sh
> set -ev
> export SLURM_JOBID=137
> export SLURM_TASKS_PER_NODE=1
> export SLURM_NNODES=1
> export SLURM_CPUS_PER_TASK=2
> export SLURM_NODELIST=localhost
> mpirun --display-devel-map echo hello world
> echo it worked
> 
> David
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] trouble using openmpi under slurm

2010-07-07 Thread David Roundy

Alas, I'm sorry to hear that! I had hoped (assumed?) that the slurm
team would be hand-in-glove with the OMPI team in making sure the
interface between the two is smooth.  :(

David

On Wed, Jul 7, 2010 at 11:09 AM, Ralph Castain  wrote:
> Ah, if only it were that simple. Slurm is a very difficult beast to interface 
> with, and I have yet to find a single, reliable marker across the various 
> slurm releases to detect options we cannot support.
>
>
> On Jul 7, 2010, at 11:59 AM, David Roundy wrote:
>
>> On Wed, Jul 7, 2010 at 10:26 AM, Ralph Castain  wrote:
>>> I'm afraid the bottom line is that OMPI simply doesn't support core-level 
>>> allocations. I tried it on a slurm machine available to me, using our devel 
>>> trunk as well as 1.4, with the same results.
>>>
>>> Not sure why you are trying to run that way, but I'm afraid you can't do it 
>>> with OMPI.
>>
>> Hmmm.  I'm still trying to figure out how to configure slurm properly.
>> I want it to be able to put one single-process job per core on each
>> machine.  I just now figured out that there is a slurm "-n" option.  I
>> had previously only been aware of the "-N" and "-c" options, and the
>> latter was closer match.  It looks like everything works fine with the
>> "-n" option.
>>
>> However, wouldn't it be a good idea to avoid crashing when "-c 2" is
>> used, e.g. by ignoring the environment variable SLURM_CPUS_PER_TASK?
>> It seems like this would be an important feature to be able to use if
>> one wanted to run mpi with multiple threads per node (as I've been
>> known to do in the past).
>>
>> In my trouble shooting, I came up with the following script, which can
>> reliably crash mpirun (when run without slurm, but obviously
>> pretending to be running under slurm).  :(
>>
>> #!/bin/sh
>> set -ev
>> export SLURM_JOBID=137
>> export SLURM_TASKS_PER_NODE=1
>> export SLURM_NNODES=1
>> export SLURM_CPUS_PER_TASK=2
>> export SLURM_NODELIST=localhost
>> mpirun --display-devel-map echo hello world
>> echo it worked
>>
>> David
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
David Roundy

Re: [OMPI users] trouble using openmpi under slurm

2010-07-07 Thread Ralph Castain

Noafraid not. Things work pretty well, but there are places where things 
just don't mesh. Sub-node allocation in particular is an issue as it implies 
binding, and slurm and ompi have conflicting methods.

It all can get worked out, but we have limited time and nobody cares enough to 
put in the effort. Slurm just isn't used enough to make it worthwhile (too 
small an audience).

On Jul 7, 2010, at 12:32 PM, David Roundy wrote:

> Alas, I'm sorry to hear that! I had hoped (assumed?) that the slurm
> team would be hand-in-glove with the OMPI team in making sure the
> interface between the two is smooth.  :(
> 
> David
> 
> On Wed, Jul 7, 2010 at 11:09 AM, Ralph Castain  wrote:
>> Ah, if only it were that simple. Slurm is a very difficult beast to 
>> interface with, and I have yet to find a single, reliable marker across the 
>> various slurm releases to detect options we cannot support.
>> 
>> 
>> On Jul 7, 2010, at 11:59 AM, David Roundy wrote:
>> 
>>> On Wed, Jul 7, 2010 at 10:26 AM, Ralph Castain  wrote:
 I'm afraid the bottom line is that OMPI simply doesn't support core-level 
 allocations. I tried it on a slurm machine available to me, using our 
 devel trunk as well as 1.4, with the same results.
 
 Not sure why you are trying to run that way, but I'm afraid you can't do 
 it with OMPI.
>>> 
>>> Hmmm.  I'm still trying to figure out how to configure slurm properly.
>>> I want it to be able to put one single-process job per core on each
>>> machine.  I just now figured out that there is a slurm "-n" option.  I
>>> had previously only been aware of the "-N" and "-c" options, and the
>>> latter was closer match.  It looks like everything works fine with the
>>> "-n" option.
>>> 
>>> However, wouldn't it be a good idea to avoid crashing when "-c 2" is
>>> used, e.g. by ignoring the environment variable SLURM_CPUS_PER_TASK?
>>> It seems like this would be an important feature to be able to use if
>>> one wanted to run mpi with multiple threads per node (as I've been
>>> known to do in the past).
>>> 
>>> In my trouble shooting, I came up with the following script, which can
>>> reliably crash mpirun (when run without slurm, but obviously
>>> pretending to be running under slurm).  :(
>>> 
>>> #!/bin/sh
>>> set -ev
>>> export SLURM_JOBID=137
>>> export SLURM_TASKS_PER_NODE=1
>>> export SLURM_NNODES=1
>>> export SLURM_CPUS_PER_TASK=2
>>> export SLURM_NODELIST=localhost
>>> mpirun --display-devel-map echo hello world
>>> echo it worked
>>> 
>>> David
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
> 
> 
> 
> -- 
> David Roundy
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] trouble using openmpi under slurm

2010-07-07 Thread Jeff Squyres

On Jul 7, 2010, at 2:37 PM, Ralph Castain wrote:

> Noafraid not. Things work pretty well, but there are places where things 
> just don't mesh. Sub-node allocation in particular is an issue as it implies 
> binding, and slurm and ompi have conflicting methods.
> 
> It all can get worked out, but we have limited time and nobody cares enough 
> to put in the effort. Slurm just isn't used enough to make it worthwhile (too 
> small an audience).

...that being said, patches would be appreciated.  :-)

But, as Ralph alluded to, there's quite a history -- anyone attempting to make 
such a patch should probably have a phone call with Ralph to get up to speed on 
both the history and the why-things-are-the-way-they-are story about Open MPI's 
SLURM support before starting.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

[OMPI users] Adding libraries to wrapper compiler at run-time

2010-07-07 Thread Jeremiah Willcock

The Open MPI FAQ shows how to add libraries to the Open MPI wrapper 
compilers when building them (using configure flags), but I would like to 
add flags for a specific run of the wrapper compiler.  Setting OMPI_LIBS 
overrides the necessary MPI libraries, and it does not appear that there 
is an easy way to get just the flags that OMPI_LIBS contains by default 
(either using -showme:link or ompi_info).  Is there a way to add to the 
default set of OMPI_LIBS rather than overriding it entirely?  Thank you 
for your help.


-- Jeremiah Willcock

Re: [OMPI users] trouble using openmpi under slurm

2010-07-07 Thread Douglas Guptill

On Wed, Jul 07, 2010 at 12:37:54PM -0600, Ralph Castain wrote:

> Noafraid not. Things work pretty well, but there are places
> where things just don't mesh. Sub-node allocation in particular is
> an issue as it implies binding, and slurm and ompi have conflicting
> methods.
>
> It all can get worked out, but we have limited time and nobody cares
> enough to put in the effort. Slurm just isn't used enough to make it
> worthwhile (too small an audience).

I am about to get my first HPC cluster (128 nodes), and was
considering slurm.  We do use MPI.

Should I be looking at Torque instead for a queue manager?

Suggestions appreciated,
Douglas.
-- 
  Douglas Guptill   voice: 902-461-9749
  Research Assistant, LSC 4640  email: douglas.gupt...@dal.ca
  Oceanography Department   fax:   902-494-3877
  Dalhousie University
  Halifax, NS, B3H 4J1, Canada

Re: [OMPI users] trouble using openmpi under slurm

2010-07-07 Thread Ralph Castain

You'll get passionate advocates from all the various resource managers - there 
really isn't a right/wrong answer. Torque is more widely used, but any of them 
will do.

None are perfect, IMHO.

On Jul 7, 2010, at 1:16 PM, Douglas Guptill wrote:

> On Wed, Jul 07, 2010 at 12:37:54PM -0600, Ralph Castain wrote:
> 
>> Noafraid not. Things work pretty well, but there are places
>> where things just don't mesh. Sub-node allocation in particular is
>> an issue as it implies binding, and slurm and ompi have conflicting
>> methods.
>> 
>> It all can get worked out, but we have limited time and nobody cares
>> enough to put in the effort. Slurm just isn't used enough to make it
>> worthwhile (too small an audience).
> 
> I am about to get my first HPC cluster (128 nodes), and was
> considering slurm.  We do use MPI.
> 
> Should I be looking at Torque instead for a queue manager?
> 
> Suggestions appreciated,
> Douglas.
> -- 
>  Douglas Guptill   voice: 902-461-9749
>  Research Assistant, LSC 4640  email: douglas.gupt...@dal.ca
>  Oceanography Department   fax:   902-494-3877
>  Dalhousie University
>  Halifax, NS, B3H 4J1, Canada
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] trouble using openmpi under slurm

2010-07-07 Thread Jeff Squyres

+1.

FWIW, Open MPI works pretty well with SLURM; I use it back here at Cisco for 
all my testing.  That one particular option you're testing doesn't seem to 
work, but all in all, the integration works fairly well.


On Jul 7, 2010, at 3:27 PM, Ralph Castain wrote:

> You'll get passionate advocates from all the various resource managers - 
> there really isn't a right/wrong answer. Torque is more widely used, but any 
> of them will do.
> 
> None are perfect, IMHO.
> 
> On Jul 7, 2010, at 1:16 PM, Douglas Guptill wrote:
> 
>> On Wed, Jul 07, 2010 at 12:37:54PM -0600, Ralph Castain wrote:
>> 
>>> Noafraid not. Things work pretty well, but there are places
>>> where things just don't mesh. Sub-node allocation in particular is
>>> an issue as it implies binding, and slurm and ompi have conflicting
>>> methods.
>>> 
>>> It all can get worked out, but we have limited time and nobody cares
>>> enough to put in the effort. Slurm just isn't used enough to make it
>>> worthwhile (too small an audience).
>> 
>> I am about to get my first HPC cluster (128 nodes), and was
>> considering slurm.  We do use MPI.
>> 
>> Should I be looking at Torque instead for a queue manager?
>> 
>> Suggestions appreciated,
>> Douglas.
>> -- 
>> Douglas Guptill   voice: 902-461-9749
>> Research Assistant, LSC 4640  email: douglas.gupt...@dal.ca
>> Oceanography Department   fax:   902-494-3877
>> Dalhousie University
>> Halifax, NS, B3H 4J1, Canada
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI users] EXTERNAL: Re: MPI_GET beyond 2 GB displacement

2010-07-07 Thread Price, Brian M (N-KCI)

Jeff,

I understand what you've said about 32-bit signed INTs, but in my program, the 
displacement variable that I use for the MPI_GET call is a 64-bit INT (KIND = 
8).

In fact, the only thing in my program that isn't a 64-bit INT is the array that 
I'm trying to transfer values from.

I would post my entire test program, but I don't have direct internet access 
from the machine that I'm working on.  Do you need to see the test program?

Am I still missing something?

Thanks.

-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Jeff Squyres
Sent: Wednesday, July 07, 2010 10:39 AM
To: Open MPI Users
Subject: EXTERNAL: Re: [OMPI users] MPI_GET beyond 2 GB displacement

Sorry for the delay in replying.  :-(

It's because for a 32 bit signed int, at 2GB, the value turns negative.


On Jun 29, 2010, at 1:46 PM, Price, Brian M (N-KCI) wrote:

> OpenMPI version: 1.3.3
>  
> Platform: IBM P5
>  
> Built OpenMPI 64-bit (i.e., CFLAGS=-q64, CXXFLAGS=-q64, -FFLAGS=-q64, 
> -FCFLAGS=-q64)
>  
> FORTRAN 90 test program:
> -  Create a large array (3.6 GB of 32-bit INTs)
> -  Initialize MPI
> -  Create a large window to encompass large array (3.6 GB)
> -  Have PE 0 get 1 32-bit INT from PE1
> o   Lock the window
> o   MPI_GET
> o   Unlock the window
> -  Free the window
> -  Finalize MPI
>  
> Built FORTRAN 90 test program 64-bit using OpenMPI wrapper compiler (mpif90 
> -q64).
>  
> Why would this MPI_GET work fine with displacements all the way up to just 
> under 2 GB, and then fail as soon as the displacement hits 2 GB?
>  
> The MPI_GET succeeds with a displacement of 2147483644 (4 bytes less than 2 
> GB).
>  
> I get a segmentation fault (address not mapped) when the displacement is 
> 2147483648 (2 GB) or larger.
>  
> Thanks.
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] EXTERNAL: Re: MPI_GET beyond 2 GB displacement

2010-07-07 Thread Jed Brown

On Wed, 07 Jul 2010 15:51:41 -0600, "Price, Brian M (N-KCI)" 
 wrote:
> Jeff,
> 
> I understand what you've said about 32-bit signed INTs, but in my program, 
> the displacement variable that I use for the MPI_GET call is a 64-bit INT 
> (KIND = 8).

The MPI Fortran bindings expect a standard int, your program is only
working because your system is little endian so the first 4 bytes are
the low bytes (correct for numbers less than 2^31), it would be
completely broken on a big endian system.  This is a library issue, you
can't fix it by using different sized ints in your program and you would
see compiler errors due to the type mismatch if you were using Fortran
90 (which is capable of some type checking).

Jed

Re: [OMPI users] EXTERNAL: Re: MPI_GET beyond 2 GB displacement

2010-07-07 Thread Price, Brian M (N-KCI)

Jed,

The IBM P5 I'm working on is big endian.

The test program I'm using is written in Fortran 90 (as stated in my question).

I imagine this is indeed a library issue, but I still don't understand what 
I've done wrong here.

Can anyone tell me how I should be building my OpenMPI libraries and my test 
program so that this test would work correctly?

Thanks.

-Original Message-
From: Jed Brown [mailto:five...@gmail.com] On Behalf Of Jed Brown
Sent: Wednesday, July 07, 2010 3:08 PM
To: Price, Brian M (N-KCI); Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: MPI_GET beyond 2 GB displacement

On Wed, 07 Jul 2010 15:51:41 -0600, "Price, Brian M (N-KCI)" 
 wrote:
> Jeff,
> 
> I understand what you've said about 32-bit signed INTs, but in my program, 
> the displacement variable that I use for the MPI_GET call is a 64-bit INT 
> (KIND = 8).

The MPI Fortran bindings expect a standard int, your program is only working 
because your system is little endian so the first 4 bytes are the low bytes 
(correct for numbers less than 2^31), it would be completely broken on a big 
endian system.  This is a library issue, you can't fix it by using different 
sized ints in your program and you would see compiler errors due to the type 
mismatch if you were using Fortran 90 (which is capable of some type checking).

Jed

[OMPI users] configure options

2010-07-07 Thread Zhigang Wei

Dear all,

How can I decide the configure options? I am greatly confused.

I am using school's high performance computer.
But the openmpi there is version 1.3.2, old, so I want to build the new one.

I am new to openmpi, I have built the openmpi and it doesn't work, I built and 
installed it to my own directory.
I use the following configure options,

./configure --with-sge --prefix=$MY_OWN_DIR --with-psm 

but it won't work and failed with somelines like 
..lib/openmpi/mca_ess_hnp: file not found (ignored) 
in the output file.

I guess my configure is wrong, could you tell me the meaning of --with-psm, 
--with-sge, do I need to add other options? I guess the computing nodes are 
using infiniband, but how to build with that? If I don't have the su right, can 
I build it? What should I pay attettion if I want to build and use my own 
openmpi?

You see, in a personal multicore computer, building is so easy and mpirun the 
program without any problems. But in school's hpc, it fails all the time.

Please help. Thank you all.


Zhigang Wei

NatHaz Modeling Laboratory
University of Notre Dame
112J FitzPatrick Hall
Notre Dame, IN 46556 



2010-07-07

Re: [OMPI users] Open MPI error MPI_ERR_TRUNCATE: message truncated

2010-07-07 Thread David Zhang

This error typically occurs when the received message is bigger than the
specified buffer size.  You need to narrow your code down to offending
receive command to see if this is indeed the case.

On Wed, Jul 7, 2010 at 8:42 AM, Jack Bryan  wrote:

>  Dear All:
>
> I need to transfer some messages from workers master node on MPI cluster
> with Open MPI.
>
> The number of messages is fixed.
>
> When I increase the number of worker nodes, i got error:
>
> --
>
> terminate called after throwing an instance of
> 'boost::exception_detail::clone_impl
> >'
>   what():  MPI_Unpack: MPI_ERR_TRUNCATE: message truncated
> [n231:45873] *** Process received signal ***
> [n231:45873] Signal: Aborted (6)
> [n231:45873] Signal code:  (-6)
> [n231:45873] [ 0] /lib64/libpthread.so.0 [0x3c50e0e4c0]
> [n231:45873] [ 1] /lib64/libc.so.6(gsignal+0x35) [0x3c50230215]
> [n231:45873] [ 2] /lib64/libc.so.6(abort+0x110) [0x3c50231cc0]
>
>
> --
>
> For 40 workers , it works well.
>
> But for 50 workers, it got this error.
>
> The largest message size is not more then 72 bytes.
>
> Any help is appreciated.
>
> thanks
>
> Jack
>
> July 7 2010
>
> --
> The New Busy is not the too busy. Combine all your e-mail accounts with
> Hotmail. Get 
> busy.
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
David Zhang
University of California, San Diego

Re: [OMPI users] MPI_GET beyond 2 GB displacement

2010-07-07 Thread Changsheng Jiang

Does it means we have to split the MPI_Get to many 2GB parts?

I have a MPI programm which first serialize a object, sending to other
process. The char array after serialize is just below 2GB now, but the data
is increasing.

One method is to build a large type with MPI_Type_vector, align the char
array to the upper bound. Send and Recv using the created large size type. I
think this is better than split send and recv.

Is there any graceful methods to avoid the problem? Or, I think, using
size_t(or ssize_t) as the length parameters is more reasonable in new mpi
implementation?

 Changsheng Jiang


On Thu, Jul 8, 2010 at 01:39, Jeff Squyres  wrote:

> Sorry for the delay in replying.  :-(
>
> It's because for a 32 bit signed int, at 2GB, the value turns negative.
>
>
> On Jun 29, 2010, at 1:46 PM, Price, Brian M (N-KCI) wrote:
>
> > OpenMPI version: 1.3.3
> >
> > Platform: IBM P5
> >
> > Built OpenMPI 64-bit (i.e., CFLAGS=-q64, CXXFLAGS=-q64, -FFLAGS=-q64,
> -FCFLAGS=-q64)
> >
> > FORTRAN 90 test program:
> > -  Create a large array (3.6 GB of 32-bit INTs)
> > -  Initialize MPI
> > -  Create a large window to encompass large array (3.6 GB)
> > -  Have PE 0 get 1 32-bit INT from PE1
> > o   Lock the window
> > o   MPI_GET
> > o   Unlock the window
> > -  Free the window
> > -  Finalize MPI
> >
> > Built FORTRAN 90 test program 64-bit using OpenMPI wrapper compiler
> (mpif90 –q64).
> >
> > Why would this MPI_GET work fine with displacements all the way up to
> just under 2 GB, and then fail as soon as the displacement hits 2 GB?
> >
> > The MPI_GET succeeds with a displacement of 2147483644 (4 bytes less than
> 2 GB).
> >
> > I get a segmentation fault (address not mapped) when the displacement is
> 2147483648 (2 GB) or larger.
> >
> > Thanks.
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] EXTERNAL: Re: MPI_GET beyond 2 GB displacement

2010-07-07 Thread Jed Brown

On Wed, 07 Jul 2010 17:34:44 -0600, "Price, Brian M (N-KCI)" 
 wrote:
> Jed,
> 
> The IBM P5 I'm working on is big endian.

Sorry, that didn't register.  The displ argument is MPI_Aint which is 8
bytes (at least on LP64, probably also on LLP64), so your use of kind=8
for that is certainly correct.  The count argument is a plain int, I
don't see how your code could be correct if you pass in an 8-byte int
there when it expects a 4-byte int (since the upper 4 bytes would be
used on a big-endian system).

> The test program I'm using is written in Fortran 90 (as stated in my 
> question).

Do you "use mpi" or the F77 interface?

> I imagine this is indeed a library issue, but I still don't understand what 
> I've done wrong here.

I can reproduce this in C on x86-64, even with displ much smaller than
2^31 (e.g. by setting displ_unit=4).  Apparently Open MPI multiplies
displ*displ_unit and stuffs the result in an int (somewhere in the
implementation), MPICH2 works correctly for me with large displacements.

  https://svn.open-mpi.org/trac/ompi/ticket/2472

Jed

Re: [OMPI users] Open MPI error MPI_ERR_TRUNCATE: message truncated

2010-07-07 Thread Jack Bryan


thanks
Wat if the master has to send and receive large data package ? 
It has to be splited into multiple parts ? 
This may increase communication overhead. 
I can use MPI_datatype to wrap it up as a specific datatype, which can carry 
the data. What if the data is very large? 1k bytes or 10 kbytes , 100 kbytes ?
the master need to collect the same datatype from all workers. 
So, in this way, the master has to set up a data pool to get all data. 
The master's buffer provided by the MPI may not be large enough to do this. 
Are there some other ways to do it ? 
Any help is appreciated. 
thanks
Jack
july 7  2010 
From: solarbik...@gmail.com
List-Post: users@lists.open-mpi.org
Date: Wed, 7 Jul 2010 17:32:27 -0700
To: us...@open-mpi.org
Subject: Re: [OMPI users] Open MPI error MPI_ERR_TRUNCATE: message truncated

This error typically occurs when the received message is bigger than the 
specified buffer size.  You need to narrow your code down to offending receive 
command to see if this is indeed the case.



On Wed, Jul 7, 2010 at 8:42 AM, Jack Bryan  wrote:







Dear All:
I need to transfer some messages from workers master node on MPI cluster with 
Open MPI.
The number of messages is fixed. 
When I increase the number of worker nodes, i got error: 


--
terminate called after throwing an instance of 
'boost::exception_detail::clone_impl
 >'

  what():  MPI_Unpack: MPI_ERR_TRUNCATE: message truncated[n231:45873] *** 
Process received signal ***[n231:45873] Signal: Aborted (6)[n231:45873] Signal 
code:  (-6)[n231:45873] [ 0] /lib64/libpthread.so.0 [0x3c50e0e4c0]

[n231:45873] [ 1] /lib64/libc.so.6(gsignal+0x35) [0x3c50230215][n231:45873] [ 
2] /lib64/libc.so.6(abort+0x110) [0x3c50231cc0]

--


For 40 workers , it works well. 
But for 50 workers, it got this error. 
The largest message size is not more then 72 bytes. 


Any help is appreciated. 
thanks
Jack
July 7 2010   
The New Busy is not the too busy. Combine all your e-mail accounts with 
Hotmail. Get busy.



___

users mailing list

us...@open-mpi.org

http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
David Zhang
University of California, San Diego
  
_
The New Busy is not the too busy. Combine all your e-mail accounts with Hotmail.
http://www.windowslive.com/campaign/thenewbusy?tile=multiaccount&ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_4

37 matches

Mail list logo