Re: [OMPI users] Fortran support on Windows Open-MPI

2010-05-07 Thread Trent Creekmore

Compaq Visual Fortan for Windows was out, but HP aquired Compaq. HP, later
deciding they did not want it, along with the Alpha processor techonology,
sold them to Intel. So now it's Intel Visual Fortran Compiler for Windows.
In addition, if you don't want that package, instead they do sell a plug-in
for Microsoft Visual Studio. There is also a HPC/Parallel enviroment too for
Visual Studio, but none of these are cheap.

I don't see why you can't include Open MPI libraries in that enviroment.

Trent


-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of Damien
Sent: Thursday, May 06, 2010 10:53 PM
To: us...@open-mpi.org
Subject: [OMPI users] Fortran support on Windows Open-MPI

Hi all,

Can anyone tell me what the plans are for Fortran 90 support on Windows, 
with say the Intel compilers?  I need to get MUMPS built and running 
using Open-MPI, with Visual Studio and Intel 11.1.  I know Fortran isn't 
part of the regular CMake build for Windows.  If someone's working on 
this I'm happy to test or help out.

Damien
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



[OMPI users] About the correct use of DIET_Finalize()

2010-05-07 Thread Yves Caniou
Dear All,

My parallel application ends when each process receives a msg, done in an 
async way (but my question still arise if sync comm were used, see the ref to  
the manpage). Senders call MPI_Finalize() after a call to MPI_Wait() and 
receivers call MPI_Finalize() after having received the message.

An execution gives me that all proc finish as planned but I obtain the 
following errors (times the number of processor used)

With openMPI 1.4.2 compiled with gcc-4.5 on a Quad-Core AMD Opteron(tm) 
Processor 8356
*** The MPI_Finalize() function was called after MPI_FINALIZE was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.
Abort after MPI_FINALIZE completed successfully; not able to guarantee that 
all other processes were killed!

With openMPI 1.4.1 (debian package), Intel(R) Core(TM)2 Duo CPU 
P9600
*** An error occurred in MPI_Finalize
*** after MPI was finalized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
Abort after MPI_FINALIZE completed successfully; not able to guarantee that 
all other processes were killed!
-

I think it comes from the fact that, as mentionned in the man of 
MPI_Finalize():
For  example,  a successful return from a blocking communication opera-
   tion or from MPI_Wait or MPI_Test means that the communication is  com-
   pleted by the user and the buffer can be reused, but does not guarantee
   that the local process has no more work to do.

Nonetheless, if MPI_Finaliaze() is called before that the exchange of messages 
really takes place, receivers wouldn't call their MPI_Finalize(), but would 
just be issuing an abort thing, no?

Well, I'm perplex. When do I know when my proc can make the call to 
MPI_Finalize() and obtain an execution without error messages?

Thank you for any help.

.Yves.

-- 
Yves Caniou
Associate Professor at Université Lyon 1,
Member of the team project INRIA GRAAL in the LIP ENS-Lyon,
Délégation CNRS in Japan French Laboratory of Informatics (JFLI),
  * in Information Technology Center, The University of Tokyo,
2-11-16 Yayoi, Bunkyo-ku, Tokyo 113-8658, Japan
tel: +81-3-5841-0540
  * in National Institute of Informatics
2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430, Japan
tel: +81-3-4212-2412 
http://graal.ens-lyon.fr/~ycaniou/



Re: [OMPI users] How do I run OpenMPI safely on a Nehalem standalone machine?

2010-05-07 Thread John Hearns
On 7 May 2010 03:17, Jeff Squyres  wrote:

>
> Indeed.  I have seen some people have HT enabled in the bios just so that 
> they can have the software option of turning them off via linux -- then you 
> can run with HT and without it and see what it does to your specific codes.

I may have missed this on the thread, but how do you do that?
The Nehalem systems I have came delivered with HT enabled in the BIOS
- I know it is not a real pain to reboot and configure, but it would
be a lot easir to leave it on and switch off in software - also if you
wanted to do back-to-back testing of performance with/without HT.



Re: [OMPI users] Fortran support on Windows Open-MPI

2010-05-07 Thread Tim Prince

On 5/6/2010 9:07 PM, Trent Creekmore wrote:

Compaq Visual Fortan for Windows was out, but HP aquired Compaq. HP, later
deciding they did not want it, along with the Alpha processor techonology,
sold them to Intel. So now it's Intel Visual Fortran Compiler for Windows.
In addition, if you don't want that package, instead they do sell a plug-in
for Microsoft Visual Studio. There is also a HPC/Parallel enviroment too for
Visual Studio, but none of these are cheap.

I don't see why you can't include Open MPI libraries in that enviroment.

Trent


-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of Damien
Sent: Thursday, May 06, 2010 10:53 PM
To: us...@open-mpi.org
Subject: [OMPI users] Fortran support on Windows Open-MPI

Hi all,

Can anyone tell me what the plans are for Fortran 90 support on Windows,
with say the Intel compilers?  I need to get MUMPS built and running
using Open-MPI, with Visual Studio and Intel 11.1.  I know Fortran isn't
part of the regular CMake build for Windows.  If someone's working on
this I'm happy to test or help out.

Damien
___
   
I'm not certain whether the top-post is intended as a reply to the 
original post, but I feel I must protest efforts to add confusion.  
Looking at the instructions for building on Windows, it appears that 
several routes have been taken with reported success, not including 
commercial Fortran.  It seems it should not be a major task to include 
gfortran in the cygwin build.
HP never transferred ownership of Compaq Fortran, not that it's relevant 
to the discussion.
The most popular open source MPI for commercial Windows Fortran has been 
Argonne MPICH2, which offers a pre-built version compatible with Intel 
Fortran.   Intel also offers MPI, derived originally from Argonne 
MPICH2, for both Windows and linux.
I can't imagine OpenMPI libraries being added to the Microsoft HPC 
environment; maybe that's not exactly what the top poster meant.


--
Tim Prince



Re: [OMPI users] How do I run OpenMPI safely on a Nehalem standalone machine?

2010-05-07 Thread Tim Prince

On 5/6/2010 10:30 PM, John Hearns wrote:

On 7 May 2010 03:17, Jeff Squyres  wrote:

   

Indeed.  I have seen some people have HT enabled in the bios just so that they 
can have the software option of turning them off via linux -- then you can run 
with HT and without it and see what it does to your specific codes.
 

I may have missed this on the thread, but how do you do that?
The Nehalem systems I have came delivered with HT enabled in the BIOS
- I know it is not a real pain to reboot and configure, but it would
be a lot easir to leave it on and switch off in software - also if you
wanted to do back-to-back testing of performance with/without HT.

___
   
I don't know what Jeff meant by that, but we haven't seen a feasible way 
of disabling HT without rebooting and using the BIOS options.  It is 
feasible to place 1 MPI process or thread per core.  With careful 
affinity, performance when using 1 logical per core normally is 
practically the same as with HT disabled.



--
Tim Prince



[OMPI users] MPI_Bsend vs. MPI_Ibsend

2010-05-07 Thread Jovana Knezevic
Thank you very much, now I get it! :-)

Cheers,
Jovana


Re: [OMPI users] Fortran support on Windows Open-MPI

2010-05-07 Thread Shiqing Fan


Hi Damien,

Currently only Fortran 77 bindings are supported in Open MPI on Windows. 
You could set the Intel Fortran compiler with CMAKE_Fortran_COMPILER 
variable in CMake (the full path to ifort.exe), and enable 
OMPI_WANT_F77_BINDINGS option for Open MPI, then everything should be 
compiled. I recommend to use Open MPI trunk or 1.5 branch version.


I have successfully compiled/ran NPB benchmark with f77 bindings on 
Windows. If you want to compile f90 programs, this should also be 
possible, but it needs a little modification in the config file. Please 
let me know if I can help.



Regards,
Shiqing

On 2010-5-7 5:52 AM, Damien wrote:

Hi all,

Can anyone tell me what the plans are for Fortran 90 support on 
Windows, with say the Intel compilers?  I need to get MUMPS built and 
running using Open-MPI, with Visual Studio and Intel 11.1.  I know 
Fortran isn't part of the regular CMake build for Windows.  If 
someone's working on this I'm happy to test or help out.


Damien
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




--
--
Shiqing Fan  http://www.hlrs.de/people/fan
High Performance Computing   Tel.: +49 711 685 87234
  Center Stuttgart (HLRS)Fax.: +49 711 685 65832
Address:Allmandring 30   email: f...@hlrs.de
70569 Stuttgart



Re: [OMPI users] Fortran derived types

2010-05-07 Thread Cole, Derek E
I don't have any hard numbers for fortran, but I do for C structures. Using C 
structures with some other C functionality (pointer functions, etc etc) can 
yield up to a 3x slowdown at worst, and at best, had a 15% slowdown. I have 
seen similar results in fortran, but don't have the benchmark results for it. 
In either language, nothing beats raw data types for performance. Just my .02, 
I know some out there may not agree.

Derek


-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Terry Frankcombe
Sent: Thursday, May 06, 2010 12:24 AM
To: Open MPI Users
Subject: Re: [OMPI users] Fortran derived types

Hi Derek

On Wed, 2010-05-05 at 13:05 -0400, Cole, Derek E wrote:
> In general, even in your serial fortran code, you're already taking a 
> performance hit using a derived type.

Do you have any numbers to back that up?

Ciao
Terry


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


Re: [OMPI users] How do I run OpenMPI safely on a Nehalemstandalone machine?

2010-05-07 Thread Jeff Squyres
On May 7, 2010, at 1:30 AM, John Hearns wrote:

> > Indeed.  I have seen some people have HT enabled in the bios just so that 
> > they can have the software option of turning them off via linux -- then you 
> > can run with HT and without it and see what it does to your specific codes.
> 
> I may have missed this on the thread, but how do you do that?
> The Nehalem systems I have came delivered with HT enabled in the BIOS
> - I know it is not a real pain to reboot and configure, but it would
> be a lot easir to leave it on and switch off in software - also if you
> wanted to do back-to-back testing of performance with/without HT.

What we have done is disable one of the 2 hardware threads as follows:

- download and install hwloc (it's very small/simple to install).  1.0rc5 is 
the current release, but it's *very* near release; it's very stable.
- run lstopo and look at the physical numbering of the hardware threads in each 
core.  
- here's an example output from v1.0rc5 lstopo (this is not from a Nehalem 
machine, but the same things apply):

-
# lstopo
Machine (3945MB)
  Socket #0
L2 #0 (2048KB) + L1 #0 (16KB) + Core #0
  PU #0 (phys=0)
  PU #1 (phys=4)
L2 #1 (2048KB) + L1 #1 (16KB) + Core #1
  PU #2 (phys=2)
  PU #3 (phys=6)
  Socket #1
L2 #2 (2048KB) + L1 #2 (16KB) + Core #2
  PU #4 (phys=1)
  PU #5 (phys=5)
L2 #3 (2048KB) + L1 #3 (16KB) + Core #3
  PU #6 (phys=3)
  PU #7 (phys=7)
# 
-

- you want to disable the 2nd PU (processing unit) -- i.e., hardware thread -- 
on each core.  
- Do this by echoing 0 to /sys/devices/system/cpu/cpuX/online, where X is each 
*phys* value.  
- For example:

-
# echo 0 > /sys/devices/system/cpu/cpu4/online 
# echo 0 > /sys/devices/system/cpu/cpu5/online 
# echo 0 > /sys/devices/system/cpu/cpu6/online 
# echo 0 > /sys/devices/system/cpu/cpu7/online 
# lstopo 
Machine (3945MB)
  Socket #0
L2 #0 (2048KB) + L1 #0 (16KB) + Core #0 + PU #0 (phys=0)
L2 #1 (2048KB) + L1 #1 (16KB) + Core #1 + PU #1 (phys=2)
  Socket #1
L2 #2 (2048KB) + L1 #2 (16KB) + Core #2 + PU #2 (phys=1)
L2 #3 (2048KB) + L1 #3 (16KB) + Core #3 + PU #3 (phys=3)
#
-

Granted; this doesn't actually disable hyperthreading.  But it does disable 
Linux from using the 2nd hardware thread on each core, which is pretty much the 
same thing for the purposes of this conversation.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] Fortran support on Windows Open-MPI

2010-05-07 Thread Damien Hocking
Thanks Shiqing.  I'll try that.  I'm not sure which bindings MUMPS uses, 
I'll post back if I need F90.


My apologies for not asking a clearer question, when I said Fortran 90 
support on Windows, I meant Open MPI, not compilers.


Damien

On 07/05/2010 3:09 AM, Shiqing Fan wrote:


Hi Damien,

Currently only Fortran 77 bindings are supported in Open MPI on 
Windows. You could set the Intel Fortran compiler with 
CMAKE_Fortran_COMPILER variable in CMake (the full path to ifort.exe), 
and enable OMPI_WANT_F77_BINDINGS option for Open MPI, then everything 
should be compiled. I recommend to use Open MPI trunk or 1.5 branch 
version.


I have successfully compiled/ran NPB benchmark with f77 bindings on 
Windows. If you want to compile f90 programs, this should also be 
possible, but it needs a little modification in the config file. 
Please let me know if I can help.



Regards,
Shiqing

On 2010-5-7 5:52 AM, Damien wrote:

Hi all,

Can anyone tell me what the plans are for Fortran 90 support on 
Windows, with say the Intel compilers?  I need to get MUMPS built and 
running using Open-MPI, with Visual Studio and Intel 11.1.  I know 
Fortran isn't part of the regular CMake build for Windows.  If 
someone's working on this I'm happy to test or help out.


Damien
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users






Re: [OMPI users] opal_mutex_lock(): Resource deadlock avoided

2010-05-07 Thread Jeff Squyres
Could you send a short reproducer code of the problem? 

I have not heard of this, but we have not extensively tests the entire OMPI 
code base with threading support enabled.


On May 6, 2010, at 7:03 AM, Ake Sandgren wrote:

> Hi!
> 
> We have a code that trips on this fairly often. I've seen cases where it
> works but mostly it gets stuck here.
> 
> The actual mpi call is call mpi_file_open(...)
> 
> I'm currently just wondering if there has been other reports on/anyone
> have seen deadlock in mpi-io parts of the code or if this most likely
> caused by our setup.
> 
> openmpi version is 1.4.2 (fails with 1.3.3 too)
> Filesystem used is GPFS
> 
> openmpi built with mpi-threads but without progress-threads
> 
> --
> Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
> Internet: a...@hpc2n.umu.se   Phone: +46 90 7866134 Fax: +46 90 7866126
> Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] About the correct use of DIET_Finalize()

2010-05-07 Thread Jeff Squyres
The error message is telling you that you have invoked MPI_Finalize more than 
once n a single process.  The issue is that you can't do that -- you should 
only invoke MPI_Finalize exactly once in a given process.  It's not an issue of 
ongoing communications when you invoke MPI_Finalize.

It's ok for different MPI processes to invoke MPI_Finalize at different times; 
Open MPI should figure that out without problem.

FWIW, you should also be able to invoke the MPI_Finalized function to see if 
MPI_Finalize has already been invoked.


On May 7, 2010, at 12:54 AM, Yves Caniou wrote:

> Dear All,
> 
> My parallel application ends when each process receives a msg, done in an
> async way (but my question still arise if sync comm were used, see the ref to 
> the manpage). Senders call MPI_Finalize() after a call to MPI_Wait() and
> receivers call MPI_Finalize() after having received the message.
> 
> An execution gives me that all proc finish as planned but I obtain the
> following errors (times the number of processor used)
> 
> With openMPI 1.4.2 compiled with gcc-4.5 on a Quad-Core AMD 
> Opteron(tm)
> Processor 8356
> *** The MPI_Finalize() function was called after MPI_FINALIZE was invoked.
> *** This is disallowed by the MPI standard.
> *** Your MPI job will now abort.
> Abort after MPI_FINALIZE completed successfully; not able to guarantee that
> all other processes were killed!
> 
> With openMPI 1.4.1 (debian package), Intel(R) Core(TM)2 Duo CPU 
> P9600
> *** An error occurred in MPI_Finalize
> *** after MPI was finalized
> *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
> Abort after MPI_FINALIZE completed successfully; not able to guarantee that
> all other processes were killed!
> -
> 
> I think it comes from the fact that, as mentionned in the man of
> MPI_Finalize():
> For  example,  a successful return from a blocking communication 
> opera-
>tion or from MPI_Wait or MPI_Test means that the communication is  com-
>pleted by the user and the buffer can be reused, but does not guarantee
>that the local process has no more work to do.
> 
> Nonetheless, if MPI_Finaliaze() is called before that the exchange of messages
> really takes place, receivers wouldn't call their MPI_Finalize(), but would
> just be issuing an abort thing, no?
> 
> Well, I'm perplex. When do I know when my proc can make the call to
> MPI_Finalize() and obtain an execution without error messages?
> 
> Thank you for any help.
> 
> .Yves.
> 
> --
> Yves Caniou
> Associate Professor at Université Lyon 1,
> Member of the team project INRIA GRAAL in the LIP ENS-Lyon,
> Délégation CNRS in Japan French Laboratory of Informatics (JFLI),
>   * in Information Technology Center, The University of Tokyo,
> 2-11-16 Yayoi, Bunkyo-ku, Tokyo 113-8658, Japan
> tel: +81-3-5841-0540
>   * in National Institute of Informatics
> 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430, Japan
> tel: +81-3-4212-2412
> http://graal.ens-lyon.fr/~ycaniou/
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] Fortran derived types

2010-05-07 Thread Richard Treumann

If someone is deciding whether to use complex datatypes or stick with
contiguous ones, they need to look at their own situation.  There is no
simple answer. The only thing that is fully predictable is that an MPI
operation, measured in isolation, will be no slower with contiguous data
than with discontiguous.

The question for a particular application is:

 in the application context, how does the performance with this
discontiguous datatype compare with the performance I get with other
solutions?"

The other solutions include anything your application must do to allow it
to use contiguous datatypes. (most often, packing & unpacking)

The water gets even more muddy when you consider that each MPI
implementation has differences in how it processes discontiguous data and
even a single MPI (like OpenMPI) could have different underlying trade
offs, depending on the capabilities of the underlying hardware.

It should not matter whether the program is written in C or Fortran. The
cost of processing a discontiguous datatype is tied to the layout of the
data in memory and both languages can produce equally simple or complex
memory layouts.


Dick Treumann  -  MPI Team
IBM Systems & Technology Group
Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846 Fax (845) 433-8363



|>
| From:  |
|>
  
>|
  |"Cole, Derek E"   
   |
  
>|
|>
| To:|
|>
  
>|
  |Open MPI Users   
   |
  
>|
|>
| Date:  |
|>
  
>|
  |05/07/2010 08:21 AM  
   |
  
>|
|>
| Subject:   |
|>
  
>|
  |Re: [OMPI users] Fortran derived types   
   |
  
>|
|>
| Sent by:   |
|>
  
>|
  |users-boun...@open-mpi.org   
   |
  
>|





I don't have any hard numbers for fortran, but I do for C structures. Using
C structures with some other C functionality (pointer functions, etc etc)
can yield up to a 3x slowdown at worst, and at best, had a 15% slowdown. I
have seen similar results in fortran, but don't have the benchmark results
for it. In either language, nothing beats raw data types for performance.
Just my .02, I know some out there may not agree.

Derek


-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of Terry Frankcombe
Sent: Thursday, May 06, 2010 12:24 AM
To: Open MPI Users
Subject: Re: [OMPI users] Fortran derived types

Hi Derek

On Wed, 2010-05-05 at 13:05 -0400, Cole, Derek E wrote:
> In general, even in your serial fortran code, you're already taking a
> performance hit using a derived type.

Do you have any numbers to back that up?

Ciao
Terry


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] MPIError:MPI_Recv: MPI_ERR_TRUNCATE:

2010-05-07 Thread Jeff Squyres
I'm afraid I don't know enough about Boost to know.

What the specific error message means is that you have posted an MPI_Recv that 
was too small to handle an incoming message.  It is permissible in MPI to post 
a receive that is *larger* than the corresponding incoming message, but it is 
defined as an error to post a receive with a buffer that is too small.


On May 3, 2010, at 6:18 PM, Pooja Varshneya wrote:

> Hi All,
> 
> I have written a program where MPI master sends and receives large 
> amount of data i.e sending from 1KB to 1MB of data.
> The amount of data to be sent with each call is different
> 
> The program runs well when running with 5 slaves, but when i try to 
> run the same program with 9 slaves, it gives me 
> MPI_Recv:MPI_ERR_TRUNCATE: message truncated error.
> 
> I am using boost MPI and boost serialization libraries for sending data.
> I understand that the internal buffer on the master are overrun in 
> this case. Is there a way i can increase the buffer sizes ?
> 
> Here is the output:
> -bash-3.2$ mpirun -np 9 --hostfile hostfile2 --rankfile rankfile2 
> $BENCHMARKS_ROOT/bin/boost_binomial_LB 10 5000_steps.txt 
> 5000_homo_bytes.txt
> Master: Starting Binomial Option Price calculations for American call 
> option
> Master: Current stock price: 110
> Master: Strike price: 100
> Master: Risk-free rate: 1.05
> Master: Volatility (annualized): 0.15
> Master: Time (years): 1
> Master: Number of calculations: 10
> 
> Slave 1:Going to Received Skeleton: 1
> Slave 1:Received Skeleton: 1
> Slave 1:Gpoing to Received Payload: 1
> Slave 1:Received Payload: 1
> Master: Sent initial message
> Master: Sent initial message
> Master: Sent initial message
> Slave 2:Going to Received Skeleton: 2
> Slave 2:Received Skeleton: 2
> Slave 2:Gpoing to Received Payload: 2
> Slave 2:Received Payload: 2
> Slave 3:Going to Received Skeleton: 3
> Slave 3:Received Skeleton: 3
> Slave 3:Gpoing to Received Payload: 3
> Slave 3:Received Payload: 3
> Slave 4:Going to Received Skeleton: 4
> Slave 4:Received Skeleton: 4
> Slave 4:Gpoing to Received Payload: 4
> Slave 1: Sent Response Skeleton: 1
> Master: Sent initial message
> Slave 4:Received Payload: 4
> Slave 5:Going to Received Skeleton: 5
> terminate called after throwing an instance of 
> 'boost
> ::exception_detail
> ::clone_impl
>  >'
>what():  MPI_Recv: MPI_ERR_TRUNCATE: message truncated
> [rh5x64-u12:26987] *** Process received signal ***
> [rh5x64-u12:26987] Signal: Aborted (6)
> [rh5x64-u12:26987] Signal code:  (-6)
> [rh5x64-u12:26987] [ 0] /lib64/libpthread.so.0 [0x3ba680e7c0]
> [rh5x64-u12:26987] [ 1] /lib64/libc.so.6(gsignal+0x35) [0x3ba5c30265]
> [rh5x64-u12:26987] [ 2] /lib64/libc.so.6(abort+0x110) [0x3ba5c31d10]
> [rh5x64-u12:26987] [ 3] /usr/lib64/libstdc++.so.
> 6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x114) [0x3bb7abec44]
> [rh5x64-u12:26987] [ 4] /usr/lib64/libstdc++.so.6 [0x3bb7abcdb6]
> [rh5x64-u12:26987] [ 5] /usr/lib64/libstdc++.so.6 [0x3bb7abcde3]
> [rh5x64-u12:26987] [ 6] /usr/lib64/libstdc++.so.6 [0x3bb7abceca]
> [rh5x64-u12:26987] [ 7] /userdata/testing/benchmark_binaries/bin/
> boost_binomial_LB(_ZN5boost15throw_exceptionINS_3mpi9exceptionEEEvRKT_
> +0x172) [0x4216a2]
> [rh5x64-u12:26987] [ 8] /usr/local/lib/libboost_mpi.so.
> 1.42.0
> (_ZN5boost3mpi6detail19packed_archive_recvEP19ompi_communicator_tiiRNS0_15packed_iarchiveER20ompi_status_public_t
> +0x16b) [0x2b0317faa6b3]
> [rh5x64-u12:26987] [ 9] /usr/local/lib/libboost_mpi.so.
> 1.42.0
> (_ZNK5boost3mpi12communicator4recvINS0_15packed_iarchiveEEENS0_6statusEiiRT_
> +0x40) [0x2b0317f9c72a]
> [rh5x64-u12:26987] [10] /usr/local/lib/libboost_mpi.so.
> 1.42.0
> (_ZNK5boost3mpi12communicator4recvINS0_24packed_skeleton_iarchiveEEENS0_6statusEiiRT_
> +0x38) [0x2b0317f9c76c]
> [rh5x64-u12:26987] [11] /userdata/testing/benchmark_binaries/bin/
> boost_binomial_LB
> (_ZNK5boost3mpi12communicator4recvI31Binomial_Option_Pricing_RequestEENS0_6statusEiiRKNS0_14skeleton_proxyIT_EE
> +0x121) [0x4258c1]
> [rh5x64-u12:26987] [12] /userdata/testing/benchmark_binaries/bin/
> boost_binomial_LB(main+0x409) [0x41d369]
> [rh5x64-u12:26987] [13] /lib64/libc.so.6(__libc_start_main+0xf4) 
> [0x3ba5c1d994]
> [rh5x64-u12:26987] [14] /userdata/testing/benchmark_binaries/bin/
> boost_binomial_LB(__gxx_personality_v0+0x399) [0x419e69]
> [rh5x64-u12:26987] *** End of error message ***
> [rh5x64-u11.zlab.local][[47840,1],0][btl_tcp_frag.c:
> 216:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: 
> Connection reset by peer (104)
> --
> mpirun noticed that process rank 5 with PID 26987 on node 172.10.0.112 
> exited on signal 6 (Aborted).
> --
> 
> Here is the program code:
> 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> 
> #incl

Re: [OMPI users] How do I run OpenMPI safely on a Nehalemstandalone machine?

2010-05-07 Thread Gus Correa

Hi Jeff, John, Tim

I had asked the same question that John and Tim did,
but it got lost 100 emails ago on this thread.
Here I've been disabling/enabling HT on the BIOS,
as per Douglas Guptill's suggestion.

Jeff:  Thank you very much for the wizardry details.
Very helpful for the list subscriber community, I'd guess.
Another reason to install hwloc.

Cheers,
Gus Correa


Jeff Squyres wrote:

On May 7, 2010, at 1:30 AM, John Hearns wrote:


Indeed.  I have seen some people have HT enabled in the bios just so that they 
can have the software option of turning them off via linux -- then you can run 
with HT and without it and see what it does to your specific codes.

I may have missed this on the thread, but how do you do that?
The Nehalem systems I have came delivered with HT enabled in the BIOS
- I know it is not a real pain to reboot and configure, but it would
be a lot easir to leave it on and switch off in software - also if you
wanted to do back-to-back testing of performance with/without HT.


What we have done is disable one of the 2 hardware threads as follows:

- download and install hwloc (it's very small/simple to install).  
1.0rc5 is the current release, but it's *very* near release; 
it's very stable.
- run lstopo and look at the physical numbering of the 
hardware threads in each core.  
- here's an example output from v1.0rc5 lstopo 
(this is not from a Nehalem machine, but the same things apply):


-
# lstopo
Machine (3945MB)
  Socket #0
L2 #0 (2048KB) + L1 #0 (16KB) + Core #0
  PU #0 (phys=0)
  PU #1 (phys=4)
L2 #1 (2048KB) + L1 #1 (16KB) + Core #1
  PU #2 (phys=2)
  PU #3 (phys=6)
  Socket #1
L2 #2 (2048KB) + L1 #2 (16KB) + Core #2
  PU #4 (phys=1)
  PU #5 (phys=5)
L2 #3 (2048KB) + L1 #3 (16KB) + Core #3
  PU #6 (phys=3)
  PU #7 (phys=7)
# 
-


- you want to disable the 2nd PU (processing unit) -- i.e., hardware thread -- on each core.  
- Do this by echoing 0 to /sys/devices/system/cpu/cpuX/online, where X is each *phys* value.  
- For example:


-
# echo 0 > /sys/devices/system/cpu/cpu4/online 
# echo 0 > /sys/devices/system/cpu/cpu5/online 
# echo 0 > /sys/devices/system/cpu/cpu6/online 
# echo 0 > /sys/devices/system/cpu/cpu7/online 
# lstopo 
Machine (3945MB)

  Socket #0
L2 #0 (2048KB) + L1 #0 (16KB) + Core #0 + PU #0 (phys=0)
L2 #1 (2048KB) + L1 #1 (16KB) + Core #1 + PU #1 (phys=2)
  Socket #1
L2 #2 (2048KB) + L1 #2 (16KB) + Core #2 + PU #2 (phys=1)
L2 #3 (2048KB) + L1 #3 (16KB) + Core #3 + PU #3 (phys=3)
#
-

Granted; this doesn't actually disable hyperthreading.  But it does disable Linux from using the 2nd hardware thread on each core, which is pretty much the same thing for the purposes of this conversation.





[OMPI users] Dynamic libraries in OpenMPI

2010-05-07 Thread Miguel Ángel Vázquez
Dear all,

I am trying to run a C++ program which uses dynamic libraries under mpi.

The compilation command looks like:

 mpiCC `pkg-config --cflags itpp`  -o montecarlo  montecarlo.cpp `pkg-config
--libs itpp`

And it works if I executed it in one machine:

mpirun -np 2 -H localhost montecarlo

I tested this both in the "master node" and in the "compute nodes" and it
works. However, when I try to run it with two different machines:

mpirun -np 2 -H localhost,hpcnode1 montecarlo

The program claims that it can't find the shared libraries:

montecarlo: error while loading shared libraries: libitpp.so.6: cannot open
shared object file: No such file or directory

The LD_LIBRARY_PATH is set properly at every machine, any idea where the
problem is? I attached you the config.log and the result of the omp-info
--all

Thank you in advance,

Miguel


mavazquez.tar.bz2
Description: BZip2 compressed data


Re: [OMPI users] Fortran support on Windows Open-MPI

2010-05-07 Thread Damien

Hi,

I tried the 1.5a1r23092 snapshot and I used CMAKE 2.6.4 and 2.8.1.  In 
the CMake GUI, I checked the OMPI_WANT_F77_BINDINGS option, and added a 
FilePath for CMAKE_Fortran_COMPILER of C:/Program Files 
(x86)/Intel/Compiler/11.1/065/bin/ia32/ifort.exe.  When I re-run the 
Configure, CMake wipes the CMAKE_Fortran_COMPILER variable and complains 
about a missing Fortran compiler.  Any suggestions?


Damien

On 07/05/2010 3:09 AM, Shiqing Fan wrote:


Hi Damien,

Currently only Fortran 77 bindings are supported in Open MPI on 
Windows. You could set the Intel Fortran compiler with 
CMAKE_Fortran_COMPILER variable in CMake (the full path to ifort.exe), 
and enable OMPI_WANT_F77_BINDINGS option for Open MPI, then everything 
should be compiled. I recommend to use Open MPI trunk or 1.5 branch 
version.


I have successfully compiled/ran NPB benchmark with f77 bindings on 
Windows. If you want to compile f90 programs, this should also be 
possible, but it needs a little modification in the config file. 
Please let me know if I can help.



Regards,
Shiqing

On 2010-5-7 5:52 AM, Damien wrote:

Hi all,

Can anyone tell me what the plans are for Fortran 90 support on 
Windows, with say the Intel compilers?  I need to get MUMPS built and 
running using Open-MPI, with Visual Studio and Intel 11.1.  I know 
Fortran isn't part of the regular CMake build for Windows.  If 
someone's working on this I'm happy to test or help out.


Damien
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users






[OMPI users] open mpi installation error

2010-05-07 Thread Bharath.K. Chakravarthi
hello there...
as i've been advised to use open mpi rather than lam mpi i've tried to
install it.
i could not got any installation guide online
although i have tried to install using lam/mpi manual only
but i rushed into errors as follows

[root@localhost openmpi]# ./configure
..
..

*** C++ compiler and preprocessor
checking for g++... no
checking for c++... no
checking for gpp... no
checking for aCC... no
checking for CC... no
checking for cxx... no
checking for cc++... no
checking for cl.exe... no
checking for FCC... no
checking for KCC... no
checking for RCC... no
checking for xlC_r... no
checking for xlC... no
checking whether we are using the GNU C++ compiler... no
checking whether g++ accepts -g... no
checking dependency style of g++... none
checking how to run the C++ preprocessor... /lib/cpp
configure: error: in `/home/bioinfo/Documents/gromacs/openmpi':
configure: error: C++ preprocessor "/lib/cpp" fails sanity check
See `config.log' for more details.
[root@localhost openmpi]# ./configure --help

can any one tell me how can i install it or provide me any installation
guide
thanks for help.. :)

-- 
Bharath.K.Chakravarthi
Ph:9535629260


Re: [OMPI users] open mpi installation error

2010-05-07 Thread Jeff Squyres
As I advised you on the LAM/MPI list, please see:

http://www.open-mpi.org/community/help/

:-)


On May 7, 2010, at 1:08 PM, Bharath.K. Chakravarthi wrote:

> 
> hello there...
> as i've been advised to use open mpi rather than lam mpi i've tried to 
> install it.
> i could not got any installation guide online 
> although i have tried to install using lam/mpi manual only 
> but i rushed into errors as follows
> 
> [root@localhost openmpi]# ./configure
> ..
> ..
> 
> *** C++ compiler and preprocessor
> checking for g++... no
> checking for c++... no
> checking for gpp... no
> checking for aCC... no
> checking for CC... no
> checking for cxx... no
> checking for cc++... no
> checking for cl.exe... no
> checking for FCC... no
> checking for KCC... no
> checking for RCC... no
> checking for xlC_r... no
> checking for xlC... no
> checking whether we are using the GNU C++ compiler... no
> checking whether g++ accepts -g... no
> checking dependency style of g++... none
> checking how to run the C++ preprocessor... /lib/cpp
> configure: error: in `/home/bioinfo/Documents/gromacs/openmpi':
> configure: error: C++ preprocessor "/lib/cpp" fails sanity check
> See `config.log' for more details.
> [root@localhost openmpi]# ./configure --help
> 
> can any one tell me how can i install it or provide me any installation guide
> thanks for help.. :)
> 
> -- 
> Bharath.K.Chakravarthi
> Ph:9535629260
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] Dynamic libraries in OpenMPI

2010-05-07 Thread Prentice Bisbal


Miguel Ángel Vázquez wrote:
> Dear all,
> 
> I am trying to run a C++ program which uses dynamic libraries under mpi.
> 
> The compilation command looks like:
> 
>  mpiCC `pkg-config --cflags itpp`  -o montecarlo  montecarlo.cpp
> `pkg-config --libs itpp`
> 
> And it works if I executed it in one machine:
> 
> mpirun -np 2 -H localhost montecarlo
> 
> I tested this both in the "master node" and in the "compute nodes" and
> it works. However, when I try to run it with two different machines:
> 
> mpirun -np 2 -H localhost,hpcnode1 montecarlo
> 
> The program claims that it can't find the shared libraries:
> 
> montecarlo: error while loading shared libraries: libitpp.so.6: cannot
> open shared object file: No such file or directory
> 
> The LD_LIBRARY_PATH is set properly at every machine, any idea where the
> problem is? I attached you the config.log and the result of the omp-info
> --all
> 
> Thank you in advance,
> 
> Miguel

Miguel,

Shells behave differently depending on whether it is an interactive
login shell or a non-interactive shell. For example, the bash shell uses
.bash_profile in case, but .bashrc in the other. Check the documentation
for your shell and see what files it uses in each case, and make sure
the non-login config file has the necessary settings for your MPI jobs.
 It sounds like your login shell environment is okay, but your non-login
environment isn't setup correctly. This is a common problem.

I use bash, and to keep it simple, my .bash_profile is just a symbolic
link to .bashrc. That way, both shell types have the same environment.
This isn't always a good idea, but in my case it's fine.

-- 
Prentice


Re: [OMPI users] open mpi installation error

2010-05-07 Thread Prentice Bisbal
You rushed into errors because you rushed through the installation. Open
MPI has very good, very thorough documentation. The FAQ has a whole
section that deals with building Open MPI:

http://www.open-mpi.org/faq/?category=building

--
Prentice


Jeff Squyres wrote:
> As I advised you on the LAM/MPI list, please see:
> 
> http://www.open-mpi.org/community/help/
> 
> :-)
> 
> 
> On May 7, 2010, at 1:08 PM, Bharath.K. Chakravarthi wrote:
> 
>> hello there...
>> as i've been advised to use open mpi rather than lam mpi i've tried to 
>> install it.
>> i could not got any installation guide online 
>> although i have tried to install using lam/mpi manual only 
>> but i rushed into errors as follows
>>
>> [root@localhost openmpi]# ./configure
>> ..
>> ..
>>
>> *** C++ compiler and preprocessor
>> checking for g++... no
>> checking for c++... no
>> checking for gpp... no
>> checking for aCC... no
>> checking for CC... no
>> checking for cxx... no
>> checking for cc++... no
>> checking for cl.exe... no
>> checking for FCC... no
>> checking for KCC... no
>> checking for RCC... no
>> checking for xlC_r... no
>> checking for xlC... no
>> checking whether we are using the GNU C++ compiler... no
>> checking whether g++ accepts -g... no
>> checking dependency style of g++... none
>> checking how to run the C++ preprocessor... /lib/cpp
>> configure: error: in `/home/bioinfo/Documents/gromacs/openmpi':
>> configure: error: C++ preprocessor "/lib/cpp" fails sanity check
>> See `config.log' for more details.
>> [root@localhost openmpi]# ./configure --help
>>
>> can any one tell me how can i install it or provide me any installation guide
>> thanks for help.. :)
>>



[OMPI users] communicate C++ STL strucutures ??

2010-05-07 Thread Cristobal Navarro
Hello,

my question is the following.

is it possible to send and receive C++ objects or STL structures (for
example, send map myMap) through openMPI SEND and RECEIVE functions?
at first glance i thought it was possible, but after reading some doc, im
not sure.
i dont have my source code at that stage for testing yet


Cristobal


Re: [OMPI users] communicate C++ STL strucutures ??

2010-05-07 Thread Fernando Lemos
On Fri, May 7, 2010 at 5:33 PM, Cristobal Navarro  wrote:
> Hello,
>
> my question is the following.
>
> is it possible to send and receive C++ objects or STL structures (for
> example, send map myMap) through openMPI SEND and RECEIVE functions?
> at first glance i thought it was possible, but after reading some doc, im
> not sure.
> i dont have my source code at that stage for testing yet

Not normally, you have to serialize it before sending and deserialize
it after sending. You could use Boost.MPI and Boost.Serialize too,
that would probably be the best way to go.


[OMPI users] MPI_FILE_SET_ERRHANDLER returns an error with MPI_FILE_NULL

2010-05-07 Thread Secretan Yves
Hello,

According to my understanding of the documentation, it should be possible to 
set the default error handler for files with MPI_FILE_SET_ERRHANDLER. However, 
the following small Fortran77 program fails, MPI_FILE_SET_ERRHANDLER returns an 
error.

=
  PROGRAM H2D2_MAIN

  INCLUDE 'mpif.h'

  EXTERNAL HNDLR
C

  CALL MPI_INIT(I_ERR)

  I_HDLR = 0
  CALL MPI_FILE_CREATE_ERRHANDLER(HNDLR1, I_HDLR, I_ERR)
  WRITE(*,*) 'MPI_FILE_CREATE_ERRHANDLER: ', I_ERR
  CALL MPI_FILE_SET_ERRHANDLER   (MPI_FILE_NULL, I_HDLR, I_ERR)
  WRITE(*,*) 'MPI_FILE_SET_ERRHANDLER: ', I_ERR

  END

  SUBROUTINE HNDLR(I_CNTX, I_ERR)
  WRITE(*,*) 'In HNDLR: MPI Error detected'
  RETURN
  END



Did I miss something obvious?
Regards

Yves Secretan
Professeur
yves.secre...@ete.inrs.ca

[cid:image001.gif@01CAEDF4.2405BF40] Avant d'imprimer, pensez à l'environnement



Re: [OMPI users] Problem with mpi_comm_spawn_multiple

2010-05-07 Thread Jeff Squyres
Greetings Fred.

After looking at this for more hours than I'd care to admit, I'm wondering if 
this is a bug in gfortran.  I can replicate your problem with a simple program 
on gfortran 4.1 on RHEL 5.4, but it doesn't happen with the Intel Fortran 
compiler (11.1) or the PGI fortran compiler (10.0).

One of the issues appears how to determine how Fortran 2d CHARACTER arrays are 
terminated.  I can't figure out how gfortran is terminating them -- but intel 
and PGI both terminate them by having an empty string at the end.

Are you using gfortran 4.1, perchance?




On May 5, 2010, at 2:08 PM, Fred Marquis wrote:

> Hi,
> 
>   I am using mpi_comm_spawn_multiple to spawn multiple commands with argument 
> lists. I am trying to do this in fortran (77) using version openmpi-1.4.1 and 
> the ifort compiler v9.0. The operating system is SuSE Linux 10.1 (x86-64).
> 
> I have put together a simple controlling example program (test_pbload.F) and 
> an example slave program (spray.F) to try and explain my problem.
> 
> In the controlling program mpi_comm_spawn_multiple is used to set 2 copies of 
> the slave running. The first is started with the argument list "1 2 3 4" and 
> the second with "5 6 7 8".
> 
> The slaves are started OK and the slaves print out the argument lists and 
> exit. In addition the slaves print out their rank numbers so I can see which 
> argument list belongs to which slave.
> 
> What I am finding is that the argument lists are not being sent to the slaves 
> correctly, indeed both slaves seem to be getting both arguments lists !!!
> 
> To compile and run the programs I follow the steps below.
> 
> Controlling program "test_pbload.F"
> 
>mpif77 -o test_pbload test_pbload.F
> 
> Slave program "spray.F"
> 
>mpif77 -o spray spray.F
> 
> Run the controller
> 
>mpirun -np 1 test_pbload
> 
> 
> 
> 
> The output of which is from the first slave:
> 
>  nsize, mytid: iargs   2   0 :   2
>  spray:   0 1:1 2 3 4   < FIRST ARGUMENT  
>  spray:   0 2:4 5 6 7   < SECOND ARGUMENT
> 
>  and the second slave:
> 
>  nsize, mytid: iargs   2   1 :   2
>  spray:   1 1:1 2 3 4   < FIRST ARGUMENT
>  spray:   1 2:4 5 6 7   < SECOND ARGUMENT 
> 
> In each case the arguments (2 in both cases) are the same.
> 
> I have written a C version of the controlling program and everthing works as 
> expected so I presume that I have either got the specification of the 
> argument list wrong or I have discovered an error/bug. At the moment I 
> working on the former -- but am at a loss to see what is wrong !!
> 
> Any help, pointers etc really appreciated.
> 
> 
> Controlling program (that uses MPI_COMM_SPAWN_MULTIPLE) test_pbload.F
> 
>   program main
> c
>   implicit none
> #include "mpif.h"
> 
>   integer error
>   integer intercomm
>   CHARACTER*25 commands(2), argvs(2, 2)
>   integer nprocs(2),info(2),ncpus
> c
>   call mpi_init(error)
> c
>ncpus = 2
> c
>commands(1) = ' ./spray '
>nprocs(1) = 1
>info(1) = MPI_INFO_NULL
>argvs(1, 1) = ' 1 2 3 4 '
>argvs(1, 2) = ' '
> c
>commands(2) = ' ./spray '
>nprocs(2) = 1
>info(2) = MPI_INFO_NULL
>argvs(2, 1) = ' 4 5 6 7 '
>argvs(2, 2) = ' '
> c
>   call mpi_comm_spawn_multiple( ncpus,
>  1  commands, argvs, nprocs, info,
>  2  0, MPI_COMM_WORLD, intercomm,
>  3  MPI_ERRCODES_IGNORE, error )
> c
>   call mpi_finalize(error)
> c
>   end
> 
> Slave program (started by the controlling program) spray.F
> 
>   program main
>   integer error
>   integer pid
>   character*20 line(100)
>   call mpi_init(error)
> c
>   CALL MPI_COMM_SIZE(MPI_COMM_WORLD,NSIZE,error)
>   CALL MPI_COMM_RANK(MPI_COMM_WORLD,MYTID,error)
> c
>   iargs=iargc()
>   write(*,*) 'nsize, mytid: iargs', nsize, mytid, ":", iargs
> c
>   if( iargs.gt.0 ) then
>  do i = 1, iargs
> call getarg(i,line(i))
> write(*,'(1x,a,i3,20(i2,1h:,a))')
>  1   'spray: ',mytid,i,line(i)
>  enddo
>   endif
> c
>   call mpi_finalize(error)
> c
>   end
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] communicate C++ STL strucutures ??

2010-05-07 Thread Cristobal Navarro
Thanks for the answer,

ill will look at them when i get to this point, i've heard good comments
about boost.
Cristobal




On Fri, May 7, 2010 at 4:49 PM, Fernando Lemos wrote:

> On Fri, May 7, 2010 at 5:33 PM, Cristobal Navarro 
> wrote:
> > Hello,
> >
> > my question is the following.
> >
> > is it possible to send and receive C++ objects or STL structures (for
> > example, send map myMap) through openMPI SEND and RECEIVE functions?
> > at first glance i thought it was possible, but after reading some doc, im
> > not sure.
> > i dont have my source code at that stage for testing yet
>
> Not normally, you have to serialize it before sending and deserialize
> it after sending. You could use Boost.MPI and Boost.Serialize too,
> that would probably be the best way to go.
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] Problem with mpi_comm_spawn_multiple

2010-05-07 Thread Andrew J Marquis
Dear Jeff,

   am afraid not, as I said in my original post I am using the Intel ifort 
compiler version 9.0, i.e.

fred@prandtl:~>  mpif77 -V

Intel(R) Fortran Compiler for Intel(R) EM64T-based applications, Version 9.0
Build 20060222 Package ID: 
Copyright (C) 1985-2006 Intel Corporation.  All rights reserved.
FOR NON-COMMERCIAL USE ONLY


I have been looking at this myself and have noted a couple of things, some of 
these need cross-checking (I am using different computers and different setups 
and different compilers and different openmpi releases !!)  but my thoughts 
at the moment are (point number (4) is possibly the most important so far):

1) If I allocate the string array using an allocate statement then I see that 
ALL of the string locations are initialised to "\0" (character 0).

2) If I set part of a location in the string array then all the OTHER 
characters in the same location are set to " " (character 32). 

3) If the character array is defined via a dimension statement then the 
locations in the array seem to be initialised at random.

4) Looking at the output from my test program I noticed and odd pattern in the 
arguments being sent to the slaves (yes I do need to quantify this better !!). 
However this caused me to look at the ompi source, in particular I am looking 
at:

   openmpi-1.4.1/ompi/mpi/f77/base/strings.c

In particular at the bottom (line 156( in function 
"ompi_fortran_multiple_argvs_f2c" at the end of the for statement there is the 
line:

   current_array += len * i;

The "* i" looks wrong to me I am thinking it should just be:

   current_array += len;

making this change improves things BUT like you suggest in your email there 
seems to be a problem locating the end of the 2d-array elements.



I will try and look at this more over the w/e.

Fred Marquis.


On Fri, May 07, 2010 at 10:02:48PM +0100, Jeff Squyres wrote:
> Greetings Fred.
> 
> After looking at this for more hours than I'd care to admit, I'm wondering if 
> this is a bug in gfortran.  I can replicate your problem with a simple 
> program on gfortran 4.1 on RHEL 5.4, but it doesn't happen with the Intel 
> Fortran compiler (11.1) or the PGI fortran compiler (10.0).
> 
> One of the issues appears how to determine how Fortran 2d CHARACTER arrays 
> are terminated.  I can't figure out how gfortran is terminating them -- but 
> intel and PGI both terminate them by having an empty string at the end.
> 
> Are you using gfortran 4.1, perchance?
> 
> 
> 
> 
> On May 5, 2010, at 2:08 PM, Fred Marquis wrote:
> 
> > Hi,
> > 
> >   I am using mpi_comm_spawn_multiple to spawn multiple commands with 
> > argument lists. I am trying to do this in fortran (77) using version 
> > openmpi-1.4.1 and the ifort compiler v9.0. The operating system is SuSE 
> > Linux 10.1 (x86-64).
> > 
> > I have put together a simple controlling example program (test_pbload.F) 
> > and an example slave program (spray.F) to try and explain my problem.
> > 
> > In the controlling program mpi_comm_spawn_multiple is used to set 2 copies 
> > of the slave running. The first is started with the argument list "1 2 3 4" 
> > and the second with "5 6 7 8".
> > 
> > The slaves are started OK and the slaves print out the argument lists and 
> > exit. In addition the slaves print out their rank numbers so I can see 
> > which argument list belongs to which slave.
> > 
> > What I am finding is that the argument lists are not being sent to the 
> > slaves correctly, indeed both slaves seem to be getting both arguments 
> > lists !!!
> > 
> > To compile and run the programs I follow the steps below.
> > 
> > Controlling program "test_pbload.F"
> > 
> >mpif77 -o test_pbload test_pbload.F
> > 
> > Slave program "spray.F"
> > 
> >mpif77 -o spray spray.F
> > 
> > Run the controller
> > 
> >mpirun -np 1 test_pbload
> > 
> > 
> > 
> > 
> > The output of which is from the first slave:
> > 
> >  nsize, mytid: iargs   2   0 :   2
> >  spray:   0 1:1 2 3 4   < FIRST ARGUMENT  
> >  spray:   0 2:4 5 6 7   < SECOND ARGUMENT
> > 
> >  and the second slave:
> > 
> >  nsize, mytid: iargs   2   1 :   2
> >  spray:   1 1:1 2 3 4   < FIRST ARGUMENT
> >  spray:   1 2:4 5 6 7   < SECOND ARGUMENT 
> > 
> > In each case the arguments (2 in both cases) are the same.
> > 
> > I have written a C version of the controlling program and everthing works 
> > as expected so I presume that I have either got the specification of the 
> > argument list wrong or I have discovered an error/bug. At the moment I 
> > working on the former -- but am at a loss to see what is wrong !!
> > 
> > Any help, pointers etc really appreciated.
> > 
> > 
> > Controlling program (that uses MPI_COMM_SPAWN_MULTIPLE) test_pbload.F
> > 
> >   program main
> > c
> >   implicit none
> > #include "mpif.h"
> > 
> >   integer error
> >   integer intercomm
> >   CHARACTER*25 commands(2), argvs(2, 2)

Re: [OMPI users] Problem with mpi_comm_spawn_multiple

2010-05-07 Thread Jeff Squyres
Yoinks; I missed that -- sorry!

Here's a simple tarball; can you try this with your compiler?  Just untar it and

  make CC=icc FC=ifort
  ./main

Do you see only 6 entries in the array?

(I have icc 9.0, but I'm now running RHEL 5.4, and the gcc version with it is 
too new for icc 9.0 -- so I can't run it)


On May 7, 2010, at 5:44 PM, Andrew J Marquis wrote:

> Dear Jeff,
> 
>am afraid not, as I said in my original post I am using the Intel ifort 
> compiler version 9.0, i.e.
> 
> fred@prandtl:~>  mpif77 -V
> 
> Intel(R) Fortran Compiler for Intel(R) EM64T-based applications, Version 9.0  
>   Build 20060222 Package ID: 
> Copyright (C) 1985-2006 Intel Corporation.  All rights reserved.
> FOR NON-COMMERCIAL USE ONLY
> 
> 
> I have been looking at this myself and have noted a couple of things, some of 
> these need cross-checking (I am using different computers and different 
> setups and different compilers and different openmpi releases !!)  but my 
> thoughts at the moment are (point number (4) is possibly the most important 
> so far):
> 
> 1) If I allocate the string array using an allocate statement then I see that 
> ALL of the string locations are initialised to "\0" (character 0).
> 
> 2) If I set part of a location in the string array then all the OTHER 
> characters in the same location are set to " " (character 32).
> 
> 3) If the character array is defined via a dimension statement then the 
> locations in the array seem to be initialised at random.
> 
> 4) Looking at the output from my test program I noticed and odd pattern in 
> the arguments being sent to the slaves (yes I do need to quantify this better 
> !!). However this caused me to look at the ompi source, in particular I am 
> looking at:
> 
>openmpi-1.4.1/ompi/mpi/f77/base/strings.c
> 
> In particular at the bottom (line 156( in function 
> "ompi_fortran_multiple_argvs_f2c" at the end of the for statement there is 
> the line:
> 
>current_array += len * i;
> 
> The "* i" looks wrong to me I am thinking it should just be:
> 
>current_array += len;
> 
> making this change improves things BUT like you suggest in your email there 
> seems to be a problem locating the end of the 2d-array elements.
> 
> 
> 
> I will try and look at this more over the w/e.
> 
> Fred Marquis.
> 
> 
> On Fri, May 07, 2010 at 10:02:48PM +0100, Jeff Squyres wrote:
> > Greetings Fred.
> >
> > After looking at this for more hours than I'd care to admit, I'm wondering 
> > if this is a bug in gfortran.  I can replicate your problem with a simple 
> > program on gfortran 4.1 on RHEL 5.4, but it doesn't happen with the Intel 
> > Fortran compiler (11.1) or the PGI fortran compiler (10.0).
> >
> > One of the issues appears how to determine how Fortran 2d CHARACTER arrays 
> > are terminated.  I can't figure out how gfortran is terminating them -- but 
> > intel and PGI both terminate them by having an empty string at the end.
> >
> > Are you using gfortran 4.1, perchance?
> >
> >
> >
> >
> > On May 5, 2010, at 2:08 PM, Fred Marquis wrote:
> >
> > > Hi,
> > >
> > >   I am using mpi_comm_spawn_multiple to spawn multiple commands with 
> > > argument lists. I am trying to do this in fortran (77) using version 
> > > openmpi-1.4.1 and the ifort compiler v9.0. The operating system is SuSE 
> > > Linux 10.1 (x86-64).
> > >
> > > I have put together a simple controlling example program (test_pbload.F) 
> > > and an example slave program (spray.F) to try and explain my problem.
> > >
> > > In the controlling program mpi_comm_spawn_multiple is used to set 2 
> > > copies of the slave running. The first is started with the argument list 
> > > "1 2 3 4" and the second with "5 6 7 8".
> > >
> > > The slaves are started OK and the slaves print out the argument lists and 
> > > exit. In addition the slaves print out their rank numbers so I can see 
> > > which argument list belongs to which slave.
> > >
> > > What I am finding is that the argument lists are not being sent to the 
> > > slaves correctly, indeed both slaves seem to be getting both arguments 
> > > lists !!!
> > >
> > > To compile and run the programs I follow the steps below.
> > >
> > > Controlling program "test_pbload.F"
> > >
> > >mpif77 -o test_pbload test_pbload.F
> > >
> > > Slave program "spray.F"
> > >
> > >mpif77 -o spray spray.F
> > >
> > > Run the controller
> > >
> > >mpirun -np 1 test_pbload
> > >
> > >
> > >
> > >
> > > The output of which is from the first slave:
> > >
> > >  nsize, mytid: iargs   2   0 :   2
> > >  spray:   0 1:1 2 3 4   < FIRST ARGUMENT 
> > >  spray:   0 2:4 5 6 7   < SECOND ARGUMENT   
> > >
> > >  and the second slave:
> > >
> > >  nsize, mytid: iargs   2   1 :   2
> > >  spray:   1 1:1 2 3 4   < FIRST ARGUMENT   
> > >  spray:   1 2:4 5 6 7   < SECOND ARGUMENT
> > >
> > > In each case the arguments (2 in both cases) are the same.
> > >
> > > I have written a C version of t

Re: [OMPI users] Problem with mpi_comm_spawn_multiple

2010-05-07 Thread Andrew J Marquis
Dear Jeff,

  thats odd !!

fred@prandtl:~/test/fortran-c-2d-char> make CC=icc FC=ifort
ifort -g  -c -o main.o main.f
icc -g   -c -o c_func.o c_func.c

Error: A license for CComp is not available (-5,357).




I will look into this tomorrow, time for bed I am afraid !!

Fred Marquis.

On Fri, May 07, 2010 at 10:49:40PM +0100, Jeff Squyres wrote:
> Yoinks; I missed that -- sorry!
> 
> Here's a simple tarball; can you try this with your compiler?  Just untar it 
> and
> 
>   make CC=icc FC=ifort
>   ./main
> 
> Do you see only 6 entries in the array?
> 
> (I have icc 9.0, but I'm now running RHEL 5.4, and the gcc version with it is 
> too new for icc 9.0 -- so I can't run it)
> 
> 
> On May 7, 2010, at 5:44 PM, Andrew J Marquis wrote:
> 
> > Dear Jeff,
> > 
> >am afraid not, as I said in my original post I am using the Intel ifort 
> > compiler version 9.0, i.e.
> > 
> > fred@prandtl:~>  mpif77 -V
> > 
> > Intel(R) Fortran Compiler for Intel(R) EM64T-based applications, Version 
> > 9.0Build 20060222 Package ID: 
> > Copyright (C) 1985-2006 Intel Corporation.  All rights reserved.
> > FOR NON-COMMERCIAL USE ONLY
> > 
> > 
> > I have been looking at this myself and have noted a couple of things, some 
> > of these need cross-checking (I am using different computers and different 
> > setups and different compilers and different openmpi releases !!)  but 
> > my thoughts at the moment are (point number (4) is possibly the most 
> > important so far):
> > 
> > 1) If I allocate the string array using an allocate statement then I see 
> > that ALL of the string locations are initialised to "\0" (character 0).
> > 
> > 2) If I set part of a location in the string array then all the OTHER 
> > characters in the same location are set to " " (character 32).
> > 
> > 3) If the character array is defined via a dimension statement then the 
> > locations in the array seem to be initialised at random.
> > 
> > 4) Looking at the output from my test program I noticed and odd pattern in 
> > the arguments being sent to the slaves (yes I do need to quantify this 
> > better !!). However this caused me to look at the ompi source, in 
> > particular I am looking at:
> > 
> >openmpi-1.4.1/ompi/mpi/f77/base/strings.c
> > 
> > In particular at the bottom (line 156( in function 
> > "ompi_fortran_multiple_argvs_f2c" at the end of the for statement there is 
> > the line:
> > 
> >current_array += len * i;
> > 
> > The "* i" looks wrong to me I am thinking it should just be:
> > 
> >current_array += len;
> > 
> > making this change improves things BUT like you suggest in your email there 
> > seems to be a problem locating the end of the 2d-array elements.
> > 
> > 
> > 
> > I will try and look at this more over the w/e.
> > 
> > Fred Marquis.
> > 
> > 
> > On Fri, May 07, 2010 at 10:02:48PM +0100, Jeff Squyres wrote:
> > > Greetings Fred.
> > >
> > > After looking at this for more hours than I'd care to admit, I'm 
> > > wondering if this is a bug in gfortran.  I can replicate your problem 
> > > with a simple program on gfortran 4.1 on RHEL 5.4, but it doesn't happen 
> > > with the Intel Fortran compiler (11.1) or the PGI fortran compiler (10.0).
> > >
> > > One of the issues appears how to determine how Fortran 2d CHARACTER 
> > > arrays are terminated.  I can't figure out how gfortran is terminating 
> > > them -- but intel and PGI both terminate them by having an empty string 
> > > at the end.
> > >
> > > Are you using gfortran 4.1, perchance?
> > >
> > >
> > >
> > >
> > > On May 5, 2010, at 2:08 PM, Fred Marquis wrote:
> > >
> > > > Hi,
> > > >
> > > >   I am using mpi_comm_spawn_multiple to spawn multiple commands with 
> > > > argument lists. I am trying to do this in fortran (77) using version 
> > > > openmpi-1.4.1 and the ifort compiler v9.0. The operating system is SuSE 
> > > > Linux 10.1 (x86-64).
> > > >
> > > > I have put together a simple controlling example program 
> > > > (test_pbload.F) and an example slave program (spray.F) to try and 
> > > > explain my problem.
> > > >
> > > > In the controlling program mpi_comm_spawn_multiple is used to set 2 
> > > > copies of the slave running. The first is started with the argument 
> > > > list "1 2 3 4" and the second with "5 6 7 8".
> > > >
> > > > The slaves are started OK and the slaves print out the argument lists 
> > > > and exit. In addition the slaves print out their rank numbers so I can 
> > > > see which argument list belongs to which slave.
> > > >
> > > > What I am finding is that the argument lists are not being sent to the 
> > > > slaves correctly, indeed both slaves seem to be getting both arguments 
> > > > lists !!!
> > > >
> > > > To compile and run the programs I follow the steps below.
> > > >
> > > > Controlling program "test_pbload.F"
> > > >
> > > >mpif77 -o test_pbload test_pbload.F
> > > >
> > > > Slave program "spray.F"
> > > >
> > > >mpif77 -o spray spray.F
> > > >
> > > > Run the controller
> > > >
> > > > 

Re: [OMPI users] Problem with mpi_comm_spawn_multiple

2010-05-07 Thread Andrew J Marquis
Dear Jeff,

  following the failure I just reported I changed the CC=icc to CC=cc and reran 
and got this:

fred@prandtl:~/test/fortran-c-2d-char> make CC=cc FC=ifort
cc -g   -c -o c_func.o c_func.c
ifort -g main.o c_func.o -g -o main
fred@prandtl:~/test/fortran-c-2d-char> ./main 
Got leading dimension: 2
Got string len: 14
Found string: 1 2 3 4
Found string: 4 5 6 7
Found string: hello
Found string: goodbye
Found string: helloagain
Found string: goodbyeagain
End of the array -- found 6 entries



Fred Marquis.


On Fri, May 07, 2010 at 10:49:40PM +0100, Jeff Squyres wrote:
> Yoinks; I missed that -- sorry!
> 
> Here's a simple tarball; can you try this with your compiler?  Just untar it 
> and
> 
>   make CC=icc FC=ifort
>   ./main
> 
> Do you see only 6 entries in the array?
> 
> (I have icc 9.0, but I'm now running RHEL 5.4, and the gcc version with it is 
> too new for icc 9.0 -- so I can't run it)
> 
> 
> On May 7, 2010, at 5:44 PM, Andrew J Marquis wrote:
> 
> > Dear Jeff,
> > 
> >am afraid not, as I said in my original post I am using the Intel ifort 
> > compiler version 9.0, i.e.
> > 
> > fred@prandtl:~>  mpif77 -V
> > 
> > Intel(R) Fortran Compiler for Intel(R) EM64T-based applications, Version 
> > 9.0Build 20060222 Package ID: 
> > Copyright (C) 1985-2006 Intel Corporation.  All rights reserved.
> > FOR NON-COMMERCIAL USE ONLY
> > 
> > 
> > I have been looking at this myself and have noted a couple of things, some 
> > of these need cross-checking (I am using different computers and different 
> > setups and different compilers and different openmpi releases !!)  but 
> > my thoughts at the moment are (point number (4) is possibly the most 
> > important so far):
> > 
> > 1) If I allocate the string array using an allocate statement then I see 
> > that ALL of the string locations are initialised to "\0" (character 0).
> > 
> > 2) If I set part of a location in the string array then all the OTHER 
> > characters in the same location are set to " " (character 32).
> > 
> > 3) If the character array is defined via a dimension statement then the 
> > locations in the array seem to be initialised at random.
> > 
> > 4) Looking at the output from my test program I noticed and odd pattern in 
> > the arguments being sent to the slaves (yes I do need to quantify this 
> > better !!). However this caused me to look at the ompi source, in 
> > particular I am looking at:
> > 
> >openmpi-1.4.1/ompi/mpi/f77/base/strings.c
> > 
> > In particular at the bottom (line 156( in function 
> > "ompi_fortran_multiple_argvs_f2c" at the end of the for statement there is 
> > the line:
> > 
> >current_array += len * i;
> > 
> > The "* i" looks wrong to me I am thinking it should just be:
> > 
> >current_array += len;
> > 
> > making this change improves things BUT like you suggest in your email there 
> > seems to be a problem locating the end of the 2d-array elements.
> > 
> > 
> > 
> > I will try and look at this more over the w/e.
> > 
> > Fred Marquis.
> > 
> > 
> > On Fri, May 07, 2010 at 10:02:48PM +0100, Jeff Squyres wrote:
> > > Greetings Fred.
> > >
> > > After looking at this for more hours than I'd care to admit, I'm 
> > > wondering if this is a bug in gfortran.  I can replicate your problem 
> > > with a simple program on gfortran 4.1 on RHEL 5.4, but it doesn't happen 
> > > with the Intel Fortran compiler (11.1) or the PGI fortran compiler (10.0).
> > >
> > > One of the issues appears how to determine how Fortran 2d CHARACTER 
> > > arrays are terminated.  I can't figure out how gfortran is terminating 
> > > them -- but intel and PGI both terminate them by having an empty string 
> > > at the end.
> > >
> > > Are you using gfortran 4.1, perchance?
> > >
> > >
> > >
> > >
> > > On May 5, 2010, at 2:08 PM, Fred Marquis wrote:
> > >
> > > > Hi,
> > > >
> > > >   I am using mpi_comm_spawn_multiple to spawn multiple commands with 
> > > > argument lists. I am trying to do this in fortran (77) using version 
> > > > openmpi-1.4.1 and the ifort compiler v9.0. The operating system is SuSE 
> > > > Linux 10.1 (x86-64).
> > > >
> > > > I have put together a simple controlling example program 
> > > > (test_pbload.F) and an example slave program (spray.F) to try and 
> > > > explain my problem.
> > > >
> > > > In the controlling program mpi_comm_spawn_multiple is used to set 2 
> > > > copies of the slave running. The first is started with the argument 
> > > > list "1 2 3 4" and the second with "5 6 7 8".
> > > >
> > > > The slaves are started OK and the slaves print out the argument lists 
> > > > and exit. In addition the slaves print out their rank numbers so I can 
> > > > see which argument list belongs to which slave.
> > > >
> > > > What I am finding is that the argument lists are not being sent to the 
> > > > slaves correctly, indeed both slaves seem to be getting both arguments 
> > > > lists !!!
> > > >
> > > > To compile and run the programs I follow the steps below.
> > > >

Re: [OMPI users] Problem with mpi_comm_spawn_multiple

2010-05-07 Thread Noam Bernstein


I haven't been following this whole discussion, but I do know
something about how Fortran allocates and passes string argument
(the joys of Fortran/C/python inter-language calls), for what
it's worth.

By definition in the Fortran language all strings have a predefined
length, which Fortran magically knows.  Therefore, it doesn't need
termination (like null termination in C).  In practice, we've found
that most compilers, when you pass a string to a routine, also pass
its length, although you don't do so explicitly, so that "character(len=*)
arg" works.  In practice (and this is probably part of the standard),
Fortran ' ' pads all strings.  As usual, there's no default value
in new strings, so strings declared, string arrays declared, and
string arrays allocated may or may not have any particular value
at initialization.  But if you set a string to ' ', or even '',
it'll actually set it all to spaces, because of the padding.  There's
no null termination.  len(string) returns the overall (allocated
or declared) length, len_trim(string) will return the length ignoring
any trailing spaces.  If the underlying routines receiving the
string want to know how long it is, and they're C routines, they
have to have another (integer, by reference) argument.  Unfortunately,
compilers don't always agree on where that hidden argument goes.
Most, I think, put it after the explicit arguments, but some might
put it right after the string being passed.  If the receiving
routines are Fortran from another compiler, you're just hosed, I
think.

As with every array in Fortran, arrays of strings
are contiguous in memory, and presumably the end of string (1,1)
is right before the beginning of string(2,1), etc.

Noam