[OMPI users] maximum size for read buffer in MPI_File_read/write

2011-09-22 Thread German Hoecht
Hello,

MPI_File_read/write functions uses  an integer to specify the size of
the buffer, for instance:
int MPI_File_read(MPI_File fh, void *buf, int count, MPI_Datatype
datatype, MPI_Status *status)
with:
count Number of elements in buffer (integer).
datatype  Data type of each buffer element (handle).

However, using the maximum value of 32 bytes integers:
count = 2^31-1 = 2147483647 (and datatype = MPI_BYTE)
MPI_file_read only reads  2^31-2^12 = 2147479552 bytes.
This means that 4095 bytes are ignored.

I am not aware of this specific limit for integers in (Open) MPI
function calls. Is this supposed to be correct?

MPI_File_read/write does not return an error (but MPI_Get_count states
that only 2147479552 bytes are considered). Find attached a C++ code
example which tries to write and read 2^31-1 bytes.

I am using Open MPI 1.4.2 compiled with the Intel compiler.

Best regards,
German

#include 
#include 
#include 
#include 
#ifndef NO_MPI 
#include 
#endif
#include 

using namespace std;

int main( int nargs, char *args[] )
{
  int mpi_rank, mpi_size;
#ifndef NO_MPI
  MPI_Init( &nargs, &args );
  MPI_Comm_rank( MPI_COMM_WORLD, &mpi_rank );
  MPI_Comm_size( MPI_COMM_WORLD, &mpi_size );
#else
  mpi_rank = 0;
  mpi_size = 1;
#endif

  int  nbuf =  std::numeric_limits::max();

  if ( nargs < 2 ){
if ( mpi_rank == 0 ){
  cerr<<"usage:\n"
	  <<"--\n"
	  <

Re: [OMPI users] Segfault on any MPI communication on head node

2011-09-27 Thread German Hoecht
char* name[20]; yields 20 (undefined) pointers to char, guess you mean
char name[20];

So Brent's suggestion should work as well(?)

To be safe I would also add:
gethostname(name,maxlen);
name[19] = '\0';
printf("Hello, world.  I am %d of %d and host %s \n", rank, ...

Cheers

On 09/27/2011 07:40 PM, Phillip Vassenkov wrote:
> Thanks, but my main concern is the segfault :P I changed and as I
> expected it still segfaults.
> 
> On 9/27/11 9:48 AM, Henderson, Brent wrote:
>> Here is another possibly non-helpful suggestion.  :)  Change:
>>
>>   char* name[20];
>>   int maxlen = 20;
>>
>> To:
>>
>>   char name[256];
>>   int maxlen = 256;
>>
>> gethostname() is supposed to properly truncate the hostname it returns
>> if the actual name is longer than the length provided, but since you
>> have at least one that is longer than 20 characters, I'm curious.
>>
>> Brent
>>
>>
>> -Original Message-
>> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org]
>> On Behalf Of Jeff Squyres
>> Sent: Tuesday, September 27, 2011 6:29 AM
>> To: Open MPI Users
>> Subject: Re: [OMPI users] Segfault on any MPI communication on head node
>>
>> Hmm.  It's not immediately clear to me what's going wrong here.
>>
>> I hate to ask, but could you install a debugging version of Open MPI
>> and capture a proper stack trace of the segv?
>>
>> Also, could you try the 1.4.4 rc and see if that magically fixes the
>> problem? (I'm about to post a new 1.4.4 rc later this morning, but
>> either the current one or the one from later today would be a good
>> datapoint)
>>
>>
>> On Sep 26, 2011, at 5:09 PM, Phillip Vassenkov wrote:
>>
>>> Yep, Fedora Core 14 and OpenMPI 1.4.3
>>>
>>> On 9/24/11 7:02 AM, Jeff Squyres wrote:
 Are you running the same OS version and Open MPI version between the
 head node and regular nodes?

 On Sep 23, 2011, at 5:27 PM, Vassenkov, Phillip wrote:

> Hey all,
> I've been racking my brains over this for several days and was
> hoping anyone could enlighten me. I'll describe only the relevant
> parts of the network/computer systems. There is one head node and a
> multitude of regular nodes. The regular nodes are all identical to
> each other. If I run an mpi program from one of the regular nodes
> to any other regular nodes, everything works. If I include the head
> node in the hosts file, I get segfaults which I'll paste below
> along with sample code. The machines are all networked via
> infiniband and Ethernet. The issue only arises when mpi
> communication occurs. By this I mean, MPi_Init might succeed but
> the segfault always occurs on MPI_Barrier or MPI_send/recv. I found
> a work around by disabling the openib btl and enforcing that
> communications go over infiniband(if I don't force infiniband,
> it'll go over Ethernet). This command works when the head node is
> included in the hosts file:
> mpirun --hostfile hostfile --mca btl ^openib --mca
> btl_tcp_if_include ib0  -np 2 ./b.out
>
> Sample Code:
> #include "mpi.h"
> #include
> int main(int argc, char *argv[])
> {
> int rank, nprocs;
>  char* name[20];
>  int maxlen = 20;
>  MPI_Init(&argc,&argv);
>  MPI_Comm_size(MPI_COMM_WORLD,&nprocs);
>  MPI_Comm_rank(MPI_COMM_WORLD,&rank);
>  MPI_Barrier(MPI_COMM_WORLD);
>  gethostname(name,maxlen);
>  printf("Hello, world.  I am %d of %d and host %s \n", rank,
> nprocs,name);
>  fflush(stdout);
>  MPI_Finalize();
>  return 0;
>
> }
>
> Segfault:
> [pastec:19917] *** Process received signal ***
> [pastec:19917] Signal: Segmentation fault (11)
> [pastec:19917] Signal code: Address not mapped (1)
> [pastec:19917] Failing at address: 0x8
> [pastec:19917] [ 0] /lib64/libpthread.so.0() [0x34a880eeb0]
> [pastec:19917] [ 1] /usr/lib64/libmthca-rdmav2.so(+0x36aa)
> [0x7eff6430b6aa]
> [pastec:19917] [ 2]
> /usr/lib64/openmpi/lib/openmpi/mca_btl_openib.so(+0x133c9)
> [0x7eff66a163c9]
> [pastec:19917] [ 3]
> /usr/lib64/openmpi/lib/openmpi/mca_btl_openib.so(+0x1eb70)
> [0x7eff66a21b70]
> [pastec:19917] [ 4]
> /usr/lib64/openmpi/lib/openmpi/mca_btl_openib.so(+0x1ec89)
> [0x7eff66a21c89]
> [pastec:19917] [ 5]
> /usr/lib64/openmpi/lib/openmpi/mca_btl_openib.so(+0x1403d)
> [0x7eff66a1703d]
> [pastec:19917] [ 6]
> /usr/lib64/openmpi/lib/openmpi/mca_pml_ob1.so(+0x120e6)
> [0x7eff676670e6]
> [pastec:19917] [ 7]
> /usr/lib64/openmpi/lib/openmpi/mca_pml_ob1.so(+0x6273)
> [0x7eff6765b273]
> [pastec:19917] [ 8]
> /usr/lib64/openmpi/lib/openmpi/mca_coll_tuned.so(+0x1b2f)
> [0x7eff65539b2f]
> [pastec:19917] [ 9]
> /usr/lib64/openmpi/lib/openmpi/mca_coll_tuned.so(+0xa5cf)
> [0x7eff655425cf]
> [pastec:19917] [10]
> /usr/lib64/openmpi/lib/libmpi.so.0(MPI_

Re: [OMPI users] maximum size for read buffer in MPI_File_read/write

2011-09-28 Thread German Hoecht
Hi Rob,

thanks for your comments. I understand that it's most probably not worth
the effort to find the actual reason.

Because I have to deal with very large files I preferred using
"std::numeric_limits::max()" rather than a hard-coded value
to split the read in case an IO request exceeds this amount. (This is
not the usual case but can happen.)

So your advice to use a max IO buffer of 1GB is quite precious.

To be honest, I did not do the check before we observed strange
numbers... Usually, MPI/ROMIO read/write functions are very stable, the
concerned code has read several Terabytes in the meanwhile.

Best regards,
German

On 09/27/2011 10:01 PM, Rob Latham wrote:
> On Thu, Sep 22, 2011 at 11:37:10PM +0200, German Hoecht wrote:
>> Hello,
>>
>> MPI_File_read/write functions uses  an integer to specify the size of
>> the buffer, for instance:
>> int MPI_File_read(MPI_File fh, void *buf, int count, MPI_Datatype
>> datatype, MPI_Status *status)
>> with:
>> count Number of elements in buffer (integer).
>> datatype  Data type of each buffer element (handle).
>>
>> However, using the maximum value of 32 bytes integers:
>> count = 2^31-1 = 2147483647 (and datatype = MPI_BYTE)
>> MPI_file_read only reads  2^31-2^12 = 2147479552 bytes.
>> This means that 4095 bytes are ignored.
>>
>> I am not aware of this specific limit for integers in (Open) MPI
>> function calls. Is this supposed to be correct?
> 
> Hi.  I'm the ROMIO maintainer.  OpenMPI more or less rolls up ROMIO
> into OpenMPI, so any problems with the MPI_File_* routines is in my
> lap, not OpenMPI.
> 
> I'll be honest with you: i've not given any thought to just how big
> the biggest request could be.  The independent routines, especially
> with a simple type like MPI_BYTE, are going to almost immediately call
> the underlying posix read() or write() call. 
> 
> I can confirm the behavior you observe with your test program.
> Thanks much for providing one.  I'll dig around but I cannot think of
> something in ROMIO that would ignore these 4095 bytes.   I do think
> it's legal by the letter of the standard to read or write less than
> requested.   "Upon completion, the amount of data accessed by the
> calling process is returned in a status."   
> 
> Bravo to you for actually checking return values and the status.  I
> don't think many non-library codes do that :>
> 
> I should at least be able to explain the behavior, so I'll dig a bit.
> 
> in general, if you plot "i/o performance vs blocksize", every file
> system tops out around several tens of megabytes.  So, we have given
> the advice to just split up this nearly 2 gb request into several 1 gb
> requests.  
> 
> ==rob
> 



Re: [OMPI users] Sending vector elements of type T using MPI_Ssend, plz help.

2011-11-02 Thread German Hoecht
Hi,

you could try the following (template):

MPI_Send( &vec[first_element],  num_elements*sizeof(T), MPI_BYTE, ..)
MPI_Recv( &vec[first_element],  num_elements*sizeof(T), MPI_BYTE, ..)

As far as I know STL vectors use contiguous memory for the values of the
vector.
However, I didn't test this and boost.mpi may be the safest solution.

Cheers

On 11/02/2011 01:28 PM, Jeff Squyres (jsquyres) wrote:
> You might want to look at boost.mpi. 
> 
> Sent from my phone. No type good. 
> 
> On Nov 1, 2011, at 2:58 PM, "Mudassar Majeed"  > wrote:
> 
>> Dear MPI people,
>> I have a vector class with template as
>> follows,
>>
>> template 
>> class Vector
>>
>> It is a wrapper on the STL vector class. The element type is T that
>> will be replaced by the actual instantiated type on the runtime. I
>> have not seen any support in C++ templates for checking the type of T.
>> I need to send elements of type T that are in the Vector v; using the
>> MPI_Ssend  plz help me how can I do that. How can I send few
>> elements may be starting from 4th element to the 10th element in the
>> vector.
>>
>> regards,
>> Mudassar
>> ___
>> users mailing list
>> us...@open-mpi.org 
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users