On Tue, Sep 18, 2012 at 2:14 PM, Alidoust <phymalido...@gmail.com> wrote:
>
> Dear Madam/Sir,
>
>
> I have a serial Fortran code (f90), dealing with matrix diagonalizing
> subroutines, and recently got its parallel version to be faster in some
> unfeasible parts via the serial program.
> I have been using the following commands for initializing MPI in the code
> ---------------
>     call MPI_INIT(ierr)
>     call MPI_COMM_SIZE(MPI_COMM_WORLD, p, ierr)
>     call MPI_COMM_RANK(MPI_COMM_WORLD, my_rank, ierr)
>
> CPU requirement >> pmem=1500mb,nodes=5:ppn=8 <<
> -------------------
> Everything looks OK when matrix dimensions are less than 1000x1000. When I
> increase the matrix dimensions to some larger values the parallel code gets
> the following error
> ------------------
> mpirun noticed that process rank 6 with PID 1566 on node node1082 exited on
> signal 11 (Segmentation fault)
> ------------------
> There is no such error with the serial version even for larger matrix
> dimensions than 2400x2400. I then thought it might be raised by the number
> of nodes and memory space I'm requiring. Then changed it as follows
>
> pmem=10gb,nodes=20:ppn=2
>
> which is more or less similar to what I'm using for serial jobs
> (mem=10gb,nodes=1:ppn=1). But the problem persists still. Is there any
> limitation on MPI subroutines for transferring data size or the issue would
> be raised by some cause else?
>
> Best of Regards,
> Mohammad
>

I believe the send/recv/bcast calls are all limited to sending 2 GB
data since they use a signed 32-bit integer to denote the size.  If
your matrices require a lot of space per element, I suppose this limit
could be reached.

  Brian

Reply via email to