On Tue, Sep 18, 2012 at 2:14 PM, Alidoust <phymalido...@gmail.com> wrote: > > Dear Madam/Sir, > > > I have a serial Fortran code (f90), dealing with matrix diagonalizing > subroutines, and recently got its parallel version to be faster in some > unfeasible parts via the serial program. > I have been using the following commands for initializing MPI in the code > --------------- > call MPI_INIT(ierr) > call MPI_COMM_SIZE(MPI_COMM_WORLD, p, ierr) > call MPI_COMM_RANK(MPI_COMM_WORLD, my_rank, ierr) > > CPU requirement >> pmem=1500mb,nodes=5:ppn=8 << > ------------------- > Everything looks OK when matrix dimensions are less than 1000x1000. When I > increase the matrix dimensions to some larger values the parallel code gets > the following error > ------------------ > mpirun noticed that process rank 6 with PID 1566 on node node1082 exited on > signal 11 (Segmentation fault) > ------------------ > There is no such error with the serial version even for larger matrix > dimensions than 2400x2400. I then thought it might be raised by the number > of nodes and memory space I'm requiring. Then changed it as follows > > pmem=10gb,nodes=20:ppn=2 > > which is more or less similar to what I'm using for serial jobs > (mem=10gb,nodes=1:ppn=1). But the problem persists still. Is there any > limitation on MPI subroutines for transferring data size or the issue would > be raised by some cause else? > > Best of Regards, > Mohammad >
I believe the send/recv/bcast calls are all limited to sending 2 GB data since they use a signed 32-bit integer to denote the size. If your matrices require a lot of space per element, I suppose this limit could be reached. Brian