On 10/27/2014 8:30 AM, maxinator333 wrote:
Hello,

I noticed this weird behavior, because after a certain time of more than
one minute the transfer rates of MPI_Send and MPI_Recv dropped by a
factor of 100+. By chance I saw, that my program did allocate more and
more memory. I have the following minimal working example:

    #include <cstdlib>
    #include <mpi.h>

    const uint32_t MSG_LENGTH = 256;

    int main(int argc, char* argv[]) {
         MPI_Init(NULL, NULL);
         int rank;
         MPI_Comm_rank(MPI_COMM_WORLD, &rank);

         volatile char * msg  = (char*) malloc( sizeof(char) * MSG_LENGTH );

         for (uint64_t i = 0; i < 1e9; i++) {
             if ( rank == 1 ) {
                 MPI_Recv( const_cast<char*>(msg), MSG_LENGTH, MPI_CHAR,
                           rank-1, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
                 MPI_Send( const_cast<char*>(msg), MSG_LENGTH, MPI_CHAR,
                           rank-1, 0, MPI_COMM_WORLD);
             } else if ( rank == 0 ) {
                 MPI_Send( const_cast<char*>(msg), MSG_LENGTH, MPI_CHAR,
                           rank+1, 0, MPI_COMM_WORLD);
                 MPI_Recv( const_cast<char*>(msg), MSG_LENGTH, MPI_CHAR,
                           rank+1, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
             }
             MPI_Barrier( MPI_COMM_WORLD );
             for (uint32_t k = 0; k < MSG_LENGTH; k++)
                 msg[k]++;
         }

         MPI_Finalize();
         return 0;
    }


I run this with mpirun -n 2 ./pingpong_memleak.exe

The program does nothing more than send a message from rank 0 to rank 1,
then from rank 1 to rank 0 and so on in standard blocking mode, not even
asynchronous.

Running the program will allocate roughly 30mb/s (Windows Task Manager)
until it stops at around 1.313.180kb. This is when the transfer rates
(not being measured in above snippet) drop significantly to maybe a
second per send instead of roughly 1µs.

I use Cygwin with Windows 7 and 16Gb RAM. I haven't tested this minimal
working example on other setups.

Can someone test on other platforms and confirm me that is a cygwin
specific issue ?

Regards
Marco

Reply via email to