I have done the test with v1.4.2 and indeed it fixes the problem. Thanks Nysal. Thank you also Terry for your help. With the fix I do not need anymore to use a huge value of btl_tcp_eager_limit (I keep the default value) which considerably decreases the memory consumption I had before. Everything works fine now.
Regards, Olivier 2010/5/20 Olivier Riff <olir...@googlemail.com> > > > 2010/5/20 Nysal Jan <jny...@gmail.com> > > This probably got fixed in https://svn.open-mpi.org/trac/ompi/ticket/2386 >> Can you try 1.4.2, the fix should be in there. >> >> > > I will test it soon (takes some time to install the new version on each > node) . It would be perfect if it fixes it. > I will tell you the result asap. > > Thanks. > > Olivier > > > > > > > >> Regards >> --Nysal >> >> >> On Thu, May 20, 2010 at 2:02 PM, Olivier Riff <olir...@googlemail.com>wrote: >> >>> Hello, >>> >>> I assume this question has been already discussed many times, but I can >>> not find on Internet a solution to my problem. >>> It is about buffer size limit of MPI_Send and MPI_Recv with heterogeneous >>> system (32 bit laptop / 64 bit cluster). >>> My configuration is : >>> open mpi 1.4, configured with: --without-openib --enable-heterogeneous >>> --enable-mpi-threads >>> Program is launched a laptop (32 bit Mandriva 2008) which distributes >>> tasks to do to a cluster of 70 processors (64 bit RedHat Entreprise >>> distribution): >>> I have to send various buffer size from few bytes till 30Mo. >>> >>> I tested following commands: >>> 1) mpirun -v -machinefile machinefile.txt MyMPIProgram >>> -> crash on client side ( 64 bit RedHat Entreprise ) when sent buffer >>> size > 65536. >>> 2) mpirun --mca btl_tcp_eager_limit 30000000 -v -machinefile >>> machinefile.txt MyMPIProgram >>> -> works but has the effect of generating gigantic memory consumption on >>> 32 bit machine side after MPI_Recv. Memory consumption goes from 800Mo to >>> 2,1Go after receiving about 20ko from each 70 clients ( a total of about 1.4 >>> Mo ). This makes my program crash later because I have no more memory to >>> allocate new structures. I read in a openmpi forum thread that setting >>> btl_tcp_eager_limit to a huge value explains this huge memory consumption >>> when a message sent does not have a preposted ready recv. Also after all >>> messages have been received and there is no more traffic activity : the >>> memory consumed remains at 2.1go... and I do not understand why. >>> >>> What is the best way to do in order to have a working program which also >>> has a small memory consumption (the speed performance can be lower) ? >>> I tried to play with mca paramters btl_tcp_sndbuf and mca btl_tcp_rcvbuf, >>> but without success. >>> >>> Thanks in advance for you help. >>> >>> Best regards, >>> >>> Olivier >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > >