Sounds like you're freeing memory that does not belong to you. Or you have some kind of memory corruption somehow.
On Apr 17, 2014, at 2:01 PM, Oscar Mojica <o_moji...@hotmail.com> wrote: > Hello guys > > I used the command > > ulimit -s unlimited > > and got > > stack size (kbytes, -s) unlimited > > but when I ran the program got the same error. So I used the gdb debugger, I > compiled using > > mpif90 -g -o mpivfsa_versao2.f exe > > I ran the program and then I ran gdb with both the executable and the core > file name as arguments and got the following > > Program received signal SIGSEGV, Segmentation fault. > 0x00002aaaab59b54c in free () from /lib/x86_64-linux-gnu/libc.so.6 > (gdb) backtrace > #0 0x00002aaaab59b54c in free () from /lib/x86_64-linux-gnu/libc.so.6 > #1 0x0000000000406801 in inv_grav3d_vfsa () at mpivfsa_versao2.f:131 > #2 0x0000000000406b88 in main (argc=1, argv=0x7fffffffe387) at > mpivfsa_versao2.f:9 > #3 0x00002aaaab53976d in __libc_start_main () from > /lib/x86_64-linux-gnu/libc.so.6 > #4 0x0000000000401399 in _start () > > These are the lines > > 9 use mpi > 131 deallocate(zv,xrec,yrec,xprm,yprm) > > I think the problem is not memory, the problem is related to MPI > > Which could be the error? > > Oscar Fabian Mojica Ladino > Geologist M.S. in Geophysics > > > > From: o_moji...@hotmail.com > > Date: Wed, 16 Apr 2014 15:17:51 -0300 > > To: us...@open-mpi.org > > Subject: Re: [OMPI users] Where is the error? (MPI program in fortran) > > > > Gus > > It is a single machine and i have installed Ubuntu 12.04 LTS. I left my > > computer in the college but I will try to follow your advice when I can and > > tell you about it. > > > > Thanks > > > > Enviado desde mi iPad > > > > > El 16/04/2014, a las 14:17, "Gus Correa" <g...@ldeo.columbia.edu> > > > escribió: > > > > > > Hi Oscar > > > > > > This is a long shot, but maybe worth trying. > > > I am assuming you're using Linux, or some form or Unix, right? > > > > > > You may try to increase the stack size. > > > The default in Linux is often too small for large programs. > > > Sometimes this may cause a segmentation fault, even if the > > > program is correct. > > > > > > You can check what you have with: > > > > > > ulimit -a (bash) > > > > > > or > > > > > > limit (csh or tcsh) > > > > > > Then set it to a larger number or perhaps to unlimited, > > > e.g.: > > > > > > ulimit -s unlimited > > > > > > or > > > > > > limit stacksize unlimited > > > > > > You didn't say anything about the computer(s) you are using. > > > Is this a single machine, a cluster, something else? > > > > > > Anyway, resetting the statck size may depend a bit on what you > > > have in /etc/security/limits.conf, > > > and whether it allows you to increase the stack size. > > > If it is a single computer that you have root access, you may > > > do it yourself. > > > There are other limits worth increasing (number of open files, > > > max locked memory). > > > For instance, this could go in limits.conf: > > > > > > * - memlock -1 > > > * - stack -1 > > > * - nofile 4096 > > > > > > See 'man limits.conf' for details. > > > > > > If it is a cluster, and this should be set on all nodes, > > > and you may need to ask your system administrator to do it. > > > > > > I hope this helps, > > > Gus Correa > > > > > >> On 04/16/2014 11:24 AM, Gus Correa wrote: > > >>> On 04/16/2014 08:30 AM, Oscar Mojica wrote: > > >>> How would be the command line to compile with the option -g ? What > > >>> debugger can I use? > > >>> Thanks > > >>> > > >> > > >> Replace any optimization flags (-O2, or similar) by -g. > > >> Check if your compiler has the -traceback flag or similar > > >> (man compiler-name). > > >> > > >> The gdb debugger is normally available on Linux (or you can install it > > >> with yum, apt-get, etc). An alternative is ddd, with a GUI (can also be > > >> installed from yum, etc). > > >> If you use a commercial compiler you may have a debugger with a GUI. > > >> > > >>> Enviado desde mi iPad > > >>> > > >>>> El 15/04/2014, a las 18:20, "Gus Correa" <g...@ldeo.columbia.edu> > > >>>> escribió: > > >>>> > > >>>> Or just compiling with -g or -traceback (depending on the compiler) > > >>>> will > > >>>> give you more information about the point of failure > > >>>> in the error message. > > >>>> > > >>>>> On 04/15/2014 04:25 PM, Ralph Castain wrote: > > >>>>> Have you tried using a debugger to look at the resulting core file? It > > >>>>> will probably point you right at the problem. Most likely a case of > > >>>>> overrunning some array when #temps > 5 > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> On Tue, Apr 15, 2014 at 10:46 AM, Oscar Mojica <o_moji...@hotmail.com > > >>>>> <mailto:o_moji...@hotmail.com>> wrote: > > >>>>> > > >>>>> Hello everybody > > >>>>> > > >>>>> I implemented a parallel simulated annealing algorithm in fortran. > > >>>>> The algorithm is describes as follows > > >>>>> > > >>>>> 1. The MPI program initially generates P processes that have rank > > >>>>> 0,1,...,P-1. > > >>>>> 2. The MPI program generates a starting point and sends it for all > > >>>>> processes set T=T0 > > >>>>> 3. At the current temperature T, each process begins to execute > > >>>>> iterative operations > > >>>>> 4. At end of iterations, process with rank 0 is responsible for > > >>>>> collecting the solution obatined by > > >>>>> 5. Each process at current temperature and broadcast the best > > >>>>> solution of them among all participating > > >>>>> process > > >>>>> 6. Each process cools the temperatue and goes back to step 3, until > > >>>>> the maximum number of temperatures > > >>>>> is reach > > >>>>> > > >>>>> I compiled with: mpif90 -o exe mpivfsa_version2.f > > >>>>> and run with: mpirun -np 4 ./exe in a single machine > > >>>>> > > >>>>> So I have 4 processes, 1 iteration per temperature and for example > > >>>>> 15 temperatures. When I run the program > > >>>>> with just 5 temperatures it works well, but when the number of > > >>>>> temperatures is higher than 5 it doesn't write the > > >>>>> ouput files and I get the following error message: > > >>>>> > > >>>>> > > >>>>> [oscar-Vostro-3550:06740] *** Process received signal *** > > >>>>> [oscar-Vostro-3550:06741] *** Process received signal *** > > >>>>> [oscar-Vostro-3550:06741] Signal: Segmentation fault (11) > > >>>>> [oscar-Vostro-3550:06741] Signal code: Address not mapped (1) > > >>>>> [oscar-Vostro-3550:06741] Failing at address: 0xad6af > > >>>>> [oscar-Vostro-3550:06742] *** Process received signal *** > > >>>>> [oscar-Vostro-3550:06740] Signal: Segmentation fault (11) > > >>>>> [oscar-Vostro-3550:06740] Signal code: Address not mapped (1) > > >>>>> [oscar-Vostro-3550:06740] Failing at address: 0xad6af > > >>>>> [oscar-Vostro-3550:06742] Signal: Segmentation fault (11) > > >>>>> [oscar-Vostro-3550:06742] Signal code: Address not mapped (1) > > >>>>> [oscar-Vostro-3550:06742] Failing at address: 0xad6af > > >>>>> [oscar-Vostro-3550:06740] [ 0] > > >>>>> /lib/x86_64-linux-gnu/libc.so.6(+0x364a0) [0x7f49ee2224a0] > > >>>>> [oscar-Vostro-3550:06740] [ 1] > > >>>>> /lib/x86_64-linux-gnu/libc.so.6(cfree+0x1c) [0x7f49ee26f54c] > > >>>>> [oscar-Vostro-3550:06740] [ 2] ./exe() [0x406742] > > >>>>> [oscar-Vostro-3550:06740] [ 3] ./exe(main+0x34) [0x406ac9] > > >>>>> [oscar-Vostro-3550:06740] [ 4] > > >>>>> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) > > >>>>> [0x7f49ee20d76d] > > >>>>> [oscar-Vostro-3550:06742] [ 0] > > >>>>> /lib/x86_64-linux-gnu/libc.so.6(+0x364a0) [0x7f6877fdc4a0] > > >>>>> [oscar-Vostro-3550:06742] [ 1] > > >>>>> /lib/x86_64-linux-gnu/libc.so.6(cfree+0x1c) [0x7f687802954c] > > >>>>> [oscar-Vostro-3550:06742] [ 2] ./exe() [0x406742] > > >>>>> [oscar-Vostro-3550:06742] [ 3] ./exe(main+0x34) [0x406ac9] > > >>>>> [oscar-Vostro-3550:06742] [ 4] > > >>>>> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) > > >>>>> [0x7f6877fc776d] > > >>>>> [oscar-Vostro-3550:06742] [ 5] ./exe() [0x401399] > > >>>>> [oscar-Vostro-3550:06742] *** End of error message *** > > >>>>> [oscar-Vostro-3550:06740] [ 5] ./exe() [0x401399] > > >>>>> [oscar-Vostro-3550:06740] *** End of error message *** > > >>>>> [oscar-Vostro-3550:06741] [ 0] > > >>>>> /lib/x86_64-linux-gnu/libc.so.6(+0x364a0) [0x7fa6c4c6e4a0] > > >>>>> [oscar-Vostro-3550:06741] [ 1] > > >>>>> /lib/x86_64-linux-gnu/libc.so.6(cfree+0x1c) [0x7fa6c4cbb54c] > > >>>>> [oscar-Vostro-3550:06741] [ 2] ./exe() [0x406742] > > >>>>> [oscar-Vostro-3550:06741] [ 3] ./exe(main+0x34) [0x406ac9] > > >>>>> [oscar-Vostro-3550:06741] [ 4] > > >>>>> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) > > >>>>> [0x7fa6c4c5976d] > > >>>>> [oscar-Vostro-3550:06741] [ 5] ./exe() [0x401399] > > >>>>> [oscar-Vostro-3550:06741] *** End of error message *** > > >>>>> > > >>>>> -------------------------------------------------------------------------- > > >>>>> > > >>>>> mpirun noticed that process rank 0 with PID 6917 on node > > >>>>> oscar-Vostro-3550 exited on signal 11 (Segmentation fault). > > >>>>> > > >>>>> -------------------------------------------------------------------------- > > >>>>> > > >>>>> 2 total processes killed (some possibly by mpirun during cleanup) > > >>>>> > > >>>>> If there is a segmentation fault in no case it must work . > > >>>>> I checked the program and didn't find the error. Why does the > > >>>>> program work with five temperatures? > > >>>>> Could someone help me to find the error and answer my question > > >>>>> please. > > >>>>> > > >>>>> The program and the necessary files to run it are attached > > >>>>> > > >>>>> Thanks > > >>>>> > > >>>>> > > >>>>> _Oscar Fabian Mojica Ladino_ > > >>>>> Geologist M.S. in Geophysics > > >>>>> > > >>>>> _______________________________________________ > > >>>>> users mailing list > > >>>>> us...@open-mpi.org <mailto:us...@open-mpi.org> > > >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> _______________________________________________ > > >>>>> users mailing list > > >>>>> us...@open-mpi.org > > >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users > > >>>> > > >>>> _______________________________________________ > > >>>> users mailing list > > >>>> us...@open-mpi.org > > >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users > > >>> _______________________________________________ > > >>> users mailing list > > >>> us...@open-mpi.org > > >>> http://www.open-mpi.org/mailman/listinfo.cgi/users > > >>> > > >> > > > > > > _______________________________________________ > > > users mailing list > > > us...@open-mpi.org > > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/