Gus It is a single machine and i have installed Ubuntu 12.04 LTS. I left my computer in the college but I will try to follow your advice when I can and tell you about it.
Thanks Enviado desde mi iPad > El 16/04/2014, a las 14:17, "Gus Correa" <g...@ldeo.columbia.edu> escribió: > > Hi Oscar > > This is a long shot, but maybe worth trying. > I am assuming you're using Linux, or some form or Unix, right? > > You may try to increase the stack size. > The default in Linux is often too small for large programs. > Sometimes this may cause a segmentation fault, even if the > program is correct. > > You can check what you have with: > > ulimit -a (bash) > > or > > limit (csh or tcsh) > > Then set it to a larger number or perhaps to unlimited, > e.g.: > > ulimit -s unlimited > > or > > limit stacksize unlimited > > You didn't say anything about the computer(s) you are using. > Is this a single machine, a cluster, something else? > > Anyway, resetting the statck size may depend a bit on what you > have in /etc/security/limits.conf, > and whether it allows you to increase the stack size. > If it is a single computer that you have root access, you may > do it yourself. > There are other limits worth increasing (number of open files, > max locked memory). > For instance, this could go in limits.conf: > > * - memlock -1 > * - stack -1 > * - nofile 4096 > > See 'man limits.conf' for details. > > If it is a cluster, and this should be set on all nodes, > and you may need to ask your system administrator to do it. > > I hope this helps, > Gus Correa > >> On 04/16/2014 11:24 AM, Gus Correa wrote: >>> On 04/16/2014 08:30 AM, Oscar Mojica wrote: >>> How would be the command line to compile with the option -g ? What >>> debugger can I use? >>> Thanks >>> >> >> Replace any optimization flags (-O2, or similar) by -g. >> Check if your compiler has the -traceback flag or similar >> (man compiler-name). >> >> The gdb debugger is normally available on Linux (or you can install it >> with yum, apt-get, etc). An alternative is ddd, with a GUI (can also be >> installed from yum, etc). >> If you use a commercial compiler you may have a debugger with a GUI. >> >>> Enviado desde mi iPad >>> >>>> El 15/04/2014, a las 18:20, "Gus Correa" <g...@ldeo.columbia.edu> >>>> escribió: >>>> >>>> Or just compiling with -g or -traceback (depending on the compiler) will >>>> give you more information about the point of failure >>>> in the error message. >>>> >>>>> On 04/15/2014 04:25 PM, Ralph Castain wrote: >>>>> Have you tried using a debugger to look at the resulting core file? It >>>>> will probably point you right at the problem. Most likely a case of >>>>> overrunning some array when #temps > 5 >>>>> >>>>> >>>>> >>>>> >>>>> On Tue, Apr 15, 2014 at 10:46 AM, Oscar Mojica <o_moji...@hotmail.com >>>>> <mailto:o_moji...@hotmail.com>> wrote: >>>>> >>>>> Hello everybody >>>>> >>>>> I implemented a parallel simulated annealing algorithm in fortran. >>>>> The algorithm is describes as follows >>>>> >>>>> 1. The MPI program initially generates P processes that have rank >>>>> 0,1,...,P-1. >>>>> 2. The MPI program generates a starting point and sends it for all >>>>> processes set T=T0 >>>>> 3. At the current temperature T, each process begins to execute >>>>> iterative operations >>>>> 4. At end of iterations, process with rank 0 is responsible for >>>>> collecting the solution obatined by >>>>> 5. Each process at current temperature and broadcast the best >>>>> solution of them among all participating >>>>> process >>>>> 6. Each process cools the temperatue and goes back to step 3, until >>>>> the maximum number of temperatures >>>>> is reach >>>>> >>>>> I compiled with: mpif90 -o exe mpivfsa_version2.f >>>>> and run with: mpirun -np 4 ./exe in a single machine >>>>> >>>>> So I have 4 processes, 1 iteration per temperature and for example >>>>> 15 temperatures. When I run the program >>>>> with just 5 temperatures it works well, but when the number of >>>>> temperatures is higher than 5 it doesn't write the >>>>> ouput files and I get the following error message: >>>>> >>>>> >>>>> [oscar-Vostro-3550:06740] *** Process received signal *** >>>>> [oscar-Vostro-3550:06741] *** Process received signal *** >>>>> [oscar-Vostro-3550:06741] Signal: Segmentation fault (11) >>>>> [oscar-Vostro-3550:06741] Signal code: Address not mapped (1) >>>>> [oscar-Vostro-3550:06741] Failing at address: 0xad6af >>>>> [oscar-Vostro-3550:06742] *** Process received signal *** >>>>> [oscar-Vostro-3550:06740] Signal: Segmentation fault (11) >>>>> [oscar-Vostro-3550:06740] Signal code: Address not mapped (1) >>>>> [oscar-Vostro-3550:06740] Failing at address: 0xad6af >>>>> [oscar-Vostro-3550:06742] Signal: Segmentation fault (11) >>>>> [oscar-Vostro-3550:06742] Signal code: Address not mapped (1) >>>>> [oscar-Vostro-3550:06742] Failing at address: 0xad6af >>>>> [oscar-Vostro-3550:06740] [ 0] >>>>> /lib/x86_64-linux-gnu/libc.so.6(+0x364a0) [0x7f49ee2224a0] >>>>> [oscar-Vostro-3550:06740] [ 1] >>>>> /lib/x86_64-linux-gnu/libc.so.6(cfree+0x1c) [0x7f49ee26f54c] >>>>> [oscar-Vostro-3550:06740] [ 2] ./exe() [0x406742] >>>>> [oscar-Vostro-3550:06740] [ 3] ./exe(main+0x34) [0x406ac9] >>>>> [oscar-Vostro-3550:06740] [ 4] >>>>> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) >>>>> [0x7f49ee20d76d] >>>>> [oscar-Vostro-3550:06742] [ 0] >>>>> /lib/x86_64-linux-gnu/libc.so.6(+0x364a0) [0x7f6877fdc4a0] >>>>> [oscar-Vostro-3550:06742] [ 1] >>>>> /lib/x86_64-linux-gnu/libc.so.6(cfree+0x1c) [0x7f687802954c] >>>>> [oscar-Vostro-3550:06742] [ 2] ./exe() [0x406742] >>>>> [oscar-Vostro-3550:06742] [ 3] ./exe(main+0x34) [0x406ac9] >>>>> [oscar-Vostro-3550:06742] [ 4] >>>>> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) >>>>> [0x7f6877fc776d] >>>>> [oscar-Vostro-3550:06742] [ 5] ./exe() [0x401399] >>>>> [oscar-Vostro-3550:06742] *** End of error message *** >>>>> [oscar-Vostro-3550:06740] [ 5] ./exe() [0x401399] >>>>> [oscar-Vostro-3550:06740] *** End of error message *** >>>>> [oscar-Vostro-3550:06741] [ 0] >>>>> /lib/x86_64-linux-gnu/libc.so.6(+0x364a0) [0x7fa6c4c6e4a0] >>>>> [oscar-Vostro-3550:06741] [ 1] >>>>> /lib/x86_64-linux-gnu/libc.so.6(cfree+0x1c) [0x7fa6c4cbb54c] >>>>> [oscar-Vostro-3550:06741] [ 2] ./exe() [0x406742] >>>>> [oscar-Vostro-3550:06741] [ 3] ./exe(main+0x34) [0x406ac9] >>>>> [oscar-Vostro-3550:06741] [ 4] >>>>> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) >>>>> [0x7fa6c4c5976d] >>>>> [oscar-Vostro-3550:06741] [ 5] ./exe() [0x401399] >>>>> [oscar-Vostro-3550:06741] *** End of error message *** >>>>> >>>>> -------------------------------------------------------------------------- >>>>> >>>>> mpirun noticed that process rank 0 with PID 6917 on node >>>>> oscar-Vostro-3550 exited on signal 11 (Segmentation fault). >>>>> >>>>> -------------------------------------------------------------------------- >>>>> >>>>> 2 total processes killed (some possibly by mpirun during cleanup) >>>>> >>>>> If there is a segmentation fault in no case it must work . >>>>> I checked the program and didn't find the error. Why does the >>>>> program work with five temperatures? >>>>> Could someone help me to find the error and answer my question >>>>> please. >>>>> >>>>> The program and the necessary files to run it are attached >>>>> >>>>> Thanks >>>>> >>>>> >>>>> _Oscar Fabian Mojica Ladino_ >>>>> Geologist M.S. in Geophysics >>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> us...@open-mpi.org <mailto:us...@open-mpi.org> >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> us...@open-mpi.org >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >> > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users