Gus
It is a single machine and i have installed Ubuntu 12.04 LTS. I left my 
computer in the college but  I will try to follow your advice when I can and 
tell you about it.

Thanks 

Enviado desde mi iPad

> El 16/04/2014, a las 14:17, "Gus Correa" <g...@ldeo.columbia.edu> escribió:
> 
> Hi Oscar
> 
> This is a long shot, but maybe worth trying.
> I am assuming you're using Linux, or some form or Unix, right?
> 
> You may try to increase the stack size.
> The default in Linux is often too small for large programs.
> Sometimes this may cause a segmentation fault, even if the
> program is correct.
> 
> You can check what you have with:
> 
> ulimit -a        (bash)
> 
> or
> 
> limit             (csh or tcsh)
> 
> Then set it to a larger number or perhaps to unlimited,
> e.g.:
> 
> ulimit -s unlimited
> 
> or
> 
> limit stacksize unlimited
> 
> You didn't say anything about the computer(s) you are using.
> Is this a single machine, a cluster, something else?
> 
> Anyway, resetting the statck size may depend a bit on what you
> have in /etc/security/limits.conf,
> and whether it allows you to increase the stack size.
> If it is a single computer that you have root access, you may
> do it yourself.
> There are other limits worth increasing (number of open files,
> max locked memory).
> For instance, this could go in limits.conf:
> 
> *   -   memlock     -1
> *   -   stack       -1
> *   -   nofile      4096
> 
> See 'man limits.conf' for details.
> 
> If it is a cluster, and this should be set on all nodes,
> and you may need to ask your system administrator to do it.
> 
> I hope this helps,
> Gus Correa
> 
>> On 04/16/2014 11:24 AM, Gus Correa wrote:
>>> On 04/16/2014 08:30 AM, Oscar Mojica wrote:
>>> How would be the command line to compile with the option -g ? What
>>> debugger can I use?
>>> Thanks
>>> 
>> 
>> Replace any optimization flags (-O2, or similar) by -g.
>> Check if your compiler has the -traceback flag or similar
>> (man compiler-name).
>> 
>> The gdb debugger is normally available on Linux (or you can install it
>> with yum, apt-get, etc).  An alternative is ddd, with a GUI (can also be
>> installed from yum, etc).
>> If you use a commercial compiler you may have a debugger with a GUI.
>> 
>>> Enviado desde mi iPad
>>> 
>>>> El 15/04/2014, a las 18:20, "Gus Correa" <g...@ldeo.columbia.edu>
>>>> escribió:
>>>> 
>>>> Or just compiling with -g or -traceback (depending on the compiler) will
>>>> give you more information about the point of failure
>>>> in the error message.
>>>> 
>>>>> On 04/15/2014 04:25 PM, Ralph Castain wrote:
>>>>> Have you tried using a debugger to look at the resulting core file? It
>>>>> will probably point you right at the problem. Most likely a case of
>>>>> overrunning some array when #temps > 5
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On Tue, Apr 15, 2014 at 10:46 AM, Oscar Mojica <o_moji...@hotmail.com
>>>>> <mailto:o_moji...@hotmail.com>> wrote:
>>>>> 
>>>>>    Hello everybody
>>>>> 
>>>>>    I implemented a parallel simulated annealing algorithm in fortran.
>>>>>      The algorithm is describes as follows
>>>>> 
>>>>>    1. The MPI program initially generates P processes that have rank
>>>>>    0,1,...,P-1.
>>>>>    2. The MPI program generates a starting point and sends it  for all
>>>>>    processes set T=T0
>>>>>    3. At the current temperature T, each process begins to execute
>>>>>    iterative operations
>>>>>    4. At end of iterations, process with rank 0 is responsible for
>>>>>    collecting the solution obatined by
>>>>>    5. Each process at current temperature and broadcast the best
>>>>>    solution of them among all participating
>>>>>    process
>>>>>    6. Each process cools the temperatue and goes back to step 3, until
>>>>>    the maximum number of temperatures
>>>>>    is reach
>>>>> 
>>>>>    I compiled with: mpif90 -o exe mpivfsa_version2.f
>>>>>    and run with: mpirun -np 4 ./exe in a single machine
>>>>> 
>>>>>    So I have 4 processes, 1 iteration per temperature and for example
>>>>>    15 temperatures. When I run the program
>>>>>    with just 5 temperatures it works well, but when the number of
>>>>>    temperatures is higher than 5 it doesn't write the
>>>>>    ouput files and I get the following error message:
>>>>> 
>>>>> 
>>>>>    [oscar-Vostro-3550:06740] *** Process received signal ***
>>>>>    [oscar-Vostro-3550:06741] *** Process received signal ***
>>>>>    [oscar-Vostro-3550:06741] Signal: Segmentation fault (11)
>>>>>    [oscar-Vostro-3550:06741] Signal code: Address not mapped (1)
>>>>>    [oscar-Vostro-3550:06741] Failing at address: 0xad6af
>>>>>    [oscar-Vostro-3550:06742] *** Process received signal ***
>>>>>    [oscar-Vostro-3550:06740] Signal: Segmentation fault (11)
>>>>>    [oscar-Vostro-3550:06740] Signal code: Address not mapped (1)
>>>>>    [oscar-Vostro-3550:06740] Failing at address: 0xad6af
>>>>>    [oscar-Vostro-3550:06742] Signal: Segmentation fault (11)
>>>>>    [oscar-Vostro-3550:06742] Signal code: Address not mapped (1)
>>>>>    [oscar-Vostro-3550:06742] Failing at address: 0xad6af
>>>>>    [oscar-Vostro-3550:06740] [ 0]
>>>>>    /lib/x86_64-linux-gnu/libc.so.6(+0x364a0) [0x7f49ee2224a0]
>>>>>    [oscar-Vostro-3550:06740] [ 1]
>>>>>    /lib/x86_64-linux-gnu/libc.so.6(cfree+0x1c) [0x7f49ee26f54c]
>>>>>    [oscar-Vostro-3550:06740] [ 2] ./exe() [0x406742]
>>>>>    [oscar-Vostro-3550:06740] [ 3] ./exe(main+0x34) [0x406ac9]
>>>>>    [oscar-Vostro-3550:06740] [ 4]
>>>>>    /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed)
>>>>> [0x7f49ee20d76d]
>>>>>    [oscar-Vostro-3550:06742] [ 0]
>>>>>    /lib/x86_64-linux-gnu/libc.so.6(+0x364a0) [0x7f6877fdc4a0]
>>>>>    [oscar-Vostro-3550:06742] [ 1]
>>>>>    /lib/x86_64-linux-gnu/libc.so.6(cfree+0x1c) [0x7f687802954c]
>>>>>    [oscar-Vostro-3550:06742] [ 2] ./exe() [0x406742]
>>>>>    [oscar-Vostro-3550:06742] [ 3] ./exe(main+0x34) [0x406ac9]
>>>>>    [oscar-Vostro-3550:06742] [ 4]
>>>>>    /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed)
>>>>> [0x7f6877fc776d]
>>>>>    [oscar-Vostro-3550:06742] [ 5] ./exe() [0x401399]
>>>>>    [oscar-Vostro-3550:06742] *** End of error message ***
>>>>>    [oscar-Vostro-3550:06740] [ 5] ./exe() [0x401399]
>>>>>    [oscar-Vostro-3550:06740] *** End of error message ***
>>>>>    [oscar-Vostro-3550:06741] [ 0]
>>>>>    /lib/x86_64-linux-gnu/libc.so.6(+0x364a0) [0x7fa6c4c6e4a0]
>>>>>    [oscar-Vostro-3550:06741] [ 1]
>>>>>    /lib/x86_64-linux-gnu/libc.so.6(cfree+0x1c) [0x7fa6c4cbb54c]
>>>>>    [oscar-Vostro-3550:06741] [ 2] ./exe() [0x406742]
>>>>>    [oscar-Vostro-3550:06741] [ 3] ./exe(main+0x34) [0x406ac9]
>>>>>    [oscar-Vostro-3550:06741] [ 4]
>>>>>    /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed)
>>>>> [0x7fa6c4c5976d]
>>>>>    [oscar-Vostro-3550:06741] [ 5] ./exe() [0x401399]
>>>>>    [oscar-Vostro-3550:06741] *** End of error message ***
>>>>> 
>>>>> --------------------------------------------------------------------------
>>>>> 
>>>>>    mpirun noticed that process rank 0 with PID 6917 on node
>>>>>    oscar-Vostro-3550 exited on signal 11 (Segmentation fault).
>>>>> 
>>>>> --------------------------------------------------------------------------
>>>>> 
>>>>>    2 total processes killed (some possibly by mpirun during cleanup)
>>>>> 
>>>>>    If there is a segmentation fault in no case it must work .
>>>>>    I checked the program and didn't find the error. Why does the
>>>>>    program work with five temperatures?
>>>>>    Could someone help me to find the error and answer my question
>>>>> please.
>>>>> 
>>>>>    The program and the necessary files to run it  are attached
>>>>> 
>>>>>    Thanks
>>>>> 
>>>>> 
>>>>>    _Oscar Fabian Mojica Ladino_
>>>>>    Geologist M.S. in  Geophysics
>>>>> 
>>>>>    _______________________________________________
>>>>>    users mailing list
>>>>>    us...@open-mpi.org <mailto:us...@open-mpi.org>
>>>>>    http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to