you need to run the ulimit command before mpirun and on the same node.
if it still does not work, then you can use a wrapper.
instead of
mpirun a.out
you would do
mpirun a.sh
a.sh is a script
ulimit -c unlimited
exec a.out
the core is created in the current directory
Cheers,
Gilles
On Saturda
OK thanks for the hint. In fact 'ldd' command shows that some libraries
were missing. adding the paths to LD_LIBRARY_PATH solved the problem.
Regards,
Mahmood
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/
Did you, perchance, install open MPI v2.0.0 in the same directory tree that a
prior version of open MPI was already installed?
If so, open MPI may be trying to use plugins from the prior version of open
MPI, which will be problematic.
Sent from my phone. No type good.
> On Sep 2, 2016, at 11
Note that open MPI v2.0.0 is not ABI compatible with prior releases of open
MPI. If you are trying to run an MPI executable created by a prior version of
open MPI, you will need to recompile your application with open MPI v2.0.0.
Sent from my phone. No type good.
> On Sep 2, 2016, at 12:48 PM,
Thanks for your help. Please see below
mahmood@compute-0-1:~$ ldd /share/apps/chemistry/siesta-3.2-pl-5/tpar/transiesta
linux-vdso.so.1 => (0x7fffba9a8000)
libmpi_f90.so.1 => /opt/openmpi/lib/libmpi_f90.so.1 (0x2b472b64)
libmpi_f77.so.1 => /opt/openmpi/lib/libm
Thankyou. That is helpful.
Could you run an 'ldd' on your executable, on one of the compute nodes if
possible?
I will nto be able to solve your problem, but at least we now know what the
application is,
and can look at the libraries it is using.
On 2 September 2016 at 17:19, Mahmood Naderan w
The application is Siesta-3.2 and the command I use is
/share/apps/computer/openmpi-1.6.5/bin/mpirun -hostfile hosts.txt -np
15 /share/apps/chemistry/siesta-3.2-pl-5/tpar/transiesta <
trans-cc-bt-cc-163-20.fdf
There is one node in the hosts.txt file. I have built transiesta
binary from the sourc
Mahmood,
are you compiling and linking this application?
Or are you using an executable which someone else has prepared?
It would be very useful if we could know the application.
On 2 September 2016 at 16:35, Mahmood Naderan wrote:
> >Did you ran
> >ulimit -c unlimited
> >before invoking mpi
Hi,
Using OpenMPI-2.0.0, is there any idea about this error
A requested component was not found, or was unable to be opened. This
means that this component is either not installed or is unable to be
used on your system (e.g., sometimes this means that shared libraries
that the component requires
>Did you ran
>ulimit -c unlimited
>before invoking mpirun ?
Yes. On the node which says that error. Is that file created in the
current working directory? Or it is somewhere in the system folders?
As another question, I am trying to use OpenMPI-2.0.0 as a new one.
Problem is that the applicatio
Lachlan mentioned that he has "M Series" hardware, which, to the best of my
knowledge, does not officially support usNIC. It may not be possible to even
configure the relevant usNIC adapter policy in UCSM for M Series
modules/chassis.
Using the TCP BTL may be the only realistic option here.
-
Greetings Lachlan.
Yes, Gilles and John are correct: on Cisco hardware, our usNIC transport is the
lowest latency / best HPC-performance transport. I'm not aware of any MPI
implementation (including Open MPI) that has support for FC types of transports
(including FCoE).
I'll ping you off-list
Also, the error message suggested that TCP is not the issue here -- the TCP
hangups are likely because some other process exited unexpectedly.
Indeed:
-
mpirun noticed that process rank 0 with PID 4989 on node compute-0-1 exited on
signal 4 (Illegal instruction).
-
This might be the re
Mahmood, as Giles says start by looking at how that application is compiled
and linked.
Run 'ldd' on the executable and look closely at the libraries. Do this on
a compute node if you can.
There was a discussion on another mailign list recently about how to
fingerpritn executables and see which a
Did you ran
ulimit -c unlimited
before invoking mpirun ?
if your application can be ran with only one tasks, you can try to run it
under gdb.
you will hopefully be able to see where the illegal instruction occurs.
since you are running on AMD processors, you have to make sure you are not
using an
>Are you running under a batch manager ?
>On which architecture ?
Currently I am not using the job manager (which is actually PBS). I am
running from the terminal.
The machines are AMD Opteron 64 bit
>Hopefully you will get a core file that points you to the illegal instruction
Where is that cor
16 matches
Mail list logo