Hi,
I am getting a weird error when running mpiexec with one user :
[mboisson@gpu-k20-14 helios_test]$ mpiexec -np 2 mdrunmpi -ntomp 10 -s
prod_s6_01kcal_bb_dr -deffnm testout
--------------------------------------------------------------------------
A requested component was not found, or was unable to be opened. This
means that this component is either not installed or is unable to be
used on your system (e.g., sometimes this means that shared libraries
that the component requires are unable to be found/loaded). Note that
Open MPI stopped checking at the first component that it did not find.
Host: gpu-k20-14
Framework: filem
Component: rsh
--------------------------------------------------------------------------
[gpu-k20-14:205673] mca: base: components_register: registering filem
components
[gpu-k20-14:205673] [[56298,0],0] ORTE_ERROR_LOG: Not found in file
ess_hnp_module.c at line 673
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):
orte_filem_base_open failed
--> Returned value Not found (-13) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
What is weird is that this same command works for other users, on the
same node.
Anyone know what might be going on here ?
Thanks,
--
---------------------------------
Maxime Boissonneault
Analyste de calcul - Calcul Québec, Université Laval
Ph. D. en physique