That is a good sign, it means orted was started on both nodes
strictly speaking, you should confirm both nodes appear 16 times each in
the output,
do you can draw any firm conclusion
Cheers,
Gilles
On Monday, August 3, 2015, abhisek Mondal wrote:
> I wrote 2 new node names(which I had not used
You will see those warning the first time you connect to a new host.
If it printed the hostname from each processor, it should be OK.
On Sun, Aug 2, 2015 at 11:06 AM, abhisek Mondal wrote:
> I wrote 2 new node names(which I had not used before) in "myhostfile".
> And when I run it from login term
I wrote 2 new node names(which I had not used before) in "myhostfile".
And when I run it from login terminal, it says:
*Warning: Permanently added 'cx1055,10.1.5.35' (RSA) to the list of known
hosts.*
*Warning: Permanently added 'cx1071,10.1.5.51' (RSA) to the list of known
hosts.*
* *
Is it o
On Sun, Aug 2, 2015 at 10:47 AM, abhisek Mondal wrote:
Try
/mpirun --hostfile myhostfile -np 32 hostname
Sorry, but I can't get it.
Would you please provide a demo_code(in context of the working code) ?
Thanks.
On Sun, Aug 2, 2015 at 7:43 PM, Gilles Gouaillardet <
gilles.gouaillar...@gmail.com> wrote:
> simply replace nwchem with hostname
>
> both hosts should be part of the output...
>
> Cheers,
>
simply replace nwchem with hostname
both hosts should be part of the output...
Cheers,
Gilles
On Sunday, August 2, 2015, abhisek Mondal wrote:
> Jeff, Gilles
>
> Here's my scenario again when I tried something different:
> I've interactively booked 2 nodes(cx1015 and cx1016) and working in
>
Jeff, Gilles
Here's my scenario again when I tried something different:
I've interactively booked 2 nodes(cx1015 and cx1016) and working in
"cx1015" node.
Here I hit "module load openmpi" and "module load nwchem"( but I don't know
how to "module load" on other node).
Using the openmpi command to r
The initial error was ompi could not find orted on the second node, and
that was fixed by using the full path for mpirun
if you run under pbs, you should not need the hostile option.
just ask pbs to allocate 2 nodes and everything should run smoothly.
at first, I recommend you run a non MPI appli
Abhisek --
You are having two problems:
1. In the first "orted not found" problem, Open MPI was not finding its "orted"
helper executable on the remote nodes in your cluster. When you "module load
..." something, it just loads the relevant PATH / LD_LIBRARY_PATH / etc. on the
local node; it d
I'm on a HPC cluster. So, the openmpi-1.6.4 here installed as a module.
In .pbs script, before executing my code-line, I'm loading both "nwchem"
and "openmpi" module.
It is working very nicely when I work on single node(with 16 processors).
But if I try to switch in multiple nodes with "hostfile" o
HI,
I have tried using full paths for both of them. But stuck in the same issue.
On Sun, Aug 2, 2015 at 4:39 PM, Gilles Gouaillardet <
gilles.gouaillar...@gmail.com> wrote:
> Is ompi installed on the other node and at the same location ?
> did you configure ompi with --enable-mpirun-prefix-by-def
Is ompi installed on the other node and at the same location ?
did you configure ompi with --enable-mpirun-prefix-by-default ?
(note that should not be necessary if you invoke mpirun with full path )
you can also try
/.../bin/mpirun --mca plm_base_verbose 100 ...
and see if there is something wro
Yes, I have tried this and got following error:
*mpirun was unable to launch the specified application as it could not find
an executable:*
*Executable: nwchem*
*Node: cx934*
*while attempting to start process rank 16.*
Given that: I have to run my code with "nwchem filename.nw" command.
While
Can you try running invoking mpirun with its full path instead ?
e.g. /usr/local/bin/mpirun instead of mpirun
Cheers,
Gilles
On Sunday, August 2, 2015, abhisek Mondal wrote:
> Here is the other details,
>
> a. The Openmpi version is 1.6.4
>
> b. The error as being generated is :
> *Warning: Pe
Here is the other details,
a. The Openmpi version is 1.6.4
b. The error as being generated is :
*Warning: Permanently added 'cx0937,10.1.4.1' (RSA) to the list of known
hosts.*
*Warning: Permanently added 'cx0934,10.1.3.255' (RSA) to the list of known
hosts.*
*orted: Command not found.*
*orted: C
Would you please tell us:
(a) what version of OMPI you are using
(b) what error message you are getting when the job terminates
> On Aug 1, 2015, at 12:22 PM, abhisek Mondal wrote:
>
> I'm working on an openmpi enabled cluster. I'm trying to run a job with 2
> different nodes and 16 processo
16 matches
Mail list logo