Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread Jeff Squyres (jsquyres) via users
t here on the mailing list. -- Jeff Squyres jsquy...@cisco.com From: users on behalf of Jeff Squyres (jsquyres) via users Sent: Thursday, May 5, 2022 3:31 PM To: George Bosilca; Open MPI Users Cc: Jeff Squyres (jsquyres) Subject: Re: [OMPI users] mpirun hangs on m

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread Jeff Squyres (jsquyres) via users
2022 3:19 PM To: Open MPI Users Cc: Jeff Squyres (jsquyres); Scott Sayres Subject: Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3 That is weird, but maybe it is not a deadlock, but a very slow progress. In the child can you print the fdmax and i in the frame do_child. George. On Thu,

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread George Bosilca via users
That is weird, but maybe it is not a deadlock, but a very slow progress. In the child can you print the fdmax and i in the frame do_child. George. On Thu, May 5, 2022 at 11:50 AM Scott Sayres via users < users@lists.open-mpi.org> wrote: > Jeff, thanks. > from 1: > > (lldb) process attach --pid 9

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread Scott Sayres via users
Jeff, thanks. from 1: (lldb) process attach --pid 95083 Process 95083 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP frame #0: 0x0001bde25628 libsystem_kernel.dylib`close + 8 libsystem_kernel.dylib`close: -> 0x1bde25628 <+8>: b.lo 0x1bde25648

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread Jeff Squyres (jsquyres) via users
You can use "lldb -p PID" to attach to a running process. -- Jeff Squyres jsquy...@cisco.com From: Scott Sayres Sent: Thursday, May 5, 2022 11:22 AM To: Jeff Squyres (jsquyres) Cc: Open MPI Users Subject: Re: [OMPI users] mpirun hangs on m1 mac

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread Scott Sayres via users
Jeff, It does launch two mpirun processes (when hung from another terminal window) scottsayres 95083 99.0 0.0 408918416 1472 s002 R 8:20AM 0:04.48 mpirun -np 4 foo.sh scottsayres 95085 0.0 0.0 408628368 1632 s006 S+8:20AM 0:00.00 egrep mpirun|foo.sh scottsayres

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread Bennet Fauber via users
happens immediately after forking the > child process... which is weird). > > -- > Jeff Squyres > jsquy...@cisco.com > > ________ > From: Scott Sayres > Sent: Wednesday, May 4, 2022 4:02 PM > To: Jeff Squyres (jsquyres) > Cc: Open MPI Users > Subject: Re: [OMPI users] mp

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread Jeff Squyres (jsquyres) via users
ild process... which is weird). -- Jeff Squyres jsquy...@cisco.com From: Scott Sayres Sent: Wednesday, May 4, 2022 4:02 PM To: Jeff Squyres (jsquyres) Cc: Open MPI Users Subject: Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3 foo.sh is executabl

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread Gilles Gouaillardet via users
it via: >> >> mpirun -np 1 foo.sh >> >> If you start seeing output, good!If it completes, better! >> >> If it hangs, and/or if you don't see any output at all, do this: >> >> ps auxwww | egrep 'mpirun|foo.sh' >> >> It should show mp

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-04 Thread Scott Sayres via users
ut at all, do this: > > ps auxwww | egrep 'mpirun|foo.sh' > > It should show mpirun and 2 copies of foo.sh (and probably a grep). Does > it? > > -- > Jeff Squyres > jsquy...@cisco.com > > ________________ > From: Scott Sayres >

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-04 Thread Jeff Squyres (jsquyres) via users
uy...@cisco.com From: Scott Sayres Sent: Wednesday, May 4, 2022 2:47 PM To: Open MPI Users Cc: Jeff Squyres (jsquyres) Subject: Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3 Following Jeff's advice, I have rebuilt open-mpi by hand using the -g option. This shows more

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-04 Thread Scott Sayres via users
Following Jeff's advice, I have rebuilt open-mpi by hand using the -g option. This shows more information as below. I am attempting George's advice of how to track the child but notice that gdb does not support arm64. attempting to update lldb. scottsayres@scotts-mbp openmpi-4.1.3 % lldb mpir

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-04 Thread Jeff Squyres (jsquyres) via users
Sent: Wednesday, May 4, 2022 12:35 PM To: Open MPI Users Cc: George Bosilca Subject: Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3 I compiled a fresh copy of the 4.1.3 branch on my M1 laptop, and I can run both MPI and non-MPI apps without any issues. Try running `lldb mpirun -- -np 1 hos

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-04 Thread George Bosilca via users
Scott, This shows the deadlock arrives during the local spawn. Here is how things are supposed to work: the mpirun process (parent) will fork (the child), and these 2 processes are connected through a pipe. The child will then execve the desired command (hostname in your case), and this will close

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-04 Thread Scott Sayres via users
Hi George, Thanks! You have just taught me a new trick. Although I do not yet understand the output, it is below: scottsayres@scotts-mbp ~ % lldb mpirun -- -np 1 hostname (lldb) target create "mpirun" Current executable set to 'mpirun' (arm64). (lldb) settings set -- target.run-args "-np" "1

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-04 Thread George Bosilca via users
I compiled a fresh copy of the 4.1.3 branch on my M1 laptop, and I can run both MPI and non-MPI apps without any issues. Try running `lldb mpirun -- -np 1 hostname` and once it deadlocks, do a CTRL+C to get back on the debugger and then `backtrace` to see where it is waiting. George. On Wed, Ma

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-04 Thread Scott Sayres via users
Thanks for looking at this Jeff. No, I cannot use mpirun to launch a non-MPI application.The command "mpirun -np 2 hostname" also hangs. I get the following output if I add the -d command before (I've replaced the server with the hashtags) : [scotts-mbp.3500.dhcp.###:05469] procdir: /var/fol

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-04 Thread Jeff Squyres (jsquyres) via users
Are you able to use mpirun to launch a non-MPI application? E.g.: mpirun -np 2 hostname And if that works, can you run the simple example MPI apps in the "examples" directory of the MPI source tarball (the "hello world" and "ring" programs)? E.g.: cd examples make mpirun -np 4 hello_c mpirun

Re: [OMPI users] mpirun on Kubuntu 20.4.1 hangs

2020-11-15 Thread Jorge SILVA via users
Hello,  I used Brice's workaround and now mpirun works well in all computers ! Thank you all for your help  Jorge Le 14/11/2020 à 23:11, Brice Goglin via users a écrit : Hello The hwloc/X11 stuff is caused by OpenMPI using a hwloc that was built with the GL backend enabled (in your case, it'

Re: [OMPI users] mpirun on Kubuntu 20.4.1 hangs

2020-11-14 Thread Brice Goglin via users
Hello The hwloc/X11 stuff is caused by OpenMPI using a hwloc that was built with the GL backend enabled (in your case, it's because package libhwloc-plugins is installed). That backend is used for querying the locality of X11 displays running on NVIDIA GPUs (using libxnvctrl). Does running "lstopo

Re: [OMPI users] mpirun on Kubuntu 20.4.1 hangs

2020-11-14 Thread Jorge Silva via users
Sorry, if  I  execute mpirun in a *really *bare terminal, without X Server running it works! but with an error message : Invalid MIT-MAGIC-COOKIE-1 key So the problem is related to X, but I have still no solution Jorge Le 14/11/2020 à 12:33, Jorge Silva via users a écrit : Hello, In spite

Re: [OMPI users] mpirun on Kubuntu 20.4.1 hangs

2020-11-14 Thread Jorge Silva via users
Hello, In spite of the delay, I was not able to solve my problem. Thanks to Joseph and Prentice for their interesting suggestions. I uninstalled AppAmor (SElinux is not installed ) as suggested by Prentice but there were no changes, mpirun  sttill hangs. The result of gdb stack trace is the

Re: [OMPI users] mpirun on Kubuntu 20.4.1 hangs

2020-10-22 Thread Joseph Schuchart via users
Hi Jorge, Can you try to get a stack trace of mpirun using the following command in a separate terminal? sudo gdb -batch -ex "thread apply all bt" -p $(ps -C mpirun -o pid= | head -n 1) Maybe that will give some insight where mpirun is hanging. Cheers, Joseph On 10/21/20 9:58 PM, Jorge SI

Re: [OMPI users] mpirun on Kubuntu 20.4.1 hangs

2020-10-21 Thread Jeff Squyres (jsquyres) via users
There's huge differences between Open MPI v2.1.1 and v4.0.3 (i.e., years of development effort); it would be very hard to categorize them all; sorry! What happens if you mpirun -np 1 touch /tmp/foo (Yes, you can run non-MPI apps through mpirun) Is /tmp/foo created? (i.e., did the job run,

Re: [OMPI users] mpirun on Kubuntu 20.4.1 hangs

2020-10-21 Thread Jorge SILVA via users
Hello Jeff, The  program is not executed, seems waits for something to connect with (why twice ctrl-C ?) jorge@gcp26:~/MPIRUN$ mpirun -np 1 touch /tmp/foo ^C^C jorge@gcp26:~/MPIRUN$ ls -l /tmp/foo ls: impossible d'accéder à '/tmp/foo': Aucun fichier ou dossier de ce type no file  is created.

Re: [OMPI users] mpirun on Kubuntu 20.4.1 hangs

2020-10-21 Thread Gilles Gouaillardet via users
Hi Jorge, If a firewall is running on your nodes, I suggest you disable it and try again Cheers, Gilles On Wed, Oct 21, 2020 at 5:50 AM Jorge SILVA via users wrote: > > Hello, > > I installed kubuntu20.4.1 with openmpi 4.0.3-0ubuntu in two different > computers in the standard way. Compiling w

Re: [OMPI users] mpirun on Kubuntu 20.4.1 hangs

2020-10-21 Thread Jorge SILVA via users
Hello Gus,  Thank you for your answer..  Unfortunately my problem is much more basic. I  didn't try to run the program in both computers , but just to run something in one computer. I just installed the new OS an openmpi in two different computers, in the standard way, with the same result.

Re: [OMPI users] mpirun on Kubuntu 20.4.1 hangs

2020-10-21 Thread Gus Correa via users
Hi Jorge You may have an active firewall protecting either computer or both, and preventing mpirun to start the connection. Your /etc/hosts file may also not have the computer IP addresses. You may also want to try the --hostfile option. Likewise, the --verbose option may also help diagnose the pr

Re: [OMPI users] mpirun only work for 1 processor

2020-06-04 Thread Hà Chi Nguyễn Nhật via users
Dear Patrick and all, Finally I solved the problem. I need to mount -t nfs the home directory of host to the node/home And then I can run in the cluster Thank you for your time. Best regards Ha Chi On Thu, 4 Jun 2020 at 17:09, Patrick Bégou < patrick.be...@legi.grenoble-inp.fr> wrote: > Ha Chi,

Re: [OMPI users] mpirun only work for 1 processor

2020-06-04 Thread Patrick Bégou via users
Ha Chi, first running MPI applications as root in not a good idea. You must create users in your rocks cluster without admin rights for all that is not system management. Let me know a little more about how you launch this: 1) Do you run "mpirun" from the rocks frontend or from a node ? 2) Ok fro

Re: [OMPI users] mpirun only work for 1 processor

2020-06-04 Thread Hà Chi Nguyễn Nhật via users
Dear Patrick, Thanks so much for your reply, Yes, we use ssh to log on the node. From the frontend, we can ssh to the nodes without password. the mpirun --version in all 3 nodes are identical, openmpi 2.1.1, and same place when testing with "whereis mpirun" So is there any problem with mpirun causi

Re: [OMPI users] mpirun only work for 1 processor

2020-06-04 Thread Patrick Bégou via users
Hi Ha Chi do you use a batch scheduler with Rocks Cluster or do you log on the node with ssh ? If ssh, can you check  that you can ssh from one node to the other without password ? Ping just says the network is alive, not that you can connect. Patrick Le 04/06/2020 à 09:06, Hà Chi Nguyễn Nhật vi

Re: [OMPI users] mpirun error only with one node

2020-04-08 Thread Garrett, Charles via users
I hope this replies correctly. I previously had a problem with replies. Anyhow, thank you for the advice. It turns out NUMA was disabled in the BIOS. All other nodes showed 2 NUMA nodes but node125 showed 1 NUMA node. I was able to see this by diffing lscpu on node125 and another node. Afte

Re: [OMPI users] mpirun error only with one node

2020-04-03 Thread John Hearns via users
Are you SURE node125 is identical to the others? systems can boot up and disable DIMMs for instance. I would log on there and runfreelscpu lspci dmidecode Take those outputs and run a diff against outputs from a known good node Also hwloc/lstopo might show some difference? On Thu, 2 A

Re: [OMPI users] mpirun CLI parsing

2020-03-30 Thread Ralph Castain via users
I'm afraid the short answer is "no" - there is no way to do that today. > On Mar 30, 2020, at 1:45 PM, Jean-Baptiste Skutnik via users > wrote: > > Hello, > > I am writing a wrapper around `mpirun` which requires pre-processing of the > user's program. To achieve this, I need to isolate the

Re: [OMPI users] mpirun --output-filename behavior

2019-11-01 Thread Jeff Squyres (jsquyres) via users
On Nov 1, 2019, at 10:14 AM, Reuti mailto:re...@staff.uni-marburg.de>> wrote: For the most part, this whole thing needs to get documented. Especially that the colon is a disallowed character in the directory name. Any suffix :foo will just be removed AFAICS without any error output about foo b

Re: [OMPI users] mpirun --output-filename behavior

2019-11-01 Thread Reuti via users
> Am 01.11.2019 um 14:46 schrieb Jeff Squyres (jsquyres) via users > : > > On Nov 1, 2019, at 9:34 AM, Jeff Squyres (jsquyres) via users > wrote: >> >>> Point to make: it would be nice to have an option to suppress the output on >>> stdout and/or stderr when output redirection to file is re

Re: [OMPI users] mpirun --output-filename behavior

2019-11-01 Thread Jeff Squyres (jsquyres) via users
On Nov 1, 2019, at 9:34 AM, Jeff Squyres (jsquyres) via users wrote: > >> Point to make: it would be nice to have an option to suppress the output on >> stdout and/or stderr when output redirection to file is requested. In my >> case, having stdout still visible on the terminal is desirable bu

Re: [OMPI users] mpirun --output-filename behavior

2019-11-01 Thread Jeff Squyres (jsquyres) via users
On Oct 31, 2019, at 6:43 PM, Joseph Schuchart via users wrote: > > Just to throw in my $0.02: I recently found that the output to stdout/stderr > may not be desirable: in an application that writes a lot of log data to > stderr on all ranks, stdout was significantly slower than the files I >

Re: [OMPI users] mpirun --output-filename behavior

2019-11-01 Thread Gilles GOUAILLARDET via users
Joseph, I had to use the absolute path of the fork agent. I may have misunderstood your request. Now it seems you want to have each task stderr redirected to a unique file but not to (duplicated) to mpirun stderr. Is that right? If so, instead of the --output-filename option, you can do it "manu

Re: [OMPI users] mpirun --output-filename behavior

2019-11-01 Thread Joseph Schuchart via users
Gilles, Thanks for your suggestions! I just tried both of them, see below: On 11/1/19 1:15 AM, Gilles Gouaillardet via users wrote: Joseph, you can achieve this via an agent (and it works with DDT too) For example, the nostderr script below redirects each MPI task's stderr to /dev/null (so

Re: [OMPI users] mpirun --output-filename behavior

2019-10-31 Thread Gilles Gouaillardet via users
Joseph, you can achieve this via an agent (and it works with DDT too) For example, the nostderr script below redirects each MPI task's stderr to /dev/null (so it is not forwarded to mpirun) $ cat nostderr #!/bin/sh exec 2> /dev/null exec "$@" and then you can simply $ mpirun --mca or

Re: [OMPI users] mpirun --output-filename behavior

2019-10-31 Thread Joseph Schuchart via users
On 10/30/19 2:06 AM, Jeff Squyres (jsquyres) via users wrote: Oh, did the prior behavior *only* output to the file and not to stdout/stderr?  Huh. I guess a workaround for that would be:     mpirun  ... > /dev/null Just to throw in my $0.02: I recently found that the output to stdout/std

Re: [OMPI users] mpirun --output-filename behavior

2019-10-31 Thread Kulshrestha, Vipul via users
Thanks Jeff. “:nojobid” worked well for me and helps me remove 1 extra level of hierarchy for log files. Regards Vipul From: Jeff Squyres (jsquyres) [mailto:jsquy...@cisco.com] Sent: Thursday, October 31, 2019 6:21 PM To: Kulshrestha, Vipul Cc: Open MPI User's List Subject: Re: [OMPI

Re: [OMPI users] mpirun --output-filename behavior

2019-10-31 Thread Jeff Squyres (jsquyres) via users
On Oct 30, 2019, at 2:16 PM, Kulshrestha, Vipul mailto:vipul_kulshres...@mentor.com>> wrote: Given that this is an intended behavior, I have a couple of follow up questions: 1. What is the purpose of the directory “1” that gets created currently? (in /app.log/1/rank./stdout ) Is this hard

Re: [OMPI users] mpirun --output-filename behavior

2019-10-30 Thread Kulshrestha, Vipul via users
] Sent: Tuesday, October 29, 2019 9:07 PM To: Open MPI User's List Cc: Kulshrestha, Vipul Subject: Re: [OMPI users] mpirun --output-filename behavior On Oct 29, 2019, at 7:30 PM, Kulshrestha, Vipul via users mailto:users@lists.open-mpi.org>> wrote: Hi, We recently shifted from openM

Re: [OMPI users] mpirun --output-filename behavior

2019-10-29 Thread Jeff Squyres (jsquyres) via users
On Oct 29, 2019, at 7:30 PM, Kulshrestha, Vipul via users mailto:users@lists.open-mpi.org>> wrote: Hi, We recently shifted from openMPI 2.0.1 to 4.0.1 and are seeing an important behavior change with respect to above option. We invoke mpirun as % mpirun –output-filename /app.log –np With 2

Re: [OMPI users] mpirun noticed that process rank 5 with PID 0 on node localhost exited on signal 9 (Killed).

2018-09-28 Thread Ralph H Castain
Ummm…looks like you have a problem in your input deck to that application. Not sure what we can say about it… > On Sep 28, 2018, at 9:47 AM, Zeinab Salah wrote: > > Hi everyone, > I use openmpi-3.0.2 and I want to run chimere model with 8 processors, but in > the step of parallel mode, the ru

Re: [OMPI users] mpirun hangs

2018-08-15 Thread Jeff Squyres (jsquyres) via users
There can be lots of reasons that this happens. Can you send all the information listed here? https://www.open-mpi.org/community/help/ > On Aug 15, 2018, at 10:55 AM, Mota, Thyago wrote: > > Hello. > > I have openmpi 2.0.4 installed on a Cent OS 7. When I try to run "mpirun" it > hang

Re: [OMPI users] mpirun issue using more than 64 hosts

2018-02-12 Thread Adam Sylvester
A... thanks Gilles. That makes sense. I was stuck thinking there was an ssh problem on rank 0; it never occurred to me mpirun was doing something clever there and that those ssh errors were from a different instance altogether. It's no problem to put my private key on all instances - I'll go

Re: [OMPI users] mpirun issue using more than 64 hosts

2018-02-12 Thread Gilles Gouaillardet
Adam, by default, when more than 64 hosts are involved, mpirun uses a tree spawn in order to remote launch the orted daemons. That means you have two options here : - allow all compute nodes to ssh each other (e.g. the ssh private key of *all* the nodes should be in *all* the authorized_keys -

Re: [OMPI users] mpirun 2.1.1 refuses to start a Torque 6.1.1.1 job if I change the scheduler to Maui 3.3.1

2017-08-10 Thread A M
All solved and now works well! The culprit was the lost line in the "maui.cfg" file: JOBNODEMATCHPOLICY EXACTNODE The default value for this variable is EXACTPROC and, in its presence, Maui completely ignores the "-l nodes=N:ppn=M" PBS instruction and allocates the first M available cores inside

Re: [OMPI users] mpirun 2.1.1 refuses to start a Torque 6.1.1.1 job if I change the scheduler to Maui 3.3.1 [SOLVED]

2017-08-10 Thread A M
All solved and now works well! The culprit was the lost line in the "maui.cfg" file: JOBNODEMATCHPOLICY EXACTNODE The default value for this variable is EXACTPROC and, in its presence, Maui completely ignores the "-l nodes=N:ppn=M" PBS instruction and allocates the first M available cores inside

Re: [OMPI users] mpirun 2.1.1 refuses to start a Torque 6.1.1.1 job if I change the scheduler to Maui 3.3.1

2017-08-09 Thread A M
Thanks! In fact there should be a problem with Maui's node allocation setting. I have checked the $PBS_NODEFILE contents (this is also may be seen with "qstat -n1"): while the default Torque scheduler correctly allocates one slot on node1 and another slot on node2, in case of Maui I always see tha

Re: [OMPI users] mpirun 2.1.1 refuses to start a Torque 6.1.1.1 job if I change the scheduler to Maui 3.3.1

2017-08-09 Thread r...@open-mpi.org
sounds to me like your maui scheduler didn’t provide any allocated slots on the nodes - did you check $PBS_NODEFILE? > On Aug 9, 2017, at 12:41 PM, A M wrote: > > > Hello, > > I have just ran into a strange issue with "mpirun". Here is what happened: > > I successfully installed Torque 6.1.1

Re: [OMPI users] mpirun with ssh tunneling

2017-01-01 Thread Adam Sylvester
Thanks Gilles - I appreciate all the detail. Ahh, that's great that Open MPI now supports specifying an ssh port simply through the hostfile. That'll make things a little simpler when I have that use case in the future. Oh of course - that makes sense that Open MPI requires TCP ports too rather

Re: [OMPI users] mpirun with ssh tunneling

2016-12-25 Thread Gilles Gouaillardet
Adam, there are several things here with an up-to-date master, you can specify an alternate ssh port via a hostfile see https://github.com/open-mpi/ompi/issues/2224 Open MPI requires more than just ssh. - remote nodes (orted) need to call back mpirun (oob/tcp) - nodes (MPI tasks) need t

Re: [OMPI users] mpirun --map-by-node

2016-11-09 Thread Mahesh Nanavalla
k..Thank you all. That has solved. On Fri, Nov 4, 2016 at 8:24 PM, r...@open-mpi.org wrote: > All true - but I reiterate. The source of the problem is that the > "--map-by node” on the cmd line must come *before* your application. > Otherwise, none of these suggestions will help. > > > On Nov 4

Re: [OMPI users] mpirun --map-by-node

2016-11-04 Thread r...@open-mpi.org
All true - but I reiterate. The source of the problem is that the "--map-by node” on the cmd line must come *before* your application. Otherwise, none of these suggestions will help. > On Nov 4, 2016, at 6:52 AM, Jeff Squyres (jsquyres) > wrote: > > In your case, using slots or --npernode or

Re: [OMPI users] mpirun --map-by-node

2016-11-04 Thread Jeff Squyres (jsquyres)
In your case, using slots or --npernode or --map-by node will result in the same distribution of processes because you're only launching 1 process per node (a.k.a. "1ppn"). They have more pronounced differences when you're launching more than 1ppn. Let's take a step back: you should know that O

Re: [OMPI users] mpirun --map-by-node

2016-11-04 Thread Bennet Fauber
Mahesh, Depending what you are trying to accomplish, might using the mpirun option -pernode -o- --pernode work for you? That requests that only one process be spawned per available node. We generally use this for hybrid codes, where the single process will spawn threads to the remaining proc

Re: [OMPI users] mpirun --map-by-node

2016-11-04 Thread Mahesh Nanavalla
s... Thanks for responding me. i have solved that as below by limiting* slots in hostfile* root@OpenWrt:~# cat myhostfile root@10.73.145.1 slots=1 root@10.74.25.1 slots=1 root@10.74.46.1 slots=1 I want the difference between the *slots* limiting in myhostfile and runnig *--map-by node

Re: [OMPI users] mpirun --map-by-node

2016-11-04 Thread r...@open-mpi.org
My apologies - the problem is that you list the option _after_ your executable name, and so we think it is an argument for your executable. You need to list the option _before_ your executable on the cmd line > On Nov 4, 2016, at 4:44 AM, Mahesh Nanavalla > wrote: > > Thanks for reply, > >

Re: [OMPI users] mpirun --map-by-node

2016-11-04 Thread Mahesh Nanavalla
Thanks for reply, But,with space also not running on one process one each node root@OpenWrt:~# /usr/bin/mpirun --allow-run-as-root -np 3 --hostfile myhostfile /usr/bin/openmpiWiFiBulb --map-by node And If use like this it,s working fine(running one process on each node) */root@OpenWrt:~#/usr/bi

Re: [OMPI users] mpirun --map-by-node

2016-11-04 Thread r...@open-mpi.org
you mistyped the option - it is “--map-by node”. Note the space between “by” and “node” - you had typed it with a “-“ instead of a “space” > On Nov 4, 2016, at 4:28 AM, Mahesh Nanavalla > wrote: > > Hi all, > > I am using openmpi-1.10.3,using quad core processor(node). > > I am running 3 pr

Re: [OMPI users] mpirun works with cmd line call , but not with app context file arg

2016-10-16 Thread MM
On 16 October 2016 at 14:50, Gilles Gouaillardet wrote: > Out of curiosity, why do you specify both --hostfile and -H ? > Do you observe the same behavior without --hostfile ~/.mpihosts ? When I specify only -H like so: mpirun -H localhost -np 1 prog1 : -H A.lan -np 4 prog2 : -H B.lan -np 4 pro

Re: [OMPI users] mpirun works with cmd line call , but not with app context file arg

2016-10-16 Thread Gilles Gouaillardet
Out of curiosity, why do you specify both --hostfile and -H ? Do you observe the same behavior without --hostfile ~/.mpihosts ? Also, do you have at least 4 cores on both A.lan and B.lan ? Cheers, Gilles On Sunday, October 16, 2016, MM wrote: > Hi, > > openmpi 1.10.3 > > this call: > > mpirun

Re: [OMPI users] mpirun won't find programs from the PATH environment variable that are in directories that are relative paths

2016-08-13 Thread Schneider, David A.
thanks! Glad to help. best, David Schneider SLAC/LCLS From: users [users-boun...@lists.open-mpi.org] on behalf of Reuti [re...@staff.uni-marburg.de] Sent: Friday, August 12, 2016 12:00 PM To: Open MPI Users Subject: Re: [OMPI users] mpirun won't

Re: [OMPI users] mpirun won't find programs from the PATH environment variable that are in directories that are relative paths

2016-08-12 Thread Reuti
defensive practice, but it is more cumbersome, the actually path looks >> >> mpirun -n 1 $PWD/arch/x86_64-rhel7-gcc48-opt/bin/psana >> >> best, >> >> David Schneider >> SLAC/LCLS >> >> From: users

Re: [OMPI users] mpirun won't find programs from the PATH environment variable that are in directories that are relative paths

2016-08-12 Thread r...@open-mpi.org
na > > best, > > David Schneider > SLAC/LCLS > > From: users [users-boun...@lists.open-mpi.org > <mailto:users-boun...@lists.open-mpi.org>] on behalf of Phil Regier > [preg...@penguincomputing.com <mailto:preg...@penguincomput

Re: [OMPI users] mpirun won't find programs from the PATH environment variable that are in directories that are relative paths

2016-07-29 Thread Phil Regier
n -n 1 $PWD/arch/x86_64-rhel7-gcc48-opt/bin/psana > > best, > > David Schneider > SLAC/LCLS > > From: users [users-boun...@lists.open-mpi.org] on behalf of Phil Regier [ > preg...@penguincomputing.com] > Sent: Friday, July 29, 2016 5:12 PM > To: Open MPI Users > Subject

Re: [OMPI users] mpirun won't find programs from the PATH environment variable that are in directories that are relative paths

2016-07-29 Thread Ralph Castain
> David Schneider > SLAC/LCLS > > From: users [users-boun...@lists.open-mpi.org] on behalf of Ralph Castain > [r...@open-mpi.org] > Sent: Friday, July 29, 2016 5:19 PM > To: Open MPI Users > Subject: Re: [OMPI users] mpirun won't find programs from the PATH > environ

Re: [OMPI users] mpirun won't find programs from the PATH environment variable that are in directories that are relative paths

2016-07-29 Thread Schneider, David A.
Open MPI Users Subject: Re: [OMPI users] mpirun won't find programs from the PATH environment variable that are in directories that are relative paths Typical practice would be to put a ./myprogram in there to avoid any possible confusion with a “myprogram” sitting in your $PATH. We should

Re: [OMPI users] mpirun won't find programs from the PATH environment variable that are in directories that are relative paths

2016-07-29 Thread Schneider, David A.
/LCLS From: users [users-boun...@lists.open-mpi.org] on behalf of Phil Regier [preg...@penguincomputing.com] Sent: Friday, July 29, 2016 5:12 PM To: Open MPI Users Subject: Re: [OMPI users] mpirun won't find programs from the PATH environment variable t

Re: [OMPI users] mpirun won't find programs from the PATH environment variable that are in directories that are relative paths

2016-07-29 Thread Ralph Castain
Typical practice would be to put a ./myprogram in there to avoid any possible confusion with a “myprogram” sitting in your $PATH. We should search the PATH to find your executable, but the issue might be that it isn’t your PATH on a remote node. So the question is: are you launching strictly lo

Re: [OMPI users] mpirun won't find programs from the PATH environment variable that are in directories that are relative paths

2016-07-29 Thread Phil Regier
I might be three steps behind you here, but does "mpirun pwd" show that all your launched processes are running in the same directory as the mpirun command? I assume that "mpirun env" would show that your PATH variable is being passed along correctly, since you don't have any problems with absol

Re: [OMPI users] mpirun: Symbol `orte_schizo' has different size in shared object, consider re-linking

2016-07-19 Thread Ralph Castain
Afraid I have no brilliant ideas to offer - I’m not seeing that problem. It usually indicates that the orte_schizo plugin is being pulled from an incorrect location. You might just look in your install directory and ensure that the plugin is there. Also ensure that your install lib is at the fro

Re: [OMPI users] mpirun: Symbol `orte_schizo' has different size in shared object, consider re-linking

2016-07-19 Thread Nathaniel Graham
Ive also blown away the install directory and did a complete reinstall in case there was something old left in the directory. -Nathan On Tue, Jul 19, 2016 at 2:21 PM, Nathaniel Graham wrote: > The prefix location has to be there. Otherwise ompi attempts to install > to a read only directory. >

Re: [OMPI users] mpirun: Symbol `orte_schizo' has different size in shared object, consider re-linking

2016-07-19 Thread Nathaniel Graham
The prefix location has to be there. Otherwise ompi attempts to install to a read only directory. I have the install bin directory added to my path and the lib directory added to the LD_LIBRARY_PATH. When I run: which mpirun it is pointing to the expected place. -Nathan On Tue, Jul 19, 2016 at

Re: [OMPI users] mpirun: Symbol `orte_schizo' has different size in shared object, consider re-linking

2016-07-19 Thread Ralph Castain
Sounds to me like you have a confused build - I’d whack your prefix location and do a “make install” again > On Jul 19, 2016, at 1:04 PM, Nathaniel Graham wrote: > > Hello, > > I am trying to run the OSU tests for some results for a poster, but I am > getting the following error: > > mpi

Re: [OMPI users] mpirun has exited due to process rank N

2016-07-07 Thread Gilles Gouaillardet
Andrea, On top of what Ralph just wrote, you might want to upgrade OpenMPI to the latest stable version (1.10.3) 1.6.5 is pretty antique and is no more maintained. the message indicates that one process died, and so many things could cause a process crash. (since the crash occurs only wi

Re: [OMPI users] mpirun has exited due to process rank N

2016-07-07 Thread Ralph Castain
Try running one of the OMPI example codes and verify that things run correctly if N > 25. I suspect you have an error in your code that causes it to fail if its rank is > 25. > On Jul 7, 2016, at 2:49 PM, Alberti, Andrea wrote: > > Hi, > > my name is Andrea and I am a new openMPI user. > >

Re: [OMPI users] mpirun and Torque

2016-06-07 Thread Ralph Castain
I can confirm that mpirun will not direct-launch the applications under Torque. This is done for wireup support - if/when Torque natively supports PMIx, then we could revisit that design. Gilles: the benefit is two-fold: * Torque has direct visibility of the application procs. When we launch vi

Re: [OMPI users] mpirun and Torque

2016-06-07 Thread Gilles Gouaillardet
Ken, iirc, and under torque when Open MPI is configure'd with --with -tm (this is the default, so assuming your torque headers/libs can be found, you do not even have to specify --with-tm), mpirun does tm_spawn the orted daemon on all nodes except the current one. then mpirun and orted will

Re: [OMPI users] mpirun command won't run unless the firewalld daemon is disabled

2016-06-03 Thread Llolsten Kaonga
to run the tests without having to stop the firewalld daemon. Thank you to Gilles and Jeff for your help. -- Llolsten From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gilles Gouaillardet Sent: Tuesday, May 10, 2016 5:06 PM To: Open MPI Users Subject: Re: [OMPI users] mpirun

Re: [OMPI users] mpirun java

2016-05-23 Thread Howard Pritchard
HI Ralph, Yep, If you could handle this that would be great. I guess we'd like a fix in 1.10.x and for 2.0.1 that would be great. Howard 2016-05-23 14:59 GMT-06:00 Ralph Castain : > Looks to me like there is a bug in the orterun parser that is trying to > add java library paths - I can take a

Re: [OMPI users] mpirun java

2016-05-23 Thread Ralph Castain
Looks to me like there is a bug in the orterun parser that is trying to add java library paths - I can take a look at it > On May 23, 2016, at 1:05 PM, Claudio Stamile wrote: > > Hi Howard. > > Thank you for your reply. > > I'm using version 1.10.2 > > I executed the following command: > >

Re: [OMPI users] mpirun java

2016-05-23 Thread Claudio Stamile
Hi Howard. Thank you for your reply. I'm using version 1.10.2 I executed the following command: mpirun -np 2 --mca odls_base_verbose 100 java -cp alot:of:jarfile -Djava.library.path=/Users/stamile/Applications/IBM/ILOG/CPLEX_Studio1263/cplex/bin/x86-64_osx clustering.TensorClusterinCplexMPI t

Re: [OMPI users] mpirun java

2016-05-23 Thread Saliya Ekanayake
I tested with OpenMPI 1.10.1 and it works. See this example, which prints java.library.path mpijavac LibPath.java mpirun -np 2 java -Djava.library.path=path LibPath On Mon, May 23, 2016 at 1:38 PM, Howard Pritchard wrote: > Hello Claudio, > > mpirun should be combining your java.library.path o

Re: [OMPI users] mpirun java

2016-05-23 Thread Howard Pritchard
Hello Claudio, mpirun should be combining your java.library.path option with the one needed to add the Open MPI's java bindings as well. Which version of Open MPI are you using? Could you first try to compile the Ring.java code in ompi/examples and run it with the following additional mpirun par

Re: [OMPI users] Mpirun invocation only works in debug mode, hangs in "normal" mode.

2016-05-16 Thread Jeff Squyres (jsquyres)
I'm afraid I don't know what the difference is in systemctld for ssh.socket vs. ssh.service, or why that would change Open MPI's behavior. One other thing to try is to mpirun non-MPI programs, like "hostname" and see if that works. This will help distinguish between problems with Open MPI's run

Re: [OMPI users] Mpirun invocation only works in debug mode, hangs in "normal" mode.

2016-05-14 Thread Andrew Reid
I think I might have fixed this, but I still don't really understand it. In setting up the RPi machines, I followed a config guide that suggested switching the SSH service in systemd to "ssh.socket" instead of "ssh.service". It's supposed to be lighter weight and get you cleaner shut-downs, and I'

Re: [OMPI users] mpirun command won't run unless the firewalld daemon is disabled

2016-05-12 Thread Jeff Squyres (jsquyres)
>> internal ports. However, we do allow use of one of the external ports which >> we assign a static address. >> >> I thank you. >> -- >> Llolsten >> >> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gilles >> Gouaillardet >> Se

Re: [OMPI users] mpirun command won't run unless the firewalld daemon is disabled

2016-05-12 Thread Jeff Squyres (jsquyres)
ers [mailto:users-boun...@open-mpi.org] On Behalf Of Gilles > Gouaillardet > Sent: Wednesday, May 11, 2016 11:03 AM > To: Open MPI Users > Subject: Re: [OMPI users] mpirun command won't run unless the firewalld > daemon is disabled > > I am not sure I understand your l

Re: [OMPI users] mpirun command won't run unless the firewalld daemon is disabled

2016-05-12 Thread Jeff Squyres (jsquyres)
ers [mailto:users-boun...@open-mpi.org] On Behalf Of Gilles > Gouaillardet > Sent: Wednesday, May 11, 2016 11:03 AM > To: Open MPI Users > Subject: Re: [OMPI users] mpirun command won't run unless the firewalld > daemon is disabled > > I am not sure I understand your la

Re: [OMPI users] mpirun command won't run unless the firewalld daemon is disabled

2016-05-11 Thread Llolsten Kaonga
Of Gilles Gouaillardet Sent: Wednesday, May 11, 2016 11:03 AM To: Open MPI Users Subject: Re: [OMPI users] mpirun command won't run unless the firewalld daemon is disabled I am not sure I understand your last message. if MPI only need the internal port, and there is no firewall prote

Re: [OMPI users] mpirun command won't run unless the firewalld daemon is disabled

2016-05-11 Thread Gilles Gouaillardet
rg > ] *On Behalf > Of *Gilles Gouaillardet > *Sent:* Tuesday, May 10, 2016 5:06 PM > *To:* Open MPI Users > > *Subject:* Re: [OMPI users] mpirun command won't run unless the firewalld > daemon is disabled > > > > I was basically suggesting you open a few po

Re: [OMPI users] mpirun command won't run unless the firewalld daemon is disabled

2016-05-11 Thread Llolsten Kaonga
-boun...@open-mpi.org] On Behalf Of Gilles Gouaillardet Sent: Tuesday, May 10, 2016 5:06 PM To: Open MPI Users Subject: Re: [OMPI users] mpirun command won't run unless the firewalld daemon is disabled I was basically suggesting you open a few ports to anyone (e.g. any IP address), and

Re: [OMPI users] mpirun command won't run unless the firewalld daemon is disabled

2016-05-10 Thread Jeff Squyres (jsquyres)
On May 10, 2016, at 5:05 PM, Gilles Gouaillardet wrote: > > I was basically suggesting you open a few ports to anyone (e.g. any IP > address), and Jeff suggests you open all ports to a few trusted IP addresses. +1 -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: htt

Re: [OMPI users] mpirun command won't run unless the firewalld daemon is disabled

2016-05-10 Thread Gilles Gouaillardet
g ] On Behalf > Of Jeff Squyres > (jsquyres) > Sent: Tuesday, May 10, 2016 3:47 PM > To: Open MPI User's List > > Subject: Re: [OMPI users] mpirun command won't run unless the firewalld > daemon is disabled > > Open MPI generally needs to be able to communicate on ran

  1   2   3   4   5   6   >