Re: [OMPI users] Help: OpenMPI Compilation in Raspberry Pi

2013-01-10 Thread Jeff Squyres (jsquyres)
Mmmm.  Let's rope in our ARM expert here...

Leif, do you know what the issue is here?


On Jan 3, 2013, at 4:28 AM, Lee Eric  wrote:

> Hi,
> 
> I am going to compile OpenMPI 1.6.3 in Raspberry Pi and encounter following 
> errors.
> 
> make[2]: Entering directory `/root/openmpi-1.6.3/opal'
>   CC class/opal_bitmap.lo
>   CC class/opal_free_list.lo
>   CC class/opal_hash_table.lo
>   CC class/opal_list.lo
>   CC class/opal_object.lo
> /tmp/ccniCtj0.s: Assembler messages:
> /tmp/ccniCtj0.s:83: Error: selected processor does not support ARM mode 
> `ldrex r3,[r1]'
> /tmp/ccniCtj0.s:86: Error: selected processor does not support ARM mode 
> `strex r4,r0,[r1]'
> make[2]: *** [class/opal_object.lo] Error 1
> make[2]: Leaving directory `/root/openmpi-1.6.3/opal'
> make[1]: *** [all-recursive] Error 1
> make[1]: Leaving directory `/root/openmpi-1.6.3/opal'
> make: *** [all-recursive] Error 1
> 
> Can anyone have any idea to fix that issue?
> 
> I'm using Fedora 17 rootfs and kernel version is "Linux fedora-arm 3.6.11+ #1 
> PREEMPT Wed Jan 2 15:14:23 CST 2013 armv6l armv6l armv6l GNU/Linux".
> 
> Thanks.
> 
> Eric Lee
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] MPI Java Bindings on Mac OSX

2013-01-10 Thread Jeff Squyres (jsquyres)
With respect to Java support: frankly, Ralph and I have been somewhat surprised 
by the level of interest for Java+MPI!  We wanted Java for some other reasons, 
but didn't really expect *too* much interest from the community.

Unfortunately, the Java package we imported into Open MPI is both a bit dated 
and is known to be buggy in a few cases.  I'd really like to replace it with:

- interfaces that are closer to a 1:1 mapping to the MPI C bindings
- a full set of MPI interfaces

Is this something that you'd be able to help with, perchance?  Neither Ralph 
nor I are Java experts, and we're a bit pressed for resources -- anyone who can 
help here would be greatly appreciated.

See http://www.open-mpi.org/community/lists/devel/2013/01/11915.php.


On Jan 3, 2013, at 5:12 PM, Chuck Yahoo  wrote:

> It's nice to see that this mail list has a lot of activity !
> 
> Thanks for the tips, I haven't used modules in quite a few years, having been 
> spoiled by Java ;-)
> 
> Takes me back to the good old days, spending 10x more time configuring than 
> coding !
> 
> Chuck
> 
> On Jan 3, 2013, at 1:05 PM, Ralph Castain  wrote:
> 
>> FWIW: I test it regularly on Mountain Lion, without problem. We know that 
>> some of the bindings aren't quite right, particularly on some of the 
>> collectives, but send/recv is fine
>> 
>> 
>> On Jan 3, 2013, at 10:09 AM, "Beatty, Daniel D CIV NAVAIR, 474300D" 
>>  wrote:
>> 
>>> Greetings Chuck, 
>>> I tend to agree with Doug.  It hope to be able to test soon OpenMPI under 
>>> Lion/Mountain Lion.  If someone has already done so, especially with Java, 
>>> that could be quite handy.
>>> 
>>> V/R,
>>> 
>>> Daniel Beatty, Ph.D.
>>> Computer Scientist, Detonation Sciences Branch
>>> Code 474300D
>>> 1 Administration Circle M/S 1109
>>> China Lake, CA 93555
>>> daniel.bea...@navy.mil
>>> (LandLine) (760)939-7097 
>>> (iPhone) (806)438-6620
>>> 
>>>  
>>> 
>>> 
>>> On 1/3/13 9:49 AM, "Ralph Castain"  wrote:
>>> 
>>> Hi Doug
>>> 
>>> What modules software do you use on the Mac? Would be nice to know :-)
>>> 
>>> 
>>> On Jan 3, 2013, at 8:34 AM, Doug Reeder  wrote:
>>> 
>>> Chuck,
>>> 
>>> In step 4 you might want to consider the following
>>> 
>>> --prefix=/usr/local/openmpi-1.7rc5
>>> 
>>> and use the modules software to select which version of openmpi to use. I 
>>> have to have multiple versions of openmpi available on my macs and this 
>>> approach has worked well for me.
>>> 
>>> Doug Reeder
>>> On Jan 3, 2013, at 9:22 AM, Chuck Mosher wrote:
>>> 
>>> Hi,
>>> 
>>> I've been trying to get a working version of the MPI java bindings on Mac 
>>> OSX (10.6.8 with Java 1.6.0_37).
>>> 
>>> I ran into a number of issues along the way that I thought I would record 
>>> here for others who might be foolish enough to try the same ;-)
>>> 
>>> The issues I had to spend time with were:
>>> 
>>> 1. Installing a C compiler that can run from the command line
>>> 2. Finding and installing an appropriate Java JDK for my OS version
>>> 3. Building and installing OpenMPI for the first time on a Mac
>>> 4. Conflicts with the existing OpenMPI version 1.2.8 that was installed 
>>> already on my Mac
>>> 5. Figuring out syntax for using the mpirun command line to run java
>>> 6. Odd behavior when trying to use "localhost" or the output from 
>>> `hostname` on the command line or in a hostfile
>>> 
>>> Resolution for each of these in order:
>>> 
>>> 1. Installing a C compiler for the command line
>>> Found a good resource here:
>>> http://www.macobserver.com/tmo/article/install_the_command_line_c_compilers_in_os_x_lion
>>>  
>>> 
>>>  
>>> The solution is to install XCode, then enable command line compilers from 
>>> the XCode console.
>>> 
>>> 2. Finding and installing an appropriate Java JDK for my OS version
>>> Used this resource to eventually figure out what to do:
>>> http://www.wikihow.com/Install-the-JDK-(Java-Development-Kit)-on-Mac-OS-X 
>>>  
>>> It didn't exactly match my setup, but had enough clues.
>>> The solution is to first find your java version (java -version, 1.6.0_37 in 
>>> my case) and then match that version number to the Apple Java update 
>>> version (11 in my case). 
>>> The key document is:
>>> http://developer.apple.com/library/mac/#technotes/tn2002/tn2110.html
>>> Which is a table relating java version numbers to the appropriate "Java for 
>>> Mac OS X xx.x Update xx".
>>> Once you know the update number, you can download the JDK installer from
>>> https://developer.apple.com/downloads/index.action
>>> where you of course have to have an Apple developer ID to access.
>>> Enter "java" in the search bar on the left and find the matching java 
>>> update, and you're good to go.
>>> 
>>> 3. Building and installing OpenMPI for the first time on a Mac
>>> After the usual false starts with a new installation on a new OS, I man

Re: [OMPI users] problem: help-hostfile.txt: Too many open files in system.

2013-01-10 Thread Jeff Squyres (jsquyres)
That's a weird one -- it looks like having too many open files on your system 
is causing a cascading set of failures.  

Are you saying that your program runs for a while and then on iteration 32, it 
fails with errors like this?  If so, I'd like for a file descriptor leak in 
your program.


On Jan 4, 2013, at 12:48 PM, Mariana Vargas Magana  
wrote:

> Hello open MPI users:
> 
> I was just running a program that usually works well in the cluster and 
> suddenly in the 32 iteration I get this strange set of errors associated 
> with. I will appreciate if someone could give me some hint of the problem and 
> how to solve
> 
> Thanks!
> 
> Mariana
> 
> 
> /usr/bin/ssh: error while loading shared libraries: libcrypt.so.1: cannot 
> open shared object file: Error 23
> /usr/bin/ssh: error while loading shared libraries: libutil.so.1: cannot open 
> shared object file: Error 23
> /usr/bin/ssh: error while loading shared libraries: libfipscheck.so.1: cannot 
> open shared object file: Error 23
> /usr/bin/ssh: error while loading shared libraries: libkrb5.so.3: cannot open 
> shared object file: Error 23
> --
> A daemon (pid 1486) died unexpectedly with status 127 while attempting
> to launch so we are aborting.
> 
> There may be more information reported by the environment (see above).
> 
> This may be because the daemon was unable to find all the needed shared
> libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
> location of the shared libraries on the remote nodes and this will
> automatically be forwarded to the remote nodes.
> --
> --
> mpirun noticed that the job aborted, but has no info as to the process
> that caused that situation.
> --
> --
> Sorry!  You were supposed to get help about:
>no-hostfile
> But I couldn't open the help file:
>/home/mvargas/openmpi/share/openmpi/help-hostfile.txt: Too many open files 
> in system.  Sorry!
> --
> [ferrari:01490] [[65228,0],0] ORTE_ERROR_LOG: Not found in file 
> base/ras_base_allocate.c at line 200
> [ferrari:01490] [[65228,0],0] ORTE_ERROR_LOG: Not found in file 
> base/plm_base_launch_support.c at line 99
> [ferrari:01490] [[65228,0],0] ORTE_ERROR_LOG: Not found in file 
> plm_rsh_module.c at line 1167
> --
> Sorry!  You were supposed to get help about:
>no-hostfile
> But I couldn't open the help file:
>/home/mvargas/openmpi/share/openmpi/help-hostfile.txt: Too many open files 
> in system.  Sorry!
> --
> [ferrari:01491] [[65229,0],0] ORTE_ERROR_LOG: Not found in file 
> base/ras_base_allocate.c at line 200
> [ferrari:01491] [[65229,0],0] ORTE_ERROR_LOG: Not found in file 
> base/plm_base_launch_support.c at line 99
> [ferrari:01491] [[65229,0],0] ORTE_ERROR_LOG: Not found in file 
> plm_rsh_module.c at line 1167
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] problem: help-hostfile.txt: Too many open files in system.

2013-01-10 Thread Ralph Castain
What is even stranger is that the error occurs when attempting to launch a 
daemon! Does your program do a series of comm_spawns?

Sent from my iPad

On Jan 10, 2013, at 7:28 AM, "Jeff Squyres (jsquyres)"  
wrote:

> That's a weird one -- it looks like having too many open files on your system 
> is causing a cascading set of failures.  
> 
> Are you saying that your program runs for a while and then on iteration 32, 
> it fails with errors like this?  If so, I'd like for a file descriptor leak 
> in your program.
> 
> 
> On Jan 4, 2013, at 12:48 PM, Mariana Vargas Magana  
> wrote:
> 
>> Hello open MPI users:
>> 
>> I was just running a program that usually works well in the cluster and 
>> suddenly in the 32 iteration I get this strange set of errors associated 
>> with. I will appreciate if someone could give me some hint of the problem 
>> and how to solve
>> 
>> Thanks!
>> 
>> Mariana
>> 
>> 
>> /usr/bin/ssh: error while loading shared libraries: libcrypt.so.1: cannot 
>> open shared object file: Error 23
>> /usr/bin/ssh: error while loading shared libraries: libutil.so.1: cannot 
>> open shared object file: Error 23
>> /usr/bin/ssh: error while loading shared libraries: libfipscheck.so.1: 
>> cannot open shared object file: Error 23
>> /usr/bin/ssh: error while loading shared libraries: libkrb5.so.3: cannot 
>> open shared object file: Error 23
>> --
>> A daemon (pid 1486) died unexpectedly with status 127 while attempting
>> to launch so we are aborting.
>> 
>> There may be more information reported by the environment (see above).
>> 
>> This may be because the daemon was unable to find all the needed shared
>> libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
>> location of the shared libraries on the remote nodes and this will
>> automatically be forwarded to the remote nodes.
>> --
>> --
>> mpirun noticed that the job aborted, but has no info as to the process
>> that caused that situation.
>> --
>> --
>> Sorry!  You were supposed to get help about:
>>   no-hostfile
>> But I couldn't open the help file:
>>   /home/mvargas/openmpi/share/openmpi/help-hostfile.txt: Too many open files 
>> in system.  Sorry!
>> --
>> [ferrari:01490] [[65228,0],0] ORTE_ERROR_LOG: Not found in file 
>> base/ras_base_allocate.c at line 200
>> [ferrari:01490] [[65228,0],0] ORTE_ERROR_LOG: Not found in file 
>> base/plm_base_launch_support.c at line 99
>> [ferrari:01490] [[65228,0],0] ORTE_ERROR_LOG: Not found in file 
>> plm_rsh_module.c at line 1167
>> --
>> Sorry!  You were supposed to get help about:
>>   no-hostfile
>> But I couldn't open the help file:
>>   /home/mvargas/openmpi/share/openmpi/help-hostfile.txt: Too many open files 
>> in system.  Sorry!
>> --
>> [ferrari:01491] [[65229,0],0] ORTE_ERROR_LOG: Not found in file 
>> base/ras_base_allocate.c at line 200
>> [ferrari:01491] [[65229,0],0] ORTE_ERROR_LOG: Not found in file 
>> base/plm_base_launch_support.c at line 99
>> [ferrari:01491] [[65229,0],0] ORTE_ERROR_LOG: Not found in file 
>> plm_rsh_module.c at line 1167
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] Help: OpenMPI Compilation in Raspberry Pi

2013-01-10 Thread George Bosilca
A little bit of google shows that this is a known issue. ldrex and strex are 
not included in the default instruction set gcc uses (arm6). One has to add the 
compile flag "-march=argv6k" to successfully compiles.

  George.

PS: For more info: 
http://www.raspberrypi.org/phpBB3/viewtopic.php?f=9&t=4256&start=250


On Jan 10, 2013, at 16:20 , Jeff Squyres (jsquyres)  wrote:

> Mmmm.  Let's rope in our ARM expert here...
> 
> Leif, do you know what the issue is here?
> 
> 
> On Jan 3, 2013, at 4:28 AM, Lee Eric  wrote:
> 
>> Hi,
>> 
>> I am going to compile OpenMPI 1.6.3 in Raspberry Pi and encounter following 
>> errors.
>> 
>> make[2]: Entering directory `/root/openmpi-1.6.3/opal'
>>  CC class/opal_bitmap.lo
>>  CC class/opal_free_list.lo
>>  CC class/opal_hash_table.lo
>>  CC class/opal_list.lo
>>  CC class/opal_object.lo
>> /tmp/ccniCtj0.s: Assembler messages:
>> /tmp/ccniCtj0.s:83: Error: selected processor does not support ARM mode 
>> `ldrex r3,[r1]'
>> /tmp/ccniCtj0.s:86: Error: selected processor does not support ARM mode 
>> `strex r4,r0,[r1]'
>> make[2]: *** [class/opal_object.lo] Error 1
>> make[2]: Leaving directory `/root/openmpi-1.6.3/opal'
>> make[1]: *** [all-recursive] Error 1
>> make[1]: Leaving directory `/root/openmpi-1.6.3/opal'
>> make: *** [all-recursive] Error 1
>> 
>> Can anyone have any idea to fix that issue?
>> 
>> I'm using Fedora 17 rootfs and kernel version is "Linux fedora-arm 3.6.11+ 
>> #1 PREEMPT Wed Jan 2 15:14:23 CST 2013 armv6l armv6l armv6l GNU/Linux".
>> 
>> Thanks.
>> 
>> Eric Lee
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Help: OpenMPI Compilation in Raspberry Pi

2013-01-10 Thread Jeff Squyres (jsquyres)
Sadly, none of these solutions worked for me on my RPi:

-
pi@raspberrypi ~/openmpi-1.6.3/opal/asm $ make CCASFLAGS=-mcpu=arm1176jzf-s
  CPPAS  atomic-asm.lo
atomic-asm.S: Assembler messages:
atomic-asm.S:7: Error: selected processor does not support ARM mode `dmb'
atomic-asm.S:15: Error: selected processor does not support ARM mode `dmb'
atomic-asm.S:23: Error: selected processor does not support ARM mode `dmb'
atomic-asm.S:55: Error: selected processor does not support ARM mode `dmb'
atomic-asm.S:70: Error: selected processor does not support ARM mode `dmb'
make: *** [atomic-asm.lo] Error 1
pi@raspberrypi ~/openmpi-1.6.3/opal/asm $ make CCASFLAGS=-march=armv6zk
  CPPAS  atomic-asm.lo
atomic-asm.S: Assembler messages:
atomic-asm.S:7: Error: selected processor does not support ARM mode `dmb'
atomic-asm.S:15: Error: selected processor does not support ARM mode `dmb'
atomic-asm.S:23: Error: selected processor does not support ARM mode `dmb'
atomic-asm.S:55: Error: selected processor does not support ARM mode `dmb'
atomic-asm.S:70: Error: selected processor does not support ARM mode `dmb'
make: *** [atomic-asm.lo] Error 1
pi@raspberrypi ~/openmpi-1.6.3/opal/asm $ make CCASFLAGS=-march=argv6k
  CPPAS  atomic-asm.lo
cc1: error: bad value (argv6k) for -march switch
make: *** [atomic-asm.lo] Error 1
pi@raspberrypi ~/openmpi-1.6.3/opal/asm $ 
-

Although I'm using a bit different system than the original user cited (I'm 
running the latest Raspbian distro):

-
pi@raspberrypi ~/openmpi-1.6.3/opal/asm $ uname -a
Linux raspberrypi 3.2.27+ #250 PREEMPT Thu Oct 18 19:03:02 BST 2012 armv6l 
GNU/Linux
pi@raspberrypi ~/openmpi-1.6.3/opal/asm $ gcc --version
gcc (Debian 4.6.3-12+rpi1) 4.6.3
-

On Jan 10, 2013, at 5:39 PM, George Bosilca 
 wrote:

> A little bit of google shows that this is a known issue. ldrex and strex are 
> not included in the default instruction set gcc uses (arm6). One has to add 
> the compile flag "-march=argv6k" to successfully compiles.
> 
>  George.
> 
> PS: For more info: 
> http://www.raspberrypi.org/phpBB3/viewtopic.php?f=9&t=4256&start=250
> 
> 
> On Jan 10, 2013, at 16:20 , Jeff Squyres (jsquyres)  
> wrote:
> 
>> Mmmm.  Let's rope in our ARM expert here...
>> 
>> Leif, do you know what the issue is here?
>> 
>> 
>> On Jan 3, 2013, at 4:28 AM, Lee Eric  wrote:
>> 
>>> Hi,
>>> 
>>> I am going to compile OpenMPI 1.6.3 in Raspberry Pi and encounter following 
>>> errors.
>>> 
>>> make[2]: Entering directory `/root/openmpi-1.6.3/opal'
>>> CC class/opal_bitmap.lo
>>> CC class/opal_free_list.lo
>>> CC class/opal_hash_table.lo
>>> CC class/opal_list.lo
>>> CC class/opal_object.lo
>>> /tmp/ccniCtj0.s: Assembler messages:
>>> /tmp/ccniCtj0.s:83: Error: selected processor does not support ARM mode 
>>> `ldrex r3,[r1]'
>>> /tmp/ccniCtj0.s:86: Error: selected processor does not support ARM mode 
>>> `strex r4,r0,[r1]'
>>> make[2]: *** [class/opal_object.lo] Error 1
>>> make[2]: Leaving directory `/root/openmpi-1.6.3/opal'
>>> make[1]: *** [all-recursive] Error 1
>>> make[1]: Leaving directory `/root/openmpi-1.6.3/opal'
>>> make: *** [all-recursive] Error 1
>>> 
>>> Can anyone have any idea to fix that issue?
>>> 
>>> I'm using Fedora 17 rootfs and kernel version is "Linux fedora-arm 3.6.11+ 
>>> #1 PREEMPT Wed Jan 2 15:14:23 CST 2013 armv6l armv6l armv6l GNU/Linux".
>>> 
>>> Thanks.
>>> 
>>> Eric Lee
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> -- 
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to: 
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>> 
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/