[OMPI users] [patch] test(1) "==" is not portable, use "="

2012-10-31 Thread Aleksej Saushev
<#multipart type=mixed>
<#part type=text/plain nofile=yes>
  Hello,

Diff against openmpi-1.7rc5r27536

<#part type=application/octet-stream 
filename="patch-orte_config_orte__setup__hadoop.m4" disposition=attachment 
buffer=" *mml*" description="test \"==\" portability fix">
<#/part>
<#part type=text/plain nofile=yes>

-- 
HE CE3OH...
<#/multipart>


Re: [OMPI users] OpenMPI on Windows when MPI_F77 is used from a C application

2012-10-31 Thread Mathieu Gontier
I do not know too :-/

On Tue, Oct 30, 2012 at 2:37 PM, Jeff Squyres  wrote:

> What's errno=108 on your platform?
>
> On Oct 30, 2012, at 9:22 AM, Damien Hocking wrote:
>
> > I've never seen that, but someone else might have.
> >
> > Damien
> >
> > On 30/10/2012 1:43 AM, Mathieu Gontier wrote:
> >> Hi Damien,
> >>
> >> The only message I have is:
> >> [vs2010:09300] [[56007,0],0]-[[56007,1],0] mca_oob_tcp_msg_recv: readv
> failed: Unknown error (108)
> >> [vs2010:09300] 2 more processes have sent help message
> help-odls-default.txt / odls-default:could-not-kill
> >>
> >> Does it mean something for you?
> >>
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Mathieu Gontier
- MSN: mathieu.gont...@gmail.com
- Skype: mathieu_gontier


Re: [OMPI users] [patch] test(1) "==" is not portable, use "="

2012-10-31 Thread Aleksej Saushev
  Hello,

(Once again...)

Diff against openmpi-1.7rc5r27536



patch-orte_config_orte__setup__hadoop.m4
Description: test \"==\" portability fix

-- 
HE CE3OH...


[OMPI users] bug (?) opal_path_access incorrect call

2012-10-31 Thread marco atzeri

looking on a solution for
http://www.open-mpi.org/community/lists/users/2012/10/20495.php

I noticed that the issue disappears on 1.6.2 with the patch:


--- opal/util/path.c~   2012-04-03 16:29:52.0 +0200
+++ opal/util/path.c2012-10-30 20:31:43.772749400 +0100
@@ -82,7 +82,7 @@

 /* If absolute path is given, return it without searching. */
 if( opal_path_is_absolute(fname) ) {
-return opal_path_access(fname, "", mode);
+return opal_path_access(fname, NULL , mode);
 }

 /* Initialize. */



For what I can see on the function body, the test on path
is expecting path to be a null pointer and not a
pointer to an empty strings


char *opal_path_access(char *fname, char *path, int mode)
{
char *fullpath = NULL;
struct stat buf;

/* Allocate space for the full pathname. */
if (NULL == path) {
fullpath = opal_os_path(false, fname, NULL);
} else {
fullpath = opal_os_path(false, path, fname, NULL);
}
if (NULL == fullpath)
return NULL;


Regards
Marco


Re: [OMPI users] OpenMPI on Windows when MPI_F77 is used from a C application

2012-10-31 Thread Jeff Squyres
You might want to find errno.h in your machine and see what the #define'd name 
for 108 is.

On Oct 31, 2012, at 3:04 AM, Mathieu Gontier wrote:

> I do not know too :-/
> 
> On Tue, Oct 30, 2012 at 2:37 PM, Jeff Squyres  wrote:
> What's errno=108 on your platform?
> 
> On Oct 30, 2012, at 9:22 AM, Damien Hocking wrote:
> 
> > I've never seen that, but someone else might have.
> >
> > Damien
> >
> > On 30/10/2012 1:43 AM, Mathieu Gontier wrote:
> >> Hi Damien,
> >>
> >> The only message I have is:
> >> [vs2010:09300] [[56007,0],0]-[[56007,1],0] mca_oob_tcp_msg_recv: readv 
> >> failed: Unknown error (108)
> >> [vs2010:09300] 2 more processes have sent help message 
> >> help-odls-default.txt / odls-default:could-not-kill
> >>
> >> Does it mean something for you?
> >>
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> -- 
> Mathieu Gontier
> - MSN: mathieu.gont...@gmail.com
> - Skype: mathieu_gontier
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] [patch] test(1) "==" is not portable, use "="

2012-10-31 Thread Ralph Castain
Got it - thanks!

On Oct 31, 2012, at 1:11 AM, Aleksej Saushev  wrote:

>  Hello,
> 
> (Once again...)
> 
> Diff against openmpi-1.7rc5r27536
> 
> 
> -- 
> HE CE3OH...
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] bug (?) opal_path_access incorrect call

2012-10-31 Thread Ralph Castain
Wow - you are quite correct. Thanks for chasing this down!!

On Oct 31, 2012, at 1:56 AM, marco atzeri  wrote:

> looking on a solution for
> http://www.open-mpi.org/community/lists/users/2012/10/20495.php
> 
> I noticed that the issue disappears on 1.6.2 with the patch:
> 
> 
> --- opal/util/path.c~   2012-04-03 16:29:52.0 +0200
> +++ opal/util/path.c2012-10-30 20:31:43.772749400 +0100
> @@ -82,7 +82,7 @@
> 
> /* If absolute path is given, return it without searching. */
> if( opal_path_is_absolute(fname) ) {
> -return opal_path_access(fname, "", mode);
> +return opal_path_access(fname, NULL , mode);
> }
> 
> /* Initialize. */
> 
> 
> 
> For what I can see on the function body, the test on path
> is expecting path to be a null pointer and not a
> pointer to an empty strings
> 
> 
> char *opal_path_access(char *fname, char *path, int mode)
> {
>char *fullpath = NULL;
>struct stat buf;
> 
>/* Allocate space for the full pathname. */
>if (NULL == path) {
>fullpath = opal_os_path(false, fname, NULL);
>} else {
>fullpath = opal_os_path(false, path, fname, NULL);
>}
>if (NULL == fullpath)
>return NULL;
> 
> 
> Regards
> Marco
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



[OMPI users] Java bindings failed to load required libraries

2012-10-31 Thread Georg Ruzicka
Hello again.

The fault is still there and I can't locate it.

It seem's the first part of the message came from the file 
ompi/mpi/java/c/mpi_MPI.c   :   NO LT_DLADVISE - CANNOT LOAD LIBOMPI
and the second part from
ompi/mpi/java/java/MPI.java

I run a 'make install' in both directories.
As the result I get a mpi.jar and the libmpi_java libraries installed in 
/buildpath/lib
of my open mpi installation.

I also searched for the libompi but can't find anything.
I have a 'libmpi.la' library in my /lib directory but no libompi.


Any ideas?

Thanks Georg





- Ursprüngliche Mail -
Betreff: Java bindings failed to load required libraries

Hello.

I installed open mpi and try to run the examples.
I used the developer trunk.
C, C++ and Fortran90 examples compiling and running well.

When i tried to run the compiled Hello.java class
i get this messages:

georg@ThinkPad-R61:~/ompi-svn/examples$ mpirun java Hello
[ThinkPad-R61:19720] NO LT_DLADVISE - CANNOT LOAD LIBOMPI
JAVA BINDINGS FAILED TO LOAD REQUIRED LIBRARIES
---
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
---
--
mpirun detected that one or more processes exited with non-zero status, thus 
causing
the job to be terminated. The first process to do so was:

  Process name: [[43465,1],0]
  Exit code:1
--


I configure with:
./configure --prefix=/home/georg/ompi-install1 --with-platform=optimized 
--enable-mpi-java --with-jdk-dir=/opt/jdk1.7.0_09

I work with ubuntu 10.10.

I added to .bashrc:
export 
PATH=$PATH:/home/georg/tools/installed/bin:/home/georg/ompi-install1/bin:/opt/jdk1.7.0_09/bin:/opt/jdk1.7.0_09
export 
LD_LIBRARY_PATH=$LB_LIBRARY_PATH:/home/georg/ompi-install1/lib:/home/georg/ompi-install1/lib/openmpi:/home/georg/ompi-install1/lib/pkgconfig
 

I can compile and run java progs 

Did anyone know the fault?

Thanks.






Re: [OMPI users] Java bindings failed to load required libraries

2012-10-31 Thread Ralph Castain
The "libompi" was just a shorthand way of saying it can't load the OMPI 
libraries - it actually is looking for the libopen-rte in your /lib directory.

But that isn't the problem. The problem is that we require ltdladvise in order 
to correctly load those libraries. Given that you are building from an svn 
checkout, we require that this be installed on your machine. Ubuntu does not 
install this by default, so you have to do it yourself.

You can do this in two ways:

1. you could look for a libtool package and install it. For example, yum shows 
it has "libtool-ltdl" that you could load

2. you could go to the gnu site and download libtool yourself, build it, and 
then ensure it is in your default path

Either way will work.

On Oct 31, 2012, at 7:11 AM, Georg Ruzicka <82ruge1...@hft-stuttgart.de> wrote:

> Hello again.
> 
> The fault is still there and I can't locate it.
> 
> It seem's the first part of the message came from the file 
> ompi/mpi/java/c/mpi_MPI.c   :   NO LT_DLADVISE - CANNOT LOAD LIBOMPI
> and the second part from
> ompi/mpi/java/java/MPI.java
> 
> I run a 'make install' in both directories.
> As the result I get a mpi.jar and the libmpi_java libraries installed in 
> /buildpath/lib
> of my open mpi installation.
> 
> I also searched for the libompi but can't find anything.
> I have a 'libmpi.la' library in my /lib directory but no libompi.
> 
> 
> Any ideas?
> 
> Thanks Georg
> 
> 
> 
> 
> 
> - Ursprüngliche Mail -
> Betreff: Java bindings failed to load required libraries
> 
> Hello.
> 
> I installed open mpi and try to run the examples.
> I used the developer trunk.
> C, C++ and Fortran90 examples compiling and running well.
> 
> When i tried to run the compiled Hello.java class
> i get this messages:
> 
> georg@ThinkPad-R61:~/ompi-svn/examples$ mpirun java Hello
> [ThinkPad-R61:19720] NO LT_DLADVISE - CANNOT LOAD LIBOMPI
> JAVA BINDINGS FAILED TO LOAD REQUIRED LIBRARIES
> ---
> Primary job  terminated normally, but 1 process returned
> a non-zero exit code.. Per user-direction, the job has been aborted.
> ---
> --
> mpirun detected that one or more processes exited with non-zero status, thus 
> causing
> the job to be terminated. The first process to do so was:
> 
>  Process name: [[43465,1],0]
>  Exit code:1
> --
> 
> 
> I configure with:
> ./configure --prefix=/home/georg/ompi-install1 --with-platform=optimized 
> --enable-mpi-java --with-jdk-dir=/opt/jdk1.7.0_09
> 
> I work with ubuntu 10.10.
> 
> I added to .bashrc:
> export 
> PATH=$PATH:/home/georg/tools/installed/bin:/home/georg/ompi-install1/bin:/opt/jdk1.7.0_09/bin:/opt/jdk1.7.0_09
> export 
> LD_LIBRARY_PATH=$LB_LIBRARY_PATH:/home/georg/ompi-install1/lib:/home/georg/ompi-install1/lib/openmpi:/home/georg/ompi-install1/lib/pkgconfig
>  
> 
> I can compile and run java progs 
> 
> Did anyone know the fault?
> 
> Thanks.
> 
> 
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Java bindings failed to load required libraries

2012-10-31 Thread Jeff Squyres
On Oct 31, 2012, at 10:23 AM, Ralph Castain wrote:

> 2. you could go to the gnu site and download libtool yourself, build it, and 
> then ensure it is in your default path

If you go this route, read the HACKING document at the top-level Open MPI 
directory.  It has directions on how to build/install the GNU Autotools 
(including Libtool).

Just to be clear: you can't build/install Libtool all by itself.  If you go the 
#2 route, you need to build/install all the GNU Autotools (which takes about 5 
mins -- they're small and easy to build/install).

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




[OMPI users] Multirail + Open MPI 1.6.1 = very big latency for the first communication

2012-10-31 Thread Paul Kapinos

Hello all,

Open MPI is clever and use by default multiple IB adapters, if available.
http://www.open-mpi.org/faq/?category=openfabrics#ofa-port-wireup

Open MPI is lazy and establish connections only iff needed.

Both is good.

We have kinda special nodes: up to 16 sockets, 128 cores, 4 boards, 4 IB cards. 
Multirail works!


The crucial thing is, that starting with v1.6.1 the latency of the very first 
PingPong sample between two nodes take really a lot of time - some 100x - 200x 
of usual latency. You cannot see this using usual latency benchmark(*) because 
they tend to omit the first samples as "warmup phase", but we use a kinda 
self-written parallel test which clearly show this (and let me to muse some days).
If Miltirail is forbidden (-mca btl_openib_max_btls 1), or if v.1.5.3 used, or 
if the MPI processes are preconnected 
(http://www.open-mpi.org/faq/?category=running#mpi-preconnect) there is no such 
huge latency outliers for the first sample.


Well, we know about the warm-up and lazy connections.

But 200x ?!

Any comments about that is OK so?

Best,

Paul Kapinos

(*) E.g. HPCC explicitely say in http://icl.cs.utk.edu/hpcc/faq/index.html#132
> Additional startup latencies are masked out by starting the measurement after
> one non-measured ping-pong.

P.S. Sorry for cross-posting to both Users and Developers, but my last questions 
to Users have no reply until yet, so trying to broadcast...



--
Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
RWTH Aachen University, Center for Computing and Communication
Seffenter Weg 23,  D 52074  Aachen (Germany)
Tel: +49 241/80-24915



smime.p7s
Description: S/MIME Cryptographic Signature


[OMPI users] tester for cygwin openmpi-1.6.3 package

2012-10-31 Thread marco atzeri

Hi,
I built and packaged openmpi-1.6.3 for cygwin.
Before deploying it as an official package, I would
like feedback from testers.

Source and binary here:
http://matzeri.altervista.org/cygwin-1.7/openmpi/

To install using cygwin setup program
setup.exe -X -O -s http://matzeri.altervista.org

Current configuration is:

 LDFLAGS="-Wl,--export-all-symbols -no-undefined"  \
 --disable-mca-dso \
--without-udapl \
--enable-cxx-exceptions \
--with-threads=posix \
--without-cs-fs \
--enable-heterogeneous \
--with-mpi-param_check=always \
--enable-contrib-no-build=vt,libompitrace \
--enable-mca-nobuild= memory_mallopt, paffinity, 
installdirs-windows, timer-windows, shmem-sysv


Only additional patch
https://svn.open-mpi.org/trac/ompi/changeset/27539

C, C++ and Fortran pass basic tests

$ time mpirun -n 4 ./hello_f90.exe
 Hello, world, I am0  of4
 Hello, world, I am2  of4
 Hello, world, I am1  of4
 Hello, world, I am3  of4

real1m9.607s
user0m1.542s
sys 0m2.135s

But I guess there is a long delay/timeout on startup.

Regards
Marco