date:20130116

Re: [OMPI users] Error compiling openmpi-1.6.4a1r27766 on Solaris 10

2013-01-16 Thread Matthias Jurenz

Hello Siegmar,
this problem is already fixed in the OMPI trunk (r27770), but currently not 
moved to the v1.6 branch.
To make it working for the v1.6 branch, just copy the following files from the 
trunk to your v1.6 checkout and re-run autogen.sh:
ompi/contrib/vt/vt/config/m4/acinclude.execwrap.m4
ompi/contrib/vt/vt/vtlib/vt_execwrap.c

Regards,
Matthias Jurenz

> Hi 

> I tried to build openmpi-1.6.4a1r27766 on Solaris 10 Sparc and 
x86_64 with Sun C 5.12 and gcc-4.7.1 and got the following error 
for all combinations. 

> ... 
  CC vt_execwrap.lo 
"../../../../../../openmpi-1.6.4a1r27766/ompi/contrib/vt/vt/vtlib/vt_execwrap.c",
 
  line 187: warning: implicit function declaration: 
VTTHRD_MALLOC_TRACING_ENABLED 
"../../../../../../openmpi-1.6.4a1r27766/ompi/contrib/vt/vt/vtlib/vt_execwrap.c",
 
  line 358: undefined symbol: environ 
"../../../../../../openmpi-1.6.4a1r27766/ompi/contrib/vt/vt/vtlib/vt_execwrap.c",
 
  line 358: warning: improper pointer/integer combination: op "=" 
"../../../../../../openmpi-1.6.4a1r27766/ompi/contrib/vt/vt/vtlib/vt_execwrap.c",
 
  line 410: undefined symbol: environ 
"../../../../../../openmpi-1.6.4a1r27766/ompi/contrib/vt/vt/vtlib/vt_execwrap.c",
 
  line 410: warning: improper pointer/integer combination: op "=" 
cc: acomp failed for 
.../openmpi-1.6.4a1r27766/ompi/contrib/vt/vt/vtlib/vt_execwrap.c 
make[5]: *** [vt_execwrap.lo] Error 1 
make[5]: Leaving directory `.../ompi/contrib/vt/vt/vtlib' 
make[4]: *** [all-recursive] Error 1 
... 

> I would be grateful if you could solve the problem. Thank you 
very much for your help in advance. 

> Kind regards 

> Siegmar

[OMPI users] Error running program : mca_oob_tcp_msg_send_handler: writev:failed: Bad file descriptor

2013-01-16 Thread borja mf

Hello all.
I want to learn MPI and I've trying to setting up OMPI for first time on
three nodes. My config above:
Ubuntu server - master node: pruebaborja
2x Ubuntu Desktop - slaves node:
clienteprueba
clientepruebados 4 slots

Im running NFSv4 for sharing /home/mpiuser.
I want to test a plain "Hello world"but I can't make it working
successfully on node clienteprueba. This is the problem:

mpiuser@pruebaborja:~$ mpirun -np 6 --hostfile .mpi_hostfile ./holamundo
[clienteprueba:01993] [[64434,0], 2] -> [[64434,0],0]
mca_oob_tcp_msg_send_handler: writev:failed: Bad file descriptor (9) [sd =
9]
[clienteprueba:01993] [[64434,0], 2] routed:binomial: Connection to
lifeline [[64434,0],0] lost

However, with clientepruebados and pruebaborja only on my hostfile, it
works:

pruebaborja slots=1
clientepruebados slots=4
#clienteprueba slots=1

mpiuser@pruebaborja:~$ mpirun -np 6 --hostfile .mpi_hostfile ./holamundo
Hola, mundo, soy pruebaborja: 0 de 6
Hola, mundo, soy pruebaborja: 5 de 6
Hola, mundo, soy clientepruebados: 1 de 6
Hola, mundo, soy clientepruebados: 2 de 6
Hola, mundo, soy clientepruebados: 3 de 6
Hola, mundo, soy clientepruebados: 4 de 6

I've checked the OMPI versions on the machines and it's the same. I can't
understand why Im getting this error on clienteprueba; i've done the same
config on  clientepruebados and clienteprueba. Anyone could help me to
solve this?

Sorry for my english.
Thanks in advance

Re: [OMPI users] Initializing OMPI with invoking the array constructor on Fortran derived types causes the executable to crash

2013-01-16 Thread Jeff Squyres (jsquyres)

FWIW, I can replicate the behavior with gfortran 4.7.2:

- program runs "fine" with no MPI_Init/MPI_Finalize
- program dumps core when MPI_Init/MPI_Finalize are called (at the "conc = [ 
xx, yy ]" statement)

I notice that even if I disable Open MPI's memory hooking, the coredump still 
occurs.

Sidenote: there's a few ways to disable OMPI's memory hooking; one of the 
easiest is to set the environment variable FAKEROOTKEY to any value, because 
OMPI disables its memory hooking in Debian Fakeroot environments.  For example:

-
% setenv FAKEROOTKEY 0
% mpifort -g arrays.f90 && ./a.out
*** glibc detected *** ./a.out: free(): invalid pointer: 0x00369ef9cf48 ***
...etc.
-

Specifically: with OMPI's memory hooking disabled, we don't modify the behavior 
of malloc/free/memalign/realloc.

I'm not sure what Open MPI is doing to anger the gfortran gods, but I did note 
that when I run the program without MPI_Init/MPI_Finalize, valgrind complains:

-
==7269== Conditional jump or move depends on uninitialised value(s)
==7269==at 0x4015B0: MAIN__ (arrays.f90:20)
==7269==by 0x401795: main (arrays.f90:26)
-

Line 20 is the "conc = [ xx, yy ]" statement.

I'm not enough of a Fortran guru to know what that means (to my eyes, xx and yy 
were just initialized above that -- perhaps it's complaining about conc?), but 
there you go.  :-)



On Jan 14, 2013, at 6:08 AM, Stefan Mauerberger  
wrote:

> Well, I missed to emphasize one thing: It is my intension to exploit
> F2003's lhs-(re)allocate feature. Meaning, it is totally legal in F03 to
> write something like that:
> integer, allocatable :: array(:)
> array = [ 1,2,3,4 ]
> array = [ 1 ]
> where 'array' gets automatically (re)allocated. One more thing I should
> mention: In case 'array' is manually allocate, everything is fine.
> 
> Ok, lets do a little case study and make my suggested minimal example a
> little more exhaustive:
> PROGRAM main
> 
>IMPLICIT NONE 
>!INCLUDE 'mpif.h'
> 
>INTEGER :: ierr 
> 
>TYPE :: test_typ
>REAL, ALLOCATABLE :: a(:)
>END TYPE
> 
>TYPE(test_typ) :: xx, yy
>TYPE(test_typ), ALLOCATABLE :: conc(:)
> 
>!CALL mpi_init( ierr )
> 
>xx = test_typ( a=[1.0] )
>yy = test_typ( a=[2.0,1.0] )
> 
>conc = [ xx, yy ]
> 
>WRITE(*,*) SIZE(conc)
> 
>!CALL mpi_finalize( ierr )
> 
> END PROGRAM main 
> Note: For the beginning all MPI-stuff is commented out; xx and yy are
> initialized and their member-variable 'a' is allocated. 
> 
> For now, assume it as purely serial. That piece of code complies and
> runs properly with: 
> * gfortran 4.7.1, 4.7.2 and 4.8.0 (experimental)
> * ifort 12.1 and 13.0 (-assume realloc_lhs)
> * nagfort 5.3
> On the contrary it terminates, throwing a segfault, with
> * pgfortran 12.9
> Well, for the following lets simply drop PGI. In addition, according to
> 'The Fortran 2003 Handbook' published by Springer in 2009, the
> usage of the array constructor [...] is appropriate and valid. 
> 
> As a second step lets try to compile and run it invoking OMPI, just
> considering INCLUDE 'mpif.h':
> * gfortran: all right 
> * ifort: all right
> * nagfor: all right
> 
> Finally, lets initialize MPI by calling MPI_Init() and MPI_Finalize():
> * gfortran + OMPI: *** glibc detected *** ./a.out: free(): invalid
> pointer ...
> * gfortran + Intel-MPI: *** glibc detected *** ./a.out: free(): invalid
> pointer ...
> * ifort + OMPI: all right 
> * nagfor + OMPI: all right (-thread_safe)
> 
> Well, you are right, this is a very strong indication to blame gfortran
> for that!  However, it gets even more confusing. Instead of linking
> against OMPI, the following results are obtained by invoking IBM's MPI
> implementation:
> * gfortran + IBM-MPI: all right
> * ifort + IBM-MPI: all right 
> Isn't that weired? 
> 
> Any suggestions? Might it be useful to submit a bug-report to GCC
> developers? 
> 
> Cheers, 
> Stefan 
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI users] Error running program : mca_oob_tcp_msg_send_handler: writev:failed: Bad file descriptor

2013-01-16 Thread Jeff Squyres (jsquyres)

Try disabling firewalling between your nodes.  The easiest way is "sudo service 
iptables stop".


On Jan 16, 2013, at 7:46 AM, borja mf 
 wrote:

> Hello all. 
> I want to learn MPI and I've trying to setting up OMPI for first time on 
> three nodes. My config above:
> Ubuntu server - master node: pruebaborja 
> 2x Ubuntu Desktop - slaves node:  
> clienteprueba
> clientepruebados 4 slots
> 
> Im running NFSv4 for sharing /home/mpiuser.
> I want to test a plain "Hello world"but I can't make it working successfully 
> on node clienteprueba. This is the problem:
> 
> mpiuser@pruebaborja:~$ mpirun -np 6 --hostfile .mpi_hostfile ./holamundo
> [clienteprueba:01993] [[64434,0], 2] -> [[64434,0],0] 
> mca_oob_tcp_msg_send_handler: writev:failed: Bad file descriptor (9) [sd = 9]
> [clienteprueba:01993] [[64434,0], 2] routed:binomial: Connection to lifeline 
> [[64434,0],0] lost
> 
> However, with clientepruebados and pruebaborja only on my hostfile, it works:
> 
> pruebaborja slots=1
> clientepruebados slots=4
> #clienteprueba slots=1
> 
> mpiuser@pruebaborja:~$ mpirun -np 6 --hostfile .mpi_hostfile ./holamundo
> Hola, mundo, soy pruebaborja: 0 de 6
> Hola, mundo, soy pruebaborja: 5 de 6
> Hola, mundo, soy clientepruebados: 1 de 6
> Hola, mundo, soy clientepruebados: 2 de 6
> Hola, mundo, soy clientepruebados: 3 de 6
> Hola, mundo, soy clientepruebados: 4 de 6
> 
> I've checked the OMPI versions on the machines and it's the same. I can't 
> understand why Im getting this error on clienteprueba; i've done the same 
> config on  clientepruebados and clienteprueba. Anyone could help me to solve 
> this?
> 
> Sorry for my english. 
> Thanks in advance
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI users] Windows MPI with Fortran calling programs

2013-01-16 Thread Jeff Squyres (jsquyres)

On Jan 14, 2013, at 8:57 AM, Said Elnoshokaty  
wrote:

> Parallel processing is needed to speed up processing of large-scale master 
> and sub-problems. 32-bit Microsoft Access 2007 is used to capture data and 
> then calls a DLL program written in 32-bit Microsoft Fortran 90 for 
> processing (to be distributed in parallel among master and sub-problems). 
> Operating system is 64-bit Windows 7. Hardware is PCs core i3 and i5. Network 
> is Ethernet 5. Please advise on the possibility of having MPI installed on 
> this platform and how to install, if possible.

Open MPI recently lost its Windows developer, and Windows support has been 
removed from the upcoming v1.7 release.

Your best bet is likely to use Microsoft MPI.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

[OMPI users] help me understand these error msgs

2013-01-16 Thread Jure Pečar


Hello,

I have a large fortran code processing data (weather forecast). It runs ok with 
smaller dataset, but on larger dataset I get some errors I've never seen before:

node061:05144] [[55141,0],11]->[[55141,0],0] mca_oob_tcp_msg_send_handler: 
writev failed: Bad file descriptor (9) [sd = 9]
[node061:05144] [[55141,0],11] routed:binomial: Connection to lifeline 
[[55141,0],0] lost

and

node084:7.0.Non-fatal temporary exhaustion of send tid dma descriptors
(elapsed=43.788s, source LID=0x49/context=11, count=1) (err=0)

I'm using QLogic software version 7.1.0.0.58 (ofed 1.5.4.1, open-mpi 1.4.3).

I'm starting this program with mpirun -mca btl openib,sm,self so I don't really 
understand what tcp has to do in the first error message.

Also I traced second error message to psm code, but it appears even if i add 
-mca mtl ^psm to my mpirun arguments. Why?

Any help appreciated.


-- 

Jure Pečar
http://jure.pecar.org

Re: [OMPI users] help me understand these error msgs

2013-01-16 Thread Ralph Castain


On Jan 16, 2013, at 7:41 AM, Jure Pečar  wrote:

> 
> Hello,
> 
> I have a large fortran code processing data (weather forecast). It runs ok 
> with smaller dataset, but on larger dataset I get some errors I've never seen 
> before:
> 
> node061:05144] [[55141,0],11]->[[55141,0],0] mca_oob_tcp_msg_send_handler: 
> writev failed: Bad file descriptor (9) [sd = 9]
> [node061:05144] [[55141,0],11] routed:binomial: Connection to lifeline 
> [[55141,0],0] lost

This one means that a backend node lost its connection to mpirun. We use a TCP 
socket between the daemon on a node and mpirun to launch the processes and to 
detect if/when that node fails for some reason.


> 
> and
> 
> node084:7.0.Non-fatal temporary exhaustion of send tid dma descriptors
> (elapsed=43.788s, source LID=0x49/context=11, count=1) (err=0)
> 
> I'm using QLogic software version 7.1.0.0.58 (ofed 1.5.4.1, open-mpi 1.4.3).
> 
> I'm starting this program with mpirun -mca btl openib,sm,self so I don't 
> really understand what tcp has to do in the first error message.
> 
> Also I traced second error message to psm code, but it appears even if i add 
> -mca mtl ^psm to my mpirun arguments. Why?
> 
> Any help appreciated.
> 
> 
> -- 
> 
> Jure Pečar
> http://jure.pecar.org
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] Error running program : mca_oob_tcp_msg_send_handler: writev:failed: Bad file descriptor

2013-01-16 Thread borja mf

Getting the same error...
I forgot to say that I must to use Ubuntu and Im compiling with mpicc. My
code is written on C.

Thank for answer.

Im going crazy with this problem. There's not much info about.

2013/1/16 Jeff Squyres (jsquyres) 

> Try disabling firewalling between your nodes.  The easiest way is "sudo
> service iptables stop".
>
>
> On Jan 16, 2013, at 7:46 AM, borja mf 
>  wrote:
>
> > Hello all.
> > I want to learn MPI and I've trying to setting up OMPI for first time on
> three nodes. My config above:
> > Ubuntu server - master node: pruebaborja
> > 2x Ubuntu Desktop - slaves node:
> > clienteprueba
> > clientepruebados 4 slots
> >
> > Im running NFSv4 for sharing /home/mpiuser.
> > I want to test a plain "Hello world"but I can't make it working
> successfully on node clienteprueba. This is the problem:
> >
> > mpiuser@pruebaborja:~$ mpirun -np 6 --hostfile .mpi_hostfile ./holamundo
> > [clienteprueba:01993] [[64434,0], 2] -> [[64434,0],0]
> mca_oob_tcp_msg_send_handler: writev:failed: Bad file descriptor (9) [sd =
> 9]
> > [clienteprueba:01993] [[64434,0], 2] routed:binomial: Connection to
> lifeline [[64434,0],0] lost
> >
> > However, with clientepruebados and pruebaborja only on my hostfile, it
> works:
> >
> > pruebaborja slots=1
> > clientepruebados slots=4
> > #clienteprueba slots=1
> >
> > mpiuser@pruebaborja:~$ mpirun -np 6 --hostfile .mpi_hostfile ./holamundo
> > Hola, mundo, soy pruebaborja: 0 de 6
> > Hola, mundo, soy pruebaborja: 5 de 6
> > Hola, mundo, soy clientepruebados: 1 de 6
> > Hola, mundo, soy clientepruebados: 2 de 6
> > Hola, mundo, soy clientepruebados: 3 de 6
> > Hola, mundo, soy clientepruebados: 4 de 6
> >
> > I've checked the OMPI versions on the machines and it's the same. I
> can't understand why Im getting this error on clienteprueba; i've done the
> same config on  clientepruebados and clienteprueba. Anyone could help me to
> solve this?
> >
> > Sorry for my english.
> > Thanks in advance
> >
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] Error running program : mca_oob_tcp_msg_send_handler: writev:failed: Bad file descriptor

2013-01-16 Thread Ralph Castain

If you login to eprueba and try to ping pruebaborja, can you do it? What 
network is it using?

Sometimes the problem is that you have multiple ethernet interfaces on the 
machines and we pick the wrong one - i.e., one that cannot connect to the other 
machine. There are ways to help resolve the problem if that's the case, but 
first check to see.

Also, if you configure OMPI --enable-debug, there are diagnostics you can 
enable that will help debug the problem.

On Jan 16, 2013, at 7:59 AM, borja mf  wrote:

> Getting the same error... 
> I forgot to say that I must to use Ubuntu and Im compiling with mpicc. My 
> code is written on C.
> 
> Thank for answer. 
> 
> Im going crazy with this problem. There's not much info about.
> 
> 2013/1/16 Jeff Squyres (jsquyres) 
> Try disabling firewalling between your nodes.  The easiest way is "sudo 
> service iptables stop".
> 
> 
> On Jan 16, 2013, at 7:46 AM, borja mf 
>  wrote:
> 
> > Hello all.
> > I want to learn MPI and I've trying to setting up OMPI for first time on 
> > three nodes. My config above:
> > Ubuntu server - master node: pruebaborja
> > 2x Ubuntu Desktop - slaves node:
> > clienteprueba
> > clientepruebados 4 slots
> >
> > Im running NFSv4 for sharing /home/mpiuser.
> > I want to test a plain "Hello world"but I can't make it working 
> > successfully on node clienteprueba. This is the problem:
> >
> > mpiuser@pruebaborja:~$ mpirun -np 6 --hostfile .mpi_hostfile ./holamundo
> > [clienteprueba:01993] [[64434,0], 2] -> [[64434,0],0] 
> > mca_oob_tcp_msg_send_handler: writev:failed: Bad file descriptor (9) [sd = 
> > 9]
> > [clienteprueba:01993] [[64434,0], 2] routed:binomial: Connection to 
> > lifeline [[64434,0],0] lost
> >
> > However, with clientepruebados and pruebaborja only on my hostfile, it 
> > works:
> >
> > pruebaborja slots=1
> > clientepruebados slots=4
> > #clienteprueba slots=1
> >
> > mpiuser@pruebaborja:~$ mpirun -np 6 --hostfile .mpi_hostfile ./holamundo
> > Hola, mundo, soy pruebaborja: 0 de 6
> > Hola, mundo, soy pruebaborja: 5 de 6
> > Hola, mundo, soy clientepruebados: 1 de 6
> > Hola, mundo, soy clientepruebados: 2 de 6
> > Hola, mundo, soy clientepruebados: 3 de 6
> > Hola, mundo, soy clientepruebados: 4 de 6
> >
> > I've checked the OMPI versions on the machines and it's the same. I can't 
> > understand why Im getting this error on clienteprueba; i've done the same 
> > config on  clientepruebados and clienteprueba. Anyone could help me to 
> > solve this?
> >
> > Sorry for my english.
> > Thanks in advance
> >
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] Error running program : mca_oob_tcp_msg_send_handler: writev:failed: Bad file descriptor

2013-01-16 Thread Jeff Squyres (jsquyres)

Make sure you disable iptables on all the nodes.

Also, check that all your IP interfaces are configured correctly.  Do you have 
IP interfaces for only real ethernet connections and loopback?  Or do you have 
other interfaces (e.g., for virtual machines)?

If you have interfaces for virtual machines, you'll need to exclude them from 
Open MPI -- see http://www.open-mpi.org/faq/?category=tcp#tcp-selection.


On Jan 16, 2013, at 10:59 AM, borja mf 
 wrote:

> Getting the same error... 
> I forgot to say that I must to use Ubuntu and Im compiling with mpicc. My 
> code is written on C.
> 
> Thank for answer. 
> 
> Im going crazy with this problem. There's not much info about.
> 
> 2013/1/16 Jeff Squyres (jsquyres) 
> Try disabling firewalling between your nodes.  The easiest way is "sudo 
> service iptables stop".
> 
> 
> On Jan 16, 2013, at 7:46 AM, borja mf 
>  wrote:
> 
> > Hello all.
> > I want to learn MPI and I've trying to setting up OMPI for first time on 
> > three nodes. My config above:
> > Ubuntu server - master node: pruebaborja
> > 2x Ubuntu Desktop - slaves node:
> > clienteprueba
> > clientepruebados 4 slots
> >
> > Im running NFSv4 for sharing /home/mpiuser.
> > I want to test a plain "Hello world"but I can't make it working 
> > successfully on node clienteprueba. This is the problem:
> >
> > mpiuser@pruebaborja:~$ mpirun -np 6 --hostfile .mpi_hostfile ./holamundo
> > [clienteprueba:01993] [[64434,0], 2] -> [[64434,0],0] 
> > mca_oob_tcp_msg_send_handler: writev:failed: Bad file descriptor (9) [sd = 
> > 9]
> > [clienteprueba:01993] [[64434,0], 2] routed:binomial: Connection to 
> > lifeline [[64434,0],0] lost
> >
> > However, with clientepruebados and pruebaborja only on my hostfile, it 
> > works:
> >
> > pruebaborja slots=1
> > clientepruebados slots=4
> > #clienteprueba slots=1
> >
> > mpiuser@pruebaborja:~$ mpirun -np 6 --hostfile .mpi_hostfile ./holamundo
> > Hola, mundo, soy pruebaborja: 0 de 6
> > Hola, mundo, soy pruebaborja: 5 de 6
> > Hola, mundo, soy clientepruebados: 1 de 6
> > Hola, mundo, soy clientepruebados: 2 de 6
> > Hola, mundo, soy clientepruebados: 3 de 6
> > Hola, mundo, soy clientepruebados: 4 de 6
> >
> > I've checked the OMPI versions on the machines and it's the same. I can't 
> > understand why Im getting this error on clienteprueba; i've done the same 
> > config on  clientepruebados and clienteprueba. Anyone could help me to 
> > solve this?
> >
> > Sorry for my english.
> > Thanks in advance
> >
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI users] libmpi_f90 shared lib version number change in 1.6.3

2013-01-16 Thread Jeff Squyres (jsquyres)

On Jan 12, 2013, at 5:06 AM, Ake Sandgren  wrote:

> Was the change for libmpi_f90 in VERSION intentional or a typo?
> This is from openmpi 1.6.3
> libmpi_f90_so_version=4:0:1
> 1.6.1 had
> libmpi_f90_so_version=2:0:1

It was both intentional and a typo.  

Specifically, it really should have bee 4:0:3.  :-(

Meaning: we unintentionally broke the F90 ABI for 1.6.3 (specifically: OMPI 
applications compiled to utilize "use mpi"). :-( :-( :-(  

This ABI compatibility will be restored in 1.6.4.

-

See these commit messages for a fuller explanation:

https://svn.open-mpi.org/trac/ompi/changeset/27471
https://svn.open-mpi.org/trac/ompi/changeset/27558

The short explanation is that, in terms of the "use mpi" interface, all Open 
MPI 1.6.x versions are ABI compatible except 1.6.3.

These work:

- compile a "use mpi" OMPI application with 1.6.x (where x!=3), change your 
LD_LIBRARY_PATH to point to a different OMPI 1.6.x installation (where x!=3)

- compile a "use mpi" OMPI application with 1.6.3, use it with OMPI 1.6.3 
installation

These do not:

- compile a "use mpi" OMPI application with 1.6.x (where x!=3), change your 
LD_LIBRARY_PATH to point an OMPI 1.6.3 installation

- compile a "use mpi" OMPI application with 1.6.3, use it with OMPI 1.6.x 
installation (where x!=3)

I will make an FAQ item about this so that the result is google-able.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

[OMPI users] openmpi 1.4 vs. 1.6 internals

2013-01-16 Thread Divakar Viswanath

There is a significant improvement in non-blocking MPI calls (over
Infiniband) from version 1.4 to version 1.6.

I am comparing two methods to exchange messages between two nodes. The
message size varies from 1 MB to 1 GB.

The first method is sends using MPI_Isend()and receives using MPI_Irecv().
The same buffers are used repeatedly to exchange messages between two
nodes. The buffers are allocated using malloc(). In the second method, the
buffers are allocated using MPI_Alloc_mem() and the send and receive are
initialized using MPI_Send_init() and MPI_Recv_init(). The sends and recvs
are posted using MPI_Start.

In version 1.4, the first method has a peak bidirectional bandwidth of 5.3
GB/s and the second method has a peak of 6.2 GB/s. In version 1.6, both
methods have peak bandwidth of 6.2 GB/s. The peak bandwidths are pretty
close to the number reported by ib_read_bw or ib_write_bw commands for
Infiniband.

1. The first question is as follows: why does version 1.6 do nonblocking
Isend/Irecv better than version 1.4? I would assume that in the second
method, memory is pinned and registered during MPI_Alloc_mem() and the
transfers use RDMA direct.

In the first method, where the buffers are allocated using malloc(), I
would assume that RDMA pipelining is used. I emphasize that the
mpi_leave_pinned parameter has its default value of  -1 and is turned off
in all the runs. I would expect some overhead due to registering and
unregistering memory during each Isend/Irecv, even though pipelining tries
 to amortize the costs.

The numbers for version 1.4 are in line with this expectation. However, in
version 1.6 there seems to be no overhead at all due to
registering/unregistering memory. What is going on? Do large messages still
use RDMA pipelining? How has the RDMA pipeline been improved?

2. To send and receive a large message, openmpi may choose between RDMA
write and RDMA read. If RDMA pipelining  is used, it seems advantageous to
use RDMA write because some fragments use send/recv semantics. If the
memory is registered and the send/recv result in a single RDMA operation,
there seems nothing to choose between the two. Is that correct? If so, does
openmpi use RDMA write or RDMA read?

Thanks!

Divakar Viswanath

[OMPI users] Problem with mpirun for java codes

2013-01-16 Thread Karos Lotfifar

Hi,

I am still struggling with the installation problems! I get very strange
errors. everything is fine when I run OpenMPI for C codes, but when I try
to run a simple java code I get very strange error. The code is as simple
as the following and I can not get it running:

import mpi.*;

class JavaMPI {
  public static void main(String[] args) throws MPIException {
MPI.Init(args);
System.out.println("Hello world from rank " +
  MPI.COMM_WORLD.Rank() + " of " +
  MPI.COMM_WORLD.Size() );
MPI.Finalize();
  }
}

everything is ok with mpijavac, my java code, etc. when I try to run the
code with the following command:

/usr/local/bin/mpijavac -d classes JavaMPI.java   --> FINE
/usr/local/bin/mpirun -np 2 java -cp ./classes JavaMPI  --> *ERROR*

I'll the following error. Could you please help me about this (As I
mentioned the I can run C MPI codes without any problem ). The system
specifications are:

JRE version: 6.0_30-b12 (java-sun-6)
OS: Linux 3.0.0-30-generic-pae #47-Ubuntu
CPU:total 4 (2 cores per cpu, 2 threads per core) family 6 model 42
stepping 7, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2,
popcnt, ht




##
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x70e1dd12, pid=28616, tid=3063311216
#
 (0xb) at pc=0x70f61d12, pid=28615, tid=3063343984
#
# JRE version: 6.0_30-b12
# JRE version: 6.0_30-b12
# Java VM: Java HotSpot(TM) Server VM (20.5-b03 mixed mode linux-x86 )
# Problematic frame:
# C  [libmpi.so.1+0x20d12]  unsigned __int128+0xa2
#
# An error report file with more information is saved as:
# /home/karos/hs_err_pid28616.log
# Java VM: Java HotSpot(TM) Server VM (20.5-b03 mixed mode linux-x86 )
# Problematic frame:
# C  [libmpi.so.1+0x20d12]  unsigned __int128+0xa2
#
# An error report file with more information is saved as:
# /home/karos/hs_err_pid28615.log
#
# If you would like to submit a bug report, please visit:
#   http://java.sun.com/webapps/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
[tulips:28616] *** Process received signal ***
[tulips:28616] Signal: Aborted (6)
[tulips:28616] Signal code:  (-6)
[tulips:28616] [ 0] [0xb777840c]
[tulips:28616] [ 1] [0xb7778424]
[tulips:28616] [ 2] /lib/i386-linux-gnu/libc.so.6(gsignal+0x4f) [0xb75e3cff]
[tulips:28616] [ 3] /lib/i386-linux-gnu/libc.so.6(abort+0x175) [0xb75e7325]
[tulips:28616] [ 4]
/usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(+0x5dcf7f)
[0xb6f6df7f]
[tulips:28616] [ 5]
/usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(+0x724897)
[0xb70b5897]
[tulips:28616] [ 6]
/usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(JVM_handle_linux_signal+0x21c)
[0xb6f7529c]
[tulips:28616] [ 7]
/usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(+0x5dff64)
[0xb6f70f64]
[tulips:28616] [ 8] [0xb777840c]
[tulips:28616] [ 9] [0xb3891548]
[tulips:28616] *** End of error message ***
[tulips:28615] *** Process received signal ***
[tulips:28615] Signal: Aborted (6)
[tulips:28615] Signal code:  (-6)
#
# If you would like to submit a bug report, please visit:
#   http://java.sun.com/webapps/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
[tulips:28615] [ 0] [0xb778040c]
[tulips:28615] [ 1] [0xb7780424]
[tulips:28615] [ 2] /lib/i386-linux-gnu/libc.so.6(gsignal+0x4f) [0xb75ebcff]
[tulips:28615] [ 3] /lib/i386-linux-gnu/libc.so.6(abort+0x175) [0xb75ef325]
[tulips:28615] [ 4]
/usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(+0x5dcf7f)
[0xb6f75f7f]
[tulips:28615] [ 5]
/usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(+0x724897)
[0xb70bd897]
[tulips:28615] [ 6]
/usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(JVM_handle_linux_signal+0x21c)
[0xb6f7d29c]
[tulips:28615] [ 7]
/usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(+0x5dff64)
[0xb6f78f64]
[tulips:28615] [ 8] [0xb778040c]
[tulips:28615] [ 9] [0xb3899548]
[tulips:28615] *** End of error message ***
--
mpirun noticed that process rank 1 with PID 28616 on node tulips exited on
signal 6 (Aborted).
--

##

-- 
Regards,
Karos Lotfifar

[OMPI users] help - Problem with mpirun for java codes

2013-01-16 Thread Karos Lotfifar

Hi,

I am still struggling with the installation problems! I get very strange
errors. everything is fine when I run OpenMPI for C codes, but when I try
to run a simple java code I get very strange error. The code is as simple
as the following and I can not get it running:

import mpi.*;

class JavaMPI {
  public static void main(String[] args) throws MPIException {
MPI.Init(args);
System.out.println("Hello world from rank " +
  MPI.COMM_WORLD.Rank() + " of " +
  MPI.COMM_WORLD.Size() );
MPI.Finalize();
  }
}

everything is ok with mpijavac, my java code, etc. when I try to run the
code with the following command:

/usr/local/bin/mpijavac -d classes JavaMPI.java   --> FINE
/usr/local/bin/mpirun -np 2 java -cp ./classes JavaMPI  --> *ERROR*

I'll the following error. Could you please help me about this (As I
mentioned the I can run C MPI codes without any problem ). The system
specifications are:

JRE version: 6.0_30-b12 (java-sun-6)
OS: Linux 3.0.0-30-generic-pae #47-Ubuntu
CPU:total 4 (2 cores per cpu, 2 threads per core) family 6 model 42
stepping 7, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2,
popcnt, ht




##
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x70e1dd12, pid=28616, tid=3063311216
#
 (0xb) at pc=0x70f61d12, pid=28615, tid=3063343984
#
# JRE version: 6.0_30-b12
# JRE version: 6.0_30-b12
# Java VM: Java HotSpot(TM) Server VM (20.5-b03 mixed mode linux-x86 )
# Problematic frame:
# C  [libmpi.so.1+0x20d12]  unsigned __int128+0xa2
#
# An error report file with more information is saved as:
# /home/karos/hs_err_pid28616.log
# Java VM: Java HotSpot(TM) Server VM (20.5-b03 mixed mode linux-x86 )
# Problematic frame:
# C  [libmpi.so.1+0x20d12]  unsigned __int128+0xa2
#
# An error report file with more information is saved as:
# /home/karos/hs_err_pid28615.log
#
# If you would like to submit a bug report, please visit:
#   http://java.sun.com/webapps/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
[tulips:28616] *** Process received signal ***
[tulips:28616] Signal: Aborted (6)
[tulips:28616] Signal code:  (-6)
[tulips:28616] [ 0] [0xb777840c]
[tulips:28616] [ 1] [0xb7778424]
[tulips:28616] [ 2] /lib/i386-linux-gnu/libc.so.6(gsignal+0x4f) [0xb75e3cff]
[tulips:28616] [ 3] /lib/i386-linux-gnu/libc.so.6(abort+0x175) [0xb75e7325]
[tulips:28616] [ 4]
/usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(+0x5dcf7f)
[0xb6f6df7f]
[tulips:28616] [ 5]
/usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(+0x724897)
[0xb70b5897]
[tulips:28616] [ 6]
/usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(JVM_handle_linux_signal+0x21c)
[0xb6f7529c]
[tulips:28616] [ 7]
/usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(+0x5dff64)
[0xb6f70f64]
[tulips:28616] [ 8] [0xb777840c]
[tulips:28616] [ 9] [0xb3891548]
[tulips:28616] *** End of error message ***
[tulips:28615] *** Process received signal ***
[tulips:28615] Signal: Aborted (6)
[tulips:28615] Signal code:  (-6)
#
# If you would like to submit a bug report, please visit:
#   http://java.sun.com/webapps/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
[tulips:28615] [ 0] [0xb778040c]
[tulips:28615] [ 1] [0xb7780424]
[tulips:28615] [ 2] /lib/i386-linux-gnu/libc.so.6(gsignal+0x4f) [0xb75ebcff]
[tulips:28615] [ 3] /lib/i386-linux-gnu/libc.so.6(abort+0x175) [0xb75ef325]
[tulips:28615] [ 4]
/usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(+0x5dcf7f)
[0xb6f75f7f]
[tulips:28615] [ 5]
/usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(+0x724897)
[0xb70bd897]
[tulips:28615] [ 6]
/usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(JVM_handle_linux_signal+0x21c)
[0xb6f7d29c]
[tulips:28615] [ 7]
/usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(+0x5dff64)
[0xb6f78f64]
[tulips:28615] [ 8] [0xb778040c]
[tulips:28615] [ 9] [0xb3899548]
[tulips:28615] *** End of error message ***
--
mpirun noticed that process rank 1 with PID 28616 on node tulips exited on
signal 6 (Aborted).
--

##

-- 
Regards,
Karos Lotfifar

Re: [OMPI users] Problem with mpirun for java codes

2013-01-16 Thread Ralph Castain

Which version of OMPI are you using?


On Jan 16, 2013, at 11:43 AM, Karos Lotfifar  wrote:

> Hi,
> 
> I am still struggling with the installation problems! I get very strange 
> errors. everything is fine when I run OpenMPI for C codes, but when I try to 
> run a simple java code I get very strange error. The code is as simple as the 
> following and I can not get it running:
> 
> import mpi.*;
> 
> class JavaMPI {
>   public static void main(String[] args) throws MPIException {
> MPI.Init(args);
> System.out.println("Hello world from rank " + 
>   MPI.COMM_WORLD.Rank() + " of " +
>   MPI.COMM_WORLD.Size() );
> MPI.Finalize();
>   }
> } 
> 
> everything is ok with mpijavac, my java code, etc. when I try to run the code 
> with the following command:
> 
> /usr/local/bin/mpijavac -d classes JavaMPI.java   --> FINE
> /usr/local/bin/mpirun -np 2 java -cp ./classes JavaMPI  --> *ERROR*
> 
> I'll the following error. Could you please help me about this (As I mentioned 
> the I can run C MPI codes without any problem ). The system specifications 
> are:
> 
> JRE version: 6.0_30-b12 (java-sun-6)
> OS: Linux 3.0.0-30-generic-pae #47-Ubuntu
> CPU:total 4 (2 cores per cpu, 2 threads per core) family 6 model 42 stepping 
> 7, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, ht
> 
> 
> 
> 
> ##
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV#
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x70e1dd12, pid=28616, tid=3063311216
> #
>  (0xb) at pc=0x70f61d12, pid=28615, tid=3063343984
> #
> # JRE version: 6.0_30-b12
> # JRE version: 6.0_30-b12
> # Java VM: Java HotSpot(TM) Server VM (20.5-b03 mixed mode linux-x86 )
> # Problematic frame:
> # C  [libmpi.so.1+0x20d12]  unsigned __int128+0xa2
> #
> # An error report file with more information is saved as:
> # /home/karos/hs_err_pid28616.log
> # Java VM: Java HotSpot(TM) Server VM (20.5-b03 mixed mode linux-x86 )
> # Problematic frame:
> # C  [libmpi.so.1+0x20d12]  unsigned __int128+0xa2
> #
> # An error report file with more information is saved as:
> # /home/karos/hs_err_pid28615.log
> #
> # If you would like to submit a bug report, please visit:
> #   http://java.sun.com/webapps/bugreport/crash.jsp
> # The crash happened outside the Java Virtual Machine in native code.
> # See problematic frame for where to report the bug.
> #
> [tulips:28616] *** Process received signal ***
> [tulips:28616] Signal: Aborted (6)
> [tulips:28616] Signal code:  (-6)
> [tulips:28616] [ 0] [0xb777840c]
> [tulips:28616] [ 1] [0xb7778424]
> [tulips:28616] [ 2] /lib/i386-linux-gnu/libc.so.6(gsignal+0x4f) [0xb75e3cff]
> [tulips:28616] [ 3] /lib/i386-linux-gnu/libc.so.6(abort+0x175) [0xb75e7325]
> [tulips:28616] [ 4] 
> /usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(+0x5dcf7f) 
> [0xb6f6df7f]
> [tulips:28616] [ 5] 
> /usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(+0x724897) 
> [0xb70b5897]
> [tulips:28616] [ 6] 
> /usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(JVM_handle_linux_signal+0x21c)
>  [0xb6f7529c]
> [tulips:28616] [ 7] 
> /usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(+0x5dff64) 
> [0xb6f70f64]
> [tulips:28616] [ 8] [0xb777840c]
> [tulips:28616] [ 9] [0xb3891548]
> [tulips:28616] *** End of error message ***
> [tulips:28615] *** Process received signal ***
> [tulips:28615] Signal: Aborted (6)
> [tulips:28615] Signal code:  (-6)
> #
> # If you would like to submit a bug report, please visit:
> #   http://java.sun.com/webapps/bugreport/crash.jsp
> # The crash happened outside the Java Virtual Machine in native code.
> # See problematic frame for where to report the bug.
> #
> [tulips:28615] [ 0] [0xb778040c]
> [tulips:28615] [ 1] [0xb7780424]
> [tulips:28615] [ 2] /lib/i386-linux-gnu/libc.so.6(gsignal+0x4f) [0xb75ebcff]
> [tulips:28615] [ 3] /lib/i386-linux-gnu/libc.so.6(abort+0x175) [0xb75ef325]
> [tulips:28615] [ 4] 
> /usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(+0x5dcf7f) 
> [0xb6f75f7f]
> [tulips:28615] [ 5] 
> /usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(+0x724897) 
> [0xb70bd897]
> [tulips:28615] [ 6] 
> /usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(JVM_handle_linux_signal+0x21c)
>  [0xb6f7d29c]
> [tulips:28615] [ 7] 
> /usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(+0x5dff64) 
> [0xb6f78f64]
> [tulips:28615] [ 8] [0xb778040c]
> [tulips:28615] [ 9] [0xb3899548]
> [tulips:28615] *** End of error message ***
> --
> mpirun noticed that process rank 1 with PID 28616 on node tulips exited on 
> signal 6 (Aborted).
> --
> 
>

Re: [OMPI users] Problem with mpirun for java codes

2013-01-16 Thread Karos Lotfifar

Hi, 
The version that I am using is 

1.7rc6 (pre-release)


Regards,
Karos

On 16 Jan 2013, at 21:07, Ralph Castain  wrote:

> Which version of OMPI are you using?
> 
> 
> On Jan 16, 2013, at 11:43 AM, Karos Lotfifar  wrote:
> 
>> Hi,
>> 
>> I am still struggling with the installation problems! I get very strange 
>> errors. everything is fine when I run OpenMPI for C codes, but when I try to 
>> run a simple java code I get very strange error. The code is as simple as 
>> the following and I can not get it running:
>> 
>> import mpi.*;
>> 
>> class JavaMPI {
>>   public static void main(String[] args) throws MPIException {
>> MPI.Init(args);
>> System.out.println("Hello world from rank " + 
>>   MPI.COMM_WORLD.Rank() + " of " +
>>   MPI.COMM_WORLD.Size() );
>> MPI.Finalize();
>>   }
>> } 
>> 
>> everything is ok with mpijavac, my java code, etc. when I try to run the 
>> code with the following command:
>> 
>> /usr/local/bin/mpijavac -d classes JavaMPI.java   --> FINE
>> /usr/local/bin/mpirun -np 2 java -cp ./classes JavaMPI  --> *ERROR*
>> 
>> I'll the following error. Could you please help me about this (As I 
>> mentioned the I can run C MPI codes without any problem ). The system 
>> specifications are:
>> 
>> JRE version: 6.0_30-b12 (java-sun-6)
>> OS: Linux 3.0.0-30-generic-pae #47-Ubuntu
>> CPU:total 4 (2 cores per cpu, 2 threads per core) family 6 model 42 stepping 
>> 7, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, ht
>> 
>> 
>> 
>> 
>> ##
>> #
>> # A fatal error has been detected by the Java Runtime Environment:
>> #
>> #  SIGSEGV#
>> # A fatal error has been detected by the Java Runtime Environment:
>> #
>> #  SIGSEGV (0xb) at pc=0x70e1dd12, pid=28616, tid=3063311216
>> #
>>  (0xb) at pc=0x70f61d12, pid=28615, tid=3063343984
>> #
>> # JRE version: 6.0_30-b12
>> # JRE version: 6.0_30-b12
>> # Java VM: Java HotSpot(TM) Server VM (20.5-b03 mixed mode linux-x86 )
>> # Problematic frame:
>> # C  [libmpi.so.1+0x20d12]  unsigned __int128+0xa2
>> #
>> # An error report file with more information is saved as:
>> # /home/karos/hs_err_pid28616.log
>> # Java VM: Java HotSpot(TM) Server VM (20.5-b03 mixed mode linux-x86 )
>> # Problematic frame:
>> # C  [libmpi.so.1+0x20d12]  unsigned __int128+0xa2
>> #
>> # An error report file with more information is saved as:
>> # /home/karos/hs_err_pid28615.log
>> #
>> # If you would like to submit a bug report, please visit:
>> #   http://java.sun.com/webapps/bugreport/crash.jsp
>> # The crash happened outside the Java Virtual Machine in native code.
>> # See problematic frame for where to report the bug.
>> #
>> [tulips:28616] *** Process received signal ***
>> [tulips:28616] Signal: Aborted (6)
>> [tulips:28616] Signal code:  (-6)
>> [tulips:28616] [ 0] [0xb777840c]
>> [tulips:28616] [ 1] [0xb7778424]
>> [tulips:28616] [ 2] /lib/i386-linux-gnu/libc.so.6(gsignal+0x4f) [0xb75e3cff]
>> [tulips:28616] [ 3] /lib/i386-linux-gnu/libc.so.6(abort+0x175) [0xb75e7325]
>> [tulips:28616] [ 4] 
>> /usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(+0x5dcf7f) 
>> [0xb6f6df7f]
>> [tulips:28616] [ 5] 
>> /usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(+0x724897) 
>> [0xb70b5897]
>> [tulips:28616] [ 6] 
>> /usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(JVM_handle_linux_signal+0x21c)
>>  [0xb6f7529c]
>> [tulips:28616] [ 7] 
>> /usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(+0x5dff64) 
>> [0xb6f70f64]
>> [tulips:28616] [ 8] [0xb777840c]
>> [tulips:28616] [ 9] [0xb3891548]
>> [tulips:28616] *** End of error message ***
>> [tulips:28615] *** Process received signal ***
>> [tulips:28615] Signal: Aborted (6)
>> [tulips:28615] Signal code:  (-6)
>> #
>> # If you would like to submit a bug report, please visit:
>> #   http://java.sun.com/webapps/bugreport/crash.jsp
>> # The crash happened outside the Java Virtual Machine in native code.
>> # See problematic frame for where to report the bug.
>> #
>> [tulips:28615] [ 0] [0xb778040c]
>> [tulips:28615] [ 1] [0xb7780424]
>> [tulips:28615] [ 2] /lib/i386-linux-gnu/libc.so.6(gsignal+0x4f) [0xb75ebcff]
>> [tulips:28615] [ 3] /lib/i386-linux-gnu/libc.so.6(abort+0x175) [0xb75ef325]
>> [tulips:28615] [ 4] 
>> /usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(+0x5dcf7f) 
>> [0xb6f75f7f]
>> [tulips:28615] [ 5] 
>> /usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(+0x724897) 
>> [0xb70bd897]
>> [tulips:28615] [ 6] 
>> /usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(JVM_handle_linux_signal+0x21c)
>>  [0xb6f7d29c]
>> [tulips:28615] [ 7] 
>> /usr/lib/jvm/java-6-sun-1.6.0.30/jre/lib/i386/server/libjvm.so(+0x5dff64) 
>> [0xb6f78f64]
>> [tulips:28615] [ 8] [0xb778040c]
>> [tulips:28615] [ 9] [0xb3899548]
>> [tulips:28615] *** End of error message ***
>> --

Re: [OMPI users] Error compiling openmpi-1.6.4a1r27766 on Solaris 10

[OMPI users] Error running program : mca_oob_tcp_msg_send_handler: writev:failed: Bad file descriptor

Re: [OMPI users] Initializing OMPI with invoking the array constructor on Fortran derived types causes the executable to crash

Re: [OMPI users] Error running program : mca_oob_tcp_msg_send_handler: writev:failed: Bad file descriptor

Re: [OMPI users] Windows MPI with Fortran calling programs

[OMPI users] help me understand these error msgs

Re: [OMPI users] help me understand these error msgs

Re: [OMPI users] Error running program : mca_oob_tcp_msg_send_handler: writev:failed: Bad file descriptor

Re: [OMPI users] Error running program : mca_oob_tcp_msg_send_handler: writev:failed: Bad file descriptor

Re: [OMPI users] Error running program : mca_oob_tcp_msg_send_handler: writev:failed: Bad file descriptor

Re: [OMPI users] libmpi_f90 shared lib version number change in 1.6.3

[OMPI users] openmpi 1.4 vs. 1.6 internals

[OMPI users] Problem with mpirun for java codes

[OMPI users] help - Problem with mpirun for java codes

Re: [OMPI users] Problem with mpirun for java codes

Re: [OMPI users] Problem with mpirun for java codes

16 matches

Site Navigation

Mail list logo

Footer information