This is really great news!! I'll test the trunk on our cluster. Thank you, Saliya
On Fri, Mar 14, 2014 at 4:44 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com > wrote: > We just fixed the segv (see > https://svn.open-mpi.org/trac/ompi/changeset/31073, if you care). > > The issue was an errant large array on the stack in debug builds, which > would cause JVMs to run out of stack space. > > The fix is on the SVN trunk now; it will be on the v1.7 branch shortly. > > > On Mar 11, 2014, at 5:06 PM, Saliya Ekanayake <esal...@gmail.com> wrote: > > > I just tested with "ml" turned off as you suggested, but unfortunately > it didn't solve the issue. > > > > However, I found that by explicitly setting --mca btl ^tcp the code > worked on upto 4 nodes with each running 8 procs. If I don't specify this > it'll simply fail even on one node with 8 procs. > > > > Thank you, > > Saliya > > > > > > On Tue, Mar 11, 2014 at 4:35 PM, Jeff Squyres (jsquyres) < > jsquy...@cisco.com> wrote: > > Looks like we still have a bug in one of our components -- can you try: > > > > mpirun --mca coll ^ml ... > > > > This will deactivate the "ml" collective component. See if that enables > you to run (this particular component has nothing to do with Java). > > > > > > On Mar 11, 2014, at 1:33 AM, Saliya Ekanayake <esal...@gmail.com> wrote: > > > > > Just tested that this happens even with the simple Hello.java program > given in OMPI distribution. > > > > > > I've made a tarball containing details of the error adhering to > http://www.open-mpi.org/community/help/. Please let me know if I have > missed any info necessary. > > > > > > Thank you, > > > Saliya > > > > > > > > > > > > > > > On Mon, Mar 10, 2014 at 10:46 AM, Jeff Squyres (jsquyres) < > jsquy...@cisco.com> wrote: > > > Greetings, and thanks for trying out our Java bindings. > > > > > > Can you provide some more details? E.g., is there a particular > program you're running that incurs these problems? Or is there even a > particular MPI function that you're using that results in this segv (e.g., > perhaps we have a specific bug somewhere)? > > > > > > Can you reduce the segv to a small example that we can reproduce (and > therefore fix)? > > > > > > > > > On Mar 10, 2014, at 12:05 AM, Saliya Ekanayake <esal...@gmail.com> > wrote: > > > > > > > Hi, > > > > > > > > I have 8 nodes each with 2 quad core sockets. Also, the nodes have > IB connectivity. I am trying to run OMPI Java binding in OMPI trunk > revision 30301 with 8 procs per node totaling 64 procs. This gives a SIGSEV > error as below. > > > > > > > > I wonder if you have any suggestion to resolve this? > > > > > > > > Thank you, > > > > Saliya > > > > > > > > # A fatal error has been detected by the Java Runtime Environment: > > > > # > > > > # SIGSEGV (0xb) at pc=0x000000313867b75b, pid=12229, > tid=47864973515072 > > > > # > > > > # JRE version: Java(TM) SE Runtime Environment (8.0-b118) (build > 1.8.0-ea-b118) > > > > # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.0-b60 mixed mode > linux-amd64 compressed oops) > > > > # Problematic frame: > > > > # C [libc.so.6+0x7b75b] memcpy+0x15b > > > > > > > > > > > > -- > > > > Saliya Ekanayake esal...@gmail.com > > > > http://saliya.org > > > > _______________________________________________ > > > > users mailing list > > > > us...@open-mpi.org > > > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > > > > -- > > > Jeff Squyres > > > jsquy...@cisco.com > > > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > > > > _______________________________________________ > > > users mailing list > > > us...@open-mpi.org > > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > > > > > > > -- > > > Saliya Ekanayake esal...@gmail.com > > > Cell 812-391-4914 Home 812-961-6383 > > > http://saliya.org > > > <hellobug.tar.gz>_______________________________________________ > > > users mailing list > > > us...@open-mpi.org > > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > -- > > Jeff Squyres > > jsquy...@cisco.com > > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > > > -- > > Saliya Ekanayake esal...@gmail.com > > Cell 812-391-4914 Home 812-961-6383 > > http://saliya.org > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Saliya Ekanayake esal...@gmail.com Cell 812-391-4914 Home 812-961-6383 http://saliya.org