Just checking if there's some solution for this.

Thank you,
Saliya


On Tue, Mar 11, 2014 at 10:54 PM, Saliya Ekanayake <esal...@gmail.com>wrote:

> I forgot to mention that I tried the hello.c version instead of Java and
> it too failed in a similar manner, but
>
> 1. On a single node with --mca btl ^tcp it went up to 24 procs before
> failing
> 2. On 8 nodes with --mca btl ^tcp it could go only up to 16 procs
>
>
> On Tue, Mar 11, 2014 at 5:06 PM, Saliya Ekanayake <esal...@gmail.com>wrote:
>
>> I just tested with "ml" turned off as you suggested, but unfortunately it
>> didn't solve the issue.
>>
>> However, I found that by explicitly setting --mca btl ^tcp the code
>> worked on upto 4 nodes with each running 8 procs. If I don't specify this
>> it'll simply fail even on one node with 8 procs.
>>
>> Thank you,
>> Saliya
>>
>>
>> On Tue, Mar 11, 2014 at 4:35 PM, Jeff Squyres (jsquyres) <
>> jsquy...@cisco.com> wrote:
>>
>>> Looks like we still have a bug in one of our components -- can you try:
>>>
>>>     mpirun --mca coll ^ml ...
>>>
>>> This will deactivate the "ml" collective component.  See if that enables
>>> you to run (this particular component has nothing to do with Java).
>>>
>>>
>>> On Mar 11, 2014, at 1:33 AM, Saliya Ekanayake <esal...@gmail.com> wrote:
>>>
>>> > Just tested that this happens even with the simple Hello.java program
>>> given in OMPI distribution.
>>> >
>>> > I've made a tarball containing details of the error adhering to
>>> http://www.open-mpi.org/community/help/. Please let me know if I have
>>> missed any info necessary.
>>> >
>>> > Thank you,
>>> > Saliya
>>> >
>>> >
>>> >
>>> >
>>> > On Mon, Mar 10, 2014 at 10:46 AM, Jeff Squyres (jsquyres) <
>>> jsquy...@cisco.com> wrote:
>>> > Greetings, and thanks for trying out our Java bindings.
>>> >
>>> > Can you provide some more details?  E.g., is there a particular
>>> program you're running that incurs these problems?  Or is there even a
>>> particular MPI function that you're using that results in this segv (e.g.,
>>> perhaps we have a specific bug somewhere)?
>>> >
>>> > Can you reduce the segv to a small example that we can reproduce (and
>>> therefore fix)?
>>> >
>>> >
>>> > On Mar 10, 2014, at 12:05 AM, Saliya Ekanayake <esal...@gmail.com>
>>> wrote:
>>> >
>>> > > Hi,
>>> > >
>>> > > I have 8 nodes each with 2 quad core sockets. Also, the nodes have
>>> IB connectivity. I am trying to run OMPI Java binding in OMPI trunk
>>> revision 30301 with 8 procs per node totaling 64 procs. This gives a SIGSEV
>>> error as below.
>>> > >
>>> > > I wonder if you have any suggestion to resolve this?
>>> > >
>>> > > Thank you,
>>> > > Saliya
>>> > >
>>> > > # A fatal error has been detected by the Java Runtime Environment:
>>> > > #
>>> > > #  SIGSEGV (0xb) at pc=0x000000313867b75b, pid=12229,
>>> tid=47864973515072
>>> > > #
>>> > > # JRE version: Java(TM) SE Runtime Environment (8.0-b118) (build
>>> 1.8.0-ea-b118)
>>> > > # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.0-b60 mixed mode
>>> linux-amd64 compressed oops)
>>> > > # Problematic frame:
>>> > > # C  [libc.so.6+0x7b75b]  memcpy+0x15b
>>> > >
>>> > >
>>> > > --
>>> > > Saliya Ekanayake esal...@gmail.com
>>> > > http://saliya.org
>>> > > _______________________________________________
>>> > > users mailing list
>>> > > us...@open-mpi.org
>>> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> >
>>> >
>>> > --
>>> > Jeff Squyres
>>> > jsquy...@cisco.com
>>> > For corporate legal information go to:
>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>> >
>>> > _______________________________________________
>>> > users mailing list
>>> > us...@open-mpi.org
>>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> >
>>> >
>>> >
>>> > --
>>> > Saliya Ekanayake esal...@gmail.com
>>> > Cell 812-391-4914 Home 812-961-6383
>>> > http://saliya.org
>>> > <hellobug.tar.gz>_______________________________________________
>>> > users mailing list
>>> > us...@open-mpi.org
>>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> --
>>> Jeff Squyres
>>> jsquy...@cisco.com
>>> For corporate legal information go to:
>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>>
>>
>> --
>> Saliya Ekanayake esal...@gmail.com
>> Cell 812-391-4914 Home 812-961-6383
>> http://saliya.org
>>
>
>
>
> --
> Saliya Ekanayake esal...@gmail.com
> Cell 812-391-4914 Home 812-961-6383
> http://saliya.org
>



-- 
Saliya Ekanayake esal...@gmail.com
Cell 812-391-4914 Home 812-961-6383
http://saliya.org

Reply via email to