Dear Paul,

I checked the way 'mpirun -np N <cmd>' you mentioned, but it was the same 
problem.

I guess it may related to the system I used, because I have used it correctly 
in 
another XP 32 bit system.

I look forward to more advice.Thanks.

Zhangping 




________________________________
发件人: "users-requ...@open-mpi.org" <users-requ...@open-mpi.org>
收件人: us...@open-mpi.org
发送日期: 2011/5/19 (周四) 11:00:02 上午
主   题: users Digest, Vol 1910, Issue 2

Send users mailing list submissions to
    us...@open-mpi.org

To subscribe or unsubscribe via the World Wide Web, visit
    http://www.open-mpi.org/mailman/listinfo.cgi/users
or, via email, send a message with subject or body 'help' to
    users-requ...@open-mpi.org

You can reach the person managing the list at
    users-ow...@open-mpi.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of users digest..."


Today's Topics:

   1. Re: Error: Entry Point Not Found (Paul van der Walt)
   2. Re: Openib with > 32 cores per node (Robert Horton)
   3. Re: Openib with > 32 cores per node (Samuel K. Gutierrez)


----------------------------------------------------------------------

Message: 1
Date: Thu, 19 May 2011 16:14:02 +0100
From: Paul van der Walt <p...@denknerd.nl>
Subject: Re: [OMPI users] Error: Entry Point Not Found
To: Open MPI Users <us...@open-mpi.org>
Message-ID: <banlktinjz0cntchqjczyhfgsnr51jpu...@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8

Hi,

On 19 May 2011 15:54, Zhangping Wei <zhangping_...@yahoo.com> wrote:
> 4, I use command window to run it in this way: ?mpirun ?n 4 ?**.exe ?,then I

Probably not the problem, but shouldn't that be 'mpirun -np N <cmd>' ?

Paul

-- 
O< ascii ribbon campaign - stop html mail - www.asciiribbon.org



------------------------------

Message: 2
Date: Thu, 19 May 2011 16:37:56 +0100
From: Robert Horton <r.hor...@qmul.ac.uk>
Subject: Re: [OMPI users] Openib with > 32 cores per node
To: Open MPI Users <us...@open-mpi.org>
Message-ID: <1305819476.9663.148.camel@moelwyn>
Content-Type: text/plain; charset="UTF-8"

On Thu, 2011-05-19 at 08:27 -0600, Samuel K. Gutierrez wrote:
> Hi,
> 
> Try the following QP parameters that only use shared receive queues.
> 
> -mca btl_openib_receive_queues S,12288,128,64,32:S,65536,128,64,32
> 

Thanks for that. If I run the job over 2 x 48 cores it now works and the
performance seems reasonable (I need to do some more tuning) but when I
go up to 4 x 48 cores I'm getting the same problem:

[compute-1-7.local][[14383,1],86][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_oob.c:464:qp_create_one]
 error creating qp errno says Cannot allocate memory
[compute-1-7.local:18106] *** An error occurred in MPI_Isend
[compute-1-7.local:18106] *** on communicator MPI_COMM_WORLD
[compute-1-7.local:18106] *** MPI_ERR_OTHER: known error not in list
[compute-1-7.local:18106] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)

Any thoughts?

Thanks,
Rob
-- 
Robert Horton
System Administrator (Research Support) - School of Mathematical Sciences
Queen Mary, University of London
r.hor...@qmul.ac.uk  -  +44 (0) 20 7882 7345



------------------------------

Message: 3
Date: Thu, 19 May 2011 09:59:13 -0600
From: "Samuel K. Gutierrez" <sam...@lanl.gov>
Subject: Re: [OMPI users] Openib with > 32 cores per node
To: Open MPI Users <us...@open-mpi.org>
Message-ID: <b3e83138-9af0-48c0-871c-dbbb2e712...@lanl.gov>
Content-Type: text/plain; charset=us-ascii

Hi,

On May 19, 2011, at 9:37 AM, Robert Horton wrote

> On Thu, 2011-05-19 at 08:27 -0600, Samuel K. Gutierrez wrote:
>> Hi,
>> 
>> Try the following QP parameters that only use shared receive queues.
>> 
>> -mca btl_openib_receive_queues S,12288,128,64,32:S,65536,128,64,32
>> 
> 
> Thanks for that. If I run the job over 2 x 48 cores it now works and the
> performance seems reasonable (I need to do some more tuning) but when I
> go up to 4 x 48 cores I'm getting the same problem:
> 
>[compute-1-7.local][[14383,1],86][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_oob.c:464:qp_create_one]
>] error creating qp errno says Cannot allocate memory
> [compute-1-7.local:18106] *** An error occurred in MPI_Isend
> [compute-1-7.local:18106] *** on communicator MPI_COMM_WORLD
> [compute-1-7.local:18106] *** MPI_ERR_OTHER: known error not in list
> [compute-1-7.local:18106] *** MPI_ERRORS_ARE_FATAL (your MPI job will now 
>abort)
> 
> Any thoughts?

How much memory does each node have?  Does this happen at startup?

Try adding:

-mca btl_openib_cpc_include rdmacm

I'm not sure if your version of OFED supports this feature, but maybe using XRC 
may help.  I **think** other tweaks are needed to get this going, but I'm not 
familiar with the details.

Hope that helps,

Samuel K. Gutierrez
Los Alamos National Laboratory


> 
> Thanks,
> Rob
> -- 
> Robert Horton
> System Administrator (Research Support) - School of Mathematical Sciences
> Queen Mary, University of London
> r.hor...@qmul.ac.uk  -  +44 (0) 20 7882 7345
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users






------------------------------

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

End of users Digest, Vol 1910, Issue 2
**************************************

Reply via email to