Dear Paul, I checked the way 'mpirun -np N <cmd>' you mentioned, but it was the same problem.
I guess it may related to the system I used, because I have used it correctly in another XP 32 bit system. I look forward to more advice.Thanks. Zhangping ________________________________ 发件人: "users-requ...@open-mpi.org" <users-requ...@open-mpi.org> 收件人: us...@open-mpi.org 发送日期: 2011/5/19 (周四) 11:00:02 上午 主 题: users Digest, Vol 1910, Issue 2 Send users mailing list submissions to us...@open-mpi.org To subscribe or unsubscribe via the World Wide Web, visit http://www.open-mpi.org/mailman/listinfo.cgi/users or, via email, send a message with subject or body 'help' to users-requ...@open-mpi.org You can reach the person managing the list at users-ow...@open-mpi.org When replying, please edit your Subject line so it is more specific than "Re: Contents of users digest..." Today's Topics: 1. Re: Error: Entry Point Not Found (Paul van der Walt) 2. Re: Openib with > 32 cores per node (Robert Horton) 3. Re: Openib with > 32 cores per node (Samuel K. Gutierrez) ---------------------------------------------------------------------- Message: 1 Date: Thu, 19 May 2011 16:14:02 +0100 From: Paul van der Walt <p...@denknerd.nl> Subject: Re: [OMPI users] Error: Entry Point Not Found To: Open MPI Users <us...@open-mpi.org> Message-ID: <banlktinjz0cntchqjczyhfgsnr51jpu...@mail.gmail.com> Content-Type: text/plain; charset=UTF-8 Hi, On 19 May 2011 15:54, Zhangping Wei <zhangping_...@yahoo.com> wrote: > 4, I use command window to run it in this way: ?mpirun ?n 4 ?**.exe ?,then I Probably not the problem, but shouldn't that be 'mpirun -np N <cmd>' ? Paul -- O< ascii ribbon campaign - stop html mail - www.asciiribbon.org ------------------------------ Message: 2 Date: Thu, 19 May 2011 16:37:56 +0100 From: Robert Horton <r.hor...@qmul.ac.uk> Subject: Re: [OMPI users] Openib with > 32 cores per node To: Open MPI Users <us...@open-mpi.org> Message-ID: <1305819476.9663.148.camel@moelwyn> Content-Type: text/plain; charset="UTF-8" On Thu, 2011-05-19 at 08:27 -0600, Samuel K. Gutierrez wrote: > Hi, > > Try the following QP parameters that only use shared receive queues. > > -mca btl_openib_receive_queues S,12288,128,64,32:S,65536,128,64,32 > Thanks for that. If I run the job over 2 x 48 cores it now works and the performance seems reasonable (I need to do some more tuning) but when I go up to 4 x 48 cores I'm getting the same problem: [compute-1-7.local][[14383,1],86][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_oob.c:464:qp_create_one] error creating qp errno says Cannot allocate memory [compute-1-7.local:18106] *** An error occurred in MPI_Isend [compute-1-7.local:18106] *** on communicator MPI_COMM_WORLD [compute-1-7.local:18106] *** MPI_ERR_OTHER: known error not in list [compute-1-7.local:18106] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort) Any thoughts? Thanks, Rob -- Robert Horton System Administrator (Research Support) - School of Mathematical Sciences Queen Mary, University of London r.hor...@qmul.ac.uk - +44 (0) 20 7882 7345 ------------------------------ Message: 3 Date: Thu, 19 May 2011 09:59:13 -0600 From: "Samuel K. Gutierrez" <sam...@lanl.gov> Subject: Re: [OMPI users] Openib with > 32 cores per node To: Open MPI Users <us...@open-mpi.org> Message-ID: <b3e83138-9af0-48c0-871c-dbbb2e712...@lanl.gov> Content-Type: text/plain; charset=us-ascii Hi, On May 19, 2011, at 9:37 AM, Robert Horton wrote > On Thu, 2011-05-19 at 08:27 -0600, Samuel K. Gutierrez wrote: >> Hi, >> >> Try the following QP parameters that only use shared receive queues. >> >> -mca btl_openib_receive_queues S,12288,128,64,32:S,65536,128,64,32 >> > > Thanks for that. If I run the job over 2 x 48 cores it now works and the > performance seems reasonable (I need to do some more tuning) but when I > go up to 4 x 48 cores I'm getting the same problem: > >[compute-1-7.local][[14383,1],86][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_oob.c:464:qp_create_one] >] error creating qp errno says Cannot allocate memory > [compute-1-7.local:18106] *** An error occurred in MPI_Isend > [compute-1-7.local:18106] *** on communicator MPI_COMM_WORLD > [compute-1-7.local:18106] *** MPI_ERR_OTHER: known error not in list > [compute-1-7.local:18106] *** MPI_ERRORS_ARE_FATAL (your MPI job will now >abort) > > Any thoughts? How much memory does each node have? Does this happen at startup? Try adding: -mca btl_openib_cpc_include rdmacm I'm not sure if your version of OFED supports this feature, but maybe using XRC may help. I **think** other tweaks are needed to get this going, but I'm not familiar with the details. Hope that helps, Samuel K. Gutierrez Los Alamos National Laboratory > > Thanks, > Rob > -- > Robert Horton > System Administrator (Research Support) - School of Mathematical Sciences > Queen Mary, University of London > r.hor...@qmul.ac.uk - +44 (0) 20 7882 7345 > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users ------------------------------ _______________________________________________ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users End of users Digest, Vol 1910, Issue 2 **************************************