I looks like an openmpi error. Check this link :

http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages



On Thu, Mar 15, 2012 at 1:11 PM, Reuti <[email protected]> wrote:

> Am 15.03.2012 um 11:30 schrieb Mohamed Adel:
>
> > Dear all,
> >
> > I was trying to run a simple mpi script via qsub, then I received the
> error below after which the job run correctly!
> > I received no error when I tried to run the same script directly without
> qsub.
> > Is there a way to fix this error message?
>
> To me it doesn't look like error messages from SGE. You could avoid "#$ -j
>   y" and they should end up in a separate file.
>
> >  Thanks in advance,
> > madel
> >
> > job script:
> > #$ -cwd
> > #$ -j   y
> > #$ -N   hello-mpi
> > #$ -o   $JOB_NAME.o$JOB_ID
> > #$ -pe  impi    16
> > mpirun --rsh=ssh -np 16 ./hello.bin
>
> Are you using a tight integration, or is it an SSH startup outside of
> SGE's control?
>
> -- Reuti
>
>
> > ============================================
> >
> > Error Message:
> > libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
> >     This will severely limit memory registrations.
> > libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
> >     This will severely limit memory registrations.
> > libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
> >     This will severely limit memory registrations.
> > libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
> >     This will severely limit memory registrations.
> > libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
> >     This will severely limit memory registrations.
> > libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
> >     This will severely limit memory registrations.
> > libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
> >     This will severely limit memory registrations.
> > libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
> >     This will severely limit memory registrations.
> > comp040.local:5827:  create_cq Cannot allocate memory
> > comp040.local:5826:  create_cq Cannot allocate memory
> > comp040.local:5829:  create_cq Cannot allocate memory
> > comp040.local:5833:  create_cq Cannot allocate memory
> > comp040.local:5828:  create_cq Cannot allocate memory
> > comp040.local:5831:  create_cq Cannot allocate memory
> > comp040.local:5832:  create_cq Cannot allocate memory
> > comp040.local:5830:  create_cq Cannot allocate memory
> > comp040.local:5832:  open_hca: getaddr_netdev ERROR: No such file or
> directory. Is ib1 configured?
> > comp040.local:5832: dapls_ib_open_hca failed 120000
> > comp040.local:5827:  open_hca: getaddr_netdev ERROR: No such file or
> directory. Is ib1 configured?
> > comp040.local:5827: dapls_ib_open_hca failed 120000
> > comp040.local:5832:  open_hca: getaddr_netdev ERROR: No such device. Is
> ib2 configured?
> > comp040.local:5832: dapls_ib_open_hca failed 120000
> > comp040.local:5829:  open_hca: getaddr_netdev ERROR: No such file or
> directory. Is ib1 configured?
> > comp040.local:5829: dapls_ib_open_hca failed 120000
> > comp040.local:5827:  open_hca: getaddr_netdev ERROR: No such device. Is
> ib2 configured?
> > comp040.local:5827: dapls_ib_open_hca failed 120000
> > comp040.local:5832:  open_hca: getaddr_netdev ERROR: No such device. Is
> ib3 configured?
> > comp040.local:5832: dapls_ib_open_hca failed 120000
> > comp040.local:5829:  open_hca: getaddr_netdev ERROR: No such device. Is
> ib2 configured?
> > comp040.local:5829: dapls_ib_open_hca failed 120000
> > comp040.local:5826:  open_hca: getaddr_netdev ERROR: No such file or
> directory. Is ib1 configured?
> > comp040.local:5826: dapls_ib_open_hca failed 120000
> > comp040.local:5827:  open_hca: getaddr_netdev ERROR: No such device. Is
> ib3 configured?
> > comp040.local:5827: dapls_ib_open_hca failed 120000
> > comp040.local:5828:  open_hca: getaddr_netdev ERROR: No such file or
> directory. Is ib1 configured?
> > comp040.local:5828: comp040.local:5832:  open_hca: getaddr_netdev ERROR:
> No such device. Is bond0 configured?
> > comp040.local:5832: dapls_ib_open_hca failed 120000
> > dapls_ib_open_hca failed 120000
> > comp040.local:5829:  open_hca: getaddr_netdev ERROR: No such device. Is
> ib3 configured?
> > comp040.local:5829: dapls_ib_open_hca failed 120000
> > comp040.local:5826:  open_hca: getaddr_netdev ERROR: No such device. Is
> ib2 configured?
> > comp040.local:5826: dapls_ib_open_hca failed 120000
> > comp040.local:5833:  open_hca: getaddr_netdev ERROR: No such file or
> directory. Is ib1 configured?
> > comp040.local:5833: dapls_ib_open_hca failed 120000
> > comp040.local:5831:  open_hca: getaddr_netdev ERROR: No such file or
> directory. Is ib1 configured?
> > comp040.local:5831: dapls_ib_open_hca failed 120000
> > comp040.local:5827:  open_hca: getaddr_netdev ERROR: No such device. Is
> bond0 configured?
> > comp040.local:5827: dapls_ib_open_hca failed 120000
> > comp040.local:5830:  open_hca: getaddr_netdev ERROR: No such file or
> directory. Is ib1 configured?
> > comp040.local:5830: dapls_ib_open_hca failed 120000
> > comp040.local:5833:  open_hca: getaddr_netdev ERROR: No such device. Is
> ib2 configured?
> > comp040.local:5833: dapls_ib_open_hca failed 120000
> > comp040.local:5829:  open_hca: getaddr_netdev ERROR: No such device. Is
> bond0 configured?
> > comp040.local:5829: dapls_ib_open_hca failed 120000
> > comp040.local:5826:  open_hca: getaddr_netdev ERROR: No such device. Is
> ib3 configured?
> > comp040.local:5826: dapls_ib_open_hca failed 120000
> > comp040.local:5828:  open_hca: getaddr_netdev ERROR: No such device. Is
> ib2 configured?
> > comp040.local:5828: dapls_ib_open_hca failed 120000
> > comp040.local:5831:  open_hca: getaddr_netdev ERROR: No such device. Is
> ib2 configured?
> > comp040.local:5831: dapls_ib_open_hca failed 120000
> > comp040.local:5833:  open_hca: getaddr_netdev ERROR: No such device. Is
> ib3 configured?
> > comp040.local:5833: dapls_ib_open_hca failed 120000
> > comp040.local:5830:  open_hca: getaddr_netdev ERROR: No such device. Is
> ib2 configured?
> > comp040.local:5830: dapls_ib_open_hca failed 120000
> > comp040.local:5826:  open_hca: getaddr_netdev ERROR: No such device. Is
> bond0 configured?
> > comp040.local:5826: dapls_ib_open_hca failed 120000
> > comp040.local:5828:  open_hca: getaddr_netdev ERROR: No such device. Is
> ib3 configured?
> > comp040.local:5828: dapls_ib_open_hca failed 120000
> > comp040.local:5831:  open_hca: getaddr_netdev ERROR: No such device. Is
> ib3 configured?
> > comp040.local:5831: dapls_ib_open_hca failed 120000
> > comp040.local:5833:  open_hca: getaddr_netdev ERROR: No such device. Is
> bond0 configured?
> > comp040.local:5833: dapls_ib_open_hca failed 120000
> > comp040.local:5828:  open_hca: getaddr_netdev ERROR: No such device. Is
> bond0 configured?
> > comp040.local:5828: dapls_ib_open_hca failed 120000
> > comp040.local:5831:  open_hca: getaddr_netdev ERROR: No such device. Is
> bond0 configured?
> > comp040.local:5831: dapls_ib_open_hca failed 120000
> > comp040.local:5830:  open_hca: getaddr_netdev ERROR: No such device. Is
> ib3 configured?
> > comp040.local:5830: dapls_ib_open_hca failed 120000
> > comp040.local:5830:  open_hca: getaddr_netdev ERROR: No such device. Is
> bond0 configured?
> > comp040.local:5830: dapls_ib_open_hca failed 120000
> > Hello world: rank 8 of 16 running on comp047.local
> > Hello world: rank 13 of 16 running on comp047.local
> > Hello world: rank 10 of 16 running on comp047.local
> > Hello world: rank 11 of 16 running on comp047.local
> > Hello world: rank 15 of 16 running on comp047.local
> > Hello world: rank 9 of 16 running on comp047.local
> > Hello world: rank 12 of 16 running on comp047.local
> > Hello world: rank 7 of 16 running on comp040.local
> > Hello world: rank 14 of 16 running on comp047.local
> > Hello world: rank 6 of 16 running on comp040.local
> > Hello world: rank 0 of 16 running on comp040.local
> > Hello world: rank 4 of 16 running on comp040.local
> > Hello world: rank 2 of 16 running on comp040.local
> > Hello world: rank 5 of 16 running on comp040.local
> > Hello world: rank 1 of 16 running on comp040.local
> > Hello world: rank 3 of 16 running on comp040.local
> >
> > _______________________________________________
> > users mailing list
> > [email protected]
> > https://gridengine.org/mailman/listinfo/users
>
>
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users
>
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to