Artem is investigating with Timur
On Jun 10, 2014, at 12:34 PM, Mike Dubman <mi...@dev.mellanox.co.il> wrote: > btw, the output comes from ompi`s libevent and not from slurm itself (sorry > about confusion and thanks to Yossi for catching this) > > > opal/mca/event/libevent2021/libevent/epoll.c: > event_warn("Epoll %s(%d) on fd %d failed. Old events were %d; read change > was %d (%s); write change was %d (%s)", > opal/mca/event/libevent2021/libevent/epoll.c: > event_debug(("Epoll %s(%d) on fd %d okay. [old events were %d; read change > was %d; write change was %d]", > > > > On Fri, Jun 6, 2014 at 3:38 PM, Ralph Castain <r...@open-mpi.org> wrote: > Possible - honestly don't know > > On Jun 6, 2014, at 12:16 AM, Timur Ismagilov <tismagi...@mail.ru> wrote: > >> Sometimes, after termination of the program, launched with the command >> "sbatch ... -o myprogram.out .....", no file "myprogram.out" is being >> produced. Could this be due to the above mentioned problem? >> >> >> Thu, 5 Jun 2014 07:45:01 -0700 от Ralph Castain <r...@open-mpi.org>: >> FWIW: support for the --resv-ports option was deprecated and removed on the >> OMPI side a long time ago. >> >> I'm not familiar enough with "oshrun" to know if it is doing anything >> unusual - I believe it is just a renaming of our usual "mpirun". I suspect >> this is some interaction with sbatch, but I'll take a look. I haven't see >> that warning. Mike indicated he thought it is due to both slurm and OMPI >> trying to control stdin/stdout, in which case it shouldn't be happening but >> you can safely ignore it >> >> >> On Jun 5, 2014, at 3:04 AM, Timur Ismagilov <tismagi...@mail.ru> wrote: >> >>> I use cmd line >>> >>> $sbatch -p test --exclusive -N 2 -o hello_oshmem.out -e hello_oshmem.err >>> shrun_mxm3.0 ./hello_oshmem >>> >>> where script shrun_mxm3.0: >>> >>> $cat shrun_mxm3.0 >>> >>> #!/bin/sh >>> >>> #srun --resv-ports "$@" >>> #exit $? >>> >>> [ x"$TMPDIR" == x"" ] && TMPDIR=/tmp >>> HOSTFILE=${TMPDIR}/hostfile.${SLURM_JOB_ID} >>> srun hostname -s|sort|uniq -c|awk '{print $2" slots="$1}' > $HOSTFILE || >>> { rm -f $HOSTFILE; exit 255; } >>> >>> LD_PRELOAD=/mnt/data/users/dm2/vol3/semenov/_scratch/mxm/mxm-3.0/lib/libmxm.so >>> oshrun -x LD_PRELOAD -x MXM_SHM_KCOPY_MODE=off --hostfile $HOSTFILE "$@" >>> >>> rc=$? >>> rm -f $HOSTFILE >>> >>> exit $rc >>> >>> I configured openmpi using >>> >>> ./configure CC=icc CXX=icpc F77=ifort FC=ifort >>> --prefix=/mnt/data/users/dm2/vol3/semenov/_scratch/openmpi-1.8.1_mxm-3.0 >>> --with-mxm=/mnt/data/users/dm2/vol3/semenov/_scratch/mxm/mxm-3.0/ --with- >>> slurm --with-platform=contrib/platform/mellanox/optimized >>> >>> >>> Fri, 30 May 2014 07:09:54 -0700 от Ralph Castain <r...@open-mpi.org>: >>> >>> Can you pass along the cmd line that generated that output, and how OMPI >>> was configured? >>> >>> On May 30, 2014, at 5:11 AM, Тимур Исмагилов <tismagi...@mail.ru> wrote: >>> >>>> Hello! >>>> >>>> I am using Open MPI v1.8.1 and slurm 2.5.6. >>>> >>>> I got this messages when i try to run example (hello_oshmem.cpp) program: >>>> >>>> [warn] Epoll ADD(1) on fd 0 failed. Old events were 0; read change was 1 >>>> (add); write change was 0 (none): Operation not permitted >>>> [warn] Epoll ADD(4) on fd 1 failed. Old events were 0; read change was 0 >>>> (none); write change was 1 (add): Operation not permitted >>>> Hello, world, I am 0 of 2 >>>> Hello, world, I am 1 of 2 >>>> >>>> What does this warnings mean? >>>> >>>> I lunch this job using sbatch and mpirun with hostfile (got it from : >>>> $srun hostname -s|sort|uniq -c|awk '{print $2" slots="$1}' > $HOSTFILE) >>>> Regards, >>>> Timur >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >>> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users