As I said, there seems to be a problem starting the app on all nodes. I am planning to do some test with orte-ps. I hope that I can get some information why the app didn't start.
Thorsten On Saturday, June 25, 2011, Jeff Squyres wrote: > Did this issue get resolved? You might also want to look at our FAQ > category for large clusters: > > http://www.open-mpi.org/faq/?category=large-clusters > > On Jun 22, 2011, at 9:43 AM, Thorsten Schuett wrote: > > Thanks for the tip. I can't tell yet whether it helped or not. However, > > with your settings I get the following warning: > > WARNING: Open MPI will create a shared memory backing file in a > > directory that appears to be mounted on a network filesystem. > > > > I repeated the run with my settings and I noticed that on at least one > > node my app didn't came up. I can see an orted daemon on this node, but > > no other process. And this was 30 minutes after the app started. > > > > orted -mca ess tm -mca orte_ess_jobid 125894656 -mca orte_ess_vpid 63 -mc > > a orte_ess_num_procs 255 --hnp-uri ... > > > > Thorsten > > > > On Wednesday, June 22, 2011, Gilbert Grosdidier wrote: > >> Bonjour Thorsten, > >> > >> I'm not surprised about the cluster type, indeed, > >> > >> but I do not remember getting such specific hang up you mention. > >> > >> Anyway, I suspect SGI Altix is a little bit special for OpenMPI, > >> > >> and I usually run with the following setup: > >> - there is need to create for each job a specific tmp area, > >> like "/scratch/ggg/uuu/run/tmp/pbs.${PBS_JOBID}" > >> - then use something like that: > >> > >> setenv TMPDIR "/scratch/ggg/uuu/run/tmp/pbs.${PBS_JOBID}" > >> setenv OMPI_PREFIX_ENV "/scratch/ggg/uuu/run/tmp/pbs.${PBS_JOBID}" > >> setenv OMPI_MCA_mpi_leave_pinned_pipeline 1 > >> > >> - then, for running, many of these -mca options are probably useless > >> with your app, > >> while many of them may show to be useful. Your own way ... > >> > >> mpiexec -mca coll_tuned_use_dynamic_rules 1 -hostfile $PBS_NODEFILE - > >> mca rmaps seq -mca btl_openib_rdma_pipeline_send_length 65536 -mca > >> btl_openib_rdma_pipeline_frag_size 65536 -mca > >> btl_openib_min_rdma_pipeline_size 65536 -mca > >> btl_self_rdma_pipeline_send_length 262144 -mca > >> btl_self_rdma_pipeline_frag_size 262144 -mca plm_rsh_num_concurrent > >> 4096 -mca mpi_paffinity_alone 1 -mca mpi_leave_pinned_pipeline 1 -mca > >> btl_sm_max_send_size 128 -mca > >> coll_tuned_pre_allocate_memory_comm_size_limit 1048576 -mca > >> btl_openib_cq_size 128 -mca btl_ofud_rd_num 128 -mca > >> mpi_preconnect_mpi 0 -mca mpool_sm_min_size 131072 -mca btl > >> sm,openib,self -mca btl_openib_want_fork_support 0 -mca > >> opal_set_max_sys_limits 1 -mca osc_pt2pt_no_locks 1 -mca > >> osc_rdma_no_locks 1 YOUR_APP > >> > >> (Watch the step : only one line only ...) > >> > >> This should be suitable for up to 8k cores. > >> > >> > >> HTH, Best, G. > >> > >> Le 22 juin 11 à 09:13, Thorsten Schuett a écrit : > >>> Sure. It's an SGI ICE cluster with dual-rail IB. The HCAs are Mellanox > >>> ConnectX IB DDR. > >>> > >>> This is a 2040 cores job. I use 255 nodes with one MPI task on each > >>> node and > >>> use 8-way OpenMP. > >>> > >>> I don't need -np and -machinefile, because mpiexec picks up this > >>> information > >>> from PBS. > >>> > >>> Thorsten > >>> > >>> On Tuesday, June 21, 2011, Gilbert Grosdidier wrote: > >>>> Bonjour Thorsten, > >>>> > >>>> Could you please be a little bit more specific about the cluster > >>>> > >>>> itself ? > >>>> > >>>> G. > >>>> > >>>> Le 21 juin 11 à 17:46, Thorsten Schuett a écrit : > >>>>> Hi, > >>>>> > >>>>> I am running openmpi 1.5.3 on a IB cluster and I have problems > >>>>> starting jobs > >>>>> on larger node counts. With small numbers of tasks, it usually > >>>>> works. But now > >>>>> the startup failed three times in a row using 255 nodes. I am using > >>>>> 255 nodes > >>>>> with one MPI task per node and the mpiexec looks as follows: > >>>>> > >>>>> mpiexec --mca btl self,openib --mca mpi_leave_pinned 0 ./a.out > >>>>> > >>>>> After ten minutes, I pulled a stracktrace on all nodes and killed > >>>>> the job, > >>>>> because there was no progress. In the following, you will find the > >>>>> stack trace > >>>>> generated with gdb thread apply all bt. The backtrace looks > >>>>> basically the same > >>>>> on all nodes. It seems to hang in mpi_init. > >>>>> > >>>>> Any help is appreciated, > >>>>> > >>>>> Thorsten > >>>>> > >>>>> Thread 3 (Thread 46914544122176 (LWP 28979)): > >>>>> #0 0x00002b6ee912d9a2 in select () from /lib64/libc.so.6 > >>>>> #1 0x00002b6eeabd928d in service_thread_start (context=<value > >>>>> optimized out>) > >>>>> at btl_openib_fd.c:427 > >>>>> #2 0x00002b6ee835e143 in start_thread () from /lib64/ > >>>>> libpthread.so.0 > >>>>> #3 0x00002b6ee9133b8d in clone () from /lib64/libc.so.6 > >>>>> #4 0x0000000000000000 in ?? () > >>>>> > >>>>> Thread 2 (Thread 46916594338112 (LWP 28980)): > >>>>> #0 0x00002b6ee912b8b6 in poll () from /lib64/libc.so.6 > >>>>> #1 0x00002b6eeabd7b8a in btl_openib_async_thread (async=<value > >>>>> optimized > >>>>> out>) at btl_openib_async.c:419 > >>>>> #2 0x00002b6ee835e143 in start_thread () from /lib64/ > >>>>> libpthread.so.0 > >>>>> #3 0x00002b6ee9133b8d in clone () from /lib64/libc.so.6 > >>>>> #4 0x0000000000000000 in ?? () > >>>>> > >>>>> Thread 1 (Thread 47755361533088 (LWP 28978)): > >>>>> #0 0x00002b6ee9133fa8 in epoll_wait () from /lib64/libc.so.6 > >>>>> #1 0x00002b6ee87745db in epoll_dispatch (base=0xb79050, > >>>>> arg=0xb558c0, > >>>>> tv=<value optimized out>) at epoll.c:215 > >>>>> #2 0x00002b6ee8773309 in opal_event_base_loop (base=0xb79050, > >>>>> flags=<value > >>>>> optimized out>) at event.c:838 > >>>>> #3 0x00002b6ee875ee92 in opal_progress () at runtime/ > >>>>> opal_progress.c:189 > >>>>> #4 0x0000000039f00001 in ?? () > >>>>> #5 0x00002b6ee87979c9 in std::ios_base::Init::~Init () at > >>>>> ../../.././libstdc++-v3/src/ios_init.cc:123 > >>>>> #6 0x00007fffc32c8cc8 in ?? () > >>>>> #7 0x00002b6ee9d20955 in orte_grpcomm_bad_get_proc_attr > >>>>> (proc=<value > >>>>> optimized out>, attribute_name=0x2b6ee88e5780 " \020322351n+", > >>>>> val=0x2b6ee875ee92, size=0x7fffc32c8cd0) at grpcomm_bad_module.c:500 > >>>>> #8 0x00002b6ee86dd511 in ompi_modex_recv_key_value (key=<value > >>>>> optimized > >>>>> out>, source_proc=<value optimized out>, value=0xbb3a00, dtype=14 > >>>>> '\016') at > >>>>> runtime/ompi_module_exchange.c:125 > >>>>> #9 0x00002b6ee86d7ea1 in ompi_proc_set_arch () at proc/proc.c:154 > >>>>> #10 0x00002b6ee86db1b0 in ompi_mpi_init (argc=15, > >>>>> argv=0x7fffc32c92f8, > >>>>> requested=<value optimized out>, provided=0x7fffc32c917c) at > >>>>> runtime/ompi_mpi_init.c:699 > >>>>> #11 0x00007fffc32c8e88 in ?? () > >>>>> #12 0x00002b6ee77f8348 in ?? () > >>>>> #13 0x00007fffc32c8e60 in ?? () > >>>>> #14 0x00007fffc32c8e20 in ?? () > >>>>> #15 0x0000000009efa994 in ?? () > >>>>> #16 0x0000000000000000 in ?? () > >>>>> _______________________________________________ > >>>>> users mailing list > >>>>> us...@open-mpi.org > >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users > >>>> > >>>> -- > >>>> *--------------------------------------------------------------------- > >>>> * > >>>> > >>>> Gilbert Grosdidier gilbert.grosdid...@in2p3.fr > >>>> LAL / IN2P3 / CNRS Phone : +33 1 6446 8909 > >>>> Faculté des Sciences, Bat. 200 Fax : +33 1 6446 8546 > >>>> B.P. 34, F-91898 Orsay Cedex (FRANCE) > >>>> > >>>> *--------------------------------------------------------------------- > >>>> * > >>> > >>> _______________________________________________ > >>> users mailing list > >>> us...@open-mpi.org > >>> http://www.open-mpi.org/mailman/listinfo.cgi/users > >> > >> -- > >> *---------------------------------------------------------------------* > >> > >> Gilbert Grosdidier gilbert.grosdid...@in2p3.fr > >> LAL / IN2P3 / CNRS Phone : +33 1 6446 8909 > >> Faculté des Sciences, Bat. 200 Fax : +33 1 6446 8546 > >> B.P. 34, F-91898 Orsay Cedex (FRANCE) > >> > >> *---------------------------------------------------------------------* > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users