Re: [OMPI users] Multiple mpiexec's within a job (schedule within a scheduled machinefile/job allocation)

2009-07-31 Thread Ralph Castain
On Jul 30, 2009, at 4:36 PM, Adams, Brian M wrote: I found the manual pages for mpirun and orte_hosts, which have a pretty thorough description of these features. Let me know if there's anything else I should check out. My quick impression is that this will meet at least 90% of user nee

[OMPI users] programs are segfaulting using Torque & OpenMPI

2009-07-31 Thread Wilko Keegstra
hi, I have the following problem: I am using openmpi 1.3.3 programs (directly and from scripts) submitted with mpiexec are running fine. programs (directly and from scripts) submitted through Torque 2.3.7 with openmpi compiled with --with-tm (and torque-devel) installed give segfaulting of the

Re: [OMPI users] programs are segfaulting using Torque & OpenMPI

2009-07-31 Thread Ralph Castain
Could you send the contents of a PBS_NODEFILE from a Torque 2.3.7 allocation, and the man page for tm_spawn? My only guess would be that something changed in those areas as we don't really use anything else from Torque, and run on Torque-based clusters in production every day. Not sure what

Re: [OMPI users] strange IMB runs

2009-07-31 Thread Michael Di Domenico
mpi_leave_pinned didn't help still at ~145MB/sec btl_sm_eager_limit from 4096 to 8192 pushes me upto ~212MB/sec, but pushing it past that doesn't change it anymore Are there any intelligent programs that can go through and test all the different permutations of tunables for openmpi? Outside of me

Re: [OMPI users] programs are segfaulting using Torque & OpenMPI

2009-07-31 Thread Wilko Keegstra
.3.gz, Job output: AlignImages.o34.gz, momlog-20090731 I hope you can help me, kind regards, Wilko Ralph Castain wrote: > Could you send the contents of a PBS_NODEFILE from a Torque 2.3.7 > allocation, and the man page for tm_spawn? > > My only guess would be that something changed in

Re: [OMPI users] strange IMB runs

2009-07-31 Thread Edgar Gabriel
Michael Di Domenico wrote: mpi_leave_pinned didn't help still at ~145MB/sec btl_sm_eager_limit from 4096 to 8192 pushes me upto ~212MB/sec, but pushing it past that doesn't change it anymore Are there any intelligent programs that can go through and test all the different permutations of tunable

Re: [OMPI users] programs are segfaulting using Torque & OpenMPI

2009-07-31 Thread Ralph Castain
-psml.img /pcs/pc00/keegstra/work/hm/hemo-mix-psml-ali.img 4 9 14 1 2497 360.000 64.000 /pcs/pc00/keegstra/work/hm/hemo-mix-pref.img 1 7 0 and the job crashed almost immediately. i have attached: tm.3.gz, Job output: AlignImages.o34.gz, momlog-20090731 I hope you can help me, kind regards, Wilko

Re: [OMPI users] programs are segfaulting using Torque & OpenMPI

2009-07-31 Thread Wilko Keegstra
>> 64.000 /pcs/pc00/keegstra/work/hm/hemo-mix-pref.img 1 7 0 >> >> and the job crashed almost immediately. i have attached: >> tm.3.gz, Job output: AlignImages.o34.gz, momlog-20090731 >> >> I hope you can help me, >> kind regards, >> Wilko >> &

Re: [OMPI users] programs are segfaulting using Torque & OpenMPI

2009-07-31 Thread Ralph Castain
360.000 64.000 /pcs/pc00/keegstra/work/hm/hemo-mix-pref.img 1 7 0 and the job crashed almost immediately. i have attached: tm.3.gz, Job output: AlignImages.o34.gz, momlog-20090731 I hope you can help me, kind regards, Wilko Ralph Castain wrote: Could you send the contents of a PBS_NODEFILE from

Re: [OMPI users] programs are segfaulting using Torque & OpenMPI

2009-07-31 Thread W.Keegstra
s/pc00/keegstra/work/hm/hemo-mix-psml.img /pcs/pc00/keegstra/work/hm/hemo-mix-psml-ali.img 4 9 14 1 2497 360.000 64.000 /pcs/pc00/keegstra/work/hm/hemo-mix-pref.img 1 7 0 and the job crashed almost immediately. i have attached: tm.3.gz, Job output: AlignImages.o34.gz, momlog-20090731 I hope you ca

Re: [OMPI users] programs are segfaulting using Torque & OpenMPI

2009-07-31 Thread Ralph Castain
eegstra/work/hm/hemo-mix-psml.img /pcs/pc00/keegstra/work/hm/hemo-mix-psml-ali.img 4 9 14 1 2497 360.000 64.000 /pcs/pc00/keegstra/work/hm/hemo-mix-pref.img 1 7 0 and the job crashed almost immediately. i have attached: tm.3.gz, Job output: AlignImages.o34.gz, momlog-20090731 I hope you can h

Re: [OMPI users] programs are segfaulting using Torque & OpenMPI

2009-07-31 Thread Gus Correa
i have attached: tm.3.gz, Job output: AlignImages.o34.gz, momlog-20090731 I hope you can help me, kind regards, Wilko Ralph Castain wrote: Could you send the contents of a PBS_NODEFILE from a Torque 2.3.7 allocation, and the man page for tm_spawn? My only guess would be that something changed in

Re: [OMPI users] programs are segfaulting using Torque & OpenMPI

2009-07-31 Thread W.Keegstra
ork/hm/hemo-mix-psml.img /pcs/pc00/keegstra/work/hm/hemo-mix-psml-ali.img 4 9 14 1 2497 360.000 64.000 /pcs/pc00/keegstra/work/hm/hemo-mix-pref.img 1 7 0 and the job crashed almost immediately. i have attached: tm.3.gz, Job output: AlignImages.o34.gz, momlog-20090731 I hope you can help me, kind