Hi,
I have configured out why the tmpdir parameter works for the first
process. I got another problem if I tried to run 400 processes (no
problem if under 400 processes). I got an error "ORTE_ERROR_LOG: Out of
resource in file base/iof_base_setup.c at line 106". I attached the
message as below:
[ac27:12442] [0,0,0] setting up session dir with
[ac27:12442] tmpdir /jobfs/z07/247752.ac-pbs
[ac27:12442] universe default-universe-12442
[ac27:12442] user kxc565
[ac27:12442] host ac27
[ac27:12442] jobid 0
[ac27:12442] procid 0
[ac27:12442] procdir:
/jobfs/z07/247752.ac-pbs/openmpi-sessions-kxc565@ac27_0/default-universe-12442/0/0
[ac27:12442] jobdir:
/jobfs/z07/247752.ac-pbs/openmpi-sessions-kxc565@ac27_0/default-universe-12442/0
[ac27:12442] unidir:
/jobfs/z07/247752.ac-pbs/openmpi-sessions-kxc565@ac27_0/default-universe-12442
[ac27:12442] top: openmpi-sessions-kxc565@ac27_0
[ac27:12442] tmp: ??
[ac27:12442] [0,0,0] contact_file
/jobfs/z07/247752.ac-pbs/openmpi-sessions-kxc565@ac27_0/default-universe-12442/universe-setup.txt
[ac27:12442] [0,0,0] wrote setup file
[ac27:12447] [0,0,1] setting up session dir with
[ac27:12447] universe default-universe-12442
[ac27:12447] user kxc565
[ac27:12447] host ac27
[ac27:12447] jobid 0
[ac27:12447] procid 1
[ac27:12447] procdir:
/jobfs/z07/247752.ac-pbs/openmpi-sessions-kxc565@ac27_0/default-universe-12442/0/1
[ac27:12447] jobdir:
/jobfs/z07/247752.ac-pbs/openmpi-sessions-kxc565@ac27_0/default-universe-12442/0
[ac27:12447] unidir:
/jobfs/z07/247752.ac-pbs/openmpi-sessions-kxc565@ac27_0/default-universe-12442
[ac27:12447] top: openmpi-sessions-kxc565@ac27_0
[ac27:12447] tmp: /jobfs/z07/247752.ac-pbs
[ac27:12447] [0,0,1] ORTE_ERROR_LOG: Out of resource in file
base/iof_base_setup.c at line 106
[ac27:12447] [0,0,1] ORTE_ERROR_LOG: Out of resource in file
odls_default_module.c at line 663
[ac27:12447] [0,0,1] ORTE_ERROR_LOG: Out of resource in file
odls_default_module.c at line 1191
[ac27:12447] [0,0,1] ORTE_ERROR_LOG: Out of resource in file orted.c at
line 594
[ac27:12442] spawn: in job_state_callback(jobid = 1, state = 0x80)
mpirun noticed that job rank 0 with PID 0 on node ac27 exited on signal
15 (Terminated).
[ac27:12447] sess_dir_finalize: job session dir not empty - leaving
[ac27:12447] sess_dir_finalize: proc session dir not empty - leaving
[ac27:12442] sess_dir_finalize: proc session dir not empty - leaving
Thanks,
Clement
Clement Kam Man Chu wrote:
Hi,
I am using openmpi 1.2.3 under ia64 machine. I typed "mpirun -d --tmpdir
/home/565/kxc565/tmpdir -mca btl sm -np 400 ./testprogram". I found only
the first process can use my parameter setting to store tmp file, but
the second process used its default setting to store tmp file in /tmp
directory. How can I change all processes stored in a directory I
required? I have attached the message from openmpi for more in details.
Thanks for any help.
Cheers,
Clement
[ac27:27928] [0,0,0] setting up session dir with
[ac27:27928] tmpdir /home/565/kxc565/tmpdir
[ac27:27928] universe default-universe-27928
[ac27:27928] user kxc565
[ac27:27928] host ac27
[ac27:27928] jobid 0
[ac27:27928] procid 0
[ac27:27928] procdir:
/home/565/kxc565/tmpdir/openmpi-sessions-kxc565@ac27_0/default-universe-27928/0/0
[ac27:27928] jobdir:
/home/565/kxc565/tmpdir/openmpi-sessions-kxc565@ac27_0/default-universe-27928/0
[ac27:27928] unidir:
/home/565/kxc565/tmpdir/openmpi-sessions-kxc565@ac27_0/default-universe-27928
[ac27:27928] top: openmpi-sessions-kxc565@ac27_0
[ac27:27928] tmp: ?
[ac27:27928] [0,0,0] contact_file
/home/565/kxc565/tmpdir/openmpi-sessions-kxc565@ac27_0/default-universe-27928/universe-setup.txt
[ac27:27928] [0,0,0] wrote setup file
[ac27:27932] [0,0,1] setting up session dir with
[ac27:27932] universe default-universe-27928
[ac27:27932] user kxc565
[ac27:27932] host ac27
[ac27:27932] jobid 0
[ac27:27932] procid 1
[ac27:27932] procdir:
/tmp/openmpi-sessions-kxc565@ac27_0/default-universe-27928/0/1
[ac27:27932] jobdir:
/tmp/openmpi-sessions-kxc565@ac27_0/default-universe-27928/0
[ac27:27932] unidir:
/tmp/openmpi-sessions-kxc565@ac27_0/default-universe-27928
[ac27:27932] top: openmpi-sessions-kxc565@ac27_0
[ac27:27932] tmp: /tmp
[ac27:27932] [0,0,1] ORTE_ERROR_LOG: Out of resource in file
base/iof_base_setup.c at line 106
[ac27:27932] [0,0,1] ORTE_ERROR_LOG: Out of resource in file
odls_default_module.c at line 663
[ac27:27932] [0,0,1] ORTE_ERROR_LOG: Out of resource in file
odls_default_module.c at line 1191
[ac27:27932] [0,0,1] ORTE_ERROR_LOG: Out of resource in file orted.c at
line 594
[ac27:27928] spawn: in job_state_callback(jobid = 1, state = 0x80)
mpirun noticed that job rank 0 with PID 0 on node ac27 exited on signal
15 (Terminated).
[ac27:27932] sess_dir_finalize: job session dir not empty - leaving
[ac27:27932] sess_dir_finalize: proc session dir not empty - leaving
[ac27:27928] sess_dir_finalize: proc session dir not empty - leaving
--
Clement Kam Man Chu
Research Assistant
Faculty of Information Technology
Monash University, Caulfield Campus
Ph: 61 3 9903 2355