Hi,

I have configured out why the tmpdir parameter works for the first process. I got another problem if I tried to run 400 processes (no problem if under 400 processes). I got an error "ORTE_ERROR_LOG: Out of resource in file base/iof_base_setup.c at line 106". I attached the message as below:

[ac27:12442] [0,0,0] setting up session dir with
[ac27:12442] tmpdir /jobfs/z07/247752.ac-pbs
[ac27:12442] universe default-universe-12442
[ac27:12442] user kxc565
[ac27:12442] host ac27
[ac27:12442] jobid 0
[ac27:12442] procid 0
[ac27:12442] procdir: /jobfs/z07/247752.ac-pbs/openmpi-sessions-kxc565@ac27_0/default-universe-12442/0/0 [ac27:12442] jobdir: /jobfs/z07/247752.ac-pbs/openmpi-sessions-kxc565@ac27_0/default-universe-12442/0 [ac27:12442] unidir: /jobfs/z07/247752.ac-pbs/openmpi-sessions-kxc565@ac27_0/default-universe-12442
[ac27:12442] top: openmpi-sessions-kxc565@ac27_0
[ac27:12442] tmp: ??
[ac27:12442] [0,0,0] contact_file /jobfs/z07/247752.ac-pbs/openmpi-sessions-kxc565@ac27_0/default-universe-12442/universe-setup.txt
[ac27:12442] [0,0,0] wrote setup file
[ac27:12447] [0,0,1] setting up session dir with
[ac27:12447] universe default-universe-12442
[ac27:12447] user kxc565
[ac27:12447] host ac27
[ac27:12447] jobid 0
[ac27:12447] procid 1
[ac27:12447] procdir: /jobfs/z07/247752.ac-pbs/openmpi-sessions-kxc565@ac27_0/default-universe-12442/0/1 [ac27:12447] jobdir: /jobfs/z07/247752.ac-pbs/openmpi-sessions-kxc565@ac27_0/default-universe-12442/0 [ac27:12447] unidir: /jobfs/z07/247752.ac-pbs/openmpi-sessions-kxc565@ac27_0/default-universe-12442
[ac27:12447] top: openmpi-sessions-kxc565@ac27_0
[ac27:12447] tmp: /jobfs/z07/247752.ac-pbs
[ac27:12447] [0,0,1] ORTE_ERROR_LOG: Out of resource in file base/iof_base_setup.c at line 106 [ac27:12447] [0,0,1] ORTE_ERROR_LOG: Out of resource in file odls_default_module.c at line 663 [ac27:12447] [0,0,1] ORTE_ERROR_LOG: Out of resource in file odls_default_module.c at line 1191 [ac27:12447] [0,0,1] ORTE_ERROR_LOG: Out of resource in file orted.c at line 594
[ac27:12442] spawn: in job_state_callback(jobid = 1, state = 0x80)
mpirun noticed that job rank 0 with PID 0 on node ac27 exited on signal 15 (Terminated).
[ac27:12447] sess_dir_finalize: job session dir not empty - leaving
[ac27:12447] sess_dir_finalize: proc session dir not empty - leaving
[ac27:12442] sess_dir_finalize: proc session dir not empty - leaving


Thanks,
Clement

Clement Kam Man Chu wrote:
Hi,

I am using openmpi 1.2.3 under ia64 machine. I typed "mpirun -d --tmpdir /home/565/kxc565/tmpdir -mca btl sm -np 400 ./testprogram". I found only the first process can use my parameter setting to store tmp file, but the second process used its default setting to store tmp file in /tmp directory. How can I change all processes stored in a directory I required? I have attached the message from openmpi for more in details. Thanks for any help.

Cheers,
Clement


[ac27:27928] [0,0,0] setting up session dir with
[ac27:27928] tmpdir /home/565/kxc565/tmpdir
[ac27:27928] universe default-universe-27928
[ac27:27928] user kxc565
[ac27:27928] host ac27
[ac27:27928] jobid 0
[ac27:27928] procid 0
[ac27:27928] procdir: /home/565/kxc565/tmpdir/openmpi-sessions-kxc565@ac27_0/default-universe-27928/0/0 [ac27:27928] jobdir: /home/565/kxc565/tmpdir/openmpi-sessions-kxc565@ac27_0/default-universe-27928/0 [ac27:27928] unidir: /home/565/kxc565/tmpdir/openmpi-sessions-kxc565@ac27_0/default-universe-27928
[ac27:27928] top: openmpi-sessions-kxc565@ac27_0
[ac27:27928] tmp: ?
[ac27:27928] [0,0,0] contact_file /home/565/kxc565/tmpdir/openmpi-sessions-kxc565@ac27_0/default-universe-27928/universe-setup.txt
[ac27:27928] [0,0,0] wrote setup file
[ac27:27932] [0,0,1] setting up session dir with
[ac27:27932] universe default-universe-27928
[ac27:27932] user kxc565
[ac27:27932] host ac27
[ac27:27932] jobid 0
[ac27:27932] procid 1
[ac27:27932] procdir: /tmp/openmpi-sessions-kxc565@ac27_0/default-universe-27928/0/1 [ac27:27932] jobdir: /tmp/openmpi-sessions-kxc565@ac27_0/default-universe-27928/0 [ac27:27932] unidir: /tmp/openmpi-sessions-kxc565@ac27_0/default-universe-27928
[ac27:27932] top: openmpi-sessions-kxc565@ac27_0
[ac27:27932] tmp: /tmp
[ac27:27932] [0,0,1] ORTE_ERROR_LOG: Out of resource in file base/iof_base_setup.c at line 106 [ac27:27932] [0,0,1] ORTE_ERROR_LOG: Out of resource in file odls_default_module.c at line 663 [ac27:27932] [0,0,1] ORTE_ERROR_LOG: Out of resource in file odls_default_module.c at line 1191 [ac27:27932] [0,0,1] ORTE_ERROR_LOG: Out of resource in file orted.c at line 594
[ac27:27928] spawn: in job_state_callback(jobid = 1, state = 0x80)
mpirun noticed that job rank 0 with PID 0 on node ac27 exited on signal 15 (Terminated).
[ac27:27932] sess_dir_finalize: job session dir not empty - leaving
[ac27:27932] sess_dir_finalize: proc session dir not empty - leaving
[ac27:27928] sess_dir_finalize: proc session dir not empty - leaving



--
Clement Kam Man Chu
Research Assistant
Faculty of Information Technology
Monash University, Caulfield Campus
Ph: 61 3 9903 2355

Reply via email to