Re: [OMPI users] OpenMPI job initializing problem

Beichuan Yan Thu, 20 Mar 2014 17:05:55 -0400 (EDT)

Good for me to read it.

-----Original Message-----
From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gus Correa
Sent: Thursday, March 20, 2014 15:00
To: Open MPI Users
Subject: Re: [OMPI users] OpenMPI job initializing problem


On 03/20/2014 02:13 PM, Ralph Castain wrote:
>
> On Mar 20, 2014, at 9:48 AM, Beichuan Yan <beichuan....@colorado.edu> wrote:
>
>> Hi,
>>
>> Today I tested OMPI v1.7.5rc5 and surprisingly, it works like a charm!
>>
>> I found discussions related to this issue:
>>
>> 1. http://www.open-mpi.org/community/lists/users/2011/11/17688.php
>> The correct solution here is get your sys admin to make /tmp local.
Making /tmp NFS mounted across multiple nodes is a major "faux pas"
in the Linux world - it should never be done, for the reasons stated by Jeff.
>>

Actually, besides the previous discussions on this thread,
this problem is documented in the OMPI FAQ:

http://www.open-mpi.org/faq/?category=sm#poor-sm-btl-performance

>> my comment: for most clusters I have used, /tmp is NOT local.
Open MPI community may not enforce it.
>
> We don't enforce anything, but /tmp being network mounted is a
> VERY unusual situation in the cluster world, and highly unrecommended
>

I agree that it is bad.
Perhaps unusual also, but not unheard of.
If these nodes are diskless,
I guess that the cluster vendor would probably
recommend mounting /tmp as a tmpfs / ramfs (in RAM / shared memory).
That is what is usually done in diskless computers, right?
Why some installations mount /tmp over the network is unclear.

I guess OpenMPI is not alone in using /tmp for to store
temporary and readily accessible stuff,
which, given its name, /tmp is supposed to do.
So, it is not a matter of OMPI enforcing it.

However, reducing the dependence on /tmp, may be a plus anyway.

>
>>
>> 2. http://www.open-mpi.org/community/lists/users/2011/11/17684.php
>> In the upcoming OMPI v1.7, we revamped the shared memory setup code such 
>> that it'll actually use /dev/shm properly, or use some other mechanism other 
>> than a mmap file backed in a real filesystem. So the issue goes away.
>>
>> my comment: up to OMPI v1.7.4, this shmem issue is still there. However, it 
>> is resolved in OMPI v1.7.5rc5. This is surprising.
>>
>> Anyway, OMPI v1.7.5rc5 works well for multi-processes-on-one-node (shmem) 
>> mode on Spirit. There is no need to tune TCP or IB parameters to use it. My 
>> code just runs well:
>>
>> My test data takes 20 minutes to run with OMPI v1.7.4, but needs less than 1 
>> minute with OMPI v1.7.5rc5. I don't know what the magic is. I am wondering 
>> when OMPI v1.7.5 final will be released.
>>
>> I will update performance comparison between Intel MPI and Open MPI.
>>
>> Thanks,
>> Beichuan
>>
>>
>>
>> -----Original Message-----
>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gus Correa
>> Sent: Friday, March 07, 2014 18:41
>> To: Open MPI Users
>> Subject: Re: [OMPI users] OpenMPI job initializing problem
>>
>> On 03/06/2014 04:52 PM, Beichuan Yan wrote:
>>> No, I did all these and none worked.
>>>
>>> I just found, with exact the same code, data and job settings, a job can 
>>> really run one day while cannot the other day. It is NOT repeatable. I 
>>> don't know what the problem is: hardware? OpenMPI? PBS Pro?
>>>
>>> Anyway, I may have to give up using OpenMPI on that system and switch to 
>>> IntelMPI which always work.
>>>
>>> Thanks,
>>> Beichuan
>>
>> Well, this machine may have been setup to run only Intel MPI (DAPL?) and SGI 
>> MPI.
>> It is a pity that it doesn't seem to work with OpenMPI.
>>
>> In any case, good luck with your research project.
>>
>> Gus Correa
>>
>>>
>>> -----Original Message-----
>>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gus
>>> Correa
>>> Sent: Thursday, March 06, 2014 13:51
>>> To: Open MPI Users
>>> Subject: Re: [OMPI users] OpenMPI job initializing problem
>>>
>>> On 03/06/2014 03:35 PM, Beichuan Yan wrote:
>>>> Gus,
>>>>
>>>> Yes, 10.148.0.0/16 is the IB subnet.
>>>>
>>>> I did try others but none worked:
>>>> #export
>>>> TCP="--mca btl sm,openib"
>>>> No run, no output
>>>
>>> If I remember right, and unless this changed in recent OMPI vervsions, you 
>>> also need "self":
>>>
>>> -mca btl sm,openib,self
>>>
>>> Alternatively, you could rule out tcp:
>>>
>>> -mca btl ^tcp
>>>
>>>>
>>>> #export
>>>> TCP="--mca btl sm,openib --mca btl_tcp_if_include 10.148.0.0/16"
>>>> No run, no output
>>>>
>>>> Beichuan
>>>
>>> Likewise, "self" is missing here.
>>>
>>> Also, I don't know if you can ask for openib and also add --mca 
>>> btl_tcp_if_include 10.148.0.0/16.
>>> Note that one turns off tcp (I think), whereas the other requests a
>>> tcp interface (or that the IB interface with IPoIB functionality).
>>> That combination sounds weird to me.
>>> The OMPI developers may clarify if this is valid syntax/syntax combination.
>>>
>>> I would try simply -mca btl sm,openib,self, which is likely to give
>>> you the IB transport with verbs, plus shared memory intra-node, plus
>>> the
>>> (mandatory?) self (loopback interface?).
>>> In my experience, this will also help identify any malfunctioning IB HCA in 
>>> the nodes (with a failure/error message).
>>>
>>>
>>> I hope it helps,
>>> Gus Correa
>>>
>>>
>>>>
>>>> -----Original Message-----
>>>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gus
>>>> Correa
>>>> Sent: Thursday, March 06, 2014 13:16
>>>> To: Open MPI Users
>>>> Subject: Re: [OMPI users] OpenMPI job initializing problem
>>>>
>>>> Hi Beichuan
>>>>
>>>> So, it looks like that now the program runs, even though with specific 
>>>> settings depending on whether you're using OMPI 1.6.5 or 1.7.4, right?
>>>>
>>>> It looks like the problem now is performance, right?
>>>>
>>>> System load affects performance, but unless the network is overwhelmed, or 
>>>> perhaps the Lustre file system is hanging or too slow, I would think that 
>>>> a walltime increase from 1min to 10min is not related to system load, but 
>>>> something else.
>>>>
>>>> Do you remember the setup that gave you 1min walltime?
>>>> Was it the same that you sent below?
>>>> Do you happen to know which nodes?
>>>> Are you sharing nodes with other jobs, or are you running alone on the 
>>>> nodes?
>>>> Sharing with other processes may slow down your job.
>>>> If you request all cores in the node, PBS should give you a full node 
>>>> (unless they tricked PBS to think the nodes have more cores than they 
>>>> actually do).
>>>> How do you request the nodes in your #PBS directives?
>>>> Do you request nodes and ppn, or do you request procs?
>>>>
>>>> I suggest that you do:
>>>> cat $PBS_NODEFILE
>>>> in your PBS script, just to document which nodes are actually given to you.
>>>>
>>>> Also helpful to document/troubleshoot is to add -v and -tag-output to your 
>>>> mpiexec command line.
>>>>
>>>>
>>>> The difference in walltime could be due to some malfunction of IB HCAs on 
>>>> the nodes, for instance.
>>>> Since you are allowing (if I remember right) the use of TCP, OpenMPI will 
>>>> try to use any interfaces that you did not rule out.
>>>> If your mpiexec command line doesn't make any restriction, it will use 
>>>> anything available, if I remember right.
>>>> (Jeff will correct me in the next second.) If your mpiexec command line 
>>>> has mca btl_tcp_if_include 10.148.0.0/16 it will use the 10.148.0.0/16 
>>>> subnet in with TCP transport, I think.
>>>> (Jeff will cut my list subscription after that one, for spreading
>>>> misinformation.)
>>>>
>>>> In either case my impression is that you may have left a door open to the 
>>>> use of non-IB (and non-IB-verbs) transport.
>>>>
>>>> Is 10.148.0.0/16 the an Infiniband subnet or an Ethernet subnet?
>>>>
>>>> Did you remeber Jeff's suggestion from a while ago to avoid TCP (over 
>>>> Ethernet or over IB), and stick to IB verbs?
>>>>
>>>>
>>>> Is 10.148.0.0/16 the IB or the Ethernet subnet?
>>>>
>>>> On 03/02/2014 02:38 PM, Jeff Squyres (jsquyres) wrote:
>>>>>   Both 1.6.x and 1.7.x/1.8.x will need verbs.h to use the native verbs
>>>>>   network stack.
>>>>>
>>>>>   You can use emulated TCP over IB (e.g., using the OMPI TCP BTL), but
>>>>>   it's nowhere near as fast/efficient the native verbs network stack.
>>>>>
>>>>
>>>>
>>>> You could force the use of IB verbs with
>>>>
>>>> -mca btl ^tcp
>>>>
>>>> or with
>>>>
>>>> -mca btl sm,openib,self
>>>>
>>>> on the mpiexec command line.
>>>>
>>>> In this case, if any of the IB HCAs on the nodes is bad, the job will
>>>> abort with an error message, instead of running too slow (if it is
>>>> using other networks).
>>>>
>>>> There are also ways to tell OMPI to do a more verbose output, that
>>>> may perhaps help diagnose the problem.
>>>> ompi_info | grep verbose
>>>> may give some hints (I confess I don't remember them).
>>>>
>>>>
>>>> Believe me, this did happen to me, i.e., to run MPI programs in a
>>>> cluster that had all sorts of non-homogeneous nodes, some with faulty
>>>> IB HCAs, some with incomplete OFED installation, some that were not
>>>> mounting shared file systems properly, etc.
>>>> [I didn't administer that one!]
>>>> Hopefully that is not the problem you are facing, but verbose output
>>>> may help anyways.
>>>>
>>>> I hope this helps,
>>>> Gus Correa
>>>>
>>>>
>>>>
>>>> On 03/06/2014 01:49 PM, Beichuan Yan wrote:
>>>>> 1. For $TMPDIR and $TCP, there are four combinations by commenting on/off 
>>>>> (note the system's default TMPDIR=/work3/yanb):
>>>>> export TMPDIR=/work1/home/yanb/tmp
>>>>> TCP="--mca btl_tcp_if_include 10.148.0.0/16"
>>>>>
>>>>> 2. I tested the 4 combinations for OpenMPI 1.6.5 and OpenMPI 1.7.4 
>>>>> respectively for the pure-MPI mode (no OPENMP threads; 8 nodes, each node 
>>>>> runs 16 processes). The results are weird: of all 8 cases, only TWO of 
>>>>> them can run, but run so slow:
>>>>>
>>>>> OpenMPI 1.6.5:
>>>>> export TMPDIR=/work1/home/yanb/tmp
>>>>> TCP="--mca btl_tcp_if_include 10.148.0.0/16"
>>>>> Warning: shared-memory, /work1/home/yanb/tmp/ Run, take 10 minutes,
>>>>> slow
>>>>>
>>>>> OpenMPI 1.7.4:
>>>>> #export TMPDIR=/work1/home/yanb/tmp
>>>>> #TCP="--mca btl_tcp_if_include 10.148.0.0/16"
>>>>> Warning: shared-memory /work3/yanb/605832.SPIRIT/ Run, take 10
>>>>> minutess, slow
>>>>>
>>>>> So you see, a) openmpi 1.6.5 and 1.7.4 need different settings to
>>>>> run;
>>>> b) whether specifying TMPDIR, I got the shared memory warning.
>>>>>
>>>>> 3. But a few days ago, OpenMPI 1.6.5 worked great and took only 1
>>>>> minute
>>>> (now it takes 10 minutes). I am so confused by the results.
>>>> Does the system loading level or fluctuation or PBS pro affect
>>>> OpenMPI performance?
>>>>>
>>>>> Thanks,
>>>>> Beichuan
>>>>>
>>>>> -----Original Message-----
>>>>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gus
>>>>> Correa
>>>>> Sent: Tuesday, March 04, 2014 08:48
>>>>> To: Open MPI Users
>>>>> Subject: Re: [OMPI users] OpenMPI job initializing problem
>>>>>
>>>>> Hi Beichuan
>>>>>
>>>>> So, from "df" it looks like /home is /work1, right?
>>>>>
>>>>> Also, "mount" shows only /work[1-4], not the other
>>>>> 7 CWFS panfs (Panasas?), which apparently are not available in the 
>>>>> compute nodes/blades.
>>>>>
>>>>> I presume you have access and are using only some of the /work[1-4]
>>>>> (lustre) file systems for all your MPI and other software installation, 
>>>>> right? Not the panfs, right?
>>>>>
>>>>> Awkward that it doesn't work, because lustre is supposed to be a parallel 
>>>>> file system, highly available to all nodes (assuming it is mounted on all 
>>>>> nodes).
>>>>>
>>>>> It also shows a small /tmp with a tmpfs file system, which is volatile, 
>>>>> in memory:
>>>>>
>>>>> http://en.wikipedia.org/wiki/Tmpfs
>>>>>
>>>>> I would guess they don't let you write there, so TMPDIR=/tmp may not be a 
>>>>> possible option, but this is just a wild guess.
>>>>> Or maybe OMPI requires an actual non-volatile file system to write its 
>>>>> shared memory auxiliary files and other stuff that normally goes on /tmp? 
>>>>>  [Jeff, Ralph, help!!] I kind of remember some old discussion on this 
>>>>> list about this, but maybe it was in another list.
>>>>>
>>>>> [You could ask the sys admin about this, and perhaps what he
>>>>> recommends to use to replace /tmp.]
>>>>>
>>>>> Just in case they may have some file system mount point mixup, you could 
>>>>> try perhaps TMPDIR=/work1/yanb/tmp (rather than /home) You could also try 
>>>>> TMPDIR=/work3/yanb/tmp, as if I remember right this is another file 
>>>>> system you have access to (not sure anymore, it may have been in the 
>>>>> previous emails).
>>>>> Either way, you may need to create the tmp directory beforehand.
>>>>>
>>>>> **
>>>>>
>>>>> Any chances that this is an environment mixup?
>>>>>
>>>>> Say, that you may be inadvertently using the SGI-MPI mpiexec Using a 
>>>>> /full/path/to/mpiexec in your job may clarify this.
>>>>>
>>>>> "which mpiexec" will tell, but since the environment on the compute nodes 
>>>>> may not be exactly the same as in the login node, it may not be reliable 
>>>>> information.
>>>>>
>>>>> Or perhaps you may not be pointing to the OMPI libraries?
>>>>> Are you exporting PATH and LD_LIBRARY_PATH on .bashrc/.tcshrc, with the 
>>>>> OMPI items (bin and lib) *PREPENDED* (not appended), so as to take 
>>>>> precedence over other possible/SGI/pre-existent MPI items?
>>>>>
>>>>> Those are pretty (ugly) common problems.
>>>>>
>>>>> **
>>>>>
>>>>> I hope this helps,
>>>>> Gus Correa
>>>>>
>>>>> On 03/03/2014 10:13 PM, Beichuan Yan wrote:
>>>>>> 1. info from a compute node
>>>>>> -bash-4.1$ hostname
>>>>>> r32i1n1
>>>>>> -bash-4.1$ df -h /home
>>>>>> Filesystem            Size  Used Avail Use% Mounted on
>>>>>> 10.148.18.45@o2ib:10.148.18.46@o2ib:/fs1
>>>>>>                           1.2P  136T  1.1P  12% /work1 -bash-4.1$
>>>>>> mount devpts on /dev/pts type devpts (rw,gid=5,mode=620) tmpfs on
>>>>>> /tmp type tmpfs (rw,size=150m) none on /proc/sys/fs/binfmt_misc
>>>>>> type binfmt_misc
>>>>>> (rw) cpuset on /dev/cpuset type cpuset (rw)
>>>>>> 10.148.18.45@o2ib:10.148.18.46@o2ib:/fs1 on /work1 type lustre
>>>>>> (rw,flock)
>>>>>> 10.148.18.76@o2ib:10.148.18.164@o2ib:/fs2 on /work2 type lustre
>>>>>> (rw,flock)
>>>>>> 10.148.18.104@o2ib:10.148.18.165@o2ib:/fs3 on /work3 type lustre
>>>>>> (rw,flock)
>>>>>> 10.148.18.132@o2ib:10.148.18.133@o2ib:/fs4 on /work4 type lustre
>>>>>> (rw,flock)
>>>>>>
>>>>>>
>>>>>> 2. For "export TMPDIR=/home/yanb/tmp", I created it beforehand, and I 
>>>>>> did see mpi-related temporary files there when the job gets started.
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gus
>>>>>> Correa
>>>>>> Sent: Monday, March 03, 2014 18:23
>>>>>> To: Open MPI Users
>>>>>> Subject: Re: [OMPI users] OpenMPI job initializing problem
>>>>>>
>>>>>> Hi Beichuan
>>>>>>
>>>>>> OK, it says "unclassified.html", so I presume it is not a problem.
>>>>>>
>>>>>> The web site says the computer is an SGI ICE X.
>>>>>> I am not familiar to it, so what follows are guesses.
>>>>>>
>>>>>> The SGI site brochure suggests that the nodes/blades have local disks:
>>>>>> https://www.sgi.com/pdfs/4330.pdf
>>>>>>
>>>>>> The file systems prefixed with IP addresses (work[1-4]) and with panfs 
>>>>>> (cwfs and CWFS[1-6]) and a colon (:) are shared exports (not local), but 
>>>>>> not necessarily NFS (panfs may be Panasas?).
>>>>>>      From this output it is hard to tell where /home is, but I would 
>>>>>> guess it is also shared (not local).
>>>>>> Maybe "df -h /home" will tell.  Or perhaps "mount".
>>>>>>
>>>>>> You may be logged in to a login/service node, so although it does have a 
>>>>>> /tmp (your ls / shows tmp), this doesn't guarantee that the compute 
>>>>>> nodes/blades also do.
>>>>>>
>>>>>> Since your jobs failed when you specified TMPDIR=/tmp, I would guess 
>>>>>> /tmp doesn't exist on the nodes/blades, or is not writable.
>>>>>>
>>>>>> Did you try to submit a job with, say, "mpiexec -np 16 ls -ld /tmp"?
>>>>>> This should tell if /tmp exists on the nodes, if it is writable.
>>>>>>
>>>>>> A stupid question:
>>>>>> When you tried your job with this:
>>>>>>
>>>>>> export TMPDIR=/home/yanb/tmp
>>>>>>
>>>>>> Did you create the directory /home/yanb/tmp beforehand?
>>>>>>
>>>>>> Anyway, you may need to ask the help of a system administrator of this 
>>>>>> machine.
>>>>>>
>>>>>> Gus Correa
>>>>>>
>>>>>> On 03/03/2014 07:43 PM, Beichuan Yan wrote:
>>>>>>> Gus,
>>>>>>>
>>>>>>> I am using this system: 
>>>>>>> http://centers.hpc.mil/systems/unclassified.html#Spirit. I don't know 
>>>>>>> exactly configurations of the file system. Here is the output of "df 
>>>>>>> -h":
>>>>>>> Filesystem            Size  Used Avail Use% Mounted on
>>>>>>> /dev/sda6             919G   16G  857G   2% /
>>>>>>> tmpfs                  32G     0   32G   0% /dev/shm
>>>>>>> /dev/sda5             139M   33M  100M  25% /boot
>>>>>>> adfs3v-s:/adfs3/hafs14
>>>>>>>                            6.5T  678G  5.5T  11% /scratch
>>>>>>> adfs3v-s:/adfs3/hafs16
>>>>>>>                            6.5T  678G  5.5T  11% /var/spool/mail
>>>>>>> 10.148.18.45@o2ib:10.148.18.46@o2ib:/fs1
>>>>>>>                            1.2P  136T  1.1P  12% /work1
>>>>>>> 10.148.18.132@o2ib:10.148.18.133@o2ib:/fs4
>>>>>>>                            1.2P  793T  368T  69% /work4
>>>>>>> 10.148.18.104@o2ib:10.148.18.165@o2ib:/fs3
>>>>>>>                            1.2P  509T  652T  44% /work3
>>>>>>> 10.148.18.76@o2ib:10.148.18.164@o2ib:/fs2
>>>>>>>                            1.2P  521T  640T  45% /work2
>>>>>>> panfs://172.16.0.10/CWFS
>>>>>>>                            728T  286T  443T  40% /p/cwfs
>>>>>>> panfs://172.16.1.61/CWFS1
>>>>>>>                            728T  286T  443T  40% /p/CWFS1
>>>>>>> panfs://172.16.0.210/CWFS2
>>>>>>>                            728T  286T  443T  40% /p/CWFS2
>>>>>>> panfs://172.16.1.125/CWFS3
>>>>>>>                            728T  286T  443T  40% /p/CWFS3
>>>>>>> panfs://172.16.1.224/CWFS4
>>>>>>>                            728T  286T  443T  40% /p/CWFS4
>>>>>>> panfs://172.16.1.224/CWFS5
>>>>>>>                            728T  286T  443T  40% /p/CWFS5
>>>>>>> panfs://172.16.1.224/CWFS6
>>>>>>>                            728T  286T  443T  40% /p/CWFS6
>>>>>>> panfs://172.16.1.224/CWFS7
>>>>>>>                            728T  286T  443T  40% /p/CWFS7
>>>>>>>
>>>>>>> 1. My home directory is /home/yanb.
>>>>>>> My simulation files are located at /work3/yanb.
>>>>>>> The default TMPDIR set by system is just /work3/yanb
>>>>>>>
>>>>>>> 2. I did try not to set TMPDIR and let it default, which is just case 1 
>>>>>>> and case 2.
>>>>>>>        Case1: #export TMPDIR=/home/yanb/tmp
>>>>>>>                  TCP="--mca btl_tcp_if_include 10.148.0.0/16"
>>>>>>>           It gives no apparent reason.
>>>>>>>        Case2: #export TMPDIR=/home/yanb/tmp
>>>>>>>                  #TCP="--mca btl_tcp_if_include 10.148.0.0/16"
>>>>>>>           It gives warning of shared memory file on network file system.
>>>>>>>
>>>>>>> 3. With "export TMPDIR=/tmp", the job gives the same, no apparent 
>>>>>>> reason.
>>>>>>>
>>>>>>> 4. FYI, "ls /" gives:
>>>>>>> ELT    apps  cgroup  hafs1   hafs12  hafs2  hafs5  hafs8        home   
>>>>>>> lost+found  mnt  p      root     selinux  tftpboot  var    work3
>>>>>>> admin  bin   dev     hafs10  hafs13  hafs3  hafs6  hafs9        lib    
>>>>>>> media       net  panfs  sbin     srv      tmp       work1  work4
>>>>>>> app    boot  etc     hafs11  hafs15  hafs4  hafs7  hafs_x86_64  lib64  
>>>>>>> misc        opt  proc   scratch  sys      usr       work2  workspace
>>>>>>>
>>>>>>> Beichuan
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gus
>>>>>>> Correa
>>>>>>> Sent: Monday, March 03, 2014 17:24
>>>>>>> To: Open MPI Users
>>>>>>> Subject: Re: [OMPI users] OpenMPI job initializing problem
>>>>>>>
>>>>>>> Hi Beichuan
>>>>>>>
>>>>>>> If you are using the university cluster, chances are that /home is not 
>>>>>>> local, but on an NFS share, or perhaps Lustre (which you may have 
>>>>>>> mentioned before, I don't remember).
>>>>>>>
>>>>>>> Maybe "df -h" will show what is local what is not.
>>>>>>> It works for NFS, it prefixes file systems with the server name, but I 
>>>>>>> don't know about Lustre.
>>>>>>>
>>>>>>> Did you try just not to set TMPDIR and let it default?
>>>>>>> If the default TMPDIR is on Lustre (did you say this?, anyway I
>>>>>>> don't
>>>>>>> remember) you could perhaps try to force it to /tmp:
>>>>>>> export TMPDIR=/tmp,
>>>>>>> If the cluster nodes are diskfull /tmp is likely to exist and be local 
>>>>>>> to the cluster nodes.
>>>>>>> [But the cluster nodes may be diskless ... :( ]
>>>>>>>
>>>>>>> I hope this helps,
>>>>>>> Gus Correa
>>>>>>>
>>>>>>> On 03/03/2014 07:10 PM, Beichuan Yan wrote:
>>>>>>>> How to set TMPDIR to a local filesystem? Is /home/yanb/tmp a local 
>>>>>>>> filesystem? I don't know how to tell a directory is local file system 
>>>>>>>> or network file system.
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Jeff
>>>>>>>> Squyres (jsquyres)
>>>>>>>> Sent: Monday, March 03, 2014 16:57
>>>>>>>> To: Open MPI Users
>>>>>>>> Subject: Re: [OMPI users] OpenMPI job initializing problem
>>>>>>>>
>>>>>>>> How about setting TMPDIR to a local filesystem?
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mar 3, 2014, at 3:43 PM, Beichuan Yan<beichuan....@colorado.edu>    
>>>>>>>>    wrote:
>>>>>>>>
>>>>>>>>> I agree there are two cases for pure-MPI mode: 1. Job fails with no 
>>>>>>>>> apparent reason;  2 job complains shared-memory file on network file 
>>>>>>>>> system, which can be resolved by " export TMPDIR=/home/yanb/tmp", 
>>>>>>>>> /home/yanb/tmp is my local directory. The default TMPDIR points to a 
>>>>>>>>> Lustre directory.
>>>>>>>>>
>>>>>>>>> There is no any other output. I checked my job with "qstat -n" and 
>>>>>>>>> found that processes were actually not started on compute nodes even 
>>>>>>>>> though PBS Pro has "started" my job.
>>>>>>>>>
>>>>>>>>> Beichuan
>>>>>>>>>
>>>>>>>>>> 3. Then I test pure-MPI mode: OPENMP is turned off, and each compute 
>>>>>>>>>> node runs 16 processes (clearly shared-memory of MPI is used). Four 
>>>>>>>>>> combinations of "TMPDIR" and "TCP" are tested:
>>>>>>>>>> case 1:
>>>>>>>>>> #export TMPDIR=/home/yanb/tmp
>>>>>>>>>> TCP="--mca btl_tcp_if_include 10.148.0.0/16"
>>>>>>>>>> mpirun $TCP -np 64 -npernode 16 -hostfile $PBS_NODEFILE
>>>>>>>>>> ./paraEllip3d input.txt
>>>>>>>>>> output:
>>>>>>>>>> Start Prologue v2.5 Mon Mar  3 15:47:16 EST 2014 End Prologue
>>>>>>>>>> v2.5 Mon Mar  3 15:47:16 EST 2014
>>>>>>>>>> -bash: line 1: 448597 Terminated              
>>>>>>>>>> /var/spool/PBS/mom_priv/jobs/602244.service12.SC
>>>>>>>>>> Start Epilogue v2.5 Mon Mar  3 15:50:51 EST 2014 Statistics
>>>>>>>>>> cpupercent=0,cput=00:00:00,mem=7028kb,ncpus=128,vmem=495768kb,w
>>>>>>>>>> all
>>>>>>>>>> t
>>>>>>>>>> i
>>>>>>>>>> m
>>>>>>>>>> e
>>>>>>>>>> =00:03:24 End Epilogue v2.5 Mon Mar  3 15:50:52 EST 2014
>>>>>>>>>
>>>>>>>>> It looks like you have two general cases:
>>>>>>>>>
>>>>>>>>> 1. The job fails for no apparent reason (like above), or 2. The
>>>>>>>>> job complains that your TMPDIR is on a shared filesystem
>>>>>>>>>
>>>>>>>>> Right?
>>>>>>>>>
>>>>>>>>> I think the real issue, then, is to figure out why your jobs are 
>>>>>>>>> failing with no output.
>>>>>>>>>
>>>>>>>>> Is there anything in the stderr output?
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Jeff Squyres
>>>>>>>>> jsquy...@cisco.com
>>>>>>>>> For corporate legal information go to:
>>>>>>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> us...@open-mpi.org
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> us...@open-mpi.org
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Jeff Squyres
>>>>>>>> jsquy...@cisco.com
>>>>>>>> For corporate legal information go to:
>>>>>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] OpenMPI job initializing problem

Reply via email to