Hi Bruno,

It seems that I met the similar problem. My question is how to turn METIS 
on? Need I write something in the pbs job file or contact with the guy who 
is in charge of the cluster? Thank you very much!
Please refer to my 
question https://groups.google.com/forum/#!topic/dealii/116cFgS6EiE


在 2015年7月15日星期三 UTC-4上午10:25:50,Bruno Turcksin写道:
> Weixiong,
> the error that you get is normal. By default, candi turns off METIS but 
> step-17 and step-18 require METIS to run in parallel. PETSc is installed 
> correctly that's why when you use one processor everything works fine. If 
> you want to use METIS just turn it ON in candi.cgf
> Best,
> Bruno
> On Wednesday, July 15, 2015 at 12:58:13 AM UTC-5, Weixiong Zheng wrote:
>> Sorry for the ambiguity, let me clarify here:
>> Those error messages came from using candi-PETSc with mpirun in step-18 
>> (step-17 has the same problem).
>> mpirun -np 1 ./step-18 was fine. but mpirun -np x ./step-18 (x>=2) gave 
>> me the errors.
>> Thanks,
>> Weixiong
>> 在 2015年7月15日星期三 UTC-5上午12:51:07,Weixiong Zheng写道:
>>> Dear all, 
>>> After struggling on installing parallel stuffs with deal.ii for 4 days 
>>> as an Ubuntu rookie (the previous discussion), the final compromise was 
>>> made as: I used candi to make deal.ii + trilinos and built deal.ii + petsc 
>>> by myself from source. The reason for building PETSc without candi is that 
>>> when using -np>1, the following error manifested as shown in the end of 
>>> this post.
>>> This time, I didn't install libumfpack, but directly installed 
>>> libsuitesparse-dev. And recall that last time both candi and self-compiled 
>>> Trilinos did not work, yet, candi-trilinos works well this time, I would 
>>> assume the undefined reference blabla was caused by "libumfpack"
>>> Yet, I don't have luck with candi+petsc but the self-compiled petsc 
>>> works well with deal.ii with MPI. I didn't change anything in candi.cfg 
>>> expect the PROC.
>>> Though i now have working PETSc (self-compiled) and Trilinos (candi), it 
>>> is still interesting to know what's going on with candi-PETSc (well, I 
>>> would assume this could be my problem,)
>>> Thanks in advance,
>>> Weixiong
>>>  ERROR: Uncaught exception in MPI_InitFinalize on proc 1. Skipping 
>>> MPI_Finalize() to avoid a deadlock.
>>> ----------------------------------------------------
>>> Exception on processing: 
>>> --------------------------------------------------------
>>> An error occurred in line <70> of file 
>>> </home/weixiong/apps/candi/unpack/deal.II-v8.2.1/source/lac/sparsity_tools.cc>
>>> in function
>>>     void dealii::SparsityTools::partition(const 
>>> dealii::SparsityPattern&, unsigned int, std::vector<unsigned int>&)
>>> The violated condition was: 
>>>     false
>>> The name and call sequence of the exception was:
>>>     ExcMETISNotInstalled()
>>> Additional Information: 
>>> (none)
>>> --------------------------------------------------------
>>> Aborting!
>>> ----------------------------------------------------
>>> ERROR: Uncaught exception in MPI_InitFinalize on proc 0. Skipping 
>>> MPI_Finalize() to avoid a deadlock.
>>> ----------------------------------------------------
>>> Exception on processing: 
>>> --------------------------------------------------------
>>> An error occurred in line <70> of file 
>>> </home/weixiong/apps/candi/unpack/deal.II-v8.2.1/source/lac/sparsity_tools.cc>
>>> in function
>>>     void dealii::SparsityTools::partition(const 
>>> dealii::SparsityPattern&, unsigned int, std::vector<unsigned int>&)
>>> The violated condition was: 
>>>     false
>>> The name and call sequence of the exception was:
>>>     ExcMETISNotInstalled()
>>> Additional Information: 
>>> (none)
>>> --------------------------------------------------------
>>> Aborting!
>>> ----------------------------------------------------
>>> --------------------------------------------------------------------------
>>> mpirun has exited due to process rank 1 with PID 13739 on
>>> node Berserker exiting improperly. There are two reasons this could 
>>> occur:
>>> 1. this process did not call "init" before exiting, but others in
>>> the job did. This can cause a job to hang indefinitely while it waits
>>> for all processes to call "init". By rule, if one process calls "init",
>>> then ALL processes must call "init" prior to termination.
>>> 2. this process called "init", but exited without calling "finalize".
>>> By rule, all processes that call "init" MUST call "finalize" prior to
>>> exiting or it will be considered an "abnormal termination"
>>> This may have caused other processes in the application to be
>>> terminated by signals sent by mpirun (as reported here).

