Hi Stefano,

It’s odd that the matmult works for several iterations before crashing. I will 
dig into the code and see if we do something to the matrix that disrupts it’s 
state along the way. Only thing I can think of is that BNLS does attempt to do 
diagonal shifts when it detects ill conditioning and maybe that’s causing a bug 
with MFFD. The other Newton methods don’t do this because they don’t have a 
rigid descent direction requirement like the line search version does.

Anyway thanks for pointing this out. And yes we should definitely add this to 
the tests.

Alp Dener
Argonne National Laboratory
Mathematics and Computer Science
https://alp.dener.me

On Oct 14, 2021, at 10:19 AM, Stefano Zampini <[email protected]> wrote:


Alp

I have found this problem, reproducible with release (see below using 
src/tao/unconstrained/tutorials/minsurf1)
I was trying MFFD hessian with bntr.

Using assembled hessian is fine

(tf-oneapi) [szampini@localhost tutorials]$ ./minsurf1 -tao_smonitor -tao_type 
bntr -mx 10 -my 8 -tao_bnk_max_cg_its 3 -tao_gatol 1.e-4

---- Minimum Surface Area Problem -----
mx: 10     my: 8

iter =   0, Function value 1.45591, Residual: 0.21372
iter =   0, Function value 1.43469, Residual: 0.205909
  iter =   0,   Function value 1.43469,   Residual: 0.205909
  iter =   1,   Function value 1.42058,   Residual: 0.0881207
  iter =   2,   Function value 1.41987,   Residual: 0.10019
  iter =   3,   Function value 1.41797,   Residual: 0.0209656
iter =   1, Function value 1.41775, Residual: 0.000288154
  iter =   0,   Function value 1.41775,   Residual: 0.000288154
  iter =   1,   Function value 1.41775,   Residual: 0.000207788
  iter =   2,   Function value 1.41775,   Residual: 0.000123638
  iter =   3,   Function value 1.41775,   Residual: 0.000117563
iter =   2, Function value 1.41775, Residual: < 1.0e-6

Using MFFD hessian error (using bnls segefaults, we should try to check all 
possible Tao solvers affected by this bug?)
(tf-oneapi) [szampini@localhost tutorials]$ ./minsurf1 -tao_smonitor -tao_type 
bntr -mx 10 -my 8 -tao_bnk_max_cg_its 3 -tao_gatol 1.e-4 -tao_mf_hessian

---- Minimum Surface Area Problem -----
mx: 10     my: 8

iter =   0, Function value 1.45591, Residual: 0.21372
iter =   0, Function value 1.43469, Residual: 0.205909
  iter =   0,   Function value 1.43469,   Residual: 0.205909
  iter =   1,   Function value 1.42058,   Residual: 0.0881207
  iter =   2,   Function value 1.41987,   Residual: 0.10019
  iter =   3,   Function value 1.41797,   Residual: 0.0209656
[0]PETSC ERROR: --------------------- Error Message 
--------------------------------------------------------------
[0]PETSC ERROR: Object is in wrong state
[0]PETSC ERROR: MatMFFDSetBase() has not been called, this is often caused by 
forgetting to call
MatAssemblyBegin/End on the first Mat in the SNES compute function
[0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.16.0, unknown
[0]PETSC ERROR: ./minsurf1 on a arch-conda-tf-oneapi-single-real named 
localhost.localdomain by szampini Thu Oct 14 18:11:23 2021
[0]PETSC ERROR: Configure options 
--with-blaslapack-dir=/home/szampini/Devel/miniforge/envs/tf-oneapi 
--download-thrust --LDFLAGS=-liomp5 --with-debugging=0 --with-openmp 
--with-precision=single --with-fc=0 --download-h2opus --download-slepc 
--download-slepc-commit=origin/release 
PETSC_ARCH=arch-conda-tf-oneapi-single-real 
PETSC_DIR=/home/szampini/Devel/miniforge/Devel/petsc
[0]PETSC ERROR: #1 MatMult_MFFD() at 
/home/szampini/Devel/miniforge/Devel/petsc/src/mat/impls/mffd/mffd.c:333
[0]PETSC ERROR: #2 MatMult_Shell() at 
/home/szampini/Devel/miniforge/Devel/petsc/src/mat/impls/shell/shell.c:1066
[0]PETSC ERROR: #3 MatMult() at 
/home/szampini/Devel/miniforge/Devel/petsc/src/mat/interface/matrix.c:2439
[0]PETSC ERROR: #4 KSP_MatMult() at 
/home/szampini/Devel/miniforge/Devel/petsc/include/petsc/private/kspimpl.h:346
[0]PETSC ERROR: #5 KSPCGSolve_STCG() at 
/home/szampini/Devel/miniforge/Devel/petsc/src/ksp/ksp/impls/cg/stcg/stcg.c:183
[0]PETSC ERROR: #6 KSPSolve_Private() at 
/home/szampini/Devel/miniforge/Devel/petsc/src/ksp/ksp/interface/itfunc.c:914
[0]PETSC ERROR: #7 KSPSolve() at 
/home/szampini/Devel/miniforge/Devel/petsc/src/ksp/ksp/interface/itfunc.c:1086
[0]PETSC ERROR: #8 TaoBNKComputeStep() at 
/home/szampini/Devel/miniforge/Devel/petsc/src/tao/bound/impls/bnk/bnk.c:477
[0]PETSC ERROR: #9 TaoSolve_BNTR() at 
/home/szampini/Devel/miniforge/Devel/petsc/src/tao/bound/impls/bnk/bntr.c:139
[0]PETSC ERROR: #10 TaoSolve() at 
/home/szampini/Devel/miniforge/Devel/petsc/src/tao/interface/taosolver.c:227
[0]PETSC ERROR: #11 main() at minsurf1.c:110
[0]PETSC ERROR: PETSc Option Table entries:
[0]PETSC ERROR: -check_pointer_intensity 0
[0]PETSC ERROR: -mx 10
[0]PETSC ERROR: -my 8
[0]PETSC ERROR: -tao_bnk_max_cg_its 3
[0]PETSC ERROR: -tao_gatol 1.e-4
[0]PETSC ERROR: -tao_mf_hessian
[0]PETSC ERROR: -tao_smonitor
[0]PETSC ERROR: -tao_type bntr
[0]PETSC ERROR: ----------------End of Error Message -------send entire error 
message to [email protected]
Abort(73) on node 0 (rank 0 in comm 0): application called 
MPI_Abort(MPI_COMM_WORLD, 73) - process 0

Using fd is fine

(tf-oneapi) [szampini@localhost tutorials]$ ./minsurf1 -tao_smonitor -tao_type 
bntr -mx 10 -my 8 -tao_bnk_max_cg_its 3 -tao_gatol 1.e-4 -tao_fd_hessian

---- Minimum Surface Area Problem -----
mx: 10     my: 8

iter =   0, Function value 1.45591, Residual: 0.21372
iter =   0, Function value 1.43469, Residual: 0.205909
  iter =   0,   Function value 1.43469,   Residual: 0.205909
  iter =   1,   Function value 1.42058,   Residual: 0.0881207
  iter =   2,   Function value 1.41987,   Residual: 0.10019
  iter =   3,   Function value 1.41797,   Residual: 0.0209656
iter =   1, Function value 1.41775, Residual: 0.000342734
  iter =   0,   Function value 1.41775,   Residual: 0.000342734
  iter =   1,   Function value 1.41775,   Residual: 0.000275372
  iter =   2,   Function value 1.41775,   Residual: 0.000148513
  iter =   3,   Function value 1.41775,   Residual: 0.000150663
iter =   2, Function value 1.41775, Residual: < 1.0e-6



--
Stefano

Reply via email to