Re: [petsc-users] sources of floating point randomness in JFNK in serial

Barry Smith Fri, 05 May 2023 07:56:20 -0700

  Mark,

  Thank you.  You do have aggressive optimizations: -O3 -march=native, which 
means out-of-order instructions may be performed thus, two runs may have 
different order of operations and possibly different round-off values.


  You could try turning off all of this with -O0 for an experiment and see what 
happens. My guess is that you will see much smaller differences in the 
residuals. 

 Barry


> On May 5, 2023, at 8:11 AM, Mark Lohry <[email protected]> wrote:
> 
> 
> 
> On Thu, May 4, 2023 at 9:51 PM Barry Smith <[email protected] 
> <mailto:[email protected]>> wrote:
>> 
>>   Send configure.log
>> 
>> 
>>> On May 4, 2023, at 5:35 PM, Mark Lohry <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> 
>>>> Sure, but why only once and why save to disk? Why not just use that 
>>>> computed approximate Jacobian at each Newton step to drive the Newton 
>>>> solves along for a bunch of time steps?
>>> 
>>> Ah I get what you mean. Okay I did three newton steps with the same LHS, 
>>> with a few repeated manual tests. 3 out of 4 times i got the same exact 
>>> history. is it in the realm of possibility that a hardware error could 
>>> cause something this subtle, bad memory bit or something?
>>> 
>>> 2 runs of 3 newton solves below, ever-so-slightly different.
>>> 
>>> 
>>>  0 SNES Function norm 3.424003312857e+04 
>>>     0 KSP Residual norm 3.424003312857e+04 
>>>     1 KSP Residual norm 2.886124328003e+04 
>>>     2 KSP Residual norm 2.504664994246e+04 
>>>     3 KSP Residual norm 2.104615835161e+04 
>>>     4 KSP Residual norm 1.938102896632e+04 
>>>     5 KSP Residual norm 1.793774642408e+04 
>>>     6 KSP Residual norm 1.671392566980e+04 
>>>     7 KSP Residual norm 1.501504103873e+04 
>>>     8 KSP Residual norm 1.366362900747e+04 
>>>     9 KSP Residual norm 1.240398500429e+04 
>>>    10 KSP Residual norm 1.156293733914e+04 
>>>    11 KSP Residual norm 1.066296477958e+04 
>>>    12 KSP Residual norm 9.835601966950e+03 
>>>    13 KSP Residual norm 9.017480191491e+03 
>>>    14 KSP Residual norm 8.415336139780e+03 
>>>    15 KSP Residual norm 7.807497808435e+03 
>>>    16 KSP Residual norm 7.341703768294e+03 
>>>    17 KSP Residual norm 6.979298049282e+03 
>>>    18 KSP Residual norm 6.521277772081e+03 
>>>    19 KSP Residual norm 6.174842408773e+03 
>>>    20 KSP Residual norm 5.889819665003e+03 
>>>   Linear solve converged due to CONVERGED_ITS iterations 20
>>> KSP Object: 1 MPI process
>>>   type: gmres
>>>     restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization 
>>> with no iterative refinement
>>>     happy breakdown tolerance 1e-30
>>>   maximum iterations=20, initial guess is zero
>>>   tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>>   left preconditioning
>>>   using PRECONDITIONED norm type for convergence test
>>> PC Object: 1 MPI process
>>>   type: none
>>>   linear system matrix = precond matrix:
>>>   Mat Object: 1 MPI process
>>>     type: seqbaij
>>>     rows=16384, cols=16384, bs=16
>>>     total: nonzeros=1277952, allocated nonzeros=1277952
>>>     total number of mallocs used during MatSetValues calls=0
>>>         block size is 16
>>>   1 SNES Function norm 1.000525348433e+04 
>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>>> SNES Object: 1 MPI process
>>>   type: newtonls
>>>   maximum iterations=1, maximum function evaluations=-1
>>>   tolerances: relative=0.1, absolute=1e-15, solution=1e-15
>>>   total number of linear solver iterations=20
>>>   total number of function evaluations=2
>>>   norm schedule ALWAYS
>>>   Jacobian is never rebuilt
>>>   Jacobian is built using finite differences with coloring
>>>   SNESLineSearch Object: 1 MPI process
>>>     type: basic
>>>     maxstep=1.000000e+08, minlambda=1.000000e-12
>>>     tolerances: relative=1.000000e-08, absolute=1.000000e-15, 
>>> lambda=1.000000e-08
>>>     maximum iterations=40
>>>   KSP Object: 1 MPI process
>>>     type: gmres
>>>       restart=30, using Classical (unmodified) Gram-Schmidt 
>>> Orthogonalization with no iterative refinement
>>>       happy breakdown tolerance 1e-30
>>>     maximum iterations=20, initial guess is zero
>>>     tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>>     left preconditioning
>>>     using PRECONDITIONED norm type for convergence test
>>>   PC Object: 1 MPI process
>>>     type: none
>>>     linear system matrix = precond matrix:
>>>     Mat Object: 1 MPI process
>>>       type: seqbaij
>>>       rows=16384, cols=16384, bs=16
>>>       total: nonzeros=1277952, allocated nonzeros=1277952
>>>       total number of mallocs used during MatSetValues calls=0
>>>           block size is 16
>>>   0 SNES Function norm 1.000525348433e+04 
>>>     0 KSP Residual norm 1.000525348433e+04 
>>>     1 KSP Residual norm 7.908741564765e+03 
>>>     2 KSP Residual norm 6.825263536686e+03 
>>>     3 KSP Residual norm 6.224930664968e+03 
>>>     4 KSP Residual norm 6.095547180532e+03 
>>>     5 KSP Residual norm 5.952968230430e+03 
>>>     6 KSP Residual norm 5.861251998116e+03 
>>>     7 KSP Residual norm 5.712439327755e+03 
>>>     8 KSP Residual norm 5.583056913266e+03 
>>>     9 KSP Residual norm 5.461768804626e+03 
>>>    10 KSP Residual norm 5.351937611098e+03 
>>>    11 KSP Residual norm 5.224288337578e+03 
>>>    12 KSP Residual norm 5.129863847081e+03 
>>>    13 KSP Residual norm 5.010818237218e+03 
>>>    14 KSP Residual norm 4.907162936199e+03 
>>>    15 KSP Residual norm 4.789564773955e+03 
>>>    16 KSP Residual norm 4.695173370720e+03 
>>>    17 KSP Residual norm 4.584070962171e+03 
>>>    18 KSP Residual norm 4.483061424742e+03 
>>>    19 KSP Residual norm 4.373384070745e+03 
>>>    20 KSP Residual norm 4.260704657592e+03 
>>>   Linear solve converged due to CONVERGED_ITS iterations 20
>>> KSP Object: 1 MPI process
>>>   type: gmres
>>>     restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization 
>>> with no iterative refinement
>>>     happy breakdown tolerance 1e-30
>>>   maximum iterations=20, initial guess is zero
>>>   tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>>   left preconditioning
>>>   using PRECONDITIONED norm type for convergence test
>>> PC Object: 1 MPI process
>>>   type: none
>>>   linear system matrix = precond matrix:
>>>   Mat Object: 1 MPI process
>>>     type: seqbaij
>>>     rows=16384, cols=16384, bs=16
>>>     total: nonzeros=1277952, allocated nonzeros=1277952
>>>     total number of mallocs used during MatSetValues calls=0
>>>         block size is 16
>>>   1 SNES Function norm 4.662386014882e+03 
>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>>> SNES Object: 1 MPI process
>>>   type: newtonls
>>>   maximum iterations=1, maximum function evaluations=-1
>>>   tolerances: relative=0.1, absolute=1e-15, solution=1e-15
>>>   total number of linear solver iterations=20
>>>   total number of function evaluations=2
>>>   norm schedule ALWAYS
>>>   Jacobian is never rebuilt
>>>   Jacobian is built using finite differences with coloring
>>>   SNESLineSearch Object: 1 MPI process
>>>     type: basic
>>>     maxstep=1.000000e+08, minlambda=1.000000e-12
>>>     tolerances: relative=1.000000e-08, absolute=1.000000e-15, 
>>> lambda=1.000000e-08
>>>     maximum iterations=40
>>>   KSP Object: 1 MPI process
>>>     type: gmres
>>>       restart=30, using Classical (unmodified) Gram-Schmidt 
>>> Orthogonalization with no iterative refinement
>>>       happy breakdown tolerance 1e-30
>>>     maximum iterations=20, initial guess is zero
>>>     tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>>     left preconditioning
>>>     using PRECONDITIONED norm type for convergence test
>>>   PC Object: 1 MPI process
>>>     type: none
>>>     linear system matrix = precond matrix:
>>>     Mat Object: 1 MPI process
>>>       type: seqbaij
>>>       rows=16384, cols=16384, bs=16
>>>       total: nonzeros=1277952, allocated nonzeros=1277952
>>>       total number of mallocs used during MatSetValues calls=0
>>>           block size is 16
>>>   0 SNES Function norm 4.662386014882e+03 
>>>     0 KSP Residual norm 4.662386014882e+03 
>>>     1 KSP Residual norm 4.408316259864e+03 
>>>     2 KSP Residual norm 4.184867769829e+03 
>>>     3 KSP Residual norm 4.079091244351e+03 
>>>     4 KSP Residual norm 4.009247390166e+03 
>>>     5 KSP Residual norm 3.928417371428e+03 
>>>     6 KSP Residual norm 3.865152075780e+03 
>>>     7 KSP Residual norm 3.795606446033e+03 
>>>     8 KSP Residual norm 3.735294554158e+03 
>>>     9 KSP Residual norm 3.674393726487e+03 
>>>    10 KSP Residual norm 3.617795166786e+03 
>>>    11 KSP Residual norm 3.563807982274e+03 
>>>    12 KSP Residual norm 3.512269444921e+03 
>>>    13 KSP Residual norm 3.455110223236e+03 
>>>    14 KSP Residual norm 3.407141247372e+03 
>>>    15 KSP Residual norm 3.356562415982e+03 
>>>    16 KSP Residual norm 3.312720047685e+03 
>>>    17 KSP Residual norm 3.263690150810e+03 
>>>    18 KSP Residual norm 3.219359862444e+03 
>>>    19 KSP Residual norm 3.173500955995e+03 
>>>    20 KSP Residual norm 3.127528790155e+03 
>>>   Linear solve converged due to CONVERGED_ITS iterations 20
>>> KSP Object: 1 MPI process
>>>   type: gmres
>>>     restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization 
>>> with no iterative refinement
>>>     happy breakdown tolerance 1e-30
>>>   maximum iterations=20, initial guess is zero
>>>   tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>>   left preconditioning
>>>   using PRECONDITIONED norm type for convergence test
>>> PC Object: 1 MPI process
>>>   type: none
>>>   linear system matrix = precond matrix:
>>>   Mat Object: 1 MPI process
>>>     type: seqbaij
>>>     rows=16384, cols=16384, bs=16
>>>     total: nonzeros=1277952, allocated nonzeros=1277952
>>>     total number of mallocs used during MatSetValues calls=0
>>>         block size is 16
>>>   1 SNES Function norm 3.186752172556e+03 
>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>>> SNES Object: 1 MPI process
>>>   type: newtonls
>>>   maximum iterations=1, maximum function evaluations=-1
>>>   tolerances: relative=0.1, absolute=1e-15, solution=1e-15
>>>   total number of linear solver iterations=20
>>>   total number of function evaluations=2
>>>   norm schedule ALWAYS
>>>   Jacobian is never rebuilt
>>>   Jacobian is built using finite differences with coloring
>>>   SNESLineSearch Object: 1 MPI process
>>>     type: basic
>>>     maxstep=1.000000e+08, minlambda=1.000000e-12
>>>     tolerances: relative=1.000000e-08, absolute=1.000000e-15, 
>>> lambda=1.000000e-08
>>>     maximum iterations=40
>>>   KSP Object: 1 MPI process
>>>     type: gmres
>>>       restart=30, using Classical (unmodified) Gram-Schmidt 
>>> Orthogonalization with no iterative refinement
>>>       happy breakdown tolerance 1e-30
>>>     maximum iterations=20, initial guess is zero
>>>     tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>>     left preconditioning
>>>     using PRECONDITIONED norm type for convergence test
>>>   PC Object: 1 MPI process
>>>     type: none
>>>     linear system matrix = precond matrix:
>>>     Mat Object: 1 MPI process
>>>       type: seqbaij
>>>       rows=16384, cols=16384, bs=16
>>>       total: nonzeros=1277952, allocated nonzeros=1277952
>>>       total number of mallocs used during MatSetValues calls=0
>>>           block size is 16
>>> 
>>> 
>>> 
>>>   0 SNES Function norm 3.424003312857e+04 
>>>     0 KSP Residual norm 3.424003312857e+04 
>>>     1 KSP Residual norm 2.886124328003e+04 
>>>     2 KSP Residual norm 2.504664994221e+04 
>>>     3 KSP Residual norm 2.104615835130e+04 
>>>     4 KSP Residual norm 1.938102896610e+04 
>>>     5 KSP Residual norm 1.793774642406e+04 
>>>     6 KSP Residual norm 1.671392566981e+04 
>>>     7 KSP Residual norm 1.501504103854e+04 
>>>     8 KSP Residual norm 1.366362900726e+04 
>>>     9 KSP Residual norm 1.240398500414e+04 
>>>    10 KSP Residual norm 1.156293733914e+04 
>>>    11 KSP Residual norm 1.066296477972e+04 
>>>    12 KSP Residual norm 9.835601967036e+03 
>>>    13 KSP Residual norm 9.017480191500e+03 
>>>    14 KSP Residual norm 8.415336139732e+03 
>>>    15 KSP Residual norm 7.807497808414e+03 
>>>    16 KSP Residual norm 7.341703768300e+03 
>>>    17 KSP Residual norm 6.979298049244e+03 
>>>    18 KSP Residual norm 6.521277772042e+03 
>>>    19 KSP Residual norm 6.174842408713e+03 
>>>    20 KSP Residual norm 5.889819664983e+03 
>>>   Linear solve converged due to CONVERGED_ITS iterations 20
>>> KSP Object: 1 MPI process
>>>   type: gmres
>>>     restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization 
>>> with no iterative refinement
>>>     happy breakdown tolerance 1e-30
>>>   maximum iterations=20, initial guess is zero
>>>   tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>>   left preconditioning
>>>   using PRECONDITIONED norm type for convergence test
>>> PC Object: 1 MPI process
>>>   type: none
>>>   linear system matrix = precond matrix:
>>>   Mat Object: 1 MPI process
>>>     type: seqbaij
>>>     rows=16384, cols=16384, bs=16
>>>     total: nonzeros=1277952, allocated nonzeros=1277952
>>>     total number of mallocs used during MatSetValues calls=0
>>>         block size is 16
>>>   1 SNES Function norm 1.000525348435e+04 
>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>>> SNES Object: 1 MPI process
>>>   type: newtonls
>>>   maximum iterations=1, maximum function evaluations=-1
>>>   tolerances: relative=0.1, absolute=1e-15, solution=1e-15
>>>   total number of linear solver iterations=20
>>>   total number of function evaluations=2
>>>   norm schedule ALWAYS
>>>   Jacobian is never rebuilt
>>>   Jacobian is built using finite differences with coloring
>>>   SNESLineSearch Object: 1 MPI process
>>>     type: basic
>>>     maxstep=1.000000e+08, minlambda=1.000000e-12
>>>     tolerances: relative=1.000000e-08, absolute=1.000000e-15, 
>>> lambda=1.000000e-08
>>>     maximum iterations=40
>>>   KSP Object: 1 MPI process
>>>     type: gmres
>>>       restart=30, using Classical (unmodified) Gram-Schmidt 
>>> Orthogonalization with no iterative refinement
>>>       happy breakdown tolerance 1e-30
>>>     maximum iterations=20, initial guess is zero
>>>     tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>>     left preconditioning
>>>     using PRECONDITIONED norm type for convergence test
>>>   PC Object: 1 MPI process
>>>     type: none
>>>     linear system matrix = precond matrix:
>>>     Mat Object: 1 MPI process
>>>       type: seqbaij
>>>       rows=16384, cols=16384, bs=16
>>>       total: nonzeros=1277952, allocated nonzeros=1277952
>>>       total number of mallocs used during MatSetValues calls=0
>>>           block size is 16
>>>   0 SNES Function norm 1.000525348435e+04 
>>>     0 KSP Residual norm 1.000525348435e+04 
>>>     1 KSP Residual norm 7.908741565645e+03 
>>>     2 KSP Residual norm 6.825263536988e+03 
>>>     3 KSP Residual norm 6.224930664967e+03 
>>>     4 KSP Residual norm 6.095547180474e+03 
>>>     5 KSP Residual norm 5.952968230397e+03 
>>>     6 KSP Residual norm 5.861251998127e+03 
>>>     7 KSP Residual norm 5.712439327726e+03 
>>>     8 KSP Residual norm 5.583056913167e+03 
>>>     9 KSP Residual norm 5.461768804526e+03 
>>>    10 KSP Residual norm 5.351937611030e+03 
>>>    11 KSP Residual norm 5.224288337536e+03 
>>>    12 KSP Residual norm 5.129863847028e+03 
>>>    13 KSP Residual norm 5.010818237161e+03 
>>>    14 KSP Residual norm 4.907162936143e+03 
>>>    15 KSP Residual norm 4.789564773923e+03 
>>>    16 KSP Residual norm 4.695173370709e+03 
>>>    17 KSP Residual norm 4.584070962145e+03 
>>>    18 KSP Residual norm 4.483061424714e+03 
>>>    19 KSP Residual norm 4.373384070713e+03 
>>>    20 KSP Residual norm 4.260704657576e+03 
>>>   Linear solve converged due to CONVERGED_ITS iterations 20
>>> KSP Object: 1 MPI process
>>>   type: gmres
>>>     restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization 
>>> with no iterative refinement
>>>     happy breakdown tolerance 1e-30
>>>   maximum iterations=20, initial guess is zero
>>>   tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>>   left preconditioning
>>>   using PRECONDITIONED norm type for convergence test
>>> PC Object: 1 MPI process
>>>   type: none
>>>   linear system matrix = precond matrix:
>>>   Mat Object: 1 MPI process
>>>     type: seqbaij
>>>     rows=16384, cols=16384, bs=16
>>>     total: nonzeros=1277952, allocated nonzeros=1277952
>>>     total number of mallocs used during MatSetValues calls=0
>>>         block size is 16
>>>   1 SNES Function norm 4.662386014874e+03 
>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>>> SNES Object: 1 MPI process
>>>   type: newtonls
>>>   maximum iterations=1, maximum function evaluations=-1
>>>   tolerances: relative=0.1, absolute=1e-15, solution=1e-15
>>>   total number of linear solver iterations=20
>>>   total number of function evaluations=2
>>>   norm schedule ALWAYS
>>>   Jacobian is never rebuilt
>>>   Jacobian is built using finite differences with coloring
>>>   SNESLineSearch Object: 1 MPI process
>>>     type: basic
>>>     maxstep=1.000000e+08, minlambda=1.000000e-12
>>>     tolerances: relative=1.000000e-08, absolute=1.000000e-15, 
>>> lambda=1.000000e-08
>>>     maximum iterations=40
>>>   KSP Object: 1 MPI process
>>>     type: gmres
>>>       restart=30, using Classical (unmodified) Gram-Schmidt 
>>> Orthogonalization with no iterative refinement
>>>       happy breakdown tolerance 1e-30
>>>     maximum iterations=20, initial guess is zero
>>>     tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>>     left preconditioning
>>>     using PRECONDITIONED norm type for convergence test
>>>   PC Object: 1 MPI process
>>>     type: none
>>>     linear system matrix = precond matrix:
>>>     Mat Object: 1 MPI process
>>>       type: seqbaij
>>>       rows=16384, cols=16384, bs=16
>>>       total: nonzeros=1277952, allocated nonzeros=1277952
>>>       total number of mallocs used during MatSetValues calls=0
>>>           block size is 16
>>>   0 SNES Function norm 4.662386014874e+03 
>>>     0 KSP Residual norm 4.662386014874e+03 
>>>     1 KSP Residual norm 4.408316259834e+03 
>>>     2 KSP Residual norm 4.184867769891e+03 
>>>     3 KSP Residual norm 4.079091244367e+03 
>>>     4 KSP Residual norm 4.009247390184e+03 
>>>     5 KSP Residual norm 3.928417371457e+03 
>>>     6 KSP Residual norm 3.865152075802e+03 
>>>     7 KSP Residual norm 3.795606446041e+03 
>>>     8 KSP Residual norm 3.735294554160e+03 
>>>     9 KSP Residual norm 3.674393726485e+03 
>>>    10 KSP Residual norm 3.617795166775e+03 
>>>    11 KSP Residual norm 3.563807982249e+03 
>>>    12 KSP Residual norm 3.512269444873e+03 
>>>    13 KSP Residual norm 3.455110223193e+03 
>>>    14 KSP Residual norm 3.407141247334e+03 
>>>    15 KSP Residual norm 3.356562415949e+03 
>>>    16 KSP Residual norm 3.312720047652e+03 
>>>    17 KSP Residual norm 3.263690150782e+03 
>>>    18 KSP Residual norm 3.219359862425e+03 
>>>    19 KSP Residual norm 3.173500955997e+03 
>>>    20 KSP Residual norm 3.127528790156e+03 
>>>   Linear solve converged due to CONVERGED_ITS iterations 20
>>> KSP Object: 1 MPI process
>>>   type: gmres
>>>     restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization 
>>> with no iterative refinement
>>>     happy breakdown tolerance 1e-30
>>>   maximum iterations=20, initial guess is zero
>>>   tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>>   left preconditioning
>>>   using PRECONDITIONED norm type for convergence test
>>> PC Object: 1 MPI process
>>>   type: none
>>>   linear system matrix = precond matrix:
>>>   Mat Object: 1 MPI process
>>>     type: seqbaij
>>>     rows=16384, cols=16384, bs=16
>>>     total: nonzeros=1277952, allocated nonzeros=1277952
>>>     total number of mallocs used during MatSetValues calls=0
>>>         block size is 16
>>>   1 SNES Function norm 3.186752172503e+03 
>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>>> SNES Object: 1 MPI process
>>>   type: newtonls
>>>   maximum iterations=1, maximum function evaluations=-1
>>>   tolerances: relative=0.1, absolute=1e-15, solution=1e-15
>>>   total number of linear solver iterations=20
>>>   total number of function evaluations=2
>>>   norm schedule ALWAYS
>>>   Jacobian is never rebuilt
>>>   Jacobian is built using finite differences with coloring
>>>   SNESLineSearch Object: 1 MPI process
>>>     type: basic
>>>     maxstep=1.000000e+08, minlambda=1.000000e-12
>>>     tolerances: relative=1.000000e-08, absolute=1.000000e-15, 
>>> lambda=1.000000e-08
>>>     maximum iterations=40
>>>   KSP Object: 1 MPI process
>>>     type: gmres
>>>       restart=30, using Classical (unmodified) Gram-Schmidt 
>>> Orthogonalization with no iterative refinement
>>>       happy breakdown tolerance 1e-30
>>>     maximum iterations=20, initial guess is zero
>>>     tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>>     left preconditioning
>>>     using PRECONDITIONED norm type for convergence test
>>>   PC Object: 1 MPI process
>>>     type: none
>>>     linear system matrix = precond matrix:
>>>     Mat Object: 1 MPI process
>>>       type: seqbaij
>>>       rows=16384, cols=16384, bs=16
>>>       total: nonzeros=1277952, allocated nonzeros=1277952
>>>       total number of mallocs used during MatSetValues calls=0
>>>           block size is 16
>>> 
>>> On Thu, May 4, 2023 at 5:22 PM Matthew Knepley <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>>> On Thu, May 4, 2023 at 5:03 PM Mark Lohry <[email protected] 
>>>> <mailto:[email protected]>> wrote:
>>>>>> Do you get different results (in different runs) without  
>>>>>> -snes_mf_operator? So just using an explicit matrix?
>>>>> 
>>>>> Unfortunately I don't have an explicit matrix available for this, hence 
>>>>> the MFFD/JFNK.
>>>> 
>>>> I don't mean the actual matrix, I mean a representative matrix.
>>>>  
>>>>>> 
>>>>>>   (Note: I am not convinced there is even a problem and think it may be 
>>>>>> simply different order of floating point operations in different runs.)
>>>>> 
>>>>> I'm not convinced either, but running explicit RK for 10,000 iterations i 
>>>>> get exactly the same results every time so i'm fairly confident it's not 
>>>>> the residual evaluation.
>>>>> How would there be a different order of floating point ops in different 
>>>>> runs in serial?
>>>>> 
>>>>>> No, I mean without -snes_mf_* (as Barry says), so we are just running 
>>>>>> that solver with a sparse matrix. This would give me confidence
>>>>>> that nothing in the solver is variable.
>>>>>> 
>>>>> I could do the sparse finite difference jacobian once, save it to disk, 
>>>>> and then use that system each time.
>>>> 
>>>> Yes. That would work.
>>>> 
>>>>   Thanks,
>>>> 
>>>>      Matt
>>>>  
>>>>> On Thu, May 4, 2023 at 4:57 PM Matthew Knepley <[email protected] 
>>>>> <mailto:[email protected]>> wrote:
>>>>>> On Thu, May 4, 2023 at 4:44 PM Mark Lohry <[email protected] 
>>>>>> <mailto:[email protected]>> wrote:
>>>>>>>> Is your code valgrind clean?
>>>>>>> 
>>>>>>> Yes, I also initialize all allocations with NaNs to be sure I'm not 
>>>>>>> using anything uninitialized. 
>>>>>>> 
>>>>>>>> 
>>>>>>>> We can try and test this. Replace your MatMFFD with an actual matrix 
>>>>>>>> and run. Do you see any variability?
>>>>>>> 
>>>>>>> I think I did what you're asking. I have -snes_mf_operator set, and 
>>>>>>> then SNESSetJacobian(snes, diag_ones, diag_ones, NULL, NULL) where 
>>>>>>> diag_ones is a matrix with ones on the diagonal. Two runs below, still 
>>>>>>> with differences but sometimes identical.
>>>>>> 
>>>>>> No, I mean without -snes_mf_* (as Barry says), so we are just running 
>>>>>> that solver with a sparse matrix. This would give me confidence
>>>>>> that nothing in the solver is variable.
>>>>>> 
>>>>>>   Thanks,
>>>>>> 
>>>>>>      Matt
>>>>>>  
>>>>>>>   0 SNES Function norm 3.424003312857e+04 
>>>>>>>     0 KSP Residual norm 3.424003312857e+04 
>>>>>>>     1 KSP Residual norm 2.871734444536e+04 
>>>>>>>     2 KSP Residual norm 2.490276930242e+04 
>>>>>>>     3 KSP Residual norm 2.131675872968e+04 
>>>>>>>     4 KSP Residual norm 1.973129814235e+04 
>>>>>>>     5 KSP Residual norm 1.832377856317e+04 
>>>>>>>     6 KSP Residual norm 1.716783617436e+04 
>>>>>>>     7 KSP Residual norm 1.583963149542e+04 
>>>>>>>     8 KSP Residual norm 1.482272170304e+04 
>>>>>>>     9 KSP Residual norm 1.380312106742e+04 
>>>>>>>    10 KSP Residual norm 1.297793480658e+04 
>>>>>>>    11 KSP Residual norm 1.208599123244e+04 
>>>>>>>    12 KSP Residual norm 1.137345655227e+04 
>>>>>>>    13 KSP Residual norm 1.059676909366e+04 
>>>>>>>    14 KSP Residual norm 1.003823862398e+04 
>>>>>>>    15 KSP Residual norm 9.425879221354e+03 
>>>>>>>    16 KSP Residual norm 8.954805890038e+03 
>>>>>>>    17 KSP Residual norm 8.592372470456e+03 
>>>>>>>    18 KSP Residual norm 8.060707175821e+03 
>>>>>>>    19 KSP Residual norm 7.782057728723e+03 
>>>>>>>    20 KSP Residual norm 7.449686095424e+03 
>>>>>>>   Linear solve converged due to CONVERGED_ITS iterations 20
>>>>>>> KSP Object: 1 MPI process
>>>>>>>   type: gmres
>>>>>>>     restart=30, using Classical (unmodified) Gram-Schmidt 
>>>>>>> Orthogonalization with no iterative refinement
>>>>>>>     happy breakdown tolerance 1e-30
>>>>>>>   maximum iterations=20, initial guess is zero
>>>>>>>   tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>>>>>>   left preconditioning
>>>>>>>   using PRECONDITIONED norm type for convergence test
>>>>>>> PC Object: 1 MPI process
>>>>>>>   type: none
>>>>>>>   linear system matrix followed by preconditioner matrix:
>>>>>>>   Mat Object: 1 MPI process
>>>>>>>     type: mffd
>>>>>>>     rows=16384, cols=16384
>>>>>>>       Matrix-free approximation:
>>>>>>>         err=1.49012e-08 (relative error in function evaluation)
>>>>>>>         Using wp compute h routine
>>>>>>>             Does not compute normU
>>>>>>>   Mat Object: 1 MPI process
>>>>>>>     type: seqaij
>>>>>>>     rows=16384, cols=16384
>>>>>>>     total: nonzeros=16384, allocated nonzeros=16384
>>>>>>>     total number of mallocs used during MatSetValues calls=0
>>>>>>>       not using I-node routines
>>>>>>>   1 SNES Function norm 1.085015646971e+04 
>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>>>>>>> SNES Object: 1 MPI process
>>>>>>>   type: newtonls
>>>>>>>   maximum iterations=1, maximum function evaluations=-1
>>>>>>>   tolerances: relative=0.1, absolute=1e-15, solution=1e-15
>>>>>>>   total number of linear solver iterations=20
>>>>>>>   total number of function evaluations=23
>>>>>>>   norm schedule ALWAYS
>>>>>>>   Jacobian is never rebuilt
>>>>>>>   Jacobian is applied matrix-free with differencing
>>>>>>>   Preconditioning Jacobian is built using finite differences with 
>>>>>>> coloring
>>>>>>>   SNESLineSearch Object: 1 MPI process
>>>>>>>     type: basic
>>>>>>>     maxstep=1.000000e+08, minlambda=1.000000e-12
>>>>>>>     tolerances: relative=1.000000e-08, absolute=1.000000e-15, 
>>>>>>> lambda=1.000000e-08
>>>>>>>     maximum iterations=40
>>>>>>>   KSP Object: 1 MPI process
>>>>>>>     type: gmres
>>>>>>>       restart=30, using Classical (unmodified) Gram-Schmidt 
>>>>>>> Orthogonalization with no iterative refinement
>>>>>>>       happy breakdown tolerance 1e-30
>>>>>>>     maximum iterations=20, initial guess is zero
>>>>>>>     tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>>>>>>     left preconditioning
>>>>>>>     using PRECONDITIONED norm type for convergence test
>>>>>>>   PC Object: 1 MPI process
>>>>>>>     type: none
>>>>>>>     linear system matrix followed by preconditioner matrix:
>>>>>>>     Mat Object: 1 MPI process
>>>>>>>       type: mffd
>>>>>>>       rows=16384, cols=16384
>>>>>>>         Matrix-free approximation:
>>>>>>>           err=1.49012e-08 (relative error in function evaluation)
>>>>>>>           Using wp compute h routine
>>>>>>>               Does not compute normU
>>>>>>>     Mat Object: 1 MPI process
>>>>>>>       type: seqaij
>>>>>>>       rows=16384, cols=16384
>>>>>>>       total: nonzeros=16384, allocated nonzeros=16384
>>>>>>>       total number of mallocs used during MatSetValues calls=0
>>>>>>>         not using I-node routines
>>>>>>> 
>>>>>>>   0 SNES Function norm 3.424003312857e+04 
>>>>>>>     0 KSP Residual norm 3.424003312857e+04 
>>>>>>>     1 KSP Residual norm 2.871734444536e+04 
>>>>>>>     2 KSP Residual norm 2.490276931041e+04 
>>>>>>>     3 KSP Residual norm 2.131675873776e+04 
>>>>>>>     4 KSP Residual norm 1.973129814908e+04 
>>>>>>>     5 KSP Residual norm 1.832377852186e+04 
>>>>>>>     6 KSP Residual norm 1.716783608174e+04 
>>>>>>>     7 KSP Residual norm 1.583963128956e+04 
>>>>>>>     8 KSP Residual norm 1.482272160069e+04 
>>>>>>>     9 KSP Residual norm 1.380312087005e+04 
>>>>>>>    10 KSP Residual norm 1.297793458796e+04 
>>>>>>>    11 KSP Residual norm 1.208599115602e+04 
>>>>>>>    12 KSP Residual norm 1.137345657533e+04 
>>>>>>>    13 KSP Residual norm 1.059676906197e+04 
>>>>>>>    14 KSP Residual norm 1.003823857515e+04 
>>>>>>>    15 KSP Residual norm 9.425879177747e+03 
>>>>>>>    16 KSP Residual norm 8.954805850825e+03 
>>>>>>>    17 KSP Residual norm 8.592372413320e+03 
>>>>>>>    18 KSP Residual norm 8.060706994110e+03 
>>>>>>>    19 KSP Residual norm 7.782057560782e+03 
>>>>>>>    20 KSP Residual norm 7.449686034356e+03 
>>>>>>>   Linear solve converged due to CONVERGED_ITS iterations 20
>>>>>>> KSP Object: 1 MPI process
>>>>>>>   type: gmres
>>>>>>>     restart=30, using Classical (unmodified) Gram-Schmidt 
>>>>>>> Orthogonalization with no iterative refinement
>>>>>>>     happy breakdown tolerance 1e-30
>>>>>>>   maximum iterations=20, initial guess is zero
>>>>>>>   tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>>>>>>   left preconditioning
>>>>>>>   using PRECONDITIONED norm type for convergence test
>>>>>>> PC Object: 1 MPI process
>>>>>>>   type: none
>>>>>>>   linear system matrix followed by preconditioner matrix:
>>>>>>>   Mat Object: 1 MPI process
>>>>>>>     type: mffd
>>>>>>>     rows=16384, cols=16384
>>>>>>>       Matrix-free approximation:
>>>>>>>         err=1.49012e-08 (relative error in function evaluation)
>>>>>>>         Using wp compute h routine
>>>>>>>             Does not compute normU
>>>>>>>   Mat Object: 1 MPI process
>>>>>>>     type: seqaij
>>>>>>>     rows=16384, cols=16384
>>>>>>>     total: nonzeros=16384, allocated nonzeros=16384
>>>>>>>     total number of mallocs used during MatSetValues calls=0
>>>>>>>       not using I-node routines
>>>>>>>   1 SNES Function norm 1.085015821006e+04 
>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>>>>>>> SNES Object: 1 MPI process
>>>>>>>   type: newtonls
>>>>>>>   maximum iterations=1, maximum function evaluations=-1
>>>>>>>   tolerances: relative=0.1, absolute=1e-15, solution=1e-15
>>>>>>>   total number of linear solver iterations=20
>>>>>>>   total number of function evaluations=23
>>>>>>>   norm schedule ALWAYS
>>>>>>>   Jacobian is never rebuilt
>>>>>>>   Jacobian is applied matrix-free with differencing
>>>>>>>   Preconditioning Jacobian is built using finite differences with 
>>>>>>> coloring
>>>>>>>   SNESLineSearch Object: 1 MPI process
>>>>>>>     type: basic
>>>>>>>     maxstep=1.000000e+08, minlambda=1.000000e-12
>>>>>>>     tolerances: relative=1.000000e-08, absolute=1.000000e-15, 
>>>>>>> lambda=1.000000e-08
>>>>>>>     maximum iterations=40
>>>>>>>   KSP Object: 1 MPI process
>>>>>>>     type: gmres
>>>>>>>       restart=30, using Classical (unmodified) Gram-Schmidt 
>>>>>>> Orthogonalization with no iterative refinement
>>>>>>>       happy breakdown tolerance 1e-30
>>>>>>>     maximum iterations=20, initial guess is zero
>>>>>>>     tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>>>>>>     left preconditioning
>>>>>>>     using PRECONDITIONED norm type for convergence test
>>>>>>>   PC Object: 1 MPI process
>>>>>>>     type: none
>>>>>>>     linear system matrix followed by preconditioner matrix:
>>>>>>>     Mat Object: 1 MPI process
>>>>>>>       type: mffd
>>>>>>>       rows=16384, cols=16384
>>>>>>>         Matrix-free approximation:
>>>>>>>           err=1.49012e-08 (relative error in function evaluation)
>>>>>>>           Using wp compute h routine
>>>>>>>               Does not compute normU
>>>>>>>     Mat Object: 1 MPI process
>>>>>>>       type: seqaij
>>>>>>>       rows=16384, cols=16384
>>>>>>>       total: nonzeros=16384, allocated nonzeros=16384
>>>>>>>       total number of mallocs used during MatSetValues calls=0
>>>>>>>         not using I-node routines
>>>>>>> 
>>>>>>> On Thu, May 4, 2023 at 10:10 AM Matthew Knepley <[email protected] 
>>>>>>> <mailto:[email protected]>> wrote:
>>>>>>>> On Thu, May 4, 2023 at 8:54 AM Mark Lohry <[email protected] 
>>>>>>>> <mailto:[email protected]>> wrote:
>>>>>>>>>> Try -pc_type none.
>>>>>>>>> 
>>>>>>>>> With -pc_type none the 0 KSP residual looks identical. But 
>>>>>>>>> *sometimes* it's producing exactly the same history and others it's 
>>>>>>>>> gradually changing.  I'm reasonably confident my residual evaluation 
>>>>>>>>> has no randomness, see info after the petsc output.
>>>>>>>> 
>>>>>>>> We can try and test this. Replace your MatMFFD with an actual matrix 
>>>>>>>> and run. Do you see any variability?
>>>>>>>> 
>>>>>>>> If not, then it could be your routine, or it could be MatMFFD. So run 
>>>>>>>> a few with -snes_view, and we can see if the
>>>>>>>> "w" parameter changes.
>>>>>>>> 
>>>>>>>>   Thanks,
>>>>>>>> 
>>>>>>>>      Matt
>>>>>>>>  
>>>>>>>>> solve history 1:
>>>>>>>>> 
>>>>>>>>>   0 SNES Function norm 3.424003312857e+04 
>>>>>>>>>     0 KSP Residual norm 3.424003312857e+04 
>>>>>>>>>     1 KSP Residual norm 2.871734444536e+04 
>>>>>>>>>     2 KSP Residual norm 2.490276931041e+04 
>>>>>>>>> ...
>>>>>>>>>    20 KSP Residual norm 7.449686034356e+03 
>>>>>>>>>   Linear solve converged due to CONVERGED_ITS iterations 20
>>>>>>>>>   1 SNES Function norm 1.085015821006e+04 
>>>>>>>>> 
>>>>>>>>> solve history 2, identical to 1:
>>>>>>>>> 
>>>>>>>>>   0 SNES Function norm 3.424003312857e+04 
>>>>>>>>>     0 KSP Residual norm 3.424003312857e+04 
>>>>>>>>>     1 KSP Residual norm 2.871734444536e+04 
>>>>>>>>>     2 KSP Residual norm 2.490276931041e+04 
>>>>>>>>> ...
>>>>>>>>>    20 KSP Residual norm 7.449686034356e+03 
>>>>>>>>>   Linear solve converged due to CONVERGED_ITS iterations 20
>>>>>>>>>   1 SNES Function norm 1.085015821006e+04 
>>>>>>>>> 
>>>>>>>>> solve history 3, identical KSP at 0 and 1, slight change at 2, 
>>>>>>>>> growing difference to the end:
>>>>>>>>>   0 SNES Function norm 3.424003312857e+04 
>>>>>>>>>     0 KSP Residual norm 3.424003312857e+04 
>>>>>>>>>     1 KSP Residual norm 2.871734444536e+04 
>>>>>>>>>     2 KSP Residual norm 2.490276930242e+04 
>>>>>>>>> ... 
>>>>>>>>>  20 KSP Residual norm 7.449686095424e+03 
>>>>>>>>>   Linear solve converged due to CONVERGED_ITS iterations 20
>>>>>>>>>   1 SNES Function norm 1.085015646971e+04 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Ths is using a standard explicit 3-stage Runge-Kutta smoother for 10 
>>>>>>>>> iterations, so 30 calls of the same residual evaluation, identical 
>>>>>>>>> residuals every time
>>>>>>>>> 
>>>>>>>>> run 1:
>>>>>>>>> 
>>>>>>>>> # iteration            rho                 rhou                rhov   
>>>>>>>>>              rhoE                abs_res             rel_res          
>>>>>>>>>    umin                vmax                vmin                
>>>>>>>>> elapsed_time      
>>>>>>>>> #                                                                     
>>>>>>>>>                                                                       
>>>>>>>>>                                                                       
>>>>>>>>>  
>>>>>>>>>           1.00000e+00  1.086860616292e+00  2.782316758416e+02  
>>>>>>>>> 4.482867643761e+00  2.993435920340e+02         2.04353e+02         
>>>>>>>>> 1.00000e+00        -8.23945e-15        -6.15326e-15        
>>>>>>>>> -1.35563e-14         6.34834e-01
>>>>>>>>>           2.00000e+00  2.310547487017e+00  1.079059352425e+02  
>>>>>>>>> 3.958323921837e+00  5.058927165686e+02         2.58647e+02         
>>>>>>>>> 1.26568e+00        -1.02539e-14        -9.35368e-15        
>>>>>>>>> -1.69925e-14         6.40063e-01
>>>>>>>>>           3.00000e+00  2.361005867444e+00  5.706213331683e+01  
>>>>>>>>> 6.130016323357e+00  4.688968362579e+02         2.36201e+02         
>>>>>>>>> 1.15585e+00        -1.19370e-14        -1.15216e-14        
>>>>>>>>> -1.59733e-14         6.45166e-01
>>>>>>>>>           4.00000e+00  2.167518999963e+00  3.757541401594e+01  
>>>>>>>>> 6.313917437428e+00  4.054310291628e+02         2.03612e+02         
>>>>>>>>> 9.96372e-01        -1.81831e-14        -1.28312e-14        
>>>>>>>>> -1.46238e-14         6.50494e-01
>>>>>>>>>           5.00000e+00  1.941443738676e+00  2.884190334049e+01  
>>>>>>>>> 6.237106158479e+00  3.539201037156e+02         1.77577e+02         
>>>>>>>>> 8.68970e-01         3.56633e-14        -8.74089e-15        
>>>>>>>>> -1.06666e-14         6.55656e-01
>>>>>>>>>           6.00000e+00  1.736947124693e+00  2.429485695670e+01  
>>>>>>>>> 5.996962200407e+00  3.148280178142e+02         1.57913e+02         
>>>>>>>>> 7.72745e-01        -8.98634e-14        -2.41152e-14        
>>>>>>>>> -1.39713e-14         6.60872e-01
>>>>>>>>>           7.00000e+00  1.564153212635e+00  2.149609219810e+01  
>>>>>>>>> 5.786910705204e+00  2.848717011033e+02         1.42872e+02         
>>>>>>>>> 6.99144e-01        -2.95352e-13        -2.48158e-14        
>>>>>>>>> -2.39351e-14         6.66041e-01
>>>>>>>>>           8.00000e+00  1.419280815384e+00  1.950619804089e+01  
>>>>>>>>> 5.627281158306e+00  2.606623371229e+02         1.30728e+02         
>>>>>>>>> 6.39715e-01         8.98941e-13         1.09674e-13         
>>>>>>>>> 3.78905e-14         6.71316e-01
>>>>>>>>>           9.00000e+00  1.296115915975e+00  1.794843530745e+01  
>>>>>>>>> 5.514933264437e+00  2.401524522393e+02         1.20444e+02         
>>>>>>>>> 5.89394e-01         1.70717e-12         1.38762e-14         
>>>>>>>>> 1.09825e-13         6.76447e-01
>>>>>>>>>           1.00000e+01  1.189639693918e+00  1.665381754953e+01  
>>>>>>>>> 5.433183087037e+00  2.222572900473e+02         1.11475e+02         
>>>>>>>>> 5.45501e-01        -4.22462e-12        -7.15206e-13        
>>>>>>>>> -2.28736e-13         6.81716e-01
>>>>>>>>> 
>>>>>>>>> run N:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> #                                                                     
>>>>>>>>>                                                                       
>>>>>>>>>                                                                       
>>>>>>>>>  
>>>>>>>>> # iteration            rho                 rhou                rhov   
>>>>>>>>>              rhoE                abs_res             rel_res          
>>>>>>>>>    umin                vmax                vmin                
>>>>>>>>> elapsed_time      
>>>>>>>>> #                                                                     
>>>>>>>>>                                                                       
>>>>>>>>>                                                                       
>>>>>>>>>  
>>>>>>>>>           1.00000e+00  1.086860616292e+00  2.782316758416e+02  
>>>>>>>>> 4.482867643761e+00  2.993435920340e+02         2.04353e+02         
>>>>>>>>> 1.00000e+00        -8.23945e-15        -6.15326e-15        
>>>>>>>>> -1.35563e-14         6.23316e-01
>>>>>>>>>           2.00000e+00  2.310547487017e+00  1.079059352425e+02  
>>>>>>>>> 3.958323921837e+00  5.058927165686e+02         2.58647e+02         
>>>>>>>>> 1.26568e+00        -1.02539e-14        -9.35368e-15        
>>>>>>>>> -1.69925e-14         6.28510e-01
>>>>>>>>>           3.00000e+00  2.361005867444e+00  5.706213331683e+01  
>>>>>>>>> 6.130016323357e+00  4.688968362579e+02         2.36201e+02         
>>>>>>>>> 1.15585e+00        -1.19370e-14        -1.15216e-14        
>>>>>>>>> -1.59733e-14         6.33558e-01
>>>>>>>>>           4.00000e+00  2.167518999963e+00  3.757541401594e+01  
>>>>>>>>> 6.313917437428e+00  4.054310291628e+02         2.03612e+02         
>>>>>>>>> 9.96372e-01        -1.81831e-14        -1.28312e-14        
>>>>>>>>> -1.46238e-14         6.38773e-01
>>>>>>>>>           5.00000e+00  1.941443738676e+00  2.884190334049e+01  
>>>>>>>>> 6.237106158479e+00  3.539201037156e+02         1.77577e+02         
>>>>>>>>> 8.68970e-01         3.56633e-14        -8.74089e-15        
>>>>>>>>> -1.06666e-14         6.43887e-01
>>>>>>>>>           6.00000e+00  1.736947124693e+00  2.429485695670e+01  
>>>>>>>>> 5.996962200407e+00  3.148280178142e+02         1.57913e+02         
>>>>>>>>> 7.72745e-01        -8.98634e-14        -2.41152e-14        
>>>>>>>>> -1.39713e-14         6.49073e-01
>>>>>>>>>           7.00000e+00  1.564153212635e+00  2.149609219810e+01  
>>>>>>>>> 5.786910705204e+00  2.848717011033e+02         1.42872e+02         
>>>>>>>>> 6.99144e-01        -2.95352e-13        -2.48158e-14        
>>>>>>>>> -2.39351e-14         6.54167e-01
>>>>>>>>>           8.00000e+00  1.419280815384e+00  1.950619804089e+01  
>>>>>>>>> 5.627281158306e+00  2.606623371229e+02         1.30728e+02         
>>>>>>>>> 6.39715e-01         8.98941e-13         1.09674e-13         
>>>>>>>>> 3.78905e-14         6.59394e-01
>>>>>>>>>           9.00000e+00  1.296115915975e+00  1.794843530745e+01  
>>>>>>>>> 5.514933264437e+00  2.401524522393e+02         1.20444e+02         
>>>>>>>>> 5.89394e-01         1.70717e-12         1.38762e-14         
>>>>>>>>> 1.09825e-13         6.64516e-01
>>>>>>>>>           1.00000e+01  1.189639693918e+00  1.665381754953e+01  
>>>>>>>>> 5.433183087037e+00  2.222572900473e+02         1.11475e+02         
>>>>>>>>> 5.45501e-01        -4.22462e-12        -7.15206e-13        
>>>>>>>>> -2.28736e-13         6.69677e-01
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Thu, May 4, 2023 at 8:41 AM Mark Adams <[email protected] 
>>>>>>>>> <mailto:[email protected]>> wrote:
>>>>>>>>>> ASM is just the sub PC with one proc but gets weaker with more procs 
>>>>>>>>>> unless you use jacobi. (maybe I am missing something).
>>>>>>>>>> 
>>>>>>>>>> On Thu, May 4, 2023 at 8:31 AM Mark Lohry <[email protected] 
>>>>>>>>>> <mailto:[email protected]>> wrote:
>>>>>>>>>>>>  Please send the output of -snes_view. 
>>>>>>>>>>> pasted below. anything stand out?
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> SNES Object: 1 MPI process
>>>>>>>>>>>   type: newtonls
>>>>>>>>>>>   maximum iterations=1, maximum function evaluations=-1
>>>>>>>>>>>   tolerances: relative=0.1, absolute=1e-15, solution=1e-15
>>>>>>>>>>>   total number of linear solver iterations=20
>>>>>>>>>>>   total number of function evaluations=22
>>>>>>>>>>>   norm schedule ALWAYS
>>>>>>>>>>>   Jacobian is never rebuilt
>>>>>>>>>>>   Jacobian is applied matrix-free with differencing
>>>>>>>>>>>   Preconditioning Jacobian is built using finite differences with 
>>>>>>>>>>> coloring
>>>>>>>>>>>   SNESLineSearch Object: 1 MPI process
>>>>>>>>>>>     type: basic
>>>>>>>>>>>     maxstep=1.000000e+08, minlambda=1.000000e-12
>>>>>>>>>>>     tolerances: relative=1.000000e-08, absolute=1.000000e-15, 
>>>>>>>>>>> lambda=1.000000e-08
>>>>>>>>>>>     maximum iterations=40
>>>>>>>>>>>   KSP Object: 1 MPI process
>>>>>>>>>>>     type: gmres
>>>>>>>>>>>       restart=30, using Classical (unmodified) Gram-Schmidt 
>>>>>>>>>>> Orthogonalization with no iterative refinement
>>>>>>>>>>>       happy breakdown tolerance 1e-30
>>>>>>>>>>>     maximum iterations=20, initial guess is zero
>>>>>>>>>>>     tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>>>>>>>>>>     left preconditioning
>>>>>>>>>>>     using PRECONDITIONED norm type for convergence test
>>>>>>>>>>>   PC Object: 1 MPI process
>>>>>>>>>>>     type: asm
>>>>>>>>>>>       total subdomain blocks = 1, amount of overlap = 0
>>>>>>>>>>>       restriction/interpolation type - RESTRICT
>>>>>>>>>>>       Local solver information for first block is in the following 
>>>>>>>>>>> KSP and PC objects on rank 0:
>>>>>>>>>>>       Use -ksp_view ::ascii_info_detail to display information for 
>>>>>>>>>>> all blocks
>>>>>>>>>>>     KSP Object: (sub_) 1 MPI process
>>>>>>>>>>>       type: preonly
>>>>>>>>>>>       maximum iterations=10000, initial guess is zero
>>>>>>>>>>>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>>>>>>>>>>>       left preconditioning
>>>>>>>>>>>       using NONE norm type for convergence test
>>>>>>>>>>>     PC Object: (sub_) 1 MPI process
>>>>>>>>>>>       type: ilu
>>>>>>>>>>>         out-of-place factorization
>>>>>>>>>>>         0 levels of fill
>>>>>>>>>>>         tolerance for zero pivot 2.22045e-14
>>>>>>>>>>>         matrix ordering: natural
>>>>>>>>>>>         factor fill ratio given 1., needed 1.
>>>>>>>>>>>           Factored matrix follows:
>>>>>>>>>>>             Mat Object: (sub_) 1 MPI process
>>>>>>>>>>>               type: seqbaij
>>>>>>>>>>>               rows=16384, cols=16384, bs=16
>>>>>>>>>>>               package used to perform factorization: petsc
>>>>>>>>>>>               total: nonzeros=1277952, allocated nonzeros=1277952
>>>>>>>>>>>                   block size is 16
>>>>>>>>>>>       linear system matrix = precond matrix:
>>>>>>>>>>>       Mat Object: (sub_) 1 MPI process
>>>>>>>>>>>         type: seqbaij
>>>>>>>>>>>         rows=16384, cols=16384, bs=16
>>>>>>>>>>>         total: nonzeros=1277952, allocated nonzeros=1277952
>>>>>>>>>>>         total number of mallocs used during MatSetValues calls=0
>>>>>>>>>>>             block size is 16
>>>>>>>>>>>     linear system matrix followed by preconditioner matrix:
>>>>>>>>>>>     Mat Object: 1 MPI process
>>>>>>>>>>>       type: mffd
>>>>>>>>>>>       rows=16384, cols=16384
>>>>>>>>>>>         Matrix-free approximation:
>>>>>>>>>>>           err=1.49012e-08 (relative error in function evaluation)
>>>>>>>>>>>           Using wp compute h routine
>>>>>>>>>>>               Does not compute normU
>>>>>>>>>>>     Mat Object: 1 MPI process
>>>>>>>>>>>       type: seqbaij
>>>>>>>>>>>       rows=16384, cols=16384, bs=16
>>>>>>>>>>>       total: nonzeros=1277952, allocated nonzeros=1277952
>>>>>>>>>>>       total number of mallocs used during MatSetValues calls=0
>>>>>>>>>>>           block size is 16
>>>>>>>>>>> 
>>>>>>>>>>> On Thu, May 4, 2023 at 8:30 AM Mark Adams <[email protected] 
>>>>>>>>>>> <mailto:[email protected]>> wrote:
>>>>>>>>>>>> If you are using MG what is the coarse grid solver?
>>>>>>>>>>>> -snes_view might give you that.
>>>>>>>>>>>> 
>>>>>>>>>>>> On Thu, May 4, 2023 at 8:25 AM Matthew Knepley <[email protected] 
>>>>>>>>>>>> <mailto:[email protected]>> wrote:
>>>>>>>>>>>>> On Thu, May 4, 2023 at 8:21 AM Mark Lohry <[email protected] 
>>>>>>>>>>>>> <mailto:[email protected]>> wrote:
>>>>>>>>>>>>>>> Do they start very similarly and then slowly drift further 
>>>>>>>>>>>>>>> apart?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Yes, this. I take it this sounds familiar?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> See these two examples with 20 fixed iterations pasted at the 
>>>>>>>>>>>>>> end. The difference for one solve is slight (final SNES norm is 
>>>>>>>>>>>>>> identical to 5 digits), but in the context I'm using it in 
>>>>>>>>>>>>>> (repeated applications to solve a steady state multigrid 
>>>>>>>>>>>>>> problem, though here just one level) the differences add up such 
>>>>>>>>>>>>>> that I might reach global convergence in 35 iterations or 38. 
>>>>>>>>>>>>>> It's not the end of the world, but I was expecting that with -np 
>>>>>>>>>>>>>> 1 these would be identical and I'm not sure where the root cause 
>>>>>>>>>>>>>> would be.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> The initial KSP residual is different, so its the PC. Please send 
>>>>>>>>>>>>> the output of -snes_view. If your ASM is using direct 
>>>>>>>>>>>>> factorization, then it
>>>>>>>>>>>>> could be randomness in whatever LU you are using.
>>>>>>>>>>>>> 
>>>>>>>>>>>>>   Thanks,
>>>>>>>>>>>>> 
>>>>>>>>>>>>>     Matt
>>>>>>>>>>>>>  
>>>>>>>>>>>>>>   0 SNES Function norm 2.801842107848e+04 
>>>>>>>>>>>>>>     0 KSP Residual norm 4.045639499595e+01 
>>>>>>>>>>>>>>     1 KSP Residual norm 1.917999809040e+01 
>>>>>>>>>>>>>>     2 KSP Residual norm 1.616048521958e+01 
>>>>>>>>>>>>>> [...]
>>>>>>>>>>>>>>    19 KSP Residual norm 8.788043518111e-01 
>>>>>>>>>>>>>>    20 KSP Residual norm 6.570851270214e-01 
>>>>>>>>>>>>>>   Linear solve converged due to CONVERGED_ITS iterations 20
>>>>>>>>>>>>>>   1 SNES Function norm 1.801309983345e+03 
>>>>>>>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Same system, identical initial 0 SNES norm, 0 KSP is slightly 
>>>>>>>>>>>>>> different
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>   0 SNES Function norm 2.801842107848e+04 
>>>>>>>>>>>>>>     0 KSP Residual norm 4.045639473002e+01 
>>>>>>>>>>>>>>     1 KSP Residual norm 1.917999883034e+01 
>>>>>>>>>>>>>>     2 KSP Residual norm 1.616048572016e+01 
>>>>>>>>>>>>>> [...]
>>>>>>>>>>>>>>    19 KSP Residual norm 8.788046348957e-01 
>>>>>>>>>>>>>>    20 KSP Residual norm 6.570859588610e-01 
>>>>>>>>>>>>>>   Linear solve converged due to CONVERGED_ITS iterations 20
>>>>>>>>>>>>>>   1 SNES Function norm 1.801311320322e+03 
>>>>>>>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Wed, May 3, 2023 at 11:05 PM Barry Smith <[email protected] 
>>>>>>>>>>>>>> <mailto:[email protected]>> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>   Do they start very similarly and then slowly drift further 
>>>>>>>>>>>>>>> apart? That is the first couple of KSP iterations they are 
>>>>>>>>>>>>>>> almost identical but then for each iteration get a bit further. 
>>>>>>>>>>>>>>> Similar for the SNES iterations, starting close and then for 
>>>>>>>>>>>>>>> more iterations and more solves they start moving apart. Or do 
>>>>>>>>>>>>>>> they suddenly jump to be very different? You can run with 
>>>>>>>>>>>>>>> -snes_monitor -ksp_monitor 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On May 3, 2023, at 9:07 PM, Mark Lohry <[email protected] 
>>>>>>>>>>>>>>>> <mailto:[email protected]>> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> This is on a single MPI rank. I haven't checked the coloring, 
>>>>>>>>>>>>>>>> was just guessing there. But the solutions/residuals are 
>>>>>>>>>>>>>>>> slightly different from run to run.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Fair to say that for serial JFNK/asm ilu0/gmres we should 
>>>>>>>>>>>>>>>> expect bitwise identical results?
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On Wed, May 3, 2023, 8:50 PM Barry Smith <[email protected] 
>>>>>>>>>>>>>>>> <mailto:[email protected]>> wrote:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>   No, the coloring should be identical every time. Do you see 
>>>>>>>>>>>>>>>>> differences with 1 MPI rank? (Or much smaller ones?).
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> > On May 3, 2023, at 8:42 PM, Mark Lohry <[email protected] 
>>>>>>>>>>>>>>>>> > <mailto:[email protected]>> wrote:
>>>>>>>>>>>>>>>>> > 
>>>>>>>>>>>>>>>>> > I'm running multiple iterations of newtonls with an 
>>>>>>>>>>>>>>>>> > MFFD/JFNK nonlinear solver where I give it the sparsity. PC 
>>>>>>>>>>>>>>>>> > asm, KSP gmres, with SNESSetLagJacobian -2 (compute once 
>>>>>>>>>>>>>>>>> > and then frozen jacobian).
>>>>>>>>>>>>>>>>> > 
>>>>>>>>>>>>>>>>> > I'm seeing slight (<1%) but nonzero differences in 
>>>>>>>>>>>>>>>>> > residuals from run to run. I'm wondering where randomness 
>>>>>>>>>>>>>>>>> > might enter here -- does the jacobian coloring use a random 
>>>>>>>>>>>>>>>>> > seed?
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> -- 
>>>>>>>>>>>>> What most experimenters take for granted before they begin their 
>>>>>>>>>>>>> experiments is infinitely more interesting than any results to 
>>>>>>>>>>>>> which their experiments lead.
>>>>>>>>>>>>> -- Norbert Wiener
>>>>>>>>>>>>> 
>>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ 
>>>>>>>>>>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>>>>> 
>>>>>>>> 
>>>>>>>> -- 
>>>>>>>> What most experimenters take for granted before they begin their 
>>>>>>>> experiments is infinitely more interesting than any results to which 
>>>>>>>> their experiments lead.
>>>>>>>> -- Norbert Wiener
>>>>>>>> 
>>>>>>>> https://www.cse.buffalo.edu/~knepley/ 
>>>>>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>>> 
>>>>>> 
>>>>>> -- 
>>>>>> What most experimenters take for granted before they begin their 
>>>>>> experiments is infinitely more interesting than any results to which 
>>>>>> their experiments lead.
>>>>>> -- Norbert Wiener
>>>>>> 
>>>>>> https://www.cse.buffalo.edu/~knepley/ 
>>>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>> 
>>>> 
>>>> -- 
>>>> What most experimenters take for granted before they begin their 
>>>> experiments is infinitely more interesting than any results to which their 
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>> 
>>>> https://www.cse.buffalo.edu/~knepley/ 
>>>> <http://www.cse.buffalo.edu/~knepley/>
>> 
> <configure.log>

Re: [petsc-users] sources of floating point randomness in JFNK in serial

Reply via email to