Re: [petsc-users] sources of floating point randomness in JFNK in serial

Mark Lohry Fri, 05 May 2023 13:46:01 -0700

wow. leaving -O3 and turning off -march=native seems to have made it
repeatable. this is on an avx2 cpu if it matters.


out-of-order instructions may be performed thus, two runs may have
> different order of operations
>
>
this is terrifying if true. the source code path is exactly the same every
time but the cpu does different things?

On Fri, May 5, 2023 at 10:55 AM Barry Smith <[email protected]> wrote:

>
>   Mark,
>
>   Thank you.  You do have aggressive optimizations: -O3 -march=native,
> which means out-of-order instructions may be performed thus, two runs may
> have different order of operations and possibly different round-off values.
>
>   You could try turning off all of this with -O0 for an experiment and see
> what happens. My guess is that you will see much smaller differences in the
> residuals.
>
>  Barry
>
>
> On May 5, 2023, at 8:11 AM, Mark Lohry <[email protected]> wrote:
>
>
>
> On Thu, May 4, 2023 at 9:51 PM Barry Smith <[email protected]> wrote:
>
>>
>>   Send configure.log
>>
>>
>> On May 4, 2023, at 5:35 PM, Mark Lohry <[email protected]> wrote:
>>
>> Sure, but why only once and why save to disk? Why not just use that
>>> computed approximate Jacobian at each Newton step to drive the Newton
>>> solves along for a bunch of time steps?
>>
>>
>> Ah I get what you mean. Okay I did three newton steps with the same LHS,
>> with a few repeated manual tests. 3 out of 4 times i got the same exact
>> history. is it in the realm of possibility that a hardware error could
>> cause something this subtle, bad memory bit or something?
>>
>> 2 runs of 3 newton solves below, ever-so-slightly different.
>>
>>
>>  0 SNES Function norm 3.424003312857e+04
>>     0 KSP Residual norm 3.424003312857e+04
>>     1 KSP Residual norm 2.886124328003e+04
>>     2 KSP Residual norm 2.504664994246e+04
>>     3 KSP Residual norm 2.104615835161e+04
>>     4 KSP Residual norm 1.938102896632e+04
>>     5 KSP Residual norm 1.793774642408e+04
>>     6 KSP Residual norm 1.671392566980e+04
>>     7 KSP Residual norm 1.501504103873e+04
>>     8 KSP Residual norm 1.366362900747e+04
>>     9 KSP Residual norm 1.240398500429e+04
>>    10 KSP Residual norm 1.156293733914e+04
>>    11 KSP Residual norm 1.066296477958e+04
>>    12 KSP Residual norm 9.835601966950e+03
>>    13 KSP Residual norm 9.017480191491e+03
>>    14 KSP Residual norm 8.415336139780e+03
>>    15 KSP Residual norm 7.807497808435e+03
>>    16 KSP Residual norm 7.341703768294e+03
>>    17 KSP Residual norm 6.979298049282e+03
>>    18 KSP Residual norm 6.521277772081e+03
>>    19 KSP Residual norm 6.174842408773e+03
>>    20 KSP Residual norm 5.889819665003e+03
>>   Linear solve converged due to CONVERGED_ITS iterations 20
>> KSP Object: 1 MPI process
>>   type: gmres
>>     restart=30, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>>     happy breakdown tolerance 1e-30
>>   maximum iterations=20, initial guess is zero
>>   tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>   left preconditioning
>>   using PRECONDITIONED norm type for convergence test
>> PC Object: 1 MPI process
>>   type: none
>>   linear system matrix = precond matrix:
>>   Mat Object: 1 MPI process
>>     type: seqbaij
>>     rows=16384, cols=16384, bs=16
>>     total: nonzeros=1277952, allocated nonzeros=1277952
>>     total number of mallocs used during MatSetValues calls=0
>>         block size is 16
>>   1 SNES Function norm 1.000525348433e+04
>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>> SNES Object: 1 MPI process
>>   type: newtonls
>>   maximum iterations=1, maximum function evaluations=-1
>>   tolerances: relative=0.1, absolute=1e-15, solution=1e-15
>>   total number of linear solver iterations=20
>>   total number of function evaluations=2
>>   norm schedule ALWAYS
>>   Jacobian is never rebuilt
>>   Jacobian is built using finite differences with coloring
>>   SNESLineSearch Object: 1 MPI process
>>     type: basic
>>     maxstep=1.000000e+08, minlambda=1.000000e-12
>>     tolerances: relative=1.000000e-08, absolute=1.000000e-15,
>> lambda=1.000000e-08
>>     maximum iterations=40
>>   KSP Object: 1 MPI process
>>     type: gmres
>>       restart=30, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>>       happy breakdown tolerance 1e-30
>>     maximum iterations=20, initial guess is zero
>>     tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>     left preconditioning
>>     using PRECONDITIONED norm type for convergence test
>>   PC Object: 1 MPI process
>>     type: none
>>     linear system matrix = precond matrix:
>>     Mat Object: 1 MPI process
>>       type: seqbaij
>>       rows=16384, cols=16384, bs=16
>>       total: nonzeros=1277952, allocated nonzeros=1277952
>>       total number of mallocs used during MatSetValues calls=0
>>           block size is 16
>>   0 SNES Function norm 1.000525348433e+04
>>     0 KSP Residual norm 1.000525348433e+04
>>     1 KSP Residual norm 7.908741564765e+03
>>     2 KSP Residual norm 6.825263536686e+03
>>     3 KSP Residual norm 6.224930664968e+03
>>     4 KSP Residual norm 6.095547180532e+03
>>     5 KSP Residual norm 5.952968230430e+03
>>     6 KSP Residual norm 5.861251998116e+03
>>     7 KSP Residual norm 5.712439327755e+03
>>     8 KSP Residual norm 5.583056913266e+03
>>     9 KSP Residual norm 5.461768804626e+03
>>    10 KSP Residual norm 5.351937611098e+03
>>    11 KSP Residual norm 5.224288337578e+03
>>    12 KSP Residual norm 5.129863847081e+03
>>    13 KSP Residual norm 5.010818237218e+03
>>    14 KSP Residual norm 4.907162936199e+03
>>    15 KSP Residual norm 4.789564773955e+03
>>    16 KSP Residual norm 4.695173370720e+03
>>    17 KSP Residual norm 4.584070962171e+03
>>    18 KSP Residual norm 4.483061424742e+03
>>    19 KSP Residual norm 4.373384070745e+03
>>    20 KSP Residual norm 4.260704657592e+03
>>   Linear solve converged due to CONVERGED_ITS iterations 20
>> KSP Object: 1 MPI process
>>   type: gmres
>>     restart=30, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>>     happy breakdown tolerance 1e-30
>>   maximum iterations=20, initial guess is zero
>>   tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>   left preconditioning
>>   using PRECONDITIONED norm type for convergence test
>> PC Object: 1 MPI process
>>   type: none
>>   linear system matrix = precond matrix:
>>   Mat Object: 1 MPI process
>>     type: seqbaij
>>     rows=16384, cols=16384, bs=16
>>     total: nonzeros=1277952, allocated nonzeros=1277952
>>     total number of mallocs used during MatSetValues calls=0
>>         block size is 16
>>   1 SNES Function norm 4.662386014882e+03
>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>> SNES Object: 1 MPI process
>>   type: newtonls
>>   maximum iterations=1, maximum function evaluations=-1
>>   tolerances: relative=0.1, absolute=1e-15, solution=1e-15
>>   total number of linear solver iterations=20
>>   total number of function evaluations=2
>>   norm schedule ALWAYS
>>   Jacobian is never rebuilt
>>   Jacobian is built using finite differences with coloring
>>   SNESLineSearch Object: 1 MPI process
>>     type: basic
>>     maxstep=1.000000e+08, minlambda=1.000000e-12
>>     tolerances: relative=1.000000e-08, absolute=1.000000e-15,
>> lambda=1.000000e-08
>>     maximum iterations=40
>>   KSP Object: 1 MPI process
>>     type: gmres
>>       restart=30, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>>       happy breakdown tolerance 1e-30
>>     maximum iterations=20, initial guess is zero
>>     tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>     left preconditioning
>>     using PRECONDITIONED norm type for convergence test
>>   PC Object: 1 MPI process
>>     type: none
>>     linear system matrix = precond matrix:
>>     Mat Object: 1 MPI process
>>       type: seqbaij
>>       rows=16384, cols=16384, bs=16
>>       total: nonzeros=1277952, allocated nonzeros=1277952
>>       total number of mallocs used during MatSetValues calls=0
>>           block size is 16
>>   0 SNES Function norm 4.662386014882e+03
>>     0 KSP Residual norm 4.662386014882e+03
>>     1 KSP Residual norm 4.408316259864e+03
>>     2 KSP Residual norm 4.184867769829e+03
>>     3 KSP Residual norm 4.079091244351e+03
>>     4 KSP Residual norm 4.009247390166e+03
>>     5 KSP Residual norm 3.928417371428e+03
>>     6 KSP Residual norm 3.865152075780e+03
>>     7 KSP Residual norm 3.795606446033e+03
>>     8 KSP Residual norm 3.735294554158e+03
>>     9 KSP Residual norm 3.674393726487e+03
>>    10 KSP Residual norm 3.617795166786e+03
>>    11 KSP Residual norm 3.563807982274e+03
>>    12 KSP Residual norm 3.512269444921e+03
>>    13 KSP Residual norm 3.455110223236e+03
>>    14 KSP Residual norm 3.407141247372e+03
>>    15 KSP Residual norm 3.356562415982e+03
>>    16 KSP Residual norm 3.312720047685e+03
>>    17 KSP Residual norm 3.263690150810e+03
>>    18 KSP Residual norm 3.219359862444e+03
>>    19 KSP Residual norm 3.173500955995e+03
>>    20 KSP Residual norm 3.127528790155e+03
>>   Linear solve converged due to CONVERGED_ITS iterations 20
>> KSP Object: 1 MPI process
>>   type: gmres
>>     restart=30, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>>     happy breakdown tolerance 1e-30
>>   maximum iterations=20, initial guess is zero
>>   tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>   left preconditioning
>>   using PRECONDITIONED norm type for convergence test
>> PC Object: 1 MPI process
>>   type: none
>>   linear system matrix = precond matrix:
>>   Mat Object: 1 MPI process
>>     type: seqbaij
>>     rows=16384, cols=16384, bs=16
>>     total: nonzeros=1277952, allocated nonzeros=1277952
>>     total number of mallocs used during MatSetValues calls=0
>>         block size is 16
>>   1 SNES Function norm 3.186752172556e+03
>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>> SNES Object: 1 MPI process
>>   type: newtonls
>>   maximum iterations=1, maximum function evaluations=-1
>>   tolerances: relative=0.1, absolute=1e-15, solution=1e-15
>>   total number of linear solver iterations=20
>>   total number of function evaluations=2
>>   norm schedule ALWAYS
>>   Jacobian is never rebuilt
>>   Jacobian is built using finite differences with coloring
>>   SNESLineSearch Object: 1 MPI process
>>     type: basic
>>     maxstep=1.000000e+08, minlambda=1.000000e-12
>>     tolerances: relative=1.000000e-08, absolute=1.000000e-15,
>> lambda=1.000000e-08
>>     maximum iterations=40
>>   KSP Object: 1 MPI process
>>     type: gmres
>>       restart=30, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>>       happy breakdown tolerance 1e-30
>>     maximum iterations=20, initial guess is zero
>>     tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>     left preconditioning
>>     using PRECONDITIONED norm type for convergence test
>>   PC Object: 1 MPI process
>>     type: none
>>     linear system matrix = precond matrix:
>>     Mat Object: 1 MPI process
>>       type: seqbaij
>>       rows=16384, cols=16384, bs=16
>>       total: nonzeros=1277952, allocated nonzeros=1277952
>>       total number of mallocs used during MatSetValues calls=0
>>           block size is 16
>>
>>
>>
>>   0 SNES Function norm 3.424003312857e+04
>>     0 KSP Residual norm 3.424003312857e+04
>>     1 KSP Residual norm 2.886124328003e+04
>>     2 KSP Residual norm 2.504664994221e+04
>>     3 KSP Residual norm 2.104615835130e+04
>>     4 KSP Residual norm 1.938102896610e+04
>>     5 KSP Residual norm 1.793774642406e+04
>>     6 KSP Residual norm 1.671392566981e+04
>>     7 KSP Residual norm 1.501504103854e+04
>>     8 KSP Residual norm 1.366362900726e+04
>>     9 KSP Residual norm 1.240398500414e+04
>>    10 KSP Residual norm 1.156293733914e+04
>>    11 KSP Residual norm 1.066296477972e+04
>>    12 KSP Residual norm 9.835601967036e+03
>>    13 KSP Residual norm 9.017480191500e+03
>>    14 KSP Residual norm 8.415336139732e+03
>>    15 KSP Residual norm 7.807497808414e+03
>>    16 KSP Residual norm 7.341703768300e+03
>>    17 KSP Residual norm 6.979298049244e+03
>>    18 KSP Residual norm 6.521277772042e+03
>>    19 KSP Residual norm 6.174842408713e+03
>>    20 KSP Residual norm 5.889819664983e+03
>>   Linear solve converged due to CONVERGED_ITS iterations 20
>> KSP Object: 1 MPI process
>>   type: gmres
>>     restart=30, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>>     happy breakdown tolerance 1e-30
>>   maximum iterations=20, initial guess is zero
>>   tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>   left preconditioning
>>   using PRECONDITIONED norm type for convergence test
>> PC Object: 1 MPI process
>>   type: none
>>   linear system matrix = precond matrix:
>>   Mat Object: 1 MPI process
>>     type: seqbaij
>>     rows=16384, cols=16384, bs=16
>>     total: nonzeros=1277952, allocated nonzeros=1277952
>>     total number of mallocs used during MatSetValues calls=0
>>         block size is 16
>>   1 SNES Function norm 1.000525348435e+04
>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>> SNES Object: 1 MPI process
>>   type: newtonls
>>   maximum iterations=1, maximum function evaluations=-1
>>   tolerances: relative=0.1, absolute=1e-15, solution=1e-15
>>   total number of linear solver iterations=20
>>   total number of function evaluations=2
>>   norm schedule ALWAYS
>>   Jacobian is never rebuilt
>>   Jacobian is built using finite differences with coloring
>>   SNESLineSearch Object: 1 MPI process
>>     type: basic
>>     maxstep=1.000000e+08, minlambda=1.000000e-12
>>     tolerances: relative=1.000000e-08, absolute=1.000000e-15,
>> lambda=1.000000e-08
>>     maximum iterations=40
>>   KSP Object: 1 MPI process
>>     type: gmres
>>       restart=30, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>>       happy breakdown tolerance 1e-30
>>     maximum iterations=20, initial guess is zero
>>     tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>     left preconditioning
>>     using PRECONDITIONED norm type for convergence test
>>   PC Object: 1 MPI process
>>     type: none
>>     linear system matrix = precond matrix:
>>     Mat Object: 1 MPI process
>>       type: seqbaij
>>       rows=16384, cols=16384, bs=16
>>       total: nonzeros=1277952, allocated nonzeros=1277952
>>       total number of mallocs used during MatSetValues calls=0
>>           block size is 16
>>   0 SNES Function norm 1.000525348435e+04
>>     0 KSP Residual norm 1.000525348435e+04
>>     1 KSP Residual norm 7.908741565645e+03
>>     2 KSP Residual norm 6.825263536988e+03
>>     3 KSP Residual norm 6.224930664967e+03
>>     4 KSP Residual norm 6.095547180474e+03
>>     5 KSP Residual norm 5.952968230397e+03
>>     6 KSP Residual norm 5.861251998127e+03
>>     7 KSP Residual norm 5.712439327726e+03
>>     8 KSP Residual norm 5.583056913167e+03
>>     9 KSP Residual norm 5.461768804526e+03
>>    10 KSP Residual norm 5.351937611030e+03
>>    11 KSP Residual norm 5.224288337536e+03
>>    12 KSP Residual norm 5.129863847028e+03
>>    13 KSP Residual norm 5.010818237161e+03
>>    14 KSP Residual norm 4.907162936143e+03
>>    15 KSP Residual norm 4.789564773923e+03
>>    16 KSP Residual norm 4.695173370709e+03
>>    17 KSP Residual norm 4.584070962145e+03
>>    18 KSP Residual norm 4.483061424714e+03
>>    19 KSP Residual norm 4.373384070713e+03
>>    20 KSP Residual norm 4.260704657576e+03
>>   Linear solve converged due to CONVERGED_ITS iterations 20
>> KSP Object: 1 MPI process
>>   type: gmres
>>     restart=30, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>>     happy breakdown tolerance 1e-30
>>   maximum iterations=20, initial guess is zero
>>   tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>   left preconditioning
>>   using PRECONDITIONED norm type for convergence test
>> PC Object: 1 MPI process
>>   type: none
>>   linear system matrix = precond matrix:
>>   Mat Object: 1 MPI process
>>     type: seqbaij
>>     rows=16384, cols=16384, bs=16
>>     total: nonzeros=1277952, allocated nonzeros=1277952
>>     total number of mallocs used during MatSetValues calls=0
>>         block size is 16
>>   1 SNES Function norm 4.662386014874e+03
>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>> SNES Object: 1 MPI process
>>   type: newtonls
>>   maximum iterations=1, maximum function evaluations=-1
>>   tolerances: relative=0.1, absolute=1e-15, solution=1e-15
>>   total number of linear solver iterations=20
>>   total number of function evaluations=2
>>   norm schedule ALWAYS
>>   Jacobian is never rebuilt
>>   Jacobian is built using finite differences with coloring
>>   SNESLineSearch Object: 1 MPI process
>>     type: basic
>>     maxstep=1.000000e+08, minlambda=1.000000e-12
>>     tolerances: relative=1.000000e-08, absolute=1.000000e-15,
>> lambda=1.000000e-08
>>     maximum iterations=40
>>   KSP Object: 1 MPI process
>>     type: gmres
>>       restart=30, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>>       happy breakdown tolerance 1e-30
>>     maximum iterations=20, initial guess is zero
>>     tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>     left preconditioning
>>     using PRECONDITIONED norm type for convergence test
>>   PC Object: 1 MPI process
>>     type: none
>>     linear system matrix = precond matrix:
>>     Mat Object: 1 MPI process
>>       type: seqbaij
>>       rows=16384, cols=16384, bs=16
>>       total: nonzeros=1277952, allocated nonzeros=1277952
>>       total number of mallocs used during MatSetValues calls=0
>>           block size is 16
>>   0 SNES Function norm 4.662386014874e+03
>>     0 KSP Residual norm 4.662386014874e+03
>>     1 KSP Residual norm 4.408316259834e+03
>>     2 KSP Residual norm 4.184867769891e+03
>>     3 KSP Residual norm 4.079091244367e+03
>>     4 KSP Residual norm 4.009247390184e+03
>>     5 KSP Residual norm 3.928417371457e+03
>>     6 KSP Residual norm 3.865152075802e+03
>>     7 KSP Residual norm 3.795606446041e+03
>>     8 KSP Residual norm 3.735294554160e+03
>>     9 KSP Residual norm 3.674393726485e+03
>>    10 KSP Residual norm 3.617795166775e+03
>>    11 KSP Residual norm 3.563807982249e+03
>>    12 KSP Residual norm 3.512269444873e+03
>>    13 KSP Residual norm 3.455110223193e+03
>>    14 KSP Residual norm 3.407141247334e+03
>>    15 KSP Residual norm 3.356562415949e+03
>>    16 KSP Residual norm 3.312720047652e+03
>>    17 KSP Residual norm 3.263690150782e+03
>>    18 KSP Residual norm 3.219359862425e+03
>>    19 KSP Residual norm 3.173500955997e+03
>>    20 KSP Residual norm 3.127528790156e+03
>>   Linear solve converged due to CONVERGED_ITS iterations 20
>> KSP Object: 1 MPI process
>>   type: gmres
>>     restart=30, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>>     happy breakdown tolerance 1e-30
>>   maximum iterations=20, initial guess is zero
>>   tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>   left preconditioning
>>   using PRECONDITIONED norm type for convergence test
>> PC Object: 1 MPI process
>>   type: none
>>   linear system matrix = precond matrix:
>>   Mat Object: 1 MPI process
>>     type: seqbaij
>>     rows=16384, cols=16384, bs=16
>>     total: nonzeros=1277952, allocated nonzeros=1277952
>>     total number of mallocs used during MatSetValues calls=0
>>         block size is 16
>>   1 SNES Function norm 3.186752172503e+03
>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>> SNES Object: 1 MPI process
>>   type: newtonls
>>   maximum iterations=1, maximum function evaluations=-1
>>   tolerances: relative=0.1, absolute=1e-15, solution=1e-15
>>   total number of linear solver iterations=20
>>   total number of function evaluations=2
>>   norm schedule ALWAYS
>>   Jacobian is never rebuilt
>>   Jacobian is built using finite differences with coloring
>>   SNESLineSearch Object: 1 MPI process
>>     type: basic
>>     maxstep=1.000000e+08, minlambda=1.000000e-12
>>     tolerances: relative=1.000000e-08, absolute=1.000000e-15,
>> lambda=1.000000e-08
>>     maximum iterations=40
>>   KSP Object: 1 MPI process
>>     type: gmres
>>       restart=30, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>>       happy breakdown tolerance 1e-30
>>     maximum iterations=20, initial guess is zero
>>     tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>     left preconditioning
>>     using PRECONDITIONED norm type for convergence test
>>   PC Object: 1 MPI process
>>     type: none
>>     linear system matrix = precond matrix:
>>     Mat Object: 1 MPI process
>>       type: seqbaij
>>       rows=16384, cols=16384, bs=16
>>       total: nonzeros=1277952, allocated nonzeros=1277952
>>       total number of mallocs used during MatSetValues calls=0
>>           block size is 16
>>
>> On Thu, May 4, 2023 at 5:22 PM Matthew Knepley <[email protected]> wrote:
>>
>>> On Thu, May 4, 2023 at 5:03 PM Mark Lohry <[email protected]> wrote:
>>>
>>>> Do you get different results (in different runs) without
>>>>>  -snes_mf_operator? So just using an explicit matrix?
>>>>
>>>>
>>>> Unfortunately I don't have an explicit matrix available for this, hence
>>>> the MFFD/JFNK.
>>>>
>>>
>>> I don't mean the actual matrix, I mean a representative matrix.
>>>
>>>
>>>>
>>>>>   (Note: I am not convinced there is even a problem and think it may
>>>>> be simply different order of floating point operations in different runs.)
>>>>>
>>>>
>>>> I'm not convinced either, but running explicit RK for 10,000 iterations
>>>> i get exactly the same results every time so i'm fairly confident it's not
>>>> the residual evaluation.
>>>> How would there be a different order of floating point ops in different
>>>> runs in serial?
>>>>
>>>> No, I mean without -snes_mf_* (as Barry says), so we are just running
>>>>> that solver with a sparse matrix. This would give me confidence
>>>>> that nothing in the solver is variable.
>>>>>
>>>>> I could do the sparse finite difference jacobian once, save it to
>>>> disk, and then use that system each time.
>>>>
>>>
>>> Yes. That would work.
>>>
>>>   Thanks,
>>>
>>>      Matt
>>>
>>>
>>>> On Thu, May 4, 2023 at 4:57 PM Matthew Knepley <[email protected]>
>>>> wrote:
>>>>
>>>>> On Thu, May 4, 2023 at 4:44 PM Mark Lohry <[email protected]> wrote:
>>>>>
>>>>>> Is your code valgrind clean?
>>>>>>>
>>>>>>
>>>>>> Yes, I also initialize all allocations with NaNs to be sure I'm not
>>>>>> using anything uninitialized.
>>>>>>
>>>>>>
>>>>>>> We can try and test this. Replace your MatMFFD with an actual matrix
>>>>>>> and run. Do you see any variability?
>>>>>>>
>>>>>>
>>>>>> I think I did what you're asking. I have -snes_mf_operator set, and
>>>>>> then SNESSetJacobian(snes, diag_ones, diag_ones, NULL, NULL) where
>>>>>> diag_ones is a matrix with ones on the diagonal. Two runs below, still 
>>>>>> with
>>>>>> differences but sometimes identical.
>>>>>>
>>>>>
>>>>> No, I mean without -snes_mf_* (as Barry says), so we are just running
>>>>> that solver with a sparse matrix. This would give me confidence
>>>>> that nothing in the solver is variable.
>>>>>
>>>>>   Thanks,
>>>>>
>>>>>      Matt
>>>>>
>>>>>
>>>>>>   0 SNES Function norm 3.424003312857e+04
>>>>>>     0 KSP Residual norm 3.424003312857e+04
>>>>>>     1 KSP Residual norm 2.871734444536e+04
>>>>>>     2 KSP Residual norm 2.490276930242e+04
>>>>>>     3 KSP Residual norm 2.131675872968e+04
>>>>>>     4 KSP Residual norm 1.973129814235e+04
>>>>>>     5 KSP Residual norm 1.832377856317e+04
>>>>>>     6 KSP Residual norm 1.716783617436e+04
>>>>>>     7 KSP Residual norm 1.583963149542e+04
>>>>>>     8 KSP Residual norm 1.482272170304e+04
>>>>>>     9 KSP Residual norm 1.380312106742e+04
>>>>>>    10 KSP Residual norm 1.297793480658e+04
>>>>>>    11 KSP Residual norm 1.208599123244e+04
>>>>>>    12 KSP Residual norm 1.137345655227e+04
>>>>>>    13 KSP Residual norm 1.059676909366e+04
>>>>>>    14 KSP Residual norm 1.003823862398e+04
>>>>>>    15 KSP Residual norm 9.425879221354e+03
>>>>>>    16 KSP Residual norm 8.954805890038e+03
>>>>>>    17 KSP Residual norm 8.592372470456e+03
>>>>>>    18 KSP Residual norm 8.060707175821e+03
>>>>>>    19 KSP Residual norm 7.782057728723e+03
>>>>>>    20 KSP Residual norm 7.449686095424e+03
>>>>>>   Linear solve converged due to CONVERGED_ITS iterations 20
>>>>>> KSP Object: 1 MPI process
>>>>>>   type: gmres
>>>>>>     restart=30, using Classical (unmodified) Gram-Schmidt
>>>>>> Orthogonalization with no iterative refinement
>>>>>>     happy breakdown tolerance 1e-30
>>>>>>   maximum iterations=20, initial guess is zero
>>>>>>   tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>>>>>   left preconditioning
>>>>>>   using PRECONDITIONED norm type for convergence test
>>>>>> PC Object: 1 MPI process
>>>>>>   type: none
>>>>>>   linear system matrix followed by preconditioner matrix:
>>>>>>   Mat Object: 1 MPI process
>>>>>>     type: mffd
>>>>>>     rows=16384, cols=16384
>>>>>>       Matrix-free approximation:
>>>>>>         err=1.49012e-08 (relative error in function evaluation)
>>>>>>         Using wp compute h routine
>>>>>>             Does not compute normU
>>>>>>   Mat Object: 1 MPI process
>>>>>>     type: seqaij
>>>>>>     rows=16384, cols=16384
>>>>>>     total: nonzeros=16384, allocated nonzeros=16384
>>>>>>     total number of mallocs used during MatSetValues calls=0
>>>>>>       not using I-node routines
>>>>>>   1 SNES Function norm 1.085015646971e+04
>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>>>>>> SNES Object: 1 MPI process
>>>>>>   type: newtonls
>>>>>>   maximum iterations=1, maximum function evaluations=-1
>>>>>>   tolerances: relative=0.1, absolute=1e-15, solution=1e-15
>>>>>>   total number of linear solver iterations=20
>>>>>>   total number of function evaluations=23
>>>>>>   norm schedule ALWAYS
>>>>>>   Jacobian is never rebuilt
>>>>>>   Jacobian is applied matrix-free with differencing
>>>>>>   Preconditioning Jacobian is built using finite differences with
>>>>>> coloring
>>>>>>   SNESLineSearch Object: 1 MPI process
>>>>>>     type: basic
>>>>>>     maxstep=1.000000e+08, minlambda=1.000000e-12
>>>>>>     tolerances: relative=1.000000e-08, absolute=1.000000e-15,
>>>>>> lambda=1.000000e-08
>>>>>>     maximum iterations=40
>>>>>>   KSP Object: 1 MPI process
>>>>>>     type: gmres
>>>>>>       restart=30, using Classical (unmodified) Gram-Schmidt
>>>>>> Orthogonalization with no iterative refinement
>>>>>>       happy breakdown tolerance 1e-30
>>>>>>     maximum iterations=20, initial guess is zero
>>>>>>     tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>>>>>     left preconditioning
>>>>>>     using PRECONDITIONED norm type for convergence test
>>>>>>   PC Object: 1 MPI process
>>>>>>     type: none
>>>>>>     linear system matrix followed by preconditioner matrix:
>>>>>>     Mat Object: 1 MPI process
>>>>>>       type: mffd
>>>>>>       rows=16384, cols=16384
>>>>>>         Matrix-free approximation:
>>>>>>           err=1.49012e-08 (relative error in function evaluation)
>>>>>>           Using wp compute h routine
>>>>>>               Does not compute normU
>>>>>>     Mat Object: 1 MPI process
>>>>>>       type: seqaij
>>>>>>       rows=16384, cols=16384
>>>>>>       total: nonzeros=16384, allocated nonzeros=16384
>>>>>>       total number of mallocs used during MatSetValues calls=0
>>>>>>         not using I-node routines
>>>>>>
>>>>>>   0 SNES Function norm 3.424003312857e+04
>>>>>>     0 KSP Residual norm 3.424003312857e+04
>>>>>>     1 KSP Residual norm 2.871734444536e+04
>>>>>>     2 KSP Residual norm 2.490276931041e+04
>>>>>>     3 KSP Residual norm 2.131675873776e+04
>>>>>>     4 KSP Residual norm 1.973129814908e+04
>>>>>>     5 KSP Residual norm 1.832377852186e+04
>>>>>>     6 KSP Residual norm 1.716783608174e+04
>>>>>>     7 KSP Residual norm 1.583963128956e+04
>>>>>>     8 KSP Residual norm 1.482272160069e+04
>>>>>>     9 KSP Residual norm 1.380312087005e+04
>>>>>>    10 KSP Residual norm 1.297793458796e+04
>>>>>>    11 KSP Residual norm 1.208599115602e+04
>>>>>>    12 KSP Residual norm 1.137345657533e+04
>>>>>>    13 KSP Residual norm 1.059676906197e+04
>>>>>>    14 KSP Residual norm 1.003823857515e+04
>>>>>>    15 KSP Residual norm 9.425879177747e+03
>>>>>>    16 KSP Residual norm 8.954805850825e+03
>>>>>>    17 KSP Residual norm 8.592372413320e+03
>>>>>>    18 KSP Residual norm 8.060706994110e+03
>>>>>>    19 KSP Residual norm 7.782057560782e+03
>>>>>>    20 KSP Residual norm 7.449686034356e+03
>>>>>>   Linear solve converged due to CONVERGED_ITS iterations 20
>>>>>> KSP Object: 1 MPI process
>>>>>>   type: gmres
>>>>>>     restart=30, using Classical (unmodified) Gram-Schmidt
>>>>>> Orthogonalization with no iterative refinement
>>>>>>     happy breakdown tolerance 1e-30
>>>>>>   maximum iterations=20, initial guess is zero
>>>>>>   tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>>>>>   left preconditioning
>>>>>>   using PRECONDITIONED norm type for convergence test
>>>>>> PC Object: 1 MPI process
>>>>>>   type: none
>>>>>>   linear system matrix followed by preconditioner matrix:
>>>>>>   Mat Object: 1 MPI process
>>>>>>     type: mffd
>>>>>>     rows=16384, cols=16384
>>>>>>       Matrix-free approximation:
>>>>>>         err=1.49012e-08 (relative error in function evaluation)
>>>>>>         Using wp compute h routine
>>>>>>             Does not compute normU
>>>>>>   Mat Object: 1 MPI process
>>>>>>     type: seqaij
>>>>>>     rows=16384, cols=16384
>>>>>>     total: nonzeros=16384, allocated nonzeros=16384
>>>>>>     total number of mallocs used during MatSetValues calls=0
>>>>>>       not using I-node routines
>>>>>>   1 SNES Function norm 1.085015821006e+04
>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>>>>>> SNES Object: 1 MPI process
>>>>>>   type: newtonls
>>>>>>   maximum iterations=1, maximum function evaluations=-1
>>>>>>   tolerances: relative=0.1, absolute=1e-15, solution=1e-15
>>>>>>   total number of linear solver iterations=20
>>>>>>   total number of function evaluations=23
>>>>>>   norm schedule ALWAYS
>>>>>>   Jacobian is never rebuilt
>>>>>>   Jacobian is applied matrix-free with differencing
>>>>>>   Preconditioning Jacobian is built using finite differences with
>>>>>> coloring
>>>>>>   SNESLineSearch Object: 1 MPI process
>>>>>>     type: basic
>>>>>>     maxstep=1.000000e+08, minlambda=1.000000e-12
>>>>>>     tolerances: relative=1.000000e-08, absolute=1.000000e-15,
>>>>>> lambda=1.000000e-08
>>>>>>     maximum iterations=40
>>>>>>   KSP Object: 1 MPI process
>>>>>>     type: gmres
>>>>>>       restart=30, using Classical (unmodified) Gram-Schmidt
>>>>>> Orthogonalization with no iterative refinement
>>>>>>       happy breakdown tolerance 1e-30
>>>>>>     maximum iterations=20, initial guess is zero
>>>>>>     tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>>>>>     left preconditioning
>>>>>>     using PRECONDITIONED norm type for convergence test
>>>>>>   PC Object: 1 MPI process
>>>>>>     type: none
>>>>>>     linear system matrix followed by preconditioner matrix:
>>>>>>     Mat Object: 1 MPI process
>>>>>>       type: mffd
>>>>>>       rows=16384, cols=16384
>>>>>>         Matrix-free approximation:
>>>>>>           err=1.49012e-08 (relative error in function evaluation)
>>>>>>           Using wp compute h routine
>>>>>>               Does not compute normU
>>>>>>     Mat Object: 1 MPI process
>>>>>>       type: seqaij
>>>>>>       rows=16384, cols=16384
>>>>>>       total: nonzeros=16384, allocated nonzeros=16384
>>>>>>       total number of mallocs used during MatSetValues calls=0
>>>>>>         not using I-node routines
>>>>>>
>>>>>> On Thu, May 4, 2023 at 10:10 AM Matthew Knepley <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> On Thu, May 4, 2023 at 8:54 AM Mark Lohry <[email protected]> wrote:
>>>>>>>
>>>>>>>> Try -pc_type none.
>>>>>>>>>
>>>>>>>>
>>>>>>>> With -pc_type none the 0 KSP residual looks identical. But
>>>>>>>> *sometimes* it's producing exactly the same history and others it's
>>>>>>>> gradually changing.  I'm reasonably confident my residual evaluation 
>>>>>>>> has no
>>>>>>>> randomness, see info after the petsc output.
>>>>>>>>
>>>>>>>
>>>>>>> We can try and test this. Replace your MatMFFD with an actual matrix
>>>>>>> and run. Do you see any variability?
>>>>>>>
>>>>>>> If not, then it could be your routine, or it could be MatMFFD. So
>>>>>>> run a few with -snes_view, and we can see if the
>>>>>>> "w" parameter changes.
>>>>>>>
>>>>>>>   Thanks,
>>>>>>>
>>>>>>>      Matt
>>>>>>>
>>>>>>>
>>>>>>>> solve history 1:
>>>>>>>>
>>>>>>>>   0 SNES Function norm 3.424003312857e+04
>>>>>>>>     0 KSP Residual norm 3.424003312857e+04
>>>>>>>>     1 KSP Residual norm 2.871734444536e+04
>>>>>>>>     2 KSP Residual norm 2.490276931041e+04
>>>>>>>> ...
>>>>>>>>    20 KSP Residual norm 7.449686034356e+03
>>>>>>>>   Linear solve converged due to CONVERGED_ITS iterations 20
>>>>>>>>   1 SNES Function norm 1.085015821006e+04
>>>>>>>>
>>>>>>>> solve history 2, identical to 1:
>>>>>>>>
>>>>>>>>   0 SNES Function norm 3.424003312857e+04
>>>>>>>>     0 KSP Residual norm 3.424003312857e+04
>>>>>>>>     1 KSP Residual norm 2.871734444536e+04
>>>>>>>>     2 KSP Residual norm 2.490276931041e+04
>>>>>>>> ...
>>>>>>>>    20 KSP Residual norm 7.449686034356e+03
>>>>>>>>   Linear solve converged due to CONVERGED_ITS iterations 20
>>>>>>>>   1 SNES Function norm 1.085015821006e+04
>>>>>>>>
>>>>>>>> solve history 3, identical KSP at 0 and 1, slight change at 2,
>>>>>>>> growing difference to the end:
>>>>>>>>   0 SNES Function norm 3.424003312857e+04
>>>>>>>>     0 KSP Residual norm 3.424003312857e+04
>>>>>>>>     1 KSP Residual norm 2.871734444536e+04
>>>>>>>>     2 KSP Residual norm 2.490276930242e+04
>>>>>>>> ...
>>>>>>>>  20 KSP Residual norm 7.449686095424e+03
>>>>>>>>   Linear solve converged due to CONVERGED_ITS iterations 20
>>>>>>>>   1 SNES Function norm 1.085015646971e+04
>>>>>>>>
>>>>>>>>
>>>>>>>> Ths is using a standard explicit 3-stage Runge-Kutta smoother for
>>>>>>>> 10 iterations, so 30 calls of the same residual evaluation, identical
>>>>>>>> residuals every time
>>>>>>>>
>>>>>>>> run 1:
>>>>>>>>
>>>>>>>> # iteration            rho                 rhou                rhov
>>>>>>>>                rhoE                abs_res             rel_res
>>>>>>>> umin                vmax                vmin                
>>>>>>>> elapsed_time
>>>>>>>>
>>>>>>>> #
>>>>>>>>
>>>>>>>>
>>>>>>>>           1.00000e+00  1.086860616292e+00  2.782316758416e+02
>>>>>>>>  4.482867643761e+00  2.993435920340e+02         2.04353e+02
>>>>>>>> 1.00000e+00        -8.23945e-15        -6.15326e-15        -1.35563e-14
>>>>>>>>     6.34834e-01
>>>>>>>>           2.00000e+00  2.310547487017e+00  1.079059352425e+02
>>>>>>>>  3.958323921837e+00  5.058927165686e+02         2.58647e+02
>>>>>>>> 1.26568e+00        -1.02539e-14        -9.35368e-15        -1.69925e-14
>>>>>>>>     6.40063e-01
>>>>>>>>           3.00000e+00  2.361005867444e+00  5.706213331683e+01
>>>>>>>>  6.130016323357e+00  4.688968362579e+02         2.36201e+02
>>>>>>>> 1.15585e+00        -1.19370e-14        -1.15216e-14        -1.59733e-14
>>>>>>>>     6.45166e-01
>>>>>>>>           4.00000e+00  2.167518999963e+00  3.757541401594e+01
>>>>>>>>  6.313917437428e+00  4.054310291628e+02         2.03612e+02
>>>>>>>> 9.96372e-01        -1.81831e-14        -1.28312e-14        -1.46238e-14
>>>>>>>>     6.50494e-01
>>>>>>>>           5.00000e+00  1.941443738676e+00  2.884190334049e+01
>>>>>>>>  6.237106158479e+00  3.539201037156e+02         1.77577e+02
>>>>>>>> 8.68970e-01         3.56633e-14        -8.74089e-15        -1.06666e-14
>>>>>>>>     6.55656e-01
>>>>>>>>           6.00000e+00  1.736947124693e+00  2.429485695670e+01
>>>>>>>>  5.996962200407e+00  3.148280178142e+02         1.57913e+02
>>>>>>>> 7.72745e-01        -8.98634e-14        -2.41152e-14        -1.39713e-14
>>>>>>>>     6.60872e-01
>>>>>>>>           7.00000e+00  1.564153212635e+00  2.149609219810e+01
>>>>>>>>  5.786910705204e+00  2.848717011033e+02         1.42872e+02
>>>>>>>> 6.99144e-01        -2.95352e-13        -2.48158e-14        -2.39351e-14
>>>>>>>>     6.66041e-01
>>>>>>>>           8.00000e+00  1.419280815384e+00  1.950619804089e+01
>>>>>>>>  5.627281158306e+00  2.606623371229e+02         1.30728e+02
>>>>>>>> 6.39715e-01         8.98941e-13         1.09674e-13         3.78905e-14
>>>>>>>>     6.71316e-01
>>>>>>>>           9.00000e+00  1.296115915975e+00  1.794843530745e+01
>>>>>>>>  5.514933264437e+00  2.401524522393e+02         1.20444e+02
>>>>>>>> 5.89394e-01         1.70717e-12         1.38762e-14         1.09825e-13
>>>>>>>>     6.76447e-01
>>>>>>>>           1.00000e+01  1.189639693918e+00  1.665381754953e+01
>>>>>>>>  5.433183087037e+00  2.222572900473e+02         1.11475e+02
>>>>>>>> 5.45501e-01        -4.22462e-12        -7.15206e-13        -2.28736e-13
>>>>>>>>     6.81716e-01
>>>>>>>>
>>>>>>>> run N:
>>>>>>>>
>>>>>>>>
>>>>>>>> #
>>>>>>>>
>>>>>>>>
>>>>>>>> # iteration            rho                 rhou                rhov
>>>>>>>>                rhoE                abs_res             rel_res
>>>>>>>> umin                vmax                vmin                
>>>>>>>> elapsed_time
>>>>>>>>
>>>>>>>> #
>>>>>>>>
>>>>>>>>
>>>>>>>>           1.00000e+00  1.086860616292e+00  2.782316758416e+02
>>>>>>>>  4.482867643761e+00  2.993435920340e+02         2.04353e+02
>>>>>>>> 1.00000e+00        -8.23945e-15        -6.15326e-15        -1.35563e-14
>>>>>>>>     6.23316e-01
>>>>>>>>           2.00000e+00  2.310547487017e+00  1.079059352425e+02
>>>>>>>>  3.958323921837e+00  5.058927165686e+02         2.58647e+02
>>>>>>>> 1.26568e+00        -1.02539e-14        -9.35368e-15        -1.69925e-14
>>>>>>>>     6.28510e-01
>>>>>>>>           3.00000e+00  2.361005867444e+00  5.706213331683e+01
>>>>>>>>  6.130016323357e+00  4.688968362579e+02         2.36201e+02
>>>>>>>> 1.15585e+00        -1.19370e-14        -1.15216e-14        -1.59733e-14
>>>>>>>>     6.33558e-01
>>>>>>>>           4.00000e+00  2.167518999963e+00  3.757541401594e+01
>>>>>>>>  6.313917437428e+00  4.054310291628e+02         2.03612e+02
>>>>>>>> 9.96372e-01        -1.81831e-14        -1.28312e-14        -1.46238e-14
>>>>>>>>     6.38773e-01
>>>>>>>>           5.00000e+00  1.941443738676e+00  2.884190334049e+01
>>>>>>>>  6.237106158479e+00  3.539201037156e+02         1.77577e+02
>>>>>>>> 8.68970e-01         3.56633e-14        -8.74089e-15        -1.06666e-14
>>>>>>>>     6.43887e-01
>>>>>>>>           6.00000e+00  1.736947124693e+00  2.429485695670e+01
>>>>>>>>  5.996962200407e+00  3.148280178142e+02         1.57913e+02
>>>>>>>> 7.72745e-01        -8.98634e-14        -2.41152e-14        -1.39713e-14
>>>>>>>>     6.49073e-01
>>>>>>>>           7.00000e+00  1.564153212635e+00  2.149609219810e+01
>>>>>>>>  5.786910705204e+00  2.848717011033e+02         1.42872e+02
>>>>>>>> 6.99144e-01        -2.95352e-13        -2.48158e-14        -2.39351e-14
>>>>>>>>     6.54167e-01
>>>>>>>>           8.00000e+00  1.419280815384e+00  1.950619804089e+01
>>>>>>>>  5.627281158306e+00  2.606623371229e+02         1.30728e+02
>>>>>>>> 6.39715e-01         8.98941e-13         1.09674e-13         3.78905e-14
>>>>>>>>     6.59394e-01
>>>>>>>>           9.00000e+00  1.296115915975e+00  1.794843530745e+01
>>>>>>>>  5.514933264437e+00  2.401524522393e+02         1.20444e+02
>>>>>>>> 5.89394e-01         1.70717e-12         1.38762e-14         1.09825e-13
>>>>>>>>     6.64516e-01
>>>>>>>>           1.00000e+01  1.189639693918e+00  1.665381754953e+01
>>>>>>>>  5.433183087037e+00  2.222572900473e+02         1.11475e+02
>>>>>>>> 5.45501e-01        -4.22462e-12        -7.15206e-13        -2.28736e-13
>>>>>>>>     6.69677e-01
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, May 4, 2023 at 8:41 AM Mark Adams <[email protected]> wrote:
>>>>>>>>
>>>>>>>>> ASM is just the sub PC with one proc but gets weaker with more
>>>>>>>>> procs unless you use jacobi. (maybe I am missing something).
>>>>>>>>>
>>>>>>>>> On Thu, May 4, 2023 at 8:31 AM Mark Lohry <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>>  Please send the output of -snes_view.
>>>>>>>>>>>
>>>>>>>>>> pasted below. anything stand out?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> SNES Object: 1 MPI process
>>>>>>>>>>   type: newtonls
>>>>>>>>>>   maximum iterations=1, maximum function evaluations=-1
>>>>>>>>>>   tolerances: relative=0.1, absolute=1e-15, solution=1e-15
>>>>>>>>>>   total number of linear solver iterations=20
>>>>>>>>>>   total number of function evaluations=22
>>>>>>>>>>   norm schedule ALWAYS
>>>>>>>>>>   Jacobian is never rebuilt
>>>>>>>>>>   Jacobian is applied matrix-free with differencing
>>>>>>>>>>   Preconditioning Jacobian is built using finite differences with
>>>>>>>>>> coloring
>>>>>>>>>>   SNESLineSearch Object: 1 MPI process
>>>>>>>>>>     type: basic
>>>>>>>>>>     maxstep=1.000000e+08, minlambda=1.000000e-12
>>>>>>>>>>     tolerances: relative=1.000000e-08, absolute=1.000000e-15,
>>>>>>>>>> lambda=1.000000e-08
>>>>>>>>>>     maximum iterations=40
>>>>>>>>>>   KSP Object: 1 MPI process
>>>>>>>>>>     type: gmres
>>>>>>>>>>       restart=30, using Classical (unmodified) Gram-Schmidt
>>>>>>>>>> Orthogonalization with no iterative refinement
>>>>>>>>>>       happy breakdown tolerance 1e-30
>>>>>>>>>>     maximum iterations=20, initial guess is zero
>>>>>>>>>>     tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>>>>>>>>>     left preconditioning
>>>>>>>>>>     using PRECONDITIONED norm type for convergence test
>>>>>>>>>>   PC Object: 1 MPI process
>>>>>>>>>>     type: asm
>>>>>>>>>>       total subdomain blocks = 1, amount of overlap = 0
>>>>>>>>>>       restriction/interpolation type - RESTRICT
>>>>>>>>>>       Local solver information for first block is in the
>>>>>>>>>> following KSP and PC objects on rank 0:
>>>>>>>>>>       Use -ksp_view ::ascii_info_detail to display information
>>>>>>>>>> for all blocks
>>>>>>>>>>     KSP Object: (sub_) 1 MPI process
>>>>>>>>>>       type: preonly
>>>>>>>>>>       maximum iterations=10000, initial guess is zero
>>>>>>>>>>       tolerances:  relative=1e-05, absolute=1e-50,
>>>>>>>>>> divergence=10000.
>>>>>>>>>>       left preconditioning
>>>>>>>>>>       using NONE norm type for convergence test
>>>>>>>>>>     PC Object: (sub_) 1 MPI process
>>>>>>>>>>       type: ilu
>>>>>>>>>>         out-of-place factorization
>>>>>>>>>>         0 levels of fill
>>>>>>>>>>         tolerance for zero pivot 2.22045e-14
>>>>>>>>>>         matrix ordering: natural
>>>>>>>>>>         factor fill ratio given 1., needed 1.
>>>>>>>>>>           Factored matrix follows:
>>>>>>>>>>             Mat Object: (sub_) 1 MPI process
>>>>>>>>>>               type: seqbaij
>>>>>>>>>>               rows=16384, cols=16384, bs=16
>>>>>>>>>>               package used to perform factorization: petsc
>>>>>>>>>>               total: nonzeros=1277952, allocated nonzeros=1277952
>>>>>>>>>>                   block size is 16
>>>>>>>>>>       linear system matrix = precond matrix:
>>>>>>>>>>       Mat Object: (sub_) 1 MPI process
>>>>>>>>>>         type: seqbaij
>>>>>>>>>>         rows=16384, cols=16384, bs=16
>>>>>>>>>>         total: nonzeros=1277952, allocated nonzeros=1277952
>>>>>>>>>>         total number of mallocs used during MatSetValues calls=0
>>>>>>>>>>             block size is 16
>>>>>>>>>>     linear system matrix followed by preconditioner matrix:
>>>>>>>>>>     Mat Object: 1 MPI process
>>>>>>>>>>       type: mffd
>>>>>>>>>>       rows=16384, cols=16384
>>>>>>>>>>         Matrix-free approximation:
>>>>>>>>>>           err=1.49012e-08 (relative error in function evaluation)
>>>>>>>>>>           Using wp compute h routine
>>>>>>>>>>               Does not compute normU
>>>>>>>>>>     Mat Object: 1 MPI process
>>>>>>>>>>       type: seqbaij
>>>>>>>>>>       rows=16384, cols=16384, bs=16
>>>>>>>>>>       total: nonzeros=1277952, allocated nonzeros=1277952
>>>>>>>>>>       total number of mallocs used during MatSetValues calls=0
>>>>>>>>>>           block size is 16
>>>>>>>>>>
>>>>>>>>>> On Thu, May 4, 2023 at 8:30 AM Mark Adams <[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> If you are using MG what is the coarse grid solver?
>>>>>>>>>>> -snes_view might give you that.
>>>>>>>>>>>
>>>>>>>>>>> On Thu, May 4, 2023 at 8:25 AM Matthew Knepley <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> On Thu, May 4, 2023 at 8:21 AM Mark Lohry <[email protected]>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Do they start very similarly and then slowly drift further
>>>>>>>>>>>>>> apart?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yes, this. I take it this sounds familiar?
>>>>>>>>>>>>>
>>>>>>>>>>>>> See these two examples with 20 fixed iterations pasted at the
>>>>>>>>>>>>> end. The difference for one solve is slight (final SNES norm is 
>>>>>>>>>>>>> identical
>>>>>>>>>>>>> to 5 digits), but in the context I'm using it in (repeated 
>>>>>>>>>>>>> applications to
>>>>>>>>>>>>> solve a steady state multigrid problem, though here just one 
>>>>>>>>>>>>> level) the
>>>>>>>>>>>>> differences add up such that I might reach global convergence in 
>>>>>>>>>>>>> 35
>>>>>>>>>>>>> iterations or 38. It's not the end of the world, but I was 
>>>>>>>>>>>>> expecting that
>>>>>>>>>>>>> with -np 1 these would be identical and I'm not sure where the 
>>>>>>>>>>>>> root cause
>>>>>>>>>>>>> would be.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> The initial KSP residual is different, so its the PC.
>>>>>>>>>>>> Please send the output of -snes_view. If your ASM is using direct
>>>>>>>>>>>> factorization, then it
>>>>>>>>>>>> could be randomness in whatever LU you are using.
>>>>>>>>>>>>
>>>>>>>>>>>>   Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>>     Matt
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>   0 SNES Function norm 2.801842107848e+04
>>>>>>>>>>>>>     0 KSP Residual norm 4.045639499595e+01
>>>>>>>>>>>>>     1 KSP Residual norm 1.917999809040e+01
>>>>>>>>>>>>>     2 KSP Residual norm 1.616048521958e+01
>>>>>>>>>>>>> [...]
>>>>>>>>>>>>>    19 KSP Residual norm 8.788043518111e-01
>>>>>>>>>>>>>    20 KSP Residual norm 6.570851270214e-01
>>>>>>>>>>>>>   Linear solve converged due to CONVERGED_ITS iterations 20
>>>>>>>>>>>>>   1 SNES Function norm 1.801309983345e+03
>>>>>>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Same system, identical initial 0 SNES norm, 0 KSP is slightly
>>>>>>>>>>>>> different
>>>>>>>>>>>>>
>>>>>>>>>>>>>   0 SNES Function norm 2.801842107848e+04
>>>>>>>>>>>>>     0 KSP Residual norm 4.045639473002e+01
>>>>>>>>>>>>>     1 KSP Residual norm 1.917999883034e+01
>>>>>>>>>>>>>     2 KSP Residual norm 1.616048572016e+01
>>>>>>>>>>>>> [...]
>>>>>>>>>>>>>    19 KSP Residual norm 8.788046348957e-01
>>>>>>>>>>>>>    20 KSP Residual norm 6.570859588610e-01
>>>>>>>>>>>>>   Linear solve converged due to CONVERGED_ITS iterations 20
>>>>>>>>>>>>>   1 SNES Function norm 1.801311320322e+03
>>>>>>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, May 3, 2023 at 11:05 PM Barry Smith <[email protected]>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>   Do they start very similarly and then slowly drift further
>>>>>>>>>>>>>> apart? That is the first couple of KSP iterations they are 
>>>>>>>>>>>>>> almost identical
>>>>>>>>>>>>>> but then for each iteration get a bit further. Similar for the 
>>>>>>>>>>>>>> SNES
>>>>>>>>>>>>>> iterations, starting close and then for more iterations and more 
>>>>>>>>>>>>>> solves
>>>>>>>>>>>>>> they start moving apart. Or do they suddenly jump to be very 
>>>>>>>>>>>>>> different? You
>>>>>>>>>>>>>> can run with -snes_monitor -ksp_monitor
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On May 3, 2023, at 9:07 PM, Mark Lohry <[email protected]>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is on a single MPI rank. I haven't checked the coloring,
>>>>>>>>>>>>>> was just guessing there. But the solutions/residuals are 
>>>>>>>>>>>>>> slightly different
>>>>>>>>>>>>>> from run to run.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Fair to say that for serial JFNK/asm ilu0/gmres we should
>>>>>>>>>>>>>> expect bitwise identical results?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, May 3, 2023, 8:50 PM Barry Smith <[email protected]>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>   No, the coloring should be identical every time. Do you
>>>>>>>>>>>>>>> see differences with 1 MPI rank? (Or much smaller ones?).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> > On May 3, 2023, at 8:42 PM, Mark Lohry <[email protected]>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > I'm running multiple iterations of newtonls with an
>>>>>>>>>>>>>>> MFFD/JFNK nonlinear solver where I give it the sparsity. PC 
>>>>>>>>>>>>>>> asm, KSP gmres,
>>>>>>>>>>>>>>> with SNESSetLagJacobian -2 (compute once and then frozen 
>>>>>>>>>>>>>>> jacobian).
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > I'm seeing slight (<1%) but nonzero differences in
>>>>>>>>>>>>>>> residuals from run to run. I'm wondering where randomness might 
>>>>>>>>>>>>>>> enter here
>>>>>>>>>>>>>>> -- does the jacobian coloring use a random seed?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> What most experimenters take for granted before they begin
>>>>>>>>>>>> their experiments is infinitely more interesting than any results 
>>>>>>>>>>>> to which
>>>>>>>>>>>> their experiments lead.
>>>>>>>>>>>> -- Norbert Wiener
>>>>>>>>>>>>
>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/
>>>>>>>>>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> What most experimenters take for granted before they begin their
>>>>>>> experiments is infinitely more interesting than any results to which 
>>>>>>> their
>>>>>>> experiments lead.
>>>>>>> -- Norbert Wiener
>>>>>>>
>>>>>>> https://www.cse.buffalo.edu/~knepley/
>>>>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> What most experimenters take for granted before they begin their
>>>>> experiments is infinitely more interesting than any results to which their
>>>>> experiments lead.
>>>>> -- Norbert Wiener
>>>>>
>>>>> https://www.cse.buffalo.edu/~knepley/
>>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>>
>>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> https://www.cse.buffalo.edu/~knepley/
>>> <http://www.cse.buffalo.edu/~knepley/>
>>>
>>
>> <configure.log>
>
>
>

Re: [petsc-users] sources of floating point randomness in JFNK in serial

Reply via email to