Re: Reconsider default WAL mode: we need something between LOG_ONLY and FSYNC

Dmitry Pavlov Tue, 27 Mar 2018 09:41:33 -0700

Hi Eduard, thank you for review.

Hi Ivan,


I'm confused on PR naming
https://github.com/apache/ignite/pull/3656

Could you rename?

Sincerely,
Dmitriy Pavlov

вт, 27 мар. 2018 г. в 19:38, Eduard Shangareev <eduard.shangar...@gmail.com
>:

> Ivan, I have reviewed your changes, looks good.
>
> On Tue, Mar 27, 2018 at 2:56 PM, Ivan Rakov <ivan.glu...@gmail.com> wrote:
>
> > Igniters,
> >
> > I've completed development of https://issues.apache.org/jira
> > /browse/IGNITE-7754. TeamCity state is ok. Please, review my changes.
> > Please note that it will be possible to track time of WAL fsync on
> > checkpoint begin by *walCpRecordFsyncDuration *metric in "Checkpoint
> > started" message.
> >
> > Also, I've created https://issues.apache.org/jira/browse/IGNITE-8057
> with
> > description of possible further improvement of WAL fsync on checkpoint
> > begin.
> >
> > Best Regards,
> > Ivan Rakov
> >
> >
> > On 26.03.2018 23:45, Valentin Kulichenko wrote:
> >
> >> Ivan,
> >>
> >> It's all good then :) Thanks!
> >>
> >> -Val
> >>
> >> On Mon, Mar 26, 2018 at 1:50 AM, Ivan Rakov <ivan.glu...@gmail.com>
> >> wrote:
> >>
> >> Val,
> >>>
> >>> There's no any sense to use WalMode.NONE in production environment,
> it's
> >>> kept for testing and debugging purposes (including possible user
> >>> activities
> >>> like capacity planning).
> >>> We already print a warning at node start in case WalMode.NONE is set:
> >>>
> >>> U.quietAndWarn(log,"Started write-ahead log manager in NONE mode,
> >>>
> >>>> persisted data may be lost in " +
> >>>>       "a case of unexpected node failure. Make sure to deactivate the
> >>>> cluster before shutdown.");
> >>>>
> >>>> Best Regards,
> >>> Ivan Rakov
> >>>
> >>>
> >>> On 24.03.2018 1:40, Valentin Kulichenko wrote:
> >>>
> >>> Dmitry,
> >>>>
> >>>> Thanks for clarification. So it sounds like if we fix all other modes
> as
> >>>> we
> >>>> discuss here, NONE would be the only one allowing corruption. I also
> >>>> don't
> >>>> see much sense in this and I think we should clearly state this in the
> >>>> doc,
> >>>> as well print out a warning if NONE mode is used. Eventually, if it's
> >>>> confirmed that there are no reasonable use cases for it, we can
> >>>> deprecate
> >>>> it.
> >>>>
> >>>> -Val
> >>>>
> >>>> On Fri, Mar 23, 2018 at 3:26 PM, Dmitry Pavlov <dpavlov....@gmail.com
> >
> >>>> wrote:
> >>>>
> >>>> Hi Val,
> >>>>
> >>>>> NONE means that the WAL log is disabled and not written at all. Use
> of
> >>>>> the
> >>>>> mode is at your own risk. It is possible that restore state after the
> >>>>> crash
> >>>>> at the middle of checkpoint will not succeed. I do not see much sence
> >>>>> in
> >>>>> it, especially in production.
> >>>>>
> >>>>> BACKGROUND is full functional WAL mode, but allows some delay before
> >>>>> flush
> >>>>> to disk.
> >>>>>
> >>>>> Sincerely,
> >>>>> Dmitriy Pavlov
> >>>>>
> >>>>> сб, 24 мар. 2018 г. в 1:07, Valentin Kulichenko <
> >>>>> valentin.kuliche...@gmail.com>:
> >>>>>
> >>>>> I agree. In my view, any possibility to get a corrupted storage is a
> >>>>> bug
> >>>>>
> >>>>>> which needs to be fixed.
> >>>>>>
> >>>>>> BTW, can someone explain semantics of NONE mode? What is the
> >>>>>> difference
> >>>>>> from BACKGROUND from user's perspective? Is there any particular use
> >>>>>> case
> >>>>>> where it can be used?
> >>>>>>
> >>>>>> -Val
> >>>>>>
> >>>>>> On Fri, Mar 23, 2018 at 2:49 AM, Dmitry Pavlov <
> dpavlov....@gmail.com
> >>>>>> >
> >>>>>> wrote:
> >>>>>>
> >>>>>> Hi Ivan,
> >>>>>>
> >>>>>>> IMO we have to add extra FSYNCS for BACKGROUND WAL. Agree?
> >>>>>>>
> >>>>>>> Sincerely,
> >>>>>>> Dmitriy Pavlov
> >>>>>>>
> >>>>>>> пт, 23 мар. 2018 г. в 12:23, Ivan Rakov <ivan.glu...@gmail.com>:
> >>>>>>>
> >>>>>>> Igniters, there's another important question about this matter.
> >>>>>>>
> >>>>>>>> Do we want to add extra FSYNCS for BACKGROUND WAL mode? I think
> that
> >>>>>>>>
> >>>>>>>> we
> >>>>>>>
> >>>>>> have to do it: it will cause similar performance drop, but if we
> >>>>>>
> >>>>>>> consider LOG_ONLY broken without these fixes, BACKGROUND is broken
> as
> >>>>>>>>
> >>>>>>>> well.
> >>>>>>>
> >>>>>>> Best Regards,
> >>>>>>>> Ivan Rakov
> >>>>>>>>
> >>>>>>>> On 23.03.2018 10:27, Ivan Rakov wrote:
> >>>>>>>>
> >>>>>>>> Fixes are quite simple.
> >>>>>>>>> I expect them to be merged in master in a week in worst case.
> >>>>>>>>>
> >>>>>>>>> Best Regards,
> >>>>>>>>> Ivan Rakov
> >>>>>>>>>
> >>>>>>>>> On 22.03.2018 17:49, Denis Magda wrote:
> >>>>>>>>>
> >>>>>>>>> Ivan,
> >>>>>>>>>>
> >>>>>>>>>> How quick are you going to merge the fix into the master? Many
> >>>>>>>>>> persistence
> >>>>>>>>>> related optimizations have already stacked up. Probably, we can
> >>>>>>>>>>
> >>>>>>>>>> release
> >>>>>>>>>
> >>>>>>>> them sooner if the community agrees.
> >>>>>>>>
> >>>>>>>>> --
> >>>>>>>>>> Denis
> >>>>>>>>>>
> >>>>>>>>>> On Thu, Mar 22, 2018 at 5:22 AM, Ivan Rakov <
> >>>>>>>>>>
> >>>>>>>>>> ivan.glu...@gmail.com>
> >>>>>>>>>
> >>>>>>>> wrote:
> >>>>>>
> >>>>>>> Thanks all!
> >>>>>>>>>>
> >>>>>>>>>>> We seem to have reached a consensus on this issue. I'll just
> add
> >>>>>>>>>>> necessary
> >>>>>>>>>>> fsyncs under IGNITE-7754.
> >>>>>>>>>>>
> >>>>>>>>>>> Best Regards,
> >>>>>>>>>>> Ivan Rakov
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On 22.03.2018 15:13, Ilya Lantukh wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> +1 for fixing LOG_ONLY. If current implementation doesn't
> >>>>>>>>>>> protect
> >>>>>>>>>>>
> >>>>>>>>>> from
> >>>>>>
> >>>>>>> data
> >>>>>>>>
> >>>>>>>>> corruption, it doesn't make sence.
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Wed, Mar 21, 2018 at 10:38 PM, Denis Magda <
> >>>>>>>>>>>>
> >>>>>>>>>>>> dma...@apache.org>
> >>>>>>>>>>>
> >>>>>>>>>> wrote:
> >>>>>>
> >>>>>>> +1 for the fix of LOG_ONLY
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Wed, Mar 21, 2018 at 11:23 AM, Alexey Goncharuk <
> >>>>>>>>>>>>> alexey.goncha...@gmail.com> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> +1 for fixing LOG_ONLY to enforce corruption safety given the
> >>>>>>>>>>>>> provided
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> performance results.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> 2018-03-21 18:20 GMT+03:00 Vladimir Ozerov <
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> voze...@gridgain.com
> >>>>>>>>>>>>>
> >>>>>>>>>>>> :
> >>>>>>>
> >>>>>>>> +1 for accepting drop in LOG_ONLY. 7% is not that much and
> >>>>>>>>>
> >>>>>>>>>> not a
> >>>>>>>>>>>>>
> >>>>>>>>>>>> drop
> >>>>>>
> >>>>>>> at
> >>>>>>>>>>>>>> all, provided that we fixing a bug. I.e. should we implement
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> it
> >>>>>>>>>>>>>
> >>>>>>>>>>>> correctly
> >>>>>>
> >>>>>>> in the first place we would never notice any "drop".
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I do not understand why someone would like to use current
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> broken
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> mode.
> >>>>>>>
> >>>>>>>> On Wed, Mar 21, 2018 at 6:11 PM, Dmitry Pavlov
> >>>>>>>>>>>>>>> <dpavlov....@gmail.com>
> >>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Hi, I think option 1 is better. As Val said any mode that
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> allows
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> corruption
> >>>>>>>
> >>>>>>>> does not make much sense.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> What Ivan mentioned here as drop, in relation to old mode
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> DEFAULT
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> (FSYNC
> >>>>>>>>
> >>>>>>>>> now), is still significant perfromance boost.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Sincerely,
> >>>>>>>>>>>>>>>> Dmitriy Pavlov
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> ср, 21 мар. 2018 г. в 17:56, Ivan Rakov <
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> ivan.glu...@gmail.com
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> :
> >>>>>>>
> >>>>>>>> I've attached benchmark results to the JIRA ticket.
> >>>>>>>>>
> >>>>>>>>>> We observe ~7% drop in "fair" LOG_ONLY_SAFE mode,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> independent
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> of
> >>>>>>
> >>>>>>> WAL
> >>>>>>>>
> >>>>>>>>> compaction enabled flag. It's pretty significant drop: WAL
> >>>>>>>>>>>>>>> compaction
> >>>>>>>>>>>>>>> itself gives only ~3% drop.
> >>>>>>>>>>>>>>> I see two options here:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> 1) Change LOG_ONLY behavior. That implies that we'll be
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> ready
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> to
> >>>>>>
> >>>>>>> release
> >>>>>>>>
> >>>>>>>>> AI 2.5 with 7% drop.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> 2) Introduce LOG_ONLY_SAFE, make it default, add release
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> note
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> to AI
> >>>>>>
> >>>>>>> 2.5
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> that we added power loss durability in default mode, but
> >>>>>>>>>>>>>>> user
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> may
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> fallback to previous LOG_ONLY in order to retain
> >>>>>>>
> >>>>>>>> performance.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thoughts?
> >>>>>>
> >>>>>>> Best Regards,
> >>>>>>>>>>>>>>>>> Ivan Rakov
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On 20.03.2018 16:00, Ivan Rakov wrote:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Val,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> If a storage is in
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> corrupted state, does it mean that it needs to be
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> completely
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> removed
> >>>>>>>
> >>>>>>>> and
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> cluster needs to be restarted without data?
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Yes, there's a chance that in LOG_ONLY all local data
> will
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> lost,
> >>>>>>>
> >>>>>>>> but only in *power loss**/ OS crash* case.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> kill -9, JVM crash, death of critical system thread and
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> all
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> other
> >>>>>>
> >>>>>>> cases that usually take place are variations of *process
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> crash*.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> All
> >>>>>>>>
> >>>>>>>>> WAL modes (except NONE, of course) ensure corruption-safety
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> case
> >>>>>>
> >>>>>>> of
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> process crash.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> If so, I'm not sure any mode
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> that allows corruption makes much sense to me.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> It depends on performance impact of enforcing
> power-loss
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> corruption
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> safety. Price of full protection from power loss is high
> -
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> FSYNC
> >>>>>>>>>>>>>
> >>>>>>>>>>>> is
> >>>>>>
> >>>>>>> way slower (2-10 times) than other WAL modes. The question is
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> whether
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> ensuring weaker guarantees (corruption can't happen, but
> >>>>>>>>>>>>>>>> loss
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> last
> >>>>>>>
> >>>>>>>> updates can) will affect performance as badly as strong
> >>>>>>>>>>>>>>>> guarantees.
> >>>>>>>>>>>>>>>> I'll share benchmark results soon.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Best Regards,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Ivan Rakov
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On 20.03.2018 5:09, Valentin Kulichenko wrote:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Guys,
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> What do we understand under "data corruption" here? If
> a
> >>>>>>>>>>>>>>>>>>> storage
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> corrupted state, does it mean that it needs to be
> completely
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> removed
> >>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> cluster needs to be restarted without data? If so, I'm not
> >>>>>>>>>>>>>>>> sure
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> any
> >>>>>>>
> >>>>>>>> mode
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> that allows corruption makes much sense to me. How am I
> >>>>>>>>>>>>>>>> supposed
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> to
> >>>>>>>>
> >>>>>>>>> use a
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> database, if virtually any failure can end with complete
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> loss of
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> data?
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> In any case, this definitely should not be a default
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> behavior.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> If
> >>>>>>
> >>>>>>> user ever
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> switches to corruption-unsafe mode, there should be a
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> clear
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> warning
> >>>>>>
> >>>>>>> about
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> this.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> -Val
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> On Fri, Mar 16, 2018 at 1:06 AM, Ivan Rakov <
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> ivan.glu...@gmail.com>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Ticket to track changes:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-7754
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Best Regards,
> >>>>>>>>>>>>>>>>>>>> Ivan Rakov
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> On 16.03.2018 10:58, Dmitriy Setrakyan wrote:
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> On Fri, Mar 16, 2018 at 12:55 AM, Ivan Rakov <
> >>>>>>>>>>>>>>>>>>>> ivan.glu...@gmail.com
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Vladimir,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Unlike BACKGROUND, LOG_ONLY provides strict write
> >>>>>>>>>>>>>>>>>>>>> guarantees
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> unless power
> >>>>>>>>
> >>>>>>>>> loss has happened.
> >>>>>>>>>>>>>>>>>>>>>> Seems like we need to measure performance difference
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> decide
> >>>>>>
> >>>>>>> whether do
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> we need separate WAL mode. If it will be invisible,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> we'll
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> just
> >>>>>>>
> >>>>>>>> fix
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> these
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> bugs without introducing new mode; if it will be
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> perceptible,
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> we'll
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> continue the discussion about introducing
> >>>>>>>>>>>>>>>>>>>> LOG_ONLY_SAFE.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Makes sense?
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Yes, this sounds like the right approach.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >
>

Re: Reconsider default WAL mode: we need something between LOG_ONLY and FSYNC

Reply via email to