Re: [DISCUSSION] IEP-47 Native persistence defragmentation

Ivan Bessonov Tue, 02 Jun 2020 04:55:08 -0700

Hello Anton,

I'd like to address your last message. First of all, it was already
partially discussed
in this thread: [1] To reiterate - expected performance degradation will be
significant.
There's no way that we can throttle it because free/reuse lists have to be
maintained
sorted all the time. And these are very optimized data structures.


More then that, "dummy" updates clash with data access, this is a very
dangerous
thing to do. And these updates don't save you from the situation when last
pages in
the file are not data pages, but tree pages, for example. They are much
harder to
move. Not only you should update all links to it but also do it
effectively, without
blocking the tree too much. I can think of many other examples.

*Easy to implement/understand*
 - no, it's not easy at all, defragmentation under the load is a very
challenging thing to
   implement.

*Why we're going to implement distributed system defragmentation in the old
(offline) way?*
 - because it's easier and safer, and it won't introduce any performance
degradation.

[1]
http://apache-ignite-developers.2346864.n4.nabble.com/How-to-free-up-space-on-disc-after-removing-entries-from-IgniteCache-with-enabled-PDS-td39839.html

вт, 2 июн. 2020 г. в 14:17, Anton Vinogradov <a...@apache.org>:

> Folks,
>
> Modern OS never ask you to schedule defragmentation and turn your PC off,
> it performs it while you're browsing.
> Why we're going to implement distributed system defragmentation in the old
> (offline) way?
>
> All you need is to implement free/reuse-list sorting. They should provide
> pages closest to the file beginning.
> So, every insert/update will automatically defragment the entry.
> Also, a special process should iterate over the partitions in a reverse way
> just performing dummy updates.
> The partition file may be safely truncated after the iterator.
>
> Props:
> - Your system still operating (no downtime)
> - Defragmentation can be performed partially
> - Defragmentation can be scheduled to periods of inactivity or performed on
> a regular basis
> - SQL will not be broken (no reason to recalculate the whole index, it will
> be recalculated in a regular way on every entry update)
> - Topology changes allowed
> - Easy to implement/understand
>
> Cons:
> - Performance degradation (solvable by throttling)
>
> On Mon, Jun 1, 2020 at 4:04 PM Sergey Chugunov <sergey.chugu...@gmail.com>
> wrote:
>
> > Hi Ivan,
> >
> > I have an idea about suggested maintenance mode.
> >
> > First of all, I agree with your ideas about discovery restrictions: node
> > should not join topology when performing defragmentation.
> >
> > At the same time I haven't heard about requests for this mode from users,
> > so we don't know much about possible requirements.
> > So I suggest to implement it in a pragmatical way: instead of inventing
> > (unknown in reality) user scenarios lets develop minimal but yet
> > well-designed functionality that suites our case. If we constrain our
> > implementation with reasonable set of restrictions that's OK.
> >
> > So my idea is the following: to transit a node to maintenance user has to
> > send special command to the node (e.g. with new command in control.sh
> > utility or via JMX interface). Node saves maintenance request in local
> > metastorage and waits for restart. User has to manually restart that node
> > in order to finish moving it to maintenance mode.
> >
> > When node restarts and finds maintenance request it creates special type
> of
> > discovery SPI that will not try to join topology at all yet node is able
> to
> > start all necessary components and APIs like REST processor or JMX
> > interface.
> >
> > When in maintenance, we'll be able to do defragmentation safely and
> remove
> > maintenance request from metastorage only when it is completed (with all
> > fault-tolerance logic in mind).
> >
> > As we don't have a mechanism (like watcher) to perform a "safe restart"
> (by
> > safe I mean Ignite restart without OS-level process restart) we cannot
> > finish maintenance mode without another manual restart but I think it is
> a
> > reasonable restriction as maintenance mode shouldn't be an every-day
> > routine and will be used quite rare.
> >
> > What do you think about it?
> >
> > On Tue, May 26, 2020 at 5:58 PM Ivan Bessonov <bessonov...@gmail.com>
> > wrote:
> >
> > > Hello Igniters,
> > >
> > > I'd like to discuss this new IEP with you: [1]. The main idea is to
> have
> > a
> > > procedure that relocates
> > > pages to the top of the file as compact as possible which allows us to
> > > trim the file and increase its
> > > fill-factor. It will be configured manually and executed after the
> > restart,
> > > but before node joins
> > > topology (it means any load would be impossible during
> defragmentation).
> > It
> > > is described in detail
> > > in the IEP itself, please read it. This topic was also previously
> > discussed
> > > here on dev-list in [2].
> > >
> > > Here I would like to list a few moments that are not as clear and
> require
> > > your opinion.
> > >
> > >  - what are your overall thoughts on the IEP? Any concerns?
> > >
> > >  - maintenance mode - how do we communicate with the node that's not in
> > > topology? What are
> > >    the options? As far as I know, we have no current tools like this.
> > >
> > >  - checkpointer refactoring - these changes will involve intensive
> > writing
> > > of pages to the storage.
> > >    If we're going to reuse the offheap page model to perform
> > > defragmentation then the
> > >    checkpointing mechanism will have to be adapted in some form.
> > >    Are you fine with this? Or we need a separate discussion?
> > >
> > > [1]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-47%3A+Native+persistence+defragmentation
> > > [2]
> > >
> > >
> >
> http://apache-ignite-developers.2346864.n4.nabble.com/How-to-free-up-space-on-disc-after-removing-entries-from-IgniteCache-with-enabled-PDS-td39839.html
> > >
> > >
> > > --
> > > Sincerely yours,
> > > Ivan Bessonov
> > >
> >
>


-- 
Sincerely yours,
Ivan Bessonov

Re: [DISCUSSION] IEP-47 Native persistence defragmentation

Reply via email to