Control: forwarded -1 https://github.com/kdave/btrfsmaintenance/issues/138

Hello Andy,

Thank you again for the discussion at this bug, and ouf, sorry I didn't
follow-up with your latest reply after partially addressing it in 0.5-4.
In that release I started patching the config to disable balance as well as
installing the timer in a disabled state, so this bug might have been
solved by then.  Please see reply inline:

andy <[email protected]> writes:

> On Tue, 2023-02-07 at 18:13 -0500, Nicholas D Steeves wrote:
>
> one question would be: should the btrfsmaintenance package provide a
> convenient way to use an (unnecessary?) feature that might cause data
> loss?  are there cases in which it is recommended for the average (or
> even somewhat adventurous) user to regularly schedule a btrfs balance?

I'm inclined to say yes, because Debian provides all sorts of mechanisms
to potentially accidentally erase and/or corrupt one's data.  Users are
responsible for not blindly following non-Debian documentation and
screwing up their system.  Would you agree that it's a good thing to
provide maximum liberty within a clearly-defined responsibility
boundary?

> if so, maybe spell out those cases explicitly in the README?  from this
> conversation it does not sound like something that i would regard as
> "maintenance" in the vein of max-mount-counts/interval-between-checks
> driven fsck's in the ext{2,3,4} world or periodic btrfs scrubs.  otoh,
> see below...

I've updated the README.Debian text to show this:

      systemctl enable btrfs-balance.timer  —  Not recommended    [1]
      systemctl enable btrfs-defrag.timer   —  Avoid if possible  [2]
      systemctl enable btrfs-scrub.timer    —  Highly recommended [3]

  [1] This rewrites all data that is not excluded by a filter.  Periodic
      balancing is not useful, nor recommended, for the vast majority of
      workloads.  If a corner case is found where balancing is useful,
      this case (with steps to reproduce) should be reported upstream.

> from a user's perspective it's really hard to judge.  for instance,
> https://btrfs.wiki.kernel.org/index.php/SysadminGuide#Balancing states:
> "It is _quite_ [their emphasis] useful to balance periodically any
> Btrfs volume subject to updates."

I agree 100%, and that's why I started maintaining our Debian btrfs wiki
page and this package...  Upstream docs have historically been somewhere
between aspirational and obsolete (both cases may be harmful).
Thankfully the page you linked to is finally marked "OBSOLETE CONTENT".

>> > i had read that "Some advocate not running it at all" but to me
>> > that
>> > implies "may be unnecessary" rather than "may be actively harmful."
>> > on the basis of the latter i am certainly considering setting the
>> > balance.timer back to disabled.
>> 
>> Btw, I'm curious to learn why you've enabled balancing!
>
> been a long time, but i _think_ that at one point (related to a disk
> failure/replacement in raid1 that may not have gone 100% smoothly?) i
> ended up in an unbalanced situation where i couldn't write even though
> there was apparently space available.  iirc `balance` was needed to get
> me out of that corner.  and then i found Marc MERLIN's `btrfs-scrub`
> script which included `btrfs balance` commands and he seemed like a
> knowledgeable source.  that in combination w/the btrfs wiki led me to
> believe it was a safe operation.  e.g., in addition to the above,
> https://btrfs.wiki.kernel.org/index.php/Status lists it as "stable"
> with only a note about performance.

Yes, balance is basically the sledgehammer that beats ancient corner
case bugs into uniformity.  I forget when it was (linux-4.4, I think),
but I had to do a full balance (data and metadata) when converting from
single profile data (DUP profile metadata) to raid1 profile (both data
and metadata) twice, with a reboot in between...and things were worse
when Marc MERLIN's guide was published.  For anyone reading this in
2025 (or later), it's almost certainly no longer relevant.

>> The "may be actively harmful" bit is tricky, because Btrfs gets way
>> too
>> complicated way too fast...  My hope to put in place safe defaults
>> that
>> work for most people, that don't bait users/sysadmins into taking on
>> risk, and that are good enough for the general case.
[snip]
>
> yes, it sounds like one viable approach may be to explicitly document
> cases where balance is recommended.

I still haven't found any.  Have you?

>> > alternatively/additionally are there default options that might
>> > make
>> > it safe(r)?
>> 
>> I believe that [metadata] balancing should probably be disabled in
>> upstream btrfsmaintenance.  If I remember correctly, it's enabled by
>> default because upstream targets 10year LTS releases such as SUSE's
>> 11
>> series, which uses linux 3.0.76, where the harm reduction of metadata
>> balancing is significant enough to make the risk worthwhile on
>> server-grade hardware.
>
> based on what has come up in this thread, my feedback would be that the
> debian pkg should diverge from upstream wrt balancing.

I agree, and I've forwarded this bug upstream.  I'm leaning towards
removing Debian's btrfsmaintenance support for balancing well before
forky's release.

>> > from my pov i'd still like to see the values harmonized.  i
>> > originally
>> > noticed the inconsistency because what i believed was the default
>> > setting was creating a seemingly unnecessary systemd override file.
>> 
>> Sorry, what is this "unnecessary systemd override file"?
>
> when i enabled the balance component i initially left
> `BTRFS_BALANCE_PERIOD="weekly"` which is noted as the default in
> `/etc/default/btrfsmaintenance`.  but then it created a systemd
> override file with `OnCalendar=weekly` in `/etc/systemd/system/btrfs-
> balance.timer.d/schedule.conf`.  this surprised me because i wouldn't
> expect this to be necessary when using the default.  then i checked
> `/lib/systemd/system/btrfs-balance.timer` and found that the _actual_
> default is `OnCalendar=monthly` and, well, here we are :)

Yuck, I don't like that behaviour either.  The current (in git, soon to
be uploaded) default is a soft-disabled "OnCalendar=monthly", with
BTRFS_BALANCE_PERIOD="none".

>> > > Thus, if I do anything, I'm inclined to set the period for
>> > > balance to
>> > > "none" everywhere.
>> > 
>> > given that one already needs to manually enable the service via
>> > `systemctl enable btrfs-balance.timer` i don't think it's necessary
>> > to
>> > set the default value to "none."  this would result in the (imho)
>> > counter-intuitive behavior of enabling something only to have it do
>> > nothing.  although an add'l comment re: the potential for harm in
>> > `/etc/default/btrfs` that one would hopefully see when changing the
>> > value from "none" to e.g., "monthly" may be the best way to ensure
>> > that
>> > they are an informed user.  so i could go either way on that.
>> > 
>> 
>> Good point, and yes, I agree that two knobs seems silly; however,
>> there
>> is already precedent for this with '"BTRFS_TRIM_PERIOD="none"'.  As
>> there doesn't seem to be any interest in #887461, I'm wondering if
>> might
>> be time to follow the consensus of the General Resolution in favour
>> of
>> systemd, and soon start shipping systemd timers in an enabled state,
>> and
>> disable everything in the config file, but provide suggested
>> values...
>
> sure, that would eliminate the "two knobs" and makes sense from my pov.

Aha, this is why I took so long to reply.  So, to start off with, the
balance timer should always be installed in a disabled state, because
timers with an enabled state allegedly run on boot!  Beyond that, I'm
honestly not sure how to achieve this with systemd.  Do you have a
solution in mind?  I've now removed the redundant TRIM support from
Btrfsmaintenance for Debian.

git clone https://salsa.debian.org/sten/btrfsmaintenance.git

or you can try this prebuilt prerelease:

  https://people.debian.org/~sten/test-builds/btrfsmaintenance_0.5.2-2_all.deb
  (you can use the changes file signature and checksums to verify)
  https://people.debian.org/~sten/test-builds/

Cheers,
Nicholas

Attachment: signature.asc
Description: PGP signature

Reply via email to