Am 07.11.2022 02:57, schrieb hw:
Hi,
Is there no VDO in Debian, and what would be good to use for
deduplication with
Debian? Why isn't VDO in the stardard kernel? Or is it?
I have used vdo in Debian some time ago and didn't remember big
problems. AFAIR I did compile it myself - no prebuild packages.
I switched to btrfs for other reasons. Not even for performance. The VDO
Layer eats performance, yes, but compared to naked ext4 even btrfs is
slow.
I'm not looking for deduplication that happens some time after files
have
already been written like btrfs would allow: There is no point in
deduplicating
backups after they're done because I don't need to save disk space for
them when
I can fit them in the first place.
That's only one point. And it's not really some valid one, I think, as
you do typically not run into space problems with one single action
(YMMV). Running multiple sessions and out-of-band deduplication between
them works for me.
In-band deduplication (that's the one you want) has some drawbacks, too:
High Ressource usage. You need plenty of RAM (up to several Gigabytes
per Terabyte Storage) and write success is delayed (-> slow direct i/o).
For Out-of-Band deduplication there are multiple different
implementations. File based dedup on directory basis can be very fast
and resource economical, for example via rdfind or jdupes. Block based
like via bees for btrfs (that's the one I use) is more close to in-band
deduplication (including high RAM usage). Bees can be switched off and
on at any time (for example if it's a small home-system which runs more
demanding tasks from time to time) and switching it on again resumes at
the last state (it starts at the last transaction id which was processed
-> btrfs knows its transactions).
regards
hede