On 2020-05-21 16:57, davidson wrote:
On Thu, 21 May 2020, David Christensen wrote:
On 2020-05-21 08:52, davidson wrote:
On Thu, 14 May 2020, Albretch Mueller wrote:
The thing is that I have to call, say sha256sum, on millions of files
Probably debian admin people dealing with packaging have to deal with
the same kinds of issues.
For checksums, mtree(8) from package mtree-netbsd might be worth a look.
Been there, done that; I do not recommend it:
https://lists.debian.org/debian-user/2020/01/msg00488.html
The thread you refer to reports problems with the mtree-à-la-FreeBSD
("fmtree(8)" [1]) in debian package freebsd-buildutils.
mtree-netbsd is a different debian package, providing
mtree-à-la-NetBSD ("mtree(8)" [2]). It does not seem to suffer from
the deficiency you encountered with fmtree.
1. https://manpages.debian.org/buster/freebsd-buildutils/fmtree.8.en.html
2. https://manpages.debian.org/buster/mtree-netbsd/mtree.8.en.html
Thanks for the tip.
[2] is older than [1]. Both are older than the version on FreeBSD:
https://www.freebsd.org/cgi/man.cgi?mtree(8)
I cannot remember if I found them both when I went looking for mtree(8)
on Debian, but I would have picked the newer of the two.
I was trying to validate migration of ~0.9 TB of content from a Debian
Samba server to a FreeBSD Samba server. I know I failed. I seem to
recall it was due to a missing feature in the Debian version of mtree(8).
Also, I do not believe the input/ output format of mtree(8) is
compatible with the I/O format of sha256sum(1). Using mtree(8) output
as sha256sum(1) input, or vice-versa, requires a translation command or
script.
I do seem to recall writing a Perl script to parse mtree(8) output. The
mtree(8) convert option '-C' was the key. The Debian version I tried
lacked it. The other version seems to have it. So, maybe...
I think the simplest answer on Debian is to use find(1), xargs(1) (with
the -P option), and sha256sum(1) to generate an SHA256SUMS file.
However, before I learned of mtree(8), I wrote a Perl script to perform
essentially the same function -- compare metadata and checksums of two
directory trees, or the same tree at two different points in time. I
soon discovered how wasteful it is to recompute checksums for 0.9 TB of
files (hours) when only a tiny fraction have changed (seconds or
minutes). So, I added an update feature to the Perl script. This made
the script far more efficient, and therefore usable. AFAIK no version
of mtree(8) has this feature. A find(1), xargs(1), and sha256sum(1)
pipeline would also lack this feature, and an SHA256SUMS file lacks the
metadata fields required to implement it.
David