> Then upon reading the release notes, on such a machine, one can simply do:
>
> touch /etc/tmpfiles.d/tmp.conf
>
> And they get no automated cleanups.

This also disables on-boot cleaning of /tmp/.

The root issue here is that deleting not-read-in-a-while
but-maybe-stat'ed-recently-by-make-that-doesn't-count files from
/var/tmp/ by default, particularly when the system didn't used to,
violates the principle of least surprise. It will cause problems for
end users. And disabling it *properly* requires serious sysadmin
knowledge and is easy to get wrong.

There's an old debugging story: Maclisp on a DEC-10 was glitchy some
days and not others. Someone marked the crashes on a calendar, and
they correlated with the phase of the moon! Turns out the program
printed the lunar phase ("crescent", "waning gibbous", etc) in its
header, just for fun, and when the string was long it wrote past the
end of a buffer, etc.

The story is still told fifty years later because it's so funny, it's
a total violation of levels. People debugging in a windowless basement
should not need to know the phase of the moon!

Users want their systems to run reliably and reproducibly. Getting
different behaviour depending on how long it's been since some of the
files you're using have been *accessed* adds oodles of extra state
people have to keep track of in their mental model of what some
program needs to run. Sure, it can be administratively disabled on one
particular system; but then you run the same program or perform the
same work-flow on another system, and it fails. Something fails
reliably after running for over 30 days ... unless you keep an eye on
it.

As someone who regularly deals with large datasets, and keeps them in
the "approved" don't-back-these-up location /var/tmp/, this just seems
crazy to me. When I tell a grad student to install Debian on their
laptop, do I really need to spend time explaining to them to be sure
to turn off the "delete your old files for no reason" option which is
enabled by default? When they're using someone else's cluster, should
I remind them to run a cron job "find /var/tmp/foo -exec touch ..." so
the 25 Tb of data we pushed onto scratch storage on that cluster
doesn't partly disappear? Our priority should be our users. Our users
are not well served by having their files deleted. They put them there
for a reason!

--Barak.

Reply via email to