usr overfull

Amelia A Lewis Fri, 11 Oct 2024 07:46:50 -0700

Heyo,

I performed sysupgrade on two locally-available OBSD systems on 
Tuesday, which went as smoothly as expected, and then initiated 
sysupgrade on a machine in colocation. It came back up (per ping), but 
refused ssh connections to allow me to complete the upgrade (pkg_add 
-u). Investigation (port scanning via nc -nv) showed that the machine 
was up and running at least the imap and dns daemons (on both live 
interfaces), but not sshd (on either). Daily email (mine are adjusted 
to restore disk usage information when that was removed a few releases 
back) shows that I _read but failed to heed warnings about the size of 
/usr_, so the filesystem is now too full:


OpenBSD 7.6 (GENERIC) #332: Mon Sep 30 08:45:17 MDT 2024
    dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC

 1:30AM  up  6:52, 0 users, load averages: 0.23, 0.05, 0.02

Running daily.local:

disks:
Filesystem     Size    Used   Avail Capacity iused   ifree  %iused  
Mounted on
/dev/sd1a     1005M    143M    812M    15%    2160  153742     2%   /
/dev/sd1h      149G    2.8G    139G     2%    1970 9868364     1%   
/home
/dev/sd1p      181G    1.8G    170G     2%      72 11959222     1%   
/pub
/dev/sd1d      3.9G   10.0K    3.7G     1%       6  545656     1%   /tmp
/dev/sd1f      2.0G    2.0G  -87.0M   105%   18324  267498     7%   /usr
/dev/sd1g     1005M    464M    491M    49%    9628  146274     7%   
/usr/X11R6
/dev/sd1l     19.7G    587M   18.1G     4%   13812 2636554     1%   
/usr/local
/dev/sd1o      3.9G    2.0K    3.7G     1%       1  545661     1%   
/usr/obj
/dev/sd1j      2.0G    527M    1.4G    28%  191282   94540    67%   
/usr/ports
/dev/sd1i      2.0G    1.1G    743M    62%  118844  166978    42%   
/usr/src
/dev/sd1m      2.0G    653M    1.2G    35%   32969  252853    12%   
/usr/xenocara
/dev/sd1n      3.9G    2.0K    3.7G     1%       1  545661     1%   
/usr/xobj
/dev/sd1e     58.6G   21.0M   55.7G     1%    1169 7845997     1%   /var

(this one is part of the email from overnight Tue-Wed, which also has 
this at the bottom:
Services that should be running but aren't:
sshd
)

So. I have found the problem, yes? And the solution is ... what? Note: 
there's a short version at the bottom of the email in the last para; I 
prolly talk too much.

Presumably I need console access, since sshd isn't running and I don't 
have an exploit to get a shell by breaking dovecot or nsd. But what 
then? Slice k is available on /dev/sd1, and either there's free space 
(it's prolly 500GB, and slice b is prolly 32GB), or if not then the 
contents of slice p (/pub) can be moved to slice h (/home) and the 
space now allocated to p can be reduced by circa 6G to allow creation 
of a /usr that's more appropriately sized.

But will that work? Is the overfull (-87M) /usr complete? Will copying 
it over to a new location create new, intermittent, can't-be-reproduced 
bugs? 

Is a complete/clean reinstall mandated? Nightly backups to sd2a are 
working, so there's configuration information (and a presumably 
bootable kernel) in multiple locations; that config information (on an 
unmounted drive) should survive a reinstall and provide configs for 
rebuilding. Since /altroot has the same 7.6 kernel, is it possible to 
bring up root or altroot single user, with at most /tmp mounted, and 
tinker with partitions that way? Or does the overfull result indicate 
that a clean install is required?

Is there a way to get access remotely? I'm pretty sure that there 
isn't, because duh. I presume that it's not coming up single user 
(because other daemons appear to be running, including at least crond, 
dovecot, nsd, and (probably) unbound), and all the filesystems 
(including other disks not shown above) are up. My colo guy isn't 
responding to texts or emails currently, but that's a social, not a 
technical problem. It may even be lucky, since I didn't notice the 
overfull problem until my attention was pointed to it this morning by 
discussion in the other thread, so I haven't tried hacking on things 
without an understanding of the problem.

I'm too verbose. Short version: a) with sshd not working and no console 
access, is there a way to work on it, or is console access required? b) 
assuming (prolly console) access, can the current contents of /usr be 
copied to a new /usr slice, and the rest of the upgrade (pkg_add -u) 
performed, or is a complete/clean install required for stability?

Thanks for your time,

Amy!
-- 
Amelia A. Lewis                    amyzing {at} talsever.com
Confidence: a feeling peculiar to the stage just before full
comprehension of the problem.

how to recover? sysupgrade 7.5 -> 7.6, remote machine sshd not restarted, /usr overfull

Reply via email to