Heyo Theo (and Mark),
On 2021-11-13 4:12 p.m., Theo de Raadt wrote:
If you say that 6.7 works, then the changes in the logs to look at are
the following.
It isn't that many diffs, see below.
I went through every diff to sys/arch/sparc64/stand between release 6.7
(working) until things broke. Note that I was updating the entire tree,
not just /stand (more on this in a little bit).
The point at which things break is by the time the tree hits the
following patch:
revision 1.10
date: 2020/06/05 09:16:13; author: otto; state: Exp; lines: +11 -3;
commitid: UX6Wf38M2cpPhjSY;
Qemu does not like we load ofwboot on top of the bootblocks and as
miod notes this is actually a regression I introduced in the latest
commit. So stop doing that, even if it works on real hardware.
ok kettenis@
Specifically, the problem starts between May 26, and June 5 patches.
There are no changes to files in sys/arch/sparc64/stand between those
dates but there are changes in the larger tree impacting ffs during that
window, perhaps the problem is there.
Random was added to the bootblocks before 6.7. But then code was
added to fchmod the random file, so that it would not be reused.
That fchmod is a write-to-disk operation.
Is this write code triggering a bug in the bootblocks or is it actually
broken on it's own?
see loadrandom()
I didn't see the bug manifest by the time it got to the patches you are
referring to here ^ However since the filesystems are getting mangled
on the -current version I assume that there is some garbage being
written to disk where it doesn't belong, possibly because of this code.
Presumably it wouldn't do that if things were read only.
ofwboot/elf64_exec.c
+ * Kernels up to and including OpenBSD 6.7
+ * check for an exact match if the length.
+ * Lie here to make sure we can still boot
+ * older kernels with softraid.
The format of a structure on the disk changed. Are you using softraid?
It might matter.
No I'm not using softraid, all the tests I've been doing have been with
the default install and default partition layout.
...
Regarding the rest of what you wrote, I haven't taken a look yet, I was
just working through the tedious process of updating and patching
looking for when things break so far.
Before I go, I said that I had been updating the entire tree to make
changes outside of the /stand path would get pulled in if they prompted
build changes. I should mention that I see that libsa updated between
may 26 and otto's patch on June 5, 2020 that caused ofwboot to rebuild
as well due to a patch to some of it's files that aren't in /stand
U src/sys/ufs/ffs/ffs_alloc.c
U src/sys/ufs/ffs/ffs_vfsops.c
U src/sys/ufs/ufs/dinode.h
U src/sys/ufs/ufs/inode.h
It's possible that the problem is in one of those changes and I haven't
gone through yet their history yet. I specifically mention this because
I have tested the latest forth bootblock with an older version of
ofwboot and the system has started fine, so it might not actually be
otto's change that introduce the bug? More testing IS required.
--
Ted Bullock <[email protected]>