Heyo Theo (and Mark),

On 2021-11-13 4:12 p.m., Theo de Raadt wrote:
If you say that 6.7 works, then the changes in the logs to look at are
the following.

It isn't that many diffs, see below.

I went through every diff to sys/arch/sparc64/stand between release 6.7 (working) until things broke. Note that I was updating the entire tree, not just /stand (more on this in a little bit).

The point at which things break is by the time the tree hits the following patch:

revision 1.10
date: 2020/06/05 09:16:13;  author: otto;  state: Exp;  lines: +11 -3;  
commitid: UX6Wf38M2cpPhjSY;
Qemu does not like we load ofwboot on top of the bootblocks and as
miod notes this is actually a regression I introduced in the latest
commit. So stop doing that, even if it works on real hardware.
ok kettenis@

Specifically, the problem starts between May 26, and June 5 patches. There are no changes to files in sys/arch/sparc64/stand between those dates but there are changes in the larger tree impacting ffs during that window, perhaps the problem is there.

Random was added to the bootblocks before 6.7.  But then code was
added to fchmod the random file, so that it would not be reused.
That fchmod is a write-to-disk operation.

Is this write code triggering a bug in the bootblocks or is it actually
broken on it's own?

see loadrandom()

I didn't see the bug manifest by the time it got to the patches you are referring to here ^ However since the filesystems are getting mangled on the -current version I assume that there is some garbage being written to disk where it doesn't belong, possibly because of this code. Presumably it wouldn't do that if things were read only.

ofwboot/elf64_exec.c

+                        * Kernels up to and including OpenBSD 6.7
+                        * check for an exact match if the length.
+                        * Lie here to make sure we can still boot
+                        * older kernels with softraid.

The format of a structure on the disk changed.  Are you using softraid?
It might matter.

No I'm not using softraid, all the tests I've been doing have been with the default install and default partition layout.

...

Regarding the rest of what you wrote, I haven't taken a look yet, I was just working through the tedious process of updating and patching looking for when things break so far.

Before I go, I said that I had been updating the entire tree to make changes outside of the /stand path would get pulled in if they prompted build changes. I should mention that I see that libsa updated between may 26 and otto's patch on June 5, 2020 that caused ofwboot to rebuild as well due to a patch to some of it's files that aren't in /stand

U src/sys/ufs/ffs/ffs_alloc.c
U src/sys/ufs/ffs/ffs_vfsops.c
U src/sys/ufs/ufs/dinode.h
U src/sys/ufs/ufs/inode.h

It's possible that the problem is in one of those changes and I haven't gone through yet their history yet. I specifically mention this because I have tested the latest forth bootblock with an older version of ofwboot and the system has started fine, so it might not actually be otto's change that introduce the bug? More testing IS required.

--
Ted Bullock <[email protected]>

Reply via email to