hi Julius,

I very much appreciate you taking the time to elaborate on the issues we'd
face trying to pair a "mismatched" RO/RW combination. Thoughtful, well
articulated discussions like this are a big part of what makes our
community a great one IMO.

-Matt

On Wed, Aug 21, 2024 at 6:29 PM Julius Werner <jwer...@chromium.org> wrote:

> [resent with correct sender address]
>
> Hi,
>
> I wanted to pick up the discussion from the leadership meeting again
> here because I think it's easier to show my point with a few examples.
> If you want an old coreboot RO build to work together with a ToT RW
> build, then you don't just care about the vboot workbuffer matching...
> *everything* that is shared in any way between bootblock/verstage and
> romstage/ramstage/payload needs to be kept in sync or risks
> introducing dangerous and not necessarily immediately obvious errors.
>
> One major issue is the pre-RAM memory layout. Take for example this
> simple one-line CL: https://review.coreboot.org/83680 . Would most
> reviewers expect that it makes RW firmware built after that point
> incompatible with RO firmware built before? Well, it does, because it
> increases the size of a memory region here
>
> https://review.coreboot.org/plugins/gitiles/coreboot/+/refs/heads/main/src/arch/x86/car.ld#34
> which then moves the offsets of all subsequent memory regions
> downwards accordingly. Some of these regions contain buffers that were
> passed by the bootblock to later stages (e.g. CBFS_MCACHE,
> FMAP_CACHE), and when the romstage tries to reuse those buffers it
> suddenly finds different data at the same offsets.
>
> We actually had to figure out how to support this for an RW update to
> one of our shipped Chromebooks, and it looks like this:
>
> https://chromium-review.googlesource.com/c/chromiumos/third_party/coreboot/+/5732763/3/src/arch/x86/assembly_entry.S
> . It needs an ugly, complicated assembly hack to copy bytes around
> from the locations where RO firmware left them into the locations
> where RW firmware expects them. Since one of these regions is the
> stack, you can't even do that in C code because you might already be
> trampling over the data you're still trying to save once you enter a C
> function. And this patch is just tailored to this one particular case
> (and platform), it's not a universal solution — if we were resizing a
> different region, or swapping the order of two regions, that would
> require a completely different (maybe impossible) workaround.
>
> Now, we could try to replace our simple memory region management with
> something more complicated that comes with a built-in tagged directory
> (kinda like a second CBMEM in pre-RAM). Then, if every single access
> to one of those regions went through that directory, it would
> automatically find the right location. We would need to write the code
> to parse that directory in assembly for every architecture, since the
> stack setup is one of the things that uses them. But even then, that
> doesn't solve the resize problem: if we e.g. find a rare stack
> overflow in the FSP that requires us to have a bigger romstage stack,
> then we can't just reuse the directory the bootblock has written, we
> need to actually change it to fix that bug and move all those regions
> around to fit the new requirements. And what if migrating from the old
> to the new layout on the fly is not possible? If we flip the order of
> two sections, there might not be enough free scratch space to
> temporarily save the contents of one section while we move the
> contents of the other into its place. (Note that car.ld is only the
> shared pre-RAM layout for Intel platforms. Every Arm and (recent) AMD
> platform has their own separate layout like this, and some of them
> tend to be very tight. We regularly have to land patches like
> https://review.coreboot.org/78970 to make some sections larger or
> smaller so that things still fit on certain platforms. Each of those
> changes would lead to one of these situations where suddenly the
> romstage needs to move everything into a different place from where
> the bootblock left it because it needs space for new things that the
> older code didn't account for.)
>
> There are plenty of other dependencies between RO and RW stages, many
> of them very abstract and platform specific. For example, on Arm SoCs
> we often have a clock tree where a few shared top-level PLLs get
> multiplexed onto many peripheral devices that each have their own
> pre-divisors. The PLLs are usually set up by bootblock code (because
> some of them are needed very early), and the peripheral divisors are
> usually configured by the driver that initializes that peripheral. If
> you e.g. initialize an I2C controller in ramstage to 400KHz, the I2C
> driver can calculate what divider to set based on the target frequency
> and a #define constant of what the top-level PLL was set to. But what
> if for some reason the top-level PLL frequency needs to change? Today
> we'd submit a patch to change that #define and all the peripheral
> driver code will continue to calculate accurate divisor values. But if
> you then try to combine that RW ramstage with an RO bootblock built
> before that change, suddenly your divisor will be calculated with an
> incorrect value and your I2C clock will either be too fast or too
> slow, resulting in maybe the peripheral not working, or maybe your
> boot speed just regressing in a way nobody will ever track down, or
> maybe the bus running slightly too fast so that on 99.9% of devices it
> will still work but on 0.1% of devices you suddenly start having an
> awful heisenbug that will be a royal pain to track down.
>
> There can be a near-infinite number of scenarios like this on various
> platforms where code running in a later stage depends on some hardware
> initialization that code running in an earlier stage did, and I see no
> practical way to track them all even if we did try to implement
> explicit backwards-compatibility for all of them. vboot is an optional
> feature in coreboot and most people aren't even using it. How are we
> going to get every developer/reviewer to care about whether a change
> they're about to make could subtly break compatibility for a feature
> they've never used?
>
> And even if we did find a way to perfectly track every possible
> dependency and wanted to implement backwards-compatibility for all of
> them, what is the cost for doing so (both the effort of writing it,
> and the amount of extra romstage code we need to carry around in our
> binaries)? Not every incompatibility is easy to fix up once you detect
> it. Let's go back to the vboot workbuffer, for example: the workbuffer
> contains among other things a couple of data structures that have been
> read from the TPM and contain information relevant at various points
> in the boot flow. Right now, two of those (secdata_firmware and
> secdata_kernel) get loaded in verstage, while the third (secdata_fwmp)
> only gets loaded in our payload. But we have been discussing that we
> might want to (for various reasons) change this in the future so that
> secdata_fwmp is also loaded in verstage. Let's say we make that
> change, and we want to stay backwards compatible to RO firmware built
> before it. That means that early in romstage, before any new code that
> depends on secdata_fwmp runs, we'll need to run some code that checks
> the workbuffer version and fixes this incompatibility if it is too
> old. But in order to be able to load secdata_fwmp, we'll need to link
> the entire TPM driver stack into romstage which otherwise wouldn't be
> needed there on many platforms. So we are bloating our binary size in
> a potentially space-restricted pre-RAM stage just to be able to fix an
> issue that most people are never going to care about. Multiply that by
> the dozens of possible incompatibilities that will accumulate over the
> years.
>
> I totally understand that it would be great if we could just build a
> system where you can just run any RO and RW together and have them
> work it out, but I really don't think that's feasible in practice.
> vboot was not designed to support that, for good reason. And if we
> tried to just get started on adding compatibility code for specific
> issues that people have on a specific platform right now to "see how
> far we get", I don't think it's going to work out in the long term and
> it will just lead to disruption and extra code complexity right now
> that's eventually not going to pay off for anyone. It may work for the
> one platform you care about for a while but sooner or later there'll
> be another compatibility break that you can't easily mitigate. And if
> people start to rely on this working then they may get angry if
> someone else makes a change that is going to break compatibility on
> their platform in a way that can't be fixed, which I think would be a
> bad thing, because we do need to be able to develop new features and
> fix architectural issues (both in vboot and in other code that can
> cause RO->RW dependencies) unburdened by legacy code from years ago.
>
_______________________________________________
coreboot mailing list -- coreboot@coreboot.org
To unsubscribe send an email to coreboot-le...@coreboot.org

Reply via email to