On 7/5/17 1:45 pm, Warner Losh wrote:
On Sat, May 6, 2017 at 10:03 PM, Julian Elischer <jul...@freebsd.org> wrote:
On 6/5/17 4:01 am, Toomas Soome wrote:

On 5. mai 2017, at 22:07, Julian Elischer <jul...@freebsd.org
<mailto:jul...@freebsd.org>> wrote:

Subject says it all really, is this an option at this time?

we'd like to try boot the main zfs root partition and then fall back to a
small UFS based recovery partition.. is that possible?

I know we could use grub but I'd prefer keep it in the family.




it is, sure. but there is an compromise to be made for it.

Lets start with what I have done in illumos port, as the idea there is
exactly about having as “universal” binaries as possible (just the binaries
are listed below to get the size):

-r-xr-xr-x   1 root     sys       171008 apr 30 19:55 bootia32.efi
-r-xr-xr-x   1 root     sys       148992 apr 30 19:55 bootx64.efi
-r--r--r--   1 root     sys         1255 okt 25  2015 cdboot
-r--r--r--   1 root     sys       154112 apr 30 19:55 gptzfsboot
-r-xr-xr-x   1 root     sys       482293 mai  2 21:10 loader32.efi
-r-xr-xr-x   1 root     sys       499218 mai  2 21:10 loader64.efi
-r--r--r--   1 root     sys          512 okt 15  2015 pmbr
-r--r--r--   1 root     sys       377344 mai  2 21:10 pxeboot
-r--r--r--   1 root     sys       376832 mai  2 21:10 zfsloader

the loader (bios/efi) is built with full complement - zfs, ufs, dosfs,
cd9660, nfs, tftp + gzipfs. The cdboot is starting zfsloader (thats trivial
string change).

The gptzfsboot in illumos case is only built with zfs, dosfs and ufs - as
it has to support only disk based media to read out the loader. Also I am
building gptzfsboot with libstand and libi386 to get as much shared code as
possible - which has both good and bad sides, as usual;)

The gptzfsboot size means that with ufs the dedicated boot partition is
needed (freebsd-boot), with zfs the illumos port is always using the 3.5MB
boot area after first 2 labels (as there is no geli, the illumos does not
need dedicated boot partition with zfs).

As the freebsd-boot is currently created 512k, the size is not an issue.
Also using common code does allow the generic partition code to be used, so
GPT/MBR/BSD (VTOC in illumos case) labels are not problem.


So, even just with cd boot (iso), starting zfsloader (which in fbsd has
built in ufs, zfs etc), you already can get rescue capability.

Now, even with just adding ufs reader to gptzfsboot, we can use gpt +
freebsd-boot and ufs root but loading zfsloader on usb image, so it can be
used for both live/install and rescue, because zfsloader itself has support
for all file systems + partition types.

I have kept myself a bit off from freebsd gptzfsboot because of simple
reason - the older setups have smaller size for freebsd boot, and not
everyone is necessarily happy about size changes:D also in freebsd case
there is another factor called geli - it most certainly does contribute some
bits, but also needs to be properly addressed on IO call stack (as we have
seen with zfsbootcfg bits). But then again, here also the shared code can
help to reduce the complexity.

Yea, the zfsloader/loader*.efi in that listing above is actually built
with framebuffer code and compiled in 8x16 default font (lz4 compressed
ascii+boxdrawing basically - because zfs has lz4, the decompressor is always
there), and ficl 4.1, so thats a bit of difference from fbsd loader.

Also note that we can still build the smaller dedicated blocks like boot2,
just that we can not use those blocks for more universal cases and
eventually those special cases will diminish.

thanks for that..

  so, here's my exact problem I need to solve.
FreeBSD 10 (or newer) on Amazon EC2.
We need to have a plan for recovering the scenario where somethign goes
wrong (e.g. during an upgrade) and we are left with a system where the
default zpool rootfs points to a dataset that doesn't boot. It is possible
that mabe the entire pool is unbootable into multi-user..  Maybe somehow it
filled up? who knows. It's hard to predict future problems.
There is no console access at all so there is no possibility of human
intervention. So all recovery paths that start "enter single user mode
and...." are unusable.

The customers who own the amazon account are not crazy about giving us the
keys to the kingdom as far as all their EC2 instances, so taking a root
drive off a 'sick' VM and grafting it onto a freebsd instance to 'repair' it
becomes a task we don't want to really have to ask them to do. They may not
have the in-house expertise to do it. confidently.

This leaves us with automatic recovery, or at least automatic methods of
getting access to that drive from the network.
Since the regular root is zfs, my gut feeling is that to deduce the chances
of confusion during recovery, I'd like the (recovery) system itself to be
running off a UFS partition, and potentially, with a memory root filesystem.
As long as it can be reached over the network we can then take over.

we'd also like to have the boot environment support in the bootcode.
so, what would be the minimum set we'd need?

Ufs support, zfs support, BE support, and support for selecting a completely
different boot procedure after some number of boot attempts without getting
all the way to multi-user.

How does that come out size-wise?  And what do I need to  configure to get
that?

The current EC2 Instances have a 64kB boot partition , but I have a window
to convince management to expand that if I have a good enough  argument.
(since we a re doing a repartition on the next upgrade, which is "special"
(it's out upgrade to 10.3 from 8.0).
Being able to self heal or at least 'get at' a sick instance might be a good
enough argument and would make the EC2 instances the same as all the other
versions of the product..
You should convince them to move to 512k post-haste. I doubt 64k will
suffice, and 512k is enough to get all the features you desire.

yeah I know but sometimes convincing management of things is like banging one's head against a wall.
Don't think I haven't tried, and won't keep trying.


Warner

/me has thought..  I wonder if the ec2 instance bios has enough network
support to allow PXE-like behaviour? or at least being able to receive
packets..?

rgds,
toomas

_______________________________________________
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


_______________________________________________
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Reply via email to