On Thu, 27 Jun 2024 at 11:46:51 +0200, Helmut Grohne wrote: > I am concerned about behavioural differences due to the reimplementation > from first principles aspect though. Jochen and Aurelien will know more > here, but I think we had a fair number of ftbfs due to such differences. > None of them was due to the architecture of creating a namespaces for > each command and most of them were due to not having gotten right > containers in general. Some were broken packages such as skipping tests > when detecting schroot.
Right - this is an instance of the more general problem pattern, "if we don't test a thing regularly, we can't assume it works". We routinely test sbuild+schroot (on the buildds), and individual developers often try builds without any particular isolation (on development systems or expendable test systems), but until recently sbuild's unshare backend was not something that would be routinely tested with most packages, and similarly most packages are not routinely built with Podman or Docker or whatever else. In packages that, themselves, want to do things with containers during their build or testing (for example bubblewrap and flatpak), there will typically be a code path for "no particular isolation" that actually runs the tests (otherwise upstream would not find the tests useful), and a code path for sbuild+schroot that skips the tests (otherwise they'd fail on our historical buildds), but the detection that we are in a locked-down environment where some tests need to be skipped might not be 100% correct. I know I've had to adjust flatpak's test suite several times to account for things like detecting whether FUSE works (because on DSA'd machines it intentionally doesn't, as a security hardening step). > If we move beyond containers and look into building > inside a VM (e.g. sbuild-qemu) we are in a difficult spot, because we > need e.g. systemd for booting, but we may not want it in our build > environment. So long term, I think sbuild will have to differentiate > between three contexts: > * The system it is being run on > * The containment or virtualisation environment used to perform the > build > * The system where the build is being performed inside the containment > or virtualisation environment Somewhat prior art for this: https://salsa.debian.org/smcv/vectis uses a VM (typically running Debian stable), installs sbuild + schroot into it, and uses sbuild + schroot for the actual build, in an attempt to replicate the setup of the production buildds on developer machines. In this case sbuild is in the middle layer instead of the top layer, though. Similarly, when asked to test packages under lxc (in an attempt to replicate the setup of ci.debian.net), vectis installs lxc into a VM, and runs autopkgtest on the VM rather than on the host system. Of course, I'd prefer it if Debian's production infrastructure was something that would be easier to replicate "closely enough" on my development system (such that packages that pass tests on my development system are very likely to pass tests on the production infra), without damaging my development system if I use it to build a malicious, compromised or accidentally-low-quality package that creates side-effects outside the build environment. > I don't quite understand the need for a Dockerfile here. I suspect that > this is the obvious way that works reliably, but my impression was that > using podman import would be easier. Honestly, the need for a Dockerfile here is: I already knew how to build containers from a Dockerfile, and I didn't read the documentation for the lower-level `podman import` because `podman build` can already do what I needed. I see this as the same design principle as why we encourage package maintainers to use dh, even when building trivial "toy" packages like hello, and in preference to implementing debian/rules at a lower level in trivial cases. To build a non-trivial container with multiple layers, you'll likely need a Dockerfile (or docker-compose, or some similar thing) *anyway*, so a typical user expectation will be to have a Dockerfile, and anyone building a container will likely already have learned the basics of how to write one; and then we might as well follow the same procedure in the trivial case, rather than having the trivial case be different and require different knowledge. > > $ autopkgtest -U hello*.dsc -- podman localhost/local-debian:sid > > This did not work for me. autopkgtest failed to create a user account. Please report a bug against autopkgtest with steps to reproduce. It worked for me, on Debian 12 with a local git checkout of autopkgtest, and it's probably something that ought to work - although it's always going to be non-optimal, because it will waste a bunch of time doing basic setup like installing dpkg-dev and configuring the apt proxy before every test. The reason why we have autopkgtest-build-podman is to do that setup fewer times, cache the result, and amortize its cost across multiple runs. > I am more interested in providing isolation-container though as a number > of tests require that and I currently tend to resort to virt-qemu for > that. Sure enough, adding --init=systemd to autopkgtest-build-podman > just works and a system container can also be used as an application > container by autopkgtest (so there is no need to build both), but > running the autopkgtest-virt-qemu --init also fails here in non-obvious > ways. It appears that user creation was successful, but the user > creation script is still printed in red. (I assume you mean a-v-podman --init rather than a-v-qemu --init. a-v-qemu always needs an init system.) Please report a (separate) bug against autopkgtest with steps to reproduce. Unfortunately I haven't had been able to spend as much time on autopkgtest in recent months as I would like to, and I haven't done much with podman system containers (with init) since the -docker/-podman backend was originally merged. I remember that at one point, shortly before the -docker/-podman backend was merged, I did have a-v-podman --init working successfully on a system with systemd as pid 1 on the host, and each of the three init systems known to a-b-podman in the container: systemd, sysvinit with sysv-rc, or sysvinit with openrc (only tested extremely briefly). At the time, I think I was able to test src:dbus successfully with at least the first two. When testing my own packages, I usually have to prioritize -lxc because it's de facto RC (ci.debian.net uses it when not configured otherwise), and -qemu because it's the only way some of my packages can have good test coverage (notably bubblewrap and flatpak, which want to create new user namespaces during testing in a way that a container manager like podman will not usually allow). Of course in an ideal world I should be re-running the test suite for each package in each of the potentially interesting autopkgtest-virt- backends, but that would only give me fractionally better test coverage, in exchange for making it take even longer to release a package. I am sorry for not having been optimally thorough, but one bug that affects many of my package uploads, which (unusually!) cannot be solved by adding extra QA steps, is "this update took an unacceptably long time to reach the archive". If ci.debian.net moves away from -lxc, resulting in "tests pass under lxc" no longer being a de facto requirement for inclusion in testing, then I would prefer to be using -podman for all of the simpler tests (for example flatpak's debian/tests/build, which just exercises the -dev package), because it has a much, much shorter lead time for per-test setup than -qemu, while also having a useful level of isolation and being straightforward to replicate on a developer system for interactive debugging. Less-isolated backends like -schroot seem like a bad place to invest time and effort because they have more intrusive system and privilege requirements, while not actually being significantly faster or more capable. > Let me pose a possibly stupid suggestion. Much of the time when people > interact with autopkgtest, there is a very limited set of backends and > backend options people use frequently. Rather than making the options > shorter, how about introducing an aliasing mechanism? Say I could have > some ~/.config/autopkgtest.conf and whenever I run autopkgtest ... -- > $BACKEND such that there is no autopkgtest-virt-$BACKEND, consult that > configuration file and if there the value is assigned, expand it the > assigned value. Then, I can just record my commonly used backends and > options there and refer to them by memorable names of my own liking. That sounds like a reasonable feature request, please open a bug. As with most reasonable feature requests in projects I maintain, it'll go on my list, but please don't assume that I will ever get sufficiently far through the list within my lifetime if left to implement it myself. A crude way to implement this would be to add something like this to $PATH: #!/bin/sh # Save as ~/bin/autopkgtest-virt-sid and make it executable set -eu exec autopkgtest-virt-podman "$@" localhost/autopkgtest/debian:sid and then use e.g. `autopkgtest ... -- sid`. (But please note that some backends have more than one place where you might wish to add arbitrary options, e.g. a-v-podman accepts a-v-podman options, followed by exactly one image, followed by "--" and arbitrary `podman run` options. It might be better if there was an --image parameter that can appear first as an alternative to the positional parameter.) > Automatic choice of images makes things more magic, which bears negative > aspects as well. The automatic choice of images is intended to be a matter of "have reasonable defaults" rather than anything deeper. For example in the example in the man page, if you tell autopkgtest-build-podman to convert debian:sid into a pre-prepared test container image, it'll default to outputting autopkgtest/debian:sid because that seems a little more friendly than forcing the user to choose their own arbitrary name, and establishing a convention via defaults makes it easier to write examples. (Or if you use --init=systemd to create a bootable system-container, you'll get autopkgtest/systemd/debian:sid, and so on.) > Every time I run a podman container (e.g. when I run > autopkgtest) my ~/.local/share/containers grows. I think autopkgtest > manages to clean up in the end, but e.g. podman run -it ... seems to > leave stuff behind. If you are using e.g. `podman run -it debian:sid` then that is expected to leave the container's root filesystem hanging around for future use or inspection, even after all of its processes have exited. This is vaguely analogous to using `schroot --begin-session` followed by `schroot --run-session`, and then leaving the session open indefinitely. If you want resources used by the container to be cleaned up automatically on exit, use the `--rm` option, more like `podman run --rm -it debian:sid`. This is more like `schroot --automatic-session`. `podman container list -a` will list all the containers that have been kept around in this way, and `podman container rm` or `podman container prune` will delete them. This is analogous to `schroot --end-session`. > Of course, when I skip podman's image management and use --rootfs, I can > side step this problem by choosing my root location on a tmpfs, but > that's not how autopkgtest uses podman. That seems like a reasonable a-v-podman feature request too. Presumably it would only allow this when invoked as a-v-podman, and not when invoked as a-v-docker (I don't think a-v-docker has an equivalent feature). > > I don't think podman can do this within a single run. It might be feasible > > to do the setup (installing build-dependencies) with networking enabled; > > leave the root filesystem of that container intact; and reuse it as the > > root filesystem of the container in which the actual build runs, this time > > with --network=none? > > Do I understand correctly that in this variant, you intend to use podman > without its image management capabilities and rather just use --rootfs > spawning two podman containers on the same --rootfs (one after another) > where the first one installs dependencies and the second one isolates > the network for building? Maybe that; or maybe use its image management, tell the first podman command not to delete the container's root filesystem (don't use --rm), and then there's probably a way to tell podman to reuse the resulting filesystem with an additional layer in its overlayfs for the network-isolated run. Please note that I am far from being an expert on podman or the "containers" family of libraries that it is based on, and I don't know everything it is capable of. Because Debian has a lot of pieces of infrastructure we have built for ourselves from first principles, I've had to spend time on understanding the finer points of sbuild, schroot, lxc and so on, so that I can replicate failure modes seen on the buildds and therefore fix release-critical bugs in the packages that I've taken responsibility for (and occasionally also try to improve the infrastructure itself, for example #856877 which recently passed its 7th birthday). That comes with an opportunity cost: the time I spent learning about schroot is time that I didn't spend learning about OCI. One of the reasons I would like to have fewer Debian-specific pieces in our stack is so that other Debian developers don't have to do what I did, and can instead spend their time gaining transferrable knowledge that will be equally useful inside and outside the Debian bubble (for example the best ways to use OCI images, and OCI-based tools like Docker and Podman, which have a lot of overlap in how they are used even though they are rather different behind the scenes). smcv