On Sat, Nov 04, 2006 at 01:33:49AM -0800, Zac Medico wrote: > Hi everyone, > > As many of you probably know already, bug 46223 [1] prevents the > proper uninstallation of a package when one or more of the eclasses > that it inherits are missing from the live portage tree. > > There are essentially two ways to solve this problem: > > Approach #1: Reuse the saved environment which contains a mixture of > eclasses and parts of the package manager (such as portage's > ebuild.sh) that performed the installation.
That's not technically true; what is required is stripping the manager *out* of the env, leaving in only things that are part of the ebuild env (eapi crap basically). There are a few exceptions to this, but those exceptions are covering unversioned changes in the ebuild api (cue EAPI); example being the use function going silent after .50-r11; since older ebuilds relied on use echoing the flag to stdout if it was enabled, you have to use the ebuild envs copy of it to avoid breaking older installed ebuilds (something portage ignored, and broke I might add). > Approach #2: Save copies of the raw eclasses and use them, together > with the saved ebuild, to recreate the environment. > > A major advantage of approach #1 is that it can potentially provide > complete restoration of the install-time environment. The major > disadvantage is that the saved environment needs to be purified so > that parts of the package manager that did the install do not > interfere with the package manager that does the uninstall. > The > purification process could potentially be complex or error prone, > making it difficult to maintain or unreliable. The complexity comes down to tracking what the manager provides, and knowing the internal vars to strip (DISTDIR is a system specific setting, should be pulled from the local system everytime)... This actually is a *good* thing; 1) forces ebuild.sh functionality to be written nonsucky, namely being aware of the side affects of function invocations; specifically, not bleeding vars (use local like a mad man). 2) actually allows distdir/portdir to change between phases if the user (typically a dev) is running phases manually for testing, and happens to change something in between phases (happens, and is annoying). So... it requires having at least an audited ebuild.sh, which is pretty simple (good thing to do anyways), and a bash side that isn't fragile. The *one* spot I'd define as tricky is stripping functions out of the environment dump, since sans pkgcores filter-env, no good method existed for it till recently; <bash-3.2 would choke on certain functions (try f-x() { :; }; declare -f f-x on 3.1 sometime), which's --read-function blows, and the grep approach is a total no go (here ops can cause a fair bit of hell for line based grep). That said, with bash 3.2 doing a selective dump of the environment is now doable completely in pure bash, far simpler implementation. So, error prone? Not really in my experience; any bugs I've hit over the last 2 years of running ebd off and on have always been related to trying to match filter-env to bash parsing rules; with bash 3.2, can do it in about 10 lines of shell give or take. > Simplicity is the major advantage approach #2. The eclasses that > are used by a package could be stored in eclasses.tbz2 (much like > enviroment.bz2 is currently stored for each installed package), and > the environment could be recreated from scratch by using the ebuild > and eclasses in the normal inherit process. Addressed below... > This wouldn't > necessarily allow access to the complete build-time or install-time > environment, but that feature is not currently available anyway. Was at one point. Bit curious what happened to those changes since I went through and fixed all of that crap about a year back (it *may* have stayed only in the portage-2.1_alpha branch). > If > we save the raw eclasses for each installed package, we can easily > gain the ability to remove eclasses from the portage tree and make > incompatible api changes when necessary, without the complexity of > environment purification. That data isn't in the vdb *now*, and there is no way to force folks to upgrade *now*. This isn't one of the normal "wait six months, and then make them go through a bit of hell", this is "break their ability to use all preexisting binpkgs and make it far easier for folks to have pkgs that cannot be uninstalled". You cannot screw with the existing eclasses in a backwards incompatible way without running a real risk of breaking vdb/binpkg ebuilds (built ebuilds essentially); re-read glep33 and experiment locally if in doubt. > The sooner that we start storing eclasses.tbz2 for > each installed package, the sooner that we will be able to have more > freedom with the eclasses in the live portage tree. Reiterating it again since this is massively wrong; read glep33. Existing eclasses pretty much are stuck in the tree until it's decided to cause a fair bit of hell for folks lagging behind on their $PKG_MANAGER updates. That is the sole reason that glep33 required moving all new eclasses to eclass2, a location that existing managers do not know about; keeps folks who have crappy managers from touching the dynamic eclasses and winding up broke, while removing the static restriction on the new eclasses. > What do people think about these two approaches? Personally, I > would prefer approach #2 for the sake of simplicity and > maintainability. One thing that hasn't yet been mentioned is that ebuilds are written to expect their environment to be accessible through the phases; folks *expect* that if they set something in src_compile, it'll be there in src_install. Re-sourcing the ebuild/eclass for every phase means that it's now possible for the ebuild/eclass to inadvertantly corrupt the stored phase state; extension of this problem is that profile.env and friends get sourced *after* the ebuilds phase dump is restored. Short version; look at java-2 eclass, and note that they're using a user specific hook to restore the environment prior to every single phase execution. They're forced to do this because portage assumes it can rebuild the environment via re-executing code that is intended to *generate* an initial environment. You run from the (existing) saved env, it works now for vdb, for binpkgs, enables dynamic eclasses, and wipes out the entire class of issues from the manager/initial env screwing up the ebuilds expected stated. Approach #2 doesn't. Finally, portage already has an implementation of proper environemnt saving/restoration- portage-2.1_alpha* holds EBD, and pkgcore holds the current version of it. Ripping it out of pkgcore and pushing in the updates is fairly straight forward work (rote), and carries with it a 2x regen speedup while fixing the issues from above. Already got pkgcores cache, might as well rip out more bits ;) Not after the regen speed up via daemonizing ebuild.sh? Fair enough, the code for proper env saving/restoration isn't bound to it; can rip that out without issue. So... basically either pull a NIH, or nudge portage the remaining steps to gain the full support (harsh, but the environment data is there and has been for a long while, it's just a matter of using the work I've already done to finish the job). ~harring
pgpJ3zp8uU3pK.pgp
Description: PGP signature