> On 3 Nov 2021, at 15:03, Thomas Deutschmann <whi...@gentoo.org> wrote: > > Hi, > > it is currently not possible to smoothly run a world upgrade on a 4 months > old system which doesn't even have a complicated package list: > [snip] > > This is not about finding solution to upgrade the system (in this case it was > enough to force PYTHON_TARGETS=python3_8 for portage). This is about raising > awareness that Gentoo is a rolling distribution and that we guarantee users > to be able to upgrade their system when they do world upgrades just once a > year (remember: in my case the last world upgrade is just 4 months old!). If > they cannot upgrade their system without manual intervention, we failed to do > our job. > > Situations like this will disqualify Gentoo for any professional environment > like this will break automatic upgrades and you cannot roll individual fixes > for each possible situation via CFM tools like Salt, Ansible, Puppet or Chef. > > It would be very appreciated if everyone will pay more attention to this in > future. We can do better. In most cases we can avoid problems like this by > keeping older ebuilds around much longer for certain key packages to help > with upgrades.
I agree wholeheartedly with this and thank you for raising it. ## Remark on some previous discussion First, let me just mention that I think it's been on some of our minds but we need to go a bit further with formalising matters. It was brought up at the end of the September 2021 council meeting as a footnote: ``` [21:16:56] <@sam_> I'd like to consider "upgrade lifcycles" at some point but I don't have notes ready for now. Mainly just about formalising efforts to support upgrades for X period and to try document a procedure for e.g. new EAPI versions and bootstrap packages not having new EAPIs for a while, and such. [21:17:09] <@sam_> So, no, not right now, but I'd welcome any thoughts post-meeting while I consider it more [21:17:33] <@sam_> The gist is to have a checklist so that we don't "get excited" like with EAPI 8 and end up making upgrades hard for people [21:17:43] <@sam_> I think the GLEP we recently approved helps with that ``` I started working on some notes too on possible improvements: https://wiki.gentoo.org/wiki/User:Sam/TODO#Improving_upgrades. (I wanted to mention all of this here because it's easy to lose track of e.g. council meeting references on a topic, so it's easy to find it in the thread now.) ## Summary of the two common cases Now, in terms of the common issues regarding upgrades, I think we have two (to be clear, not trying to "fix your problem" -- just bring to bear some of the support experience I've had from #gentoo and so on): 1) World upgrades which can't complete due to new EAPIs (one's Portage lacks support for e.g. EAPI 8 and hence cannot read ebuilds) I'm open to more broad measures about usage of new EAPIs in ~arch / stable (say, e.g. the first Portage supporting EAPI N should sit in ~arch for 4/6/??? months before any ebuilds should use it?), but I think this is a drastic measure we might be able to avoid. Let's keep it in mind in case we do need it though. My general thinking on this is that it doesn't matter _too much_(?) as long as one can upgrade Portage without hassle. A lot of our users seem to know to try upgrade Portage if they can't upgrade their system due to new EAPIs, but they then fall down due to cryptic errors (see my next point). We could also improve the "unknown EAPI" error if necessary to make this more clear. TL;DR: We might be able to leverage a more drastic option, but my hope is we can avoid any direct action in handling 1) if we deal with the next point I'm about to make (2)). 2) Portage often can't upgrade itself when there's "pending global PYTHON_TARGETS changes" (e.g. when we change the default value of PYTHON_TARGETS in the profiles (like from Python 3.8 to Python 3.9)) This one is far trickier. I've started documenting common hacks/methods at https://wiki.gentoo.org/wiki/User:Sam/Portage_help/Upgrading_Portage#Solution which has been rather useful in #gentoo and on the forums (it's been nice to see links on those and other similar pages pop up on /r/gentoo). Portage is written in Python and has dependencies in Python. A lot of them are optional (which is why in the wiki page I linked to, I suggest emerge --syncing and then turning off USE=rsync-verify temporarily to reduce dependencies), but I don't think this is particularly comforting to a user who just wants to upgrade Portage. They don't necessarily realise they need to toggle one or *several* flags on Portage to make it work. dilfridge has been advocating for some time that we try look at some form of a "static Portage" copy (possibly vendoring/bundling all Python dependencies) to completely decouple the Portage ebuilds from the Python eclasses other than needing a (modern) Python 3 interpreter. [I've filed a bug for this here: https://bugs.gentoo.org/821511]. I really feel like this is one of the big things we need to tackle. Upgrading Portage unlocks newer EAPIs and allows us to even discuss world upgrades. (Using an older Portage to try upgrade world with any non-trivial @world set (chosen, user-specified packages) is likely to be a fool's errand -- folks have already said that if _anything_ is using a new EAPI, it's going to affect some users and result in confusing errors.) ## Solutions * News item when a new EAPI is released explaining how to upgrade Portage in case of emergency / inability to upgrade Portage. We can describe the steps at https://wiki.gentoo.org/wiki/Project:Portage/Fixing_broken_portage: This would also flag to users that they should upgrade Portage sooner-rather-than-later even if they aren't currently willing/able to fully upgrade the rest of their system. * We may want to include a 'rescue-portage' script on the system which downloads the latest Portage (would need to use a symlink or something to reliably get the latest version). * Investigate reducing Portage's dependencies. * Mitigate PYTHON_TARGETS profile change impact: ** I don't love this idea but one possible measure is that we always have two PYTHON_TARGETS set at all times (this would double build times for a fair amount of packages). ** Or we do this just for Portage and its dependencies. ** Or we have a new portage-minimal ebuild (to simplify matters) which always has some/all targets enabled, which will have few/no Python dependencies. [Note that in the past, we weren't consistent about putting out news items for this change. We're doing that now at least. The matter has got a bit worse because of Python upstream's release cycle changing.] * Implement at least a 4-6 month(?) delay on using new EAPIs after a new version of Portage supports it (the timer resetting once it hits stable too). I wasn't sure about this at first, but actually, the PYTHON_TARGETS stuff _should_ be fine for the most part as long as we make sure the tree is mostly/entirely ready before flipping the switch. [This could actually help with a fair amount of the problems (other than "general upgrade issues" like conflicts) except when a new EAPI comes along with a targets change, and if we're looking to support upgrades over a year or two years, that's.. probably going to coincide.] ## TL;DR I don't think we can avoid thinking about Portage's entanglement / relationship with PYTHON_TARGETS. Banning use of new EAPIs immediately will not magically make it easy to upgrade Portage itself. But the combination of a new EAPI + PYTHON_TARGETS changes in profiles is pretty lethal. I've got a few ideas above and I hope we can discuss some of them, or even better, someone has other proposals. best, sam
signature.asc
Description: Message signed with OpenPGP