On 2/10/21 11:11 AM, Rich Freeman wrote: > On Wed, Feb 10, 2021 at 12:57 PM Andreas K. Hüttel <dilfri...@gentoo.org> > wrote: >> >> * what portage features are still needed or need improvements (e.g. binpkg >> signing and verification) >> * how should hosting look like > > Some ideas for portage enhancements: > > 1. Ability to fetch binary packages from some kind of repo.
The old PORTAGE_BINHOST functionality has been replaced with a binrepos.conf file that's very similar to repos.conf: https://bugs.gentoo.org/668334 It doesn't have explicit support for multiple local binary package repositories yet, but somebody got it working with src-uri set to a file:/ uri as described in comments on this bug: https://bugs.gentoo.org/768957 > 2. Ability to have multiple binary packages co-exist in a repo (local > or remote) with different build attributes (arch, USE, CFLAGS, > DEPENDS, whatever). We can now enable FEATURES=binpkg-multi-instance by default now that this bug is fixed: https://bugs.gentoo.org/571126 > 3. Ability to pick the most appropriate binary packages to use based > on user preferences (with a mix of hard and soft preferences). Current package selection logic for binary packages is basically the same as for ebuilds. These are the notable differences: 1) Binary packages are sorted in descending order by (version, mtime), so then most recent builds are preferred when the versions are identical. 2) The --binpkg-respect-use option rejects binary packages what would need to be rebuilt in order to match local USE settings. > One idea I've had around how #2-3 might be implemented is: > 1. Binary packages already contain data on how they were built (USE > flags, dependencies, etc). Place this in a file using a deterministic > sorting/etc order so that two builds with the same settings will have > the same results. This would only be needed to multi-profile binhosts that provide a variety of configurations for the same package. Features like this are not necessary if the binhost only intends to provide packages for a single profile. > 2. Generate a hash of the file contents - this can go in the filename > so that the file can co-exist with other files, and be located > assuming you have a full matching set of metadata. For FEATURES=binpkg-multi-instance we currently use an integer BUILD_ID ensure that file names are unique. > 3. Start dropping attributes from the file based on a list of > priorities and generate additional hashes. Create symlinked files to > the original file using these hashes (overwriting or not existing > symlinks based on policy). This allows the binary package to be found > using either an exact set of attributes or a subset of higher-priority > attributes. This is analogous to shared object symlinking. > 4. The package manager will look for a binary package first using the > user's full config, and then by dropping optional elements of the > config (so maybe it does the search without CFLAGs, then without USE > flags). Eventually it aborts based on user prefs (maybe the user only > wants an exact match, or is willing to accept alternate CFLAGs but not > USE flags, or maybe anything for the arch is selected> 5. As always the > final selected binary package still gets evaluated > like any other binary package to ensure it is usable. > > Such a system can identify whether a potentially usable file exists > using only filename, cutting down on fetching. In the interests of > avoiding useless fetches we would only carry step 3 reasonably far - > packages would have to match based on architecture and any dynamic > linking requirements. So we wouldn't generate hashes that didn't > include at least those minimums, and the package manager wouldn't > search for them. > > Obviously you could do more (if you have 5 combinations of use flags, > look for the set that matches most closely). That couldn't be done > using hashes alone in an efficient way. You could have a small > manifest file alongside the binary package that could be fetched > separately if the package manager wants to narrow things down and > fetch a few of those to narrow it down further. All of the above is oriented toward multi-profile binhosts, so we'll have to do a cost/benefit analysis to determine whether it's worth the effort to introduce the complexity that multi-profile binhosts add. > Or you could skip the hash searching and just fetch all the manifests > for a particular package/arch and just search all of those, but that > is more data to transfer just to do a query. A metadata cache of some > kind of might be another solution. Content hashes would probably > still be useful just to allow co-existence of alternate builds. This also relates to the centralized Packages file that's currently used to distribute the package metadata for all packages in a binhost. We can make it scale better if we split out a separate index per package, not unlike a pypi simple index: https://pypi.org/simple/ -- Thanks, Zac
signature.asc
Description: OpenPGP digital signature