On 19/03/2025 02:07, Ionen Wolkens wrote:
On Tue, Mar 18, 2025 at 08:34:43PM -0400, Ionen Wolkens wrote:
On Tue, Mar 18, 2025 at 03:14:13AM -0000, Duncan wrote:
Nowa Ammerlaan posted on Mon, 17 Mar 2025 11:11:06 +0100 as excerpted:
I had really hoped to receive more comments on my earlier RFC. [...]
I really do want to know what others think so I can
make a better judgment on whether or not my idea is really this crazy
and if I should just shut up about it or not (so dear reader if you have
an opinion then please share).
So because I carried over my own already "works for me" kernel maintenance
scripts from Mandrake when I switched in 2004 and have continued
maintaining and using them over the decades since, I normally try to stay
out of Gentoo kernel packaging discussion. But given both the above
explicit invitation and that as I've read the thread a thought occurred to
me...
First, DKMS /is/ a cross-distro standard solution. As such, I believe in
general it should be reasonably supported in Gentoo unless it simply
doesn't make sense (note that "doesn't make sense" can also include the
case of simply no one stepping up to do it, not the case here).
But, the thought that occurred to me reading the thread, was that there
are obvious parallels between this and another very significant and
controversial now "cross distro standard solution" (which I guess I don't
need to name explicitly).
As there, I believe "the Gentoo approach" should (again assuming developer
willingness to do the work, seemingly the case here) make it available as
an additional integrated *option*, while keeping the current Gentoo option
as well.
So I support DKMS integration /as/ /an/ /option/.
If anything, if go forward with this, I'd rather that it be with the
plan to (eventually) either make it the default after enough testing
and then later drop support for the old way entirely (then merge the
eclasses), or revert if we think it's no good.
As I have already stated elsewhere, DKMS can do things that we cannot
achieve with the package manager, and the package manager can do things
that we cannot achieve with DKMS. Each pathway has its use cases. And
for that reason DKMS is not a replacement for the package manager. Nor
can I think of a possible future package manager based solution that can
fully replace what DKMS does (though who knows, maybe someone will prove
me wrong in 20 years)
This dual-approach is not controversial either, other distributions
often offer a "normal" package as well as a DKMS package. Now since we
have USE flags we do not have to make two separate packages, but
nonetheless the core of letting the user choose to use DKMS or not
remains the same.
One of the thing I did not like here is the idea to gain more ways
to do the same thing that need to be tested to ensure some quality.
Can't ignore it and leave it all to Nowa given if e.g. nvidia changes
some path or something else and I don't test it on bump, then I push
a broken package for all dkms users until someone reports it. Would
even need to boot with it to be sure.
I'll grant that you'd indeed have to test both USE=dkms and USE=-dkms,
especially if the ebuild does not use a modlist and therefore the
dkms.conf is not constructed fully automatically. Though I do not see
why this would require actually rebooting the system for both cases.
DKMS either builds and installs the module successfully in postinst or
it does not. And regardless of who did the module installing, it either
loads successfully or it does not. Note that we are intentionally using
the exact same commands to actually build the module in DKMS.
I'll also note again that Nvidia is one of the upstreams that supports
DKMS, in contrast to our own linux-mod-r1 solution in portage which I
don't think they care about at all. I'd therefore say that it is far
more likely for Nvidia to change something that breaks the existing
non-dkms pathway in the ebuild, then it is for them to break the dkms
pathway that lots of other distributions rely on.
It's nice to have choices in general, but still need to draw some
lines to keep things maintainable.
This maintainability argument would be a lot stronger if I was
reinventing the wheel and proposing some custom Gentoo specific solution
to the problem at hand. Note though that this is not what I am doing (in
fact one could even turn that around and say that this is what you are
doing). You are of course right that more options means more things to
test. But really, it's not a lot of work, I know because I did the work
for almost all of the kernel module ebuilds we have in ::gentoo and was
finished in half a day. The bulk of the work was designing and writing
the eclass and figuring out all the different cases that should be
supported, that part is done now.
And if picking, in the end do we pick an option that requires to
install sources and (imo) adds very little, or let the PM (that has
access to sources unlike binary distros) handle it (with full control
for handling issues) just like for dist kernels and improve on that
as needed?
Either way, as I said initially, I won't revert if this gets merged
(even if optional forever). Just stating that I don't like it and
probably won't offer real support, not blocking it.
wrt merging eclasses, could add that I wasn't really against the
support for this being in linux-mod-r1 directly except for the part
where it did not work when not using modlist being confusing, in the
end I'd probably just have asked for Nowa to add themselves as
maintainer.
The main reason this is in a separate eclass is because we need a
pkg_prerm for dkms that linux-mod-r1 does not have. And as you pointed
out earlier, exporting an extra phase function in an established eclass
is not a good idea.
On a related note about modlist, I've been semi-regretting keeping that
modlist-type idea from the original linux-mod eclass and felt that a
simple emake wrapper (incl. modules args) for all packages "might" have
been better and easier to use for ebuilds and not miss modules on bump
and had been pondering "potential" deprecation in the future (not that
I had really explored that idea yet, would need to check packages).
(this was in my notes of things to consider for EAPI 9, but likely won't
try if there is another eclass built upon linux-mod-r1 that I need to
not break)
Note that none of this hard requires the modlist. The requirement is
that we have one or more dkms.conf files. These may be provided by
upstream (as is the case for nvidia-drivers), or generated by some build
system script (as is the case for zfs-kmod), or included in the
FILESDIR, or they can be generated by the eclass from the modlist.
This auto-generation option is just for convenience. The modlist already
contains all the information we need to define the dkms.conf, so all we
have to do is make the translation. Doing so makes it very easy for the
package maintainer to add dkms compatibility without actually writing a
custom dkms.conf.
If you wish to drop the modlist method from linux-mod-r1 then you can
still do so. It just requires that when upgrading from EAPI 8 to 9 we
also port the ebuild to so other method of providing the dkms.conf (for
example putting a stub dkms.conf in FILESDIR, sed'ing in the PV, and
then putting it in the proper place). I might then want to adjust the
src_compile phase of the eclass a bit when bumping it to EAPI 9, but
again these are all easily solvable problems, and they are also
hypothetical problems.
In the end this eclass does not really rely on the specifics of
linux-mod-r1 more then a consumer ebuild does. We rely on linux-mod-r1
setting the MODULES_MAKEARGS, we rely on linux-mod-r1 to process the
modlist and set the default values there (I split this into a separate
function to avoid code duplication), and we rely on the
modules_process_dracut.conf.d function (again, just to avoid code
duplication). And that's it. Now I could drop the linux-mod-r1 commits
that split out this processing of the modlist and make the
modules_process_dracut.conf.d function public. But we gain nothing from
this since the ebuilds already rely on linux-mod-r1 doing what it does
in this area in exactly the same way that the eclass does, it only
results in some code duplication.
Best regards,
Nowa