Hi, On Sun, 02 Jul 2023 at 21:51, Ludovic Courtès <l...@gnu.org> wrote:
> Someone™ has to invest time in studying this specific case, look at what > others like Debian are doing, and seek consensus on a way forward. Hum, I am probably not this Someone™ but here the result of my looks. :-) First, please note that the Debian thread [1] is about, Concerns to software freedom when packaging deep-learning based appications and not about specifically Leela Zero. The thread asked the general question keeping in mind the packaging of leela-zero [2] – now available with Debian [3]. Back on 2019, patch#36071 [4] also introduced leela-zero in Guix. The issues about Machine Learning that this package raises are not gone, although this specific package had been included, because we are lacking some guideline. :-) Quoting the Debian package description [3], it reads, Recomputing the AlphaGo Zero weights will take about 1700 years on commodity hardware. Upstream is running a public, distributed effort to repeat this work. Working together, and especially when starting on a smaller scale, it will take less than 1700 years to get a good network (which you can feed into this program, suddenly making it strong). To help with this effort, run the leelaz-autogtp binary provided in this package. The best-known network weights file is at http://zero.sjeng.org/best-network For instance, this message [5] from patch#36071, We need to ensure that the software necessary to train the networks is included. Is this the case? Back to this patch: I think it’s fine to accept it as long as the software necessary for training is included. would suggest that the training software should be part of the package for inclusion although it would not be affordable to recompute this training. Well, applying this “criteria”, then GNU Backgamon (package gnubg included since a while) should be rejected since there is no training software for the neural network weights. For the inclusion of leela-zero in Guix, the argument is from [6] quoting [7]: So this is basically a distributed computing client as well as a Go engine that runs with the results of that distributed computing effort. If that's true, there is no separate ‘training software’ to worry about. which draws the line: free client vs the “database”; as pointed in [8]. Somehow, we have to distinguish cases depending on the weights. If the weights are clearly separated, as with leela-zero, then the code (neural network) itself can be included. Else if the weights are tangled with the code, then we distribute them only if their license is compliant with FSDG as any other data, as with GNU Backgamon, IIUC. Well, I do not see any difference between pre-trained weights and icons or sound or good fitted-parameters (e.g., the package python-scikit-learn has a lot ;-)). As I said elsewhere, I do not see the difference between pre-trained neural network weights and genomic references (e.g., the package r-bsgenome-hsapiens-1000genomes-hs37d5). The only question for inclusion or not is about the license, IMHO. For sure, it is far better if we are able to recompute the weights. However, similarly as debootstrapping is a strong recommendation, the ability to recompute the pre-trained neural network weights must just be a recommendation. Please note this message [11] from Nicolas about VOSK models and patch#62443 [12] already merged; the weights are separated as with the package leela-zero. All that said. Second, please note Debian thread dates from 2018, as well as the LWN article [13]; and I am not aware of something new since then. Third, I have never read something on this topic produced by GNU or related; and the fact that GNU Backgamon [14] distributes the weights without the way to reproduce them draws one line. Fourth, we – at least I – are still waiting an answer from licens...@fsf.org; on FSF side, I am only aware about this [15] and also these white papers [16] about the very specific case of Copilot. On Debian side, I am only aware of [16,17]: Unofficial Policy for Debian & Machine Learning 1: https://lists.debian.org/debian-devel/2018/07/msg00153.html 2: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=903634 3: https://packages.debian.org/trixie/leela-zero 4: https://issues.guix.gnu.org/36071 5: https://issues.guix.gnu.org/36071#5 6: https://issues.guix.gnu.org/36071#6 7: https://lwn.net/Articles/760483/ 8: https://issues.guix.gnu.org/36071#4-lineno34 9: https://yhetil.org/guix/87v8gtzvu3....@gmail.com 11: https://yhetil.org/guix/87jzyshpyr....@ngraves.fr 12: https://issues.guix.gnu.org/62443 10: https://lwn.net/Articles/760142/ 11: https://www.gnu.org/software/gnubg/ 12: https://www.fsf.org/bulletin/2022/spring/unjust-algorithms/ 13: https://www.fsf.org/news/publication-of-the-fsf-funded-white-papers-on-questions-around-copilot 16: https://salsa.debian.org/deeplearning-team/ml-policy 17: https://people.debian.org/~lumin/debian-dl.html Cheers, simon