Keywords: GPU computing support, AI applications & ML-Policy. Deep learning is a new area. From our past discussions, we have already noted that this area introduces many new questions to Debian. For example, the new AI applications may even challenge the definition of free software. In this article I shall share my latest reviews on related topics across multiple domains, reviews on some of my past forecasts, as well as some relevant development advices.
Note, the whole article only conveys my own personal opinion, and does not represent any official opinion of the Debian Project. # Debian's GPU computing support -- how much should we do? #################### The recent success of partly depend on the development of GPU, which can compute matrix multiplication hundreds of times faster than a CPU. Thus, GPU computation is very valuable. And intuitively, supporting GPU computation as much as we can from the Debian side is useful and valuable as well. Due to software license issues from some certain vendor, I've been seeking for the boundary for long time -- how much should we do to support a certain type of GPU computation? Now I finally figured out my own answer. Debian is merely a _downstream_ in terms of providing GPU support for the end-users. As long as the upstream is willing to give us chance (legally) and is easy to cooperate, we can support that. Otherwise a dead-end will soon be reached, unsurprisingly. I've had some discussions with several fellow developers on suggesting Debian to buy some GPUs to extend its infrastructures for better GPU support. The plan to put forward those ideas to a larger audience inside Debian had been indefinitely postponed because we know the requirement of non-free driver (there is no free alternative) would be a big problem. Although my initial thought is to make Debian useful in more areas like GPU computing, I finally realized that by accepting new non-free blobs as an organization, we are further loosing our core value written on our homepage -- "a complete free operating system". My conclusion is: "Users with special demands can take care of themselves, as we are unable to go far on our own." In terms of GPU computing, Debian is providing a great system as a foundation for development and applications. Of course, deep learning frameworks are regular software we are already familiar enough with. Their GPU support simply depends on whether the necessary drivers and libraries are maintained in Debian. # AI Applications & ML-Policy ################################################# I predict that the ML-Policy [1] will work as a warning on potential issues instead of some practical guidance on packaging, because there are (and will be) long-existing issues hard to overcome which make our packages not really useful without external components. Throughout the whole ML-Policy, I think the most valuable warning is the definition of "ToxicCandy Model", which identifies software freedom trap for random developers interested in AI software. Cool and useful stuff keeps emerging -- e.g., Facial Authentication for Linux https://github.com/boltgolt/howdy And it depends on some pre-trained models (licence: CC0-1.0): https://github.com/davisking/dlib-models People may still have some impression on the past discussions on ML-Policy. When we treat pre-trained models as something like a picture or a song, they may enter our main archive. But when we try to exercise software freedom, things will go wrong. For example, we can study a painting/song and analyze it to learn something, but this does not work for pre-trained models. Without the training data there is no much way to study/learn/reproduce the pre-trained models. As per definition in ML-Policy the mentioned model is ToxicCandy model. Based on my interpretation, it means Debian might step aside from the world of AI applications to fully exercise software freedom. It's a pity but Debian's major role in the whole thing is a solid system. Workarounds to address that pity are possible. For example, the past "Debian User Package Repository" idea. By distributing only package building scripts to end-users so they can build corresponding packages locally. In this way the license issues and software freedom issues are bypassed as the user has determined to accept the potential issues. On the other hand, I'd advise people who want to package interesting AI applications carefully evaluate whether it is mature enough -- and never package a pure academic research project. This is largely due to our development cycle is much slower than the revolution cycle in the deep learning field. Something better may appear before it clear's our NEW queue... As for AI applications that require considerable computing power (GPU), the answer rather distinct. [1] https://salsa.debian.org/deeplearning-team/ml-policy/-/blob/master/ML-Policy.rst # Concluding Remarks We maintain and provide a free operating system, and we value software freedom. My contribution here is to provide my understanding on the boundary between what we can do and what we can't do with respect to a new interesting area. At least I learned a lot when thinking about this, and got a deeper understanding on "what Debian is". Debian is wonderful because this is one of the only few places on the earth where people will shout when software freedom is potentially infringed. Indeed, Debian must have its own uniqueness in the impression of every long term members of the project. Thank you for the excellent system, fellow developers.