Sam Hartman wrote: > TL;DR: I think it is important for Debian to consider AI models free even if those models are based on models that do not release their training data. In the terms of the DFSG, I think that a model itself is often a preferred form of modification for creating derived works.
Hello Sam, as someone not familiar at all with how AI works, could you please explain better what a "model" is (and what is a "model based on a model")? I seem to understand that a "model" is a binary blob that is the output of the process of "training" the AI, while the "training dataset" is the input of that process (and is usually much larger). The model then is used as input for the process of "using" the AI, e.g., asking questions and getting answers; the AI isn't functional without a model. So, with traditional software there are the source and the binary. To *run* the software one only needs the binary, while to exercise the "four freedoms" one needs the source. With AI there are the source, the binary, the training dataset and the model. To run the software one needs the binary *and* the model, while to exercise the four freedoms one needs the source and may or may not need the training dataset -- you're arguing that one may not necessarily need it. Is my understanding correct? > I don't really care whether base models are considered free or non-free. I don't think it will be important for Debian to include base-models in our archive. Well -- as I read it, the goal of the proposed GR is to decide whether an AI without training data is free or non-free (i.e., in which section of the archive it should go). So your concern seems rather orthogonal to that. My concern is that, if an AI is included into Debian (in whichever section of the archive), I can install the Debian package and have it work out of the box without having to download additional stuff from untrusted websites. That also seems rather orthogonal to the proposed GR, but I'd mention it nonetheless. Gerardo