Sam Hartman wrote:
> TL;DR: I think it is important for Debian to consider AI models free
even if those models are based on models that do not release their
training data. In the terms of the DFSG, I think that a model itself is
often a preferred form of modification for creating derived works.

Hello Sam,
as someone not familiar at all with how AI works, could you please
explain better what a "model" is (and what is a "model based on a
model")?

I seem to understand that a "model" is a binary blob that is the
output of the process of "training" the AI, while the "training
dataset" is the input of that process (and is usually much larger).
The model then is used as input for the process of "using" the AI,
e.g., asking questions and getting answers; the AI isn't functional
without a model.

So, with traditional software there are the source and the binary. To
*run* the software one only needs the binary, while to exercise the
"four freedoms" one needs the source. With AI there are the source,
the binary, the training dataset and the model. To run the software
one needs the binary *and* the model, while to exercise the four
freedoms one needs the source and may or may not need the training
dataset -- you're arguing that one may not necessarily need it.

Is my understanding correct?

> I don't really care whether base models are
considered free or non-free. I don't think it will be important for
Debian to include base-models in our archive.

Well -- as I read it, the goal of the proposed GR is to decide whether
an AI without training data is free or non-free (i.e., in which
section of the archive it should go). So your concern seems rather
orthogonal to that.

My concern is that, if an AI is included into Debian (in whichever
section of the archive), I can install the Debian package and have it
work out of the box without having to download additional stuff from
untrusted websites. That also seems rather orthogonal to the proposed
GR, but I'd mention it nonetheless.

Gerardo

Reply via email to