[ reordering quoted text ]

Hello Jonathan,

On Tue, Oct 29, 2024 at 05:45:20PM +0200, Jonathan Carter wrote:
> On 2024/10/29 13:03, Stefano Zacchiroli wrote:
> >
> > To make Llama models OSAID-compliant Meta [...] will also have to:
> > [...] (3) release under DFSG-compatible terms their entire training
> > pipeline (currently unreleased).
>
> Again, the OSAID doesn't particularly care about DFSG-compatible, so
> not sure where point 3 comes in here, but if there's something obvious
> I missed, I'm all ears.

"DFSG-compliant" was a Debian-slip of mine. I meant "OSD-compliant" (the
standard OSD, not OSAID). Sorry about that, but the two definitions are
de facto equivalent for the purpose of our discussion here.

Now, about the code of the training pipeline, OSAID [1] has this to say:

> Code: The complete source code used to train and run the system. The
> Code shall represent the full specification of how the data was
> processed and filtered, and how the training was done. Code shall be
> made available under OSI-approved licenses.

Where "OSI-approved licenses" refers to [2] (sure, an explicit link or
mention would be better, but that is what that expression has always
meant in the context of OSD).

[1]: https://opensource.org/ai/open-source-ai-definition
[2]: https://opensource.org/licenses

> > In order to be OSAID compliant, Meta will precisely have to change
> > those licensing terms and make them DFSG-compliant. That would be a
> > *good* thing for the world and would fix the main thing you are
> > upset about.
> 
> Unfortunately that's not the case. Meta won't have to make Llama3 DFSG
> compliant in order to be OSAID compliant, since OSAID as not as robust as
> the OSD.

That's not-not the case :-). Here is what OSAID says about model
parameters (highlight mine):

> Parameters: The model parameters, such as weights or other
> configuration settings. Parameters shall be made available under
> *OSI-approved terms*.
[...]
> The Open Source AI Definition does not require a specific legal
> mechanism for assuring that the model parameters are *freely available
> to all*. They may be free by their nature or a license or other legal
> instrument may be required to ensure their freedom.

AFAIR, in the early days of the OSAID process, the requirements for the
weights were the same of the training code, i.e., under an "OSI-approved
license". Then it was pointed out by lawyers and legal scholars that
there is not always an applicable *license* for a matrix of floats. They
might not be protectable by "intellectual property" at all (we don't
know yet), or be in the public domain, or any other number of weird
legal cases. Hence it was not appropriate to use the "OSI-approved
license" expression and OSI picked the alternative expression
"OSI-approved terms". But the intent is that, no matter what legal
regime applies to the weights, they should grant to users the
traditional 4 freedoms, which are defined earlier on in the OSAID.

I agree that it could be better written in the definition, or at least
clarified in the FAQ. But there is no doubt whatsoever that a violation
of any OSD point on the licensing terms (or whatever else applies) of
the model weights would disqualify an AI system to be OSAID-compliant.

Hope this clarifies,
Cheers
-- 
Stefano Zacchiroli . z...@upsilon.cc . https://upsilon.cc/zack  _. ^ ._
Full professor of Computer Science              o     o   o     \/|V|\/
Télécom Paris, Polytechnic Institute of Paris     o     o o    </>   <\>
Co-founder & CSO Software Heritage            o o o     o       /\|^|/\
Mastodon: https://mastodon.xyz/@zacchiro                        '" V "'

Attachment: signature.asc
Description: PGP signature

Reply via email to