Re: Concerns regarding the "Open Source AI Definition" 1.0-RC2

Jonathan Carter Tue, 29 Oct 2024 08:58:34 -0700

Hi Stefano

On 2024/10/29 13:03, Stefano Zacchiroli wrote:

On Mon, Oct 28, 2024 at 09:53:31PM +0200, Jonathan Carter wrote:

The companies [...]  want to restrict what you can actually use it
for, and call it open source? And then OSI makes a definition that
seems carefully crafted to let these kind of licenses slip through?


The licensing terms for the Meta Llama models are indeed horrific, but I
don't understand your point here. In order to be OSAID compliant, Meta
will precisely have to change those licensing terms and make them
DFSG-compliant. That would be a *good* thing for the world and would fix
the main thing you are upset about.

Unfortunately that's not the case. Meta won't have to make Llama3 DFSGcompliant in order to be OSAID compliant, since OSAID as not as robustas the OSD.

The OSAID has no provisions for explicit free (re)distribution as theOSD#1 has, so Meta could continue to require license fees over a certainuser count and still claim that they are OSAID compliant. They couldalso keep their clause about Llama3 being licensed under anon-transferable license (OSD#7), which as far as I understand, is thereto prevents forks from happening, and this too would be OSAID compliant.

LLama3's license is particular dodgy, but it's not unique in the AIspace, and I can assure you that if someone out there is even convincedto do the minimum that OSAID requires, it might still be a far shot fromDFSG free, and for that reason we as Debian should absolutely notendorse it imho.

And Meta is not liking that idea. Meta is, right now, lobbying EU
regulators to convince them that what should count as "open source AI"
for the purposes of the EU AI Act is their (Meta's) version, rather than
OSAID.

I have personally fought (and lost) during the OSAID definition process
to make access to training data mandatory in the definition. So while
I'm certainly not against criticizing OSAID, we should do that for the
right reasons.

What is the OSI's motivation for creating such an incredibly laxdefinition for open source AI? Meta is already calling theirabsolutely-not-open-source model Open Source and promoting it as such,without as much as a *peep* from the OSI condemning the abuse of theterm. (although, while doing a quick search to make sure that's true, Ifound this link from OSI to an article that keeps insisting that LLama3is open source:https://opensource.org/press-mentions/meta-inches-toward-open-source-ai-with-new-llama-3-1)

If they're not even going to defend the one definition that they'resupposed to be the stewards of, what do you think will happen when theyhave an additional, significantly looser, much more lax definition thatis open to many more kinds of abuse?

I don't need to fast-forward to the next episode or the next season topredict what's going to happen:

* It will be bad for users in terms of what they can do with what theyconsider to be their own devices

* It will be bad for software developers and people who implement software

* It will result in *more* non-DFSG models being released, not less(since the creators of these models can now fall back to licenses whichare completely non-free but still squeeze by on the OSAID definition)

PS To make Llama models OSAID-compliant Meta, in addition to (1)
    changing the model license, will also have to: (2) provide "a listing
    of all publicly available training data and where to obtain it", and
    (3) release under DFSG-compatible terms their entire training
    pipeline (currently unreleased). I don't think they will ever get
    there. But if they do, these would also be good things for the world.
    Not *as good* as having access to the entire training dataset, but
    good nonetheless.

Again, the OSAID doesn't particularly care about DFSG-compatible, so notsure where point 3 comes in here, but if there's something obvious Imissed, I'm all ears.


-Jonathan

Re: Concerns regarding the "Open Source AI Definition" 1.0-RC2

Reply via email to