Re: Guidelines for pre-trained ML model weight binaries (Was re: Where should we put machine learning model parameters?)

Simon Tournier Thu, 06 Apr 2023 01:44:29 -0700

Hi,

On Mon, 03 Apr 2023 at 18:07, Ryan Prior <rpr...@protonmail.com> wrote:


> Hi there FSF Licensing! (CC: Guix devel, Nicholas Graves) This morning
> I read through the FSDG to see if it gives any guidance on when
> machine learning model weights are appropriate for inclusion in a free
> system. It does not seem to offer much. 

Years ago, I asked to FSF and Stallman how to deal with that and I had
never got an answer back.  Anyway! :-)

Debian folks discussed such topic [1,2] but I do not know if they have
an “official” policy.

I remember we discussed on guix-devel or guix-patches similar topic some
years ago – but I do not find back the thread.

For what my opinion is worth, I think that machine learning model
weights should be considered as any other data (images, text files,
translated strings, etc.) and thus they are appropriated for inclusion
or not depending on if their license is compliant.

Since it is computing, we could ask about the bootstrap of such
generated data.  I think it is a slippery slope because it is totally
not affordable to re-train for many cases: (1) we would not have the
hardware resources from a practical point of view,, (2) it is almost
impossible to tackle the source of indeterminism (the optimization is
too entailed with randomness).  From my point of view, pre-trained
weights should be considered as the output of a (numerical) experiment,
similarly as we include other experimental data (from genome to
astronomy dataset).

1: https://salsa.debian.org/deeplearning-team/ml-policy
2: https://people.debian.org/~lumin/debian-dl.html


Cheers,
simon

Re: Guidelines for pre-trained ML model weight binaries (Was re: Where should we put machine learning model parameters?)

Reply via email to