[Fedora-legal-list] AI/ML Model and Pre-Trained Weight Packaging in Fedora

Tim Flink Mon, 26 Feb 2024 15:33:06 -0800

These questions came up in a FESCo ticket [1] recently and the primary purpose 
of this thread is to have some public record of the conversation around the 
handling of pre-trained weights for AI/ML models as packaged for Fedora.


[1] https://pagure.io/fesco/issue/3175

Intro and Definitions
=====================

Previous conversations have involved a decent amount of confusion around 
terminology and I want to be clear about what I'm asking so I'm starting with a 
few definitions in the context of my questions.

Artificial Neural Network (ANN) - effectively structured data consisting of 
neurons (nodes containing some value) organized into layers with various 
connections between the neurons. There are  connections between neurons which 
control the flow of data through the entire network. The exact value of how the 
connections affect flow through the network is found through the training 
process and these values are generally referred to as weights.

Model - A model by itself is a description of a specific ANN - how layers are 
configured, how they interact with each other, how model training is done, how 
data needs to be structured for using a trained model and so on. A model by 
itself is rarely, if ever useful. Models generally need to be trained on data 
before they can be used but many models offer a mechanism through which weights 
can be loaded from a model which has already been trained. An untrained model 
without pre-trained weights or training is pretty much code.

Pre-Trined Weights - Pre-trained weights are essentially the data contained in 
a model after training the model on some input data. Training modern ANN models 
is a very expensive and time consuming process; pre-trained weights allow 
people to use models without having to train the model locally or even have 
access to data needed to train the model.



Questions
=========

1. Are pre-trained weights considered to be normal non-code content/data or do 
they require special handling?

2. If an upstream offers pre-trained weights and indicates that those weights 
are available under a license which is acceptable for non-code content in 
Fedora, can those pre-trained weights be included in Fedora packages?

3. Extending question 2, is it considered sufficient for an upstream to have a 
license on pre-trained weights or would a packager/reviewer need to verify that 
the data used to train those weights is acceptable?

4. Is it acceptable to package code which downloads pre-trained weights from a 
non-Fedora source upon first use post-installation by a user if that model and 
its associated weights are
   a. For a specific model?
   b. For a user-defined model which may or may not exist at the time of 
packaging?



I can provide examples of any of these situations if that would be helpful.

Thanks,

Tim
--
_______________________________________________
legal mailing list -- legal@lists.fedoraproject.org
To unsubscribe send an email to legal-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/legal@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

[Fedora-legal-list] AI/ML Model and Pre-Trained Weight Packaging in Fedora

Reply via email to