On Fri, Mar 1, 2024 at 4:52 PM Tim Flink <tfl...@fedoraproject.org> wrote:
>
> On 2/28/24 19:03, Richard Fontana wrote:
> > On Tue, Feb 27, 2024 at 5:58 PM Tim Flink <tfl...@fedoraproject.org> wrote:
> >>
> >>
> >>
> >> On 2/26/24 19:06, Richard Fontana wrote:
> >>
> >> <snip>
> >>
> >>>> 4. Is it acceptable to package code which downloads pre-trained weights 
> >>>> from a non-Fedora source upon first use post-installation by a user if 
> >>>> that model and its associated weights are
> >>>>       a. For a specific model?
> >
> > What do you mean by "upon first use post-installation"? Does that mean
> > I install the package, and the first time I launch it or whatever, it
> > automatically downloads some set of pre-trained weights, or is this
> > something that would be controlled by the user? The example you gave
> > suggests the latter but I wasn't sure if I was misunderstanding.
>
> Once the package is installed, pre-trained weights would downloaded if and 
> only if code written to use a specific model with pre-trained weights is run. 
> In the cases I'm aware of, code that would cause the weights to be downloaded 
> is not directly part of the packaged libraries and anything that could 
> trigger the downloading of pre-trained weights would have to be written by a 
> user or contained in a separate package. If a specific model with pre-trained 
> weights is not used and not executed by another library/application, the 
> weights will not be downloaded. With the ViT example, the vitb16 weights 
> would be downloaded when that code (not included in the package) is run but 
> the vitb32 weights would not be downloaded unless the example was changed or 
> something else specified a pre-trained ViT model with the vitb32 weights. 
> Similarly, the weights for other models (googlenet, as an example) would not 
> be downloaded unless code that uses that specific model in its pre-trained 
> form is executed post-installation.
>
> The implementations that I'm familiar with will check for downloaded weights 
> as the code is initialized. When done in this way, the download is 
> transparent to the user and unless code using these models/weights is written 
> in such a way that the user a choice, there is not much a user could do to 
> change the download URL or prevent the weights from being downloaded. The 
> only ways I can think of off hand would be to modify the underlying libraries 
> to override the hard-coded URLs or maybe put identically named files in the 
> cache location but that would end up being dependant on model implementation. 
> For the specific libraries I used as examples, I don't know what the local 
> download folder is off the top of my head, nor do I know if they do any 
> verification of downloads so putting files into the cached location may not 
> work if they don't match the intended file contents.
>
> This is just my opinion but I doubt that many people writing code that uses 
> pre-trained models are going to go out of their way to help users avoid 
> downloading pre-trained weights. I know that for code that I've written using 
> pre-trained models, it might be able to execute without the pre-trained 
> weights but the output would just be noise in that situation. I would have a 
> hard time justifying the work needed to make those downloads optional since 
> it would make the code useless for what it was intended to do.
>
> It may also be worth noting that some models with pre-trained weights are 
> almost useless without those weights. For some (mostly older) models, it's 
> feasible to train a model from scratch but for many of the recent models, 
> it's just not feasible. As an example, the weights for Meta's Llama 2 took 
> 3.3 million hours of GPU time to train [1] with a cost into the millions of 
> USD ignoring what it would take to obtain enough data to train a model that 
> large.
>
> Apologies for my verbosity but I hope that I answered your question and the 
> extra bits weren't entirely useless.
>

This sounds like it falls in the same bucket as pip, snapd, gem, and
other similar "package manager" functionality.



-- 
真実はいつも一つ!/ Always, there's only one truth!
--
_______________________________________________
legal mailing list -- legal@lists.fedoraproject.org
To unsubscribe send an email to legal-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/legal@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

Reply via email to