On Fri, Mar 1, 2024 at 4:52 PM Tim Flink <tfl...@fedoraproject.org> wrote: > > On 2/28/24 19:03, Richard Fontana wrote: > > On Tue, Feb 27, 2024 at 5:58 PM Tim Flink <tfl...@fedoraproject.org> wrote: > >> > >> > >> > >> On 2/26/24 19:06, Richard Fontana wrote: > >> > >> <snip> > >> > >>>> 4. Is it acceptable to package code which downloads pre-trained weights > >>>> from a non-Fedora source upon first use post-installation by a user if > >>>> that model and its associated weights are > >>>> a. For a specific model? > > > > What do you mean by "upon first use post-installation"? Does that mean > > I install the package, and the first time I launch it or whatever, it > > automatically downloads some set of pre-trained weights, or is this > > something that would be controlled by the user? The example you gave > > suggests the latter but I wasn't sure if I was misunderstanding. > > Once the package is installed, pre-trained weights would downloaded if and > only if code written to use a specific model with pre-trained weights is run. > In the cases I'm aware of, code that would cause the weights to be downloaded > is not directly part of the packaged libraries and anything that could > trigger the downloading of pre-trained weights would have to be written by a > user or contained in a separate package. If a specific model with pre-trained > weights is not used and not executed by another library/application, the > weights will not be downloaded. With the ViT example, the vitb16 weights > would be downloaded when that code (not included in the package) is run but > the vitb32 weights would not be downloaded unless the example was changed or > something else specified a pre-trained ViT model with the vitb32 weights. > Similarly, the weights for other models (googlenet, as an example) would not > be downloaded unless code that uses that specific model in its pre-trained > form is executed post-installation. > > The implementations that I'm familiar with will check for downloaded weights > as the code is initialized. When done in this way, the download is > transparent to the user and unless code using these models/weights is written > in such a way that the user a choice, there is not much a user could do to > change the download URL or prevent the weights from being downloaded. The > only ways I can think of off hand would be to modify the underlying libraries > to override the hard-coded URLs or maybe put identically named files in the > cache location but that would end up being dependant on model implementation. > For the specific libraries I used as examples, I don't know what the local > download folder is off the top of my head, nor do I know if they do any > verification of downloads so putting files into the cached location may not > work if they don't match the intended file contents. > > This is just my opinion but I doubt that many people writing code that uses > pre-trained models are going to go out of their way to help users avoid > downloading pre-trained weights. I know that for code that I've written using > pre-trained models, it might be able to execute without the pre-trained > weights but the output would just be noise in that situation. I would have a > hard time justifying the work needed to make those downloads optional since > it would make the code useless for what it was intended to do. > > It may also be worth noting that some models with pre-trained weights are > almost useless without those weights. For some (mostly older) models, it's > feasible to train a model from scratch but for many of the recent models, > it's just not feasible. As an example, the weights for Meta's Llama 2 took > 3.3 million hours of GPU time to train [1] with a cost into the millions of > USD ignoring what it would take to obtain enough data to train a model that > large. > > Apologies for my verbosity but I hope that I answered your question and the > extra bits weren't entirely useless. >
This sounds like it falls in the same bucket as pip, snapd, gem, and other similar "package manager" functionality. -- 真実はいつも一つ!/ Always, there's only one truth! -- _______________________________________________ legal mailing list -- legal@lists.fedoraproject.org To unsubscribe send an email to legal-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/legal@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue