Re: Guidelines for pre-trained ML model weight binaries (Was re: Where should we put machine learning model parameters?)

zamfofex Sun, 28 May 2023 20:58:23 -0700

> To me, there is no doubt that neural networks are a threat to user
> autonomy: hard to train by yourself without very expensive hardware,
> next to impossible without proprietary software, plus you need that huge
> amount of data available to begin with.
> 
> As a project, we don’t have guidelines about this though.  I don’t know
> if we can come up with general guidelines or if we should, at least as a
> start, look at things on a case-by-case basis.


I feel like it’s important to have a guideline for this, at least if the issue 
becomes recurrent too frequently.

To me, a sensible *base criterion* is whether the user is able to practically 
produce their own networks (either from scratch, or by using the an existing 
networks) using free software alone. I feel like this solves the issue of user 
autonomy being in risk.

By “practically produce”, I mean within reasonable time (somewhere between a 
few minutes and a few weeks depending on the scope) and using exclusively 
hardware they physically own (assuming they own reasonbly recent hardware to 
run Guix, at least).

The effect is that the user shouldn’t be bound to the provided networks, and 
should be able to train their own for their own purposes if they so choose, 
even if using the existing networks during that training. (And in the context 
of Guix, the neural network needs to be packaged for the user to be able to use 
it that way.)

Regarding Lc0 specifically, that is already possible! The Lc0 project has a 
training client that can use existing networks and a set of configurations to 
train your own special‐purpose network. (And although this client supports 
proprietary software, it is able to run using exclusively free software too.) 
In fact, there are already community‐provided networks for Lc0[1], which 
sometimes can play even more accurately than the official ones (or otherwise 
play differently in various specialised ways).

Of course, this might seem very dissatisfying in the same way as providing 
binary seeds for software programs is. In the sense that if you require an 
existing network to further train networks, rather than being able to start a 
network from scratch (in this case). But I feel like (at least under my “base 
criterion”), the effects of this to the user are not as significant, since the 
effects of the networks are limited compared to those of actual programs.

In the sense that, even though you might want to argue that “the network 
affects the behavior of the program using it” in the same way as “a Python 
source file affects the behavior of its interpreter”, the effect of the network 
file for the program is limited compared to that of a Python program. It’s much 
more like how an image would affect the affect the behavior of the program 
displaying it. More concretely, there isn’t a trust issue to be solved, because 
the network doesn’t have as many capabilities (theoretical or practical) as a 
program does.

I say “practical capabilities” in the sense of being access user resources and 
data for purposes they don’t want. (E.g. By accessing/modifying their files, 
sharing their data through the Internet without their acknowledgement, etc.)

I say “theoretical capabilities” in the sense of doing things the user doesn’t 
want nor expects, i.e. thinking about using computations as a tool for some 
purpose. (E.g. Even sandboxed/containerised programs can be harmful, because 
the program could behave in a way the user doesn’t want without letting the 
user do something about it.)

The only autonomy‐disrespecting (or perhaps rather freedom‐disrespecting) issue 
is when the user is stuck with the provided network, and doesn’t have any tools 
to (practically) change how the program behaves by creating a different network 
that suits their needs. (Which is what my “base criterion” tries to defend 
against.) This is not the case with Lc0, as I said.

Finally, I will also note that, in addition to the aforementioned[2] fact that 
Stockfish (already packaged) does use pre‐trained neural networks too, the 
lastest versions of Stockfish (from 14 onward) use neural networks that have 
themselves been indirectly trained using the networks from the Lc0 project.[3]

[1]: See <https://lczero.org/play/networks/basics/#training-data>
[2]: It was mentioned in <https://issues.guix.gnu.org/63088>
[3]: See <https://stockfishchess.org/blog/2021/stockfish-14/>

Re: Guidelines for pre-trained ML model weight binaries (Was re: Where should we put machine learning model parameters?)

Reply via email to