Nathan Dehnel <ncdeh...@gmail.com> writes:
> a) Bit-identical re-train of ML models is similar to #2; other said > that bit-identical re-training of ML model weights does not protect > much against biased training. The only protection against biased > training is by human expertise. > > Yeah, I didn't mean to give the impression that I thought > bit-reproducibility was the silver bullet for AI backdoors with that > analogy. I guess my argument is this: if they release the training > info, either 1) it does not produce the bias/backdoor of the trained > model, so there's no problem, or 2) it does, in which case an expert > will be able to look at it and go "wait, that's not right", and will > raise an alarm, and it will go public. The expert does not need to be > affiliated with guix, but guix will eventually hear about it. Similar > to how a normal security vulnerability works. > > b) The resources (human, financial, hardware, etc.) for re-training is, > for most of the cases, not affordable. Not because it would be > difficult or because the task is complex, this is covered by the > point a), no it is because the requirements in term of resources is > just to high. > > Maybe distributed substitutes could change that equation? Probably not, it would require distributed *builds*. Right now Guix can't even use distcc, so it definitely can't use remote GPUs.