Hi, MSavoritias, Am Freitag, dem 21.06.2024 um 17:15 +0300 schrieb MSavoritias: > But I didnt say that tho did I? the context you are reading as from > the quote is Guix uploading all code from its packages to SWH. > Not any private repos. So i have no idea what you are reffering to > here tbh. I hate to say that, but you kinda did. It was implicit on the mailing list (at least in the OP), but very explicit in the XMPP room, where you say "it automatically sen[d]s your repo (and all your code) that is reachable through the internet to Software Heritage […] with no way to opt-out at any of the process and no flag with `guix lint` to disable it"
Now, you stand corrected on both accounts (the automatic sending of code and the inability to disable it), but I'd like to poke at another tangent. Currently, the StarCoder LLM endorsed by SWH, claims to only ingest GitHub and to filter out both commercial and copyleft code, thus training on non-copyleft "open source" software only [1]. So, at the time of writing, you do have an "easy" opt-out by way of using the GPL. Except, that, of course, their script to detect licenses is buggy – what else did you expect? Just search for GNOME using their tool.[2] It will print out repos like the unlicensed releng [3] – although for some reason, being unlicensed appears to be fair game to them anyway [1] – or the GPL'd devhelp [4]. So, in my opinion, the collaboration between SWH and StarCoder should trigger some side-eyeing; and if only to exclude the archival lint for the time being. We can still consider SWH as a software mirror if all else fails, and they should probably be quick enough in updating as well. Long term, we might want to look into options that do not openly endorse tools which make such questionable decisions. On the notion of consent, I do think that "I license my code under the MIT license, because then companies will like me" ought to count as consent here. [3] and [4] on the other hand very much don't. Also, "sign up with GitHub, so that you can opt out" is not a great consent model either – at the very least accept bleeping email. As per Doctorow's law of enshittification, there is a good chance that "ethical AI" to SWH will become "any AI" if we do nothing to communicate that this is not what we as Guix expect. Cheers [1] https://arxiv.org/abs/2402.19173 [2] https://huggingface.co/spaces/bigcode/in-the-stack [3] https://github.com/GNOME/releng [4] https://github.com/GNOME/devhelp