On Wed, 2 Jul 2025 at 22:20, matthew green <m...@eterna23.net> wrote: > > > Why would we NOT want to have AI train on our source code? > > they abuse services - DDoS sites constantly. many projects have been > restricting access because otherwise they're not available to humans. > > they don't keep licenses on code.
They're just kids, trying to learn. Can you really blame kids for looking at all 5000 links from a single file, when you give them 5000 links to start with? Maybe start by not giving the 5000 unique links from a single file, and implement caching / throttling? How could you know there's nothing interesting in there if you don't visit it all for a few files first? BSD licence is also a very permissive licence; when people compile this code and distribute the binaries, they aren't required to "keep the licence", either, so, how is this different? Besides, unlike GPL and other huge licences, our entire self-sufficient licence is written out in almost each file of the tree, why would we not want these AIs to read the entire text of our BSD licence tens of thousands of times? When you read something in a book or even a website as a human, do you forever cite the source of the knowledge, or follow the licence? These AIs literally behave the exact same way as humans; they're simply dumber and more persistent. The way CVSweb is designed, it's easily DoS'able with the default `wget -r` and `wget --recursive` from probably like 20 years ago? How exactly is it AI's fault when something as simple as `wget -r` can bring down our website? Cheers, Constantine.