Gerardo Ballabio <gerardo.balla...@gmail.com> writes: > As I understand, that is an open legal question. The Affero GPL would be > such a license *if* the training dataset would be considered part of the > code. While that does seem to make sense, as AI code is essentially > non-functional without the training, I am not aware that there has ever > been a pronouncement by a court of law that affirms or denies it, nor I > am aware of any free/open source license that contains language that > deals specifically with that issue, and I'm pretty sure that there's lot > of room for lawyers to argue their point.
To add to this, I'm fairly sure that the companies that are training AI models on, say, every piece of text they can find on the Internet, or all public GitHub repositories, are going to explicitly argue that doing so is fair use of the training material. If that argument prevails in court, or in legislatures, it will not be possible to write a free software license to prevent this, since the point of fair use is that copyright law does not apply to that usage and therefore no copyright license can prohibit it. I don't think we have any idea yet whether that argument will prevail. It will probably be years before it reaches a high enough level court in the United States for a definitive ruling, let alone every other relevant country that will have its own legal judgments. Consider Google v. Oracle: a suitable case with litigants willing to appeal all the way to the highest court about the copyright status of library APIs was only filed in 2015, years after this became a common issue, and it took six years for it to be decided, and that only in the United States. I would expect a similar delay. Court systems work very slowly. It's also entirely possible that court judgments will go different ways in different countries to add even more confusion. The organizations that have every incentive to argue that it's fair use have very deep pockets, so they have a substantial chance of success on the prosaic grounds that the best-funded litigant or lobbyist always stands a reasonable chance of winning. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>