Buongiorno lista, > L'idea che istruire un modello su dei testi coperti da copyright sia una > violazione del suddetto copyright è altamente opinabile
Fin qui, ho l'impressione che tutti i legali in lista concorderanno. > ragionamento è in realtà abbastanza semplice: se istruirsi su un > testo ne violasse il copyright, saremmo tutti dei criminali. Ma siccome noi siamo umani e quello che produciamo non è - salvo i discorsi dei politici(*) - ontologicamente identico alla produzione di esseri tecnici non viventi, logica vuole che quanto si applica a noi non possa applicarsi a un LLM, tanto quanto la legge sul copyright non si applica pedissequamente all'utilizzo di testi umani per creare modelli linguistici. Questo è il motivo per il quale tutti i tentativi di "proteggere via copyright" il prodotto di software generativi sono falliti miseramente, e con motivazioni scritte in sentenze; che per il diritto credo abbiano un peso assai maggiore del sito di CC. La mia impressione è che la questione terrà impegnati legali, informatici, filosofi e società ancora moooooolto a lungo. SBB (*) Come sanno bene i bambini degli anni '80 che hanno giocato con questo spassoso giocattolo: https://www.enricodalbosco.it/giochi/tubolario/ Di quei testi > non c'è fisicamente traccia all'interno dei modelli, non viene copiato > niente. I modelli sono un'opera trasformativa di quei testi, non > derivativa. > > Lo argomenta molto bene Creative Commons: > https://creativecommons.org/2023/02/17/fair-use-training-generative-ai/ > > Detto questo, cito le parole di un altro autore, Jeff Jarvis: > https://www.facebook.com/jeff.jarvis/posts/pfbid0LMFeqdTYoxnGHQAZwp5HMmeeVqgMSjL2dkcwMcBojkb2cinBpgYTHyc7Fhq1B9NPl > > «I, for one, am not complaining about my books being in in large > language model training sets. I write to enter ideas into public > discourse. I prefer informed over ignorant AI. I believe it is fair > use for anyone to read & use books for transformative work. In fact, > I'd probably feel snubbed if my books were not there. I'm happy when > they are in libraries. I'm fine that they're here.» > > Fabio > > Il giorno ven 29 set 2023 alle ore 07:52 Alberto Cammozzo via nexa > nexa@server-nexa.polito.it ha scritto: > > > https://www.theguardian.com/australia-news/2023/sep/28/australian-books-training-ai-books3-stolen-pirated > > > > Thousands of books from some of Australia’s most celebrated authors have > > potentially been caught up in what Booker prize-winning novelist Richard > > Flanagan has called “the biggest act of copyright theft in history”. > > > > The works have allegedly been pirated by the US-based Books3 dataset and > > used to train generative AI for corporations such as Meta and Bloomberg. > > > > Flanagan, who found 10 of his works, including the multi-international > > award-winning 2013 novel The Narrow Road to the Deep North, on the Books3 > > dataset, told Guardian Australia he was deeply shocked by the discovery > > made several days ago. > > > > “I felt as if my soul had been strip mined and I was powerless to stop it,” > > he said in a statement. > > > > “This is the biggest act of copyright theft in history.” > > > > AI could ‘turbo-charge fraud’ and be monopolised by tech companies, Andrew > > Leigh warns > > > > The Australian Publishers Association confirmed to Guardian Australia on > > Wednesday that as many as 18,000 fiction and nonfiction titles with > > Australian ISBNs (unique international standard book numbers) appeared to > > be affected by the copyright infringement, although it is not yet clear > > what proportion of these are Australian editions of internationally > > authored books. > > > > “We’re still working through [the data] to work out the impact in terms of > > Australian authors,” APA spokesperson Stuart Glover said. > > > > “This is a massive legal and ethical challenge for the publishing industry > > and for authors globally.” > > > > A search tool published on Monday by US media platform The Atlantic and > > uploaded by the US Authors Guild on Wednesday revealed the works of Peter > > Carey, Helen Garner, Kate Grenville, Anna Funder, Christos Tsiolkas and > > Thomas Keneally, as well as Flanagan and dozens of other high-profile > > Australian authors, were included in the pirated dataset containing more > > than 180,000 titles. > > > > On Thursday, the Australian Society of Authors issued a statement saying it > > was “horrified” to learn that the works of Australian writers were being > > used to train artificial intelligence without permission from the authors. > > > > ASA chief executive, Olivia Lanchester, described the Books3 dataset as > > piracy on an industrial scale. > > > > “Authors appropriately feel outraged,” Lanchester said. “The fact is this > > technology relies upon books, journals, essays written by authors, yet > > permission was not sought nor compensation granted.” > > > > Lanchester said the Australian literary industry, while not objecting per > > se to emerging technologies such as AI, was deeply concerned about the lack > > of transparency evident in the development and monetisation of AI by global > > tech companies. > > > > “Turning a blind eye to the legitimate rights of copyright owners threatens > > to diminish already precarious creative careers,” she said. > > > > “The enrichment of a few powerful companies is at the cost of thousands of > > individual creators. This is not how a fair market functions.” > > > > Josephine Johnston, chief executive of Australia’s Copyright Agency, > > described the Books3 development as “a free kick to big tech” at the > > expense of Australia’s creative and cultural life. > > > > “We’re going to need greater transparency – how these tools have been > > developed, trained, how they operate – before people can truly understand > > what their legal rights might be,” she said. > > > > “We seem to be in this terrible position now where content owners – > > remembering that the vast majority of them will be individual authors – may > > actually have to take out court cases to enforce their rights.” > > > > Australian copyright law protects creators of original content from data > > scraping. > > > > Litigation in the US against ChatGPT creator OpenAI over use of allegedly > > pirated book datasets, Books1 and Books2 (which do not appear to be > > affiliated with Books3) has already commenced. > > > > In July, North American horror/fantasy writers Mona Awad (author of Bunny) > > and Paul Tremblay (author of The Cabin at the End of the World) filed a > > lawsuit in a San Francisco federal court, alleging ChatGPT unlawfully > > digested their books as part of its AI training data. > > > > On 28 August, OpenAI filed a motion to dismiss the lawsuit, arguing that > > the authors “misconceive the scope of copyright, failing to take into > > account the limitations and exceptions (including fair use) that properly > > leave room for innovations like the large language models now at the > > forefront of artificial intelligence”. > > > > On 19 September the Writers Guild and 17 of its members, including > > bestselling novelists John Grisham, George RR Martin and Jodi Picoult, > > filed a complaint in a New York district court against OpenAI, seeking > > redress for “flagrant and harmful infringements” of guild members’ > > registered copyrights. > > > > In a statement on its website, the guild says while it is aware that > > companies such as Meta and Bloomberg have used the Books3 dataset to train > > their LLMs, it is not yet clear whether OpenAI is using Books3 to train its > > ChatGPT models GPT 3.5 or GPT 4. > > > > Democracies face ‘truth decay’ as AI blurs fact and fiction, warns head of > > Australia’s military > > > > Guardian Australia has sought comment from OpenAI, which has yet to > > officially respond to the guild’s complaint, and Meta. > > > > On 4 September, US technology magazine Wired reported that a Danish > > anti-piracy group called Rights Alliance had been told by Bloomberg that > > the company did not plan to train future versions of its BloombergGPT using > > Books3. > > > > Bloomberg declined to respond to the Guardian’s queries. > > > > The APA said the global nature of the issue would present significant > > challenges in enforcement and prosecution, and has joined the authors’ > > society in calling for AI technologies to be regulated. > > > > Consultation closed last month for a Department of Industry, Science and > > Resources discussion paper on supporting responsible AI. > > > > A parliamentary inquiry is under way examining the use of generative > > artificial intelligence in the Australian education system. > > > > Flanagan said it was up to the Australian government to act to protect > > Australia’s writers. > > > > “It has power and we do not,” he said. > > > > “If it cares for our culture it must now stand up and fight for it.” > > > > _______________________________________________ > > nexa mailing list > > nexa@server-nexa.polito.it > > https://server-nexa.polito.it/cgi-bin/mailman/listinfo/nexa > > _______________________________________________ > nexa mailing list > nexa@server-nexa.polito.it > https://server-nexa.polito.it/cgi-bin/mailman/listinfo/nexa _______________________________________________ nexa mailing list nexa@server-nexa.polito.it https://server-nexa.polito.it/cgi-bin/mailman/listinfo/nexa