Buongiorno lista,

> L'idea che istruire un modello su dei testi coperti da copyright sia una 
> violazione del suddetto copyright è altamente opinabile

Fin qui, ho l'impressione che tutti i legali in lista concorderanno.

> ragionamento è in realtà abbastanza semplice: se istruirsi su un 
> testo ne violasse il copyright, saremmo tutti dei criminali. 

Ma siccome noi siamo umani e quello che produciamo non è - salvo i discorsi dei 
politici(*) - ontologicamente identico alla produzione di esseri tecnici non 
viventi, logica vuole che quanto si applica a noi non possa applicarsi a un 
LLM, tanto quanto la legge sul copyright non si applica pedissequamente 
all'utilizzo di testi umani per creare modelli linguistici.

Questo è il motivo per il quale tutti i tentativi di "proteggere via copyright" 
il prodotto di software generativi sono falliti miseramente, e con motivazioni 
scritte in sentenze; che per il diritto credo abbiano un peso assai maggiore 
del sito di CC.

La mia impressione è che la questione terrà impegnati legali, informatici, 
filosofi e società ancora moooooolto a lungo.
SBB

(*) Come sanno bene i bambini degli anni '80 che hanno giocato con questo 
spassoso giocattolo: https://www.enricodalbosco.it/giochi/tubolario/


Di quei testi
> non c'è fisicamente traccia all'interno dei modelli, non viene copiato
> niente. I modelli sono un'opera trasformativa di quei testi, non
> derivativa.
> 
> Lo argomenta molto bene Creative Commons:
> https://creativecommons.org/2023/02/17/fair-use-training-generative-ai/
> 
> Detto questo, cito le parole di un altro autore, Jeff Jarvis:
> https://www.facebook.com/jeff.jarvis/posts/pfbid0LMFeqdTYoxnGHQAZwp5HMmeeVqgMSjL2dkcwMcBojkb2cinBpgYTHyc7Fhq1B9NPl
> 
> «I, for one, am not complaining about my books being in in large
> language model training sets. I write to enter ideas into public
> discourse. I prefer informed over ignorant AI. I believe it is fair
> use for anyone to read & use books for transformative work. In fact,
> I'd probably feel snubbed if my books were not there. I'm happy when
> they are in libraries. I'm fine that they're here.»
> 
> Fabio
> 
> Il giorno ven 29 set 2023 alle ore 07:52 Alberto Cammozzo via nexa
> nexa@server-nexa.polito.it ha scritto:
> 
> > https://www.theguardian.com/australia-news/2023/sep/28/australian-books-training-ai-books3-stolen-pirated
> > 
> > Thousands of books from some of Australia’s most celebrated authors have 
> > potentially been caught up in what Booker prize-winning novelist Richard 
> > Flanagan has called “the biggest act of copyright theft in history”.
> > 
> > The works have allegedly been pirated by the US-based Books3 dataset and 
> > used to train generative AI for corporations such as Meta and Bloomberg.
> > 
> > Flanagan, who found 10 of his works, including the multi-international 
> > award-winning 2013 novel The Narrow Road to the Deep North, on the Books3 
> > dataset, told Guardian Australia he was deeply shocked by the discovery 
> > made several days ago.
> > 
> > “I felt as if my soul had been strip mined and I was powerless to stop it,” 
> > he said in a statement.
> > 
> > “This is the biggest act of copyright theft in history.”
> > 
> > AI could ‘turbo-charge fraud’ and be monopolised by tech companies, Andrew 
> > Leigh warns
> > 
> > The Australian Publishers Association confirmed to Guardian Australia on 
> > Wednesday that as many as 18,000 fiction and nonfiction titles with 
> > Australian ISBNs (unique international standard book numbers) appeared to 
> > be affected by the copyright infringement, although it is not yet clear 
> > what proportion of these are Australian editions of internationally 
> > authored books.
> > 
> > “We’re still working through [the data] to work out the impact in terms of 
> > Australian authors,” APA spokesperson Stuart Glover said.
> > 
> > “This is a massive legal and ethical challenge for the publishing industry 
> > and for authors globally.”
> > 
> > A search tool published on Monday by US media platform The Atlantic and 
> > uploaded by the US Authors Guild on Wednesday revealed the works of Peter 
> > Carey, Helen Garner, Kate Grenville, Anna Funder, Christos Tsiolkas and 
> > Thomas Keneally, as well as Flanagan and dozens of other high-profile 
> > Australian authors, were included in the pirated dataset containing more 
> > than 180,000 titles.
> > 
> > On Thursday, the Australian Society of Authors issued a statement saying it 
> > was “horrified” to learn that the works of Australian writers were being 
> > used to train artificial intelligence without permission from the authors.
> > 
> > ASA chief executive, Olivia Lanchester, described the Books3 dataset as 
> > piracy on an industrial scale.
> > 
> > “Authors appropriately feel outraged,” Lanchester said. “The fact is this 
> > technology relies upon books, journals, essays written by authors, yet 
> > permission was not sought nor compensation granted.”
> > 
> > Lanchester said the Australian literary industry, while not objecting per 
> > se to emerging technologies such as AI, was deeply concerned about the lack 
> > of transparency evident in the development and monetisation of AI by global 
> > tech companies.
> > 
> > “Turning a blind eye to the legitimate rights of copyright owners threatens 
> > to diminish already precarious creative careers,” she said.
> > 
> > “The enrichment of a few powerful companies is at the cost of thousands of 
> > individual creators. This is not how a fair market functions.”
> > 
> > Josephine Johnston, chief executive of Australia’s Copyright Agency, 
> > described the Books3 development as “a free kick to big tech” at the 
> > expense of Australia’s creative and cultural life.
> > 
> > “We’re going to need greater transparency – how these tools have been 
> > developed, trained, how they operate – before people can truly understand 
> > what their legal rights might be,” she said.
> > 
> > “We seem to be in this terrible position now where content owners – 
> > remembering that the vast majority of them will be individual authors – may 
> > actually have to take out court cases to enforce their rights.”
> > 
> > Australian copyright law protects creators of original content from data 
> > scraping.
> > 
> > Litigation in the US against ChatGPT creator OpenAI over use of allegedly 
> > pirated book datasets, Books1 and Books2 (which do not appear to be 
> > affiliated with Books3) has already commenced.
> > 
> > In July, North American horror/fantasy writers Mona Awad (author of Bunny) 
> > and Paul Tremblay (author of The Cabin at the End of the World) filed a 
> > lawsuit in a San Francisco federal court, alleging ChatGPT unlawfully 
> > digested their books as part of its AI training data.
> > 
> > On 28 August, OpenAI filed a motion to dismiss the lawsuit, arguing that 
> > the authors “misconceive the scope of copyright, failing to take into 
> > account the limitations and exceptions (including fair use) that properly 
> > leave room for innovations like the large language models now at the 
> > forefront of artificial intelligence”.
> > 
> > On 19 September the Writers Guild and 17 of its members, including 
> > bestselling novelists John Grisham, George RR Martin and Jodi Picoult, 
> > filed a complaint in a New York district court against OpenAI, seeking 
> > redress for “flagrant and harmful infringements” of guild members’ 
> > registered copyrights.
> > 
> > In a statement on its website, the guild says while it is aware that 
> > companies such as Meta and Bloomberg have used the Books3 dataset to train 
> > their LLMs, it is not yet clear whether OpenAI is using Books3 to train its 
> > ChatGPT models GPT 3.5 or GPT 4.
> > 
> > Democracies face ‘truth decay’ as AI blurs fact and fiction, warns head of 
> > Australia’s military
> > 
> > Guardian Australia has sought comment from OpenAI, which has yet to 
> > officially respond to the guild’s complaint, and Meta.
> > 
> > On 4 September, US technology magazine Wired reported that a Danish 
> > anti-piracy group called Rights Alliance had been told by Bloomberg that 
> > the company did not plan to train future versions of its BloombergGPT using 
> > Books3.
> > 
> > Bloomberg declined to respond to the Guardian’s queries.
> > 
> > The APA said the global nature of the issue would present significant 
> > challenges in enforcement and prosecution, and has joined the authors’ 
> > society in calling for AI technologies to be regulated.
> > 
> > Consultation closed last month for a Department of Industry, Science and 
> > Resources discussion paper on supporting responsible AI.
> > 
> > A parliamentary inquiry is under way examining the use of generative 
> > artificial intelligence in the Australian education system.
> > 
> > Flanagan said it was up to the Australian government to act to protect 
> > Australia’s writers.
> > 
> > “It has power and we do not,” he said.
> > 
> > “If it cares for our culture it must now stand up and fight for it.”
> > 
> > _______________________________________________
> > nexa mailing list
> > nexa@server-nexa.polito.it
> > https://server-nexa.polito.it/cgi-bin/mailman/listinfo/nexa
> 
> _______________________________________________
> nexa mailing list
> nexa@server-nexa.polito.it
> https://server-nexa.polito.it/cgi-bin/mailman/listinfo/nexa
_______________________________________________
nexa mailing list
nexa@server-nexa.polito.it
https://server-nexa.polito.it/cgi-bin/mailman/listinfo/nexa

Reply via email to