On Sat, 25 Jan 2025 at 18:42, Ilu <il...@gmx.net> wrote:
> Will Debian accept a GR that requires all training data to be free,
> including training data that belongs to the core of human dignity? That
> would be disturbing. And in fact practically lobotomize good projects.

Mozilla would like a word: https://commonvoice.mozilla.org/en/datasets

"Each entry in the dataset consists of a unique MP3 and corresponding
text file. Many of the 33,151 recorded hours in the dataset also
include demographic metadata like age, sex, and accent that can help
train the accuracy of speech recognition engines. The dataset
currently consists of 22,109 validated hours in 133 languages, but
we’re always adding more voices and languages. Take a look at our
Languages page to request a language or start contributing."

 - samj

Reply via email to