Thanks for the very helpful answers. I will look at the possibilities for uploading (and licensing) the data sets.
Meanwhile I have another question. Currently I don't parse any information other than the words or expressions, meaning gender and other language-specific information is ignored, even though they might appear in the translation tables. This is probably a huge problem for large Wiktionaries (e.g. I doubt that the enwiktionary would accept French nouns without their gender). Adding this functionality would be very tedious and probably impossible for languages I can't even read. Should I try it anyway or can the data be useful without these? 2013/10/9 Federico Leva (Nemo) <[email protected]> > Judit, Ács, 08/10/2013 12:21: > > Do you think there is a way to contribute this dictionary back to >> Wiktionary? >> > > Sure! You could first of all upload the dataset with a free license > somewhere, for instance archive.org. Actually, it's probably better if > you choose CC-0 as "license", otherwise – being EU-based – you could add > database rights which would be a nightmare. (Or CC-0 for your work + > CC-BY-SA for any copyrightable text from Wiktionary, if there is any.) > > Then, you can build upon one of out WebAPI clients to contribute it > directly to Wiktionary: > https://www.mediawiki.org/**wiki/API:Client_code<https://www.mediawiki.org/wiki/API:Client_code> > I say "you" because you are the ones knowing your own dataset better. You > need local consensus of course, so you could proceed this way: > 1) determine what Wiktionary editions has the biggest overlap with your > entries (i.e. which would require less page creation; adding to existing > pages is less controversial than adding new ones); > 2) propose to those editions, or wait for the most interested to ask you, > and get local green light (ideally a not-so-huge one to start with); > 3) run on your own a bot on that language and identify what's the kind and > amount of needed work; > 4) share the code and information from (3) to let others continue on other > editions. > Of course someone else could do 1-3 too, but it would be a > disproportionate effort for them compared to you; peer review of the code > at (3) should also help make the coding of the bot a shared effort. > > Nemo > _______________________________________________ Wiktionary-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wiktionary-l
