[Wiktionary-l] Wiktionary parsing ; multiple languages

Moutupsi Paul Thu, 04 Apr 2013 14:26:35 -0700

Hi All,



Greeting,



I am a CS grad student from Data Science Lab Stony 
Brook<https://sites.google.com/site/datascienceslab/> and I am dropping this 
mail to request information about parsing multi-lingual Wiktionary data. Our 
lab has been using Wikipedia data for quite a while now but we are really 
interested in taking advantage of the massive Wiktionary content which we feel 
, after proper parsing, can become an rich muti-language corpus.



But the big hurdle is a parsing tool. We have tried a few Wiktionary parsing 
tools



1.       https://github.com/clbecker/perl-wiktionary-parser/

2.       https://code.google.com/p/wikokit/wiki/GettingStartedWiktionaryParser

3.       
https://github.com/benreynwar/wiktionary-parser/tree/master/wiktionary_parser

4.       http://www.ukp.tu-darmstadt.de/software/jwktl/



but none of them are available in a ready-to-use or easy-to-extend in multiple 
language mode. (I am currently trying to work with wikokit (parser 2 above)  )



I request for some advice, suggestion or redirection towards best available 
Wiktionary parser. We are mainly looking to extract meanings, POS, examples, 
translations etc. (more can never hurt).



Any help is appreciated. Kindly let know if further information is needed.



Regards,

Moutupsi

_______________________________________________
Wiktionary-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wiktionary-l

[Wiktionary-l] Wiktionary parsing ; multiple languages

Reply via email to