The following module was proposed for inclusion in the Module List: modid: Lingua::En::Tagger DSLIP: adpOg description: Part of speech tagger for English userid: MACIEJ (Maciej Ceglowski) chapterid: 11 (String_Lang_Text_Proc) communities: Part of a toolkit for latent semantic indexing ( LSI ) to be introduced at the O'Reilly Bioinformatics Conference in February
similar: None that I can find - there are some modules under Lingua and Lingua::En that deal with a specific subset of the funcionality ( finding proper names, for example ), but no full POS tagger rationale: Being able to auto-tag English text with part of speech information is very helpful in building all kinds of natural language processing tools, and especially search engines ( where you want noun phrases, and little else ). There exist several tagging algorithms for English text, but none of them seem to be implemented in Perl ( at least not under an open source license, or on the CPAN ). I've been using a POS tagger for my own work on latent semantic indexing, and this seems like a good module to abstract out so that other CPAN users can benefit. For now the module is something of a homebrew - but we intend to extend support to standard algorithms like Brill's tagger and others, in future versions. I think the namespace is appropriate, since this is ipso facto specific to English, but I welcome other suggestions. enteredby: MACIEJ (Maciej Ceglowski) enteredon: Sun Oct 20 21:38:23 2002 GMT The resulting entry would be: Lingua::En:: ::Tagger adpOg Part of speech tagger for English MACIEJ Thanks for registering, The Pause Team PS: The following links are only valid for module list maintainers: Registration form with editing capabilities: https://pause.perl.org/pause/authenquery?ACTION=add_mod&USERID=53300000_6d9f60bdac7db8ba&SUBMIT_pause99_add_mod_preview=1 Immediate (one click) registration: https://pause.perl.org/pause/authenquery?ACTION=add_mod&USERID=53300000_6d9f60bdac7db8ba&SUBMIT_pause99_add_mod_insertit=1