Sir I am working on classification project before that i have to do feature
selection process. i am very interested to apply lemmatization rather
stemming. So i executed the code below according to definition of
lemmatization it should give root words like run for running and ran, think
for thought etc... but my code is not giving correct output.. could you
please Please help me to finding out where actually i went wrong please
sir..

CODE:

library("tm")
library("NLP")
library("wordnet")
setDict("C:/Program Files/WordNet/2.1/dict")
vector.documents <- c("The children something to the playground The cars %s
down the avenue")
corpus.documents <- Corpus(VectorSource(vector.documents))

initDict("C:/Program Files/WordNet/2.1/dict")
lapply(corpus.documents,function(x){
  sapply(unlist(strsplit(as.character(x),"[[:space:]]+")), function(word) {
    x.filter <- getTermFilter("StartsWithFilter", word, TRUE)
    x.filter
    x
    terms    <- getIndexTerms("NOUN",1,x.filter)
    terms
    if(!is.null(terms)) sapply(terms,getLemma)
  })
})

OUTPUT:
$`1`
$`1`$The
[1] "the absurd"

$`1`$children
NULL

$`1`$playing
[1] "playing"

$`1`$playground
[1] "playground"

$`1`$The
[1] "the absurd"

$`1`$cars
[1] "carson"

$`1`$landing
[1] "landing"

$`1`$avenue
[1] "avenue"




I also tried by applying other POS and type like "Containsfilter"
 but that also not worked please please help me !!!


Thanks in advance.

with regards,

BHANUMATHI H M

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to