Hi,

I have been doing some text mining. I created the DTM matrix using the
following steps.
corpus1<-VCorpus(VectorSource(resume1$Dat1))

corpus1<-tm_map(corpus1,content_transformer(tolower))

dtm<-DocumentTermMatrix(corpus1,
                               control = list(removePunctuation = TRUE,
                                              removeNumbers = TRUE,
                                              removeSparseTerms=TRUE,
                                                stopwords = TRUE))


​After all the run I am still getting words like -quotation, "fun, model"​
, etc.

What can I do about it. I do not need this dahses and extra quotations.

-- 
Anindya Sankar Dey

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to