Hi, I have been doing some text mining. I created the DTM matrix using the following steps.
corpus1<-VCorpus(VectorSource(resume1$Dat1)) corpus1<-tm_map(corpus1,content_transformer(tolower)) dtm<-DocumentTermMatrix(corpus1, control = list(removePunctuation = TRUE, removeNumbers = TRUE, removeSparseTerms=TRUE, stopwords = TRUE)) After all the run I am still getting words like -quotation, "fun, model" , etc. What can I do about it. I do not need this dahses and extra quotations. -- Anindya Sankar Dey [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.