I am using the package "tm" for text-mining of abstracts and would like to
use it to find instances of gene names that may contain white space. For
instance "gene regulatory protein 1". The default behavior of tm is to parse
this into 4 separate words, but I would like to use the class constructor
"dictionary" to define phrases such as just mentioned.

Is this possible? If so, how?

Thanks,
Mark
------------------------------------------------------------
Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work, & Mobile & VoiceMail

"The real problem is not whether machines think but whether men do." -- B.
F. Skinner
******************************************************************

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to