Hi Steve, IMO this problem does not need a classifier but rather a database and a simple query. I would just build a database with all city names including the geo information, and then say whether it is north or south exactly.
If there was such a "rule" (which I doubt) I would expect it to have many exceptions and therefore a bunch of false-positives on both sides. Why overcomplicate a simple problem? HTH, Ciao, Giovanni -----Original Message----- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Steve Stephenson Sent: Saturday, January 26, 2013 10:08 PM To: r-help@r-project.org Subject: [R] Text mining Hallo to everybody, I would like to perform an analysis but I don't know how to proceed and whether R packages are available for my purpose or not. Therefore I'm here to request your support. *The idea is the following:* I noticed that the names of the towns and villages in northern Italy most of the time sound differently from names of cities based on southern Italy. Just to give you an idea "Caronno Pertusella" is a northern Italy village while Frascati is a center Italy town. Most of the time I am able to recognize where the town is located just hearing the name but I cannot say why, that is to say that I didn't find a "rule". What I would like to do is to find a classification rule/engine that is able to "locate" the city starting from its name. *I think the classification method should be based on the sequence of letters belonging to the town's name*. But this is just an intuition not yet formalized! I know that mine is a strange request and idea, anyway advices are very appreciated and welcome! Many thanks in advance to all. Steve -- View this message in context: http://r.789695.n4.nabble.com/Text-mining-tp4656732.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.