Hi Giovanni, thanks a lot for your quick reply!!! I try to answer you in a few points: 1 - A Data Base containing all the towns and the Region they belong to (North, Sud...) is already available on the ISTAT site (www.ISTAT.it); 2- My goal was just to find a "method" supporting my idea, that is to say that northern towns names "sound" different from "southern" names; 3- To build this method I should use the ISTAT DB, partially as training set and partially as validation set; 4- The idea was born just for fun since I find very interesting and also challenging the data mining; 5- I absolutely agree with you: I will find a lot of exception and therefore ; if the exceptions are greater than the rule (this could happen) this would imply that my initial idea is wrong. In any case I would be satisfied because this would mean that I have been able to prove if an in intuition is right or wrong.
I hope this can clarify my previous post. Many thanks and *sorry for the lack of clarity*. Steve -- View this message in context: http://r.789695.n4.nabble.com/Text-mining-tp4656732p4656738.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.