hi, After did a bit more searching, I found https://issues.apache.org/jira/browse/MAHOUT-1527 The version of Mahout that I have been working on is Mahout 0.9 (from http://mahout.apache.org/general/downloads.html), which I downloaded in April. Albeit the latest stable release, it doesn't include the patch mentioned in https://issues.apache.org/jira/browse/MAHOUT-1527
Then I realized had I cloned the latest mahout, I would get a script that classify-wiki.sh, and probably can start from there. Sorry for the spam! Thanks, Wei From: Wei Zhang/Watson/IBM@IBMUS To: [email protected] Date: 08/19/2014 06:18 PM Subject: any pointer to run wikipedia bayes example Hi, I have been able to run the bayesian network 20news group example provided at Mahout website. I am interested in running the Wikipedia bayes example, as it is a much larger dataset. >From several googling attempts, I figured it is a bit different workflow than running the 20news group example -- e.g., I would need to provide a categories.txt file, and invoke WikipediaXmlSplitter, call wikipediaDataSetCreator and etc. I am wondering is there a document somewhere that describes the process of running Wikipedia bayes example ? https://cwiki.apache.org/MAHOUT/wikipedia-bayes-example.html seems no longer work. Greatly appreciated! Wei
