Thanks a lot!
I was reading about Mahout today.
I'll try that out.
Thanks again
Maria
Sent from my iPhone
On Oct 27, 2010, at 20:59, Lance Norskog wrote:
> There are tools for this in the Mahout project. These are oriented
> toward large-scale work.
>
> http://mahout.apache.org
>
> There is
There are tools for this in the Mahout project. These are oriented
toward large-scale work.
http://mahout.apache.org
There is a big learning curve and you have to learn Hadoop somewhat.
The book 'Collective Intelligence' includes a suite of Python tools
for small-scale experiments.
On Wed, Oct
I need to auto-categorize a large number of documents. They are basically news
articles from major news sources (nytimes, npr, abcnews, etc).
I'd like to categorize them automatically. Any suggestions?
Lucene in Action suggests using a set of documents to build category vectors
and then comparing