Evan Klitzke wrote: > What frameworks are there available for doing pattern classification? > I'm generally interested in the problem of mapping some sort of input > to one or more categories. For example, I want to be able to solve > problems like taking text and applying one or more tags to it like > "romance", "horror", "poetry", etc. This isn't really my research > specialty, but my understanding is that Bayesian classifiers are > generally used for problems like this.
In fact, a wide variety of classifiers are used in text classification, including Bayesian approaches, support vector machines, conditional random fields, etc. > Are there any other frameworks I should be aware of? I have used (but not recently) Orange: http://www.ailab.si/orange I haven't used, but have been meaning to try, PyML: http://pyml.sourceforge.net/ A more recent addition (whose documentation needs work) is: http://montepython.sourceforge.net/ And here's a Summer of Code project to build an ML library: http://projects.scipy.org/scipy/scipy/wiki/MachineLearning These are all general-purpose machine learning frameworks. So they can be applied to pretty much any classification problem (including the text classification problems you're looking at). You just need to pick out a set of relevant features to describe your data, and feed those features along with your chosen labels to a machine learning algorithm. STeVe -- http://mail.python.org/mailman/listinfo/python-list