Hi All, I need to work on the application where I have to categorize text (group of sentences) into multiple pre-defined categories.
As I understand from the searches on the internet, theoretically it is possible with Ngram based index and matching the incoming text n-gram with the known fingerprint of the category. I wanted to know if Lucene already has any contribution done in this regards that I can find in the contrib directory or is there any example that I can look at else where. Saurabh