Text Categorization with Lucene (N-Gram technique)

Saurabh Gokhale Sun, 24 Jul 2011 09:38:40 -0700

Hi All,

I need to work on the application where I have to categorize text (group of
sentences) into multiple pre-defined categories.


As I understand from the searches on the internet, theoretically it is
possible with Ngram based index and matching the incoming text n-gram with
the known fingerprint of the category.

I wanted to know if Lucene already has any contribution done in this regards
that I can find in the contrib directory or is there any example that I can
look at else where.

Saurabh

Text Categorization with Lucene (N-Gram technique)

Reply via email to