Andi Vajda wrote:
On Thu, 18 Jun 2009, Neha Gupta wrote:
I was wondering if there is a way to read the index and generate
n-grams of
words for a document using pylucene?
PyLucene just wraps Java Lucene. If there is a way to do this in Java
Lucene, then use the same way with PyLucene.
To find out how to do this in Java Lucene, ask the
java-u...@lucene.apache.org mailing list. To subscribe, see [1].
Andi..
[1] java-user-subscr...@lucene.apache.org
There is an n-gram tokenizer, EdgeNGramTokenizer, that may be what
you're looking for.
- Brian