On Tue, Sep 4, 2012 at 12:37 PM, Martin O'Shea wrote:
>
> Does anyone know if this can be used in conjunction with other analyzers to
> return the frequencies of the bigrams or trigrams found, e.g.:
>
>
>
> "please divide this please divide sentence into shingles"
>
>
>
> Would return 2 for "p
If a Lucene ShingleFilter can be used to tokenize a string into shingles, or
ngrams, of different sizes, e.g.:
"please divide this sentence into shingles"
Becomes:
shingles "please divide", "divide this", "this sentence", "sentence
into", and "into shingles"
Does anyone know