Do you want to search for shingles?
On 3/4/2015 9:16 PM, Stephen Rudd wrote:
I have created a slightly hairy document collection that contains 10s of
millions of DNA sequence words that I wish to process to find rarer and unique
words. Each of the words is between 100 characters (nucleotides)
I have created a slightly hairy document collection that contains 10s of
millions of DNA sequence words that I wish to process to find rarer and unique
words. Each of the words is between 100 characters (nucleotides) and 1000
characters in length.
I have been able to use WildcardQuery and Fuzzy