RE: Word split problems

2008-04-18 Thread Max Metral
It's probably about 100,000 entries per "thing that it would care about at once". -Original Message- From: Karl Wettin [mailto:[EMAIL PROTECTED] Sent: Thursday, April 17, 2008 3:17 PM To: java-user@lucene.apache.org Subject: Re: Word split problems Max Metral skrev:

Re: Word split problems

2008-04-17 Thread Karl Wettin
Max Metral skrev: > Lululemon Athletica I'd like any of these search terms to work for this: Lulu lemon Lu Lu Lemon Lululemon What strategy would be optimal for this kind of thing (of course keeping How large is your corpus? I suggest you look at NGramTokenizer. karl --