Hi Wulf, You should consider decompounding! There are filters based on dictionaries that support decompounding german words. Its a TokenFilter to be put into your analysis chain. There is a simple Lucene-Rule: Whenever you need wildcards think about your analysis, you probably did something wrong :-) Add stemming, decompounding, synonyms,...
Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -----Original Message----- > From: Wulf Berschin [mailto:bersc...@dosco.de] > Sent: Wednesday, January 26, 2011 3:56 PM > To: java-user@lucene.apache.org > Subject: Re: ****SPAM(5.0)**** Re: Highlight Wildcard Queries: Scores > > Hi Erick, > > good points, but: > > our index is fed with german text. In german (in contrast to english) nouns > are just appended to create new words. E.g. > > Kaffee > Kaffeemaschine > Kaffeemaschinensatzbehälter > > In our scenario standard fulltext search on "Maschine" shall present all of > these nouns. That's why we add * before and after on each term. > > Of course we provide an option "full words only" which finds none of these. > > Since we do not wrap * around words shorter than 4 characters we weren't > yet faced with the too many clauses exception. > > Greetings > Wulf > > Am 26.01.2011 15:18, schrieb Erick Erickson: > > It is, I think, a legitimate question to ask whether scoring is > > worthwhile on wildcards. That is, does it really improve the user > > experience? Because the MaxBooleanClause gets tripped pretty quickly > > if you add the terms back in, so you'd have to deal with that. > > > > Would your users be satisfied with sorting in the case of only > > searching on wildcard characters by, say, dates or titles or subjects > > or.....? > > > > Then just let the scoring work on fields that aren't wildcarded... > > > > This may not be reasonable in your particular case, but it's worth > > considering before assuming that scoring is necessary on wildcard > > terms. > > > > Best > > Erick > > > > On Wed, Jan 26, 2011 at 9:10 AM, Wulf Berschin<bersc...@dosco.de> > wrote: > > > >> Now I have the highlighted wildcards but obviously the scoring is > >> lost. I see that a rewrite of the wildcard query produces a constant > >> score query. I added > >> > >> > setMultiTermRewriteMethod(MultiTermQuery.SCORING_BOOLEAN_QUERY > _REWRIT > >> E); > >> > >> to my QueryParser instance but no effect. What's to be done now? > >> > >> Wulf > >> > >> > >> > >> Am 26.01.2011 11:06, schrieb Wulf Berschin: > >> > >>> Thank you Alexander and Uwe, for your help. > >>> > >>> I read Marks explanation but it seems to me that his changes are not > >>> contained in Lucene-3.0.3. > >>> > >>> So I commented out the rewrite, changed QueryTermScorer back to > >>> QueryScorer and now I got the wildcard queries highlighted again. > >>> > >>> Wulf > >>> > >>> > >>> -------------------------------------------------------------------- > >>> - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >>> For additional commands, e-mail: java-user-h...@lucene.apache.org > >>> > >>> > >>> > >>> > >>> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >> For additional commands, e-mail: java-user-h...@lucene.apache.org > >> > >> > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org