Well, what you're really doing, in your example, is searching
on all the terms that start with cat and are less than 7 characters
long.

So it seems to me that you can pick out terms yourself and assemble
your own bit OR clause rather than rely on Lucene's old behavior.

By that, I mean use a WildcardTermEnum on cat*. As you enumerate
over all the terms, add the term to an OR clause if it's less than
7 characters long.

Really, this is what was happening under the covers before, but
since the behavior has changed (actually been corrected), you probably
can emulate it like this.

Hope this helps
Erick

On 6/7/07, Tim Smith <[EMAIL PROTECTED]> wrote:

Hi!

The situation is the following:

In my native language, stemming is not available for
lucene AFAIK. ..and there are pleny of forms of words
that needs to be stemmed. People usually search with
'*' at the end of the word because of the reason, but
because of the nature of our language, in most of the
cases, it expands to much more words than acceptable.
We've limited the number of Query clauses because of
performance reasons.
That's where '???' comes in picture. We were running
on 1.4.x until now and our Help and FAQ suggests for
people using '???' for stemming, since '*' will work
only for long words (fewer expanded variations). (our
stems are mostly 1-2 or 3 chars long)
'???' was almost perfect for this reason. It returned
the original word and most of the stemmed variations.
Now it is completly broken, so we need to find a
solution.
I know this is a special case for a special language,
but this is a real problem for us now.

Thanks,
Tim

--- Erick Erickson <[EMAIL PROTECTED]> wrote:

> Well, having your application depend upon incorrect
> behavior
> is...er...fraught.
>
> It looks like what you really want is custom
> behavior for multiple
> question marks, perhaps only with multiple question
> marks
> at the end of your query?
>
> If this is the case, I'd think about substituting
> splat (*) in this
> case at query time. So you simply transform
> cat??? to cat*....
>
> If that doesn't satisfy your requirements, perhaps
> you could
> post a more detailed explanation of what you're
> trying to
> accomplish.
>
> Best
> Erick
>
> On 6/6/07, Tim Smith <[EMAIL PROTECTED]> wrote:
> >
> > Hi!
> >
> > How can I restore the behavior of the old
> > WildcardQuery under 2.1?
> > I badly need 'cat???' to match 'cat' again just
> like
> > in the older versions.
> >
> > I could modify my istance of lucene by removing
> those
> > "new" lines, but I don't want to maintain a custom
> > lucene package.
> >
> > Please help!
> >
> > Tim
> >
> >
> >
> >
> > Source: LUCENE-306
> > >
> >
>
********************************************************************
> > > --- WildcardTermEnum.org      2004-05-11
> > 11:42:10.000000000 -0400
> > > +++ WildcardTermEnum.java     2004-11-08
> > 14:35:14.823610500 -0500
> > > @@ -132,6 +132,10 @@
> > >              }
> > >              else
> > >              {
> > > +           //to prevent "cat" matches "ca??"
> > > +           if(wildchar == WILDCARD_CHAR){
> > > +             return false;
> > > +           }
> > >                // Look at the next character
> > >                wildcardSearchPos++;
> > >              }
> > >
> >
>
**********************************************************************
> >
> >
> >
> >
> >
> >
>

____________________________________________________________________________________
> > Sucker-punch spam with award-winning protection.
> > Try the free Yahoo! Mail Beta.
> >
>
http://advision.webevents.yahoo.com/mailbeta/features_spam.html
> >
> >
>
---------------------------------------------------------------------
> > To unsubscribe, e-mail:
> [EMAIL PROTECTED]
> > For additional commands, e-mail:
> [EMAIL PROTECTED]
> >
> >
>





____________________________________________________________________________________
Get the free Yahoo! toolbar and rest assured with the added security of
spyware protection.
http://new.toolbar.yahoo.com/toolbar/features/norton/index.php

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Reply via email to