Re: [hibernate-dev] [Wildcard] Search: changing the way we search

Emmanuel Bernard Tue, 04 Mar 2014 04:26:24 -0800

On 04 Mar 2014, at 12:09, Guillaume Smet <guillaume.s...@gmail.com> wrote:


> On Tue, Mar 4, 2014 at 11:09 AM, Emmanuel Bernard
> <emman...@hibernate.org> wrote:
>> I would like to separate the notion of autosuggestion from the wildcard 
>> problem. To me they are separate and I would love to Hibernate Search to 
>> offer an autosuggest and spell checker API.
> 
> AFAICS from the changelog of each version, autosuggest is still a vast
> work in progress in Lucene/Solr.

So? :)

> 
>> Back to wildcard. If we have an analyser stack that separates normaliser 
>> filters from filters generating additional tokens (see my email [AND]), then 
>> it is piece of cake to apply the right filters, raise an exception if 
>> someone tries to wildcard on ngrams, and simply ignore the synonym filter.
> 
> In theory, yes.
> 
> But for the tokenization, we use WhitespaceTokenizer and
> WordDelimiterFilter which generates new tokens (for example, depending
> on the options you use, you can index wi-fi as wi and fi, wi-fi and
> wifi).

Ok that poses a problem for the wildcard if wi and if are separated. But I 
don’t think it’s an issue for the AND case as we would get the expected query:
- hotel AND wi-fi
- hotel AND wi AND fi

And to be fair, how do you plan to make wildcard and wi fi work together in 
Lucene (any solution available). The solution I can think of is to index the 
property with an analyzer stack that does not split words like that in two 
tokens.

> 
> The problem of this particular filter is also that we put it after the
> ASCIIFoldingFilter because we want the input to be as clean as
> possible but before the LowerCaseFilter as WordDelimiterFilter can do
> its magic on case change too.
> 
> If you separate normalizer from tokenizer, I don't think it's going to
> be easy to order them adequately.


_______________________________________________
hibernate-dev mailing list
hibernate-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev

Re: [hibernate-dev] [Wildcard] Search: changing the way we search

Reply via email to