It looks like an interesting idea especially as it keep the simple use case simple (ie simply not defining an queryAnalyzer.
Can you explain to me why you would need a different analyzer for a wildcard query? My brain is still tanning on the beach. Brainstorming here we could do the following @AnalyzerDef.target enum AnalyzerTarget { ALL, INDEXING, QUERY, WILDCARD } So you could define the same @AnalyzerDef.name several times provided that they did not share the same targets. But that would also change the API for the dynamic analyzer I suppose. It also does not cover the @Analyzer.impl usage. On Tue 2013-08-13 10:13, Guillaume Smet wrote: > Hi, > > Note: this is just a prospective idea I'd like to discuss. Even if > it's a good idea, it's definitely 5.0 material. > > Those who have used Solr and are familiar with the Solr schema have > already seen the ability to use different analyzer for indexing and > querying. > > It's usually useful when you use analyzers which returns several > tokens for a given token: the QueryParser usually can't build the > correct query with these analyzers. > > To take an example from my current work on HSEARCH-917 (soon to come > \o/), I have the following case. From i-pod , the analyzer builds ipod > i pod i-pod. ipod and i-pod aren't the issue here but the fact that i > pod is on two tokens makes the QueryParser build an incorrect query > (even if I use the Lucene 4.4 version which is a little bit smarter > about these cases and at least make the i-pod ipod case work > correctly). > > The fact is that if the analyzer used at indexing has correctly > indexed all the tokens, I don't need to expand the terms at querying > and it should be sufficient to use a simple analyzer to lowercase the > string and remove the accents. > > Solr introduced this feature a long time ago (it was already there in > the good old times of 1.3) and I'm wondering if we shouldn't introduce > it in Hibernate Search too. > > As for the implementation, I was thinking about adding an attribute > queryAnalyzer to the @Field annotation. I was also wondering if we > shouldn't add the ability to define an Analyzer for wildcard queries > (Lucene introduced recently an AnalyzingQueryParser to do something > like that). > > And maybe, in this case, it would be a good idea to centralize the > configuration with types as it's done in Solr? Usually, the three > analyzers definitions would come together. > > As for my particular needs, most of my full text fields would be > analyzed like this: > > indexing: > @AnalyzerDef(name = HibernateSearchAnalyzer.TEXT, > tokenizer = @TokenizerDef(factory = > WhitespaceTokenizerFactory.class), > filters = { > @TokenFilterDef(factory = > ASCIIFoldingFilterFactory.class), > @TokenFilterDef(factory = > WordDelimiterFilterFactory.class, params = { > > @org.hibernate.search.annotations.Parameter(name = > "generateWordParts", value = "1"), > > @org.hibernate.search.annotations.Parameter(name = > "generateNumberParts", value = "1"), > > @org.hibernate.search.annotations.Parameter(name = > "catenateWords", value = "1"), > > @org.hibernate.search.annotations.Parameter(name = > "catenateNumbers", value = "0"), > > @org.hibernate.search.annotations.Parameter(name = > "catenateAll", value = "0"), > > @org.hibernate.search.annotations.Parameter(name = > "splitOnCaseChange", value = "0"), > > @org.hibernate.search.annotations.Parameter(name = > "splitOnNumerics", value = "0"), > > @org.hibernate.search.annotations.Parameter(name = > "preserveOriginal", value = "1") > } > ), > @TokenFilterDef(factory = > LowerCaseFilterFactory.class) > } > ), > querying: > @AnalyzerDef(name = HibernateSearchAnalyzer.TEXT, > tokenizer = @TokenizerDef(factory = > StandardTokenizerFactory.class), > filters = { > @TokenFilterDef(factory = > ASCIIFoldingFilterFactory.class), > @TokenFilterDef(factory = > LowerCaseFilterFactory.class) > } > ), > wildcard: > @AnalyzerDef(name = HibernateSearchAnalyzer.TEXT, > tokenizer = @TokenizerDef(factory = > WhitespaceTokenizerFactory.class), > filters = { > @TokenFilterDef(factory = > ASCIIFoldingFilterFactory.class), > @TokenFilterDef(factory = > LowerCaseFilterFactory.class) > } > ), > > I could contribute time to work on this if we can agree on the way to > pursue this idea. > > Thanks for your feedback. > > -- > Guillaume > _______________________________________________ > hibernate-dev mailing list > hibernate-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/hibernate-dev _______________________________________________ hibernate-dev mailing list hibernate-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/hibernate-dev