Hi, I think a great solution for that would be to leverage "annotation composition" as done e.g. in CDI and Bean Validation.
There would be an annotation @DocValuesField which will cause the creation of a DocValues and which would expose attributes required for its configuration: @DocValuesField(name="foo", ... ) private String bar; Besides using @DocValuesField itself directly, it would be usable as a meta-annotation on "doc value annotation types": @DocValuesField public @interface SortField { @OverridesAttribute(name="name") String name(); } And then its usage: @SortField private String bar; Of course such doc value annotation types (I think @SortField, @Facet and @DocumentId could be modelled as that) would only expose those degrees of freedom needed for a specific use case. Most users would only use these more abstract, easy-to-use specific purpose annotations. But others could use @DocValuesField directly to create custom doc values or they could even create their own, domain-specific doc value annotation type. Cheers, --Gunnar 2015-08-04 18:00 GMT+02:00 Sanne Grinovero <sa...@hibernate.org>: > Hi Guillaume, > thanks! great input. Some comments inline: > > On 4 August 2015 at 15:11, Guillaume Smet <guillaume.s...@gmail.com> > wrote: > > Hi Sanne, > > > > On Wed, Jul 29, 2015 at 1:26 PM, Sanne Grinovero <sa...@hibernate.org> > > wrote: > >> > >> I'm not sure if this should be extending the @Field annotation as > >> there are special restrictions implied in terms of analysis: are we > >> going to enforce a specific type of tokenizer, or simply take the > >> analysis option away? > > > > > > You can't remove the analysis option away: it's often used to normalize > > sorting on strings (lowercase, remove accents, remove special characters > and > > so on). > > Right we made this same example in a recent meeting we had on this same > subject. > So that's what makes it tricky: we want to allow Analysis, but while > Lucene needs a strong guarantee that it will be unique, we can't > really verify for that unless we take away the liberty to use any > analyzer. > An alternative would be to wrap the Analyzer to monitor and verify it > to be "well-behaved" but I'm not sure if that's doable, or if the > performance would be negligible. I guess we'll just put it into user's > hands to make a sensible choice.. not that we've done better so far on > this aspect. > > > FWIW, we use specific fields for sorting each time we need to sort on a > > string as we don't want to tokenize the string (but not for numerics and > > dates). Maybe @SortFields/@SortField annotations would be in order (I > don't > > like Sortable as I don't think it's a good idea to use these fields for > > search). > > I like that name proposal, and +1 to not encourage people to try reuse > the same field for sorting and indexing. > > The next action for us is to verify what the performance impact is of > the current approach, which is based on the UninvertingReader from > lucene-misc. Gunnar pointed out that uninverting and loading into a > FieldCache is not very different than what Lucene has been doing so > far, so that might be a good strategy to allow migrating to Lucene 5 > incrementally, and provide an incremental improvement in this area > rather than requiring the new mapping. > > I'll soon merge this approach, and as usual I'm lacking on real-world > applications to benchmark so if you're interested in helping on that > that would be awesome; we just need to know that the new code won't be > significantly slower than the Lucene 4 based strategies for sorted > queries. > > Thanks, > Sanne > _______________________________________________ > hibernate-dev mailing list > hibernate-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/hibernate-dev > _______________________________________________ hibernate-dev mailing list hibernate-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/hibernate-dev