On 13 August 2015 at 08:33, Gunnar Morling <gun...@hibernate.org> wrote: > Hi, > > 2015-08-12 17:46 GMT+02:00 Sanne Grinovero <sa...@hibernate.org>: >> That's an interesting proposal, as index sharing inherently implies >> that fields on different types shall not have conflicting mapping >> (i.e. don't reuse the same field name for a different type). >> >> By default we don't share indexes across unrelated types, but also *by >> default* subtypes are indexed in the same index as their parent - if >> the parent is indexed as well. > > Yes, I think that's the case where it makes sense. It'd make sense to > re-phrase the docs in that regard. > >> >> The reason is to efficiently map a polymorphic domain: when people >> search for type X, they implicitly also search for its subtypes as >> these are valid candidates for the query. >> Having them all in the same index makes for better result quality and >> better search performance - as joining multiple IndexReaders to >> perform a cross - index Query is generally a bad idea, as it's then >> hard to accurately normalize statistics across different vector >> spaces, and that's what defines the quality of the search result. >> At least I believe that *generally* that would give you better >> results, but that's why we give options, and also why sometimes people >> might want multiple Domain objects to be stored in the same index: >> they might be "subtypes" from a domain perspective even if they don't >> technically use inheritance at the Java level: they might be different >> types and yet be mapped to some common fields with (hopefully) >> compatible indexing options. > > Have you ever seen this as an actual requirement by someone?
Yes, not least by myself :) You might have various types which don't share a Java inheritance tree but still have some common property. Could be a simple tagging system, or just the classical example of "title" of a product. Some people will have a Product parent class, some people might not have love for expressing their model in a Java inheritance straight jacket.. a real world large information system seldom follows the Animal examples of text books. Consider also that you might not want to *search* for these different types, but still index them together. E.g. do some computation like what's the most frequently used tag across various types, or implement an auto-suggester field for a UI in which the exact target domain type is yet to be filled in by some follow-up step. So while I agree it doesn't seem a great idea to run a query which could return multiple different (and unrelated - other than by inheritance from Object), there are many other cases; even a mixed-type search is not too hard to handle when using a Projection. >> If we were to drop index sharing, then I think it should be fair to >> also not support multiple types as target for a query anymore; as I'm >> assuming in this case you'd only share for subtypes of some common >> parent, and you'd target that common parent exclusively to perform a >> polymorphic query. > > Assuming we'd drop index sharing for unrelated types but would > continue to support it for the types of one inheritance hierarchy, one > still might want results only from a sub-set of the hierarchy's types. > >> >> So that's the reasons for which it exists; there are some good reasons >> to not allow it too: as you mention the filtering, but also the very >> fact that the type information has to be stored in form of classname >> (typename, in free-form). > > Interestingly, that's not so much an issue with ES. There you always > add a "type" discriminator. Right, any discriminator is quite cheap with Lucene. Just trying to think which benefits it would have, but it's clear I think we need to stick with it. >> I think the strongest reason to not allow it is to avoid the >> inconsistent field mappings, but we could compensate for that with >> better schema validation - something which seems is getting more >> necessary anyway. > > Yes, that' help. All in all, index sharing for inheritance hierarchies > makes sense to me, but I am doubtful about sharing between unrelated > types. I'll assume the above examples changed your mind ;) Cheers, Sanne > >> >> I didn't mean to kill the proposal :) just hoping it helps figure out >> why someone might need it. Would be nice to think of alternatives out >> of the box to avoid the filtering. >> >> Sanne > > --Gunnar > >> >> >> >> On 12 August 2015 at 15:30, Gunnar Morling <gun...@hibernate.org> wrote: >>> Hibernate Search aficionados, >>> >>> I am wondering what that's the rationale for offering the feature of >>> index sharing [1]. >>> >>> The ref guide says "there is really not much benefit in sharing >>> indexes". It complicates queries, as an additional filter on the type >>> field must be applied in case of targeting only one entity using a >>> shared index. >>> >>> Should we consider to drop this feature in HS 6? >>> >>> Thanks, >>> >>> --Gunnar >>> >>> [1] >>> https://docs.jboss.org/hibernate/search/5.4/reference/en-US/html_single/#section-sharing-indexes >>> _______________________________________________ >>> hibernate-dev mailing list >>> hibernate-dev@lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/hibernate-dev >> _______________________________________________ >> hibernate-dev mailing list >> hibernate-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/hibernate-dev _______________________________________________ hibernate-dev mailing list hibernate-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/hibernate-dev