Re: [hibernate-dev] [Search] proposing an alternative to depth in @IndexedEmbedded

Zach Kurey Wed, 24 Aug 2011 12:29:56 -0700

On Aug 24, 2011, at 8:26 AM, Sanne Grinovero wrote:

> This complicates things. First of all it means that the "subPaths"
> property should now be named "includeSubPaths" instead, as opposing to
> "excludeSubPaths".


Yes, if 'excludeSubPaths' is provided, then 'subPaths' should be renamed to 
'includeSubPaths', for cleanliness/symmetry sake.  

> Also with such names I would expect the additional
> paths to work *in addition to* normal depth.

I think wasn't exact enough.  I would expect 'includeSubPaths' to be 
incompatible with both 'depth' and 'excludeSubPaths'.  However, I would expect 
'depth' and 'excludeSubPaths' to be compatible.  Which basically says to index 
using the default approach, and only stop at max depth, but exclude indexing of 
the paths specified.

given:
class C{
    @IndexEmbedded
    private Collection<D> d;
    @Field
    private int foo;
}
Illegal configuration: can't specify depth and includeSubPaths simultaneously:
class A{
    @IndexEmbedded(
        includeSubPaths={"d.one", "d.two"}, depth=5 
     )
    private C see;
}
Illegal configuration: specifying includeSubPaths and excludeSubPaths is 
nonsense, since absence of specifying in includeSubPaths means the path won't 
be indexed anyway:
class A{
    @IndexEmbedded(
        includeSubPaths={"d.one", "d.two"}, excludeSubPaths {"d.three"}  
     )
    private C see;
}
Valid configuration:  Excludes indexing of d.  Maybe D leads to cycles, or 
expensive nested joins, and it isn't used when searching index A, so we want to 
exclude it.  
class A{
    @IndexEmbedded( depth=5, excludeSubPaths {"d"}  )
    private C see;
}
Also what validly constitutes a path is different for excludeSubPaths.  
Anywhere in a 'path' can be a termination point where the user can express that 
they don't want indexing to go down that path any further; and that could 
potentially go down to a leaf.  While 'includeSubPaths' must be composed of 
leaf nodes.
> So to implement your original suggestion we should have thought of a
> mapping algorithm which would use either the _depth_ approach or the
> _subPaths_ approach, but you say that in practice you would apply them
> both?
> In this case if I wanted to use the subPaths strategy only I should
> use depth=0 and then add what I want to add? Just checking if we're on
> the same page.

No, that wasn't what I meant.  I'd expect the annotation processing to 
basically look like:

IndexEmbedded embeddedConfig = (IndexEmbedded) 
node.getAnnotation(IndexEmbedded.class);

if(embeddedConfig.includeSubPaths() != null 
&& embeddedConfig.depth() != null || embeddedConfig.excludeSubPaths() != null){
  throw new IllegalArgumentException("Invalid configuration:  Cannot specify 
includeSubPaths and depth(nor excludeSubPaths), simultaneously");
}

Hopefully it would be understood that if only includeSubPaths is provided, then 
the default depth is irrelevant and is explicitly expressed per path.

> 
> Do you have a great example to support the more complex option? We
> have to start somewhere, but the property names should be final and
> the meaning should not change in future if we then want to add the
> exclusions in future.

I think the complex option you thought I was implying was a mixed bag approach. 
 Which I'm not advocating for.  My only purpose for suggesting the 'exclude' 
option is that if I have 100 properties I want to index for a particular 
entity, then listing 100 properties explicitly in 'includeSubPaths' could be 
laborious(and some might think messy).  Those 100 properties could be directly 
on the entity, or they could be through associated entities.  However, because 
of my desire to have those 100 properties, because of 'depth' I might end up 
with 1000 values indexed(mostly waste and potentially costly).  In that case 
maybe those 900 other values come from a particular unused path, or a path I 
can prune a bit through via 'excludeSubPaths'.

Overall I think the options of:  default approach, default + excludesSubPaths, 
or includeSubPaths(but no default depth or excludes), gives users 3 good 
options for how they want to go about indexing, and they can choose the least 
painful approach for their particular use case.  Most cases are going to be 
simple and for a particular entity only a very limited subset of properties are 
needed for search, and I'd probably go with 'includeSubPaths' most of the time 
in our particular object model.

Hope that clarifies things?

Zach



_______________________________________________
hibernate-dev mailing list
hibernate-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev

Re: [hibernate-dev] [Search] proposing an alternative to depth in @IndexedEmbedded

Reply via email to