> Current signature is not accepting the FullTextFilterImplementor, but > accepts multiple classes: > SearchFactory.openIndexReader(Class<?>... entities); > > Since we can't use two varargs on the same method, this won't work > unless you're suggesting that we should support a single type only. No, that's not what I meant. Maybe a ShardingOptions arg may become usefull? And that's actually another requirement, access to IndexReader for multiple classes/indexnames applying the same 'sharding options' for all of them.
> I'd add another option: > > C - SearchFactory.openIndexReader(String... indexName); > > This is simple, but it is in no way delegating to the ShardingStrategy > to make the index names choice which I think would be way more usable. +1. Q: Is there a difference between the readers returned by: SearchFactory.openIndexReader("A"); SearchFactory.openIndexReader(SubA.class); if A and SubA (subclass of A) both have the bare @Indexed annotation, thus sharing the same index name. That is, will SearchFactory.openIndexReader(SubA.class) perform some filtering to only return docs from SubA entities? > Why would you need direct access to a Directory? isn't it enough to > provide access to the IndexReader ? No need for that, my mistake. So requirements become: --------- *1 Have access to IndexReader for one class *2 Have access to IndexReader with a subset of IndexManagers based on *3 Have access to IndexReader for multiple classes/indexnames applying the same 'sharding options' for all of them. --------- As Emmanuel mentioned, can we think of use cases where we would like to have access to Lucene Directories (/IndexManagers), which is currently mentioned in the docs: http://docs.jboss.org/hibernate/search/4.0/reference/en-US/html_single/#d0e6658 ? Elmer On Thu, 2011-08-25 at 13:37 +0200, Sanne Grinovero wrote: > 2011/8/25 Elmer van Chastelet <evanchaste...@gmail.com>: > > Hi all, > > > > Yesterday I had a discussion with Sanne on irc [3] about the new api to > > access index readers in HS4.0. We couldn't complete our discussion > > yesterday, so let's continue here. > > As explained in the forum [1], there is currently no good solution for > > getting a reader with a subset of the indexes in a sharded environment. > > > > Currently two basic ideas came to mind: > > A - Have a SearchFactory.openIndexReader(Class<?> c, > > FullTextFilterImplementor...): This is similar to how the IndexManager's > > are gathered at query time, and is probably therefore easy to understand > > Current signature is not accepting the FullTextFilterImplementor, but > accepts multiple classes: > SearchFactory.openIndexReader(Class<?>... entities); > > Since we can't use two varargs on the same method, this won't work > unless you're suggesting that we should support a single type only. > > > > > B - (to be further reviewed) Have something like > > searchFactory.indexReaders().withShardingOptions( X, Y > > ).includeType(Class<?> z).openIndexReader(). This also adds the ability > > to get an IndexReader for multiple classes. But we need to think about > > the .withShardingOptions (or something similar), what input should we > > support here? Sharding properties are mostly based on some entity > > property(/ies), probably easy to be encode as String. The (custom) > > sharding strategy may use such String to select the proper index > > managers. Using a String object for identifying which index managers to > > use looks fine to me. It will be compatible with current implementation > > of custom sharding strategies where one might use the Lucene document at > > addition time, or if an entity instance will also be passed (see > > discussion [2]), the properties of that entity can probably encoded to > > some String. And if HS will cover the mapping/have support for Strings > > as identifiers for sharding instead of a user defined mapping of the > > index (integer) in the array of IndexManagers, that would be awesome :) > > (Relieves the pain of having some mapping that should be stored > > somewhere, which I currently do). > > FullTextFilterImplementor could work as a withShardingOptions > parameter, but while it's true that in the end it's going to filter > some values, I wonder if the concept of Filters is misleading in this > case. > > > Still, we need to know the use cases there might be, i.e. which > > flexibility the API should offer. > > I'd add another option: > > C - SearchFactory.openIndexReader(String... indexName); > > This is simple, but it is in no way delegating to the ShardingStrategy > to make the index names choice which I think would be way more usable. > > > As is also mentioned in [1], there is currently no direct access to the > > index managers, so getting a FSDirectory is currently not possible in > > 4.0alpha1. I think HS should support this to offer the flexibility to > > work on the Lucene indexes directly (for example, to build an auto > > completion/spell check index from an existing index) > > Why would you need direct access to a Directory? isn't it enough to > provide access to the IndexReader ? > > > > > Let's start by setting up some requirements? > > --------- > > *1 Have access to IndexReader for one class > > *2 Have access to IndexReader with a subset of IndexManagers based on > > sharding strategy. Sharding strategies are mostly based on some > > propert(y/ies) of an entity instance, which can likely be encoded to > > some String. > > *3 Have access to index directories (FSDirectory/...). Unlike previous > > versions (< HS4.0) it would be nice if this uses the ShardingStrategy > > instance in use, so mapping is completely and exclusively done in a > > ShardingStrategy > > We can't provide access to a "virtual" Directory exposing the contents > of multiple Directories, that's possible with an IndexReader only. > > > * ... > > --------- > > > > Please extend/modify the list of requirements if you think something is > > missing/incorrect and drop your ideas/thoughts about the mentioned > > ideas. > > > > > > Elmer > > > > > > > > [1] https://forum.hibernate.org/viewtopic.php?p=2448000#p2448000 > > [2] > > http://www.mailinglistarchive.com/html/hibernate-dev@lists.jboss.org/2011-08/msg00091.html > > [3] IRC log: > > > > <elmervc> sannegrinovero, have you read/did you have time to think about > > https://forum.hibernate.org/viewtopic.php?p=2448000#p2448000 > > <sannegrinovero> hi elmervc , yes I've read it. my next thing on the > > todo is to make some prototype, as I'm not happy with the current ideas: > > <sannegrinovero> elmervc, are you blocked by this? the workaround is > > very simple > > <sannegrinovero> generally, I'm wondering if we can avoid having to > > expose the DirectoryProviders. I would want them gone from the public > > API, but of course limitations like this are not acceptable. > > <elmervc> sannegrinovero, I'm branching this migration, so it's not > > really blocking. But I would like to try the new H core/search, so for > > that to work I need access to the subset of indices > > <elmervc> What workaround were you thinking about ? > > <elmervc> Just construct an index reader/FSDirs myself using 'hardcoded' > > paths ? > > <sannegrinovero> nono that's ugly.. > > <sannegrinovero> elmervc, all logic to open this IR is in > > org.hibernate.search.impl.ImmutableSearchFactory.openIndexReader(Class<?>...) > > <sannegrinovero> elmervc, and it's just a couple of lines to change ;) > > <sannegrinovero> the problem is more how to make it easy to consume > > <elmervc> Ok, I'll look into that :) > > <elmervc> Using filters is not a good idea? > > <sannegrinovero> yes I liked your suggestion. but is it enough ? > > <sannegrinovero> and how would the methods look like? > > <sannegrinovero> (i.e. the signature) > > <elmervc> SearchFactory.openIndexReader(Class<?> c, > > FullTextFilterImplementor[] filters) , or what do you mean? > > <sannegrinovero> I'd prefer SearchFactory.openIndexReader(Class<?> c, > > FullTextFilterImplementor... filters) > > <elmervc> But I'm not sure if this covers all use cases of sharding > > <sannegrinovero> elmervc, the methods don't need necessarily be defined > > on the SearchFactory. We can think of something like > > searchFactory.indexReaders().withShardingOptions( X, Y > > ).includeType(Class<?> z).openIndexReader() .. how does that look like? > > <sannegrinovero> I'm just tossing out some ideas, but then we should > > bring this up to the mailing list. > > <elmervc> the .includeType , do you mean that multiple classes can be > > included? > > <sannegrinovero> yes > > <sannegrinovero> basically the indexReaders() method would open a > > context, private to this invocation chain only. (i.e. not affecting > > other threads invoking .indexReaders() ) > > <elmervc> Sounds cool. But then we need to think about > > the .withShardingOptions, or something similar. For transparancy it's > > best to have something similar to the methods in the ShardingStrategy > > interface > > <elmervc> Or something similar to what is done @ querytime, i.e. > > FullTextFilterImplementors > > <elmervc> The point is, we need to know what other use cases one might > > have > > <elmervc> That's related to how sharding is done, i.e. ... might be a > > field in the doc , full text filter, ... > > <elmervc> (doc = doc to be added) > > <sannegrinovero> yes exactly I need use cases to understand this, that's > > why your feedback is very much appreciated :) > > <elmervc> sannegrinovero, For example, our sharding strategy is based on > > some field in an entity that is added to the Lucene Document (actually, > > it has a @Field anno, and this field is removed from the Lucene Document > > in the shardingstrategy.getDirectoryProviderForAddition(...) > > <sannegrinovero> elmervc, lol that prooves another discussion I had > > recently in proposing that we should pass the entity instance and not > > the document to the sharding strategy. > > <elmervc> It might be usefull indeed, but in our case it's easier to use > > a Field in the doc, because that field will always have the same name, > > i.e. we can reuse the same sharding strategy. > > <sannegrinovero> elmervc, this discussion is very interesting but I'm > > busy in other chats now which I can't postpone. Could you please > > synthesize this and send a mail to the developer list? > > > > _______________________________________________ > > hibernate-dev mailing list > > hibernate-dev@lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/hibernate-dev > > _______________________________________________ hibernate-dev mailing list hibernate-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/hibernate-dev