2011/8/25 Elmer van Chastelet <evanchaste...@gmail.com>: > Hi all, > > Yesterday I had a discussion with Sanne on irc [3] about the new api to > access index readers in HS4.0. We couldn't complete our discussion > yesterday, so let's continue here. > As explained in the forum [1], there is currently no good solution for > getting a reader with a subset of the indexes in a sharded environment. > > Currently two basic ideas came to mind: > A - Have a SearchFactory.openIndexReader(Class<?> c, > FullTextFilterImplementor...): This is similar to how the IndexManager's > are gathered at query time, and is probably therefore easy to understand
Current signature is not accepting the FullTextFilterImplementor, but accepts multiple classes: SearchFactory.openIndexReader(Class<?>... entities); Since we can't use two varargs on the same method, this won't work unless you're suggesting that we should support a single type only. > > B - (to be further reviewed) Have something like > searchFactory.indexReaders().withShardingOptions( X, Y > ).includeType(Class<?> z).openIndexReader(). This also adds the ability > to get an IndexReader for multiple classes. But we need to think about > the .withShardingOptions (or something similar), what input should we > support here? Sharding properties are mostly based on some entity > property(/ies), probably easy to be encode as String. The (custom) > sharding strategy may use such String to select the proper index > managers. Using a String object for identifying which index managers to > use looks fine to me. It will be compatible with current implementation > of custom sharding strategies where one might use the Lucene document at > addition time, or if an entity instance will also be passed (see > discussion [2]), the properties of that entity can probably encoded to > some String. And if HS will cover the mapping/have support for Strings > as identifiers for sharding instead of a user defined mapping of the > index (integer) in the array of IndexManagers, that would be awesome :) > (Relieves the pain of having some mapping that should be stored > somewhere, which I currently do). FullTextFilterImplementor could work as a withShardingOptions parameter, but while it's true that in the end it's going to filter some values, I wonder if the concept of Filters is misleading in this case. > Still, we need to know the use cases there might be, i.e. which > flexibility the API should offer. I'd add another option: C - SearchFactory.openIndexReader(String... indexName); This is simple, but it is in no way delegating to the ShardingStrategy to make the index names choice which I think would be way more usable. > As is also mentioned in [1], there is currently no direct access to the > index managers, so getting a FSDirectory is currently not possible in > 4.0alpha1. I think HS should support this to offer the flexibility to > work on the Lucene indexes directly (for example, to build an auto > completion/spell check index from an existing index) Why would you need direct access to a Directory? isn't it enough to provide access to the IndexReader ? > > Let's start by setting up some requirements? > --------- > *1 Have access to IndexReader for one class > *2 Have access to IndexReader with a subset of IndexManagers based on > sharding strategy. Sharding strategies are mostly based on some > propert(y/ies) of an entity instance, which can likely be encoded to > some String. > *3 Have access to index directories (FSDirectory/...). Unlike previous > versions (< HS4.0) it would be nice if this uses the ShardingStrategy > instance in use, so mapping is completely and exclusively done in a > ShardingStrategy We can't provide access to a "virtual" Directory exposing the contents of multiple Directories, that's possible with an IndexReader only. > * ... > --------- > > Please extend/modify the list of requirements if you think something is > missing/incorrect and drop your ideas/thoughts about the mentioned > ideas. > > > Elmer > > > > [1] https://forum.hibernate.org/viewtopic.php?p=2448000#p2448000 > [2] > http://www.mailinglistarchive.com/html/hibernate-dev@lists.jboss.org/2011-08/msg00091.html > [3] IRC log: > > <elmervc> sannegrinovero, have you read/did you have time to think about > https://forum.hibernate.org/viewtopic.php?p=2448000#p2448000 > <sannegrinovero> hi elmervc , yes I've read it. my next thing on the > todo is to make some prototype, as I'm not happy with the current ideas: > <sannegrinovero> elmervc, are you blocked by this? the workaround is > very simple > <sannegrinovero> generally, I'm wondering if we can avoid having to > expose the DirectoryProviders. I would want them gone from the public > API, but of course limitations like this are not acceptable. > <elmervc> sannegrinovero, I'm branching this migration, so it's not > really blocking. But I would like to try the new H core/search, so for > that to work I need access to the subset of indices > <elmervc> What workaround were you thinking about ? > <elmervc> Just construct an index reader/FSDirs myself using 'hardcoded' > paths ? > <sannegrinovero> nono that's ugly.. > <sannegrinovero> elmervc, all logic to open this IR is in > org.hibernate.search.impl.ImmutableSearchFactory.openIndexReader(Class<?>...) > <sannegrinovero> elmervc, and it's just a couple of lines to change ;) > <sannegrinovero> the problem is more how to make it easy to consume > <elmervc> Ok, I'll look into that :) > <elmervc> Using filters is not a good idea? > <sannegrinovero> yes I liked your suggestion. but is it enough ? > <sannegrinovero> and how would the methods look like? > <sannegrinovero> (i.e. the signature) > <elmervc> SearchFactory.openIndexReader(Class<?> c, > FullTextFilterImplementor[] filters) , or what do you mean? > <sannegrinovero> I'd prefer SearchFactory.openIndexReader(Class<?> c, > FullTextFilterImplementor... filters) > <elmervc> But I'm not sure if this covers all use cases of sharding > <sannegrinovero> elmervc, the methods don't need necessarily be defined > on the SearchFactory. We can think of something like > searchFactory.indexReaders().withShardingOptions( X, Y > ).includeType(Class<?> z).openIndexReader() .. how does that look like? > <sannegrinovero> I'm just tossing out some ideas, but then we should > bring this up to the mailing list. > <elmervc> the .includeType , do you mean that multiple classes can be > included? > <sannegrinovero> yes > <sannegrinovero> basically the indexReaders() method would open a > context, private to this invocation chain only. (i.e. not affecting > other threads invoking .indexReaders() ) > <elmervc> Sounds cool. But then we need to think about > the .withShardingOptions, or something similar. For transparancy it's > best to have something similar to the methods in the ShardingStrategy > interface > <elmervc> Or something similar to what is done @ querytime, i.e. > FullTextFilterImplementors > <elmervc> The point is, we need to know what other use cases one might > have > <elmervc> That's related to how sharding is done, i.e. ... might be a > field in the doc , full text filter, ... > <elmervc> (doc = doc to be added) > <sannegrinovero> yes exactly I need use cases to understand this, that's > why your feedback is very much appreciated :) > <elmervc> sannegrinovero, For example, our sharding strategy is based on > some field in an entity that is added to the Lucene Document (actually, > it has a @Field anno, and this field is removed from the Lucene Document > in the shardingstrategy.getDirectoryProviderForAddition(...) > <sannegrinovero> elmervc, lol that prooves another discussion I had > recently in proposing that we should pass the entity instance and not > the document to the sharding strategy. > <elmervc> It might be usefull indeed, but in our case it's easier to use > a Field in the doc, because that field will always have the same name, > i.e. we can reuse the same sharding strategy. > <sannegrinovero> elmervc, this discussion is very interesting but I'm > busy in other chats now which I can't postpone. Could you please > synthesize this and send a mail to the developer list? > > _______________________________________________ > hibernate-dev mailing list > hibernate-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/hibernate-dev > _______________________________________________ hibernate-dev mailing list hibernate-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/hibernate-dev