Yes, this is possible using Lucene's grouping APIs. It looks like index time grouping won't work, since you get the same parent spread out across time, but you can use the two-pass grouping instead ... run the FirstPassGroupingCollector on each shard, get the top groups from each, merge those and pick the top N groups, run SecondPassGroupingCollector to get TopGroups from each shard, and then use TopGroups.merge to merge the results.
Lucene provides the APIs to do this ... but it's up to you to send requests out to other shards, gather the results, call the merge, etc. Mike McCandless http://blog.mikemccandless.com On Fri, Nov 16, 2012 at 9:43 AM, Ravikumar Govindarajan <ravikumar.govindara...@gmail.com> wrote: > The formatter has wrecked the table... Reposting it > > Please read it as follows > > {ENTITY,PARENT,DATE,SHARD} tuple > > M1 C1 12/11/2010 A1 > M2 C2 12/11/2011 A2 > M3 C4 12/02/2012 A3 > M4 C1 12/11/2012 A4 > M5 C2 13/11/2012 A4 > M6 C3 14/11/2012 A4 > > I need to group this based on parents ordered by time. The shards > themselves are in increasing order of time {A1-A4 in ascending order of > time} > > So, if for some search, the entities matched are M1,M2,M3,M4&M6, the set of > results returned should be *C3,C2,C1,C4* > > I am aware of grouping search in lucene, but extending it to multiple > shards is possible? More importantly, are there ways by which I can > re-organize my Documents during index-time to optimize query performance > for such a grouping feature? > > -- > Ravi > > > On Fri, Nov 16, 2012 at 8:05 PM, Ravikumar Govindarajan < > ravikumar.govindara...@gmail.com> wrote: > >> We are trying to do a grouping search that spans multiple shards ordered >> by time. >> >> >> *ENTITY PARENT >> TIME SHARD* >> M1 C1 >> 12-Nov-2010 A1 >> M2 C2 >> 12-Nov-2011 A2 >> M3 C4 >> 12-Feb-2012 A3 >> M4 C1 >> 12-Nov-2012 A4 >> M5 C2 >> 13-Nov-2012 A4 >> M6 C3 >> 14-Nov-2012 A4 >> >> I need to group this based on parents ordered by time. The shards >> themselves are in increasing order of time {A1-A4 in ascending order of >> time} >> >> So, if for some search, the entities matched are M1,M2,M3,M4&M6, the set >> of results returned should be *C3,C2,C1,C4* >> >> I am aware of grouping search in lucene, but extending it to multiple >> shards is possible? More importantly, are there ways by which I can >> re-organize my Documents during index-time to optimize query performance >> for such a grouping feature? >> >> -- >> Ravi >> --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org