We have similar distribute search system and we have finished with the following scheme. Search replicas (machines where index resides) are build FacetResult's based on their index chunk (top N categories with document counts). Later on the results are merged "by hands" with summing relevant categories from different replicas.
On Jan 22, 2013, at 3:08 AM, Nicola Buso <nb...@ebi.ac.uk> wrote: > Hi Shai, > > I was thinking to that too, but I'm indexing all indexes in a custom > distributed environment than I can't in this moment have a single > categories index for all the content indexes at indexing time. > A solution should be to merge all the categories indexes in one only > index and use your solution but the merge code I see in the examples > merge also the content index and I can't do that. > > I should share the taxonomy if is possible to merge (I see the resulting > categories indexes are not that big currently), but I would prefer to > have a solution where I can collect the facets over multiple categories > indexes in this way I will be sure the solution will scale better. > > > Nicola. > > > On Mon, 2013-01-21 at 17:54 +0200, Shai Erera wrote: >> Hi Nicola, >> >> >> I think that what you're describing corresponds to distributed faceted >> search. I.e., you have N content indexes, alongside N taxonomy >> indexes. >> >> The information that's indexed in each of those sub-indexes does not >> correlate with the other ones. >> For example, say that you index the category "Movie/Drama", it may >> receive ordinal 12 in index1 and 23 in index2. >> >> If you'll try to count ordinals using MultiReader, you'll just mess up >> everything. >> >> >> If you can share a single taxonomy index for all N content indexes, >> then you'll be in a super-simple position: >> >> 1) Open one TaxonomyReader >> >> 2) Execute search with MultiReader and FacetsCollector >> >> >> >> It doesn't get simpler than that ! :) >> >> >> Before I go into great length describing what you should do if you >> cannot share the taxonomy, let me know if that's not an option for >> you. >> >> Shai >> >> >> >> On Mon, Jan 21, 2013 at 5:39 PM, Nicola Buso <nb...@ebi.ac.uk> wrote: >> Thanks for the reply Uwe, >> >> we currently can search with MultiReader over all the indexes >> we have. >> Now I want to add the faceting search, than I created a >> categories index >> for every index I currently have. >> To accumulate the faceted results now I have a MultiReader >> pointing all >> the indexes and I can create a TaxonomyReader for every >> categories index >> I have; all the way I see to obtain FacetResults are: >> 1 - FacetsCollector >> 2 - a FacetsAccumulator implementation >> >> suppose I use the second option. I should: >> - search as usual using the MultiReader >> - than try to collect all the facetresults iterating over my >> TaxonomyReaders; at every iteration: >> - I create a FacetsAccumulator using the MultiReader and a >> TaxonomyReader >> - I get a list of FacetResult from the accumulator. >> - as I finish I should in some way merge all the >> List<FacetResult> I >> have. >> >> I think this solution is not correct because the docsids from >> the search >> are pointing the multireader instead the taxonomyreader is >> pointing to >> the categories index of a single reader. >> I neither like to merge all the List of FacetResult I retrieve >> from the >> Accumulators. >> >> Probably I'm missing something, can somebody clarify to me how >> I should >> collect the facets in this case? >> >> >> Nicola. >> >> >> >> On Mon, 2013-01-21 at 16:22 +0100, Uwe Schindler wrote: >>> Just use MultiReader, it extends IndexReader, so you can >> pass it anywhere where IndexReader can be passed. >>> >>> ----- >>> Uwe Schindler >>> H.-H.-Meier-Allee 63, D-28213 Bremen >>> http://www.thetaphi.de >>> eMail: u...@thetaphi.de >>> >>>> -----Original Message----- >>>> From: Nicola Buso [mailto:nb...@ebi.ac.uk] >>>> Sent: Monday, January 21, 2013 3:59 PM >>>> To: java-user@lucene.apache.org >>>> Subject: FacetedSearch and MultiReader >>>> >>>> Hi all, >>>> >>>> I'm trying to develop faceted search using lucene 4.0 >> faceting framework. >>>> In our project we are searching on multiple indexes using >> lucene >>>> MultiReader. How should we use the faceted framework to >> obtain >>>> FacetResults starting from a MultiReader? all the example >> I see are using a >>>> "single" IndexReader. >>>> >>>> >>>> >>>> Nicola. >>>> >>>> >>>> >> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: >> java-user-unsubscr...@lucene.apache.org >>>> For additional commands, e-mail: >> java-user-h...@lucene.apache.org >>> >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: >> java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: >> java-user-h...@lucene.apache.org >> >> >> >> > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --- Denis Bazhenov <dot...@gmail.com> --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org