Hi, I'm currently calling: FacetsCollector.create(new StandardFacetsAccumulator(facetSearchParams, indexReader, getTaxonomyReader())
that is calling FacetRequest.createAggregator(...) and is not working properly; I'm extending the CountingAggregator and than Aggregator, if I override FacetAccumulator.getAggregator(), etc... what is the difference from the calls? I mean the difference from: Aggregator.aggregate(int docID, float score, IntsRef ordinals) and FacetsAggregator.aggregate(FacetsCollector.MatchingDocs matchingDocs, CategoryListParams clp, FacetArrays facetArrays) I suppose I can use the code from FastCountingFacetsAggregator and recalculate the ordinal based on the merged ones; than count on the correct position in facetArrays.getIntArray(). Nicola. On Thu, 2013-04-11 at 13:23 +0300, Shai Erera wrote: > Hi Nicola, > > I didn't read the code examples, but I'll relate to your last question > regarding the Aggregator. Indeed, with Lucene 4.2, > FacetRequest.createAggregator is not called by the default > FacetsAccumulator. This method should go away from FacetRequest > entirely, but unfortunately we did not finish all the refactoring work > before 4.2. > > > What you should do is extend the new FacetsAggregator and override > FacetsAccumulator.getAggregator(). Can you try that and let us know if > that resolves your problem? > > > Shai > > > > On Thu, Apr 11, 2013 at 1:05 PM, Nicola Buso <nb...@ebi.ac.uk> wrote: > Hi all, > > in Lucene 4.1, after some advise from the mailing list I am > merging > taxonomies (in memory because of the small size of taxonomies > indexes) > and collecting facets values from the merged taxonomy instead > of the > single ones; the scenario is: > - you have a Multireader pointing to more indexes > - you are querying the Multireader > - you want to collect facets for the Multireader > > What I'm doing: > -1- taxonomies merging > long createStart = System.currentTimeMillis(); > catMergeDir = new RAMDirectory(); > readerOrdinalsMap = new HashMap<AtomicReader, > DirectoryTaxonomyWriter.OrdinalMap>(); > DirectoryTaxonomyWriter taxoMergeWriter = new > DirectoryTaxonomyWriter(catMergeDir); > Directory taxoDirectory = null; > IndexReader contentReader = null; > OrdinalMap[] ordinalMapsArray = new > DirectoryTaxonomyWriter.MemoryOrdinalMap[taxoIdxRepoArray.length]; > > for (int idx = 0; idx < taxoIdxRepoArray.length; idx++) { > taxoDirectory = > LuceneDirectoryFactory.getDirectory(taxoIdxRepoArray[idx]); > contentReader = idxReaderArray[idx]; > ordinalMapsArray[idx] = new > DirectoryTaxonomyWriter.MemoryOrdinalMap(); > taxoMergeWriter.addTaxonomy(taxoDirectory, > ordinalMapsArray[idx]); > > for (AtomicReaderContext readerCtx : > contentReader.leaves()) { > readerOrdinalsMap.put(readerCtx.reader(), > ordinalMapsArray[idx]); > } > } > taxoMergeWriter.close(); > log.info(String.format("Taxonomy merge time elapsed: %s(ms)", > System.currentTimeMillis() - createStart)); > > ------ > from the code above I'm holding: > - catMergeDir: the directory containing the merged categories > - readerOrdinalsMap: map containing the ordinals for every > reader in the > Multireader > > -2- aggregator based on the ordinalsMap constructed in -1- > class OrdinalMappingCountingAggregator extends > CountingAggregator { > private int[] ordinalMap; > > public OrdinalMappingCountingAggregator(int[] > counterArray) { > super(counterArray); > } > > @Override > public void aggregate(int docID, float score, IntsRef > ordinals) > throws IOException { > > int upto = ordinals.offset + ordinals.length; > for (int i = ordinals.offset; i < upto; i++) { > int ordinal = ordinals.ints[i]; // original ordinal > read for the > AtomicReader given to setNextReader > int mappedOrdinal = ordinalMap[ordinal]; // mapped > ordinal, > following the taxonomy merge > counterArray[mappedOrdinal]++; // count the mapped > ordinal > instead, so all AtomicReaders count that ordinal > } > } > > @Override > public boolean setNextReader(AtomicReaderContext ctx) > throws IOException { > > if (readerOrdinalsMap.get(ctx.reader()) == null) > { return > false; } > ordinalMap = > readerOrdinalsMap.get(ctx.reader()).getMap(); > return true; > } > } > > -3- override the CountFacetRequest.createAggregator(..) to > return -2- > return new CountFacetRequest(cp, maxCount) { > > @Override > public Aggregator createAggregator(boolean useComplements, > FacetArrays arrays, TaxonomyReader taxonomy) { > > int[] a = arrays.getIntArray(); > > return new OrdinalMappingCountingAggregator(a); > } > }; > -------- > In 4.2 is no more working, and I'm not collecting facet values > from the > merged taxonomy. > > First problem I realized is: > the new api FacetCollector. create(FacetSearchParams fsp, > IndexReader > indexReader, TaxonomyReader taxoReader) will give back > collectors and > accumulators that will never call > FacetRequest.createAggregator() > You have to use the api > FacetsCollector.create(FacetsAccumulator > accumulator) passing to it a StandarFacetsAccumulator (the > only one that > will call FacetRequest.createAggregator(..) > > Second > Also using the StandardFacetsAccumulator it's not working > because the > facet counting is wrong. > Any advice why this is happening? > > I'm also going to check how to use this idea to mimic the > behaviour of > the FastCountingFacetsAggregator, that I think should be the > right way. > > I hope I gived enough information, if somebody can help better > understanding how facets changed in 4.2 will be appreciated. > > > > Nicola. > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: > java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: > java-user-h...@lucene.apache.org > > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org