Hi The taxonomy faceting approach maintains a sidecar index where it keeps the taxonomy and assigns an integer (ordinal) to each category. Those integers are encoded in a BinaryDocValues field for each document. It supports hierarchical faceting as well as assigning additional metadata to each facet occurrence (called associations). At search time, faceting is done by aggregating the category ordinals found in each document. Since those ordinals are global to the index, merging and finding the top-K facets across segments is relatively cheap.
The SortedSet faceting approach does not need a sidecar index ans relies on the SortedSet fields. Here too each term/category is assigned an ordinal and at search time the facets are aggregated using those ordinals. However, the ordinals of the same category is not the same across segments, and therefore finding the top-K facets is a bit more expensive (roughly 20% slower if I remember correctly). Another difference is that the SortedSet approach keeps a true ordinal for a facet, so e.g. the category A/B will always receive an ordinal that is smaller than A/C. In the taxonomy approach though, whichever facet got added first receives the lowest ordinal, except that the parent of all categories at a certain level in the hierarchy always receives a smaller ordinal than all its children. Working w/ SortedSet facets is indeed simpler than the taxonomy, but the taxonomy does not seriously complicate things. If you need a facet hierarchy, you should use the taxonomy approach. Otherwise, I would just try each and see which one works better for your usecase. As for optimizing an index, the taxonomy facets do not make any difference in that case. Shai On Mon, Sep 22, 2014 at 8:48 PM, Yonghui Zhao <zhaoyong...@gmail.com> wrote: > If we want to implement simple facet counting feature, it seems we can do > it via sortedset or taxonomy writer/reader. > > Seems sortedset is simpler but doesn't support hierarchical facet count > such as A/B/C. > > I want to know what's advantage/disadvantage of sortedset or taxonomy? > > Is there any trouble with taxonomy when index is optimized(merged)? >