Hi Greg,
Yes, in future we consider to implement our own shard management with
entire index split and merge operations, so for now we just wanted to make
sure that Taxonomy won't make it too complicated.
In fact, recently I found the TaxonomyMergeUtils, which is doing just that
- merging the main a
Interesting Alex. So for your "merge" case, are you suggesting you
would have a different taxonomy index for each segment and would need
to merge those? I could be completely mistaken (I'm not nearly as
familiar with the indexing side of things), but I thought Lucene
maintains one single taxonomy i
Hi Greg, Matt,
Thank you for the responses, it's very helpful and great to hear that
Taxonomy is successfully used for large scale products!
Our biggest concern with it right now is future complications related to
index split and merge, which we are most likely going to use to implement
sharding a
Hi Alex-
Amazon's product search engine is built on top of Lucene, which is a
fairly large-scale application (w.r.t. both index size, traffic and
use-case complexity). We have found taxonomy-based faceting to work
well for us generally, and haven't needed to do much to optimize
beyond what's alrea
Alex,
We did consider trying to optimize Taxonomy indexing performance but we
never really got around to it. The sidecar index is annoying to deal with
and we have had occasional issues with it. Zulia has sharding implemented.
The main issue here is not the taxonomy but rather just getting exact
Hi Matt,
It's very interesting, thanks for the response! Did you have any issues
with Taxonomy indexing performance, or maybe tried to optimize it somehow?
Also, any problems maintaining a sidecar index or experience building a
distributed system around it with sharding/rebalancing?
--
Regards,
Al
Alex,
With our lucene based implementation of Zulia (
https://github.com/zuliaio/zuliasearch) we have went back and forth. We
started with Taxonomy and switched and then switched back to taxonomy. In
our experience the Taxonomy based approach is more scalable and
performant. We do large search
Hello everyone,
We are trying to choose between Taxonomy and SortedSetDocValuesFacetField
implementations for faceted search, and based on available information and
our quick tests, the difference is the following -
- Taxonomy is faster at query time (on our test workload, the difference
sometime