Re: Taxonomy vs SSDVFF for faceted search

2021-04-30 Thread Alexander Lukyanchikov
Hi Greg, Yes, in future we consider to implement our own shard management with entire index split and merge operations, so for now we just wanted to make sure that Taxonomy won't make it too complicated. In fact, recently I found the TaxonomyMergeUtils, which is doing just that - merging the main a

Re: Taxonomy vs SSDVFF for faceted search

2021-04-30 Thread Greg Miller
Interesting Alex. So for your "merge" case, are you suggesting you would have a different taxonomy index for each segment and would need to merge those? I could be completely mistaken (I'm not nearly as familiar with the indexing side of things), but I thought Lucene maintains one single taxonomy i

Re: Taxonomy vs SSDVFF for faceted search

2021-04-29 Thread Alexander Lukyanchikov
Hi Greg, Matt, Thank you for the responses, it's very helpful and great to hear that Taxonomy is successfully used for large scale products! Our biggest concern with it right now is future complications related to index split and merge, which we are most likely going to use to implement sharding a

Re: Taxonomy vs SSDVFF for faceted search

2021-04-29 Thread Greg Miller
Hi Alex- Amazon's product search engine is built on top of Lucene, which is a fairly large-scale application (w.r.t. both index size, traffic and use-case complexity). We have found taxonomy-based faceting to work well for us generally, and haven't needed to do much to optimize beyond what's alrea

Re: Taxonomy vs SSDVFF for faceted search

2021-04-29 Thread Matt Davis
Alex, We did consider trying to optimize Taxonomy indexing performance but we never really got around to it. The sidecar index is annoying to deal with and we have had occasional issues with it. Zulia has sharding implemented. The main issue here is not the taxonomy but rather just getting exact

Re: Taxonomy vs SSDVFF for faceted search

2021-04-28 Thread Alexander Lukyanchikov
Hi Matt, It's very interesting, thanks for the response! Did you have any issues with Taxonomy indexing performance, or maybe tried to optimize it somehow? Also, any problems maintaining a sidecar index or experience building a distributed system around it with sharding/rebalancing? -- Regards, Al

Re: Taxonomy vs SSDVFF for faceted search

2021-04-28 Thread Matt Davis
Alex, With our lucene based implementation of Zulia ( https://github.com/zuliaio/zuliasearch) we have went back and forth. We started with Taxonomy and switched and then switched back to taxonomy. In our experience the Taxonomy based approach is more scalable and performant. We do large search

Taxonomy vs SSDVFF for faceted search

2021-04-28 Thread Alexander Lukyanchikov
Hello everyone, We are trying to choose between Taxonomy and SortedSetDocValuesFacetField implementations for faceted search, and based on available information and our quick tests, the difference is the following - - Taxonomy is faster at query time (on our test workload, the difference sometime