Hi Solr Group, I am not sure the following is a viable use-case, welcoming input and any implementation recommendations.
I would like to perform joins over two sharded collections. Where docs are routed to specific shards based on a date range and are the same for shards in each collection. I understand that this means that the replicas from each collection that hold data to be joined need to be collated on the same Solr Server. I have read solutions that use ADD REPLICA to add a Collection B replica to all SolrServers assuming Collection B has only one Shard. For my use case I need Collection B to have multiple shards. *Collection A Collection B SolrServer * Shard1_2020 Shard1_2020 172.33.0.1:8983_solr Shard2_2021 Shard2_2021 172.33.0.2:8983_solr Shard3_2022 Shard3_2022 172.33.0.3:8983_solr I think my question comes down to how do I break shards by a date range, and do it in a way that both Collections A and B would be defined by the same date range? If could reliably break shards by date, and know the date range of the shard, I think I could use ADD REPLICA api to align. Not sure a compositeId routing approach would work, but thinking an implicit id may be hard to manage over time. Is an approach like this viable, concerned a bit about maintenance concerns, other ideas to support this join? Note: I am considering this within Time series collections... Matt