[ https://issues.apache.org/jira/browse/SOLR-16717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704693#comment-17704693 ]
Mikhail Khludnev edited comment on SOLR-16717 at 3/24/23 7:30 PM: ------------------------------------------------------------------ Hi, After all I have something [https://github.com/apache/solr/compare/main...mkhludnev:solr:true-cluster-join?expand=1] honestly, I don't recommend to look at. It's too rough. Here are a few topics to discuss: # removed constraint for single shard from in query parser. # instead added many conditions checking that router keys correspond to fromField&toField # I think it's ok to stick to composite router, wdyt? # what's not done is fall back to uniqueKey if routerKey isn't set explicitly. Should be ok. # the code picking collocated corresponding shard for local join operation. # I think it's worth to bother about replica placement to ensure that 5. always can happen # I choose AffinityPlacementPlugin fo experimentation. Is it a good decision? Or it's enough to start from the simpler one? # Turns out only slight change is necessary in {{computePlacement()}} but there's no elegant way to amend the current {{withCollection}} flow, thus I end up with copy-pasting-and hacking. If we agree about this way of configuring, I need to refactor {{computePlacement()}} new a separate class, and then customize it for "shard-2-shard" case. # I think it might be configured via new property {{AffinityPlacementPlugin.withCollectionShards}} which would be a slight modification of existing {{{}withCollection{}}}. WDYT? # Another problem and surprise for me is how to a proper shard placement at all: my first experiments with default placement plugin gave expected (green) results because it's just how it naturally put same number of shards: shard1 goes to the first node, and then shard2 .. etc. Should I bother about replica placement plugin at all? How to break the test by providing no corresponding shard at node? I thinking about rolling drop and start test cluster nodes, but I'm not sure it makes sense. WDYT? was (Author: mkhludnev): Hi, After all I have something [https://github.com/apache/solr/compare/main...mkhludnev:solr:true-cluster-join?expand=1] honestly, I don't recommend to look at. It's too rough. Here are a few topics to discuss: # removed constraint for single shard from in query parser. # instead added many conditions checking that router keys correspond to fromField&toField # I think it's ok to stick to composite router, wdyt? # what's not done is fall back to uniqueKey if routerKey isn't set explicitly. Should be ok. # the code picking collocated corresponding shard for local join operation. # I think it's worth to bother about replica placement to ensure that 5. always can happen # I choose AffinityPlacementPlugin fo experimentation. Is it a good decision? Or it's enough to start from the simpler one? # Turns out only slight change is necessary in {{computePlacement()}} but there's no elegant way to amend the current {{withCollection}} flow, thus I end up with copy-pasting-and hacking. # I think it might be configured via new property {{AffinityPlacementPlugin.withCollectionShards}} which would be a slight modification of existing {{withCollection. WDYT? }} # Another problem and surprise for me is how to a proper shard placement at all: my first experiments with default placement plugin gave expected (green) results because it's just how it naturally put same number of shards: shard1 goes to the first node, and then shard2 .. etc. Should I bother about replica placement plugin at all? How to break the test by providing no corresponding shard at node? I thinking about rolling drop and start test cluster nodes, but I'm not sure it makes sense. WDYT? > Join collocated shards > ---------------------- > > Key: SOLR-16717 > URL: https://issues.apache.org/jira/browse/SOLR-16717 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Mikhail Khludnev > Priority: Major > > h3. Context > It's about \{!join} query parser and distributed mode. > h3. As is > SOLR-4905 allows to join from single shard collection to many shards > collection. > h3. Challenge > * Support multiple shards on from side as well, > * but strictly stick to collocated indices that promise much performance. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org