Sorry, I'll add more context. The main collection is a sharded collection
with over ten shards and where each shard has 2 replicas. The from
collection (fromData) has a single shard and one replica in each of the
solr nodes.
The query I send is a Json Query, looking like:
{
"filter":[{"join":{
"query":{"lucene":{
"query":"\"test\"",
"df":"value_s"}},
"from":"id",
"to":"to_s",
"fromIndex":"fromData"}},
],
"offset":0,
"query":"*:*",
"limit":1,
"params":{
"TZ":"GMT+01:00",
"timeAllowed":1800000},
"fields":["id"]
}
It works perfectly fine when sending it to any random solr node, but it
fails when it gets sent from the coordinator query. Every other query that
doesn't have a join works fine, or at least I haven't found any other
problems.
Thanks
On Tue, 3 Mar 2026 at 17:38, Mikhail Khludnev <[email protected]> wrote:
> Hello,
> I'm in doubt. Assuming you use
>
> https://solr.apache.org/guide/solr/latest/query-guide/join-query-parser.html#joining-multiple-shard-collections
> Please confirm.
> There;s no exact coordinator test for shard joins here
>
> https://github.com/apache/solr/blob/main/solr/core/src/test/org/apache/solr/search/join/ShardToShardJoinAbstract.java#L58
> But it creates 5 nodes for 3 shard collections, and I believe pick a
> coordinator randomly. So, we may expect it's working.
> Then, the error you provide might occur at "to"-node when it didn't find
> expected co-shard.
> I'm afraid we need to check shard alignment across cluster, and detailed
> request log across nodes. what exactly happened at coordinator and
> subordinate nodes.
> Regarding shards allocation: even if there's a node with a shard1 of "to"
> collection collocated with "from" shard1, nothing will stop the coordinator
> from attempting to search "to" shard1 at another node where "from" shard1
> is absent, and got the error like this.
>
> On Tue, Mar 3, 2026 at 6:02 PM Endika Posadas <[email protected]>
> wrote:
>
> > Hi,
> >
> > We're running dedicated coordinator nodes for query performance, with
> > collections that are properly co-located across data nodes.
> >
> >
> > When sending a join query (fromIndex pointing to a co-located collection)
> > through the coordinator, we get an error:
> >
> > "error":{
> >
> >
> "metadata":["error-class","org.apache.solr.common.SolrException","root-error-class","org.apache.solr.common.SolrException"],
> > "msg":"SolrCloud join: To join with a collection that might not be
> > co-located, use method=crossCollection.",
> > "code":400
> > }
> >
> >
> > The same query works fine when sent directly to a data node.
> >
> > It seems like the coordinator is trying to resolve the join instead of
> > delegating it to the data nodes. Is there a workaround around this?
> >
> > Thanks
> >
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>