: > that seems... dangerous.  you could easily wind up in a situation where 
: > nodes just keep trying to forward forever?
: 
: There is some special http parameter being added when forwarding 
: requests, so I'm sure each node will be able to decide whether it should 
: act as LB or if it is supposed to be the final destination. Or we can 
: add such a param. Of course, if SolrJ on the client side has already 
: selected a replica, the receiving node should not discard that and do 
: its own balancing. So there is some state to get right here.

"Forever" wasn'treally what i ment to say ... I'm concerned more about how 
you would implement this to work well in the 'general case' -- ie: 
multiple nodes, multiple collections, multiple shards, multiple replicas 
per shard -- w/o doing "too much" forwarding.


If nodeA gets a request, when exactly should it decide "i *COULD* handle 
this request for collection1 using local core, but I'll go ahead and 
forward it to nodeB instead." ? ... should it be based on what percentage 
of collection1's total replica list are located on nodeA, or based on what 
pecentage of nodeA is dedicated to collection1? ... should nodeB be more 
or less likely then nodeC to get the request based on how many total cores 
each node has for collection1, or how many unique shards each one has?


Also bear in mind that even if you assumed everything was nice and evenly 
distributed, a "simple" round robin based approach would have some pretty 
signifincat impacts on the number of intra-node network requests....  

Say you have a 5 node cluster, hosting a 1shard/5replica collection such 
that each node has 1 replica:  today any node can process the request 
locally; but if we did a round robin proxy of the request, that means we'd 
only handle it locally 1/5th the time, and 4/5ths of the time you add an 
extra network hop and the assocaited network IO involved (plus the 
original node has a thread tied up waiting to proxy the response) .. so 
you'd go from needing 0 "internal" network requests/IO to having internal 
traffic of 80% of the amount of external traffic recieved.

If those 5 nodes host a collection with 2 shards/5replicas each, spread 
evenly over the 5 nodes: today any given request typically causes 2 
intra-cluster network requests to get the per-shard data; but if we round 
robin proxy the initial request to a differnet node 4/5ths of the time we 
now typically need 2.8 internal requests for each external request...


It just seems like adding more forwarding/proxy logic -- that isn't 
strictly neccessary to compute complete results -- could introduce a lot 
of complexity risk for a problem that already has multiple solutions:

1) client (or external load blanacer) can round robin over live nodes (and 
given that cluster state and metrics are available via HTTP, a client can 
make very sophisticated choices)

2) a single "extra" solr node in the cluster can be used as a "self 
configuring" load balancer that will automatically know when new nodes are 
added to the cluster, or when replicas get moved/added, etc...






: 
: Jan
: 
: > 10. mar. 2021 kl. 19:32 skrev Chris Hostetter <hossman_luc...@fucit.org>:
: > 
: > 
: > : Is there any way whatsoever to solve this on the Solr side only?
: > : 
: > : Only I can think of is to send all requests to a 3rd node in the cluster 
: > : that does not have a core for the collection, then it will balance 
: > : between the two :)
: > 
: > correct -- you can create a Solr node w/o any cores that will act as a 
: > "load balancer" to other solr nodes.
: > 
: > : Or create a new, empty collection on the node, which acts as a routing 
: > : collection only to the target collection?
: > 
: > no -- this won't work, because the requerst your remote client sends will 
: > need to specify the actual collection you want to query, and when the node 
: > gets this it will hand it to the local core for that collection -- it 
: > won't care that there is another local collection that's unrelated.
: > 
: > : Sounds like there should be a way to explicitly disable the 
: > : "optimization" of always handling the request locally in single-shard 
: > : collections, i.e. always try to balance unless shards.preference=local?
: > 
: > that seems... dangerous.  you could easily wind up in a situation where 
: > nodes just keep trying to forward forever?
: > 
: > 
: > 
: > : 
: > : Jan
: > : 
: > : > 10. mar. 2021 kl. 19:06 skrev Chris Hostetter <hossman_luc...@fucit.org 
<mailto:hossman_luc...@fucit.org>>:
: > : > 
: > : > 
: > : > : Ah, I missed "single shard" ... this looks relevant:
: > : > : https://issues.apache.org/jira/browse/SOLR-12217 
<https://issues.apache.org/jira/browse/SOLR-12217>
: > : > 
: > : > That improvement still isn't going to impact Jan's situation where the 
: > : > *client* isn't SolrJ ... as the description says:
: > : > 
: > : >>> NOTE: This Jira doesn't cover the single-sharded collections cases 
when 
: > : >>> not using the CloudSolrClient or Streaming Expressions (i.e. if you 
do 
: > : >>> a non-streaming curl request to a random node in the cluster, the 
: > : >>> shards.preference parameter is not considered in the case of single 
: > : >>> shards collections).
: > : > 
: > : > 
: > : > : 
: > : > : On Wed, Mar 10, 2021 at 12:43 PM Jan Høydahl <jan....@cominvent.com 
<mailto:jan....@cominvent.com>> wrote:
: > : > : 
: > : > : > We have not set any shard.preference, and I also think preferLocal
: > : > : > defaults to false, i.e random
: > : > : >
: > : > : > Earlier we had 2 shares for the same collection (both existed on 
both
: > : > : > nodes) and then requests were distributed to both nodes. That’s 
why, when
: > : > : > we went to 1 shard, I was wondering if the “single-shard” code path 
perhaps
: > : > : > never attempts to utilize replicas?? But have not looked in code 
yet.
: > : > : >
: > : > : > Guess next step is to setup a small local test cluster and see what
: > : > : > happens.
: > : > : >
: > : > : > Jan Høydahl
: > : > : >
: > : > : > > 10. mar. 2021 kl. 15:46 skrev Michael Gibney 
<mich...@michaelgibney.net <mailto:mich...@michaelgibney.net>
: > : > : > >:
: > : > : > >
: > : > : > > You say not "anything fancy" -- depending on how you define 
"fancy", if
: > : > : > you
: > : > : > > have an explicit `shards.preference` param, based on the version 
you're
: > : > : > > running (8.4) you might also take a look at
: > : > : > > https://issues.apache.org/jira/browse/SOLR-14471 
<https://issues.apache.org/jira/browse/SOLR-14471>. (If SOLR-14471 is the
: > : > : > > problem, removing the explicit `shards.preference` param should 
restore
: > : > : > > default "shuffling" routing).
: > : > : > >
: > : > : > > I haven't dug too deep, but it looks like for 8.4 
preferLocalShards
: > : > : > > actually defaults to false? I might be missing something though:
: > : > : > >
: > : > : > 
https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.4.1/solr/solrj/src/java/org/apache/solr/client/solrj/routing/RequestReplicaListTransformerGenerator.java#L85
 
<https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.4.1/solr/solrj/src/java/org/apache/solr/client/solrj/routing/RequestReplicaListTransformerGenerator.java#L85>
: > : > : > >
: > : > : > >
: > : > : > >
: > : > : > >> On Wed, Mar 10, 2021 at 9:10 AM Houston Putman 
<houstonput...@gmail.com
: > : > : > >
: > : > : > >> wrote:
: > : > : > >>
: > : > : > >> I could be wrong, but i dont think preferLocalShards is the 
default in
: > : > : > >> multi-shard use cases.
: > : > : > >>
: > : > : > >>> On Wed, Mar 10, 2021 at 9:07 AM Mike Drob <md...@mdrob.com> 
wrote:
: > : > : > >>>
: > : > : > >>> I believe a server will always try to prefer local cores. Can 
you do an
: > : > : > >>> experiment with 3 nodes, and send http queries to the node not 
hosting
: > : > : > >> any
: > : > : > >>> replicas? That should confirm the balanced distribution.
: > : > : > >>>
: > : > : > >>> If you have multiple shards, the receiving server will forward 
the
: > : > : > >> requests
: > : > : > >>> for shards it doesn’t have, but would still prefer local shards 
when
: > : > : > they
: > : > : > >>> are available.
: > : > : > >>>
: > : > : > >>> On Wed, Mar 10, 2021 at 8:00 AM Jan Høydahl 
<jan....@cominvent.com>
: > : > : > >> wrote:
: > : > : > >>>
: > : > : > >>>> Hi,
: > : > : > >>>>
: > : > : > >>>> A client has a SolrCloud 8.4 setup with two nodes, and one 
collection
: > : > : > >>> with
: > : > : > >>>> one shard and replicationFactor=2.
: > : > : > >>>> Of course we want search traffic to be evenly distributed 
between the
: > : > : > >> two
: > : > : > >>>> replicas.
: > : > : > >>>> The client is using plain HTTP requests, no SolrJ or anything 
fancy,
: > : > : > >> and
: > : > : > >>>> sends all requests to one of the two nodes.
: > : > : > >>>> I was expecting Solr to forward about 50% of those requests to 
the
: > : > : > >> other
: > : > : > >>>> replica, but it is serving them all locally.
: > : > : > >>>>
: > : > : > >>>> I know we can setup an LB in front or re-program the client to 
do
: > : > : > round
: > : > : > >>>> robin, but that is not my question.
: > : > : > >>>> Is the select-random-replica logic only active when we have a 
sharded
: > : > : > >>>> oollection, and not for a single-shard?
: > : > : > >>>>
: > : > : > >>>> Jan
: > : > : > >>>
: > : > : > >>
: > : > : >
: > : > : 
: > : > 
: > : > -Hoss
: > : > http://www.lucidworks.com/ <http://www.lucidworks.com/>
: > : 
: > : 
: > 
: > -Hoss
: > http://www.lucidworks.com/ <http://www.lucidworks.com/>
: 

-Hoss
http://www.lucidworks.com/

Reply via email to