Re: Querying locally before sending a distributed request

Steve Davids Wed, 10 Dec 2014 07:17:34 -0800

bq. In a one-shard case, no query really needs to be forwarded, since any
replica can fully get the results so in this case no query would be
forwarded.


You can pass the request param distrib=false to not distribute the request
in that particular case at which point it will only gather results from
that particular host.

As for the SolrCloud example with n-shards > 1 your overall search request
time is limited to the slowest shard's response time. So, you would
potentially be saving one hop, but you are still making n-1 other hops to
gather all of the other shard's results thus making it a moot point since
you will be waiting on the other shards to respond before you can return
the aggregated result list. You will then be on the hook to setup the load
balancing across replicas of that one particular host you have chosen to
query as Erick said which could have some gotchyas for people not expecting
that behavior.

-Steve

On Wed, Dec 10, 2014 at 9:26 AM, Erick Erickson <[email protected]>
wrote:

> Just skimming, but if I'm reading this right, your suggestion is
> that queries be served locally rather than being forwarded to
> another replica when possible.
>
> So let's take the one-shard case with N replicas to make sure
> I understand. In a one-shard case, no query really needs to
> be forwarded, since any replica can fully get the results so
> in this case no query would be forwarded.
>
> If this is a fair summary, then consider the situation where the
> outside world connects to a single server rather than to a
> fronting load balancer. Then only one shard would be doing
> any work....
>
> Or am I off in the weeds?
>
> That aside, if I've gotten it wrong and you want to put
> up a patch (or even just outline a better approach),
> feel free to open a JIRA and attach a patch...
>
> Best,
> Erick
>
> On Tue, Dec 9, 2014 at 11:55 PM, S G <[email protected]> wrote:
> > Hello Solr Devs,
> >
> > I am a developer using Solr and wanted to have some opinion on a
> performance
> > change request.
> >
> > Currently, I see that code flow for a query in SolrCloud is as follows:
> >
> > For distributed query:
> > SolrCore -> SearchHandler.handleRequestBody() ->
> HttpShardHandler.submit()
> >
> > For non-distributed query:
> > SolrCore -> SearchHandler.handleRequestBody() -> QueryComponent.process()
> >
> >
> > For a distributed query, the request is always sent to all the shards
> even
> > if the originating SolrCore (handling the original distributed query) is
> a
> > replica of one of the shards.
> > If the original Solr-Core can check itself before sending http requests
> for
> > any shard, we can probably save some network hopping and gain some
> > performance.
> >
> > If this idea seems feasible, I can submit a JIRA ticket and work on it.
> > I am planning to change SearchHandler.handleRequestBody() or
> > HttpShardHandler.submit()
> >
> > Thanks
> > SG
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Re: Querying locally before sending a distributed request

Reply via email to