Re: Querying locally before sending a distributed request

S G Wed, 10 Dec 2014 09:22:23 -0800

I have opened https://issues.apache.org/jira/browse/SOLR-6832 to track this.


The performance gain increases if coresPerMachine is > 1 and a single JVM
has cores from 'k' shards.

We can also look into giving more preference to machines with same IP
address as current machine (when multiple tomcats are running on same
machine).


On Wed, Dec 10, 2014 at 7:14 AM, Steve Davids <[email protected]> wrote:

> bq. In a one-shard case, no query really needs to be forwarded, since any
> replica can fully get the results so in this case no query would be
> forwarded.
>
> You can pass the request param distrib=false to not distribute the request
> in that particular case at which point it will only gather results from
> that particular host.
>
> As for the SolrCloud example with n-shards > 1 your overall search request
> time is limited to the slowest shard's response time. So, you would
> potentially be saving one hop, but you are still making n-1 other hops to
> gather all of the other shard's results thus making it a moot point since
> you will be waiting on the other shards to respond before you can return
> the aggregated result list. You will then be on the hook to setup the load
> balancing across replicas of that one particular host you have chosen to
> query as Erick said which could have some gotchyas for people not expecting
> that behavior.
>
> -Steve
>
> On Wed, Dec 10, 2014 at 9:26 AM, Erick Erickson <[email protected]>
> wrote:
>
>> Just skimming, but if I'm reading this right, your suggestion is
>> that queries be served locally rather than being forwarded to
>> another replica when possible.
>>
>> So let's take the one-shard case with N replicas to make sure
>> I understand. In a one-shard case, no query really needs to
>> be forwarded, since any replica can fully get the results so
>> in this case no query would be forwarded.
>>
>> If this is a fair summary, then consider the situation where the
>> outside world connects to a single server rather than to a
>> fronting load balancer. Then only one shard would be doing
>> any work....
>>
>> Or am I off in the weeds?
>>
>> That aside, if I've gotten it wrong and you want to put
>> up a patch (or even just outline a better approach),
>> feel free to open a JIRA and attach a patch...
>>
>> Best,
>> Erick
>>
>> On Tue, Dec 9, 2014 at 11:55 PM, S G <[email protected]> wrote:
>> > Hello Solr Devs,
>> >
>> > I am a developer using Solr and wanted to have some opinion on a
>> performance
>> > change request.
>> >
>> > Currently, I see that code flow for a query in SolrCloud is as follows:
>> >
>> > For distributed query:
>> > SolrCore -> SearchHandler.handleRequestBody() ->
>> HttpShardHandler.submit()
>> >
>> > For non-distributed query:
>> > SolrCore -> SearchHandler.handleRequestBody() ->
>> QueryComponent.process()
>> >
>> >
>> > For a distributed query, the request is always sent to all the shards
>> even
>> > if the originating SolrCore (handling the original distributed query)
>> is a
>> > replica of one of the shards.
>> > If the original Solr-Core can check itself before sending http requests
>> for
>> > any shard, we can probably save some network hopping and gain some
>> > performance.
>> >
>> > If this idea seems feasible, I can submit a JIRA ticket and work on it.
>> > I am planning to change SearchHandler.handleRequestBody() or
>> > HttpShardHandler.submit()
>> >
>> > Thanks
>> > SG
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>>
>

Re: Querying locally before sending a distributed request

Reply via email to