[ 
https://issues.apache.org/jira/browse/SOLR-17926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18022573#comment-18022573
 ] 

Chris M. Hostetter commented on SOLR-17926:
-------------------------------------------

I definitely prefer the approach in the PR over the use of  "NOW" in the patch.

"NOW" really makes sense for ensuring that date rounding/arithmetic of values 
_in the documents_ are treated consistently regardless of replica or query 
stage _because we *expect* clock drift_ between the replicas - I don't think it 
makes sense to try and use it to do "how much timeAllowed do i have left?" type 
calculations (on the order of 10s of milliseconds) in replicas that didn't 
generate the "NOW" value in the first place.

(We also document "NOW" in the ref-guide as a way for clients to request to 
specify the frame of refrence they have to requests that include date math – so 
anyone doing that would get all sorts of really wonky timeAllowed results if we 
go that route)
----
Two things about the PR that i'm confused by:
 # I'm not really sure though that I understand the utility of 
{{adjustShardRequestLimit}} adding {{USED_PARAM}} on sub-requests, instead of 
just decrementing the value of {{timeAllowed}} on the sub-requests (like the 
current grouping code does) ?
 # I don't really understand the point of INFLIGHT_PARAM ?   Nothing in the 
code sets it, which I guess is fine? – it looks like it's intended to just be a 
way for external clients to override the implicit assumption that "2ms" isn't 
enough (remaining) time to bother sending sub-requests – but the only code path 
where {{req.getParams().getLong(INFLIGHT_PARAM, DEFAULT_INFLIGHT_MS)}} is 
called is a conditional block in the constructor where we already know "{{{}// 
this is a sub-request{}}}" .. which means {{adjustShardRequestLimit}} (which is 
only ever going to get called in the original parent request) will only ever 
use the DEFAULT_INFLIGHT_MS of 2ms ... right?

> Discount timeAllowed for all types of queries
> ---------------------------------------------
>
>                 Key: SOLR-17926
>                 URL: https://issues.apache.org/jira/browse/SOLR-17926
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 9.9
>            Reporter: Andrzej Bialecki
>            Assignee: Andrzej Bialecki
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: SOLR-17926-using-NOW.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Spin-off from SOLR-17869.
> Currently only {{TopGroupsShardRequestFactory}} subtracts the time already 
> spent on local request processing from {{timeAllowed}} before sending shard 
> requests.
> This is inconsistent and likely not optimal. Since {{timeAllowed}} tracks 
> wall-clock time it makes sense to track the same starting point for all 
> phases of distributed request processing and terminate processing early when 
> the allowed time runs out, as compared to the original starting point.
> This is not the way it works now, though (except for this special case of 
> grouping queries): the same time span is allocated to the query coordinator 
> and to the shard requests where the processing starts later, which means that 
> the coordinator may time out while waiting for responses even if all shard 
> requests succeeded.
> [~dsmiley] suggested to use {{SolrRequestInfo.getNOW()}} instead, as the 
> absolute starting point for both local and distributed requests, and compare 
> {{timeAllowed}} to that starting point. However, this relies on correct time 
> sync between nodes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to