[ 
https://issues.apache.org/jira/browse/SOLR-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836658#comment-13836658
 ] 

Anshum Gupta commented on SOLR-5477:
------------------------------------

Here's what I'd recommend. 

Have 3 queues in the first phase of implementation. One each for submitted, 
running, completed. The completed queue only keeps the top-X tasks (by recency 
of completion). The completion queue is important for people to figure out 
details about a completed task e.g. completion time, running time etc.

I've started working on it and would recommend that we have a ThreadPool for 
the running tasks. This can be capped at a config setting.

I am still debating about when to accept tasks (or perhaps accept everything 
and fail them when they run). Here's a sample case on that. Firing a Shard 
split for collection1/shard1 would lead to an inactive shard1. If we continue 
to accept tasks until this completes, we may accept actions that involve 
shard1. We may need to take a call on that.

For now, I am not looking at truly multi-threading my implementation (but 
certainly doing that before having this particular JIRA as resolved). Once I 
get to it, I'd perhaps still just run only one request per collection at a 
time, until we have a more complex decision making capability.

Once a task is submitted, the OverseerCollectionProcessor peeks and processes 
tasks which are in the submitted queue and moves them to in-process. We'll have 
to synchronize this task on the queue/collection.

Upon completion, again the task is moved from the in-progress queue to the 
completed queue.

Cleaning up of the completed queue could also be tricky and we may need a 
failed tasks queue or have a way to perhaps retain failed tasks in the 
completed queue longer.

> Async execution of OverseerCollectionProcessor tasks
> ----------------------------------------------------
>
>                 Key: SOLR-5477
>                 URL: https://issues.apache.org/jira/browse/SOLR-5477
>             Project: Solr
>          Issue Type: Sub-task
>          Components: SolrCloud
>            Reporter: Noble Paul
>
> Typical collection admin commands are long running and it is very common to 
> have the requests get timed out.  It is more of a problem if the cluster is 
> very large.Add an option to run these commands asynchronously
> add an extra param async=true for all collection commands
> the task is written to ZK and the caller is returned a task id. 
> as separate collection admin command will be added to poll the status of the 
> task
> command=status&id=7657668909
> if id is not passed all running async tasks should be listed
> A separate queue is created to store in-process tasks . After the tasks are 
> completed the queue entry is removed. OverSeerColectionProcessor will perform 
> these tasks in multiple threads



--
This message was sent by Atlassian JIRA
(v6.1#6144)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to