[jira] [Commented] (SOLR-5681) Make the OverseerCollectionProcessor multi-threaded

Shalin Shekhar Mangar (JIRA) Fri, 16 May 2014 09:36:25 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13998700#comment-13998700
 ]


Shalin Shekhar Mangar commented on SOLR-5681:
---------------------------------------------

More comments:
# DistributedQueue.peekTopN should count stats in the same way as peek() does 
by using “peekN_wait_forever” and “peekN_wait_” + wait.
# DistributedQueue.peekTopN is still not correct. Suppose orderedChildren 
returns 0 nodes, the childWatcher.await will be called, thread will wait and 
immediately return 0 results even if children were available. So there was no 
point in waiting at all if we were going to return 0 results.
# The same thing happens later in DQ.peekTopN after the loop. There’s no point 
in calling await if we’re going to return null anyway.
{code}
childWatcher.await(wait == Long.MAX_VALUE ? DEFAULT_TIMEOUT : wait);
          waitedEnough = wait != Long.MAX_VALUE;
          if (waitedEnough) {
            return null;
          }
{code}
# The DQ.getTailId (renamed from getLastElementId) still has an empty catch 
block for KeeperException.
# We should probably add unit test for the DQ.peekTopN method.

> Make the OverseerCollectionProcessor multi-threaded
> ---------------------------------------------------
>
>                 Key: SOLR-5681
>                 URL: https://issues.apache.org/jira/browse/SOLR-5681
>             Project: Solr
>          Issue Type: Improvement
>          Components: SolrCloud
>            Reporter: Anshum Gupta
>            Assignee: Anshum Gupta
>         Attachments: SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, 
> SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, 
> SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, 
> SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, 
> SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, 
> SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, 
> SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch
>
>
> Right now, the OverseerCollectionProcessor is single threaded i.e submitting 
> anything long running would have it block processing of other mutually 
> exclusive tasks.
> When OCP tasks become optionally async (SOLR-5477), it'd be good to have 
> truly non-blocking behavior by multi-threading the OCP itself.
> For example, a ShardSplit call on Collection1 would block the thread and 
> thereby, not processing a create collection task (which would stay queued in 
> zk) though both the tasks are mutually exclusive.
> Here are a few of the challenges:
> * Mutual exclusivity: Only let mutually exclusive tasks run in parallel. An 
> easy way to handle that is to only let 1 task per collection run at a time.
> * ZK Distributed Queue to feed tasks: The OCP consumes tasks from a queue. 
> The task from the workQueue is only removed on completion so that in case of 
> a failure, the new Overseer can re-consume the same task and retry. A queue 
> is not the right data structure in the first place to look ahead i.e. get the 
> 2nd task from the queue when the 1st one is in process. Also, deleting tasks 
> which are not at the head of a queue is not really an 'intuitive' thing.
> Proposed solutions for task management:
> * Task funnel and peekAfter(): The parent thread is responsible for getting 
> and passing the request to a new thread (or one from the pool). The parent 
> method uses a peekAfter(last element) instead of a peek(). The peekAfter 
> returns the task after the 'last element'. Maintain this request information 
> and use it for deleting/cleaning up the workQueue.
> * Another (almost duplicate) queue: While offering tasks to workQueue, also 
> offer them to a new queue (call it volatileWorkQueue?). The difference is, as 
> soon as a task from this is picked up for processing by the thread, it's 
> removed from the queue. At the end, the cleanup is done from the workQueue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-5681) Make the OverseerCollectionProcessor multi-threaded

Reply via email to