[
https://issues.apache.org/jira/browse/SOLR-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13998700#comment-13998700
]
Shalin Shekhar Mangar commented on SOLR-5681:
---------------------------------------------
More comments:
# DistributedQueue.peekTopN should count stats in the same way as peek() does
by using “peekN_wait_forever” and “peekN_wait_” + wait.
# DistributedQueue.peekTopN is still not correct. Suppose orderedChildren
returns 0 nodes, the childWatcher.await will be called, thread will wait and
immediately return 0 results even if children were available. So there was no
point in waiting at all if we were going to return 0 results.
# The same thing happens later in DQ.peekTopN after the loop. There’s no point
in calling await if we’re going to return null anyway.
{code}
childWatcher.await(wait == Long.MAX_VALUE ? DEFAULT_TIMEOUT : wait);
waitedEnough = wait != Long.MAX_VALUE;
if (waitedEnough) {
return null;
}
{code}
# The DQ.getTailId (renamed from getLastElementId) still has an empty catch
block for KeeperException.
# We should probably add unit test for the DQ.peekTopN method.
> Make the OverseerCollectionProcessor multi-threaded
> ---------------------------------------------------
>
> Key: SOLR-5681
> URL: https://issues.apache.org/jira/browse/SOLR-5681
> Project: Solr
> Issue Type: Improvement
> Components: SolrCloud
> Reporter: Anshum Gupta
> Assignee: Anshum Gupta
> Attachments: SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch,
> SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch,
> SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch,
> SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch,
> SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch,
> SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch,
> SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch
>
>
> Right now, the OverseerCollectionProcessor is single threaded i.e submitting
> anything long running would have it block processing of other mutually
> exclusive tasks.
> When OCP tasks become optionally async (SOLR-5477), it'd be good to have
> truly non-blocking behavior by multi-threading the OCP itself.
> For example, a ShardSplit call on Collection1 would block the thread and
> thereby, not processing a create collection task (which would stay queued in
> zk) though both the tasks are mutually exclusive.
> Here are a few of the challenges:
> * Mutual exclusivity: Only let mutually exclusive tasks run in parallel. An
> easy way to handle that is to only let 1 task per collection run at a time.
> * ZK Distributed Queue to feed tasks: The OCP consumes tasks from a queue.
> The task from the workQueue is only removed on completion so that in case of
> a failure, the new Overseer can re-consume the same task and retry. A queue
> is not the right data structure in the first place to look ahead i.e. get the
> 2nd task from the queue when the 1st one is in process. Also, deleting tasks
> which are not at the head of a queue is not really an 'intuitive' thing.
> Proposed solutions for task management:
> * Task funnel and peekAfter(): The parent thread is responsible for getting
> and passing the request to a new thread (or one from the pool). The parent
> method uses a peekAfter(last element) instead of a peek(). The peekAfter
> returns the task after the 'last element'. Maintain this request information
> and use it for deleting/cleaning up the workQueue.
> * Another (almost duplicate) queue: While offering tasks to workQueue, also
> offer them to a new queue (call it volatileWorkQueue?). The difference is, as
> soon as a task from this is picked up for processing by the thread, it's
> removed from the queue. At the end, the cleanup is done from the workQueue.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]