[jira] [Commented] (SOLR-6760) New optimized DistributedQueue implementation for overseer

Scott Blum (JIRA) Wed, 12 Aug 2015 09:47:26 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-6760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14693804#comment-14693804
 ]


Scott Blum commented on SOLR-6760:
----------------------------------

[~noble.paul] I feel like the API and implementation of DistributedQueue 
represents a pretty clean, cohesive, and general API.  This is evidenced by the 
fact that most of the existing places we were using DQ "just work".

DistributedQueueExt represents what I feel like is kind of crap that was 
glommed on to support the collection task queue, specifically.  You have 
methods like containsTaskWithRequestId() that are highly specific to the 
collection task queue, the strange QueueEvent and response-prefix stuff that I 
don't even understand what it's supposed to do, getTailId() to peek at the end 
of the queue with unclear semantics (is it good enough to answer with the end 
of the in-memory queue, or does the caller expect a synchronous read-through 
into ZK?), and a remove method that doesn't operate on the head of the queue.  
In addition to the unclear semantics on some of these, the implementations of 
some of them necessarily break the clean model DQ uses and are in some cases 
FAR less efficient -- containsTaskWithRequestId for example has to not only 
fetch the entire list from ZK, it then has to actually read all the data nodes.

Suffice it to say I don't think anything in there is good enough to promote 
into the general purpose DQ.  Maybe the core issue is that the collection work 
queue is fundamentally looking for something more, like a distributed task 
queue.  I think someone should go back and analyze the true needs there and 
figure out if there's something better we can do.

> New optimized DistributedQueue implementation for overseer
> ----------------------------------------------------------
>
>                 Key: SOLR-6760
>                 URL: https://issues.apache.org/jira/browse/SOLR-6760
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Noble Paul
>            Assignee: Noble Paul
>         Attachments: SOLR-6760.patch, deadlock.patch
>
>
> Currently the DQ works as follows
> * read all items in the directory
> * sort them all 
> * take the head and return it and discard everything else
> * rinse and repeat
> This works well when we have only a handful of items in the Queue. If the 
> items in the queue is much larger (in tens of thousands) , this is 
> counterproductive
> As the overseer queue is a multiple producers + single consumer queue, We can 
> read them all in bulk  and before processing each item , just do a 
> zk.exists(itemname) and if all is well we don't need to do the fetch all + 
> sort thing again



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-6760) New optimized DistributedQueue implementation for overseer

Reply via email to