If it helps, this is the debug dump from the single regionserver. Tasks: =========================================================== Task: RpcServer.reader=5,bindAddress=0.0.0.0,port=60020 Status: WAITING:Waiting for a call Running for 2326s
Task: RpcServer.reader=6,bindAddress=0.0.0.0,port=60020 Status: WAITING:Waiting for a call Running for 2326s Task: RpcServer.reader=7,bindAddress=0.0.0.0,port=60020 Status: WAITING:Waiting for a call Running for 2313s Task: RpcServer.reader=8,bindAddress=0.0.0.0,port=60020 Status: WAITING:Waiting for a call Running for 2312s Task: RpcServer.reader=9,bindAddress=0.0.0.0,port=60020 Status: WAITING:Waiting for a call Running for 2282s Task: RpcServer.reader=0,bindAddress=0.0.0.0,port=60020 Status: WAITING:Waiting for a call Running for 2282s On Mon, Sep 14, 2015 at 1:49 PM, [email protected] < [email protected]> wrote: > So circling back a little bit on this discussion, if someone can help me > understand this behavior I would really appreciate that. So as mentioned > above, my num RPC handlers on RS and master are set to 1. Also I trimmed > down the number of RS to 1. However when I restart the cluster, I still see > 6 of them showing up in the UI all waiting for an operation from the client: > > Mon Sep 14 13:19:38 CDT 2015 > RpcServer.reader=0,bindAddress=0.0.0.0,port=60020WAITING (since 29sec > ago)Waiting > for a call (since 29sec ago)Mon Sep 14 13:19:38 CDT 2015 > RpcServer.reader=9,bindAddress=0.0.0.0,port=60020WAITING (since 59sec > ago)Waiting > for a call (since 59sec ago)Mon Sep 14 13:19:08 CDT 2015 > RpcServer.reader=8,bindAddress=0.0.0.0,port=60020WAITING (since 59sec > ago)Waiting > for a call (since 59sec ago)Mon Sep 14 13:19:07 CDT 2015 > RpcServer.reader=7,bindAddress=0.0.0.0,port=60020WAITING (since 34sec > ago)Waiting > for a call (since 34sec ago)Mon Sep 14 13:18:54 CDT 2015 > RpcServer.reader=6,bindAddress=0.0.0.0,port=60020WAITING (since 1mins, > 59sec ago)Waiting for a call (since 1mins, 59sec ago)Mon Sep 14 13:18:54 > CDT 2015RpcServer.reader=5,bindAddress=0.0.0.0,port=60020WAITING (since > 1mins, 59sec ago)Waiting for a call (since 1mins, 59sec ago) > > Anything obvious that I am missing here? Shouldn't the num handlers on the > RS show up as 1 if that is what the config is set to? > > Thanks for the help, > > On Fri, Sep 4, 2015 at 12:35 PM, [email protected] < > [email protected]> wrote: > >> > if you receive [scan, get] the get has to wait the scan to complete. >> >> Agreed. And like I mentioned in my original post, this is what I was >> expecting too to happen with FIFO but did not see. May be having multiple >> handlers in the wait queue was the reason[1] even though I put the num RPC >> handlers on the RS down to 1 but not sure. >> >> Also thanks for the explanation on the "deadline" concept. It makes much >> more sense now. >> >> [1] http://imgur.com/tUu4y8r >> >> On Fri, Sep 4, 2015 at 12:12 PM, Matteo Bertozzi <[email protected] >> > wrote: >> >>> On Fri, Sep 4, 2015 at 10:03 AM, [email protected] < >>> [email protected]> wrote: >>> >>> > Ah ok. So if I understand you right, irrespective of which queue type I >>> > use, if a get comes when a chunk is being processed, it is going to >>> wait. >>> > >>> >>> assuming no other threads (handlers) are available, yes. >>> we don't have the ability to eject in-progress requests. >>> >>> >>> > > what fifo vs deadline does is pushing the scan chunk back in the >>> queue if >>> > you have multiple operation with higher priority. >>> > >>> > So, if I set the queue type to FIFO, the scans shouldn't get pushed >>> back >>> > correct? Wouldn't this mean that a "get" would be after a scan in the >>> queue >>> > if the scan was submitted before? Or is it still possible in this case >>> for >>> > the handler to switch between the scan chunks, quickly execute the >>> "get" >>> > and then get back to execute the rest of the chunks? >>> > >>> >>> using fifo, the requests are executed in the other the server receives >>> them. >>> assuming 1 thread (handler). if you receive [scan, get] the get has to >>> wait >>> the scan to complete. >>> if you receive [get, scan] the scan has to wait the get to complete. >>> currently, there is no way to interrupt the operation being executed and >>> execute the one with higher priority. >>> >>> deadline does this thing: >>> let say you have 1 thread, and that you are executing one operation. >>> a scan comes in, and then a get comes in. >>> the other operation is not completed yet. so your request order looks >>> like >>> [scan, get] >>> the deadline queue reorders the operations as [get, scan] if the scan >>> should be pushed back. >>> but this reorder happens before execution. they are still in the queue >>> when >>> they are reordered. >>> for now, once one operation is in execution there is no way to switch to >>> another with higher-priority. >>> >> >> >> >> -- >> Swarnim >> > > > > -- > Swarnim > -- Swarnim
