[ 
https://issues.apache.org/jira/browse/KUDU-1698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625692#comment-17625692
 ] 

Bakai Ádám commented on KUDU-1698:
----------------------------------

I tried to recreate the exact steps in a test, but it failed, because the 
client didn't rediscover, but rather tried over and over again until session 
timeout.  I talked with [~aserbin] and we came to the conclusion to create a 
new issue for the not rediscovering behaviour, and test the session and rpc 
timeout in an ohter way. The new issue is: KUDU-3414 . 
The new idea to test the separate entity property:
* Make the tablet lookup artificially slow by adding latency.
* See that the rpc is timing out but retries.
* Remove the artifical delay 
* Check that the operation was succesful in the end, and tablet look up 
happened twice.

> Kudu C++ client: add a new unit test to make sure default_rpc_timeout and 
> session timeout are separate entities
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: KUDU-1698
>                 URL: https://issues.apache.org/jira/browse/KUDU-1698
>             Project: Kudu
>          Issue Type: Task
>          Components: client, test
>            Reporter: Alexey Serbin
>            Assignee: Bakai Ádám
>            Priority: Minor
>              Labels: newbie
>
> We need a new unit test that makes sure there is a difference between 
> top-level operation timeout and per-call RPC timeout in Kudu C++ client 
> library.  Prior to change introduced in 
> 5195ce573850653e0e53094cdd35a1da93d33444 it was the same (which was a bug).
> The test should:
> * set  per-call RPC timeout when creating KuduClient object
> * set KuduSession::SetTimeoutMillis() for the target session: the value 
> should be 2 times of per-call RPC timeout or such.
> * create a tablet with replication factor of 2 at least.
> * find current tablet replica leader and pause it (send SIGSTOP)
> * make a write into the table
> * make sure the write operation was successful
> Prior to change introduced in 5195ce573850653e0e53094cdd35a1da93d33444 such a 
> test would fail because the C++ client used the full operation deadline on 
> every RPC call.
> I.e., it would wait till the call to current leader times out, and that would 
> consume time budget of the whole operation.  Once RPC timeout is less thatn 
> the timeout for the whole write operation, the call to the frozen tablet 
> server should timeout, and the client should re-discover a new tablet 
> replicate leader and complete the write operation successfully.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to