[jira] [Commented] (HIVE-21506) Memory based TxnHandler implementation

Peter Vary (JIRA) Mon, 15 Apr 2019 00:34:29 -0700


    [ 
https://issues.apache.org/jira/browse/HIVE-21506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16817618#comment-16817618
 ]


Peter Vary commented on HIVE-21506:
-----------------------------------

[~tlipcon]: I have read through the paper and it has plenty of interesting 
points. My most important takeaway is that we really need the current/expected 
workload patterns to be able to optimize for them.

The paper proposed solution helps a lot in case: "a scalable application will 
nearly always request hot (often accessed) locks in a compatible mode;". I am 
not entirely sure that we arrived in this phase. My understanding is that we 
are not yet blocked by the concurrency checks when acquiring locks, but the 
bottleneck is simply the number of HMS/RDBMS calls implementing that.

The proposed solution in the description of this jira was working only with 
active-passive HMS installations, the one based on the paper above is enhances 
that so it could work with active-active HMS installations with sticky 
transactions/queries as well with the same caveat that if the HMS instance 
fails then all of the connected transactions/queries are failing as well.

 
{quote}Does this imply that you'd move the transaction manager out of the HMS 
into a standalone daemon? 
{quote}
I am not entirely happy that we tie together the Metadata handling and the 
Locking/Transaction handling components in HMS. These components have different 
basic workload characteristics:
 * Metadata
 ** Read heavy
 ** Content heavy
 ** Slow changing
 * Locking/Transactions
 ** Balanced read/write
 ** Low on data
 ** Fast changing

Since the Metadata handles bigger amount of data we might really need 
active-active load balancing to serve the needs there, but on 
Locking/Transactions I think we can get away with active-passive load-balancing 
easily.

The possible gains here might alleviate the extra cost of introducing a new 
component. 

 
{quote}so maybe a first step is to just look at some kind of stress 
test/benchmark and understand if we can do any changes to the way we manage the 
RDBMS table to be more efficient?
{quote}
[~lpinter] did some research to reuse the HMS benchmark tool to test the 
transaction related API-s. If we have the proposed workload that would help a 
lot.

> Memory based TxnHandler implementation
> --------------------------------------
>
>                 Key: HIVE-21506
>                 URL: https://issues.apache.org/jira/browse/HIVE-21506
>             Project: Hive
>          Issue Type: New Feature
>          Components: Transactions
>            Reporter: Peter Vary
>            Priority: Major
>
> The current TxnHandler implementations are using the backend RDBMS to store 
> every Hive lock and transaction data, so multiple TxnHandler instances can 
> run simultaneously and can serve requests. The continuous 
> communication/locking done on the RDBMS side puts serious load on the backend 
> databases also restricts the possible throughput.
> If it is possible to have only a single active TxnHandler (with the current 
> design HMS) instance then we can provide much better (using only java based 
> locking) performance. We still have to store the committed write transactions 
> to the RDBMS (or later some other persistent storage), but other lock and 
> transaction operations could remain memory only.
> The most important drawbacks with this solution is that we definitely lose 
> scalability when one instance of TxnHandler is no longer able to serve the 
> requests (see NameNode), and fault tolerance in the sense that the ongoing 
> transactions should be terminated when the TxnHandler is failed. If this 
> drawbacks are acceptable in certain situations the we can provide better 
> throughput for the users.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21506) Memory based TxnHandler implementation

Reply via email to