[ 
https://issues.apache.org/jira/browse/HIVE-20545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16644142#comment-16644142
 ] 

Vihang Karajgaonkar commented on HIVE-20545:
--------------------------------------------

Hi [~anishek] The size of these stats are not significant on each partition 
(few 10s of bytes per column IIRC) but it adds up to a significant size if 
there are thousands of such partitions. Moreover, this data is will be 
persisted in the database for all notification events which are generated on 
that table irrespective of whether the stats changed or not. In general, what 
we have observed is that as we increase the message size (like when we add the 
whole JSON serialization of thrift before and after objects) the API runtime 
degrades in highly concurrent workloads. Based on our simulations for example 
{{alter_partition}} time increases by ~30-35% when we add thrift objects along 
with parameters in the notification events when there are 15-20 concurrent 
sessions running in parallel. It is hard to quantify the performance hit on 
overall HMS operation in a real-world workload since it highly depends on how 
many APIs are metadata read-only (get* calls) v/s metadata modification calls.

> Ability to exclude potentially large parameters in HMS Notifications
> --------------------------------------------------------------------
>
>                 Key: HIVE-20545
>                 URL: https://issues.apache.org/jira/browse/HIVE-20545
>             Project: Hive
>          Issue Type: Improvement
>          Components: Metastore
>    Affects Versions: 3.1.0, 4.0.0
>            Reporter: Bharathkrishna Guruvayoor Murali
>            Assignee: Bharathkrishna Guruvayoor Murali
>            Priority: Major
>         Attachments: HIVE-20545.1.patch, HIVE-20545.2.patch, 
> HIVE-20545.3.branch-3.patch, HIVE-20545.3.patch, HIVE-20545.4.patch, 
> HIVE-20545.6.patch, HIVE-20545.7.patch
>
>
> Clients can add large-sized parameters in Table/Partition objects. So we need 
> to enable adding regex patterns through HiveConf to match parameters to be 
> filtered from table and partition objects before serialization in HMS 
> notifications.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to