[ 
https://issues.apache.org/jira/browse/HIVE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13867036#comment-13867036
 ] 

Ashutosh Chauhan commented on HIVE-5951:
----------------------------------------

I thought perf gain would be because of doing one call on backend DBMS 
(probably using directsql). But looking at newly added method in ObjectStore, 
we are still adding partitions one by one in for loop. I think there will be 
quite a bit of perf to have by making single on backend db.
Apart from perf, for atomicity guarantees it also make sense to do one call. 

> improve performance of adding partitions from client
> ----------------------------------------------------
>
>                 Key: HIVE-5951
>                 URL: https://issues.apache.org/jira/browse/HIVE-5951
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HIVE-5951.01.patch, HIVE-5951.02.patch, 
> HIVE-5951.03.patch, HIVE-5951.nogen.patch, HIVE-5951.nogen.patch, 
> HIVE-5951.nogen.patch, HIVE-5951.nogen.patch, HIVE-5951.patch
>
>
> Adding partitions to metastore is currently very inefficient. There are small 
> things like, for !ifNotExists case, DDLSemanticAnalyzer gets the full 
> partition object for every spec (which is a network call to metastore), and 
> then discards it instantly; there's also general problem that too much 
> processing is done on client side. DDLSA should analyze the query and make 
> one call to metastore (or maybe a set of batched  calls if there are too many 
> partitions in the command), metastore should then figure out stuff and insert 
> in batch.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to