[ https://issues.apache.org/jira/browse/HIVE-23018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17058172#comment-17058172 ]
Vihang Karajgaonkar commented on HIVE-23018: -------------------------------------------- {noformat} struct InsertEventRequestData { 1: optional bool replace, 2: required list<string> filesAdded, // Checksum of files (hex string of checksum byte payload) 3: optional list<string> filesAddedChecksum, // Used by acid operation to create the sub directory 4: optional list<string> subDirectoryList, } union FireEventRequestData { 1: InsertEventRequestData insertData } struct FireEventRequest { 1: required bool successful, 2: required FireEventRequestData data // dbname, tablename, and partition vals are included as optional in the top level event rather than placed in each type of // subevent as I assume they'll be used across most event types. 3: optional string dbName, 4: optional string tableName, 5: optional list<string> partitionVals, 6: optional string catName, } struct FireEventResponse { // NOP for now, this is just a place holder for future responses } {noformat} The thrift structures above can be reused to have a new API to fire multiple events. Most of the thrift structures above can be reused. I propose to add a field to the {{FireEventRequestData}} which takes a {{list<InsertEventRequestData>}} so that multiple events can be fired. Since {{FireEventRequestData}} is defined as union only one field is allowed to be set so existing clients can continue to use the same API to fire event one by one while newer clients can make use of the bulk events in one RPC call. > Provide a bulk API to fire multiple listener events > --------------------------------------------------- > > Key: HIVE-23018 > URL: https://issues.apache.org/jira/browse/HIVE-23018 > Project: Hive > Issue Type: Improvement > Reporter: Vihang Karajgaonkar > Assignee: Vihang Karajgaonkar > Priority: Major > > Metastore provides a API to fire a listener event (currently only supports > INSERT event). The problem with that API is that it only takes in one > partition at a time. A typical query may insert data into multiple partitions > at a time. In such a case query engines like HS2 or Impala will have to issue > multiple RPCs to metastore sequentially to fire these events. This can show > up as a slowdown to the user if the query engines do not return the prompt to > the user until all the events are fired (In case of HS2 and Impala). It would > be great if we have bulk API which takes in multiple partitions for a table > so that metastore can generate many such events in one RPC. -- This message was sent by Atlassian Jira (v8.3.4#803005)