[ https://issues.apache.org/jira/browse/HIVE-22221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16934885#comment-16934885 ]
Hive QA commented on HIVE-22221: -------------------------------- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12980909/HIVE-22221.1.patch {color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 16840 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[show_functions] (batchId=81) org.apache.hadoop.hive.metastore.TestMetaStoreEventListenerOnlyOnCommit.testEventStatus (batchId=233) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/18672/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18672/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18672/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12980909 - PreCommit-HIVE-Build > Llap external client - Need to reduce LlapBaseInputFormat#getSplits() > footprint > --------------------------------------------------------------------------------- > > Key: HIVE-22221 > URL: https://issues.apache.org/jira/browse/HIVE-22221 > Project: Hive > Issue Type: Bug > Components: llap, UDF > Reporter: Shubham Chaurasia > Assignee: Shubham Chaurasia > Priority: Major > Labels: pull-request-available > Attachments: HIVE-22221.1.patch > > Time Spent: 10m > Remaining Estimate: 0h > > While querying through llap external client, LlapBaseInputFormat#getSplits() > invokes get_splits() (GenericUDTFGetSplits) udtf under the hoods. > GenericUDTFGetSplits returns LlapInputSplit in which planBytes[] occupies > around 90% of the split size. > Depending on data size/partitions and plan, LlapInputSplit can grow upto 1mb > with planBytes[] being common to all the splits and occupying more than 850 > kb. Also, it sometimes causes OOM on HS2 depending on HS2 heap size. > This can be resolved by separating out common parts from actual splits and > reassembling them at client side. > We can also provide an option where client can say it does not want to > reassemble them and can take the control of reassembling in it's hands. > Splits can be broken like: > 1) schema split > 2) plan split > 3) actual split 1 > 4) actual split 2....and so on. > This greatly reduces the memory(in my case from 5GB(~5000 splits) to around > 15MB) on server side and hence the data transfer. And this eliminates OOM on > HS2 side. > cc [~jdere] [~sankarh] [~thejas] -- This message was sent by Atlassian Jira (v8.3.4#803005)